E‐Shape Analysis Page: 13
View a full description of this thesis.
Extracted Text
The following text was automatically extracted from the image on this page using optical character recognition software:
Table 2.2. Accuracy of individual bucket.
2.4.1. Spam Filtering
In this application of E-shape, I discuss the behavior that E-shape analysis can have on
the spam filtering process. The Bayesian filter proves to be over 99% successful most all
the time. However, to reach the goal of 100% further analysis is required. The Bayesian
filter uses content and context to classify emails. The process could be enhanced using the
method of shape analysis to "look at" if an email is spam or ham, taking content and context
completely out of the equation. Surprise emails to the classifier that can't be categorized or
are unique in manufacturing might make it through.
The data set used for this case study was the Trec 2007 corpus [7]. The Trec corpora
are widely used in spam testing. The 2007 corpus was over 74,000 emails. However, for this
study, only the first 7,500 emails were used for analysis. The corpus was approximately 67%
spam and 33% ham and has been hand labeled by the Trec Team.13
Bucket Accuracy False Negative False Positive Total Emails
1 41.37% 14 48 81
2 74.07% 20 20 76
3 80.95% 20 11 59
4 100% 0 0 129
5 68.75% 0 28 90
6 45.83% 0 25 67
7 100% 0 0 78
8 93.10% 14 6 81
9 88.00% 22 17 140
10 100% 0 0 118
11 100% 0 0 70
Upcoming Pages
Here’s what’s next.
Search Inside
This thesis can be searched. Note: Results may vary based on the legibility of text within the document.
Tools / Downloads
Get a copy of this page or view the extracted text.
Citing and Sharing
Basic information for referencing this web page. We also provide extended guidance on usage rights, references, copying or embedding.
Reference the current page of this Thesis.
Sroufe, Paul. E‐Shape Analysis, thesis, December 2009; Denton, Texas. (https://digital.library.unt.edu/ark:/67531/metadc12201/m1/22/: accessed July 17, 2024), University of North Texas Libraries, UNT Digital Library, https://digital.library.unt.edu; .