Sakkis et al., 2001 - Google Patents
Stacking classifiers for anti-spam filtering of e-mailSakkis et al., 2001
View PDF- Document ID
- 2325942298245495882
- Author
- Sakkis G
- Androutsopoulos I
- Paliouras G
- Karkaletsis V
- Spyropoulos C
- Stamatopoulos P
- Publication year
- Publication venue
- arXiv preprint cs/0106040
External Links
Snippet
We evaluate empirically a scheme for combining classifiers, known as stacked generalization, in the context of anti-spam filtering, a novel cost-sensitive application of text categorization. Unsolicited commercial e-mail, or" spam", floods mailboxes, causing …
- 238000001914 filtration 0 title abstract description 9
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6279—Classification techniques relating to the number of classes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06K9/6228—Selecting the most significant subset of features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/68—Methods or arrangements for recognition using electronic means using sequential comparisons of the image signals with a plurality of references in which the sequence of the image signals or the references is relevant, e.g. addressable memory
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/10—Office automation, e.g. computer aided management of electronic mail or groupware; Time management, e.g. calendars, reminders, meetings or time accounting
- G06Q10/107—Computer aided management of electronic mail
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/12—Computer systems based on biological models using genetic models
- G06N3/126—Genetic algorithms, i.e. information processing using digital simulations of the genetic system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30705—Clustering or classification
- G06F17/3071—Clustering or classification including class or cluster creation or modification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30705—Clustering or classification
- G06F17/30707—Clustering or classification into predefined classes
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Sakkis et al. | Stacking classifiers for anti-spam filtering of e-mail | |
Androutsopoulos et al. | An evaluation of naive bayesian anti-spam filtering | |
Trivedi | A study of machine learning classifiers for spam detection | |
Koprinska et al. | Learning to classify e-mail | |
Tretyakov | Machine learning techniques in spam filtering | |
Lowd et al. | Good Word Attacks on Statistical Spam Filters. | |
US8131655B1 (en) | Spam filtering using feature relevance assignment in neural networks | |
US8010614B1 (en) | Systems and methods for generating signatures for electronic communication classification | |
Novovičová et al. | Conditional mutual information based feature selection for classification task | |
Crammer et al. | Confidence-weighted linear classification for text categorization | |
Oda et al. | Immunity from spam: An analysis of an artificial immune system for junk email detection | |
Yang et al. | An approach to spam detection by naive Bayes ensemble based on decision induction | |
Khoussainov et al. | Email Task Management: An Iterative Relational Learning Approach. | |
Trivedi et al. | An enhanced genetic programming approach for detecting unsolicited emails | |
US7096208B2 (en) | Large margin perceptrons for document categorization | |
Faddoul et al. | Learning multiple tasks with boosted decision trees | |
Zhou et al. | Adaptive spam filtering using dynamic feature spaces | |
Cheng et al. | Personalized spam filtering with semi-supervised classifier ensemble | |
L'Huillier et al. | Online phishing classification using adversarial data mining and signaling games | |
Nisar et al. | Voting-ensemble classification for email spam detection | |
Sheu | An Efficient Two-phase Spam Filtering Method Based on E-mails Categorization. | |
Ying et al. | FrauDetector+ An Incremental Graph-Mining Approach for Efficient Fraudulent Phone Call Detection | |
Bickel et al. | Learning from message pairs for automatic email answering | |
Ganiev et al. | Artificial intelligence-based methods for filtering spam messages in email services | |
Trudgian | Spam classification using nearest neighbour techniques |