Drucker et al., 1999 - Google Patents
Support vector machines for spam categorizationDrucker et al., 1999
View PDF- Document ID
- 18185744430868097683
- Author
- Drucker H
- Wu D
- Vapnik V
- Publication year
- Publication venue
- IEEE Transactions on Neural networks
External Links
Snippet
We study the use of support vector machines (SVM) in classifying e-mail as spam or nonspam by comparing it to three other classification algorithms: Ripper, Rocchio, and boosting decision trees. These four algorithms were tested on two different data sets: one …
- 238000007635 classification algorithm 0 abstract description 2
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30705—Clustering or classification
- G06F17/3071—Clustering or classification including class or cluster creation or modification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30705—Clustering or classification
- G06F17/30707—Clustering or classification into predefined classes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30861—Retrieval from the Internet, e.g. browsers
- G06F17/30864—Retrieval from the Internet, e.g. browsers by querying, e.g. search engines or meta-search engines, crawling techniques, push systems
- G06F17/30867—Retrieval from the Internet, e.g. browsers by querying, e.g. search engines or meta-search engines, crawling techniques, push systems with filtering and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30613—Indexing
- G06F17/30619—Indexing indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/10—Office automation, e.g. computer aided management of electronic mail or groupware; Time management, e.g. calendars, reminders, meetings or time accounting
- G06Q10/107—Computer aided management of electronic mail
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99931—Database or file accessing
- Y10S707/99933—Query processing, i.e. searching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Drucker et al. | Support vector machines for spam categorization | |
Rupapara et al. | Impact of SMOTE on imbalanced text features for toxic comments classification using RVVC model | |
Alguliyev et al. | COSUM: Text summarization based on clustering and optimization | |
Magdy et al. | Efficient spam and phishing emails filtering based on deep learning | |
Dewdney et al. | The form is the substance: Classification of genres in text | |
Méndez et al. | Tracking concept drift at feature selection stage in spamhunting: An anti-spam instance-based reasoning system | |
RU2543315C2 (en) | Method of selecting effective versions in search and recommendation systems (versions) | |
Quan et al. | Term weighting schemes for question categorization | |
Mengle et al. | Ambiguity measure feature‐selection algorithm | |
Firte et al. | Spam detection filter using KNN algorithm and resampling | |
Choi et al. | Web page classification | |
Doshi et al. | Movie genre detection using topological data analysis | |
Van Gysel et al. | Reply with: Proactive recommendation of email attachments | |
Liu et al. | Boosting to correct inductive bias in text classification | |
Dasgupta et al. | Enhanced email spam filtering through combining similarity graphs | |
Reddy et al. | Classification of spam messages using random forest algorithm | |
Almeida et al. | Filtering spams using the minimum description length principle | |
Trivedi et al. | A modified content-based evolutionary approach to identify unsolicited emails | |
Hur et al. | Utilizing large language models for detection of sms spam in few-shot settings | |
Yih et al. | Mining online deal forums for hot deals | |
Itskevitch | Automatic hierarchical e-mail classification using association rules | |
Macskassy et al. | Emailvalet: Learning user preferences for wireless email | |
Mock | Dynamic email organization via relevance categories | |
Yang et al. | Intelligent email categorization based on textual information and metadata | |
Kalinov et al. | Building a dynamic classifier for large text data collections |