[go: up one dir, main page]

Reincke, 2003 - Google Patents

Profiling and classification of scientific documents with SAS Text Miner

Reincke, 2003

View PDF
Document ID
17812926176005621854
Author
Reincke U
Publication year
Publication venue
third &Knowledge Discovery” Workshop

External Links

Snippet

The automatic classification of documents into categories is an increasingly important task. As in life sciences scientific document collections continue to grow at exponential growth rates, the task of retrieving and classifying the appropriate documents by hand can become …
Continue reading at km.aifb.kit.edu (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30634Querying
    • G06F17/30657Query processing
    • G06F17/30675Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30705Clustering or classification
    • G06F17/3071Clustering or classification including class or cluster creation or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30705Clustering or classification
    • G06F17/30707Clustering or classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30861Retrieval from the Internet, e.g. browsers
    • G06F17/30864Retrieval from the Internet, e.g. browsers by querying, e.g. search engines or meta-search engines, crawling techniques, push systems
    • G06F17/30867Retrieval from the Internet, e.g. browsers by querying, e.g. search engines or meta-search engines, crawling techniques, push systems with filtering and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • G06F17/30386Retrieval requests
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30244Information retrieval; Database structures therefor; File system structures therefor in image databases
    • G06F17/30247Information retrieval; Database structures therefor; File system structures therefor in image databases based on features automatically derived from the image data
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30011Document retrieval systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6217Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • Y10S707/99935Query augmenting and refining, e.g. inexact access

Similar Documents

Publication Publication Date Title
US7603348B2 (en) System for classifying a search query
Buntine et al. Applying discrete PCA in data analysis
Barbosa et al. Combining classifiers to identify online databases
US6701305B1 (en) Methods, apparatus and computer program products for information retrieval and document classification utilizing a multidimensional subspace
EP1435581B1 (en) Retrieval of structured documents
US7752204B2 (en) Query-based text summarization
US20040010485A1 (en) Retrieving, detecting and identifying major and outlier clusters in a very large database
Chen et al. Automated feature weighting in naive bayes for high-dimensional data classification
Shu et al. A neural network-based intelligent metasearch engine
Cummins Document score distribution models for query performance inference and prediction
Hull Information retrieval using statistical classification
Ding et al. User modeling for personalized Web search with self‐organizing map
US20040186833A1 (en) Requirements -based knowledge discovery for technology management
Phadnis et al. Framework for document retrieval using latent semantic indexing
Nanas et al. A comparative evaluation of term weighting methods for information filtering
Reincke Profiling and classification of scientific documents with SAS Text Miner
McCarey et al. Recommending library methods: An evaluation of the vector space model (VSM) and latent semantic indexing (LSI)
Vadivel et al. An Effective Document Category Prediction System Using Support Vector Machines, Mann-Whitney Techniques
Negm et al. Investigate the performance of document clustering approach based on association rules mining
Trieschnigg et al. Hierarchical topic detection in large digital news archives
Yang et al. Decomposition of term-document matrix representation for clustering analysis
Regulski Formalization of technological knowledge in the field of metallurgy using document classification tools supported with semantic techniques
Cheng et al. Learning to rank relevant documents for information retrieval in bioengineering text corpora
Imran et al. Selecting Effective Expansion Terms for Better Information Retrieval.
Samundeeswari Comparison of neural networks and support vector machines using PCA and ICA for feature reduction