Farkas et al., 2007 - Google Patents
Improving a state-of-the-art named entity recognition system using the world wide webFarkas et al., 2007
View PDF- Document ID
- 5332534771289717334
- Author
- Farkas R
- Szarvas G
- Ormándi R
- Publication year
- Publication venue
- Industrial Conference on Data Mining
External Links
Snippet
The development of highly accurate Named Entity Recognition (NER) systems can be beneficial to a wide range of Human Language Technology applications. In this paper we introduce three heuristics that exploit a variety of knowledge sources (the World Wide Web …
- 238000005516 engineering process 0 abstract description 3
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/30675—Query execution
- G06F17/30684—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G06F17/2775—Phrasal analysis, e.g. finite state techniques, chunking
- G06F17/278—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G06F17/277—Lexical analysis, e.g. tokenisation, collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30613—Indexing
- G06F17/30619—Indexing indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
- G06F17/2715—Statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
- G06F17/271—Syntactic parsing, e.g. based on context-free grammar [CFG], unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30705—Clustering or classification
- G06F17/3071—Clustering or classification including class or cluster creation or modification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/274—Grammatical analysis; Style critique
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30705—Clustering or classification
- G06F17/30707—Clustering or classification into predefined classes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2795—Thesaurus; Synonyms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30861—Retrieval from the Internet, e.g. browsers
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Szarvas et al. | A multilingual named entity recognition system using boosting and c4. 5 decision tree learning algorithms | |
| Chopra et al. | Named entity recognition in Hindi using hidden Markov model | |
| Jabbar et al. | An improved Urdu stemming algorithm for text mining based on multi-step hybrid approach | |
| Rozovskaya et al. | Correcting grammatical verb errors | |
| Prokopidis et al. | A neural NLP toolkit for Greek | |
| Zheng et al. | Dynamic knowledge-base alignment for coreference resolution | |
| Fu et al. | Exploiting multiple sources for open-domain hypernym discovery | |
| Díez Platas et al. | Medieval Spanish (12th–15th centuries) named entity recognition and attribute annotation system based on contextual information | |
| Lamba et al. | A survey on plagiarism detection techniques for indian regional languages | |
| Espinosa-Anke et al. | Applying dependency relations to definition extraction | |
| Widyantoro et al. | Citation sentence identification and classification for related work summarization | |
| Li et al. | National University of Singapore at the TREC-13 question answering main task | |
| Krithika et al. | Learning to grade short answers using machine learning techniques | |
| Zhao et al. | Learning Question Paraphrases for QA from Encarta Logs. | |
| Pasini et al. | Huge automatically extracted training sets for multilingual word sense disambiguation | |
| Kuba et al. | POS tagging of Hungarian with combined statistical and rule-based methods | |
| Farkas et al. | Improving a state-of-the-art named entity recognition system using the world wide web | |
| Loukachevitch et al. | Determining the most frequent senses using Russian linguistic ontology RuThes | |
| Bruches et al. | A system for information extraction from scientific texts in Russian | |
| Alabbas | ArbTE: Arabic textual entailment | |
| Spiegler et al. | Learning the morphology of Zulu with different degrees of supervision | |
| Nothman | Learning named entity recognition from Wikipedia | |
| Bosch et al. | Memory-based morphological analysis and part-of-speech tagging of Arabic | |
| Sinha et al. | Abstractive text summarization for Sanskrit prose: a study of methods and approaches | |
| Ilgen et al. | Exploring feature sets for Turkish word sense disambiguation |