Kiani et al., 2009 - Google Patents
Developing a Persian chunker using a hybrid approachKiani et al., 2009
View PDF- Document ID
- 3993535500128490738
- Author
- Kiani S
- Akhavan T
- Shamsfard M
- Publication year
- Publication venue
- 2009 International Multiconference on Computer Science and Information Technology
External Links
Snippet
Text segmentation is the process of recognizing boundaries of text constituents, such as sentences, phrases and words. This paper focuses on phrase segmentation also known as chunking. This task has different problems in various natural languages depending on …
- 230000001537 neural 0 abstract description 22
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
- G06F17/271—Syntactic parsing, e.g. based on context-free grammar [CFG], unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G06F17/277—Lexical analysis, e.g. tokenisation, collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G06F17/2775—Phrasal analysis, e.g. finite state techniques, chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
- G06F17/2715—Statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/30675—Query execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/274—Grammatical analysis; Style critique
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2785—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2809—Data driven translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2872—Rule based translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/68—Methods or arrangements for recognition using electronic means using sequential comparisons of the image signals with a plurality of references in which the sequence of the image signals or the references is relevant, e.g. addressable memory
- G06K9/6807—Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries
- G06K9/6842—Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries according to the linguistic properties, e.g. English, German
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Poon et al. | Unsupervised morphological segmentation with log-linear models | |
| Elayeb | Arabic word sense disambiguation: a review | |
| Antony et al. | Parts of speech tagging for Indian languages: a literature survey | |
| Rozovskaya et al. | Correcting grammatical verb errors | |
| Das et al. | Part of speech tagging in odia using support vector machine | |
| Jayakrishnan et al. | Multi-class emotion detection and annotation in Malayalam novels | |
| Suárez et al. | A maximum entropy-based word sense disambiguation system | |
| Dien et al. | POS-tagger for English-Vietnamese bilingual corpus | |
| Tlili-Guiassa | Hybrid method for tagging Arabic text | |
| Kumar et al. | Morphological analyzer for agglutinative languages using machine learning approaches | |
| Wan et al. | Enhancing metaphor detection by gloss-based interpretations | |
| Surahio et al. | Prediction system for sindhi parts of speech tags by using support vector machine | |
| Jayasuriya et al. | Learning a stochastic part of speech tagger for sinhala | |
| Hellwig | Coarse semantic classification of rare nouns using cross-lingual data and recurrent neural networks | |
| Bruches et al. | A system for information extraction from scientific texts in Russian | |
| Jamwal et al. | A Novel Hybrid Approach for the Designing and Implementation of Dogri Spell Checker | |
| Chakraborty et al. | Syntactic Category based Assamese Question Pattern Extraction using N-grams | |
| Kiani et al. | Developing a Persian chunker using a hybrid approach | |
| Francis | A comprehensive survey on parts of speech tagging approaches in dravidian languages | |
| Hoste | The mention-pair model | |
| Bosch et al. | Memory-based morphological analysis and part-of-speech tagging of Arabic | |
| Farrah et al. | An hybrid approach to improve part of speech tagging system | |
| Bach et al. | Paraphrase identification in Vietnamese documents | |
| Sampath et al. | Hybrid Tamil spell checker with combined character splitting | |
| Kardan et al. | Improving Persian POS tagging using the maximum entropy model |