Ablimit et al., 2008 - Google Patents
Partly supervised Uyghur morpheme segmentationAblimit et al., 2008
View PDF- Document ID
- 3007314334011393673
- Author
- Ablimit M
- Eli M
- Kawahara T
- Publication year
- Publication venue
- Proc. Oriental-COCOSDA Workshop
External Links
Snippet
This paper introduces Uighur morpheme segmentation, which is a basic part of the comprehensive effort of the Uighur language corpus compilation, conducted at Xinjiang University in cooperation with Kyoto University. Uighur is an agglutinative language with …
- 230000011218 segmentation 0 title abstract description 37
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G06F17/277—Lexical analysis, e.g. tokenisation, collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
- G06F17/271—Syntactic parsing, e.g. based on context-free grammar [CFG], unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2809—Data driven translation
- G06F17/2827—Example based machine translation; Alignment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/274—Grammatical analysis; Style critique
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
- G06F17/2217—Character encodings
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2872—Rule based translation
- G06F17/2881—Natural language generation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/289—Use of machine translation, e.g. multi-lingual retrieval, server side translation for client devices, real-time translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2785—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2863—Processing of non-latin text
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/30675—Query execution
- G06F17/30684—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/211—Formatting, i.e. changing of presentation of document
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Althobaiti et al. | AraNLP: A Java-based library for the processing of Arabic text | |
| Ahmadi | KLPT–Kurdish language processing toolkit | |
| Oflazer | Turkish and its challenges for language processing | |
| Çöltekin | A set of open source tools for Turkish natural language processing. | |
| Maamouri et al. | Diacritization: A challenge to Arabic treebank annotation and parsing | |
| Masmoudi et al. | Transliteration of Arabizi into Arabic script for Tunisian dialect | |
| Zeroual et al. | A new Quranic Corpus rich in morphosyntactical information | |
| Mohammed | Using machine learning to build POS tagger for under-resourced language: the case of Somali | |
| Sen et al. | Bangla natural language processing: A comprehensive review of classical machine learning and deep learning based methods | |
| Oflazer et al. | Turkish and its challenges for language and speech processing | |
| Ahmadi et al. | Towards finite-state morphology of Kurdish | |
| Vasiu et al. | Enhancing tokenization by embedding romanian language specific morphology | |
| Jamro | Sindhi language processing: A survey | |
| Megerdoomian | Developing a Persian part of speech tagger | |
| Malik et al. | A hybrid model for Urdu Hindi transliteration | |
| Ablimit et al. | Partly supervised Uyghur morpheme segmentation | |
| Hellwig | Morphological disambiguation of classical Sanskrit | |
| Lingam et al. | Rule-based machine translation from English to Telugu with emphasis on prepositions | |
| KR101757222B1 (en) | Paraphrase sentence generation method for a korean language sentence | |
| Toleush et al. | Development of morphological segmentation for the Kyrgyz language on complete set of endings | |
| Kammoun et al. | The MORPH2 new version: A robust morphological analyzer for Arabic texts | |
| Khoufi et al. | Statistical-based system for morphological annotation of Arabic texts | |
| Elsheikh | Timeline of the development of Arabic PoS taggers and Morphological analysers | |
| Adewole et al. | Token validation in automatic corpus gathering for yoruba language | |
| Khemakhem et al. | Integrating morpho-syntactic features in English-Arabic statistical machine translation |