Ablimit et al., 2008 - Google Patents

Partly supervised Uyghur morpheme segmentation

Ablimit et al., 2008

Document ID: 3007314334011393673
Author: Ablimit M; Eli M; Kawahara T
Publication year: 2008
Publication venue: Proc. Oriental-COCOSDA Workshop

External Links

Cited by

Snippet

This paper introduces Uighur morpheme segmentation, which is a basic part of the comprehensive effort of the Uighur language corpus compilation, conducted at Xinjiang University in cooperation with Kyoto University. Uighur is an agglutinative language with …

Continue reading at sap.ist.i.kyoto-u.ac.jp (PDF) (other versions)

230000011218 segmentation 0 title abstract description 37

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G06F17/277—Lexical analysis, e.g. tokenisation, collocates
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
- G06F17/271—Syntactic parsing, e.g. based on context-free grammar [CFG], unification grammars
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2809—Data driven translation
- G06F17/2827—Example based machine translation; Alignment
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/274—Grammatical analysis; Style critique
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
- G06F17/2217—Character encodings
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2872—Rule based translation
- G06F17/2881—Natural language generation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/289—Use of machine translation, e.g. multi-lingual retrieval, server side translation for client devices, real-time translation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2785—Semantic analysis
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2863—Processing of non-latin text
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/30675—Query execution
- G06F17/30684—Query execution using natural language analysis
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/211—Formatting, i.e. changing of presentation of document
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling

Similar Documents

Publication	Publication Date	Title
Althobaiti et al.	2014	AraNLP: A Java-based library for the processing of Arabic text
Ahmadi	2020	KLPT–Kurdish language processing toolkit
Oflazer	2014	Turkish and its challenges for language processing
Çöltekin	2014	A set of open source tools for Turkish natural language processing.
Maamouri et al.	2006	Diacritization: A challenge to Arabic treebank annotation and parsing
Masmoudi et al.	2019	Transliteration of Arabizi into Arabic script for Tunisian dialect
Zeroual et al.	2016	A new Quranic Corpus rich in morphosyntactical information
Mohammed	2020	Using machine learning to build POS tagger for under-resourced language: the case of Somali
Sen et al.	2021	Bangla natural language processing: A comprehensive review of classical machine learning and deep learning based methods
Oflazer et al.	2018	Turkish and its challenges for language and speech processing
Ahmadi et al.	2020	Towards finite-state morphology of Kurdish
Vasiu et al.	2020	Enhancing tokenization by embedding romanian language specific morphology
Jamro	2017	Sindhi language processing: A survey
Megerdoomian	2004	Developing a Persian part of speech tagger
Malik et al.	2009	A hybrid model for Urdu Hindi transliteration
Ablimit et al.	2008	Partly supervised Uyghur morpheme segmentation
Hellwig	2015	Morphological disambiguation of classical Sanskrit
Lingam et al.	2014	Rule-based machine translation from English to Telugu with emphasis on prepositions
KR101757222B1 (en)	2017-07-13	Paraphrase sentence generation method for a korean language sentence
Toleush et al.	2021	Development of morphological segmentation for the Kyrgyz language on complete set of endings
Kammoun et al.	2010	The MORPH2 new version: A robust morphological analyzer for Arabic texts
Khoufi et al.	2013	Statistical-based system for morphological annotation of Arabic texts
Elsheikh	2018	Timeline of the development of Arabic PoS taggers and Morphological analysers
Adewole et al.	2017	Token validation in automatic corpus gathering for yoruba language
Khemakhem et al.	2013	Integrating morpho-syntactic features in English-Arabic statistical machine translation