[go: up one dir, main page]

Farhath et al., 2018 - Google Patents

Integration of bilingual lists for domain-specific statistical machine translation for sinhala-tamil

Farhath et al., 2018

Document ID
17264314013381962167
Author
Farhath F
Ranathunga S
Jayasena S
Dias G
Publication year
Publication venue
2018 Moratuwa Engineering Research Conference (MERCon)

External Links

Snippet

Availability of quality parallel data is a major requirement to build a reasonably well performing statistical machine translation (SMT) system. Thus, developing a decent SMT system for a low-resourced language pair like Sinhala and Tamil that does not have a large …
Continue reading at ieeexplore.ieee.org (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/2809Data driven translation
    • G06F17/2827Example based machine translation; Alignment
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30634Querying
    • G06F17/30657Query processing
    • G06F17/3066Query translation
    • G06F17/30669Translation of the query language, e.g. Chinese to English
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/289Use of machine translation, e.g. multi-lingual retrieval, server side translation for client devices, real-time translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/2809Data driven translation
    • G06F17/2845Using very large corpora, e.g. the world wide web [WWW]
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30634Querying
    • G06F17/30657Query processing
    • G06F17/30675Query execution
    • G06F17/30684Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/2872Rule based translation
    • G06F17/2881Natural language generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2705Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/2863Processing of non-latin text
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/21Text processing
    • G06F17/22Manipulating or registering by use of codes, e.g. in sequence of text characters
    • G06F17/2217Character encodings
    • G06F17/2223Handling non-latin characters, e.g. kana-to-kanji conversion
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2765Recognition
    • G06F17/277Lexical analysis, e.g. tokenisation, collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2795Thesaurus; Synonyms
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2785Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/274Grammatical analysis; Style critique
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30861Retrieval from the Internet, e.g. browsers

Similar Documents

Publication Publication Date Title
Hearne et al. Statistical machine translation: a guide for linguists and translators
Koehn et al. Empirical methods for compound splitting
Farhath et al. Integration of bilingual lists for domain-specific statistical machine translation for sinhala-tamil
Birch et al. Edinburgh SLT and MT system description for the IWSLT 2014 evaluation
Haque et al. TermFinder: log-likelihood comparison and phrase-based statistical machine translation models for bilingual terminology extraction
Ranathunga et al. Si-ta: Machine translation of sinhala and tamil official documents
Abdurakhmonova et al. Linguistic functionality of Uzbek Electron Corpus: uzbekcorpus. uz
Aasha et al. Machine translation from English to Malayalam using transfer approach
Barreiro et al. Linguistic evaluation of support verb constructions by OpenLogos and Google Translate
Ferreira et al. Surface realization shared task 2018 (sr18): The tilburg university approach
Rabbani et al. A new verb based approach for English to Bangla machine translation
Vandeghinste et al. Parse and corpus-based machine translation
Tennage et al. Handling rare word problem using synthetic training data for sinhala and tamil neural machine translation
Farhath et al. Improving domain-specific SMT for low-resourced languages using data from different domains
Li et al. Uzbek-English and Turkish-English morpheme alignment corpora
Musleh et al. Enabling medical translation for low-resource languages
Yashothara et al. Improving Phrase-Based Statistical Machine Translation with Preprocessing Techniques
Haque et al. Ruslan Mitkov, Johanna Monti, Gloria Corpas Pastor, and Violeta Seretan (eds): Multiword units in machine translation and translation technology: Current Issues in Linguistic Theory, Volume 341, John Benjamin Publishing Company, Amsterdam & Philadelphia, 2018, ix+ 259 pp, ISBN 978-90-272-0060-0 (HB), ISBN 978-90-272-6420-6 (e-book)
Mohaghegh et al. Improved language modeling for English-Persian statistical machine translation
Costa-Jussa et al. A large Spanish-Catalan parallel corpus release for machine translation
Dubey et al. Generation of bilingual dictionaries using structural properties
Akhtar et al. An unsupervised approach for mapping between vector spaces
Ji et al. Phonetic name matching for cross-lingual spoken sentence retrieval
Way et al. Multiword units in machine translation and translation technology--Ruslan Mitkov, Johanna Monti, Gloria Corpas Pastor, and Violeta Seretan (eds), Book Review.
Alubaidi Hybrid Arabic-English Machine Translation to Solve Reordering and Ambiguity Problems