[go: up one dir, main page]

Nikoulina et al., 2013 - Google Patents

Domain adaptation of statistical machine translation models with monolingual data for cross lingual information retrieval

Nikoulina et al., 2013

Document ID
6505461924216016823
Author
Nikoulina V
Clinchant S
Publication year
Publication venue
European Conference on Information Retrieval

External Links

Snippet

Abstract Statistical Machine Translation (SMT) is often used as a black-box in CLIR tasks. We propose an adaptation method for an SMT model relying on the monolingual statistics that can be extracted from the document collection (both source and target if available). We …
Continue reading at link.springer.com (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/2809Data driven translation
    • G06F17/2827Example based machine translation; Alignment
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30634Querying
    • G06F17/30657Query processing
    • G06F17/3066Query translation
    • G06F17/30669Translation of the query language, e.g. Chinese to English
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30634Querying
    • G06F17/30657Query processing
    • G06F17/30675Query execution
    • G06F17/30684Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/2809Data driven translation
    • G06F17/2845Using very large corpora, e.g. the world wide web [WWW]
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/289Use of machine translation, e.g. multi-lingual retrieval, server side translation for client devices, real-time translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/2872Rule based translation
    • G06F17/2881Natural language generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/2863Processing of non-latin text
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2705Parsing
    • G06F17/271Syntactic parsing, e.g. based on context-free grammar [CFG], unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2765Recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2795Thesaurus; Synonyms
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2785Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30613Indexing
    • G06F17/30619Indexing indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/21Text processing
    • G06F17/22Manipulating or registering by use of codes, e.g. in sequence of text characters
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30861Retrieval from the Internet, e.g. browsers
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores

Similar Documents

Publication Publication Date Title
US9026425B2 (en) Lexical and phrasal feature domain adaptation in statistical machine translation
Hasler et al. Dynamic topic adaptation for phrase-based mt
Neubig et al. Machine translation without words through substring alignment
Tsvetkov et al. Lexicon stratification for translating out-of-vocabulary words
Haque et al. Bilingual termbank creation via log-likelihood comparison and phrase-based statistical machine translation
Saad et al. Cross-lingual semantic similarity measure for comparable articles
Shahnawaz et al. Statistical machine translation system for English to Urdu
Yıldız et al. The effect of parallel corpus quality vs size in English-to-Turkish SMT
Gutierrez-Vasques Bilingual lexicon extraction for a distant language pair using a small parallel corpus
Madhyastha et al. Learning bilingual projections of embeddings for vocabulary expansion in machine translation
Kalita et al. Bengali to assamese statistical machine translation using moses (corpus based)
Nikoulina et al. Domain adaptation of statistical machine translation models with monolingual data for cross lingual information retrieval
Richardson et al. Kyotoebmt: An example-based dependency-to-dependency translation framework
Burlot et al. Two-step mt: Predicting target morphology
Chandra et al. Hunting elusive English in Hinglish and Benglish text: unfolding challenges and remedies
Afli et al. MultiNews: A web collection of an aligned multimodal and multilingual corpus
Bhat et al. A house united: bridging the script and lexical barrier between Hindi and Urdu
Lohar et al. The ADAPT bilingual document alignment system at WMT16
Taslimipoor et al. Bilingual contexts from comparable corpora to mine for translations of collocations
Rasooli et al. Extracting parallel paragraphs and sentences from english-persian translated documents
Gupta et al. Automatic Ranking of MT Outputs using Approximations
Pathak et al. Natural Language Chhattisgarhi: A Literature Survey
Wang et al. Effective hypotheses re-ranking model in statistical machine translation
Ji et al. Phonetic name matching for cross-lingual spoken sentence retrieval
Luekhong et al. A study of a Thai-English translation comparing on applying phrase-based and hierarchical phrase-based translation