[go: up one dir, main page]

Chang et al., 2009 - Google Patents

Unsupervised constraint driven learning for transliteration discovery

Chang et al., 2009

View PDF
Document ID
13647520998040076671
Author
Chang M
Goldwasser D
Roth D
Tu Y
Publication year
Publication venue
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics

External Links

Snippet

This paper introduces a novel unsupervised constraint-driven learning algorithm for identifying named-entity (NE) transliterations in bilingual corpora. The proposed method does not require any annotated data or aligned corpora. Instead, it is bootstrapped using a …
Continue reading at aclanthology.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/2809Data driven translation
    • G06F17/2827Example based machine translation; Alignment
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2705Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2765Recognition
    • G06F17/277Lexical analysis, e.g. tokenisation, collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30634Querying
    • G06F17/30657Query processing
    • G06F17/3066Query translation
    • G06F17/30669Translation of the query language, e.g. Chinese to English
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/21Text processing
    • G06F17/22Manipulating or registering by use of codes, e.g. in sequence of text characters
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/2863Processing of non-latin text
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/289Use of machine translation, e.g. multi-lingual retrieval, server side translation for client devices, real-time translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6217Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass
    • G06N99/005Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/065Adaptation
    • G10L15/07Adaptation to the speaker

Similar Documents

Publication Publication Date Title
US9026425B2 (en) Lexical and phrasal feature domain adaptation in statistical machine translation
US10025778B2 (en) Training markov random field-based translation models using gradient ascent
Kim et al. Two-stage multi-intent detection for spoken language understanding
US9176936B2 (en) Transliteration pair matching
US8849665B2 (en) System and method of providing machine translation from a source language to a target language
Xu et al. Cross-domain and semisupervised named entity recognition in chinese social media: A unified model
Darwish et al. Arabic pos tagging: Don’t abandon feature engineering just yet
Woodsend et al. Text rewriting improves semantic role labeling
CN105988990A (en) Device and method for resolving zero anaphora in Chinese language, as well as training method
Namysl et al. NAT: Noise-aware training for robust neural sequence labeling
CN103678318B (en) Multi-word unit extraction method and equipment and artificial neural network training method and equipment
Sammons et al. Relation Alignment for Textual Entailment Recognition.
EP3598321A1 (en) Method for parsing natural language text with constituent construction links
Kim et al. Part-of-speech taggers for low-resource languages using CCA features
Chang et al. Unsupervised constraint driven learning for transliteration discovery
JP5565827B2 (en) A sentence separator training device for language independent word segmentation for statistical machine translation, a computer program therefor and a computer readable medium.
Zhang et al. Active learning with semi-automatic annotation for extractive speech summarization
Goldwasser et al. Transliteration as constrained optimization
Jia et al. Improved discourse parsing with two-step neural transition-based model
Das Semi-Supervised and Latent-Variable Models of Natural Language Semantics
CN103914447A (en) Information processing device and information processing method
Mao et al. A neural joint model with bert for burmese syllable segmentation, word segmentation, and pos tagging
Seneviratne Text Simplification Using Natural Language Processing and Machine Learning for Better Language Understandability
Yang et al. Bilingual semantic role labeling inference via dual decomposition
Trye et al. A hybrid architecture for labelling bilingual māori-english tweets