Chang et al., 2009 - Google Patents
Unsupervised constraint driven learning for transliteration discoveryChang et al., 2009
View PDF- Document ID
- 13647520998040076671
- Author
- Chang M
- Goldwasser D
- Roth D
- Tu Y
- Publication year
- Publication venue
- Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
External Links
Snippet
This paper introduces a novel unsupervised constraint-driven learning algorithm for identifying named-entity (NE) transliterations in bilingual corpora. The proposed method does not require any annotated data or aligned corpora. Instead, it is bootstrapped using a …
- 238000002474 experimental method 0 abstract description 12
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2809—Data driven translation
- G06F17/2827—Example based machine translation; Alignment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G06F17/277—Lexical analysis, e.g. tokenisation, collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/3066—Query translation
- G06F17/30669—Translation of the query language, e.g. Chinese to English
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2863—Processing of non-latin text
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/289—Use of machine translation, e.g. multi-lingual retrieval, server side translation for client devices, real-time translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9026425B2 (en) | Lexical and phrasal feature domain adaptation in statistical machine translation | |
US10025778B2 (en) | Training markov random field-based translation models using gradient ascent | |
Kim et al. | Two-stage multi-intent detection for spoken language understanding | |
US9176936B2 (en) | Transliteration pair matching | |
US8849665B2 (en) | System and method of providing machine translation from a source language to a target language | |
Xu et al. | Cross-domain and semisupervised named entity recognition in chinese social media: A unified model | |
Darwish et al. | Arabic pos tagging: Don’t abandon feature engineering just yet | |
Woodsend et al. | Text rewriting improves semantic role labeling | |
CN105988990A (en) | Device and method for resolving zero anaphora in Chinese language, as well as training method | |
Namysl et al. | NAT: Noise-aware training for robust neural sequence labeling | |
CN103678318B (en) | Multi-word unit extraction method and equipment and artificial neural network training method and equipment | |
Sammons et al. | Relation Alignment for Textual Entailment Recognition. | |
EP3598321A1 (en) | Method for parsing natural language text with constituent construction links | |
Kim et al. | Part-of-speech taggers for low-resource languages using CCA features | |
Chang et al. | Unsupervised constraint driven learning for transliteration discovery | |
JP5565827B2 (en) | A sentence separator training device for language independent word segmentation for statistical machine translation, a computer program therefor and a computer readable medium. | |
Zhang et al. | Active learning with semi-automatic annotation for extractive speech summarization | |
Goldwasser et al. | Transliteration as constrained optimization | |
Jia et al. | Improved discourse parsing with two-step neural transition-based model | |
Das | Semi-Supervised and Latent-Variable Models of Natural Language Semantics | |
CN103914447A (en) | Information processing device and information processing method | |
Mao et al. | A neural joint model with bert for burmese syllable segmentation, word segmentation, and pos tagging | |
Seneviratne | Text Simplification Using Natural Language Processing and Machine Learning for Better Language Understandability | |
Yang et al. | Bilingual semantic role labeling inference via dual decomposition | |
Trye et al. | A hybrid architecture for labelling bilingual māori-english tweets |