Chang et al., 2009 - Google Patents

Unsupervised constraint driven learning for transliteration discovery

Chang et al., 2009

Document ID: 13647520998040076671
Author: Chang M; Goldwasser D; Roth D; Tu Y
Publication year: 2009
Publication venue: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics

External Links

Cited by

Snippet

This paper introduces a novel unsupervised constraint-driven learning algorithm for identifying named-entity (NE) transliterations in bilingual corpora. The proposed method does not require any annotated data or aligned corpora. Instead, it is bootstrapped using a …

Continue reading at aclanthology.org (PDF) (other versions)

238000002474 experimental method 0 abstract description 12

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2809—Data driven translation
- G06F17/2827—Example based machine translation; Alignment
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G06F17/277—Lexical analysis, e.g. tokenisation, collocates
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/3066—Query translation
- G06F17/30669—Translation of the query language, e.g. Chinese to English
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2863—Processing of non-latin text
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/289—Use of machine translation, e.g. multi-lingual retrieval, server side translation for client devices, real-time translation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker

Similar Documents

Publication	Publication Date	Title
US9026425B2 (en)	2015-05-05	Lexical and phrasal feature domain adaptation in statistical machine translation
US10025778B2 (en)	2018-07-17	Training markov random field-based translation models using gradient ascent
Kim et al.	2017	Two-stage multi-intent detection for spoken language understanding
US9176936B2 (en)	2015-11-03	Transliteration pair matching
US8849665B2 (en)	2014-09-30	System and method of providing machine translation from a source language to a target language
Xu et al.	2018	Cross-domain and semisupervised named entity recognition in chinese social media: A unified model
Darwish et al.	2017	Arabic pos tagging: Don’t abandon feature engineering just yet
Woodsend et al.	2014	Text rewriting improves semantic role labeling
CN105988990A (en)	2016-10-05	Device and method for resolving zero anaphora in Chinese language, as well as training method
Namysl et al.	2020	NAT: Noise-aware training for robust neural sequence labeling
CN103678318B (en)	2016-12-21	Multi-word unit extraction method and equipment and artificial neural network training method and equipment
Sammons et al.	2009	Relation Alignment for Textual Entailment Recognition.
EP3598321A1 (en)	2020-01-22	Method for parsing natural language text with constituent construction links
Kim et al.	2015	Part-of-speech taggers for low-resource languages using CCA features
Chang et al.	2009	Unsupervised constraint driven learning for transliteration discovery
JP5565827B2 (en)	2014-08-06	A sentence separator training device for language independent word segmentation for statistical machine translation, a computer program therefor and a computer readable medium.
Zhang et al.	2012	Active learning with semi-automatic annotation for extractive speech summarization
Goldwasser et al.	2008	Transliteration as constrained optimization
Jia et al.	2018	Improved discourse parsing with two-step neural transition-based model
Das	2012	Semi-Supervised and Latent-Variable Models of Natural Language Semantics
CN103914447A (en)	2014-07-09	Information processing device and information processing method
Mao et al.	2021	A neural joint model with bert for burmese syllable segmentation, word segmentation, and pos tagging
Seneviratne	2024	Text Simplification Using Natural Language Processing and Machine Learning for Better Language Understandability
Yang et al.	2016	Bilingual semantic role labeling inference via dual decomposition
Trye et al.	2022	A hybrid architecture for labelling bilingual māori-english tweets