Thi Minh Huyen Nguyen
Also published as: Thi Minh Huyền Nguyễn, Thi-Minh-Huyen Nguyen, T. M. Huyen Nguyen, Thị Minh Huyền Nguyễn, Thi Minh Huyen Nguyen
2023
Two Neural Models for Multilingual Grammatical Error Detection
Phuong Le-Hong | The Quyen Ngo | Thi Minh Huyen Nguyen
Proceedings of the 12th Workshop on NLP for Computer Assisted Language Learning
Phuong Le-Hong | The Quyen Ngo | Thi Minh Huyen Nguyen
Proceedings of the 12th Workshop on NLP for Computer Assisted Language Learning
2020
Overview of VLSP RelEx shared task: A Data Challenge for Semantic Relation Extraction from Vietnamese News
Vu Tran Mai | Hoang-Quynh Le | Duy-Cat Can | Thi Minh Huyen Nguyen | Tran Ngoc Linh Nguyen | Thanh Tam Doan
Proceedings of the 7th International Workshop on Vietnamese Language and Speech Processing
Vu Tran Mai | Hoang-Quynh Le | Duy-Cat Can | Thi Minh Huyen Nguyen | Tran Ngoc Linh Nguyen | Thanh Tam Doan
Proceedings of the 7th International Workshop on Vietnamese Language and Speech Processing
2010
An empirical study of maximum entropy approach for part-of-speech tagging of Vietnamese texts
Phuong Le-Hong | Azim Roussanaly | Thi Minh Huyen Nguyen | Mathias Rossignol
Actes de la 17e conférence sur le Traitement Automatique des Langues Naturelles. Articles longs
Phuong Le-Hong | Azim Roussanaly | Thi Minh Huyen Nguyen | Mathias Rossignol
Actes de la 17e conférence sur le Traitement Automatique des Langues Naturelles. Articles longs
This paper presents an empirical study on the application of the maximum entropy approach for part-of-speech tagging of Vietnamese text, a language with special characteristics which largely distinguish it from occidental languages. Our best tagger explores and includes useful knowledge sources for tagging Vietnamese text and gives a 93.40%overall accuracy and a 80.69%unknown word accuracy on a test set of the Vietnamese treebank. Our tagger significantly outperforms the tagger that is being used for building the Vietnamese treebank, and as far as we are aware, this is the best tagging result ever published for the Vietnamese language.
Automated Extraction of Tree Adjoining Grammars from a Treebank for Vietnamese
Phuong Le-Hong | Thi Minh Huyen Nguyen | Phuong Thai Nguyen | Azim Roussanaly
Proceedings of the 10th International Workshop on Tree Adjoining Grammar and Related Frameworks (TAG+10)
Phuong Le-Hong | Thi Minh Huyen Nguyen | Phuong Thai Nguyen | Azim Roussanaly
Proceedings of the 10th International Workshop on Tree Adjoining Grammar and Related Frameworks (TAG+10)
2009
Building a Large Syntactically-Annotated Corpus of Vietnamese
Phuong-Thai Nguyen | Xuan-Luong Vu | Thi-Minh-Huyen Nguyen | Van-Hiep Nguyen | Hong-Phuong Le
Proceedings of the Third Linguistic Annotation Workshop (LAW III)
Phuong-Thai Nguyen | Xuan-Luong Vu | Thi-Minh-Huyen Nguyen | Van-Hiep Nguyen | Hong-Phuong Le
Proceedings of the Third Linguistic Annotation Workshop (LAW III)
Finite-State Description of Vietnamese Reduplication
Phuong Le Hong | Thi Minh Huyen Nguyen | Azim Roussanaly
Proceedings of the 7th Workshop on Asian Language Resources (ALR7)
Phuong Le Hong | Thi Minh Huyen Nguyen | Azim Roussanaly
Proceedings of the 7th Workshop on Asian Language Resources (ALR7)
2008
Word Segmentation of Vietnamese Texts: a Comparison of Approaches
Quang Thắng Đinh | Hồng Phương Lê | Thị Minh Huyền Nguyễn | Cẩm Tú Nguyễn | Mathias Rossignol | Xuân Lương Vũ
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
Quang Thắng Đinh | Hồng Phương Lê | Thị Minh Huyền Nguyễn | Cẩm Tú Nguyễn | Mathias Rossignol | Xuân Lương Vũ
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
We present in this paper a comparison between three segmentation systems for the Vietnamese language. Indeed, the majority of Vietnamese words is built by semantic composition from about 7,000 syllables, which also have a meaning as isolated words. So the identification of word boundaries in a text is not a simple task, and ambiguities often appear. Beyond the presentation of the tested systems, we also propose a standard definition for word segmentation in Vietnamese, and introduce a reference corpus developed for the purpose of evaluating such a task. The results observed confirm that it can be relatively well treated by automatic means, although a solution needs to be found to take into account out-of-vocabulary words.
A Metagrammar for Vietnamese LTAG
Phương Lê Hồng | Thị Minh Huyền Nguyễn | Azim Roussanaly
Proceedings of the Ninth International Workshop on Tree Adjoining Grammar and Related Frameworks (TAG+9)
Phương Lê Hồng | Thị Minh Huyền Nguyễn | Azim Roussanaly
Proceedings of the Ninth International Workshop on Tree Adjoining Grammar and Related Frameworks (TAG+9)
2006
Evaluation of multilingual text alignment systems: the ARCADE II project
Yun-Chuang Chiao | Olivier Kraif | Dominique Laurent | Thi Minh Huyen Nguyen | Nasredine Semmar | François Stuck | Jean Véronis | Wajdi Zaghouani
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
Yun-Chuang Chiao | Olivier Kraif | Dominique Laurent | Thi Minh Huyen Nguyen | Nasredine Semmar | François Stuck | Jean Véronis | Wajdi Zaghouani
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
This paper describes the ARCADE II project, concerned with the evaluation of parallel text alignment systems. The ARCADE II project aims at exploring the techniques of multilingual text alignment through a fine evaluation of the existing techniques and the development of new alignment methods. The evaluation campaign consists of two tracks devoted to the evaluation of alignment at sentence and word level respectively. It differs from ARCADE I in the multilingual aspect and the investigation of lexical alignment.
A Lexicalized Tree-Adjoining Grammar for Vietnamese
H. Phuong Le | T. M. Huyen Nguyen | Laurent Romary | Azim Roussanaly
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
H. Phuong Le | T. M. Huyen Nguyen | Laurent Romary | Azim Roussanaly
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
In this paper, we present the first sizable grammar built for Vietnamese using LTAG, developed over the past two years, named vnLTAG. This grammar aims at modelling written language and is general enough to be both application- and domain-independent. It can be used for the morpho-syntactic tagging and syntactic parsing of Vietnamese texts, as well as text generation. We then present a robust parsing scheme using vnLTAG and a parser for the grammar. We finish with an evaluation using a test suite.
A language-independent method for the alignement of parallel corpora
Thi Minh Huyền Nguyễn | Mathias Rossignol
Proceedings of the 20th Pacific Asia Conference on Language, Information and Computation
Thi Minh Huyền Nguyễn | Mathias Rossignol
Proceedings of the 20th Pacific Asia Conference on Language, Information and Computation