Hoang Quynh Le

Also published as: Hoang-Quynh Le

2025

Beyond the Scientific Document: A Citation-Aware Multi-Granular Summarization Approach with Heterogeneous Graphs
Quoc-An Nguyen | Xuan-Hung Le | Thi-Minh-Thu Vu | Hoang-Quynh Le
Findings of the Association for Computational Linguistics: EMNLP 2025

Scientific summarization remains a challenging task due to the complex characteristics of internal structure and its external relations to other documents. To address this, our proposed model constructs a heterogeneous graph to represent a document and its relevant external citations. This heterogeneous graph enables the model to exploit information across multiple granularities, ranging from fine-grained textual components to the global document structure, and from internal content to external citation context, which facilitates context-aware representations and effectively reduces redundancy. In addition, we develop an effective encoder based on a multi-granularity graph attention mechanism and the triplet loss objective to enhance representation learning performance. Experimental results across three different scenarios consistently demonstrate that our model outperforms existing approaches. Source code is available at: https://github.com/quocanuetcs/CiteHeteroSum.

pdf bib

DRILL Shared Task 2025: The Challenge of Deep Retrieval in the Expansive Legal Landscape
Thi-Hai-Yen Vuong | Tan-Minh Nguyen | Hoang-Trung Nguyen | Trong-Khoi Dao | Ha-Thanh Nguyen | Hoang-Quynh Le
Proceedings of the 11th International Workshop on Vietnamese Language and Speech Processing

2024

pdf bib

MERE: A Deep Learning Architecture Using Multi-Fragment Ensemble for Relation Extraction
Hoang-Quynh Le | Duy-Cat Can
Proceedings of the 38th Pacific Asia Conference on Language, Information and Computation

pdf bib

Contrastive Summarization of User Reviews: An Aspect-based Abstractive Approach
Hung-Manh Hoang | Duc-Loc Vu | Huong Nguyen-Thi-Thuy | Duy-Cat Can | Hoang-Quynh Le
Proceedings of the 38th Pacific Asia Conference on Language, Information and Computation

2021

pdf bib abs

This paper describes a system developed to summarize multiple answers challenge in the MEDIQA 2021 shared task collocated with the BioNLP 2021 Workshop. We present an abstractive summarization model based on BART, a denoising auto-encoder for pre-training sequence-to-sequence models. As focusing on the summarization of answers to consumer health questions, we propose a query-driven filtering phase to choose useful information from the input document automatically. Our approach achieves potential results, rank no.2 (evaluated on extractive references) and no.3 (evaluated on abstractive references) in the final evaluation.

2020

pdf bib

Overview of VLSP RelEx shared task: A Data Challenge for Semantic Relation Extraction from Vietnamese News
Vu Tran Mai | Hoang-Quynh Le | Duy-Cat Can | Thi Minh Huyen Nguyen | Tran Ngoc Linh Nguyen | Thanh Tam Doan
Proceedings of the 7th International Workshop on Vietnamese Language and Speech Processing

2019

pdf bib abs

A Richer-but-Smarter Shortest Dependency Path with Attentive Augmentation for Relation Extraction
Duy-Cat Can | Hoang-Quynh Le | Quang-Thuy Ha | Nigel Collier
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

To extract the relationship between two entities in a sentence, two common approaches are (1) using their shortest dependency path (SDP) and (2) using an attention model to capture a context-based representation of the sentence. Each approach suffers from its own disadvantage of either missing or redundant information. In this work, we propose a novel model that combines the advantages of these two approaches. This is based on the basic information in the SDP enhanced with information selected by several attention mechanisms with kernel filters, namely RbSP (Richer-but-Smarter SDP). To exploit the representation behind the RbSP structure effectively, we develop a combined deep neural model with a LSTM network on word sequences and a CNN on RbSP. Experimental results on the SemEval-2010 dataset demonstrate improved performance over competitive baselines. The data and source code are available at https://github.com/catcd/RbSP.

2018

pdf bib abs

Large-scale Exploration of Neural Relation Classification Architectures
Hoang-Quynh Le | Duy-Cat Can | Sinh T. Vu | Thanh Hai Dang | Mohammad Taher Pilehvar | Nigel Collier
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Experimental performance on the task of relation classification has generally improved using deep neural network architectures. One major drawback of reported studies is that individual models have been evaluated on a very narrow range of datasets, raising questions about the adaptability of the architectures, while making comparisons between approaches difficult. In this work, we present a systematic large-scale analysis of neural relation classification architectures on six benchmark datasets with widely varying characteristics. We propose a novel multi-channel LSTM model combined with a CNN that takes advantage of all currently popular linguistic and architectural features. Our ‘Man for All Seasons’ approach achieves state-of-the-art performance on two datasets. More importantly, in our view, the model allowed us to obtain direct insights into the continued challenges faced by neural language models on this task.

2013

pdf bib

Exploring a Probabilistic Earley Parser for Event Composition in Biomedical Texts
Mai-Vu Tran | Nigel Collier | Hoang-Quynh Le | Van-Thuy Phi | Thanh-Binh Pham
Proceedings of the BioNLP Shared Task 2013 Workshop

2012

pdf bib

Hoang Quynh Le

2025

2024

2021

2020

2019

2018

2013

2012

Co-authors

Venues