HW-TSC’s Participation in the WAT 2020 Indic Languages Multilingual Task
Zhengzhe Yu, Zhanglin Wu, Xiaoyu Chen, Daimeng Wei, Hengchao Shang, Jiaxin Guo, Zongyao Li, Minghan Wang, Liangyou Li, Lizhi Lei, Hao Yang, Ying Qin
Correct Metadata for
Abstract
This paper describes our work in the WAT 2020 Indic Multilingual Translation Task. We participated in all 7 language pairs (En<->Bn/Hi/Gu/Ml/Mr/Ta/Te) in both directions under the constrained condition—using only the officially provided data. Using transformer as a baseline, our Multi->En and En->Multi translation systems achieve the best performances. Detailed data filtering and data domain selection are the keys to performance enhancement in our experiment, with an average improvement of 2.6 BLEU scores for each language pair in the En->Multi system and an average improvement of 4.6 BLEU scores regarding the Multi->En. In addition, we employed language independent adapter to further improve the system performances. Our submission obtains competitive results in the final evaluation.- Anthology ID:
- 2020.wat-1.8
- Volume:
- Proceedings of the 7th Workshop on Asian Translation
- Month:
- December
- Year:
- 2020
- Address:
- Suzhou, China
- Editors:
- Toshiaki Nakazawa, Hideki Nakayama, Chenchen Ding, Raj Dabre, Anoop Kunchukuttan, Win Pa Pa, Ondřej Bojar, Shantipriya Parida, Isao Goto, Hidaya Mino, Hiroshi Manabe, Katsuhito Sudoh, Sadao Kurohashi, Pushpak Bhattacharyya
- Venue:
- WAT
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 92–97
- Language:
- URL:
- https://aclanthology.org/2020.wat-1.8/
- DOI:
- 10.18653/v1/2020.wat-1.8
- Bibkey:
- Cite (ACL):
- Zhengzhe Yu, Zhanglin Wu, Xiaoyu Chen, Daimeng Wei, Hengchao Shang, Jiaxin Guo, Zongyao Li, Minghan Wang, Liangyou Li, Lizhi Lei, Hao Yang, and Ying Qin. 2020. HW-TSC’s Participation in the WAT 2020 Indic Languages Multilingual Task. In Proceedings of the 7th Workshop on Asian Translation, pages 92–97, Suzhou, China. Association for Computational Linguistics.
- Cite (Informal):
- HW-TSC’s Participation in the WAT 2020 Indic Languages Multilingual Task (Yu et al., WAT 2020)
- Copy Citation:
- PDF:
- https://aclanthology.org/2020.wat-1.8.pdf
Export citation
@inproceedings{yu-etal-2020-hw,
title = "{HW}-{TSC}{'}s Participation in the {WAT} 2020 Indic Languages Multilingual Task",
author = "Yu, Zhengzhe and
Wu, Zhanglin and
Chen, Xiaoyu and
Wei, Daimeng and
Shang, Hengchao and
Guo, Jiaxin and
Li, Zongyao and
Wang, Minghan and
Li, Liangyou and
Lei, Lizhi and
Yang, Hao and
Qin, Ying",
editor = "Nakazawa, Toshiaki and
Nakayama, Hideki and
Ding, Chenchen and
Dabre, Raj and
Kunchukuttan, Anoop and
Pa, Win Pa and
Bojar, Ond{\v{r}}ej and
Parida, Shantipriya and
Goto, Isao and
Mino, Hidaya and
Manabe, Hiroshi and
Sudoh, Katsuhito and
Kurohashi, Sadao and
Bhattacharyya, Pushpak",
booktitle = "Proceedings of the 7th Workshop on Asian Translation",
month = dec,
year = "2020",
address = "Suzhou, China",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2020.wat-1.8/",
doi = "10.18653/v1/2020.wat-1.8",
pages = "92--97",
abstract = "This paper describes our work in the WAT 2020 Indic Multilingual Translation Task. We participated in all 7 language pairs (En{\ensuremath{<}}-{\ensuremath{>}}Bn/Hi/Gu/Ml/Mr/Ta/Te) in both directions under the constrained condition{---}using only the officially provided data. Using transformer as a baseline, our Multi-{\ensuremath{>}}En and En-{\ensuremath{>}}Multi translation systems achieve the best performances. Detailed data filtering and data domain selection are the keys to performance enhancement in our experiment, with an average improvement of 2.6 BLEU scores for each language pair in the En-{\ensuremath{>}}Multi system and an average improvement of 4.6 BLEU scores regarding the Multi-{\ensuremath{>}}En. In addition, we employed language independent adapter to further improve the system performances. Our submission obtains competitive results in the final evaluation."
}<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="yu-etal-2020-hw">
<titleInfo>
<title>HW-TSC’s Participation in the WAT 2020 Indic Languages Multilingual Task</title>
</titleInfo>
<name type="personal">
<namePart type="given">Zhengzhe</namePart>
<namePart type="family">Yu</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Zhanglin</namePart>
<namePart type="family">Wu</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Xiaoyu</namePart>
<namePart type="family">Chen</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Daimeng</namePart>
<namePart type="family">Wei</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Hengchao</namePart>
<namePart type="family">Shang</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Jiaxin</namePart>
<namePart type="family">Guo</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Zongyao</namePart>
<namePart type="family">Li</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Minghan</namePart>
<namePart type="family">Wang</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Liangyou</namePart>
<namePart type="family">Li</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Lizhi</namePart>
<namePart type="family">Lei</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Hao</namePart>
<namePart type="family">Yang</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Ying</namePart>
<namePart type="family">Qin</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<originInfo>
<dateIssued>2020-12</dateIssued>
</originInfo>
<typeOfResource>text</typeOfResource>
<relatedItem type="host">
<titleInfo>
<title>Proceedings of the 7th Workshop on Asian Translation</title>
</titleInfo>
<name type="personal">
<namePart type="given">Toshiaki</namePart>
<namePart type="family">Nakazawa</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Hideki</namePart>
<namePart type="family">Nakayama</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Chenchen</namePart>
<namePart type="family">Ding</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Raj</namePart>
<namePart type="family">Dabre</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Anoop</namePart>
<namePart type="family">Kunchukuttan</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Win</namePart>
<namePart type="given">Pa</namePart>
<namePart type="family">Pa</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Ondřej</namePart>
<namePart type="family">Bojar</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Shantipriya</namePart>
<namePart type="family">Parida</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Isao</namePart>
<namePart type="family">Goto</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Hidaya</namePart>
<namePart type="family">Mino</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Hiroshi</namePart>
<namePart type="family">Manabe</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Katsuhito</namePart>
<namePart type="family">Sudoh</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Sadao</namePart>
<namePart type="family">Kurohashi</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Pushpak</namePart>
<namePart type="family">Bhattacharyya</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<originInfo>
<publisher>Association for Computational Linguistics</publisher>
<place>
<placeTerm type="text">Suzhou, China</placeTerm>
</place>
</originInfo>
<genre authority="marcgt">conference publication</genre>
</relatedItem>
<abstract>This paper describes our work in the WAT 2020 Indic Multilingual Translation Task. We participated in all 7 language pairs (En\ensuremath<-\ensuremath>Bn/Hi/Gu/Ml/Mr/Ta/Te) in both directions under the constrained condition—using only the officially provided data. Using transformer as a baseline, our Multi-\ensuremath>En and En-\ensuremath>Multi translation systems achieve the best performances. Detailed data filtering and data domain selection are the keys to performance enhancement in our experiment, with an average improvement of 2.6 BLEU scores for each language pair in the En-\ensuremath>Multi system and an average improvement of 4.6 BLEU scores regarding the Multi-\ensuremath>En. In addition, we employed language independent adapter to further improve the system performances. Our submission obtains competitive results in the final evaluation.</abstract>
<identifier type="citekey">yu-etal-2020-hw</identifier>
<identifier type="doi">10.18653/v1/2020.wat-1.8</identifier>
<location>
<url>https://aclanthology.org/2020.wat-1.8/</url>
</location>
<part>
<date>2020-12</date>
<extent unit="page">
<start>92</start>
<end>97</end>
</extent>
</part>
</mods>
</modsCollection>
%0 Conference Proceedings %T HW-TSC’s Participation in the WAT 2020 Indic Languages Multilingual Task %A Yu, Zhengzhe %A Wu, Zhanglin %A Chen, Xiaoyu %A Wei, Daimeng %A Shang, Hengchao %A Guo, Jiaxin %A Li, Zongyao %A Wang, Minghan %A Li, Liangyou %A Lei, Lizhi %A Yang, Hao %A Qin, Ying %Y Nakazawa, Toshiaki %Y Nakayama, Hideki %Y Ding, Chenchen %Y Dabre, Raj %Y Kunchukuttan, Anoop %Y Pa, Win Pa %Y Bojar, Ondřej %Y Parida, Shantipriya %Y Goto, Isao %Y Mino, Hidaya %Y Manabe, Hiroshi %Y Sudoh, Katsuhito %Y Kurohashi, Sadao %Y Bhattacharyya, Pushpak %S Proceedings of the 7th Workshop on Asian Translation %D 2020 %8 December %I Association for Computational Linguistics %C Suzhou, China %F yu-etal-2020-hw %X This paper describes our work in the WAT 2020 Indic Multilingual Translation Task. We participated in all 7 language pairs (En\ensuremath<-\ensuremath>Bn/Hi/Gu/Ml/Mr/Ta/Te) in both directions under the constrained condition—using only the officially provided data. Using transformer as a baseline, our Multi-\ensuremath>En and En-\ensuremath>Multi translation systems achieve the best performances. Detailed data filtering and data domain selection are the keys to performance enhancement in our experiment, with an average improvement of 2.6 BLEU scores for each language pair in the En-\ensuremath>Multi system and an average improvement of 4.6 BLEU scores regarding the Multi-\ensuremath>En. In addition, we employed language independent adapter to further improve the system performances. Our submission obtains competitive results in the final evaluation. %R 10.18653/v1/2020.wat-1.8 %U https://aclanthology.org/2020.wat-1.8/ %U https://doi.org/10.18653/v1/2020.wat-1.8 %P 92-97
Markdown (Informal)
[HW-TSC’s Participation in the WAT 2020 Indic Languages Multilingual Task](https://aclanthology.org/2020.wat-1.8/) (Yu et al., WAT 2020)
- HW-TSC’s Participation in the WAT 2020 Indic Languages Multilingual Task (Yu et al., WAT 2020)
ACL
- Zhengzhe Yu, Zhanglin Wu, Xiaoyu Chen, Daimeng Wei, Hengchao Shang, Jiaxin Guo, Zongyao Li, Minghan Wang, Liangyou Li, Lizhi Lei, Hao Yang, and Ying Qin. 2020. HW-TSC’s Participation in the WAT 2020 Indic Languages Multilingual Task. In Proceedings of the 7th Workshop on Asian Translation, pages 92–97, Suzhou, China. Association for Computational Linguistics.