TWI814216B - Method and device for establishing translation model based on triple self-learning - Google Patents
Method and device for establishing translation model based on triple self-learning Download PDFInfo
- Publication number
- TWI814216B TWI814216B TW111102256A TW111102256A TWI814216B TW I814216 B TWI814216 B TW I814216B TW 111102256 A TW111102256 A TW 111102256A TW 111102256 A TW111102256 A TW 111102256A TW I814216 B TWI814216 B TW I814216B
- Authority
- TW
- Taiwan
- Prior art keywords
- translation
- original
- vocabulary
- language
- model
- Prior art date
Links
- 238000013519 translation Methods 0.000 title claims abstract description 348
- 238000000034 method Methods 0.000 title claims description 19
- 238000012545 processing Methods 0.000 claims abstract description 39
- 230000014616 translation Effects 0.000 claims description 334
- 150000001875 compounds Chemical class 0.000 claims description 35
- 238000012549 training Methods 0.000 claims description 34
- 238000013136 deep learning model Methods 0.000 claims description 11
- 239000000463 material Substances 0.000 claims description 11
- 239000000203 mixture Substances 0.000 claims 2
- 229910000805 Pig iron Inorganic materials 0.000 description 4
- 230000003044 adaptive effect Effects 0.000 description 4
- 229910001356 Nickel pig iron Inorganic materials 0.000 description 3
- 238000004590 computer program Methods 0.000 description 2
- 230000007423 decrease Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000007613 environmental effect Effects 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000008602 contraction Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
Images
Landscapes
- Machine Translation (AREA)
- Feedback Control In General (AREA)
Abstract
一種基於三重自學習的翻譯模型建立裝置,包含一儲存模組及一處理模組。該處理模組根據該儲存模組儲存的一用以將一原文翻譯成一目標語言的一翻譯文的預設翻譯模型及一已訓練完且用以將一原文翻譯成該目標語言的一翻譯文的輔助翻譯模型,將該預設翻譯模型進行第一重自學習,並根據該儲存模組儲存的該預設翻譯模型、該輔助翻譯模型,及一已訓練完且用以將一原語言的一詞彙翻譯成該目標語言的一翻譯詞彙的詞彙翻譯模型,將該預設翻譯模型進行第二重自學習及第三重自學習,以提高該預設翻譯模型之翻譯準確率。A device for establishing a translation model based on triple self-learning includes a storage module and a processing module. The processing module stores a default translation model for translating an original text into a target language and a translation model that has been trained and used for translating an original text into the target language. The auxiliary translation model performs the first self-learning on the default translation model, and based on the default translation model stored in the storage module, the auxiliary translation model, and an auxiliary translation model that has been trained and used to convert an original language A word translation model that translates a word into a translation word in the target language performs second-level self-learning and third-level self-learning on the default translation model to improve the translation accuracy of the default translation model.
Description
本發明是有關於一種翻譯模型建立方法,特別是指一種基於三重自學習的翻譯模型建立方法及裝置。The present invention relates to a translation model establishment method, and in particular to a translation model establishment method and device based on triple self-learning.
機器翻譯(英語:Machine Translation)屬於計算語言學的範疇,其研究藉由電腦程式將文字或演說從一種自然語言翻譯成另一種自然語言。目前的翻譯機器,有時可以得到可以理解的翻譯結果,但是想要得到較有意義的翻譯結果,往往需要在輸入語句時適當地編輯調整,以利電腦程式分析,故要改善機器翻譯的結果,需要人為的介入。Machine Translation (English: Machine Translation) belongs to the field of computational linguistics, which studies the use of computer programs to translate text or speech from one natural language to another natural language. Current translation machines can sometimes obtain understandable translation results, but in order to obtain more meaningful translation results, it is often necessary to make appropriate edits and adjustments when entering sentences to facilitate computer program analysis. Therefore, it is necessary to improve the results of machine translation. Human intervention is required.
Linting Xue等人在”mT5: A massively multilingual pre-trained text-to-text transformer”一文中,提出一種基於機器學習在將資料集進行了預訓練建立翻譯模型,大幅降低了人為的介入。In the article "mT5: A massively multilingual pre-trained text-to-text transformer", Linting Xue et al. proposed a translation model based on pre-training of the data set based on machine learning, which greatly reduced human intervention.
然而,現有的翻譯模型在專業術語或是習慣用語的翻譯上,會出現非常明顯的翻譯錯誤,故需要進行領域適應性訓練(domain adaptive training),但領域適應性訓練需要大量的標記資料,若標記資料不足則無法進行領域適應性訓練。However, existing translation models will cause very obvious translation errors in the translation of professional terms or idioms, so domain adaptive training is required. However, domain adaptive training requires a large amount of labeled data. If If there is insufficient labeled data, domain adaptation training cannot be performed.
因此,本發明的目的,即在提供一種能提高翻譯模型翻譯準確率的基於三重自學習的翻譯模型建立方法。Therefore, the purpose of the present invention is to provide a translation model establishment method based on triple self-learning that can improve the translation accuracy of the translation model.
於是,本發明基於三重自學習的翻譯模型建立方法,由一模型建立裝置來實施,該模型建立裝置儲存有一用以將一原文翻譯成一目標語言的一翻譯文的預設翻譯模型、一已訓練完且用以將一原文翻譯成該目標語言的一翻譯文的輔助翻譯模型,及一已訓練完且用以將一原語言的一詞彙翻譯成該目標語言的一翻譯詞彙的詞彙翻譯模型,該方法包含一步驟(A)、一步驟(B)、一步驟(C)、一步驟(D)、一步驟(E)、一步驟(F)、一步驟(G)、一步驟(H),及一步驟(I)。Therefore, the present invention's translation model establishment method based on triple self-learning is implemented by a model establishment device, which stores a default translation model for translating an original text into a translation text in a target language, a trained translation model an auxiliary translation model that is completed and used to translate a source text into a translation text in the target language, and a vocabulary translation model that has been trained and is used to translate a vocabulary in a source language into a translation vocabulary in the target language, The method includes one step (A), one step (B), one step (C), one step (D), one step (E), one step (F), one step (G), and one step (H). , and step (I).
在該步驟(A)中,該模型建立裝置將多個原文輸入至該預設翻譯模型,以致該預設翻譯模型輸出多個分別對應該等原文的目標語言翻譯文In the step (A), the model building device inputs a plurality of original texts into the default translation model, so that the default translation model outputs a plurality of target language translations respectively corresponding to the original texts.
在該步驟(B)中,對於每一目標語言翻譯文,該模型建立裝置將該目標語言翻譯文輸入至該輔助翻譯模型,以致該輔助翻譯模型輸出一對應該目標語言翻譯文的一第一原語言翻譯文,且獲得一相關於該目標語言翻譯文與該第一原語言翻譯文的對應關係In step (B), for each target language translation, the model building device inputs the target language translation into the auxiliary translation model, so that the auxiliary translation model outputs a first value corresponding to the target language translation. Original language translation, and obtain a corresponding relationship between the target language translation and the first original language translation
在該步驟(C)中,該模型建立裝置根據該等原文及該等原文對應的第一原語言翻譯文調整該預設翻譯模型。In step (C), the model building device adjusts the default translation model according to the original texts and the first original language translations corresponding to the original texts.
在該步驟(D)中,對於每一原文,該模型建立裝置根據該原文及該原文對應的第一原語言翻譯文,獲得多個在該第一原語言翻譯文中與該原文相異的殘差原語言翻譯文詞彙,及多個在該原文中與該第一原語言翻譯文相異的殘差原文詞彙。In this step (D), for each original text, the model building device obtains a plurality of residuals in the first original language translation that are different from the original text based on the original text and the first original language translation corresponding to the original text. Different original language translation words, and a plurality of residual original words in the original language that are different from the first original language translation.
在該步驟(E)中,對於每一原文,該模型建立裝置將該等殘差原文詞彙輸入至該詞彙翻譯模型,以致該詞彙翻譯模型輸出多個分別對應該等殘差原文詞彙的目標語言正確翻譯詞彙。In this step (E), for each original text, the model building device inputs the residual original text vocabulary into the vocabulary translation model, so that the vocabulary translation model outputs a plurality of target languages corresponding to the residual original text vocabulary. Translate vocabulary correctly.
在該步驟(F)中,對於每一目標語言翻譯文,該模型建立裝置根據該目標語言翻譯文、該對應關係、該等殘差原文詞彙,及該等目標語言正確翻譯詞彙,產生一目標語言補丁翻譯文。In this step (F), for each target language translation, the model building device generates a target based on the target language translation, the correspondence relationship, the residual original vocabulary, and the correct translation vocabulary of the target language. Language patch translation.
在該步驟(G)中,對於每一目標語言補丁翻譯文,該模型建立裝置將該目標語言補丁翻譯文輸入至該輔助翻譯模型,以致該輔助翻譯模型輸出一對應該目標語言補丁翻譯文的第二原語言翻譯文。In this step (G), for each target language patch translation, the model building device inputs the target language patch translation into the auxiliary translation model, so that the auxiliary translation model outputs a pair of the target language patch translation. Second original language translation.
在該步驟(H)中,該模型建立裝置根據該等原文及該等原文對應的第二原語言翻譯文調整該預設翻譯模型。In this step (H), the model building device adjusts the default translation model according to the original texts and the second original language translations corresponding to the original texts.
在該步驟(I)中,該模型建立裝置根據該等原文對應的殘差原文詞彙及目標語言正確翻譯詞彙調整該預設翻譯模型。In step (I), the model building device adjusts the default translation model according to the residual original vocabulary corresponding to the original text and the correct translation vocabulary of the target language.
本發明的另一目的,即在提供一種能提高翻譯模型翻譯準確率的基於三重自學習的翻譯模型建立裝置。Another object of the present invention is to provide a translation model establishment device based on triple self-learning that can improve the translation accuracy of the translation model.
於是,本發明基於三重自學習的翻譯模型建立裝置,包含一儲存模組及一處理模組。Therefore, the translation model establishment device based on triple self-learning of the present invention includes a storage module and a processing module.
該儲存模組儲存有一用以將一原文翻譯成一目標語言的一翻譯文的預設翻譯模型、一已訓練完且用以將一原文翻譯成該目標語言的一翻譯文的輔助翻譯模型,及一已訓練完且用以將一原語言的一詞彙翻譯成該目標語言的一翻譯詞彙的詞彙翻譯模型。The storage module stores a default translation model used to translate a source text into a translation text in a target language, an auxiliary translation model that has been trained and used to translate a source text into a translation text in the target language, and A word translation model that has been trained and used to translate a word in a source language into a translated word in the target language.
該處理模組電連接該儲存模組,該處理模組將多個原文輸入至該預設翻譯模型,以致該預設翻譯模型輸出多個分別對應該等原文的目標語言翻譯文,並對於每一目標語言翻譯文,將該目標語言翻譯文輸入至該輔助翻譯模型,以致該輔助翻譯模型輸出一對應該目標語言翻譯文的一第一原語言翻譯文,且獲得一相關於該目標語言翻譯文與該第一原語言翻譯文的對應關係,再根據該等原文及該等原文對應的第一原語言翻譯文調整該預設翻譯模型,並對於每一原文,根據該原文及該原文對應的第一原語言翻譯文,獲得多個在該第一原語言翻譯文中與該原文相異的殘差原語言翻譯文詞彙,及多個在該原文中與該第一原語言翻譯文相異的殘差原文詞彙,再對於每一原文,將該等殘差原文詞彙輸入至該詞彙翻譯模型,以致該詞彙翻譯模型輸出多個分別對應該等殘差原文詞彙的目標語言正確翻譯詞彙,且對於每一目標語言翻譯文,根據該目標語言翻譯文、該對應關係、該等殘差原文詞彙,及該等目標語言正確翻譯詞彙,產生一目標語言補丁翻譯文,並對於每一目標語言補丁翻譯文,將該目標語言補丁翻譯文輸入至該輔助翻譯模型,以致該輔助翻譯模型輸出一對應該目標語言補丁翻譯文的一第二原語言翻譯文,再根據該等原文及該等原文對應的第二原語言翻譯文調整該預設翻譯模型,最後根據該等原文對應的殘差原文詞彙及目標語言正確翻譯詞彙調整該預設翻譯模型。The processing module is electrically connected to the storage module, and the processing module inputs a plurality of original texts into the default translation model, so that the default translation model outputs a plurality of target language translations respectively corresponding to the original texts, and for each A target language translation is input into the auxiliary translation model, so that the auxiliary translation model outputs a first original language translation corresponding to the target language translation, and obtains a translation related to the target language The corresponding relationship between the text and the first original language translation, and then adjust the default translation model according to the original text and the first original language translation corresponding to the original text, and for each original text, according to the corresponding relationship between the original text and the original language of the first original language translation, obtain a plurality of residual original language translation words that are different from the original language translation in the first original language translation, and a plurality of residual original language translation words that are different from the first original language translation in the original language of the residual original vocabulary, and then for each original text, input the residual original vocabulary into the vocabulary translation model, so that the vocabulary translation model outputs a plurality of correctly translated vocabulary in the target language corresponding to the residual original vocabulary, and For each target language translation, a target language patch translation is generated based on the target language translation, the corresponding relationship, the residual original vocabulary, and the target language correct translation vocabulary, and for each target language patch Translate text, input the target language patch translation text into the auxiliary translation model, so that the auxiliary translation model outputs a second source language translation text corresponding to the target language patch translation text, and then correspond to the original text according to the original text The default translation model is adjusted based on the second original language translation, and finally the default translation model is adjusted based on the residual original vocabulary and the correct translation vocabulary in the target language corresponding to the original text.
本發明之功效在於:藉由該翻譯模型建立裝置根據該等原文及該等原文對應的第一原語言翻譯文調整該預設翻譯模型,再根據該等原文及該等原文對應的第二原語言翻譯文調整該預設翻譯模型,以及根據該等原文對應的殘差原文詞彙及目標語言正確翻譯詞彙調整該預設翻譯模型,以提高該預設翻譯模型之翻譯準確率。The effect of the present invention is to: use the translation model building device to adjust the default translation model according to the original texts and the first original language translations corresponding to the original texts, and then adjust the default translation model according to the original texts and the second original language corresponding to the original texts. The language translation text adjusts the default translation model, and adjusts the default translation model according to the residual original vocabulary corresponding to the original text and the correct translation vocabulary of the target language, so as to improve the translation accuracy of the default translation model.
在本發明被詳細描述之前,應當注意在以下的說明內容中,類似的元件是以相同的編號來表示。Before the present invention is described in detail, it should be noted that in the following description, similar elements are designated with the same numbering.
參閱圖1,本發明基於三重自學習的翻譯模型建立裝置1的一實施例,包含一儲存模組11及一電連接該儲存模組11的處理模組12。值得注意的是,在本實施例中,該基於三重自學習的翻譯模型建立裝置1例如為一電腦主機,但不以此為限。Referring to FIG. 1 , an embodiment of the translation
該儲存模組11儲存有一用以將一原文翻譯成一目標語言的一翻譯文的預設翻譯模型、一已訓練完且用以將一原文翻譯成該目標語言的一翻譯文的輔助翻譯模型、一已訓練完且用以將一原語言的一詞彙翻譯成該目標語言的一翻譯詞彙的詞彙翻譯模型、多筆單詞類詞彙訓練資料、多筆複合詞詞彙訓練資料,及多筆複雜混合型詞彙訓練資料。每一單詞類詞彙訓練資料具有一原語言單詞類詞彙及一對應該原語言單詞類詞彙的目標語言單詞類詞彙,每一複合詞詞彙訓練資料具有一原語言複合詞詞彙及一對應該原語言複合詞詞彙的目標語言複合詞詞彙,每一複雜混合型詞彙訓練資料具有一原語言複雜混合型詞彙及一對應該原語言複雜混合型詞彙的目標語言複雜混合型詞彙。The
值得注意的是,在本實施例中,該預設翻譯模型例如為Linting Xue等人提出的mT5模型的原始模型,該輔助翻譯模型例如為由Google訓練mT5模型後獲得之Google mT5模型,該原語言例如為中文,該目標語言例如為英文,但不以此為限。It is worth noting that in this embodiment, the preset translation model is, for example, the original model of the mT5 model proposed by Linting Xue et al., and the auxiliary translation model is, for example, the Google mT5 model obtained after training the mT5 model by Google. The original The language is, for example, Chinese, and the target language is, for example, English, but is not limited thereto.
要再注意的是,單詞類詞彙意指能夠獨立使用且蘊含完整語意的最小單位之詞彙。如名詞(例:中國信託)、動詞(例:損失)、單位量詞(例:萬元)、介系詞(例:達)等。單詞形成最少需要一個自由語素(free morpheme),如介系詞“達”。It should be noted again that word-type vocabulary refers to the smallest unit of vocabulary that can be used independently and contains complete semantics. Such as nouns (for example: China Trust), verbs (for example: loss), unit quantifiers (for example: ten thousand yuan), prepositions (for example: Da), etc. Word formation requires at least one free morpheme, such as the preposition "da".
複合詞詞彙組成至少兩個或兩個以上的詞素(morpheme),可由自由詞素與不自由詞素(bound morpheme)相互搭配而成。舉例來說,中文裡的複合詞主要有以下幾種類型:①偏正型:前詞素修飾後詞素,如:生鐵,“生”為前詞素,“鐵”為後詞素;②並列型:兩詞素語意相近,如:採購,詞素“採”和“購”語意皆為“買”;③主謂型:一詞素為動詞,另一詞素即為此詞素的主詞,如:毛損,“毛”為動詞,語意指涉為“毛利”,“損”為主詞,語意指涉為“損失”;及④動賓型:一詞素為動詞,另一詞素即為此詞素的賓語,如:掛單,“單”為動詞“掛”的賓語。Compound words consist of at least two or more morphemes, and can be formed by collocation of free morphemes and bound morphemes. For example, there are mainly the following types of compound words in Chinese: ① Positive type: the front morpheme modifies the back morpheme, such as: pig iron, "生" is the front morpheme, and "铁" is the back morpheme; ② Parallel type: two morphemes The semantics are similar, such as: purchasing, the morphemes "cai" and "purchase" both mean "buy"; ③Subject-predicate type: one morpheme is a verb, and the other morpheme is the subject of this morpheme, such as: Maoshuan, "hair" As a verb, the semantic reference is "Maoli", "loss" is the main word, and the semantic reference is "loss"; and ④ verb-object type: one morpheme is a verb, and another morpheme is the object of this morpheme, such as: pending order , "shan" is the object of the verb "hang".
複雜混合型詞彙意指將原本完整的句子或多片語,利用語法縮略(abbreviation)二次形成的詞彙。複雜混合形主要有以下兩種類型:①全句縮略(contraction of complete sentence):將完整句(complete sentence)縮略形成具獨特語意的短詞。如:電匯,“電匯”的深層結構為完整句:“以電文匯款”,進行語法縮略後形成於金融領域使用的專門術語;及②多詞縮略(contraction of multiple phrases):將兩個完整片語縮略行程具獨特語意的短詞。如:短放,“短放”的深層結構為完整句:“短期放款”([短期ADVP]+[放款VP]),進行語法縮略後形成於金融領域使用的專門術語。Complex mixed vocabulary refers to vocabulary that is formed by using grammatical abbreviation (abbreviation) from an originally complete sentence or multiple phrases. There are two main types of complex mixed forms: 1. Contraction of complete sentence: abbreviation of a complete sentence to form a short word with unique semantic meaning. For example: wire transfer, the deep structure of "wire transfer" is a complete sentence: "remit money by telegram", which is a specialized term used in the financial field after grammatical abbreviation; and ②contraction of multiple phrases: combining two The complete phrase abbreviation is a short word with unique semantic meaning. For example: short-term lending, the deep structure of "short-term lending" is a complete sentence: "short-term lending" ([short-term ADVP] + [loan VP]), which is a specialized term used in the financial field after grammatical abbreviation.
參閱圖1、圖2,及圖3,說明本發明基於三重自學習的翻譯模型建立裝置1如何執行本發明基於三重自學習的翻譯模型建立方法之一實施例,該實施例包含一詞彙翻譯模型建立程序及一翻譯模型建立程序。Referring to Figure 1, Figure 2, and Figure 3, it is explained how the translation
該詞彙翻譯模型建立程序包括步驟21~23。The vocabulary translation model establishment program includes
在步驟21中,該處理模組12根據該等單詞類詞彙訓練資料,將一第一預設深度學習模型進行訓練,以建立一用以翻譯單詞類詞彙的單詞類翻譯模型。In
在步驟22中,該處理模組12根據該等複合詞詞彙訓練資料,將一第二預設深度學習模型進行訓練,以建立一用以翻譯複合詞詞彙的複合詞翻譯模型。In
在步驟23中,該處理模組12根據該等複雜混合型詞彙訓練資料,將一第三預設深度學習模型進行訓練,以建立一用以翻譯複雜混合型詞彙的複雜混合型模型。In
該單詞類翻譯模型、該複合詞翻譯模型,及該複雜混合型模型共同構成該詞彙翻譯模型。The word class translation model, the compound word translation model, and the complex hybrid model together constitute the vocabulary translation model.
值得注意的是,在本實施例中,該第一預設深度學習模型例如為基於規則模型(Rule-based Model)、該第二預設深度學習模型例如為雜訊通道類型模型(Noisy channel type model)、該第三預設深度學習模型例如為序列到序列注意力模型(Sequence-to-Sequence Attention Model),在其他實施方式中,亦可以一用以翻譯專業術語或/且習慣用語詞彙的基於領域自適應短語的翻譯模型替代該詞彙翻譯模型,例如領域自適應目標檢測模型(Domain Adaptive Faster RNN Model for Object Detection),但不以此為限。It is worth noting that in this embodiment, the first default deep learning model is, for example, a rule-based model, and the second default deep learning model is, for example, a noise channel type model. model), the third default deep learning model is, for example, a Sequence-to-Sequence Attention Model. In other implementations, it can also be a model used to translate professional terms or/and idioms. A translation model based on domain adaptive phrases replaces the vocabulary translation model, such as a Domain Adaptive Faster RNN Model for Object Detection, but is not limited to this.
要再注意的是,在本實施例中,步驟22在步驟21之後,且步驟23在步驟22之後,在其他實施方式中,步驟21~23沒有先後順序,可以任何順序進行。It should be noted that in this embodiment,
該翻譯模型建立程序包括步驟31~39。The translation model establishment program includes
在步驟31中,該處理模組12將多個原文輸入至該預設翻譯模型,以致該預設翻譯模型輸出多個分別對應該等原文的目標語言翻譯文。In
在步驟32中,對於每一目標語言翻譯文,該處理模組12將該目標語言翻譯文輸入至該輔助翻譯模型,以致該輔助翻譯模型輸出一對應該目標語言翻譯文的一第一原語言翻譯文,且獲得一相關於該目標語言翻譯文與該第一原語言翻譯文的對應關係。In
要特別注意的是,在該輔助翻譯模型翻譯過程中,該處理模組12係根據注意力對齊機制(transformer attention alignment mechanism)獲得該對應關係,但不以此為限。It should be noted that during the translation process of the auxiliary translation model, the
在步驟33中,該處理模組12根據該等原文及該等原文對應的第一原語言翻譯文調整該預設翻譯模型。In
搭配參閱圖4,步驟33包括子步驟331~332。Referring to Figure 4,
在子步驟331中,該處理模組12根據該等原文及該等原文對應的第一原語言翻譯文產生一第一損失值。In
在子步驟332中,該處理模組12根據該第一損失值調整該預設翻譯模型。In
在步驟34中,對於每一原文,該處理模組12根據該原文及該原文對應的第一原語言翻譯文,獲得多個在該第一原語言翻譯文中與該原文相異的殘差原語言翻譯文詞彙,及多個在該原文中與該第一原語言翻譯文相異的殘差原文詞彙。In
舉例來說,若該原文為「中信集團2020年為配合政府環保政策休爐,導致鎳生鐵產量及營收下滑毛損達新台幣16200仟元」,而該原文對應的第一原語言翻譯文為「花旗集團2020年為配合政府環保政策休爐,導致生鐵產量及營收下滑損失1620萬」,該等殘差原語言翻譯文詞彙為「花旗、生鐵、損失、1620萬」,該等殘差原文詞彙為「中信、鎳生鐵、毛損、16200仟元」,其中「花旗、中信」為公司名翻譯錯誤,「生鐵、鎳生鐵」為領域專有名詞翻譯錯誤,「損失、毛損」為專業術語翻譯錯誤,「1620萬、16200仟元」為數據翻譯錯誤。For example, if the original text is "CITIC Group shut down its boilers in 2020 to comply with the government's environmental protection policy, resulting in a decline in nickel pig iron production and revenue, with a gross loss of NT$16,200,000", and the corresponding translation of the original text into the first original language It is "Citigroup shut down furnaces in 2020 to comply with the government's environmental protection policy, resulting in a decline in pig iron production and revenue, resulting in a loss of 16.2 million yuan." The original language translation of these residuals is "Citigroup, pig iron, loss, 16.2 million yuan". These residuals The original vocabulary of the difference is "CITIC, nickel pig iron, gross loss, 16,200 thousand yuan", in which "Citigroup, CITIC" is a translation error of the company name, "pig iron, nickel pig iron" is a translation error of the proper nouns in the field, "loss, gross loss" It is a translation error of professional terminology, and "16.2 million, 16,200 thousand yuan" is an error of data translation.
在步驟35中,對於每一原文,該處理模組12將該等殘差原文詞彙輸入至該詞彙翻譯模型,以致該詞彙翻譯模型輸出多個分別對應該等殘差原文詞彙的目標語言正確翻譯詞彙。In
值得注意的是,在本實施例中,該處理模組12以詞幹提取對該等殘差原文詞彙進行分類,以判定每一殘差原文詞彙是屬於單詞類、複合詞或是複雜混合型,以將每一殘差原文詞彙輸入至該詞彙翻譯模型中的對應的模型。It is worth noting that in this embodiment, the
在步驟36中,對於每一目標語言翻譯文,該處理模組12根據該目標語言翻譯文、該對應關係、該等殘差原文詞彙,及該等目標語言正確翻譯詞彙,產生一目標語言補丁翻譯文。In
搭配參閱圖5,步驟36包括子步驟361~362。Referring to Figure 5,
在子步驟361中,對於每一目標語言翻譯文,該處理模組12根據該對應關係及該等殘差原文詞彙,獲得多個相關於該等殘差原文詞彙在該目標語言翻譯文的對應位置。In
在子步驟362中,對於每一目標語言翻譯文,該處理模組12將該目標語言翻譯文中在該等對應位置的詞彙替換成該等殘差原文詞彙對應的該等目標語言正確翻譯詞彙,以產生該目標語言補丁翻譯文。In
在步驟37中,對於每一目標語言補丁翻譯文,該處理模組12將該目標語言補丁翻譯文輸入至該輔助翻譯模型,以致該輔助翻譯模型輸出一對應該目標語言補丁翻譯文的第二原語言翻譯文。In
在步驟38中,該處理模組12根據該等原文及該等原文對應的第二原語言翻譯文調整該預設翻譯模型。In
搭配參閱圖6,步驟38包括子步驟381~382。Referring to Figure 6,
在步驟381中,該處理模組12根據該等原文及該等原文對應的第二原語言翻譯文產生一第二損失值。In
在步驟382中,該處理模組12根據該第二損失值調整該預設翻譯模型。In
在步驟39中,該處理模組12根據該等原文對應的殘差原文詞彙及目標語言正確翻譯詞彙調整該預設翻譯模型。In
搭配參閱圖7,步驟39包括子步驟391~393。Referring to Figure 7,
在子步驟391中,對於每一原文,該處理模組12將該原文對應的該等殘差原文詞彙輸入至該預設翻譯模型,以致該預設翻譯模型輸出多個分別對應該等殘差原文詞彙的目標語言預測翻譯詞彙。In
在子步驟392中,該處理模組12根據該等原文對應的目標語言正確翻譯詞彙及目標語言預測翻譯詞彙產生一第三損失值。In
在子步驟393中,該處理模組12根據該第三損失值調整該預設翻譯模型。In
值得注意的是,在本實施例中,該處理模組12是利用Renjie Zheng等人所著的” Multi-Reference Training with Pseudo-References for Neural Translation and Text Generation”論文中之軟詞對齊(soft word alignment)演算法,在子步驟331及步驟34中,將該原文及該原文對應的第一原語言翻譯文進行比對,以獲得該第一損失值、該等殘差原語言翻譯文詞彙及該等殘差原文詞彙;在子步驟381中,將該等原文及該等原文對應的第二原語言翻譯文進行比對,以獲得該第二損失值;及在子步驟392中,將該等原文對應的目標語言正確翻譯詞彙及目標語言預測翻譯詞彙進行比對,以獲得該第三損失值,但不以此為限。It is worth noting that in this embodiment, the
要再注意得是,在本實施例中,該第一損失值、該第二損失值,及該第三損失值為根據比對結果以交叉熵(Cross-Entropy)損失函數所計算出來的,但不以此為限。It should be noted again that in this embodiment, the first loss value, the second loss value, and the third loss value are calculated using the cross-entropy (Cross-Entropy) loss function based on the comparison results. But it is not limited to this.
綜上所述,本發明基於三重自學習的翻譯模型建立方法及裝置,藉由該翻譯模型建立裝置1根據該等原文及該等原文對應的第一原語言翻譯文調整該預設翻譯模型(即進行第一重自學習),再根據該等原文及該等原文對應的第二原語言翻譯文調整該預設翻譯模型(即進行第二重自學習),以及根據該等原文對應的殘差原文詞彙及目標語言正確翻譯詞彙調整該預設翻譯模型(即進行第三重自學習),以提高該預設翻譯模型之翻譯準確率,故確實能達成本發明的目的。To sum up, the present invention is a translation model establishment method and device based on triple self-learning. The translation
惟以上所述者,僅為本發明的實施例而已,當不能以此限定本發明實施的範圍,凡是依本發明申請專利範圍及專利說明書內容所作的簡單的等效變化與修飾,皆仍屬本發明專利涵蓋的範圍內。However, the above are only examples of the present invention. They cannot be used to limit the scope of the present invention. All simple equivalent changes and modifications made based on the patent scope of the present invention and the contents of the patent specification are still within the scope of the present invention. within the scope covered by the patent of this invention.
1········ 翻譯模型建立裝置
11······ 儲存模組
12······ 處理模組
21~23·· 詞彙翻譯模型建立程序
31~39·· 翻譯模型建立程序包括步驟
331~332 子步驟
361~362 子步驟
381~382 子步驟
391~393 子步驟
1········· Translation
本發明的其他的特徵及功效,將於參照圖式的實施方式中清楚地呈現,其中:
圖1是一方塊圖,說明本發明基於三重自學習的翻譯模型建立裝置的一實施例;
圖2是一流程圖,說明本發明基於三重自學習的翻譯模型建立方法的一實施例的一詞彙翻譯模型建立程序;
圖3是一流程圖,說明本發明基於三重自學習的翻譯模型建立方法的該實施例的一翻譯模型建立程序;
圖4是一流程圖,輔助說明圖3的步驟33之子步驟;
圖5是一流程圖,輔助說明圖3的步驟36之子步驟;
圖6是一流程圖,輔助說明圖3的步驟38之子步驟;及
圖7是一流程圖,輔助說明圖3的步驟39之子步驟。
Other features and effects of the present invention will be clearly presented in the embodiments with reference to the drawings, in which:
Figure 1 is a block diagram illustrating an embodiment of a translation model building device based on triple self-learning of the present invention;
Figure 2 is a flow chart illustrating a vocabulary translation model establishment procedure of an embodiment of the translation model establishment method based on triple self-learning of the present invention;
Figure 3 is a flow chart illustrating a translation model establishment procedure of this embodiment of the translation model establishment method based on triple self-learning of the present invention;
Figure 4 is a flow chart to assist in explaining the sub-steps of
31~39·· 翻譯模型建立程序31~39·· Translation model establishment program
Claims (8)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| TW111102256A TWI814216B (en) | 2022-01-19 | 2022-01-19 | Method and device for establishing translation model based on triple self-learning |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| TW111102256A TWI814216B (en) | 2022-01-19 | 2022-01-19 | Method and device for establishing translation model based on triple self-learning |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| TW202331583A TW202331583A (en) | 2023-08-01 |
| TWI814216B true TWI814216B (en) | 2023-09-01 |
Family
ID=88558990
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| TW111102256A TWI814216B (en) | 2022-01-19 | 2022-01-19 | Method and device for establishing translation model based on triple self-learning |
Country Status (1)
| Country | Link |
|---|---|
| TW (1) | TWI814216B (en) |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110532573A (en) * | 2018-12-29 | 2019-12-03 | 苏州七星天专利运营管理有限责任公司 | A translation method and system |
| CN110717342A (en) * | 2019-09-27 | 2020-01-21 | 电子科技大学 | Distance parameter alignment translation method based on transformer |
| CN110945594A (en) * | 2017-10-16 | 2020-03-31 | 因美纳有限公司 | Deep learning-based splice site classification |
| TWM627083U (en) * | 2022-01-19 | 2022-05-11 | 中國信託商業銀行股份有限公司 | Device for establishing a translation model based on triple self-learning technology |
-
2022
- 2022-01-19 TW TW111102256A patent/TWI814216B/en active
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110945594A (en) * | 2017-10-16 | 2020-03-31 | 因美纳有限公司 | Deep learning-based splice site classification |
| CN110532573A (en) * | 2018-12-29 | 2019-12-03 | 苏州七星天专利运营管理有限责任公司 | A translation method and system |
| CN110717342A (en) * | 2019-09-27 | 2020-01-21 | 电子科技大学 | Distance parameter alignment translation method based on transformer |
| TWM627083U (en) * | 2022-01-19 | 2022-05-11 | 中國信託商業銀行股份有限公司 | Device for establishing a translation model based on triple self-learning technology |
Also Published As
| Publication number | Publication date |
|---|---|
| TW202331583A (en) | 2023-08-01 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Chiang et al. | Parsing arabic dialects | |
| US8401839B2 (en) | Method and apparatus for providing hybrid automatic translation | |
| Sen et al. | Neural machine translation of low-resource languages using SMT phrase pair injection | |
| CN101788978B (en) | Chinese and foreign spoken language automatic translation method combining Chinese pinyin and character | |
| JPH06325080A (en) | Translation system between automatic languages | |
| WO2009014465A2 (en) | System and method for multilingual translation of communicative speech | |
| Jabaian et al. | Comparison and combination of lightly supervised approaches for language portability of a spoken language understanding system | |
| Li et al. | Boost transformer with BERT and copying mechanism for ASR error correction | |
| Abiola et al. | Review of the Various Approaches to Text to Text Machine Translations | |
| CN114881010A (en) | Chinese grammar error correction method based on Transformer and multitask learning | |
| TWI814216B (en) | Method and device for establishing translation model based on triple self-learning | |
| TWM627083U (en) | Device for establishing a translation model based on triple self-learning technology | |
| CN113408307B (en) | Neural machine translation method based on translation template | |
| CN111666774B (en) | Machine translation method and device based on document context | |
| Rambow et al. | Parsing arabic dialects | |
| Chang et al. | A corpus-based statistics-oriented transfer and generation model for machine translation | |
| CN113723080B (en) | English article automatic grammar error correction method based on reverse translation | |
| Bak et al. | Kakao enterprise’s WMT21 machine translation using terminologies task submission | |
| Li et al. | Correction while recognition: combining pretrained language model for Taiwan-accented speech recognition | |
| Xu et al. | Two-stage translation: A combined linguistic and statistical machine translation framework | |
| Rauf et al. | Automated grammatical error correction: A comprehensive review | |
| Kumar et al. | Low resource pipeline for spoken language understanding via weak supervision | |
| Gao et al. | Research on Automatic Detection and Correction of English Translation Errors Based on Machine Learning | |
| Akhand et al. | Recent progress, emerging techniques, and future research prospects of Bangla machine translation: a systematic review | |
| Dandapat et al. | Statistically motivated example-based machine translation using translation memory |