TWI814216B

TWI814216B - Method and device for establishing translation model based on triple self-learning

Info

Publication number: TWI814216B
Application number: TW111102256A
Authority: TW
Inventors: 彭士爵; 邱國豪; 林志豪; 曾文忻; 吳瑞琳; 宋政隆; 王俊權
Original assignee: 中國信託商業銀行股份有限公司
Priority date: 2022-01-19
Filing date: 2022-01-19
Publication date: 2023-09-01
Also published as: TW202331583A

Abstract

一種基於三重自學習的翻譯模型建立裝置，包含一儲存模組及一處理模組。該處理模組根據該儲存模組儲存的一用以將一原文翻譯成一目標語言的一翻譯文的預設翻譯模型及一已訓練完且用以將一原文翻譯成該目標語言的一翻譯文的輔助翻譯模型，將該預設翻譯模型進行第一重自學習，並根據該儲存模組儲存的該預設翻譯模型、該輔助翻譯模型，及一已訓練完且用以將一原語言的一詞彙翻譯成該目標語言的一翻譯詞彙的詞彙翻譯模型，將該預設翻譯模型進行第二重自學習及第三重自學習，以提高該預設翻譯模型之翻譯準確率。A device for establishing a translation model based on triple self-learning includes a storage module and a processing module. The processing module stores a default translation model for translating an original text into a target language and a translation model that has been trained and used for translating an original text into the target language. The auxiliary translation model performs the first self-learning on the default translation model, and based on the default translation model stored in the storage module, the auxiliary translation model, and an auxiliary translation model that has been trained and used to convert an original language A word translation model that translates a word into a translation word in the target language performs second-level self-learning and third-level self-learning on the default translation model to improve the translation accuracy of the default translation model.

Description

Translation model establishment method and device based on triple self-learning

本發明是有關於一種翻譯模型建立方法，特別是指一種基於三重自學習的翻譯模型建立方法及裝置。The present invention relates to a translation model establishment method, and in particular to a translation model establishment method and device based on triple self-learning.

機器翻譯（英語：Machine Translation）屬於計算語言學的範疇，其研究藉由電腦程式將文字或演說從一種自然語言翻譯成另一種自然語言。目前的翻譯機器，有時可以得到可以理解的翻譯結果，但是想要得到較有意義的翻譯結果，往往需要在輸入語句時適當地編輯調整，以利電腦程式分析，故要改善機器翻譯的結果，需要人為的介入。Machine Translation (English: Machine Translation) belongs to the field of computational linguistics, which studies the use of computer programs to translate text or speech from one natural language to another natural language. Current translation machines can sometimes obtain understandable translation results, but in order to obtain more meaningful translation results, it is often necessary to make appropriate edits and adjustments when entering sentences to facilitate computer program analysis. Therefore, it is necessary to improve the results of machine translation. Human intervention is required.

Linting Xue等人在”mT5: A massively multilingual pre-trained text-to-text transformer”一文中，提出一種基於機器學習在將資料集進行了預訓練建立翻譯模型，大幅降低了人為的介入。In the article "mT5: A massively multilingual pre-trained text-to-text transformer", Linting Xue et al. proposed a translation model based on pre-training of the data set based on machine learning, which greatly reduced human intervention.

然而，現有的翻譯模型在專業術語或是習慣用語的翻譯上，會出現非常明顯的翻譯錯誤，故需要進行領域適應性訓練(domain adaptive training)，但領域適應性訓練需要大量的標記資料，若標記資料不足則無法進行領域適應性訓練。However, existing translation models will cause very obvious translation errors in the translation of professional terms or idioms, so domain adaptive training is required. However, domain adaptive training requires a large amount of labeled data. If If there is insufficient labeled data, domain adaptation training cannot be performed.

因此，本發明的目的，即在提供一種能提高翻譯模型翻譯準確率的基於三重自學習的翻譯模型建立方法。Therefore, the purpose of the present invention is to provide a translation model establishment method based on triple self-learning that can improve the translation accuracy of the translation model.

於是，本發明基於三重自學習的翻譯模型建立方法，由一模型建立裝置來實施，該模型建立裝置儲存有一用以將一原文翻譯成一目標語言的一翻譯文的預設翻譯模型、一已訓練完且用以將一原文翻譯成該目標語言的一翻譯文的輔助翻譯模型，及一已訓練完且用以將一原語言的一詞彙翻譯成該目標語言的一翻譯詞彙的詞彙翻譯模型，該方法包含一步驟(A)、一步驟(B)、一步驟(C)、一步驟(D)、一步驟(E)、一步驟(F)、一步驟(G)、一步驟(H)，及一步驟(I)。Therefore, the present invention's translation model establishment method based on triple self-learning is implemented by a model establishment device, which stores a default translation model for translating an original text into a translation text in a target language, a trained translation model an auxiliary translation model that is completed and used to translate a source text into a translation text in the target language, and a vocabulary translation model that has been trained and is used to translate a vocabulary in a source language into a translation vocabulary in the target language, The method includes one step (A), one step (B), one step (C), one step (D), one step (E), one step (F), one step (G), and one step (H). , and step (I).

在該步驟(A)中，該模型建立裝置將多個原文輸入至該預設翻譯模型，以致該預設翻譯模型輸出多個分別對應該等原文的目標語言翻譯文In the step (A), the model building device inputs a plurality of original texts into the default translation model, so that the default translation model outputs a plurality of target language translations respectively corresponding to the original texts.

在該步驟(B)中，對於每一目標語言翻譯文，該模型建立裝置將該目標語言翻譯文輸入至該輔助翻譯模型，以致該輔助翻譯模型輸出一對應該目標語言翻譯文的一第一原語言翻譯文，且獲得一相關於該目標語言翻譯文與該第一原語言翻譯文的對應關係In step (B), for each target language translation, the model building device inputs the target language translation into the auxiliary translation model, so that the auxiliary translation model outputs a first value corresponding to the target language translation. Original language translation, and obtain a corresponding relationship between the target language translation and the first original language translation

在該步驟(C)中，該模型建立裝置根據該等原文及該等原文對應的第一原語言翻譯文調整該預設翻譯模型。In step (C), the model building device adjusts the default translation model according to the original texts and the first original language translations corresponding to the original texts.

在該步驟(D)中，對於每一原文，該模型建立裝置根據該原文及該原文對應的第一原語言翻譯文，獲得多個在該第一原語言翻譯文中與該原文相異的殘差原語言翻譯文詞彙，及多個在該原文中與該第一原語言翻譯文相異的殘差原文詞彙。In this step (D), for each original text, the model building device obtains a plurality of residuals in the first original language translation that are different from the original text based on the original text and the first original language translation corresponding to the original text. Different original language translation words, and a plurality of residual original words in the original language that are different from the first original language translation.

在該步驟(E)中，對於每一原文，該模型建立裝置將該等殘差原文詞彙輸入至該詞彙翻譯模型，以致該詞彙翻譯模型輸出多個分別對應該等殘差原文詞彙的目標語言正確翻譯詞彙。In this step (E), for each original text, the model building device inputs the residual original text vocabulary into the vocabulary translation model, so that the vocabulary translation model outputs a plurality of target languages corresponding to the residual original text vocabulary. Translate vocabulary correctly.

在該步驟(F)中，對於每一目標語言翻譯文，該模型建立裝置根據該目標語言翻譯文、該對應關係、該等殘差原文詞彙，及該等目標語言正確翻譯詞彙，產生一目標語言補丁翻譯文。In this step (F), for each target language translation, the model building device generates a target based on the target language translation, the correspondence relationship, the residual original vocabulary, and the correct translation vocabulary of the target language. Language patch translation.

在該步驟(G)中，對於每一目標語言補丁翻譯文，該模型建立裝置將該目標語言補丁翻譯文輸入至該輔助翻譯模型，以致該輔助翻譯模型輸出一對應該目標語言補丁翻譯文的第二原語言翻譯文。In this step (G), for each target language patch translation, the model building device inputs the target language patch translation into the auxiliary translation model, so that the auxiliary translation model outputs a pair of the target language patch translation. Second original language translation.

在該步驟(H)中，該模型建立裝置根據該等原文及該等原文對應的第二原語言翻譯文調整該預設翻譯模型。In this step (H), the model building device adjusts the default translation model according to the original texts and the second original language translations corresponding to the original texts.

在該步驟(I)中，該模型建立裝置根據該等原文對應的殘差原文詞彙及目標語言正確翻譯詞彙調整該預設翻譯模型。In step (I), the model building device adjusts the default translation model according to the residual original vocabulary corresponding to the original text and the correct translation vocabulary of the target language.

本發明的另一目的，即在提供一種能提高翻譯模型翻譯準確率的基於三重自學習的翻譯模型建立裝置。Another object of the present invention is to provide a translation model establishment device based on triple self-learning that can improve the translation accuracy of the translation model.

於是，本發明基於三重自學習的翻譯模型建立裝置，包含一儲存模組及一處理模組。Therefore, the translation model establishment device based on triple self-learning of the present invention includes a storage module and a processing module.

該儲存模組儲存有一用以將一原文翻譯成一目標語言的一翻譯文的預設翻譯模型、一已訓練完且用以將一原文翻譯成該目標語言的一翻譯文的輔助翻譯模型，及一已訓練完且用以將一原語言的一詞彙翻譯成該目標語言的一翻譯詞彙的詞彙翻譯模型。The storage module stores a default translation model used to translate a source text into a translation text in a target language, an auxiliary translation model that has been trained and used to translate a source text into a translation text in the target language, and A word translation model that has been trained and used to translate a word in a source language into a translated word in the target language.

該處理模組電連接該儲存模組，該處理模組將多個原文輸入至該預設翻譯模型，以致該預設翻譯模型輸出多個分別對應該等原文的目標語言翻譯文，並對於每一目標語言翻譯文，將該目標語言翻譯文輸入至該輔助翻譯模型，以致該輔助翻譯模型輸出一對應該目標語言翻譯文的一第一原語言翻譯文，且獲得一相關於該目標語言翻譯文與該第一原語言翻譯文的對應關係，再根據該等原文及該等原文對應的第一原語言翻譯文調整該預設翻譯模型，並對於每一原文，根據該原文及該原文對應的第一原語言翻譯文，獲得多個在該第一原語言翻譯文中與該原文相異的殘差原語言翻譯文詞彙，及多個在該原文中與該第一原語言翻譯文相異的殘差原文詞彙，再對於每一原文，將該等殘差原文詞彙輸入至該詞彙翻譯模型，以致該詞彙翻譯模型輸出多個分別對應該等殘差原文詞彙的目標語言正確翻譯詞彙，且對於每一目標語言翻譯文，根據該目標語言翻譯文、該對應關係、該等殘差原文詞彙，及該等目標語言正確翻譯詞彙，產生一目標語言補丁翻譯文，並對於每一目標語言補丁翻譯文，將該目標語言補丁翻譯文輸入至該輔助翻譯模型，以致該輔助翻譯模型輸出一對應該目標語言補丁翻譯文的一第二原語言翻譯文，再根據該等原文及該等原文對應的第二原語言翻譯文調整該預設翻譯模型，最後根據該等原文對應的殘差原文詞彙及目標語言正確翻譯詞彙調整該預設翻譯模型。The processing module is electrically connected to the storage module, and the processing module inputs a plurality of original texts into the default translation model, so that the default translation model outputs a plurality of target language translations respectively corresponding to the original texts, and for each A target language translation is input into the auxiliary translation model, so that the auxiliary translation model outputs a first original language translation corresponding to the target language translation, and obtains a translation related to the target language The corresponding relationship between the text and the first original language translation, and then adjust the default translation model according to the original text and the first original language translation corresponding to the original text, and for each original text, according to the corresponding relationship between the original text and the original language of the first original language translation, obtain a plurality of residual original language translation words that are different from the original language translation in the first original language translation, and a plurality of residual original language translation words that are different from the first original language translation in the original language of the residual original vocabulary, and then for each original text, input the residual original vocabulary into the vocabulary translation model, so that the vocabulary translation model outputs a plurality of correctly translated vocabulary in the target language corresponding to the residual original vocabulary, and For each target language translation, a target language patch translation is generated based on the target language translation, the corresponding relationship, the residual original vocabulary, and the target language correct translation vocabulary, and for each target language patch Translate text, input the target language patch translation text into the auxiliary translation model, so that the auxiliary translation model outputs a second source language translation text corresponding to the target language patch translation text, and then correspond to the original text according to the original text The default translation model is adjusted based on the second original language translation, and finally the default translation model is adjusted based on the residual original vocabulary and the correct translation vocabulary in the target language corresponding to the original text.

本發明之功效在於：藉由該翻譯模型建立裝置根據該等原文及該等原文對應的第一原語言翻譯文調整該預設翻譯模型，再根據該等原文及該等原文對應的第二原語言翻譯文調整該預設翻譯模型，以及根據該等原文對應的殘差原文詞彙及目標語言正確翻譯詞彙調整該預設翻譯模型，以提高該預設翻譯模型之翻譯準確率。The effect of the present invention is to: use the translation model building device to adjust the default translation model according to the original texts and the first original language translations corresponding to the original texts, and then adjust the default translation model according to the original texts and the second original language corresponding to the original texts. The language translation text adjusts the default translation model, and adjusts the default translation model according to the residual original vocabulary corresponding to the original text and the correct translation vocabulary of the target language, so as to improve the translation accuracy of the default translation model.

在本發明被詳細描述之前，應當注意在以下的說明內容中，類似的元件是以相同的編號來表示。Before the present invention is described in detail, it should be noted that in the following description, similar elements are designated with the same numbering.

參閱圖1，本發明基於三重自學習的翻譯模型建立裝置1的一實施例，包含一儲存模組11及一電連接該儲存模組11的處理模組12。值得注意的是，在本實施例中，該基於三重自學習的翻譯模型建立裝置1例如為一電腦主機，但不以此為限。Referring to FIG. 1 , an embodiment of the translation model building device 1 based on triple self-learning of the present invention includes a storage module 11 and a processing module 12 electrically connected to the storage module 11 . It is worth noting that in this embodiment, the translation model establishment device 1 based on triple self-learning is, for example, a computer host, but is not limited to this.

該儲存模組11儲存有一用以將一原文翻譯成一目標語言的一翻譯文的預設翻譯模型、一已訓練完且用以將一原文翻譯成該目標語言的一翻譯文的輔助翻譯模型、一已訓練完且用以將一原語言的一詞彙翻譯成該目標語言的一翻譯詞彙的詞彙翻譯模型、多筆單詞類詞彙訓練資料、多筆複合詞詞彙訓練資料，及多筆複雜混合型詞彙訓練資料。每一單詞類詞彙訓練資料具有一原語言單詞類詞彙及一對應該原語言單詞類詞彙的目標語言單詞類詞彙，每一複合詞詞彙訓練資料具有一原語言複合詞詞彙及一對應該原語言複合詞詞彙的目標語言複合詞詞彙，每一複雜混合型詞彙訓練資料具有一原語言複雜混合型詞彙及一對應該原語言複雜混合型詞彙的目標語言複雜混合型詞彙。The storage module 11 stores a default translation model used to translate an original text into a translation text in a target language, an auxiliary translation model that has been trained and used to translate an original text into a translation text in the target language, A vocabulary translation model that has been trained and used to translate a vocabulary in a source language into a translated vocabulary in the target language, multiple word-type vocabulary training materials, multiple compound word vocabulary training materials, and multiple complex mixed vocabulary training materials. Each word class vocabulary training material has a word class vocabulary in the original language and a word class vocabulary in the target language corresponding to the word class vocabulary in the original language. Each compound word vocabulary training material has a compound word vocabulary in the original language and a compound word vocabulary in the original language. Each compound word training material of the target language compound word has a source language compound word and a target language compound word corresponding to the source language compound word.

值得注意的是，在本實施例中，該預設翻譯模型例如為Linting Xue等人提出的mT5模型的原始模型，該輔助翻譯模型例如為由Google訓練mT5模型後獲得之Google mT5模型，該原語言例如為中文，該目標語言例如為英文，但不以此為限。It is worth noting that in this embodiment, the preset translation model is, for example, the original model of the mT5 model proposed by Linting Xue et al., and the auxiliary translation model is, for example, the Google mT5 model obtained after training the mT5 model by Google. The original The language is, for example, Chinese, and the target language is, for example, English, but is not limited thereto.

要再注意的是，單詞類詞彙意指能夠獨立使用且蘊含完整語意的最小單位之詞彙。如名詞(例：中國信託)、動詞(例：損失)、單位量詞(例：萬元)、介系詞(例：達)等。單詞形成最少需要一個自由語素(free morpheme)，如介系詞“達”。It should be noted again that word-type vocabulary refers to the smallest unit of vocabulary that can be used independently and contains complete semantics. Such as nouns (for example: China Trust), verbs (for example: loss), unit quantifiers (for example: ten thousand yuan), prepositions (for example: Da), etc. Word formation requires at least one free morpheme, such as the preposition "da".

複合詞詞彙組成至少兩個或兩個以上的詞素(morpheme)，可由自由詞素與不自由詞素(bound morpheme)相互搭配而成。舉例來說，中文裡的複合詞主要有以下幾種類型：①偏正型：前詞素修飾後詞素，如：生鐵，“生”為前詞素，“鐵”為後詞素；②並列型：兩詞素語意相近，如：採購，詞素“採”和“購”語意皆為“買”；③主謂型：一詞素為動詞，另一詞素即為此詞素的主詞，如：毛損，“毛”為動詞，語意指涉為“毛利”，“損”為主詞，語意指涉為“損失”；及④動賓型：一詞素為動詞，另一詞素即為此詞素的賓語，如：掛單，“單”為動詞“掛”的賓語。Compound words consist of at least two or more morphemes, and can be formed by collocation of free morphemes and bound morphemes. For example, there are mainly the following types of compound words in Chinese: ① Positive type: the front morpheme modifies the back morpheme, such as: pig iron, "生" is the front morpheme, and "铁" is the back morpheme; ② Parallel type: two morphemes The semantics are similar, such as: purchasing, the morphemes "cai" and "purchase" both mean "buy"; ③Subject-predicate type: one morpheme is a verb, and the other morpheme is the subject of this morpheme, such as: Maoshuan, "hair" As a verb, the semantic reference is "Maoli", "loss" is the main word, and the semantic reference is "loss"; and ④ verb-object type: one morpheme is a verb, and another morpheme is the object of this morpheme, such as: pending order , "shan" is the object of the verb "hang".

複雜混合型詞彙意指將原本完整的句子或多片語，利用語法縮略(abbreviation)二次形成的詞彙。複雜混合形主要有以下兩種類型：①全句縮略(contraction of complete sentence):將完整句(complete sentence)縮略形成具獨特語意的短詞。如：電匯，“電匯”的深層結構為完整句：“以電文匯款”，進行語法縮略後形成於金融領域使用的專門術語；及②多詞縮略(contraction of multiple phrases):將兩個完整片語縮略行程具獨特語意的短詞。如：短放，“短放”的深層結構為完整句：“短期放款”([短期ADVP]+[放款VP])，進行語法縮略後形成於金融領域使用的專門術語。Complex mixed vocabulary refers to vocabulary that is formed by using grammatical abbreviation (abbreviation) from an originally complete sentence or multiple phrases. There are two main types of complex mixed forms: 1. Contraction of complete sentence: abbreviation of a complete sentence to form a short word with unique semantic meaning. For example: wire transfer, the deep structure of "wire transfer" is a complete sentence: "remit money by telegram", which is a specialized term used in the financial field after grammatical abbreviation; and ②contraction of multiple phrases: combining two The complete phrase abbreviation is a short word with unique semantic meaning. For example: short-term lending, the deep structure of "short-term lending" is a complete sentence: "short-term lending" ([short-term ADVP] + [loan VP]), which is a specialized term used in the financial field after grammatical abbreviation.

參閱圖1、圖2，及圖3，說明本發明基於三重自學習的翻譯模型建立裝置1如何執行本發明基於三重自學習的翻譯模型建立方法之一實施例，該實施例包含一詞彙翻譯模型建立程序及一翻譯模型建立程序。Referring to Figure 1, Figure 2, and Figure 3, it is explained how the translation model establishment device 1 based on triple self-learning of the present invention performs one embodiment of the translation model establishment method based on triple self-learning of the present invention. This embodiment includes a vocabulary translation model. A creation program and a translation model creation program.

該詞彙翻譯模型建立程序包括步驟21~23。The vocabulary translation model establishment program includes steps 21 to 23.

在步驟21中，該處理模組12根據該等單詞類詞彙訓練資料，將一第一預設深度學習模型進行訓練，以建立一用以翻譯單詞類詞彙的單詞類翻譯模型。In step 21 , the processing module 12 trains a first preset deep learning model based on the word-type vocabulary training data to establish a word-class translation model for translating word-type vocabulary.

在步驟22中，該處理模組12根據該等複合詞詞彙訓練資料，將一第二預設深度學習模型進行訓練，以建立一用以翻譯複合詞詞彙的複合詞翻譯模型。In step 22, the processing module 12 trains a second default deep learning model based on the compound word vocabulary training data to establish a compound word translation model for translating compound word vocabulary.

在步驟23中，該處理模組12根據該等複雜混合型詞彙訓練資料，將一第三預設深度學習模型進行訓練，以建立一用以翻譯複雜混合型詞彙的複雜混合型模型。In step 23, the processing module 12 trains a third default deep learning model based on the complex mixed vocabulary training data to establish a complex hybrid model for translating complex mixed vocabulary.

該單詞類翻譯模型、該複合詞翻譯模型，及該複雜混合型模型共同構成該詞彙翻譯模型。The word class translation model, the compound word translation model, and the complex hybrid model together constitute the vocabulary translation model.

值得注意的是，在本實施例中，該第一預設深度學習模型例如為基於規則模型(Rule-based Model)、該第二預設深度學習模型例如為雜訊通道類型模型(Noisy channel type model)、該第三預設深度學習模型例如為序列到序列注意力模型(Sequence-to-Sequence Attention Model)，在其他實施方式中，亦可以一用以翻譯專業術語或/且習慣用語詞彙的基於領域自適應短語的翻譯模型替代該詞彙翻譯模型，例如領域自適應目標檢測模型(Domain Adaptive Faster RNN Model for Object Detection)，但不以此為限。It is worth noting that in this embodiment, the first default deep learning model is, for example, a rule-based model, and the second default deep learning model is, for example, a noise channel type model. model), the third default deep learning model is, for example, a Sequence-to-Sequence Attention Model. In other implementations, it can also be a model used to translate professional terms or/and idioms. A translation model based on domain adaptive phrases replaces the vocabulary translation model, such as a Domain Adaptive Faster RNN Model for Object Detection, but is not limited to this.

要再注意的是，在本實施例中，步驟22在步驟21之後，且步驟23在步驟22之後，在其他實施方式中，步驟21~23沒有先後順序，可以任何順序進行。It should be noted that in this embodiment, step 22 follows step 21, and step 23 follows step 22. In other embodiments, steps 21 to 23 are not sequential and can be performed in any order.

該翻譯模型建立程序包括步驟31~39。The translation model establishment program includes steps 31 to 39.

在步驟31中，該處理模組12將多個原文輸入至該預設翻譯模型，以致該預設翻譯模型輸出多個分別對應該等原文的目標語言翻譯文。In step 31 , the processing module 12 inputs a plurality of original texts into the default translation model, so that the default translation model outputs a plurality of target language translations respectively corresponding to the original texts.

在步驟32中，對於每一目標語言翻譯文，該處理模組12將該目標語言翻譯文輸入至該輔助翻譯模型，以致該輔助翻譯模型輸出一對應該目標語言翻譯文的一第一原語言翻譯文，且獲得一相關於該目標語言翻譯文與該第一原語言翻譯文的對應關係。In step 32, for each target language translation, the processing module 12 inputs the target language translation into the auxiliary translation model, so that the auxiliary translation model outputs a first original language corresponding to the target language translation. Translate the text, and obtain a corresponding relationship between the target language translation and the first source language translation.

要特別注意的是，在該輔助翻譯模型翻譯過程中，該處理模組12係根據注意力對齊機制(transformer attention alignment mechanism)獲得該對應關係，但不以此為限。It should be noted that during the translation process of the auxiliary translation model, the processing module 12 obtains the corresponding relationship based on the attention alignment mechanism (transformer attention alignment mechanism), but is not limited to this.

在步驟33中，該處理模組12根據該等原文及該等原文對應的第一原語言翻譯文調整該預設翻譯模型。In step 33, the processing module 12 adjusts the default translation model according to the original texts and the first original language translations corresponding to the original texts.

搭配參閱圖4，步驟33包括子步驟331~332。Referring to Figure 4, step 33 includes sub-steps 331~332.

在子步驟331中，該處理模組12根據該等原文及該等原文對應的第一原語言翻譯文產生一第一損失值。In sub-step 331, the processing module 12 generates a first loss value based on the original texts and the first original language translations corresponding to the original texts.

在子步驟332中，該處理模組12根據該第一損失值調整該預設翻譯模型。In sub-step 332, the processing module 12 adjusts the preset translation model according to the first loss value.

在步驟34中，對於每一原文，該處理模組12根據該原文及該原文對應的第一原語言翻譯文，獲得多個在該第一原語言翻譯文中與該原文相異的殘差原語言翻譯文詞彙，及多個在該原文中與該第一原語言翻譯文相異的殘差原文詞彙。In step 34, for each original text, the processing module 12 obtains a plurality of residual originals that are different from the original text in the first original language translation based on the original text and the first original language translation corresponding to the original text. The vocabulary of the language translation, and a plurality of residual original words in the original text that are different from the first original language translation.

舉例來說，若該原文為「中信集團2020年為配合政府環保政策休爐，導致鎳生鐵產量及營收下滑毛損達新台幣16200仟元」，而該原文對應的第一原語言翻譯文為「花旗集團2020年為配合政府環保政策休爐，導致生鐵產量及營收下滑損失1620萬」，該等殘差原語言翻譯文詞彙為「花旗、生鐵、損失、1620萬」，該等殘差原文詞彙為「中信、鎳生鐵、毛損、16200仟元」，其中「花旗、中信」為公司名翻譯錯誤，「生鐵、鎳生鐵」為領域專有名詞翻譯錯誤，「損失、毛損」為專業術語翻譯錯誤，「1620萬、16200仟元」為數據翻譯錯誤。For example, if the original text is "CITIC Group shut down its boilers in 2020 to comply with the government's environmental protection policy, resulting in a decline in nickel pig iron production and revenue, with a gross loss of NT$16,200,000", and the corresponding translation of the original text into the first original language It is "Citigroup shut down furnaces in 2020 to comply with the government's environmental protection policy, resulting in a decline in pig iron production and revenue, resulting in a loss of 16.2 million yuan." The original language translation of these residuals is "Citigroup, pig iron, loss, 16.2 million yuan". These residuals The original vocabulary of the difference is "CITIC, nickel pig iron, gross loss, 16,200 thousand yuan", in which "Citigroup, CITIC" is a translation error of the company name, "pig iron, nickel pig iron" is a translation error of the proper nouns in the field, "loss, gross loss" It is a translation error of professional terminology, and "16.2 million, 16,200 thousand yuan" is an error of data translation.

在步驟35中，對於每一原文，該處理模組12將該等殘差原文詞彙輸入至該詞彙翻譯模型，以致該詞彙翻譯模型輸出多個分別對應該等殘差原文詞彙的目標語言正確翻譯詞彙。In step 35, for each original text, the processing module 12 inputs the residual original text vocabulary into the vocabulary translation model, so that the vocabulary translation model outputs a plurality of correct translations in the target language corresponding to the residual original text vocabulary. Vocabulary.

值得注意的是，在本實施例中，該處理模組12以詞幹提取對該等殘差原文詞彙進行分類，以判定每一殘差原文詞彙是屬於單詞類、複合詞或是複雜混合型，以將每一殘差原文詞彙輸入至該詞彙翻譯模型中的對應的模型。It is worth noting that in this embodiment, the processing module 12 uses word stemming to classify the residual original words to determine whether each residual original word belongs to a word category, a compound word, or a complex mixed type. Each residual original word is input into the corresponding model in the word translation model.

在步驟36中，對於每一目標語言翻譯文，該處理模組12根據該目標語言翻譯文、該對應關係、該等殘差原文詞彙，及該等目標語言正確翻譯詞彙，產生一目標語言補丁翻譯文。In step 36, for each target language translation, the processing module 12 generates a target language patch based on the target language translation, the correspondence, the residual original vocabulary, and the correct translation vocabulary of the target language. Translation.

搭配參閱圖5，步驟36包括子步驟361~362。Referring to Figure 5, step 36 includes sub-steps 361~362.

在子步驟361中，對於每一目標語言翻譯文，該處理模組12根據該對應關係及該等殘差原文詞彙，獲得多個相關於該等殘差原文詞彙在該目標語言翻譯文的對應位置。In sub-step 361, for each target language translation, the processing module 12 obtains a plurality of correspondences related to the residual original words in the target language translation based on the corresponding relationship and the residual original words. Location.

在子步驟362中，對於每一目標語言翻譯文，該處理模組12將該目標語言翻譯文中在該等對應位置的詞彙替換成該等殘差原文詞彙對應的該等目標語言正確翻譯詞彙，以產生該目標語言補丁翻譯文。In sub-step 362, for each target language translation, the processing module 12 replaces the words at the corresponding positions in the target language translation with the correctly translated words in the target language corresponding to the residual original words, to generate patch translations for that target language.

在步驟37中，對於每一目標語言補丁翻譯文，該處理模組12將該目標語言補丁翻譯文輸入至該輔助翻譯模型，以致該輔助翻譯模型輸出一對應該目標語言補丁翻譯文的第二原語言翻譯文。In step 37, for each target language patch translation, the processing module 12 inputs the target language patch translation into the auxiliary translation model, so that the auxiliary translation model outputs a second translation corresponding to the target language patch translation. Original language translation.

在步驟38中，該處理模組12根據該等原文及該等原文對應的第二原語言翻譯文調整該預設翻譯模型。In step 38 , the processing module 12 adjusts the default translation model according to the original texts and the second original language translations corresponding to the original texts.

搭配參閱圖6，步驟38包括子步驟381~382。Referring to Figure 6, step 38 includes sub-steps 381~382.

在步驟381中，該處理模組12根據該等原文及該等原文對應的第二原語言翻譯文產生一第二損失值。In step 381, the processing module 12 generates a second loss value based on the original texts and the second original language translations corresponding to the original texts.

在步驟382中，該處理模組12根據該第二損失值調整該預設翻譯模型。In step 382, the processing module 12 adjusts the preset translation model according to the second loss value.

在步驟39中，該處理模組12根據該等原文對應的殘差原文詞彙及目標語言正確翻譯詞彙調整該預設翻譯模型。In step 39, the processing module 12 adjusts the default translation model according to the residual original vocabulary corresponding to the original text and the correct translation vocabulary in the target language.

搭配參閱圖7，步驟39包括子步驟391~393。Referring to Figure 7, step 39 includes sub-steps 391~393.

在子步驟391中，對於每一原文，該處理模組12將該原文對應的該等殘差原文詞彙輸入至該預設翻譯模型，以致該預設翻譯模型輸出多個分別對應該等殘差原文詞彙的目標語言預測翻譯詞彙。In sub-step 391, for each original text, the processing module 12 inputs the residual original words corresponding to the original text into the default translation model, so that the default translation model outputs a plurality of residual words corresponding to the original text. The target language prediction of the original vocabulary is the translation vocabulary.

在子步驟392中，該處理模組12根據該等原文對應的目標語言正確翻譯詞彙及目標語言預測翻譯詞彙產生一第三損失值。In sub-step 392, the processing module 12 generates a third loss value based on the correct translation vocabulary of the target language corresponding to the original text and the predicted translation vocabulary of the target language.

在子步驟393中，該處理模組12根據該第三損失值調整該預設翻譯模型。In sub-step 393, the processing module 12 adjusts the preset translation model according to the third loss value.

值得注意的是，在本實施例中，該處理模組12是利用Renjie Zheng等人所著的” Multi-Reference Training with Pseudo-References for Neural Translation and Text Generation”論文中之軟詞對齊(soft word alignment)演算法，在子步驟331及步驟34中，將該原文及該原文對應的第一原語言翻譯文進行比對，以獲得該第一損失值、該等殘差原語言翻譯文詞彙及該等殘差原文詞彙；在子步驟381中，將該等原文及該等原文對應的第二原語言翻譯文進行比對，以獲得該第二損失值；及在子步驟392中，將該等原文對應的目標語言正確翻譯詞彙及目標語言預測翻譯詞彙進行比對，以獲得該第三損失值，但不以此為限。It is worth noting that in this embodiment, the processing module 12 uses soft word alignment in the paper "Multi-Reference Training with Pseudo-References for Neural Translation and Text Generation" written by Renjie Zheng et al. alignment algorithm, in sub-steps 331 and 34, compare the original text with the first original language translation corresponding to the original text to obtain the first loss value, the residual original language translation vocabulary and The residual original text vocabulary; in sub-step 381, compare the original text with the second original language translation corresponding to the original text to obtain the second loss value; and in sub-step 392, compare the original text with the second original language translation corresponding to the original text. Compare the correctly translated words in the target language and the predicted translated words in the target language corresponding to the original text to obtain the third loss value, but it is not limited to this.

要再注意得是，在本實施例中，該第一損失值、該第二損失值，及該第三損失值為根據比對結果以交叉熵(Cross-Entropy)損失函數所計算出來的，但不以此為限。It should be noted again that in this embodiment, the first loss value, the second loss value, and the third loss value are calculated using the cross-entropy (Cross-Entropy) loss function based on the comparison results. But it is not limited to this.

綜上所述，本發明基於三重自學習的翻譯模型建立方法及裝置，藉由該翻譯模型建立裝置1根據該等原文及該等原文對應的第一原語言翻譯文調整該預設翻譯模型(即進行第一重自學習)，再根據該等原文及該等原文對應的第二原語言翻譯文調整該預設翻譯模型(即進行第二重自學習)，以及根據該等原文對應的殘差原文詞彙及目標語言正確翻譯詞彙調整該預設翻譯模型(即進行第三重自學習)，以提高該預設翻譯模型之翻譯準確率，故確實能達成本發明的目的。To sum up, the present invention is a translation model establishment method and device based on triple self-learning. The translation model establishment device 1 adjusts the default translation model according to the original texts and the first original language translation corresponding to the original texts ( That is, perform the first level of self-learning), and then adjust the default translation model according to the original texts and the second original language translations corresponding to the original texts (that is, perform the second level of self-learning), and adjust the default translation model according to the residual text corresponding to the original texts. The original vocabulary and the correctly translated vocabulary in the target language are adjusted to adjust the default translation model (that is, perform the third level of self-learning) to improve the translation accuracy of the default translation model, so the purpose of the present invention can indeed be achieved.

惟以上所述者，僅為本發明的實施例而已，當不能以此限定本發明實施的範圍，凡是依本發明申請專利範圍及專利說明書內容所作的簡單的等效變化與修飾，皆仍屬本發明專利涵蓋的範圍內。However, the above are only examples of the present invention. They cannot be used to limit the scope of the present invention. All simple equivalent changes and modifications made based on the patent scope of the present invention and the contents of the patent specification are still within the scope of the present invention. within the scope covered by the patent of this invention.

1········ 翻譯模型建立裝置 11······ 儲存模組 12······ 處理模組 21~23·· 詞彙翻譯模型建立程序 31~39·· 翻譯模型建立程序包括步驟 331~332 子步驟 361~362 子步驟 381~382 子步驟 391~393 子步驟 1········· Translation model establishment device 11······· Storage module 12······· Processing module 21~23·· Vocabulary translation model establishment program 31~39·· The translation model establishment procedure includes steps 331~332 sub-steps 361~362 sub-steps 381~382 sub-steps 391~393 sub-steps

本發明的其他的特徵及功效，將於參照圖式的實施方式中清楚地呈現，其中：圖1是一方塊圖，說明本發明基於三重自學習的翻譯模型建立裝置的一實施例；圖2是一流程圖，說明本發明基於三重自學習的翻譯模型建立方法的一實施例的一詞彙翻譯模型建立程序；圖3是一流程圖，說明本發明基於三重自學習的翻譯模型建立方法的該實施例的一翻譯模型建立程序；圖4是一流程圖，輔助說明圖3的步驟33之子步驟；圖5是一流程圖，輔助說明圖3的步驟36之子步驟；圖6是一流程圖，輔助說明圖3的步驟38之子步驟；及圖7是一流程圖，輔助說明圖3的步驟39之子步驟。 Other features and effects of the present invention will be clearly presented in the embodiments with reference to the drawings, in which: Figure 1 is a block diagram illustrating an embodiment of a translation model building device based on triple self-learning of the present invention; Figure 2 is a flow chart illustrating a vocabulary translation model establishment procedure of an embodiment of the translation model establishment method based on triple self-learning of the present invention; Figure 3 is a flow chart illustrating a translation model establishment procedure of this embodiment of the translation model establishment method based on triple self-learning of the present invention; Figure 4 is a flow chart to assist in explaining the sub-steps of step 33 of Figure 3; Figure 5 is a flow chart to assist in explaining the sub-steps of step 36 in Figure 3; Figure 6 is a flow chart to assist in explaining the sub-steps of step 38 of Figure 3; and FIG. 7 is a flow chart to assist in explaining the sub-steps of step 39 of FIG. 3 .

31~39·· 翻譯模型建立程序31~39·· Translation model establishment program

Claims

A translation model establishment method based on triple self-learning is implemented by a model establishment device. The model establishment device stores a default translation model for translating an original text into a translation text in a target language, a trained and used With an auxiliary translation model that translates an original text into a translation text in the target language, and a vocabulary translation model that has been trained and is used to translate a vocabulary in the original language into a translation vocabulary in the target language, the method includes The following steps: (A) Input multiple original texts into the default translation model, so that the default translation model outputs multiple target language translations corresponding to the corresponding original texts; (B) For each target language translation, The target language translation is input to the auxiliary translation model, so that the auxiliary translation model outputs a first original language translation corresponding to the target language translation, and obtains a correlation between the target language translation and the first original language The corresponding relationship between the translated texts; (C) Adjust the default translation model according to the original texts and the first original language translations corresponding to the original texts; (D) For each original text, adjust the default translation model according to the original texts and the first original language translations corresponding to the original texts. The original language translation text obtains a plurality of residual original language translation words that are different from the original language translation text in the first original language translation text, and a plurality of residual original language translation text words that are different from the first original language translation text in the original language text. Original vocabulary; (E) For each original text, input the residual original vocabulary into the vocabulary translation model, so that the vocabulary translation model outputs a plurality of correctly translated vocabulary in the target language corresponding to the residual original vocabulary; (F ) for each target language translation, according to the corresponding relationship and the and other residual original words, and obtain a plurality of corresponding positions of the residual original words in the target language translation; (G) for each target language translation, obtain the corresponding positions in the target language translation. The vocabulary is replaced with the correctly translated vocabulary of the target language corresponding to the residual original vocabulary to generate the target language patch translation; (H) for each target language patch translation, input the target language patch translation into the An auxiliary translation model, so that the auxiliary translation model outputs a second original language translation corresponding to the target language patch translation; (I) adjusting the default translation according to the original text and the second original language translation corresponding to the original text model; and (J) adjust the default translation model based on the residual original vocabulary corresponding to the original text and the correct translation vocabulary in the target language.

The translation model establishment method based on triple self-learning as described in claim 1, wherein in step (D), a soft word alignment algorithm is used to compare the original text and the first original language translation to obtain the The vocabulary of the residual original language translation and the vocabulary of the residual original language.

As for the translation model establishment method based on triple self-learning described in claim 1, the model establishment device also stores multiple word-type vocabulary training data, multiple compound word vocabulary training data, and multiple complex mixed vocabulary training data. Each One word class vocabulary training material has a word class vocabulary in the original language and a target language word class vocabulary corresponding to the word class vocabulary in the original language. Each compound word vocabulary training material has a compound word vocabulary in the original language and a compound word vocabulary corresponding to the original language. Target language compound word vocabulary, each complex mixture The type vocabulary training material has a complex mixed vocabulary in the original language and a pair of complex mixed vocabulary in the original language and a pair of complex mixed words in the target language. Before step (A), it also includes the following steps: (L) According to the word type vocabulary training data to train a first preset deep learning model to establish a word class translation model for translating word class vocabulary; (M) based on the compound word vocabulary training data, a second preset deep learning model Carry out training to establish a compound word translation model for translating compound word vocabulary; and (N) train a third default deep learning model based on the complex mixed vocabulary training data to establish a compound word translation model for translating complex mixed words. A complex hybrid model of type vocabulary, the word class translation model, the compound word translation model, and the complex hybrid model together constitute the vocabulary translation model.

The method for establishing a translation model based on triple self-learning as described in claim 1, wherein step (C) includes the following sub-steps: (C-1) Generating a translation text based on the original text and the first original language corresponding to the original text a first loss value; and (C-2) adjusting the default translation model according to the first loss value. Step (I) includes the following sub-steps: (I-1) based on the original texts and the corresponding first text of the original texts. The two original language translations generate a second loss value; and (I-2) adjust the default translation model according to the second loss value. Step (J) includes the following sub-steps: (J-1) for each original text, The residual original text corresponding to the original text Vocabulary is input into the preset translation model, so that the preset translation model outputs a plurality of target language predicted translation vocabularies respectively corresponding to the residual original text vocabulary; (J-2) Correctly translate the vocabulary according to the target language corresponding to the original text and The target language predicts the translation vocabulary to generate a third loss value; and (J-3) adjusts the default translation model according to the third loss value.

A device for establishing a translation model based on triple self-learning, including: a storage module that stores a default translation model used to translate an original text into a target language, a trained model used to translate an original text an auxiliary translation model into a translation in the target language, and a vocabulary translation model that has been trained and used to translate a vocabulary in a source language into a translation vocabulary in the target language; and a processing module, electrically connected The storage module inputs a plurality of original texts into the default translation model, so that the default translation model outputs a plurality of target language translations respectively corresponding to the original texts, and for each target language translation, the target language The translated text is input to the auxiliary translation model, so that the auxiliary translation model outputs a first original language translation corresponding to the target language translation, and obtains a relationship between the target language translation and the first original language translation. corresponding relationship, and then adjust the default translation model according to the original text and the first original language translation corresponding to the original text, and for each original text, obtain multiple a residual original language translation vocabulary that is different from the original text in the first original language translation, and a plurality of residual original language vocabulary that is different from the first original language translation in the original text, and then for each Original text, input the residual original text vocabulary into the vocabulary translation model, so that the vocabulary translation model outputs A plurality of correctly translated words in the target language respectively corresponding to the residual original words, and for each target language translation, the processing module obtains a plurality of words related to the residual words based on the corresponding relationship and the residual original words. The corresponding positions of the original words in the target language translation are different, and for each target language translation, the processing module replaces the words in the corresponding positions in the target language translation with the words corresponding to the residual original words. Wait for the target language to correctly translate the vocabulary to generate the target language patch translation, and for each target language patch translation, input the target language patch translation into the auxiliary translation model, so that the auxiliary translation model outputs a corresponding target language A second source language translation of the language patch translation, and then adjust the default translation model according to the original text and the corresponding second source language translation of the original text, and finally adjust the residual original vocabulary and target according to the corresponding original text Language-correct translation vocabulary adjusts this default translation model.

The triple self-learning translation model building device of claim 5, wherein the processing module uses a soft word alignment algorithm to compare the original text and the first original language translation to obtain the residual original language The vocabulary of the translated text and the residual vocabulary of the original text.

The triple self-learning translation model building device as described in claim 5, wherein the storage module also stores multiple word-type vocabulary training data, multiple compound word vocabulary training data, and multiple complex mixed vocabulary training data, Each word class vocabulary training material has a word class vocabulary in the original language and a word class vocabulary in the target language corresponding to the word class vocabulary in the original language. Each compound word vocabulary training material has a compound word vocabulary in the original language and a compound word vocabulary in the original language. target language compound word vocabulary, each complex mixture The type vocabulary training data has a source language complex mixed vocabulary and a pair of target language complex mixed vocabulary of the original language complex hybrid vocabulary. The processing module converts a first preset depth based on the word type vocabulary training data. The learning model is trained to establish a word class translation model for translating word class vocabulary, and a second default deep learning model is trained based on the compound word vocabulary training data to establish a word class translation model for translating compound word vocabulary. Compound word translation model, and based on the complex mixed vocabulary training data, a third default deep learning model is trained to establish a complex hybrid model for translating complex mixed vocabulary, the word class translation model, the The compound word translation model and the complex hybrid model together constitute the vocabulary translation model.

The device for establishing a triple self-learning translation model as claimed in claim 5, wherein the processing module generates a first loss value based on the original texts and the first original language translations corresponding to the original texts, and generates a first loss value based on the first The loss value adjusts the default translation model, and the processing module generates a second loss value based on the original texts and the second original language translations corresponding to the original texts, and adjusts the default translation model according to the second loss value , and for each original text, the processing module inputs the residual original words corresponding to the original text into the default translation model, so that the default translation model outputs a plurality of target languages corresponding to the residual original words respectively. Predict the translation vocabulary, generate a third loss value based on the correct translation vocabulary in the target language corresponding to the original text and predict the translation vocabulary in the target language, and adjust the default translation model based on the third loss value.