CN111401000B - Real-time translation previewing method for online auxiliary translation - Google Patents
Real-time translation previewing method for online auxiliary translation Download PDFInfo
- Publication number
- CN111401000B CN111401000B CN202010260294.7A CN202010260294A CN111401000B CN 111401000 B CN111401000 B CN 111401000B CN 202010260294 A CN202010260294 A CN 202010260294A CN 111401000 B CN111401000 B CN 111401000B
- Authority
- CN
- China
- Prior art keywords
- translation
- atom
- html
- segment
- tag
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000013519 translation Methods 0.000 title claims abstract description 46
- 238000000034 method Methods 0.000 title claims abstract description 22
- 125000004122 cyclic group Chemical group 0.000 claims abstract description 8
- 230000000694 effects Effects 0.000 claims abstract description 7
- 238000006243 chemical reaction Methods 0.000 claims description 6
- 238000003491 array Methods 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims description 3
- 230000014616 translation Effects 0.000 description 30
- 238000010586 diagram Methods 0.000 description 8
- 230000009471 action Effects 0.000 description 3
- 230000006870 function Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 241001522296 Erithacus rubecula Species 0.000 description 1
- 101100247669 Quaranfil virus (isolate QrfV/Tick/Afghanistan/EG_T_377/1968) PB1 gene Proteins 0.000 description 1
- 101100242901 Quaranfil virus (isolate QrfV/Tick/Afghanistan/EG_T_377/1968) PB2 gene Proteins 0.000 description 1
- 101150025928 Segment-1 gene Proteins 0.000 description 1
- 101150082826 Segment-2 gene Proteins 0.000 description 1
- 101100242902 Thogoto virus (isolate SiAr 126) Segment 1 gene Proteins 0.000 description 1
- 101100194052 Thogoto virus (isolate SiAr 126) Segment 2 gene Proteins 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
Images
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Machine Translation (AREA)
- Document Processing Apparatus (AREA)
Abstract
The invention discloses a real-time translation previewing method for online auxiliary translation, which relates to the field of computer auxiliary translation and comprises the following steps: converting the original text file into HTML through a file format converter, analyzing and dividing the original text into sentence Segment segments which are divided according to sentences, burying element ids in the sentence Segment segments into converted HTML sub-tags by using a cyclic recursion algorithm to form a one-to-one correspondence, and realizing linkage between the sentence Segment segments and the HTML through a dom node of the HTML at the front end so as to achieve the effect of previewing the translation in real time; the invention provides an algorithm, which can render the translation in the auxiliary translation into a browser in real time for the translator to check and reference, thereby greatly saving the translation time and improving the efficiency obviously.
Description
Technical Field
The invention relates to the field of computer-aided translation, in particular to a real-time translation preview method for online-aided translation.
Background
The contemporary computer aided translation needs to extract the text, translate the text into the appointed target language, and then fill the translated text back. Typically, a translator cannot view the original text and translated text of a translated document in an editor during translation. The conventional method is to convert the original text into an html format by a file conversion method and render the html format to a translator for viewing by a browser. However, the translations formed by the translator during the editing process cannot be viewed in real time.
Disclosure of Invention
The embodiment of the invention provides a real-time translation preview method for online assisted translation. The following presents a simplified summary in order to provide a basic understanding of some aspects of the disclosed embodiments. This summary is not an extensive overview and is intended to neither identify key/critical elements nor delineate the scope of such embodiments. Its sole purpose is to present some concepts in a simplified form as a prelude to the more detailed description that is presented later.
According to a first aspect of an embodiment of the present invention, there is provided
A real-time translation preview method for online assisted translation comprises the following steps:
converting the original text file into an HTML format;
analyzing the original text and dividing the original text into sentence Segment arrays divided by sentences;
embedding the element id in the sentence Segment into the sub-tag in the HTML format file by using a cyclic recursion algorithm to form a one-to-one correspondence;
and the linkage between the sentence Segment and the HTML is realized through the dom node of the HTML, so that the effect of previewing the translation in real time is achieved.
Preferably, the original document format is doc, docx, rtf, xls, xlsx, ppt, pptx, pdf, sxw, stw, sxc, stc.
Preferably, the method for converting the original document into the HTML format is to convert the original document by using a word self conversion function or other third party tools.
Preferably, the parsing and dividing the original text into sentence Segment segments array divided by sentences, specifically, dividing the sentence into sentence Segment segments divided by sentences is word, phrase or sentence.
Preferably, the Segment array is a Segment list, and the text content and the corresponding text labels of the segments are recorded.
Preferably, the cyclic recursive algorithm comprises the steps of:
defining a class of Atom types, which has two types defined as Tag and text;
defining Segment content in the Segment list as text of Atom, and defining label of Segment as label Tag of Atom;
the algorithm circularly reads each Atom and judges whether the Atom is put into a text pool or not according to the type of the Atom;
and (3) corresponding each Atom in the text pool to the Tag of the Atom to finally form a new HTML sub-Tag set with id mapping.
Preferably, the Atom class is a custom class.
Preferably, each Segment is composed of one or more Atom.
Preferably, the sub-Tag of the HTML is constituted by a Tag of Atom.
Preferably, the linkage method between the sentence Segment and the HTML is as follows: the Tag of Atom is embedded into the HTML sub-Tag.
The technical scheme provided by the embodiment of the invention can comprise the following beneficial effects:
the invention provides an algorithm, which can render the translation in the auxiliary translation into a browser in real time for the translator to check and reference, thereby greatly saving the translation time and improving the efficiency obviously. As shown in FIG. 7, when translating the 181 th sentence, the translator can see the effect of the translated sentence in the translated text in real time.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention as claimed.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
FIG. 1 is a schematic diagram of a real-time preview method of a translation of an online assisted translation according to an exemplary embodiment;
FIG. 2 is a logic diagram of a recursive and round robin algorithm shown in accordance with an exemplary embodiment;
FIG. 3 is an exemplary diagram of an original document shown in accordance with an exemplary embodiment;
FIG. 4 is a schematic diagram of sentence segments divided by sentences, shown according to an example embodiment;
FIG. 5 is a diagram illustrating conversion of an original document into HTML through a file format, according to an exemplary embodiment;
FIG. 6 is a schematic diagram showing embedding a transUnitId in a tag according to an exemplary embodiment;
FIG. 7 is a live preview effect view of a translation shown in accordance with an exemplary embodiment.
Description of the embodiments
The following description and the drawings sufficiently illustrate specific embodiments of the invention to enable those skilled in the art to practice them. The embodiments represent only possible variations. Individual components and functions are optional unless explicitly required, and the sequence of operations may vary. Portions and features of some embodiments may be included in, or substituted for, those of others. The scope of embodiments of the invention encompasses the full ambit of the claims, as well as all available equivalents of the claims. Embodiments may be referred to herein, individually or collectively, by the term "invention" merely for convenience and without intending to voluntarily limit the scope of this application to any single invention or inventive concept if more than one is in fact disclosed. Relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed. Various embodiments are described herein in a progressive manner, each embodiment focusing on differences from other embodiments, and identical and similar parts between the various embodiments are sufficient to be seen with each other. The structures, products and the like disclosed in the embodiments correspond to the parts disclosed in the embodiments, so that the description is relatively simple, and the relevant parts refer to the description of the method parts.
It should be noted that: the relative arrangement of the components and steps, numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present disclosure unless it is specifically stated otherwise.
It should be noted that although the steps of the methods of the present disclosure are illustrated in the accompanying drawings in a particular order, this does not require or imply that the steps must be performed in that particular order or that all of the illustrated steps be performed in order to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step to perform, and/or one step decomposed into multiple steps to perform, etc.
The invention is further described below with reference to the accompanying drawings and examples:
as shown in FIG. 1, the real-time translation preview method for online assisted translation comprises the following steps:
s1: converting the original text file into an HTML format;
s2: analyzing the original text and dividing the original text into sentence Segment arrays divided by sentences;
s3: embedding the element id in the sentence Segment into an HTML sub-tag by using a cyclic recursion algorithm to form a one-to-one correspondence;
s4: and the linkage between the sentence Segment and the HTML is realized through the dom node of the HTML, so that the effect of previewing the translation in real time is achieved.
According to the above scheme, further, the original document format may be word, excel, ppt, pdf, as shown in fig. 3, where the original document is a word.
In particular embodiments, the file format conversion described in FIG. 5 may utilize the conversion functionality of the word itself or other third party open source tools.
According to the above scheme, further, the sentence is divided into sentence segments divided by sentence, and the sentence segments are words or phrases, as shown in fig. 3, the word has test. test, sentence 2: fast.
According to the above scheme, further, the Segment array is a Segment list, and records text content and corresponding text labels of Segment segments, as shown in fig. 4, when the code is implemented, we define two sentences as two objects, segment1 and Segment2, and transUnitId: sentence labels; srcAtom: is sentence content.
According to the above scheme, further, as shown in fig. 2, a schematic diagram of a logic diagram of a recursive and cyclic algorithm is shown in a specific embodiment, and the cyclic recursive algorithm specifically includes the following steps:
s31: defining a class of Atom types, which has two types defined as Tag and text;
s32: defining Segment content in the Segment list as text of Atom, and defining label of Segment as label Tag of Atom;
s33: the algorithm circularly reads each Atom and judges whether the Atom is put into a text pool or not according to the type of the Atom;
s34: and (3) corresponding each Atom in the text pool to the Tag of the Atom to finally form a new HTML sub-Tag set with id mapping.
In a specific embodiment, the Atom class is a custom class and is not an original class.
According to the above scheme, further, each Segment is composed of one or more Atom.
In a specific embodiment, the sub-Tag of the HTML is formed by a Tag of Atom.
According to the above scheme, further, the linkage between the Segment segments and the HTML is implemented by embedding the Atom Tag into the HTML sub-Tag, as shown in fig. 6, so as to realize that the translated content of the sentence 1 can be displayed in the HTML webpage in real time, we need to locate the first span Tag under the p Tag of the above figure. The simplest approach is to say transUnitId embedding in the tag.
The method for previewing the translation in real time in the online auxiliary translation can render the translation in the auxiliary translation into the browser in real time for the translator to check and reference, greatly saves the translation time, and has very obvious efficiency improvement. As shown in FIG. 7, when translating the 181 th sentence, the translator can see the effect of the translated sentence in the translated text in real time.
The invention provides an algorithm, which can render the translation in the auxiliary translation into a browser in real time for the translator to check and reference, thereby greatly saving the translation time and improving the efficiency obviously.
It is to be understood that the invention is not limited to the arrangements and instrumentality shown in the drawings and described above, and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.
Claims (3)
1. A real-time translation preview method for online assisted translation is characterized by comprising the following steps:
converting the original text file into an HTML format;
analyzing the original text and dividing the original text into sentence Segment arrays divided by sentences; the method comprises the following steps: dividing the clause into sentence Segment segments divided according to sentences as words, phrases or sentences; the Segment array is a Segment list, and the text content and the corresponding text labels of the Segment are recorded;
embedding the element id in the sentence Segment into the sub-tag in the HTML format file by using a cyclic recursion algorithm to form a one-to-one correspondence;
the linkage between the sentence Segment and the HTML is realized through the dom node of the HTML, so that the effect of previewing the translation in real time is achieved;
the cyclic recursion algorithm comprises the following steps:
defining a class of Atom types, which has two types defined as Tag and text; class Atom is a custom class;
defining Segment content in the Segment list as text of Atom, and defining label of Segment as label Tag of Atom;
the algorithm circularly reads each Atom and judges whether the Atom is put into a text pool or not according to the type of the Atom;
each Atom in the text pool corresponds to the Tag of the Atom to finally form a new HTML sub-Tag set with id mapping;
wherein each Segment is composed of one or more Atom; the sub-Tag of the HTML is composed of Tag of an Atom;
the linkage method between the sentence Segment and the HTML comprises the following steps: the Tag of Atom is embedded into the HTML sub-Tag.
2. The method for real-time translation preview of online assisted translation according to claim 1, wherein said original document format is doc, docx, rtf, xls, xlsx, ppt, pptx, pdf, sxw, stw, sxc, stc.
3. The method for real-time translation preview of online assisted translation according to claim 2, wherein the converting the original document into HTML format is by using a word's own conversion function or other third party tools.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010260294.7A CN111401000B (en) | 2020-04-03 | 2020-04-03 | Real-time translation previewing method for online auxiliary translation |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010260294.7A CN111401000B (en) | 2020-04-03 | 2020-04-03 | Real-time translation previewing method for online auxiliary translation |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN111401000A CN111401000A (en) | 2020-07-10 |
| CN111401000B true CN111401000B (en) | 2023-06-20 |
Family
ID=71434942
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202010260294.7A Active CN111401000B (en) | 2020-04-03 | 2020-04-03 | Real-time translation previewing method for online auxiliary translation |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN111401000B (en) |
Families Citing this family (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN111985255A (en) * | 2020-09-01 | 2020-11-24 | 北京中科凡语科技有限公司 | Translation method, translation device, electronic device and storage medium |
| CN113705158B (en) * | 2021-09-26 | 2024-05-24 | 上海一者信息科技有限公司 | Method for intelligently restoring original text style in document translation |
Citations (12)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102388383A (en) * | 2006-12-08 | 2012-03-21 | 帕特里克·J·霍尔 | Online computer-aided translation |
| CN102567384A (en) * | 2010-12-29 | 2012-07-11 | 盛乐信息技术(上海)有限公司 | Webpage multi-language dynamic switching method and system based on webpage browser engine |
| CN102929867A (en) * | 2011-11-03 | 2013-02-13 | 微软公司 | Technology for automated document translation |
| CN104965866A (en) * | 2015-06-05 | 2015-10-07 | 小米科技有限责任公司 | Method and apparatus for establishing label and style rule binding relation |
| CN105069000A (en) * | 2015-08-24 | 2015-11-18 | 中译语通科技(北京)有限公司 | Interactive prediction input method |
| CN105468697A (en) * | 2015-11-18 | 2016-04-06 | 成都优译信息技术有限公司 | Automatic positioning method used for translation teaching system |
| CN105573969A (en) * | 2006-10-02 | 2016-05-11 | 谷歌公司 | Displaying original text in a user interface with translated text |
| CN105760542A (en) * | 2016-03-15 | 2016-07-13 | 腾讯科技(深圳)有限公司 | Display control method, terminal and server |
| CN106649271A (en) * | 2016-12-19 | 2017-05-10 | 成都优译信息技术股份有限公司 | Translation-based word document analysis method |
| CN107885735A (en) * | 2017-11-21 | 2018-04-06 | 语联网(武汉)信息技术有限公司 | A kind of unrelated document translation method and system of form |
| CN109145260A (en) * | 2018-08-24 | 2019-01-04 | 北京科技大学 | A kind of text information extraction method |
| CN110263351A (en) * | 2019-06-17 | 2019-09-20 | 深圳前海微众银行股份有限公司 | Multilingual translation method, device and equipment for a webpage |
-
2020
- 2020-04-03 CN CN202010260294.7A patent/CN111401000B/en active Active
Patent Citations (12)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN105573969A (en) * | 2006-10-02 | 2016-05-11 | 谷歌公司 | Displaying original text in a user interface with translated text |
| CN102388383A (en) * | 2006-12-08 | 2012-03-21 | 帕特里克·J·霍尔 | Online computer-aided translation |
| CN102567384A (en) * | 2010-12-29 | 2012-07-11 | 盛乐信息技术(上海)有限公司 | Webpage multi-language dynamic switching method and system based on webpage browser engine |
| CN102929867A (en) * | 2011-11-03 | 2013-02-13 | 微软公司 | Technology for automated document translation |
| CN104965866A (en) * | 2015-06-05 | 2015-10-07 | 小米科技有限责任公司 | Method and apparatus for establishing label and style rule binding relation |
| CN105069000A (en) * | 2015-08-24 | 2015-11-18 | 中译语通科技(北京)有限公司 | Interactive prediction input method |
| CN105468697A (en) * | 2015-11-18 | 2016-04-06 | 成都优译信息技术有限公司 | Automatic positioning method used for translation teaching system |
| CN105760542A (en) * | 2016-03-15 | 2016-07-13 | 腾讯科技(深圳)有限公司 | Display control method, terminal and server |
| CN106649271A (en) * | 2016-12-19 | 2017-05-10 | 成都优译信息技术股份有限公司 | Translation-based word document analysis method |
| CN107885735A (en) * | 2017-11-21 | 2018-04-06 | 语联网(武汉)信息技术有限公司 | A kind of unrelated document translation method and system of form |
| CN109145260A (en) * | 2018-08-24 | 2019-01-04 | 北京科技大学 | A kind of text information extraction method |
| CN110263351A (en) * | 2019-06-17 | 2019-09-20 | 深圳前海微众银行股份有限公司 | Multilingual translation method, device and equipment for a webpage |
Also Published As
| Publication number | Publication date |
|---|---|
| CN111401000A (en) | 2020-07-10 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US7472343B2 (en) | Systems, methods and computer programs for analysis, clarification, reporting on and generation of master documents for use in automated document generation | |
| CN110413574A (en) | A kind of method of automatic code generating internationalized resources | |
| JP2004334791A (en) | Machine translation apparatus, data processing method and program | |
| CN111401000B (en) | Real-time translation previewing method for online auxiliary translation | |
| Goodman et al. | Xigt: extensible interlinear glossed text for natural language processing | |
| CN101430684A (en) | Method and apparatus for mutual conversion between Chinese work office software document and documents with other format | |
| JP2003141114A (en) | Third language text generation algorithm and device by multi-language bilingual text input, and program | |
| Sautter et al. | Semi-automated XML markup of biosystematic legacy literature with the GoldenGATE editor | |
| CN113033177A (en) | Method and device for analyzing electronic medical record data | |
| Jacobson et al. | Linguistic documents synchronizing sound and text | |
| Mall et al. | Developing a system for machine translation from Hindi language to English language | |
| Durrani et al. | Improving Egyptian-to-English SMT by mapping Egyptian into MSA | |
| Escartín | Design and compilation of a specialized Spanish-German parallel corpus. | |
| Bański | The logical architecture of CoMParS and its XML implementation | |
| Haverinen et al. | Accurate conversion of dependency parses: targeting the Stanford scheme | |
| JP5994150B2 (en) | Document creation method, document creation apparatus, and document creation program | |
| Hudík et al. | The integration of moses into localization industry | |
| Filip et al. | Internationalization Tag Set (ITS) Version 2.0 | |
| WO2008075087A1 (en) | Code translator and method of automatically translating modelling language code to hardware language code | |
| Declerck et al. | Cross-linking Austrian dialectal Dictionaries through formalized Meanings | |
| Senellart et al. | SYSTRAN translation stylesheets: machine translation driven by XSLT | |
| Huang et al. | Quality Assurance of Automatic Annotation of Very Large Corpora: a Study based on heterogeneous Tagging System. | |
| Choumane et al. | Integrating translation services within a structured editor | |
| Boitet et al. | Towards Higher Quality Internal and Outside Multilingualization of Web Sites | |
| Klyueva et al. | Querying multi-word expressions annotation with CQL |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |