[go: up one dir, main page]

CN102222070A - Related system and method for approximate words - Google Patents

Related system and method for approximate words Download PDF

Info

Publication number
CN102222070A
CN102222070A CN2010101500923A CN201010150092A CN102222070A CN 102222070 A CN102222070 A CN 102222070A CN 2010101500923 A CN2010101500923 A CN 2010101500923A CN 201010150092 A CN201010150092 A CN 201010150092A CN 102222070 A CN102222070 A CN 102222070A
Authority
CN
China
Prior art keywords
vocabulary
translation
approximate
comparison
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2010101500923A
Other languages
Chinese (zh)
Inventor
邱全成
徐胡晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inventec Corp
Original Assignee
Inventec Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inventec Corp filed Critical Inventec Corp
Priority to CN2010101500923A priority Critical patent/CN102222070A/en
Publication of CN102222070A publication Critical patent/CN102222070A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Machine Translation (AREA)

Abstract

The invention discloses an association system and an association method of approximate words, which are used for associating a first word with a second word according to an approximation degree by continuously screening translated words as the first word and the second word and comparing the first word with the second word so as to achieve the technical effect of improving the association between different translated words.

Description

近似词汇的关联系统及其方法An Approximate Vocabulary Association System and Its Method

技术领域technical field

本发明为有关于一种关联系统及其方法,特别是指一种能够根据翻译词汇的拼写近似程度,将不同的翻译词汇进行关联的近似词汇的关联系统及其方法。The present invention relates to an associating system and method thereof, in particular to a system and method for associating similar words that can associate different translated words according to the degree of spelling similarity of the translated words.

背景技术Background technique

近年来,随着电脑的普及和因特网的蓬勃发展,使用者接触多语系文件的机会较以往大幅增加,因此,能够提供使用者查询翻译词汇涵义的翻译程序便如雨后春笋般产生。In recent years, with the popularization of computers and the vigorous development of the Internet, the opportunities for users to access multilingual documents have increased significantly compared with the past. Therefore, translation programs that can provide users with querying the meaning of translated words have sprung up.

一般而言,翻译程序具有词汇数据库,此翻译数据库储存有翻译词汇,且翻译词汇包含词汇信息。使用者在输入查询词汇后,即通过词汇数据库查询相符的翻译词汇,以便根据此翻译词汇的词汇信息得知查询词汇的涵义。然而,目前使用者已经无法满足单纯显示相应的查询结果,因此,如何增加其查询结果的丰富性便成为各家厂商亟欲解决的问题之一。Generally speaking, the translation program has a vocabulary database, the translation database stores translated vocabulary, and the translated vocabulary includes vocabulary information. After inputting the query word, the user queries the corresponding translated word through the vocabulary database, so as to know the meaning of the searched word according to the vocabulary information of the translated word. However, at present, users are no longer satisfied with simply displaying corresponding query results. Therefore, how to increase the richness of their query results has become one of the problems that various manufacturers are eager to solve.

有鉴于此,便有厂商提出在查询结果中增加关联词汇,如:同义字与反义字,以便提供使用者更丰富的学习内容。不过,若能在查询结果中额外显示近似词汇,将有助于使用者快速地记忆更多的词汇,且查询结果亦更为丰富。因此,如何有效自动判断不同翻译词汇的相似程度,并根据相似程度将不同的翻译词汇进行关联,变成为亟需克服的问题。In view of this, some manufacturers propose to add related words, such as synonyms and antonyms, in the query results, so as to provide users with richer learning content. However, if similar vocabulary can be additionally displayed in the query result, it will help the user to memorize more vocabulary quickly, and the query result will be richer. Therefore, how to effectively and automatically judge the similarity of different translated words and associate different translated words according to the similarity has become an urgent problem to be overcome.

综上所述,可知现有技术中长期以来一直存在无法有效获得查询词汇的近似词汇的问题,因此实有必要提出改进的技术手段,来解决此一问题。To sum up, it can be seen that there has been a long-standing problem in the prior art that approximate words of the query words cannot be effectively obtained. Therefore, it is necessary to propose improved technical means to solve this problem.

发明内容Contents of the invention

有鉴于现有技术存在的问题,本发明遂提供一种近似词汇的关联系统及其方法。In view of the problems existing in the prior art, the present invention provides an approximate vocabulary association system and its method.

本发明所提供的近似词汇的关联系统,包含:词汇数据库、筛选模块、拆解模块及关联模块。其中,词汇数据库用以储存翻译词汇,每一翻译词汇包含索引值及词汇信息;筛选模块用以接收筛选条件,并根据此筛选条件筛选翻译词汇,以及自筛选出的翻译词汇中,持续选择加载两个翻译词汇作为第一词汇及第二词汇;拆解模块用以将第一词汇拆解成多个第一词汇字母,以及将第二词汇拆解成多个第二词汇字母;关联模块用以将第一词汇字母与第二词汇字母进行比对,并根据比对不符的次数产生差异值,当该差异值大于容许值时停止比对,并于比对完成时,将第二词汇的索引值嵌入于与第一词汇相应的翻译词汇的词汇信息。The association system of approximate vocabulary provided by the present invention includes: a vocabulary database, a screening module, a dismantling module and an association module. Among them, the vocabulary database is used to store translation vocabulary, and each translation vocabulary includes index value and vocabulary information; the filtering module is used to receive filtering conditions, and filter translation vocabulary according to the filtering conditions, and continuously select and load from the filtered translation vocabulary The two translation vocabulary are used as the first vocabulary and the second vocabulary; the dismantling module is used to disassemble the first vocabulary into a plurality of first vocabulary letters, and the second vocabulary is disassembled into a plurality of second vocabulary letters; the association module is used To compare the letters of the first vocabulary with the letters of the second vocabulary, and generate a difference value according to the number of times that the comparison does not match, stop the comparison when the difference value is greater than the allowable value, and when the comparison is completed, the second vocabulary's The index value is embedded in the vocabulary information of the translation vocabulary corresponding to the first vocabulary.

另外,所述系统更可包含查询模块用以接收查询词汇,并根据查询词汇自词汇数据库中查询相符的翻译词汇,且显示相应的词汇信息。所述每一翻译词汇的索引值为字符及数值所组成,且具有唯一性;筛选条件包含词汇长度、词组及词性的设定;而容许值为通过按压功能键或光标点选的方式所预先设置的数值。In addition, the system may further include a query module for receiving query words, querying matching translation words from a vocabulary database according to the query words, and displaying corresponding word information. The index value of each translation vocabulary is composed of characters and numerical values, and is unique; the filter conditions include the setting of vocabulary length, phrase and part of speech; and the allowable value is preset by pressing the function key or clicking the cursor The value of the setting.

至于本发明的近似词汇的关联方法,其步骤包括:于词汇数据库提供翻译词汇,每一翻译词汇包含索引值及词汇信息;接收筛选条件,并根据此筛选条件筛选翻译词汇,以及自筛选出的翻译词汇中,持续选择加载两个翻译词汇作为第一词汇及第二词汇;将第一词汇拆解成多个第一词汇字母,以及第二词汇拆解成多个第二词汇字母;将第一词汇字母与第二词汇字母进行比对,并根据比对不符的次数产生差异值,当此差异值大于容许值时停止比对,并于比对完成时,将第二词汇的索引值嵌入与第一词汇相应的翻译词汇的词汇信息。As for the method for associating similar words of the present invention, the steps include: providing translation words in a vocabulary database, each translation word includes index value and word information; receiving filter conditions, and filtering translation words according to the filter conditions, and self-filtered In the translation vocabulary, continue to select and load two translation vocabulary as the first vocabulary and the second vocabulary; disassemble the first vocabulary into multiple first vocabulary letters, and disassemble the second vocabulary into multiple second vocabulary letters; disassemble the second vocabulary Compare a letter of a vocabulary with a letter of a second vocabulary, and generate a difference value according to the number of times of comparison mismatches, stop the comparison when the difference is greater than the allowable value, and embed the index value of the second vocabulary when the comparison is completed Vocabulary information of the translated vocabulary corresponding to the first vocabulary.

除此之外,上述方法更包含接收查询词汇,并根据此查询词汇自词汇数据库中查询相符的翻译词汇,并显示相应的词汇信息的步骤。所述每一翻译词汇的索引值为字符及数值所组成,且具有唯一性;而筛选条件包含词汇长度、词组及词性的设定;以及容许值可通过按压功能键或光标点选的方式所预先设置的数值。In addition, the above method further includes the step of receiving the query word, searching for a corresponding translation word from the vocabulary database according to the query word, and displaying the corresponding word information. The index value of each translation vocabulary is composed of characters and numerical values, and is unique; and the filter conditions include the setting of vocabulary length, phrase and part of speech; and the allowable value can be selected by pressing the function key or clicking the cursor preset value.

本发明所提供的系统与方法如上,与现有技术之间的差异在于本发明是通过持续筛选翻译词汇作为第一词汇及第二词汇,并且将第一词汇及第二词汇进行比对,以便根据近似程度将第一词汇与第二词汇进行关联。The system and method provided by the present invention are as above, and the difference between the present invention and the prior art is that the present invention continuously screens the translation vocabulary as the first vocabulary and the second vocabulary, and compares the first vocabulary and the second vocabulary, so that The first vocabulary is associated with the second vocabulary based on similarity.

通过上述的技术手段,本发明可以达到提高不同翻译词汇的关联性的技术功效。Through the above-mentioned technical means, the present invention can achieve the technical effect of improving the relevance of different translation vocabulary.

附图说明Description of drawings

图1为本发明近似词汇的关联系统的系统方块图;Fig. 1 is the system block diagram of the association system of approximate vocabulary of the present invention;

图2为本发明近似词汇的关联方法的方法流程图;Fig. 2 is the method flowchart of the association method of approximate vocabulary of the present invention;

图3为应用本发明设定筛选条件的示意图;Fig. 3 is the schematic diagram of applying the present invention to set screening conditions;

图4为应用本发明输入查询词汇以查询具有近似词汇的词汇信息的示意图。FIG. 4 is a schematic diagram of applying the present invention to input query vocabulary to query vocabulary information with similar vocabulary.

【主要组件符号说明】[Description of main component symbols]

101词汇数据库101 Vocabulary Database

102筛选模块102 screening modules

103拆解模块103 Disassemble the module

104关联模块104 association modules

110查询模块110 query module

300设定窗口300 setting window

301光标301 Cursor

311词汇长度设定区块311 vocabulary length setting block

312词组选择组件312 Phrase selection component

313词性选择组件313 Part of speech selection component

321确定组件321 Identify components

322取消组件322 cancel component

400查询窗口400 query window

411输入区块411 input block

412查询组件412 query component

420显示区块420 display blocks

421词汇信息421 Vocabulary information

422近似词汇422 Approximate vocabulary

具体实施方式Detailed ways

以下将配合图式及实施例来详细说明本发明的实施方式,藉此对本发明如何应用技术手段来解决技术问题并达成技术功效的实现过程能充分理解并据以实施。The implementation of the present invention will be described in detail below in conjunction with the drawings and examples, so that the realization process of how to use technical means to solve technical problems and achieve technical effects in the present invention can be fully understood and implemented accordingly.

在说明本发明所提供的近似词汇的关联系统及其方法之前,先对本发明所自行定义的名词作说明,本发明所述的翻译词汇包含有索引值及词汇信息,举例来说,假设翻译词汇为“shy”,则其包含的索引值可为“~R1234”、词汇信息为“羞怯的”。其中,索引值可为字符及数值所组成,并且具有唯一性。然而,本发明并未以此限定索引值的格式。另外,本发明所述的差异值代表近似程度,差异值越大代表两个翻译词汇的差异越大(近似程度越小),反之则代表差异越小(近似程度越大)。Before explaining the association system and method of similar vocabulary provided by the present invention, the nouns defined by the present invention will be described first. The translation vocabulary described in the present invention includes index values and vocabulary information. For example, assuming the translation vocabulary is "shy", the index value it contains can be "~R1234", and the vocabulary information is "shy". Wherein, the index value may be composed of characters and numbers, and is unique. However, the present invention does not limit the format of the index value. In addition, the difference value described in the present invention represents the degree of approximation, and the greater the difference value, the greater the difference between the two translated words (the smaller the degree of similarity), and vice versa, the smaller the difference (the greater the degree of similarity).

以下配合图式对本发明近似词汇的关联系统及其方法作进一步说明,请参阅「图1」,「图1」为本发明近似词汇的关联系统的系统方块图,包含:词汇数据库101、筛选模块102、拆解模块103及关联模块104。其中,词汇数据库101储存多个翻译词汇,每一翻译词汇包含索引值及词汇信息。此词汇数据库101在实际实施上可利用关系型数据库(Relational Database)或档案文件来实现。另外,由于翻译词汇、索引值及词汇信息已于前述自行定义的名词中作说明,故在此不再多作赘述。The association system and method thereof of the approximate vocabulary of the present invention are further described in conjunction with the drawings below, please refer to "Fig. 1". 102 . The disassembly module 103 and the association module 104 . Wherein, the vocabulary database 101 stores a plurality of translation vocabulary, and each translation vocabulary includes an index value and vocabulary information. The vocabulary database 101 can be implemented using a relational database (Relational Database) or an archive file in practice. In addition, since the translation vocabulary, index value and vocabulary information have been described in the aforementioned self-defined nouns, no more details are given here.

筛选模块102用以接收筛选条件,并且根据此筛选条件筛选词汇数据库101中的翻译词汇,以及自筛选出的翻译词汇中,持续选择两个翻译词汇作为第一词汇及第二词汇。所述第一词汇为主词汇,第二词汇则是用以与第一词汇进行比对的词汇,第一词汇会与词汇数据库101中的所有翻译词汇比对过后,才会筛选另一个翻译词汇作为第一词汇。所述持续选择是指当第一词汇及第二词汇结束比对后,第一词汇暂时维持不变,接着将再选择另一个翻译词汇作为第二词汇,以便后续与第一词汇作比对,直到词汇数据库101中的所有翻译词汇皆曾作过第二词汇且与第一词汇比对过后,才选择另一翻译词汇作为第一词汇,并且重复上述流程直到所有筛选出的翻译词汇均互相比对过为止。举例来说,假设词汇数据库101中储存符合筛选条件的“A”、“B”及“C”三个翻译词汇,若筛选后选择翻译词汇“A”作为第一词汇、翻译词汇“C”作为第二词汇,则第一词汇“A”与第二词汇“C”结束比对后,将持续选择翻译词汇“B”作为第二词汇,而第一词汇暂时维持不变,直到第一词汇“A”分别与“B”及“C”比对过后,才持续选择另一个翻译词汇作为第一词汇。在实际实施上,可提供设定区块以便使用者设定词汇长度、词组及词性作为筛选条件。举例来说,假设筛选条件的词汇长度设为数值“3”,其代表排除小于三个字母所组成的翻译词汇,如“ax”;词组设为数值“0”,其代表排除类型为「词组」的翻译词汇,如:“a lot of”;词性设为“介词”,其代表排除词性为「介词」的翻译词汇,如:“between”。The filtering module 102 is configured to receive the filtering conditions, and filter the translated words in the vocabulary database 101 according to the filtering conditions, and continuously select two translated words from the filtered translated words as the first word and the second word. The first vocabulary is the main vocabulary, and the second vocabulary is a vocabulary used for comparison with the first vocabulary. After the first vocabulary is compared with all translation vocabulary in the vocabulary database 101, another translation vocabulary is selected as the first word. The continuous selection means that after the comparison between the first vocabulary and the second vocabulary is completed, the first vocabulary remains unchanged temporarily, and then another translation vocabulary will be selected as the second vocabulary for subsequent comparison with the first vocabulary, Until all the translation vocabulary in the vocabulary database 101 has been used as the second vocabulary and compared with the first vocabulary, another translation vocabulary is selected as the first vocabulary, and the above process is repeated until all the filtered translation vocabulary are compared with each other. So far. For example, assuming that the vocabulary database 101 stores three translation vocabulary "A", "B" and "C" that meet the filtering conditions, if the translation vocabulary "A" is selected as the first vocabulary and the translation vocabulary "C" is selected as the first vocabulary after filtering. The second vocabulary, after the comparison between the first vocabulary "A" and the second vocabulary "C", the translation vocabulary "B" will continue to be selected as the second vocabulary, while the first vocabulary will remain unchanged until the first vocabulary " After comparing A" with "B" and "C" respectively, another translation word is continuously selected as the first word. In actual implementation, a setting block may be provided for the user to set vocabulary length, phrase and part of speech as filtering conditions. For example, suppose the word length of the filter condition is set to a value of "3", which means to exclude translation words consisting of less than three letters, such as "ax"; the value of phrase is set to "0", which means that the exclusion type is "phrase ", such as: "a lot of"; the part of speech is set to "preposition", which means that the translation vocabulary with "preposition" is excluded, such as: "between".

拆解模块103用以将第一词汇拆解成多个第一词汇字母,以及将第二词汇拆解成多个第二词汇字母。举例来说,假设第一词汇为“good”,则拆解模块103会将其拆解成“g”、“o”、“o”及“d”等四个第一词汇字母。同理,假设第二词汇为“god”,则拆解模块103会将其拆解成“g”、“o”及“d”等三个第二词汇字母。由于将词汇拆解成字母的方式为习知技术,故在此将不再多作说明。The disassembly module 103 is used to disassemble the first vocabulary into a plurality of first vocabulary letters, and disassemble the second vocabulary into a plurality of second vocabulary letters. For example, assuming that the first word is "good", the dismantling module 103 will disassemble it into four letters of the first word such as "g", "o", "o" and "d". Similarly, assuming that the second vocabulary is "god", the dismantling module 103 will disassemble it into three letters of the second vocabulary such as "g", "o" and "d". Since the method of dismantling words into letters is a known technique, it will not be further explained here.

关联模块104用以将第一词汇字母与第二词汇字母进行比对,并根据比对不符的次数产生差异值,当此差异值大于容许值时停止比对,并于比对完成时,将第二词汇的索引值嵌入与第一词汇相应的翻译词汇的词汇信息中。所述容许值可通过按压功能键或光标点选的方式预先设置。以上述拆解模块103所拆解出的第一词汇字母及第二词汇字母为例,其比对方式可如下列步骤:先将第一个第一词汇字母“g”与第一个第二词汇字母“g”进行比对,并且得到比对符合的结果;由于前述为比对符合,故选择次一个第一词汇字母与次一个第二词汇字母进行比对,也就是将第二个第一词汇字母“o”与第二个第二词汇字母“o”进行比对,并且同样得到比对符合的结果;将第三个第一词汇字母“o”与第三个第二词汇字母“d”进行比对,并得到比对不符的结果,并产生差异值为数值“1”;将第四个第一词汇字母“d”与第三个第二词汇字母“d”进行比对,并得到比对符合的结果。至此,即完成第一词汇字母与第二词汇字母的比对处理。特别要说明的是,之所以要将第四个第一词汇字母与第三个第二词汇字母进行比对,是因为第三个第一词汇字母与第三个第二词汇字母为比对不符,因此,次一个(在此例中为第四个)第一词汇字母需从比对不符的第二词汇字母(即比对相符的次一个,在此例中为第三个)开始比对,直到比对符合为止。除此之外,每一个第一词汇字母在与第二词汇字母进行比对时,假设连续比对不符时,则其比对不符的次数仍然以数值“1”做计算,举例来说,假设第一个第一词汇字母为“a”,第二词汇字母为“g”、“o”及“d”,则第一个第一词汇字母分别与第二词汇字母为“g”、“o”及“d”进行比对时,虽然三次比对均不符,但其比对不符的次数仍然计算为一次。The association module 104 is used to compare the first vocabulary letter with the second vocabulary letter, and generate a difference value according to the number of times of the mismatch. When the difference value is greater than the allowable value, the comparison is stopped, and when the comparison is completed, the The index value of the second vocabulary is embedded in the vocabulary information of the translation vocabulary corresponding to the first vocabulary. The allowable value can be preset by pressing a function key or clicking with a cursor. Taking the first vocabulary letter and the second vocabulary letter disassembled by the above-mentioned dismantling module 103 as an example, the comparison method can be as follows: first compare the first first vocabulary letter "g" with the first second vocabulary letter Vocabulary letter "g" is compared, and the result of the comparison is obtained; since the aforementioned is a comparison, the next first vocabulary letter is selected to be compared with the next second vocabulary letter, that is, the second A vocabulary letter "o" is compared with the second second vocabulary letter "o", and the matching result is also obtained; the third first vocabulary letter "o" is compared with the third second vocabulary letter " d" is compared, and the result of the comparison is inconsistent, and the difference value is "1"; the fourth first vocabulary letter "d" is compared with the third second vocabulary letter "d", and obtain matching results. So far, the comparison process between the first vocabulary letter and the second vocabulary letter is completed. In particular, the reason why the fourth first vocabulary letter is compared with the third second vocabulary letter is because the third first vocabulary letter does not match the third second vocabulary letter , therefore, the next (in this case, the fourth) first word letter needs to be compared from the second word letter that does not match (that is, the next one that matches, in this case, the third) , until the comparison matches. In addition, when each letter of the first vocabulary is compared with the letter of the second vocabulary, assuming that the continuous comparison does not match, the number of times that the comparison does not match is still calculated with the value "1". For example, assuming The first letter of the first vocabulary is "a", the letter of the second vocabulary is "g", "o" and "d", then the first letter of the first vocabulary and the letter of the second vocabulary are "g", "o" " and "d" for comparison, although the three comparisons are all inconsistent, but the number of times the comparison is not consistent is still counted as one.

承上所述,假设第一词汇“good”,其索引值为“~R1000”且词汇信息为“好的。”;而第二词汇“god”,其索引值为“~R1011”且词汇信息为“神像。”。以上述比对方式为例,其比对完成时将得到差异值为“1”,此时,关联模块104会将第二词汇的索引值“~R1011”嵌入与第一词汇“good”相应的翻译词汇“good”的词汇信息,如:“好的。~R1011”。Based on the above, assume that the first vocabulary "good" has an index value of "~R1000" and the vocabulary information is "good."; and the second vocabulary "god" has an index value of "~R1011" and the vocabulary information For "Idol.". Taking the above comparison method as an example, when the comparison is completed, the difference value will be "1". At this time, the association module 104 will embed the index value "~R1011" of the second vocabulary into the corresponding index value of the first vocabulary "good". Translate the lexical information of the word "good", such as: "Good. ~R1011".

除此之外,本发明近似词汇的关联系统更可包含查询模块110,用以接收使用者所输入的查询词汇,并根据此查询词汇自词汇数据库101中查询相符的翻译词汇,且显示相应的词汇信息。特别要说明的是,当所显示的词汇信息中已嵌入有索引值时,更可根据此索引值加载相应的翻译词汇供使用者进一步查询。其查询方式将在稍后配合图式作详细说明。In addition, the system for associating similar vocabulary in the present invention may further include a query module 110, which is used to receive the query vocabulary input by the user, and query the corresponding translation vocabulary from the vocabulary database 101 according to the query vocabulary, and display the corresponding lexical information. In particular, when an index value is already embedded in the displayed vocabulary information, the corresponding translated vocabulary can be loaded according to the index value for further query by the user. The query method will be described in detail later with the diagram.

接着,请参阅「图2」所示意,「图2」为本发明近似词汇的关联方法的方法流程图,其步骤包括:于词汇数据库101提供多个翻译词汇,每一个翻译词汇均包含索引值及词汇信息(步骤210);接收筛选条件,并根据此筛选条件筛选翻译词汇,以及自筛选出的翻译词汇中,持续选择加载两个翻译词汇作为第一词汇及第二词汇(步骤220);将第一词汇拆解成多个第一词汇字母,以及将第二词汇拆解成多个第二词汇字母(步骤230);将第一词汇字母与第二词汇字母进行比对,并根据比对不符的次数产生差异值,当此差异值大于容许值时停止比对,并于比对完成时,将第二词汇的索引值嵌入与第一词汇相应的翻译词汇的词汇信息(步骤240)。通过上述步骤,即可通过持续筛选翻译词汇作为第一词汇及第二词汇,并且将第一词汇及第二词汇进行比对,以便根据近似程度将第一词汇与第二词汇进行关联。Next, please refer to the diagram shown in "Fig. 2". "Fig. 2" is a flow chart of the method for associating approximate vocabulary in the present invention. The steps include: providing a plurality of translation vocabulary in the vocabulary database 101, and each translation vocabulary includes an index value and vocabulary information (step 210); receive the filter condition, and filter the translation vocabulary according to the filter condition, and from the translation vocabulary filtered out, continue to select and load two translation vocabulary as the first vocabulary and the second vocabulary (step 220); The first vocabulary is disassembled into a plurality of first vocabulary letters, and the second vocabulary is disassembled into a plurality of second vocabulary letters (step 230); the first vocabulary letter is compared with the second vocabulary letter, and according to the comparison A difference value is generated for the number of times of inconsistency, and when the difference value is greater than the allowable value, the comparison is stopped, and when the comparison is completed, the index value of the second vocabulary is embedded into the vocabulary information of the translation vocabulary corresponding to the first vocabulary (step 240) . Through the above steps, the translation vocabulary can be continuously selected as the first vocabulary and the second vocabulary, and the first vocabulary and the second vocabulary are compared, so as to associate the first vocabulary with the second vocabulary according to the degree of similarity.

除了上述步骤之外,更可接收使用者所输入的查询词汇,并且根据此查询词汇自词汇数据库101中查询相符的翻译词汇,并显示相应的词汇信息(步骤250),稍候将配合图式对其查询过程作详细说明。In addition to the above steps, the query vocabulary input by the user can be received, and the corresponding translation vocabulary can be searched from the vocabulary database 101 according to the query vocabulary, and the corresponding vocabulary information can be displayed (step 250). Describe the inquiry process in detail.

以下配合「图3」及「图4」以实施例的方式进行如下说明,请先参阅「图3」,「图3」为应用本发明设定筛选条件的示意图,包含:设定窗口300、光标301、词汇长度设定区块311、词组选择组件312、词性选择组件313、确定组件321及取消组件322。特别要说明的是,本发明并未以此限定设定窗口300内所包含的组件数量及类型。The following description will be made in the form of an embodiment in conjunction with "Figure 3" and "Figure 4". Please refer to "Figure 3" first. "Figure 3" is a schematic diagram of applying the present invention to set filter conditions, including: setting window 300, Cursor 301 , vocabulary length setting block 311 , phrase selection component 312 , part of speech selection component 313 , determination component 321 and cancel component 322 . It should be noted that the present invention does not limit the quantity and types of components included in the setting window 300 .

当使用者欲对翻译词汇进行关联的时候,可先开启设定窗口300用设定筛选条件,以便之后根据此筛选条件筛选出允许关联的翻译词汇,例如:在设定窗口300中的词汇长度设定区块311内输入数值(例如:“3”)、通过词组选择组件312选择是否排除词组(例如:点选「排除」即可排除词组)即通过词性选择组件313选择词性,如:介词。接着,使用者可通过点选确定组件321以示设定完成,或是点选取消组件322取消上述设定。When the user wants to associate translation vocabulary, he can first open the setting window 300 to set the filtering conditions, so that the translation vocabulary that allows association can be filtered out according to the filtering conditions, for example: the length of the vocabulary in the setting window 300 Enter a value (for example: "3") in the setting block 311, select whether to exclude the phrase through the phrase selection component 312 (for example: click "exclude" to exclude the phrase), that is, select the part of speech through the part of speech selection component 313, such as: preposition . Then, the user can click the confirmation component 321 to indicate that the setting is completed, or click the cancel component 322 to cancel the above settings.

承上所述,当使用者点选确定组件321后,筛选模块102会将上述设定接收为筛选条件,并且根据此筛选条件筛选词汇数据库101中的翻译词汇,以及从筛选出的翻译词汇中,选择加载两个翻译词汇作为第一词汇及第二词汇。假设这两个翻译词汇分别为“book”及“bach”,则可将“book”作为第一词汇,以及将“bach”作为第二词汇。As mentioned above, when the user clicks the confirmation component 321, the filtering module 102 will receive the above settings as the filtering condition, and filter the translation vocabulary in the vocabulary database 101 according to the filtering condition, and select the translation vocabulary from the filtered translation vocabulary. , choose to load two translation terms as the first term and the second term. Assuming that the two translated words are "book" and "bach" respectively, "book" may be used as the first word, and "bach" may be used as the second word.

接着,拆解模块103会将第一词汇“book”拆解成多个第一词汇字母,如:“b”、“o”、“o”及“k”,以及将第二词汇“bach”拆解成多个第二词汇字母“b”、“a”、“c”及“h”。之后,关联模块104会将这些第一词汇字母与第二词汇字母进行比对,其比对方式如下所述:将第一个第一词汇字母“b”与第一个第二词汇字母“b”进行比对,并判断为比对符合;将第二个第一词汇字母“o”与第二个第二词汇字母“a”进行比对,并判断为比对不符,且差异值记录为数值“1”;将第二个第一词汇字母“o”与第三个第二词汇字母“c”进行比对,并判断为比对不符,由于仍然在第二个第一词汇字母“o”的比对过程中,因此差异值维持数值“1”;将第二个第一词汇字母“o”与第四个第二词汇字母“h”进行比对,并判断为比对不符,由于同样在第二个第一词汇字母“o”的比对过程中,故差异值维持数值“1”,至此,第二个第一词汇字母“o”的比对过程结束;将第三个第一词汇字母“o”与第二个第二词汇字母“a”进行比对,并且判断为比对不符,且将差异值设为数值“2”;接着以此类推,将第三个第一词汇字母“o”依序与第三个第二词汇字母“c”及第四个第二词汇字母“h”进行比对,且维持差异值为数值“2”;将第四个第一词汇字母“k”与第二个第二词汇字母“a”进行比对,并且判断为比对不符,且将差异值设为数值“3”;接着同样以此类推,将第四个第一词汇字母“k”依序与第三个第二词汇字母“c”及第四个第二词汇字母“h”进行比对,且维持差异值为数值“3”,至此即比对完成。特别要说明的是,上述第三个第一词汇字母“o”之所以从第二个第二词汇字母“a”开始比对,是因为在第二个第二词汇字母“a”时产生比对不符的状况,换句话说,后续的第一词汇字母会从比对相符的次一个第二词汇字母“a”开始进行比对。Next, the disassembly module 103 will disassemble the first vocabulary "book" into a plurality of first vocabulary letters, such as: "b", "o", "o" and "k", and the second vocabulary "bach" Disassembled into multiple second vocabulary letters "b", "a", "c" and "h". Afterwards, the association module 104 will compare these first vocabulary letters with the second vocabulary letters, and the comparison method is as follows: compare the first first vocabulary letter "b" with the first second vocabulary letter "b " for comparison, and it is judged that the comparison is consistent; the second first vocabulary letter "o" is compared with the second second vocabulary letter "a", and it is judged that the comparison does not match, and the difference value is recorded as Value "1"; compare the second letter "o" of the first vocabulary with the third letter "c" of the second vocabulary, and judge that the comparison does not match, because it is still in the second letter "o" of the first vocabulary " during the comparison process, so the difference value maintains the value "1"; compare the second first word letter "o" with the fourth second word letter "h", and judge that the comparison does not match, because Also in the comparison process of the second first word letter "o", so the difference value maintains the value "1", so far, the comparison process of the second first word letter "o" ends; A vocabulary letter "o" is compared with the second second vocabulary letter "a", and it is judged that the comparison does not match, and the difference value is set to the value "2"; and so on, the third first The vocabulary letter "o" is compared with the third second vocabulary letter "c" and the fourth second vocabulary letter "h" in sequence, and the difference value is maintained at "2"; the fourth first vocabulary letter The letter "k" is compared with the letter "a" of the second second vocabulary, and it is judged that the comparison does not match, and the difference value is set to the value "3"; and then similarly, the fourth first vocabulary The letter "k" is compared with the third letter "c" of the second vocabulary and the fourth letter "h" of the second vocabulary in sequence, and the difference value is maintained at "3", and the comparison is completed. In particular, it should be noted that the reason why the above-mentioned third first vocabulary letter "o" is compared from the second second vocabulary letter "a" is because the comparison occurs when the second second vocabulary letter "a" In the case of a mismatch, in other words, the subsequent first letter of the vocabulary will be compared starting from the next second letter "a" that matches.

在比对完成时,关联模块104会将第二词汇的索引值嵌入与第一词汇相应的翻译词汇的词汇信息中。以上例而言,假设第二词汇“bach”的索引值为“~R3000”、与第一词汇“book”相应的翻译词汇“book”的词汇信息为“书本。”,则关联模块104会将索引值“~R3000”嵌入至此词汇信息成为“书本。~R3000”。在实际实施上,关联模块104更可根据第一词汇的词汇长度,以及在比对过程中产生的差异值来判断是否进行嵌入,举例来说,当第一词汇的词汇长度为数值“4”时,差异值小于等于数值“1”才进行嵌入(在上例而言,在此情况下,第二词汇的索引值将不会被嵌入);第一词汇的词汇长度为数值“5”至数值“10”时,差异值小于等于数值“2”才进行嵌入......等等。然,本发明并未以此限定嵌入的判断方式。除此之外,关联模块104在进行比对前,亦可先判断第一词汇的词汇信息中,是否已经具有第二词汇的索引值,若无则进行比对,反之则无须进行比对。When the comparison is completed, the association module 104 embeds the index value of the second vocabulary into the vocabulary information of the translated vocabulary corresponding to the first vocabulary. For the above example, assuming that the index value of the second vocabulary "bach" is "~R3000", and the vocabulary information of the translated vocabulary "book" corresponding to the first vocabulary "book" is "book.", then the association module 104 will The index value "~R3000" embedded in this vocabulary information becomes "book.~R3000". In actual implementation, the association module 104 can further judge whether to perform embedding according to the vocabulary length of the first vocabulary and the difference value generated during the comparison process. For example, when the vocabulary length of the first vocabulary is the value "4" When , the difference value is less than or equal to the value "1" before embedding (in the above example, in this case, the index value of the second vocabulary will not be embedded); the vocabulary length of the first vocabulary is from the value "5" to When the value is "10", the difference value is less than or equal to the value "2" before embedding... and so on. However, the present invention does not limit the way of judging embedding. In addition, before the comparison, the association module 104 may also first determine whether the vocabulary information of the first vocabulary already has the index value of the second vocabulary, and if not, perform the comparison; otherwise, no comparison is required.

另外,若使用者通过按压功能键的方式,如:按压键盘按键“F2”,用以将容许值设定为数值“2”,那么在上述比对过程中差异值成为“3”时,关联模块104将停止比对,并且由筛选模块102重新自筛选出的翻译词汇中选择另一翻译词汇作为第二词汇,以便重复上述的比对过程,而由于其比对过程与上述相同,故在此不再多作赘述。In addition, if the user presses the function key, such as pressing the keyboard button "F2" to set the allowable value to the value "2", then when the difference value becomes "3" in the above comparison process, the associated Module 104 will stop comparison, and select another translation vocabulary as the second vocabulary from the translation vocabulary screened out again by screening module 102, so that repeat above-mentioned comparison process, and because its comparison process is identical with above-mentioned, so in This will not be repeated here.

如「图4」所示,「图4」为应用本发明输入查询词汇以查询具有近似词汇的词汇信息的示意图。当所有筛选出的翻译词汇均作为第一词汇及第二词汇,且互相比对后,词汇数据库101中的翻译词汇,其包含的词汇信息可具有索引值,并且将此索引值对应的翻译词汇作为近似词汇。在实际实施上,使用者可通过查询模块110所产生的查询窗口400查询近似词汇。As shown in "FIG. 4", "FIG. 4" is a schematic diagram of applying the present invention to input query vocabulary to query vocabulary information with similar vocabulary. When all the translated vocabulary selected are used as the first vocabulary and the second vocabulary, and compared with each other, the translated vocabulary in the vocabulary database 101 may contain an index value, and the translated vocabulary corresponding to the index value as an approximation. In practical implementation, the user can query similar words through the query window 400 generated by the query module 110 .

举例来说,使用者可在输入区块411中键入查询词汇“book”,并且以光标301点选查询组件412。此时,查询模块110会接收此查寻词汇“book”,并且根据此查询词汇自词汇数据库101查询相符的翻译词汇,由于在前述说明中,翻译词汇“book”的词汇信息为“书本。~R3000”,因此可如「图4」所示意,在显示区块420内显示词汇信息421“书本。”,并根据索引值“~R3000”自词汇数据库101查询相应的翻译词汇“bash″作为近似词汇422来显示。For example, the user can input the query term “book” in the input block 411 and click the query component 412 with the cursor 301 . At this time, the query module 110 will receive the query word "book", and query the corresponding translation vocabulary from the vocabulary database 101 according to the query word. Therefore, as shown in "Fig. 4", the vocabulary information 421 "book." is displayed in the display block 420, and the corresponding translation vocabulary "bash" is searched from the vocabulary database 101 according to the index value "~R3000" as an approximate vocabulary 422 to display.

综上所述,可知本发明与现有技术之间的差异在于通过持续筛选翻译词汇作为第一词汇及第二词汇,并且将第一词汇及第二词汇进行比对,以便根据近似程度将第一词汇与第二词汇进行关联,藉由此一技术手段可以解决现有技术所存在的问题,进而达成提高不同翻译词汇的关联性的技术功效。In summary, it can be seen that the difference between the present invention and the prior art lies in that the translation vocabulary is continuously selected as the first vocabulary and the second vocabulary, and the first vocabulary and the second vocabulary are compared, so that the second vocabulary can be compared according to the degree of similarity. A vocabulary is associated with a second vocabulary, and by means of this technical means, problems existing in the prior art can be solved, thereby achieving the technical effect of improving the relevance of different translation vocabulary.

虽然本发明以前述的实施例提供如上,然其并非用以限定本发明,任何本领域技术人员,在不脱离本发明的精神和范围内,当可作些许的更动与润饰,因此本发明的专利保护范围须视本说明书所附的权利要求书所界定者为准。Although the present invention is provided as above with the foregoing embodiments, it is not intended to limit the present invention. Any person skilled in the art may make some changes and modifications without departing from the spirit and scope of the present invention. Therefore, the present invention The scope of patent protection shall be defined by the claims attached to this specification.

Claims (10)

1. the interconnected system of an approximate vocabulary is characterized in that, comprises:
One lexical data base, in order to store a plurality of translation vocabulary, each translation vocabulary comprises an index value and a lexical information;
One screening module in order to receiving screening conditions, and is screened those translation vocabulary according to these screening conditions, and in the translation vocabulary that goes out of self-sizing, continues to select to load two translation vocabulary as one first vocabulary and one second vocabulary;
One disassembles module, in order to this first vocabulary being disassembled into a plurality of first vocabulary letters, and a plurality of second vocabulary letters is disassembled in this second vocabulary; And
One relating module, in order to those first vocabulary letters are compared with those second vocabulary letters, and produce a difference value according to the number of times that comparison is not inconsistent, when this difference value stops comparison during greater than an allowable value, and when comparison is finished, the index value embedding and first vocabulary of this second vocabulary is translated accordingly the lexical information of vocabulary.
2. the interconnected system of approximate vocabulary as claimed in claim 1, it is characterized in that this system more comprises an enquiry module, in order to receive an inquiry vocabulary, and in lexical data base, inquire about the translation vocabulary that conforms to, and show corresponding lexical information according to this inquiry vocabulary.
3. the interconnected system of approximate vocabulary as claimed in claim 1 is characterized in that, this index value of each translation vocabulary is formed by character and numerical value, and has uniqueness.
4. the interconnected system of approximate vocabulary as claimed in claim 1 is characterized in that, these screening conditions comprise the setting of vocabulary length, phrase and part of speech.
5. the interconnected system of approximate vocabulary as claimed in claim 1 is characterized in that, the numerical value that mode set in advance of this allowable value for clicking by pressing function key or cursor.
6. the correlating method of an approximate vocabulary is characterized in that, step comprises:
Provide a plurality of translation vocabulary in a lexical data base, each translation vocabulary comprises an index value and a lexical information;
Receive screening conditions, and screen those translation vocabulary, and in the translation vocabulary that goes out of self-sizing, continue to select to load two translation vocabulary as one first vocabulary and one second vocabulary according to these screening conditions;
A plurality of first vocabulary letters disassembled in this first vocabulary, and a plurality of second vocabulary letters disassembled in this second vocabulary; And
Those first vocabulary letters are compared with those second vocabulary letters, and produce a difference value according to the number of times that comparison is not inconsistent, when this difference value stops comparison during greater than an allowable value, and when comparison is finished, the index value embedding and first vocabulary of this second vocabulary is translated accordingly the lexical information of vocabulary.
7. the correlating method of approximate vocabulary as claimed in claim 6 is characterized in that, this method more comprises reception one inquiry vocabulary, and inquires about the translation vocabulary that conforms to according to this inquiry vocabulary in lexical data base, and shows the step of corresponding lexical information.
8. the correlating method of approximate vocabulary as claimed in claim 6 is characterized in that, this index value of each translation vocabulary is formed by character and numerical value, and has uniqueness.
9. the correlating method of approximate vocabulary as claimed in claim 6 is characterized in that, these screening conditions comprise the setting of vocabulary length, phrase and part of speech.
10. the correlating method of approximate vocabulary as claimed in claim 6 is characterized in that, the numerical value that mode set in advance of this allowable value for clicking by pressing function key or cursor.
CN2010101500923A 2010-04-16 2010-04-16 Related system and method for approximate words Pending CN102222070A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2010101500923A CN102222070A (en) 2010-04-16 2010-04-16 Related system and method for approximate words

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2010101500923A CN102222070A (en) 2010-04-16 2010-04-16 Related system and method for approximate words

Publications (1)

Publication Number Publication Date
CN102222070A true CN102222070A (en) 2011-10-19

Family

ID=44778624

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2010101500923A Pending CN102222070A (en) 2010-04-16 2010-04-16 Related system and method for approximate words

Country Status (1)

Country Link
CN (1) CN102222070A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103246728A (en) * 2013-05-10 2013-08-14 北京大学 Emergency detection method based on document lexical feature variations

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103246728A (en) * 2013-05-10 2013-08-14 北京大学 Emergency detection method based on document lexical feature variations

Similar Documents

Publication Publication Date Title
US8341167B1 (en) Context based interactive search
KR102067926B1 (en) Apparatus and method for de-identifying personal information contained in electronic documents
US8849653B2 (en) Updating dictionary during application installation
CN105051721B (en) Program for converting natural language descriptions to domain-specific languages for spreadsheets
KR101524889B1 (en) Identification of semantic relationships within reported speech
CN103186639B (en) Data creation method and system
CN100437573C (en) Identifying related names
CN106383836B (en) Attributing actionable attributes to data describing an identity of an individual
CN113678118A (en) data extraction system
BRPI0715523A2 (en) document-centric workflow systems, methods, and software based on document content, metadata, and context
WO2015009586A2 (en) Performing an operation relative to tabular data based upon voice input
KR20090014136A (en) System and method for searching and matching data having ideogrammatic content
CN102016837A (en) System and method for classification and retrieval of Chinese characters and character radicals
CN103678362A (en) Search method and search system
CN111767334A (en) Information extraction method, device, electronic device and storage medium
CN102682045B (en) Recommendation Method and Recommendation System Based on Dynamic Language Model
CN110162617B (en) Method, apparatus, language processing engine and medium for extracting summary information
CN103377224A (en) Method and device for recognizing problem types and method and device for establishing recognition models
JP2006023968A (en) Named expression extraction method and apparatus, and program used therefor
CN102222070A (en) Related system and method for approximate words
TWI432981B (en) Word relational system with approximation word and method thereof
CN100442275C (en) Method and system for authenticating Chinese address data
CN112989011B (en) Data query method, data query device and electronic equipment
TW200947241A (en) Database indexing algorithm and method and system for database searching using the same
CN101996505A (en) Learning system and method for raw word and vocabulary

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20111019