WO2019109663A1 - Procédé et appareil de recherche interlingue, et appareil de recherche interlingue - Google Patents
Procédé et appareil de recherche interlingue, et appareil de recherche interlingue Download PDFInfo
- Publication number
- WO2019109663A1 WO2019109663A1 PCT/CN2018/100990 CN2018100990W WO2019109663A1 WO 2019109663 A1 WO2019109663 A1 WO 2019109663A1 CN 2018100990 W CN2018100990 W CN 2018100990W WO 2019109663 A1 WO2019109663 A1 WO 2019109663A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- source language
- search term
- source
- translation
- translation result
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/338—Presentation of query results
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/3332—Query translation
- G06F16/3337—Translation of the query language, e.g. Chinese to English
Definitions
- the present application relates to the field of information search technology, and in particular, to a cross-language search method and apparatus, and a device for cross-language search.
- search term (query) input by the user is “March”, and the search in the Chinese database may not satisfy the user's needs to the greatest extent, and the English database originating from the European and American websites may have better and more search results.
- Cross-language search technology combines information retrieval technology with machine translation technology.
- the existing cross-language search process may specifically include: firstly, converting the source language search words into search words in the target language form through machine translation technology, and then respectively searching for the search words according to the source language search words and the target language forms, respectively.
- the information retrieval in the language database is performed to obtain multi-language search results, wherein the multi-language search results may include: search results in the form of source language and search results of the target language.
- the existing solution can translate the target language search result and provide the translated source language translation result to the user.
- users often need to carefully interpret in order to obtain the key information needed from the source language translation results, which will affect the efficiency of the user to obtain information from the source language translation results.
- embodiments of the present application have been proposed in order to provide a cross-language search method, a cross-language search device, and a device for cross-language search that overcome the above problems or at least partially solve the above problems, and the embodiments of the present application can improve the target.
- the coverage rate of the red content in the source language translation result corresponding to the language search term can further improve the efficiency of obtaining information from the source language translation result corresponding to the target language search term.
- a cross-language search method which includes:
- a cross-language search device including:
- a search term translation module configured to translate a source language search term into a target language search term
- a result obtaining module configured to search for a target language search result for the target language, and obtain a corresponding target language search result and a source language translation result; wherein the source language translation result is obtained by translating the target language search result;
- a keyword acquisition module configured to obtain, from the source language translation result, a keyword that matches the target language search term
- a mark display module configured to mark and display the keyword in the source language translation result.
- an embodiment of the present application discloses an apparatus for cross-language search, including a memory, and one or more programs, wherein one or more programs are stored in a memory and configured to be Execution of the one or more programs by the one or more processors includes instructions for performing the following operations:
- embodiments of the present application disclose one or more machine-readable media having stored thereon instructions that, when executed by one or more processors, cause a device to perform one or more of the aforementioned cross-language searches method.
- the embodiment of the present application obtains a keyword matching the target language search term from the source language translation result, and displays the keyword in the source language translation result, thereby improving the source language corresponding to the target language search term.
- the coverage of the red content in the translation result can improve the efficiency of obtaining information from the source language translation results corresponding to the target language search term.
- FIG. 1 is a schematic diagram of an application environment of a cross-language search method of the present application
- FIG. 2 is a flow chart showing the steps of a first embodiment of a cross-language search method according to the present application
- FIG. 3 is a flow chart of steps of a second embodiment of a cross-language search method according to the present application.
- FIG. 4 is a flow chart showing the steps of a third embodiment of a cross-language search method according to the present application.
- FIG. 5 is a flow chart of steps of a fourth embodiment of a cross-language search method according to the present application.
- FIG. 6 is a structural block diagram of an embodiment of a cross-language search device of the present application.
- FIG. 7 is a block diagram of a device 900 for cross-language search as a terminal of the present application.
- FIG. 8 is a schematic structural diagram of a device for cross-language search as a server of the present application.
- the inventor found in the process of implementing the embodiment of the present application that the existing solution can translate the target language search result into the source language, and mark the content of the translated source language translation result that is the same as the source language search term. For example, if the source language search term is "Trump" and the corresponding target language search term is "Trump”, the "Trump” appearing in the source language translation result corresponding to "Trump” can be marked red. This allows the user to quickly obtain key information included in a source language translation result, which in turn allows the user to quickly determine whether a source language translation result meets his or her information needs.
- the present invention provides a cross-language search solution, which can translate a source language search term into a target language search term, in view of the technical problem of low coverage of the red content in the source language translation result in the existing solution.
- a target language search term obtaining a corresponding target language search result and a source language translation result; wherein the source language translation result may be obtained by translating the target language search result; obtaining the source language translation result a keyword matching the target language search term; and displaying the keyword in the source language translation result.
- the embodiment of the present application obtains a keyword matching the target language search term from the source language translation result, and displays the keyword in the source language translation result, thereby improving the source language corresponding to the target language search term.
- the coverage of the red content in the translation result can improve the efficiency of obtaining information from the source language translation results corresponding to the target language search term.
- the search result corresponding to the target language search term may include: a target language search result and a source language translation result, wherein the source language translation result may be obtained by translating the target language search result, and the target language search result is obtained.
- the translation results with the source language can correspond to the same search results (such as the same web page, video, picture, music, etc.), one of the differences between the two is the difference in language form.
- the source language search term is “March”
- the target language search term corresponding to “March” is “March”
- the target language search result corresponding to “March” is included in the target language search result.
- March-Wikipedia the translation result of the source language for the “March-Wikipedia” translation is "March-Wikipedia”
- the embodiment of the present application can translate the result from the source language of "March” "March-Wikipedia”
- the keyword "March” matching "March” is determined, and "March” in the source language translation result "March” is displayed in "March-Wikipedia", thereby improving the source.
- the embodiment of the present application can be applied to a platform environment with a cross-language search function, such as a search APP (application), a search website (such as a search engine), etc., to improve the efficiency of obtaining information from search results corresponding to target language search words.
- a cross-language search function such as a search APP (application), a search website (such as a search engine), etc.
- the cross-language search method of the embodiment of the present application is mainly described by using the search APP as an example, and the cross-language search method corresponding to other platform environments such as the search website may be cross-referenced.
- the cross-language search method provided by the embodiment of the present application can be applied to the application environment shown in FIG. 1.
- the client 100 and the server 200 are located in a wired or wireless network, and the client is connected through the wired or wireless network. 100 performs data interaction with the server 200.
- the cross-language search process of the embodiment of the present application may be performed by any one or combination of the client 100 and the server 200:
- the client 100 may receive the source language search term input by the user, and send the source language search term to the server 200; after receiving the source language search term, the server 200 may translate the source language search term into the target language search term.
- Obtaining a target language search result corresponding to the target language search term performing a translation from the target language to the source language on the target language search result to obtain a source language translation result; and obtaining a context from the source language translation result
- the target language search term matches the keyword
- the source language translation result and the keyword are sent to the client 100, so that the client 100 displays the source language translation result, and in the source language translation result
- the keywords are displayed in a mark.
- the client 100 may receive the source language search term input by the user, and after receiving the source language search term, first translate the source language search term into the target language search term, and then obtain the target corresponding to the target language search term. a language search result, then performing a translation from the target language to the source language on the target language search result to obtain a source language translation result; and obtaining a key matching the target language search term from the source language translation result a word; and then displaying the source language translation result, and displaying the keyword in the source language translation result.
- the advantage of the resource richness of the server 200 can be exerted, and the source language translation result and the source language translation result are improved.
- Word acquisition efficiency and accuracy For example, a cloud server can be deployed with a large number of highly configured computing devices, so the use of these computing devices for source language translation results, as well as the acquisition of keywords in the source language translation results, to improve the source language translation results, as well as the key words in the source language translation results.
- the word acquisition efficiency and the acquisition accuracy rate can be saved; at the same time, the computing resources on the client side 100 can be saved, and the performance of the terminal corresponding to the client 100 can be improved.
- the process of obtaining the keywords in the source language translation result and the source language translation result may be performed by any one of the server 200 and the client 100, and the method for acquiring the keywords in the source language translation result and the source language translation result in the embodiment of the present application
- the corresponding specific execution subject is not limited.
- the client 100 can run on the terminal, where the terminal specifically includes but is not limited to: a smart phone, a tablet computer, an e-book reader, and an MP3 (motion image expert compresses the standard audio layer 3, and the Moving Picture Experts Group Audio Layer III Player, MP4 (Moving Picture Experts Group Audio Layer IV) player, laptop portable computer, car computer, desktop computer, set-top box, smart TV, wearable device, etc.
- the terminal specifically includes but is not limited to: a smart phone, a tablet computer, an e-book reader, and an MP3 (motion image expert compresses the standard audio layer 3, and the Moving Picture Experts Group Audio Layer III Player, MP4 (Moving Picture Experts Group Audio Layer IV) player, laptop portable computer, car computer, desktop computer, set-top box, smart TV, wearable device, etc.
- MP3 motion image expert compresses the standard audio layer 3
- MP4 Motion Picture Experts Group Audio Layer IV player
- FIG. 2 a flow chart of the steps of the first embodiment of the cross-language search method of the present application is shown, which may specifically include the following steps:
- Step 201 Translate a source language search term into a target language search term
- Step 202 Obtain a corresponding target language search result and a source language translation result for the target language search term; wherein the source language translation result may be obtained by translating the target language search result;
- Step 203 Obtain a keyword that matches the target language search term from the source language translation result.
- Step 204 Mark the keyword in the source language translation result.
- the source language search term may be a search term input by the user using the source language.
- a client searching for an APP or a search website can provide a UI (User Interface), and the user can submit a source language search term to the client through a search box, a voice interface, or the like on the UI.
- the client may display the received source language search term in the search box. Therefore, in the embodiment of the present application, the source language search term input by the user may include : The source language search term submitted by the user to the client in any way. It can be understood that the specific manner of obtaining the source language search term is not limited in the embodiment of the present application.
- the source language and the target language may be used to represent different two languages.
- the source language and the target language may be preset by a user, or may be analyzed by a search APP or a search website by analyzing a user's search behavior and/or browsing behavior. get.
- the search APP or the search website may use the language most commonly used by the user as the source language, and the language used in addition to the source language as the target language.
- the user's search behavior indicates that the search term used by the user is a Chinese search term, and the source language can be determined to be Chinese; the user's browsing behavior also indicates that the user has visited the translated website and has conducted Chinese and through the translation website.
- the mutual translation behavior between English, so it can be determined that the target language is English.
- the number of target languages in the embodiment of the present application may be one or more.
- the source language may be Chinese
- the target language may be English, Japanese, Korean, German.
- the cross-language search method of the embodiment of the present application is mainly described by using the source language as the Chinese language and the target language as the English as an example.
- the cross-language search methods corresponding to the other source languages and the target language may be cross-referenced.
- step 201 may translate the source language search term into a target language search term by the client or server.
- a plurality of different translation results may be obtained.
- the one with the highest degree of confidence may be selected from a plurality of different translation results.
- a target language search term it can be understood that the specific process of translating the source language search term into the target language search term is not limited. In an application example of the present application, if the source language search term is “March”, the target language search term may be “March”. In another application example of the present application, if the source language search term is “Disney”, the target language search term may be “disney”.
- step 202 may search for a target language search result according to the target language search word obtained in step 201 to obtain a target language search result. It can be understood that the specific manner of obtaining the target language search result is not limited in the embodiment of the present application.
- step 202 may perform a translation from the target language to the source language on the target language search result to obtain the source language translation result.
- the target language search result and the source language translation result may correspond to the same search result (such as the same webpage, video, picture, music, etc.), and one of the differences is that the language form is different.
- the obtaining process of the source language translation result may include: translating the target language search result from a target language to a source language by using a translation model; wherein the translation model may be a translation model commonly used in the translation field.
- the translation model specific to the search result may also be used.
- the embodiment of the present application does not limit the specific translation model and the specific translation process from the target language to the source language for the target language search result.
- the translation process of the target language in the source language translation result may be consistent with the source language search term by improving the translation process of the target language search result from the target language to the source language. Sex, in this way, can improve the coverage of the red content.
- the obtaining process of the source language translation result may include: translating the target content in the target language search result into the source in the process of translating the target language search result from the target language to the source language The translation result of the language search term is consistent; wherein the target content is the same content in the target language search result as the target language search term.
- the translation result of the target search word "Trump” may include: “Champ”, “Trump”, “Ace”, “Good Man”, “Honest Person”, etc.
- the embodiment of the present application can improve the coverage of the target content by improving the consistency between the translation result of the target content in the source language translation result and the source language search term.
- the target language search term “Trump” can be translated into “Trump”, and the target language search result is All "Trump” included are translated as “Trump”; or, if the source language search word corresponding to the target language search word "Trump” is "Champ”, the target language search word "Trump” can be translated into “Champ” and translates all "Trump” included in the target language search result into “Champ”; thereby improving the coverage of the red content.
- Step 203 Obtain, from the source language translation result, a keyword that matches the target language search term.
- the language form corresponding to the keyword may be the source language form, that is, the language form corresponding to the keyword is consistent with the language form corresponding to the translation result of the source language.
- the keyword matches the target language search term, which may include: the semantics corresponding to the keyword match the semantics corresponding to the target language search term, or the effect and target language generated by the keyword to the source language reader.
- the search term matches the effect produced by the target language reader.
- the step 203 of acquiring, by the source language translation result, a keyword that matches the target language search term may include: acquiring a key that matches the target language search term according to the translation dictionary. a word and finding the same content as the keyword from the source language translation result.
- the translation dictionary may provide a corresponding source language vocabulary for the target language vocabulary.
- the source language vocabulary provided for the target language vocabulary “March” may include “March,” “March,” “Forward,” and the like.
- a keyword matching the target language search term may be obtained from the source language translation result.
- step 203 may obtain a keyword that matches the target language search term from the source language translation result by a word alignment technique.
- Word alignment refers to the correspondence between the source language vocabulary and the target language vocabulary in the pair of statement pairs.
- the two-sentence pair is "I play game in the classroom”
- the word alignment technology can be "I”, “in”, “classroom”, “play”, “game” and " I", "in”, “classroom”, “play”, “game” correspond. Since the above word alignment technique considers the alignment relationship between the lexical pairs of the translations in the double sentences, the accuracy of the keywords can be improved.
- the process of obtaining, by using the word alignment technology, a keyword that matches the target language search term from the source language translation result may include: establishing a target language vocabulary included in the target language search result and the source a mapping relationship between source language vocabularies included in the language translation result; searching for the search term according to the target language, and performing a search in the mapping relationship to obtain the search result included in the source language and the target language search term Matching keywords.
- the mapping relationship between the target language vocabulary included in the target language search result and the source language vocabulary included in the source language translation result may be established by using a word alignment technique, and specifically, the target language search result and the source may be acquired.
- a word alignment model may be used to establish a mapping relationship between a target language vocabulary included in the target language search result and a source language vocabulary included in the source language translation result.
- the word alignment model can be obtained by using bilingual alignment corpus training, so that the word alignment model includes statistical information for determining a pair of words in the double sentence that are mutually translated, since the word alignment model is a related art in the field, Do not repeat them.
- the source language search term is “March”
- the target language search term corresponding to “March” is “March”
- the target language search result corresponding to “March” is assumed.
- the embodiment of the present application can adopt a word alignment model for the target language. The search results are aligned with the double sentences corresponding to the source language translation results.
- the above process of obtaining a keyword matching the target language search term from the translation result of the source language by using a translation dictionary or by a word alignment technique is only an optional embodiment, and in fact, those skilled in the art
- Other processes for obtaining keywords matching the target language search words from the source language translation results may be adopted according to actual application requirements, for example, the attention mechanism of the neural network machine translation model may be utilized to determine the target language search words. Which word is translated into the source language form, etc., wherein the attention mechanism is a selection mechanism for assigning limited information processing capabilities. Attention is focused on selectively focusing on certain perceived information. This information may be objective or subjective, while ignoring other information received at the same time, this cognitive process.
- the specific embodiment of the present application does not limit the specific process of obtaining keywords that match the target language search term from the source language translation result.
- the presentation content corresponding to the source language translation result may include: a title, a summary, a promotion content, and the like
- the step 204 may display the keyword in the presentation content corresponding to the source language translation result.
- the keyword included in the title or the abstract may be displayed in a mark
- the keyword included in the promotional content may be displayed in a mark.
- the keyword may be marked and displayed by marking a font or a color of the keyword; for example, the font of the keyword may be bolded, underlined, tilted, stereoscopic, set to red, etc. This can improve the recognition of the keyword with respect to the presentation content of the normal font or the normal color.
- the identification of the keyword is displayed while the keyword is displayed. For example, a corresponding icon or the like may be displayed at a position such as a superscript or a subscript of the keyword. It can be understood that the specific marking display process is not limited in the embodiment of the present application.
- the search result corresponding to the target language search term may be used as the first search result, and the second search result corresponding to the source language search term may be obtained, and the target language form and/or source may be obtained.
- the first search result of the language form is mixed with the second search result; thus, the search result derived from the multi-language database can be provided to the user.
- the searching may be performed in a database of the source language to obtain a second search result corresponding to the source language search term.
- the cross-language search method of the embodiment of the present application obtains keywords matching the target language search words from the source language translation results, and displays the keywords in the source language translation results.
- the coverage rate of the red content in the source language translation result corresponding to the target language search term is improved, thereby improving the efficiency of obtaining information from the source language translation result corresponding to the target language search term.
- FIG. 3 a flow chart of the steps of the second embodiment of the cross-language search method of the present application is shown, which may specifically include the following steps:
- Step 301 Translate a source language search term into a target language search term
- Step 302 Search for a target language search result and a source language translation result for the target language search term; wherein the source language translation result may be obtained by translating the target language search result;
- Step 303 Obtain a keyword that matches the target language search term from the source language translation result.
- Step 304 If the matching degree between the keyword and the source language search term meets a preset matching degree condition, mark the keyword in the source language translation result.
- the process of displaying the keyword in the source language translation result in the second embodiment of the method shown in FIG. 3 may include: if the keyword and the source If the matching degree between the language search words meets the preset matching degree condition, the keywords in the source language translation result are marked and displayed. Since the matching degree between the keyword displayed by the mark and the source language search word conforms to the preset matching degree condition, the association between the keyword displayed by the mark and the source language search word can be improved, thereby making the key of the mark display Words are more in line with the user's information needs.
- the matching degree between the keyword and the source language search term may include: a similarity between the keyword and the source language search term, and the preset matching degree condition may include : The similarity exceeds the similarity threshold.
- the first word vector corresponding to the keyword and the second word vector corresponding to the source language search term may be separately determined, and then the similarity measure is performed by an Euclidean distance, an included cosine, and the like. The similarity between a word vector and a second word vector.
- the matching degree of the keyword “March” and the source language search term “March” may be determined, and if the matching degree meets the preset matching degree condition, the source language may be The keyword "March” in the translation result is marked.
- the matching degree may be determined by the keyword “Disney” and the source language search word “Disney”. If the matching degree meets the preset matching degree condition, the source language may be used. The keyword “Disney” in the translation results is marked.
- the method may not be used in the source language translation result.
- the keywords are displayed in a mark to avoid that the keywords with lower correlations with the source language search words are displayed by the mark.
- FIG. 4 a flow chart of the steps of the third embodiment of the cross-language search method of the present application is shown, which may specifically include the following steps:
- Step 401 Translate a source language search term into a target language search term
- Step 402 Search for a target language search result and a source language translation result for the target language search term; wherein the source language translation result may be obtained by translating the target language search result;
- Step 403 Determine whether the source language translation result includes the source language search term.
- Step 404 If the source language translation result does not include the source language search term, obtain a keyword that matches the target language search term from the source language translation result;
- Step 405 Mark the keyword in the source language translation result.
- the third embodiment of the method shown in FIG. 4 can determine the source language translation before acquiring the keyword matching the target language search term from the source language translation result. Whether the result includes the source language search term, and if not, obtaining a keyword matching the target language search term from the source language translation result, and the keyword in the source language translation result The mark display is performed; thereby, the markup redness of the source language translation result can be realized without the source language search word in the source language translation result.
- the embodiment of the present application does not limit the execution condition corresponding to “obtaining a keyword matching the target language search term from the source language translation result”.
- a keyword matching the target language search term may also be acquired from the source language translation result, thereby realizing a source language.
- the source language search words and keywords are simultaneously marked red.
- the method of the embodiment of the present application may further include: if the source language translation result includes the source language search term, performing markup display on the source language search term in the source language translation result.
- FIG. 5 a flow chart of the steps of the fourth embodiment of the cross-language search method of the present application is shown, which may specifically include the following steps:
- Step 501 Translate a source language search term into a target language search term
- Step 502 Search for a target language search result and a source language translation result for the target language search term; wherein the source language translation result may be obtained by translating the target language search result;
- Step 503 Obtain, from the source language translation result, a keyword that matches the target language search term.
- Step 504 Mark the keyword in the source language translation result.
- the fourth embodiment of the method shown in FIG. 5 may further include:
- Step 505 Acquire at least two translation results corresponding to the target language search term.
- Step 506 Mark content displayed in the source language translation result that matches at least two translation results corresponding to the target language search term.
- the source language translation result is obtained by translating the target language search result, the source language translation result often has multiple possibilities, so that only the content of the source language translation result and the source language search term are marked. Red, resulting in lower coverage of the red content.
- the source language translation results may include: “Trump”, “Champ” and other translation possibilities, if the source language search term is "Trump” "Only, it is possible to realize the possible redemption for the translation of "Trump”, and it is impossible to achieve the possible redemption for other translations such as "Champ”, so the coverage of the red content is low.
- the embodiment of the present application can obtain at least two translation results corresponding to the target language search term, and corresponding to the target language search term in the source language translation result.
- the content of at least two translations matching the content is marked for display. Since at least two translation results can reflect at least two translation possibilities of the target language search term to the source language, the embodiment of the present application can implement at least two translation possible redoxes through the source language translation result, thereby improving the source language translation result.
- the coverage of the winning red content can further improve the efficiency of obtaining information from the search results corresponding to the target language search term.
- the corresponding target language search term is “Trump”
- the target language search result corresponding to the target language search term “Trump” is included.
- the corresponding source language translation results may include: “Trump”, “Champ” and other translation possibilities, then the embodiment of the present application can obtain “Trump” and "Champ” At least two kinds of translations are possible, and at least two possible translations are realized by the source language translation result, so that the coverage of the red content can be improved.
- the embodiment of the present application may provide the following acquisition scheme:
- the acquisition scheme 1 may perform translation from the target language to the source language on the target language search term, and filter the translation results of at least two source languages from the translated results.
- the translation model may be used to translate the target language search term into multiple translation results, and at least two of the results obtained from the translation are selected.
- the screening result of the at least two source languages is obtained by the above-mentioned translation result, and the method may include: selecting a confidence level from the translation result to meet a preset confidence level.
- the translation results of at least two source languages of the condition For example, you can get multiple translation results based on “Trump” translation: “Champ”, “Trump”, “Ace”, “Good Man”, “Honest Man”, “Means”, “Beyond”, etc., and For the multiple translation results corresponding to Trump, select at least two translation results with higher confidence, such as "Champ” and "Trump".
- the confidence of the source language translation result may be a translation probability of the target language search term to the source language translation result, and the translation probability may be calculated by the translation model. It can be understood that the confidence of the embodiment language in the source language translation result is There are no restrictions on how to get it.
- the screening result of the at least two source languages is obtained by the above-mentioned translation result
- the method may include: searching for the part of the word according to the source language, and obtaining the translation from the translation
- At least two translation results were obtained by screening the results.
- the optional embodiment can search for the part of speech according to the source language, and filter at least two translation results to obtain a translation result consistent with the part-of-speech of the source language search word, thereby improving the accuracy of the translation result.
- the verb "beyond” can be filtered according to "noun”; or, There are nouns, filtering out the verb "Beyond” and non-professional nouns "Ace”, “Good Man”, “Honest Man”, “Means”.
- the obtaining scheme 2 may record, in the process of translating the target language search result from the target language to the source language, the translation result corresponding to the target content in the target language search result, to obtain the target language search term. Corresponding at least two translation results; wherein the target content is the same content in the target language search result as the target language search term.
- the background and context of the target language search results cause the same content in the target language search results to correspond to different translation results in different contexts and contexts.
- “Apple” can be translated into “fruit”. It can also be translated as “Apple Phone” and can also be translated as “Bomb”.
- the acquisition scheme 2 may correspond to the target content in the search result in the process of translating the target language search result from the target language to the source language.
- the translation result is recorded; since the target content is the same content as the target language search term in the target language search result, after the translation is completed, at least the target language search term corresponding to the target content may be obtained from the recorded content. Two translation results.
- the corresponding target content is also "Trump”
- the translation model can be used to record why "Trump" included in the target language search result is translated, for example, corresponding to different contexts or contexts, Translated into “Trump”, “Champ”, or “Ace”, etc., all occurrences of translation results are recorded to obtain at least two translation results corresponding to the target language search term.
- the recording the translation result corresponding to the target content in the search result may include: recording each translation result corresponding to the target content, or converting the current translation result corresponding to the target content with the recorded translation result. For comparison, if they are different, record them. If they are the same, they are not recorded.
- the process of obtaining the at least two translation results corresponding to the target language search term is described in detail by the acquisition scheme 1 and the acquisition scheme 2, and it can be understood that those skilled in the art can adopt the acquisition scheme 1 and acquire according to actual application requirements. Any one or combination of the schemes 2, or other acquisition schemes, may be used. It can be understood that the specific process of obtaining the at least two translation results corresponding to the target language search term is not limited in the embodiment of the present application.
- the cross-language search method in the embodiment of the present application can perform translation from the target language to the source language on the target language search result to obtain the source language translation result, and search for the source language translation result and the target language.
- the content corresponding to the at least two translation results of the word is marked and displayed; since the at least two translation results can reflect at least two translation possibilities of the target language search word to the source language, the embodiment of the present application can translate the result through the source language.
- the corresponding target language translation result is the same as the red, which can improve the coverage of the red content in the source language translation result of the source language, thereby improving the efficiency of the user to obtain information from the source language translation result.
- FIG. 6 a structural block diagram of an embodiment of a cross-language search device of the present application is shown, which may specifically include: a search term translation module 601 , a result acquisition module 602 , a keyword acquisition module 603 , and a mark display module 603 .
- the search term translation module 601 is configured to translate the source language search term into a target language search term
- a result obtaining module 602 configured to search for a target language search result and a source language translation result for the target language search term; wherein the source language translation result is obtained by translating the target language search result;
- the keyword obtaining module 603 is configured to acquire, from the source language translation result, a keyword that matches the target language search term;
- the tag display module 604 is configured to perform tag display on the keywords in the source language translation result.
- the keyword obtaining module 603 may include:
- mapping establishing submodule configured to establish a mapping relationship between a target language vocabulary that can be included in the target language search result and a source language vocabulary that the source language translation result can include
- mapping search submodule configured to search for a word according to the target language, and perform a search in the mapping relationship to obtain a keyword that may be included in the source language translation result and that matches the target language search term.
- the mark display module 604 is configured to: if the matching degree between the keyword and the source language search term meets a preset matching degree condition, the method in the source language translation result Keywords are marked for display.
- the device may further include:
- a determining module configured to determine whether the source language translation result may include the source language search before the keyword obtaining module acquires a keyword matching the target language search term from the source language translation result a word, if the source language translation result does not include the source language search term, triggering the keyword acquisition module to acquire a keyword that matches the target language search term from the source language translation result.
- the device may further include:
- a second tag display module configured to mark the source language search term in the source language translation result if the source language translation result may include the source language search term.
- the device may further include:
- a translation result obtaining module configured to acquire at least two translation results corresponding to the target language search term
- the third mark display module is configured to perform mark display on the content of the source language translation result that matches at least two translation results corresponding to the target language search term.
- the source language translation result obtaining submodule may include:
- a screening unit configured to perform translation from the target language to the source language on the target language search term, and filter the translation results of the at least two source languages from the translated results
- the screening unit may include:
- a first screening subunit configured to select, from the results obtained by the translation, a translation result of at least two source languages whose confidence meets a preset confidence condition;
- the second screening subunit is configured to search for the part of speech of the word according to the source language, and screen at least two translation results from the result obtained by the translation.
- the description is relatively simple, and the relevant parts can be referred to the description of the method embodiment.
- Embodiments of the present application disclose an apparatus for cross-language search, including a memory, and one or more programs, wherein one or more programs are stored in a memory and configured to be comprised by one or more processors
- Executing the one or more programs includes instructions for: translating a source language search term into a target language search term; obtaining, for the target language search term, a corresponding target language search result and a source language translation result Wherein the source language translation result is obtained by translating the target language search result; obtaining a keyword matching the target language search term from the source language translation result; and translating the source language translation result
- the keywords in the tag are displayed.
- the obtaining, by the source language translation result, a keyword that matches the target language search term includes:
- Searching according to the target language performing a search in the mapping relationship to obtain a keyword that is included in the source language translation result and matches the target language search term.
- the flag display comprises:
- the keyword in the source language translation result is marked and displayed.
- the apparatus is further configured to execute, by the one or more processors, the one or more programs to include instructions for:
- the source language translation result does not include the source language search term, performing the step of acquiring a keyword that matches the target language search term from the source language translation result.
- the apparatus is further configured to execute, by the one or more processors, the one or more programs to include instructions for:
- the source language search term in the source language translation result is marked and displayed.
- the apparatus is further configured to execute, by the one or more processors, the one or more programs to include instructions for:
- the obtaining the at least two translation results corresponding to the target language search term includes:
- the translation results obtained by translating the results obtained from the translation into at least two source languages include:
- At least two translation results are screened from the results obtained by the translation.
- FIG. 7 is a block diagram of a device 900 for cross-language search as a terminal, according to an exemplary embodiment.
- device 900 can be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a gaming console, a tablet device, a medical device, a fitness device, a personal digital assistant, and the like.
- apparatus 900 can include one or more of the following components: processing component 902, memory 904, power component 906, multimedia component 908, audio component 910, input/output (I/O) interface 912, sensor component 914, And a communication component 916.
- Processing component 902 typically controls the overall operation of device 900, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations.
- Processing component 902 can include one or more processors 920 to execute instructions to perform all or part of the steps described above.
- processing component 902 can include one or more modules to facilitate interaction between component 902 and other components.
- processing component 902 can include a multimedia module to facilitate interaction between multimedia component 908 and processing component 902.
- Memory 904 is configured to store various types of data to support operation at device 900. Examples of such data include instructions for any application or method operating on device 900, contact data, phone book data, messages, pictures, videos, and the like.
- the memory 904 can be implemented by any type of volatile or non-volatile storage device, or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read only memory (EEPROM), erasable.
- SRAM static random access memory
- EEPROM electrically erasable programmable read only memory
- EPROM Programmable Read Only Memory
- PROM Programmable Read Only Memory
- ROM Read Only Memory
- Magnetic Memory Flash Memory
- Disk Disk or Optical Disk.
- Power component 906 provides power to various components of device 900.
- Power component 906 can include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for device 900.
- the multimedia component 908 includes a screen between the device 900 and the user that provides an output interface.
- the screen can include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen can be implemented as a touch screen to receive input signals from the user.
- the touch panel includes one or more touch sensors to sense touches, slides, and gestures on the touch panel. The touch sensor may sense not only the boundary of the touch or sliding motion action, but also the duration and pressure associated with the touch or slide operation.
- the multimedia component 908 includes a front camera and/or a rear camera. When the device 900 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera can receive external multimedia data. Each front and rear camera can be a fixed optical lens system or have focal length and optical zoom capabilities.
- the audio component 910 is configured to output and/or input an audio signal.
- audio component 910 includes a microphone (MIC) that is configured to receive an external audio signal when device 900 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode.
- the received audio signal may be further stored in memory 904 or transmitted via communication component 916.
- the audio component 910 also includes a speaker for outputting an audio signal.
- the I/O interface 912 provides an interface between the processing component 902 and the peripheral interface module, which may be a keyboard, a click wheel, a button, or the like. These buttons may include, but are not limited to, a home button, a volume button, a start button, and a lock button.
- Sensor assembly 914 includes one or more sensors for providing device 900 with various aspects of status assessment.
- sensor component 914 can detect an open/closed state of device 900, relative positioning of components, such as the display and keypad of device 900, and sensor component 914 can also detect a change in position of one component of device 900 or device 900. The presence or absence of user contact with device 900, device 900 orientation or acceleration/deceleration, and temperature variation of device 900.
- Sensor assembly 914 can include a proximity sensor configured to detect the presence of nearby objects without any physical contact.
- Sensor assembly 914 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications.
- the sensor component 914 can also include an acceleration sensor, a gyro sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
- Communication component 916 is configured to facilitate wired or wireless communication between device 900 and other devices.
- the device 900 can access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof.
- the communication component 916 receives a broadcast signal or broadcast associated information from an external broadcast management system via a broadcast channel.
- the communication component 916 also includes a near field communication (NFC) module to facilitate short range communication.
- the NFC module can be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
- RFID radio frequency identification
- IrDA infrared data association
- UWB ultra-wideband
- Bluetooth Bluetooth
- device 900 may be implemented by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable A gate array (FPGA), controller, microcontroller, microprocessor, or other electronic component implementation for performing the above methods.
- ASICs application specific integrated circuits
- DSPs digital signal processors
- DSPDs digital signal processing devices
- PLDs programmable logic devices
- FPGA field programmable A gate array
- controller microcontroller, microprocessor, or other electronic component implementation for performing the above methods.
- non-transitory computer readable storage medium comprising instructions, such as a memory 904 comprising instructions executable by processor 920 of apparatus 900 to perform the above method.
- the non-transitory computer readable storage medium may be a ROM, a random access memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, and an optical data storage device.
- FIG. 8 is a block diagram of a device for cross-language search as a server, according to an exemplary embodiment.
- the server 1900 can vary considerably depending on configuration or performance, and can include one or more central processing units (CPUs) 1922 (eg, one or more processors) and memory 1932, one or one The above storage medium 1942 or storage medium 1930 of data 1944 (eg, one or one storage device in Shanghai).
- the memory 1932 and the storage medium 1930 may be short-term storage or persistent storage.
- the program stored on storage medium 1930 may include one or more modules (not shown), each of which may include a series of instruction operations in the server.
- central processor 1922 can be configured to communicate with storage medium 1930, which performs a series of instruction operations in storage medium 1930.
- Server 1900 may also include one or more power sources 1926, one or more wired or wireless network interfaces 1950, one or more input and output interfaces 1958, one or more keyboards 1956, and/or one or more operating systems 1941.
- power sources 1926 For example, Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.
- a non-transitory computer readable storage medium when instructions in the storage medium are executed by a processor of a device (terminal or server), enabling a terminal to perform a cross-language search method, the method comprising: The language search term is translated into a target language search term; for the target language search term, the corresponding target language search result and the source language translation result are obtained; wherein the source language translation result is a translation of the target language search result Obtaining: obtaining, from the source language translation result, a keyword that matches the target language search term; and displaying the keyword in the source language translation result.
- a cross-language search method, a cross-language search device, and a device for cross-language search are provided in detail.
- the specific examples are used herein to apply the principles and implementation manners of the present application. It is to be noted that the description of the above embodiments is only for helping to understand the method of the present application and its core ideas; at the same time, for those of ordinary skill in the art, according to the idea of the present application, there will be changes in specific embodiments and applications. In the above, the contents of the present specification should not be construed as limiting the present application.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
Abstract
L'invention concerne un procédé et un appareil de recherche interlingue, et un appareil pour une recherche interlingue. Le procédé consiste en particulier à : traduire un terme de recherche de langue source en un terme de recherche de langue cible ; par rapport au terme de recherche de langue cible, acquérir un résultat de recherche de langue cible et un résultat de traduction de langue source, correspondant à celui-ci, le résultat de traduction de langue source étant obtenu par traduction du résultat de recherche de langue cible ; acquérir un mot-clé correspondant au terme de recherche de langue cible à partir du résultat de traduction de langue source ; et marquer et afficher le mot-clé dans le résultat de traduction de langue source. Les modes de réalisation de la présente invention permettent d'améliorer un taux de couverture de contenu marqué en rouge dans un résultat de traduction de langue source correspondant à un terme de recherche de langue cible, et permet ainsi d'améliorer l'efficacité d'acquisition d'informations à partir du résultat de traduction de langue source correspondant au terme de recherche de langue cible.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201711298752.0A CN108255940A (zh) | 2017-12-08 | 2017-12-08 | 一种跨语言搜索方法和装置、一种用于跨语言搜索的装置 |
| CN201711298752.0 | 2017-12-08 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2019109663A1 true WO2019109663A1 (fr) | 2019-06-13 |
Family
ID=62722397
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2018/100990 Ceased WO2019109663A1 (fr) | 2017-12-08 | 2018-08-17 | Procédé et appareil de recherche interlingue, et appareil de recherche interlingue |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN108255940A (fr) |
| WO (1) | WO2019109663A1 (fr) |
Families Citing this family (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN108255940A (zh) * | 2017-12-08 | 2018-07-06 | 北京搜狗科技发展有限公司 | 一种跨语言搜索方法和装置、一种用于跨语言搜索的装置 |
| CN109408822A (zh) * | 2018-10-30 | 2019-03-01 | 中译语通科技股份有限公司 | 一种跨语言图书相关度分析方法及系统 |
| CN112528129B (zh) * | 2019-08-28 | 2024-03-22 | 北京搜狗科技发展有限公司 | 多语种翻译系统语种搜索方法及装置 |
| CN111126087B (zh) * | 2019-12-25 | 2023-08-29 | 北京百度网讯科技有限公司 | 领域翻译处理方法、装置及设备 |
| CN112163433B (zh) * | 2020-09-29 | 2022-04-05 | 北京字跳网络技术有限公司 | 关键词汇的匹配方法、装置、电子设备及存储介质 |
| CN116304126A (zh) * | 2023-03-14 | 2023-06-23 | 北京字跳网络技术有限公司 | 跨越自然语言的媒体搜索方法、装置、设备和介质 |
Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20140149445A1 (en) * | 2012-11-26 | 2014-05-29 | Inventec Corporation | Webpage querying system and querying method thereof |
| CN106599206A (zh) * | 2016-12-15 | 2017-04-26 | 北京小米移动软件有限公司 | 搜索信息的方法及装置 |
| CN106919642A (zh) * | 2017-01-13 | 2017-07-04 | 北京搜狗科技发展有限公司 | 一种跨语言搜索方法和装置、一种用于跨语言搜索的装置 |
| CN108255940A (zh) * | 2017-12-08 | 2018-07-06 | 北京搜狗科技发展有限公司 | 一种跨语言搜索方法和装置、一种用于跨语言搜索的装置 |
| CN108304412A (zh) * | 2017-01-13 | 2018-07-20 | 北京搜狗科技发展有限公司 | 一种跨语言搜索方法和装置、一种用于跨语言搜索的装置 |
| CN108334526A (zh) * | 2017-01-20 | 2018-07-27 | 北京搜狗科技发展有限公司 | 搜索结果项的展示方法和装置 |
-
2017
- 2017-12-08 CN CN201711298752.0A patent/CN108255940A/zh active Pending
-
2018
- 2018-08-17 WO PCT/CN2018/100990 patent/WO2019109663A1/fr not_active Ceased
Patent Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20140149445A1 (en) * | 2012-11-26 | 2014-05-29 | Inventec Corporation | Webpage querying system and querying method thereof |
| CN106599206A (zh) * | 2016-12-15 | 2017-04-26 | 北京小米移动软件有限公司 | 搜索信息的方法及装置 |
| CN106919642A (zh) * | 2017-01-13 | 2017-07-04 | 北京搜狗科技发展有限公司 | 一种跨语言搜索方法和装置、一种用于跨语言搜索的装置 |
| CN108304412A (zh) * | 2017-01-13 | 2018-07-20 | 北京搜狗科技发展有限公司 | 一种跨语言搜索方法和装置、一种用于跨语言搜索的装置 |
| CN108334526A (zh) * | 2017-01-20 | 2018-07-27 | 北京搜狗科技发展有限公司 | 搜索结果项的展示方法和装置 |
| CN108255940A (zh) * | 2017-12-08 | 2018-07-06 | 北京搜狗科技发展有限公司 | 一种跨语言搜索方法和装置、一种用于跨语言搜索的装置 |
Also Published As
| Publication number | Publication date |
|---|---|
| CN108255940A (zh) | 2018-07-06 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2019109663A1 (fr) | Procédé et appareil de recherche interlingue, et appareil de recherche interlingue | |
| CN111708943B (zh) | 一种搜索结果展示方法、装置和用于搜索结果展示的装置 | |
| CN108304412B (zh) | 一种跨语言搜索方法和装置、一种用于跨语言搜索的装置 | |
| CN106919642B (zh) | 一种跨语言搜索方法和装置、一种用于跨语言搜索的装置 | |
| CN110858100B (zh) | 联想候选词生成方法及装置 | |
| WO2021120690A1 (fr) | Procédé et appareil de reconnaissance vocale, et support | |
| CN110069624B (zh) | 文本处理方法及装置 | |
| CN108255939B (zh) | 一种跨语言搜索方法和装置、一种用于跨语言搜索的装置 | |
| WO2018120447A1 (fr) | Procédé, dispositif et équipement de traitement d'informations de dossier médical | |
| US20230267282A1 (en) | Poetry generation | |
| CN108345608A (zh) | 一种搜索方法、装置及设备 | |
| WO2023078414A1 (fr) | Procédé et appareil de recherche d'articles apparentés, dispositif électronique et support de stockage | |
| CN110929122B (zh) | 一种数据处理方法、装置和用于数据处理的装置 | |
| JP7208968B2 (ja) | 情報処理方法、装置および記憶媒体 | |
| WO2018018912A1 (fr) | Procédé et appareil de recherche, et dispositif électronique | |
| CN107424612B (zh) | 处理方法、装置和机器可读介质 | |
| CN111538830B (zh) | 法条检索方法、装置、计算机设备及存储介质 | |
| CN111381685B (zh) | 一种句联想方法和装置 | |
| CN108628461B (zh) | 一种输入方法和装置、一种更新词库的方法和装置 | |
| CN107784037B (zh) | 信息处理方法和装置、用于信息处理的装置 | |
| CN113033163A (zh) | 一种数据处理方法、装置和电子设备 | |
| CN110399468B (zh) | 一种数据处理方法、装置和用于数据处理的装置 | |
| CN112214692B (zh) | 基于输入法的数据处理方法、装置和机器可读介质 | |
| WO2019196527A1 (fr) | Procédé de traitement de données, appareil et dispositif électronique | |
| CN110851692A (zh) | 一种数据处理方法、装置和用于数据处理的装置 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 18885996 Country of ref document: EP Kind code of ref document: A1 |