CN108073572B

CN108073572B - Information processing method and device, simultaneous interpretation system

Info

Publication number: CN108073572B
Application number: CN201611031049.9A
Authority: CN
Inventors: 姜里羊; 王宇光; 陈伟; 程善伯
Original assignee: Beijing Sogou Technology Development Co Ltd
Current assignee: Beijing Sogou Technology Development Co Ltd
Priority date: 2016-11-16
Filing date: 2016-11-16
Publication date: 2022-01-11
Anticipated expiration: 2036-11-16
Also published as: CN108073572A

Abstract

The embodiment of the invention provides an information processing method, an information processing device and a simultaneous interpretation system, wherein the method specifically comprises the following steps: acquiring text information corresponding to a voice signal of a speaking user; according to the clause information contained in the text information, obtaining the clause of which the clause information meets the preset condition from the text information, and using the clause as a target clause needing machine translation currently; and sending the target clause to a machine translation device so that the machine translation device translates the target clause information into characters in a target language. The embodiment of the invention can control the clauses which need to be translated by a machine at present so as to avoid the situation that the sentence sent to the machine translation device is too long or too short, thereby effectively improving the accuracy rate and the real-time rate of translation.

Description

Information processing method and device, simultaneous interpretation system

Technical Field

The present invention relates to the field of information processing technologies, and in particular, to an information processing method and apparatus, an apparatus for information processing, and a simultaneous interpretation system.

Background

The simultaneous interpretation refers to an interpretation mode that a simultaneous interpreter continuously interprets the speech content of a speaker to listeners without interrupting the speech of the speaker. At present, the simultaneous interpretation technology is widely applied to scenes such as large conferences, lectures, exhibitions, scenic spots and the like. Taking a conference scene as an example, in the conference process, a simultaneous interpretation person sits in a sound insulation room, uses professional equipment to interpret the contents heard from an earphone synchronously into a target language and outputs the target language through a microphone; meanwhile, the conference participants needing the simultaneous interpretation service can obtain the interpreted information from the earphones.

However, in practical applications, since it is necessary to continuously interpret the content heard from the headphones into the target language synchronously, the work intensity of the simultaneous interpreter is very high. Because the work intensity is high, the translation is carried out alternately by 2 simultaneous transliterators, and each simultaneous transliterator usually continues to translate for about half an hour, which consumes more labor cost.

In order to save labor cost, an existing scheme may convert a speech signal of a speaking user into a text through a sound book conversion device, perform machine translation on the text to obtain a text in a target language, and output a speech corresponding to the text in the target language. In the conventional scheme, the speech conversion apparatus determines the end of a sentence according to the interval time of the speech signal, and after determining the end of the sentence, the apparatus may transmit the corresponding sentence to the machine translation apparatus.

However, in the process of implementing the present invention, the inventor finds that different speaking users often have different speech rates, so that the existing scheme judges the end of a sentence according to the interval time of a speech signal, which affects the accuracy of the judgment result. For example, when the speech rate of a speaking user is too fast, a fixed interval will result in a sentence that is too long; while a fixed interval will result in too short a sentence when the speaking user's speech rate is too slow. For machine translation, a sentence with complete meaning is usually required, if the sentence is too long, the real-time rate of translation will be reduced, and if the context content is too much, the accuracy rate of translation will also be reduced, and if the sentence is too short, the context content will not be considered, so the accuracy rate of translation will also be affected.

Disclosure of Invention

In view of the above problems, embodiments of the present invention are provided to provide an information processing method, an information processing apparatus, an apparatus for information processing, and a simultaneous interpretation system that overcome or at least partially solve the above problems, and can control a sentence that needs to be machine-interpreted at present to avoid a situation where a sentence transmitted to a machine interpretation apparatus is too long or too short, thereby effectively improving the accuracy and real-time rate of interpretation.

In order to solve the above problem, the present invention discloses an information processing method, comprising:

acquiring text information corresponding to a voice signal of a speaking user;

according to the clause information contained in the text information, obtaining the clause of which the clause information meets the preset condition from the text information, and using the clause as a target clause needing machine translation currently;

and sending the target clause to a machine translation device so that the machine translation device translates the target clause information into characters in a target language.

Optionally, the information of the clause includes: if the number of clauses and the number of words are equal, the step of obtaining the clause with the clause information meeting the preset condition from the text information according to the clause information contained in the text information includes:

if the number of preceding clauses in the text information exceeds a first number threshold and the word number of the preceding clauses exceeds a first word number threshold, taking the preceding clauses as target clauses needing machine translation currently; or

If the difference D between the number of preceding clauses in the text information and the delay threshold is a multiple of a second number threshold and the word number of the preceding clauses exceeds a second word number threshold, taking the preceding D clauses as target clauses needing machine translation currently; wherein D is a positive integer.

if the total word number of the text information does not exceed a third word number threshold and a preset flag bit exists in the text information, if the number of preceding clauses in the text information exceeds a first number threshold and the word number of the preceding clauses exceeds a first word number threshold, taking the preceding clauses as target clauses needing machine translation currently; or

If the total word number of the text information exceeds a third word number threshold and no preset marker bit exists in the text information, if the difference D between the number of preceding clauses and a delay threshold in the text information is a multiple of a second number threshold and the word number of the preceding clauses exceeds the second word number threshold, taking the preceding D clauses as target clauses needing machine translation currently; wherein D is a positive integer.

Optionally, the step of obtaining, from the text information, a clause whose clause information meets a preset condition according to clause information included in the text information further includes:

and after the D preceding clauses are taken as the target clauses needing machine translation currently, if a first preset punctuation mark exists in the text information, taking the first preset punctuation mark and the characters before the first preset punctuation mark as the target clauses needing machine translation currently.

Optionally, the method further comprises:

acquiring a first clause with punctuation marks as first preset punctuation marks from the text information;

the step of obtaining the clause with the clause information meeting the preset condition from the text information according to the clause information contained in the text information includes:

and acquiring a first clause of which the clause information meets a preset condition from the text information according to the first clause information contained in the text information.

Optionally, the method further comprises:

writing the acquired text information into a cache region;

and reading the text information from the cache region, and acquiring clauses of which the clause information meets preset conditions from the read text information according to the clause information contained in the read text information.

Optionally, the method further comprises:

and acquiring a second clause with punctuation marks as second preset punctuation marks from the text information, and taking the second clause contained in the text information and clauses before the second clause as target clauses needing machine translation currently.

In another aspect, the present invention discloses an information processing apparatus comprising:

the text acquisition module is used for acquiring text information corresponding to a voice signal of a speaking user;

the target clause acquiring module is used for acquiring clauses of which the clause information meets preset conditions from the text information according to clause information contained in the text information, and the clauses are used as target clauses needing machine translation at present; and

and the target clause sending module is used for sending the target clause to a machine translation device so that the machine translation device translates the target clause information into characters in a target language.

Optionally, the information of the clause includes: and the clause number and the word number are determined, the target clause acquiring module comprises:

a first target clause acquiring submodule, configured to, when the number of preceding clauses in the text information exceeds a first number threshold and the number of words of the preceding clauses exceeds a first word number threshold, take the preceding clauses as target clauses that need to be machine-translated currently; or

A second target clause acquiring submodule, configured to, when a difference D between the number of preceding clauses in the text information and a delay threshold is a multiple of a second number threshold and a number of words of the preceding clauses exceeds a second word number threshold, take the preceding D clauses as target clauses currently needing machine translation; wherein D is a positive integer.

a third target clause acquiring submodule, configured to, when the total word number of the text information does not exceed a third word number threshold and a preset flag bit exists in the text information, if the number of preceding clauses in the text information exceeds a first number threshold and the word number of the preceding clauses exceeds a first word number threshold, take the preceding clauses as target clauses that need to be machine-translated currently; or

A fourth target clause acquiring submodule, configured to, when the total word number of the text information exceeds a third word number threshold and a preset flag bit does not exist in the text information, if a difference D between the number of preceding clauses in the text information and a delay threshold is a multiple of a second number threshold and the word number of the preceding clauses exceeds the second word number threshold, take the preceding D clauses as a target clause that needs to be currently machine-translated; wherein D is a positive integer.

Optionally, the target clause obtaining module further includes:

and a fifth target clause acquiring submodule, configured to, after the preceding D clauses are taken as target clauses currently needing to be machine-translated, if a first preset punctuation mark exists in the text information, take the first preset punctuation mark and characters before the first preset punctuation mark as the target clauses currently needing to be machine-translated.

Optionally, the apparatus further comprises:

the first clause acquisition module is used for acquiring a first clause with punctuation marks as first preset punctuation marks from the text information;

the target clause obtaining module includes:

and the sixth target clause acquiring submodule is used for acquiring the first clause of which the clause information meets the preset condition from the text information according to the first clause information contained in the text information.

Optionally, the apparatus further comprises:

the writing module is used for writing the acquired text information into a cache region;

the target clause obtaining module includes:

and the seventh target clause acquiring submodule is used for reading the text information from the cache region and acquiring clauses of which the clause information meets the preset conditions from the read text information according to the clause information contained in the read text information.

Optionally, the apparatus further comprises:

and the second clause processing module is used for acquiring a second clause with punctuation marks as second preset punctuation marks from the text information, and taking the second clause contained in the text information and clauses before the second clause as target clauses needing machine translation currently.

In yet another aspect, an apparatus for information processing is disclosed that includes a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured for execution by the one or more processors to include instructions for:

acquiring text information corresponding to a voice signal of a speaking user;

In another aspect, the present invention discloses a simultaneous interpretation system, comprising:

the first conversion device is used for converting a voice signal of a speaking user into text information;

the aforementioned information processing apparatus;

and the machine translation device is used for translating the target clause obtained by the information processing device into characters in the target language and outputting the characters.

The embodiment of the invention has the following advantages:

the embodiment of the invention does not directly perform machine translation on the text information, but performs machine translation on the target clause after the text information is further processed according to the clause information contained in the text information to obtain the target clause from which the clause information meets the preset condition, namely, the embodiment of the invention can control the clause which needs to be subjected to machine translation currently according to the clause information contained in the text information so as to avoid the situation that the sentence sent to a machine translation device is too long or too short, thereby effectively improving the accuracy and the real-time rate of translation.

Drawings

FIG. 1 is a schematic diagram of an exemplary structure of a homophonic translation system of the present invention;

FIG. 2 is a flowchart illustrating steps of a first embodiment of an information processing method according to the present invention;

FIG. 3 is a flowchart illustrating steps of a second embodiment of an information processing method according to the present invention;

FIG. 4 is a block diagram of an embodiment of an information processing apparatus according to the present invention;

fig. 5 is a block diagram illustrating an apparatus for information processing as a terminal according to an exemplary embodiment; and

fig. 6 is a block diagram illustrating an apparatus for information processing as a server according to an example embodiment.

Detailed Description

In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.

The embodiment of the invention provides an information processing scheme which can acquire text information corresponding to a voice signal of a speaking user, acquire clauses of which the information meets preset conditions from the text information according to clause information contained in the text information, serve as target clauses needing to be translated by a machine at present, and send the target clauses to a machine translation device, so that the machine translation device translates the target clause information into characters of a target language.

In the embodiment of the present invention, the target clause currently required to be machine-translated refers to a clause capable of matching information included in the current text information, that is, the embodiment of the present invention does not directly machine-translate the text information, but performs machine translation of the target clause after further processing the text information according to the clause information included in the text information to obtain the target clause from which the clause information meets the preset condition, that is, the embodiment of the present invention can control the clause currently required to be machine-translated according to the clause information included in the text information, so as to avoid the situation that the sentence sent to the machine translation device is too long or too short, and therefore, the accuracy and real-time rate of translation can be effectively improved.

The embodiment of the invention can be applied to various scenes needing simultaneous interpretation, such as large conferences, lectures, exhibitions, scenic spots and the like.

Referring to fig. 1, an exemplary structural diagram of a homophone translation system of the present invention is shown, which may specifically include: a first conversion device 101, an information processing device 102, a machine translation device 103, and a machine translation device 104. The first conversion device 101, the information processing device 102, the machine translation device 103, and the machine translation device 104 may be provided as separate servers or may be provided in the same server together, that is, the specific positions of the first conversion device 101, the information processing device 102, the machine translation device 103, and the machine translation device 104 are not limited in the embodiment of the present invention.

Wherein the first conversion means 101 may be used to convert speech signals of a speaking user into text information. In practical applications, the speaking user may be a user speaking and sending a voice signal in the scene that needs simultaneous interpretation, and the voice signal of the speaking user may be received by a microphone or other voice collecting device, and the received voice signal is sent to the first conversion device 101; alternatively, the first conversion apparatus 101 may have a function of receiving a voice signal of a speaking user.

Alternatively, the first conversion means 101 may employ speech recognition technology to convert the speech signal of the speaking user into text information. If the speech signal of the user who speaks is marked as S, the S is processed in series to obtain a corresponding speech feature sequence O, and the sequence O is marked as { O ═ O₁，O₂，…，O_i，…，O_TIn which O is_iIs the ith speech feature, and T is the total number of speech features. Speech signalThe sentence corresponding to S can be regarded as a word string composed of many words, and is denoted as W ═ W₁，w₂，…，w_n}. The process of speech recognition is to find the most likely word string W based on the known speech feature sequence O.

Specifically, the speech recognition is a model matching process, in which a speech model is first established according to the speech characteristics of a person, and a template required for the speech recognition is established by extracting required features through analysis of an input speech signal; the process of recognizing the voice input by the user is a process of comparing the characteristics of the voice input by the user with the template, and finally determining the best template matched with the voice input by the user so as to obtain a voice recognition result. The specific speech recognition algorithm may adopt a training and recognition algorithm based on a statistical hidden markov model, or may adopt other algorithms such as a training and recognition algorithm based on a neural network, a recognition algorithm based on dynamic time warping matching, and the like.

In a scenario requiring simultaneous interpretation, a speaking user may continuously generate a speech signal S, i.e. the speech signal S may correspond to one or more sentences. In an alternative embodiment of the present invention, the first conversion device 101 may judge the end of the sentence according to the interval time of the speech signal S.

However, in practical applications, different speaking users often have different speech rates, and thus, the existing scheme judges the end of a sentence according to the interval time of a speech signal, which affects the accuracy of the judgment result. For example, when the speech rate of a speaking user is too fast, a fixed interval will result in a sentence that is too long; while a fixed interval will result in too short a sentence when the speaking user's speech rate is too slow. For machine translation, a sentence with complete meaning is usually required, if the sentence is too long, the real-time rate of translation will be reduced, and if the context content is too much, the accuracy rate of translation will also be reduced, and if the sentence is too short, the context content will not be considered, so the accuracy rate of translation will also be affected.

For the above problems of accuracy of the sentence end correspondence determination result being insufficient, and translation accuracy and real-time rate caused by the insufficient accuracy, the information processing apparatus 102 may obtain, from the text information, a clause whose information meets a preset condition according to the clause information included in the text information, and use the clause as a target clause that needs to be machine translated currently. In the present embodiment of the invention, the target clause that needs to be machine-translated currently refers to a clause that can fit the clause information included in the current text information, that is, the text information output by the first conversion device 101 is not directly output to the machine translation device 103, but the text information output by the first conversion device 101 is further processed by the information processing device 102, so as to obtain the clause that fits the clause information included in the current text information, which is used as the target clause that needs to be machine-translated currently, and send the target clause to the machine translation device 103.

In the embodiment of the invention, a relatively independent single sentence form in a compound sentence (a complete sentence) is called a clause, and the clauses are generally paused and represented by commas or semicolons in writing; the clauses and the clauses have certain relation in meaning, and are connected by some related words (conjunctive words, adverbs with related functions or phrases).

Optionally, the first conversion device 101 may insert a corresponding second predetermined punctuation mark into the text information corresponding to the speech signal of the speaking user according to the interval time of the speech signal S and the language model thereof, for example, the inserted second predetermined punctuation mark may include but is not limited to: comma, pause, period, etc., the information processing apparatus 102 may obtain the clause included in the text information according to the second predetermined punctuation mark included in the text information.

Alternatively, for example, the clauses included in the current text information of a sentence that is too long (hereinafter, referred to as a long sentence) or a sentence that is too short (hereinafter, referred to as a short sentence) have their own characteristics, the information processing apparatus 102 may obtain the target clause matching the information of the clause included in the current text information according to the information of the clause included in the text information.

And a machine translation device 103 configured to receive the target clause from the information processing device 102 and translate the target clause into a text in a target language, where the machine translation device 103 may translate the target clause by using a machine translation technology, and the machine translation technology may utilize a process of converting the target clause in one natural language (source language) into a text in another natural language (target language) by using a computer, for example, the source language and the target language may be chinese and english, or the source language and the target language may be english and chinese, and the embodiments of the present invention are not limited to specific source language, target language, and specific machine translation technology.

The second conversion device 104 may receive the characters in the target language from the machine translation device 103, convert the characters in the target language into the speech in the target language, and output the speech. Alternatively, the second conversion device 104 may convert the words in the target language into the voice in the target language by using a text-to-voice conversion technology (e.g., a voice synthesis technology), and output the voice in the target language through a voice playing device such as an earphone or a speaker. It is understood that the embodiment of the present invention does not limit the specific process of converting the text in the target language into the speech in the target language and outputting the speech.

It should be noted that the second conversion device 104 is only an optional device in an optional embodiment of the present invention, and actually, in other embodiments of the present invention, the text in the target language obtained by the machine translation device 103 may be directly output, for example, the text in the target language is displayed on a display device such as a screen for the user to view.

In summary, in the embodiment of the present invention, the target clause currently required to be machine-translated refers to a clause capable of matching information included in the current text information, that is, the embodiment of the present invention does not directly machine-translate the text information, but performs machine translation of the target clause after further processing the text information according to the clause information included in the text information to obtain the target clause from which the clause information meets the preset condition, that is, the embodiment of the present invention can control the clause currently required to be machine-translated according to the clause information included in the text information, so as to avoid a situation that a sentence sent to a machine translation device is too long or too short, and therefore, the accuracy and the real-time rate of translation can be effectively improved.

Method embodiment one

Referring to fig. 2, a flowchart illustrating steps of a first embodiment of an information processing method according to the present invention is shown, which may specifically include the following steps:

step 201, acquiring text information corresponding to a voice signal of a speaking user;

step 202, obtaining a clause of which the clause information meets a preset condition from the text information according to the clause information contained in the text information, and using the clause as a target clause needing machine translation currently;

step 203, sending the target clause to a machine translation device so that the machine translation device translates the target clause information into characters in a target language.

The information processing method provided by the embodiment of the invention can be applied to the application environment of the computing equipment. Optionally, the computing device may include: a terminal or a server. The terminal may include, but is not limited to: smart phones, tablets, laptop portable computers, in-vehicle computers, desktop computers, smart televisions, wearable devices, and the like. The server may be a cloud server or a common server, and is configured to provide a target clause that needs to be currently translated by a machine to the machine translation device.

In practical applications, the computing device of the embodiment of the present invention may obtain text information corresponding to a speech signal of a speaking user from other computing devices. Or, the computing device according to the embodiment of the present invention may execute the information processing method flow according to the embodiment of the present invention through a client Application, and the client Application may run on the computing device, for example, the client Application may be any APP (Application program) running on the intelligent terminal, and then the client Application may obtain text information corresponding to the voice signal of the speaking user from other applications of the computing device. Alternatively, the computing device in the embodiment of the present invention may execute the information processing method flow in the embodiment of the present invention through a function device of the client application, and then the function device may obtain text information corresponding to the voice signal of the speaking user from another function device. It is understood that the embodiment of the present invention does not limit the specific manner of acquiring the text information corresponding to the voice signal of the speaking user in step 201.

In an optional embodiment of the present invention, the method of the embodiment of the present invention may further include: writing the acquired text information into a cache region; the step 202 of obtaining the clause whose clause information meets the preset condition from the text information according to the clause information included in the text information may include: and reading the text information from the cache region, and acquiring clauses of which the clause information meets preset conditions from the read text information according to the clause information contained in the read text information. Optionally, a data structure such as a queue, an array, or a linked list may be established in a memory area of the computing device as the cache area, and the specific cache area is not limited in the embodiment of the present invention. The above manner of storing the text information in the cache region can improve the processing efficiency of the text information, and it can be understood that the manner of storing the text information in the magnetic disk is also feasible, and the embodiment of the present invention does not limit the specific storage manner of the text information.

In the embodiment of the invention, the target clause needing machine translation at present refers to a clause capable of matching with the clause information contained in the current text information. Alternatively, for example, if the clauses included in the current text information of a sentence that is too long (hereinafter, referred to as a long sentence) or a sentence that is too short (hereinafter, referred to as a short sentence) have characteristics of themselves, the clauses whose information meets the preset condition may be obtained from the text information according to the clause information included in the text information.

In an optional embodiment of the present invention, the information of the clause may include: the number of clauses and the number of words, the number of clauses contained in the text information can be used for indicating that the text information contains several clauses, the number of words of clauses contained in the text information can be used for indicating the number of characters occupied by part or all of clauses contained in the text information, and the combination of the number of clauses and the number of words contained in the text information can influence the quality (accuracy rate and real-time rate) of machine translation, so that the method can be used as a basis for acquiring a target clause.

The embodiment of the present invention may provide the following technical solution for obtaining the target clause according to the number of clauses and the number of words contained in the text information:

technical solution 1, if the number of preceding clauses in the text information exceeds a first number threshold and the number of words of the preceding clauses exceeds a first word number threshold, taking the preceding clauses as target clauses currently requiring machine translation. That is, in claim 1, the preset conditions may include: the number of preceding clauses in the text information exceeds a first number threshold and the number of words of said preceding clauses exceeds a first word number threshold.

Technical scheme 1 may be applicable to a case where a compound sentence corresponding to a clause included in text information is a phrase, and may determine whether the number of phrases located in front in the text information exceeds a first word number threshold n1, and determine whether the number of words located in front exceeds a first word number threshold m1, if both the determination results are yes, then concatenate n1 phrases included in the text information in a front-to-back order, and send the concatenation result to a machine translation device for translation, where n1 and m1 are positive integers. Therefore, in the technical scheme 1, the clauses corresponding to the short sentences are spliced, so that the spliced target clause has a more complete structure, and the translation accuracy is improved.

In an application example 1 of the present invention, assuming that the text information stored in the queue includes "weather is good today", "we go out fishing bar", two clauses, the number of words occupied by the two clauses is 15, n1 is 2, and m1 is 10, since the number of the two clauses exceeds n1 and the number of words of the two clauses exceeds m1, the two clauses can be sent to the machine translation apparatus together, and since a plurality of clauses having a more complete structure can be sent to the machine translation apparatus as a whole, the accuracy of translation can be improved.

It can be understood that n1 and m1 are only alternative embodiments of n1 and m1 as embodiments of the present invention, and in fact, a person skilled in the art may determine specific values of n1 and m1 according to actual application requirements, for example, current values of n1 and m1 may be tested based on two characteristics of translation accuracy and real-time rate, and if the current values do not pass the test, the current values are updated until the current values pass the test; wherein the current value may have a corresponding initial value, such as an initial value of n1 being 1, an initial value of m1 being 1, etc.; whether the current value passes the test can be judged according to the accuracy and the real-time rate of the translation under the condition of the current value, specifically, if the accuracy and the real-time rate of the translation under the condition of the current value are respectively in the corresponding preset ranges, the test is passed, otherwise, if the accuracy and the real-time rate of the translation under the condition of the current value are not respectively in the corresponding preset ranges, the test is not passed. It is understood that the real-time rate of the present invention is not limited to the specific values of n1, m1 and the manner of determining the same.

In an optional embodiment of the present invention, after the preceding clause is sent to the machine translation apparatus as the target clause currently needing to be subjected to machine translation, the preceding clause may also be deleted in the cache region, so as to effectively save the space occupied by the cache region.

Technical solution 2, if a difference D between the number of preceding clauses in the text information and a delay threshold is a multiple of a second number threshold and the number of words of the preceding clauses exceeds a second word number threshold, taking the preceding D clauses as target clauses currently requiring machine translation; wherein D is a positive integer. That is, in claim 2, the preset conditions may include: the difference D between the number of preceding clauses in the text information and the delay threshold is a multiple of the second number threshold and the number of words of said preceding clauses exceeds the second word number threshold.

Technical solution 2 can be applied to a case where a compound sentence corresponding to a clause included in text information is a long sentence, and for the long sentence, in a process of converting a speech signal into text information, text information corresponding to previous and subsequent speech signals may affect each other, for example, text information corresponding to a previous speech signal may change with text information corresponding to a subsequent speech signal, so that text information corresponding to the long sentence is not completely stable, and therefore, in order to ensure an accuracy of translation, translation needs to be performed after a structure of the long sentence is basically stable; that is, according to the technical scheme 2, the long sentence can be segmented, so that the translation can be performed without completely fixing the whole long sentence, and the real-time rate and the accuracy rate of the translation are improved.

In the technical scheme 2, the delay threshold k represents unstable clauses positioned behind the text information, namely k clauses positioned behind the text information are clauses sent in a delayed manner, and k can ensure that the change of the compound sentence is not too large. In addition, in claim 2, the second number threshold n2 indicates the number of clauses to be normally transmitted each time, so that when the text information includes M × n2+ k clauses positioned at the front, if the total number of words of M × n2+ k clauses exceeds the second word number threshold M2, the M × n2 clauses positioned at the front can be transmitted to the machine translation device as a whole to be translated, where k, n2, and M, M2 are positive integers.

In one application example 2 of the present invention, assume that the text message in the queue includes the preceding clauses "good", "i want to ask me mom", "we have a schedule today", "if not, i follow you to phish". "assuming n2 is 2, m2 is 15, and k is 2, then since the text message" located in front, i want to ask me mom, we have an arrangement today, "contains 4 clauses, and the total number of words of the 4 clauses exceeds m2, the first (4-2) of the 4 clauses can be sent to the machine translation device; then, the preceding text message "good, i want to ask me mom, we have an arrangement today, if not" contains 6 clauses, and the total number of the 6 clauses exceeds m2, the first (6-2) of the 6 clauses can be sent to the machine translation device.

In an optional embodiment of the present invention, the step of obtaining, from the text information, a clause whose clause information meets a preset condition according to clause information included in the text information may further include: and after the D preceding clauses are taken as the target clauses needing machine translation currently, if a second preset punctuation mark exists in the text information, taking the second preset punctuation mark and the characters before the second preset punctuation mark as the target clauses needing machine translation currently. In application example 2 above, after sending the first (6-2) of the 6 clauses to the machine translation device, suppose that the text message "is in front, good, i want to ask me mom, we have an arrangement today, and if not, i go to fishing with you. "includes a second predetermined punctuation mark". ", all text information may be sent to the machine translation device.

Optionally, the second preset punctuation mark may include: the second preset punctuation marks enable the corresponding second clause and the clauses before the second clause to have certain independence so as to have definite significance, namely, the translation accuracy of the second clause and the clauses before the second clause can not be influenced by the following clauses; therefore, the embodiment of the invention can send the k clauses which are sent in a delayed way to the machine translation device according to the second preset punctuation mark. Optionally, the second predetermined punctuation mark may be added by the first conversion device according to the interval of the speech signal and/or the language model, and the embodiment of the present invention does not limit the adding manner of the second predetermined punctuation mark.

In an optional embodiment of the present invention, after the second preset punctuation mark and the previous characters are sent to the machine translation device as the target clause currently needing to be machine translated, the second preset punctuation mark and the previous characters may be deleted in the cache region, so as to effectively save the space occupied by the cache region.

In practical applications, the embodiment of the present invention may adopt any one or a combination of the foregoing technical solutions 1 and 2 according to practical application requirements. For example, in an optional embodiment of the present invention, it may be determined that a compound sentence corresponding to a clause included in the text information is a short sentence or a long sentence, and if the compound sentence is a short sentence, technical scheme 1 may be adopted, and if the compound sentence is a long sentence, technical scheme 2 may be adopted.

Optionally, the compound sentence corresponding to the clause included in the text information may be determined to be a short sentence or a long sentence according to the total word number of the clause included in the text information and whether the clause included in the text information includes a preset flag bit. The preset flag bit may be used to identify the end of the sentence, and the preset flag bit may be added by the first conversion device according to the analysis result of the speech signal. Alternatively, if the total word number of the text information does not exceed the third word number threshold n3 and a preset flag bit exists in the text information, the compound sentence corresponding to the clause included in the text information may be considered as a short sentence, otherwise, if the total word number of the text information exceeds the third word number threshold and a preset flag bit does not exist in the text information, the compound sentence corresponding to the clause included in the text information may be considered as a long sentence. In an application example of the present invention, the third word count threshold n3 may be 30, it can be understood that the value of the third word count threshold n3 can be determined by those skilled in the art according to practical application requirements, and the specific value of the third word count threshold n3 is not limited by the embodiment of the present invention.

In an optional embodiment of the invention, the method may further comprise: acquiring a first clause with punctuation marks as first preset punctuation marks from the text information; the step 102 of obtaining, from the text information, a clause whose clause information meets a preset condition according to the clause information included in the text information may specifically include: and acquiring a first clause of which the clause information meets a preset condition from the text information according to the first clause information contained in the text information. Optionally, the first preset punctuation mark may include: commas, semicolons, etc. The first clause corresponding to the first preset punctuation mark usually has significance correlation with the clauses before and after the first clause, so that the accuracy of translation of the first clause and the clauses before the first clause may be influenced by the following clauses.

In practical applications, step 201, step 202, and step 203 may be performed continuously, that is, step 201 continuously obtains text information corresponding to a speech signal of a speaking user, and outputs the obtained text information to step 202; step 202 may continuously obtain a target clause that needs to be currently translated by a machine from the text information output in step 201; step 203 may send the target clause output by step 202 to a machine translation device. The machine translation apparatus may be a computing device located on the same or different computing device as the information processing apparatus executing the information processing method flow of the embodiment of the present invention, for example, the machine translation apparatus may be a server corresponding to a client application to which the information processing apparatus belongs.

In an optional embodiment of the invention, the method may further comprise: and acquiring a second clause with punctuation marks as second preset punctuation marks from the text information, and taking the second clause contained in the text information and clauses before the second clause as target clauses needing machine translation currently. If the text information is stored in the queue, the text information can be read from the queue, the read text information is traversed according to the sequence from front to back, and if a second preset punctuation mark is traversed, the second preset punctuation mark and the previous characters are used as target clauses. For example, in one application example 3 of the present invention, the text information "do you feel good? Is there included? ", then can"? "and its preceding characters as the target clause.

In summary, the embodiment of the present invention does not directly perform machine translation on text information, but performs machine translation on a target clause after text information is further processed according to clause information included in the text information to obtain a target clause from which clause information meets a preset condition, that is, the embodiment of the present invention can control a clause that needs to be currently subjected to machine translation according to the clause information included in the text information, so as to avoid a situation that a sentence sent to a machine translation apparatus is too long or too short, thereby effectively improving accuracy and real-time rate of translation.

For example, the embodiment of the invention can splice the clauses corresponding to the short sentences according to the number and the word number of the clauses, so that the spliced target clause has a more complete structure, and the translation accuracy is improved. For another example, the embodiment of the invention can perform translation without completely fixing the whole long sentence by segmenting the long sentence according to the number and the word number of the clauses, so that the real-time rate and the accuracy rate of translation can be improved.

Method embodiment two

Referring to fig. 3, a flowchart illustrating steps of a second embodiment of the information processing method according to the present invention is shown, which may specifically include the following steps:

301, acquiring text information corresponding to a voice signal of a speaking user;

step 302, identifying and obtaining a clause type corresponding to a clause contained in the text information according to punctuation marks of the clause contained in the text information, if the clause type is a second clause type, executing step 303, otherwise, if the clause type is a first clause type, executing step 304;

step 303, taking the second clause included in the text information and a clause before the second clause as a target clause needing machine translation currently;

step 304, identifying a long type and a short type of a clause corresponding to the text information according to the total word number of the clause contained in the text information and whether the clause contained in the text information contains a preset flag bit, if the long type and the short type are short sentence types, executing step 305, otherwise, if the long type and the short type are long sentence types, executing step 306;

step 305, if the number of preceding clauses in the text information exceeds a first number threshold and the word number of the preceding clauses exceeds a first word number threshold, taking the preceding clauses as target clauses needing machine translation currently;

step 306, if the difference value D between the number of preceding clauses in the text message and the delay threshold is a multiple of a second number threshold and the number of words in the preceding clauses exceeds a second word number threshold, taking the preceding D clauses as target clauses which need to be translated by a machine currently, and executing step 307; wherein D is a positive integer;

step 307, after the preceding D clauses are taken as target clauses which need to be machine-translated currently, if a second preset punctuation mark exists in the text information, taking the second preset punctuation mark and characters before the second preset punctuation mark as the target clauses which need to be machine-translated currently;

and 308, sending the target clause to a machine translation device so that the machine translation device translates the target clause information into characters in a target language. Here, the target clause may originate from step 303, step 305, step 306 or step 307.

Alternatively, before step 302, the text information obtained in step 301 may be written into a buffer, and step 302 may read the text information from the buffer and traverse the characters in the read text information to obtain the clauses, and information such as punctuation marks, number of punctuations, and number of words of the clauses in the read text information.

Further optionally, for the second clause type and the phrase type, after the target clause is sent to the machine translation device, the corresponding target clause may be deleted in the cache region; for long sentence types, after sending the second preset punctuation mark and its preceding characters to a machine translation device, the second preset punctuation mark and its preceding characters may be deleted in the buffer.

It should be noted that, for simplicity of description, the method embodiments are described as a series of motion combinations, but those skilled in the art should understand that the present invention is not limited by the described motion sequences, because some steps may be performed in other sequences or simultaneously according to the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no moving act is required as an embodiment of the invention.

Device embodiment

Referring to fig. 4, a block diagram of an embodiment of an information processing apparatus according to the present invention is shown, which may specifically include: a text acquisition module 401, a target clause acquisition module 402, and a target clause sending module 403.

The text acquiring module 401 is configured to acquire text information corresponding to a voice signal of a speaking user;

a target clause acquiring module 402, configured to acquire, from the text information, a clause whose clause information meets a preset condition according to clause information included in the text information, and use the clause as a target clause that needs to be currently translated by a machine;

a target clause sending module 403, configured to send the target clause to a machine translation apparatus, so that the machine translation apparatus translates the target clause information into a character in a target language.

Optionally, the information of the clause may include: the number of clauses and the number of words, the target clause acquiring module may include:

Optionally, the target clause obtaining module may further include:

Optionally, the apparatus may further include:

the target clause obtaining module may include:

Optionally, the apparatus may further include:

the target clause obtaining module may include:

Optionally, the apparatus may further include:

For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.

The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

Fig. 5 is a block diagram illustrating an apparatus for information processing as a terminal according to an exemplary embodiment. For example, terminal 900 may be a mobile phone, computer, digital broadcast terminal, messaging device, game console, tablet device, medical device, fitness device, personal digital assistant, and the like.

Referring to fig. 5, terminal 900 can include one or more of the following components: processing component 902, memory 904, power component 906, multimedia component 908, audio component 910, input/output (I/O) interface 912, sensor component 914, and communication component 916.

Processing component 902 generally controls overall operation of terminal 900, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. Processing element 902 may include one or more processors 920 to execute instructions to perform all or a portion of the steps of the methods described above. Further, processing component 902 can include one or more modules that facilitate interaction between processing component 902 and other components. For example, the processing component 902 can include a multimedia module to facilitate interaction between the multimedia component 908 and the processing component 902.

Memory 904 is configured to store various types of data to support operation at terminal 900. Examples of such data include instructions for any application or method operating on terminal 900, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 904 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.

The power components 906 provide power to the various components of the terminal 900. The power components 906 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for the terminal 900.

The multimedia components 908 include a screen providing an output interface between the terminal 900 and the user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide motion action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 908 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the terminal 900 is in an operation mode, such as a photographing mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.

The audio component 910 is configured to output and/or input audio signals. For example, audio component 910 includes a Microphone (MIC) configured to receive external audio signals when terminal 900 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 904 or transmitted via the communication component 916. In some embodiments, audio component 910 also includes a speaker for outputting audio signals.

I/O interface 912 provides an interface between processing component 902 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.

The sensor component 914 includes one or more sensors for providing various aspects of state assessment for the terminal 900. For example, sensor assembly 914 can detect an open/closed state of terminal 900, a relative positioning of components, such as a display and keypad of terminal 900, a change in position of terminal 900 or a component of terminal 900, the presence or absence of user contact with terminal 900, an orientation or acceleration/deceleration of terminal 900, and a change in temperature of terminal 900. The sensor assembly 914 may include a proximity sensor configured to detect the presence of a nearby object in the absence of any physical contact. The sensor assembly 914 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 914 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

Communication component 916 is configured to facilitate communications between terminal 900 and other devices in a wired or wireless manner. Terminal 900 can access a wireless network based on a communication standard, such as WiFi, 2G, or 3G, or a combination thereof. In an exemplary embodiment, the communication component 916 receives a broadcast signal or broadcast associated information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communications component 916 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the terminal 900 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.

In an exemplary embodiment, a non-transitory computer readable storage medium comprising instructions, such as memory 904 comprising instructions, executable by processor 920 of terminal 900 to perform the above-described method is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

A non-transitory computer-readable storage medium in which instructions, when executed by a processor of a terminal, enable the terminal to perform an information processing method, the method comprising: acquiring text information corresponding to a voice signal of a speaking user; according to the clause information contained in the text information, obtaining the clause of which the clause information meets the preset condition from the text information, and using the clause as a target clause needing machine translation currently; and sending the target clause to a machine translation device so that the machine translation device translates the target clause information into characters in a target language.

Fig. 6 is a block diagram illustrating an apparatus for information processing as a server according to an example embodiment. The server 1900 may vary widely by configuration or performance and may include one or more Central Processing Units (CPUs) 1922 (e.g., one or more processors) and memory 1932, one or more storage media 1930 (e.g., one or more mass storage devices) storing applications 1942 or data 1944. Memory 1932 and storage medium 1930 can be, among other things, transient or persistent storage. The program stored in the storage medium 1930 may include one or more modules (not shown), each of which may include a series of instructions operating on a server. Still further, a central processor 1922 may be provided in communication with the storage medium 1930 to execute a series of instruction operations in the storage medium 1930 on the server 1900.

The server 1900 may also include one or more power supplies 1926, one or more wired or wireless network interfaces 1950, one or more input-output interfaces 1958, one or more keyboards 1956, and/or one or more operating systems 1941, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.

In an exemplary embodiment, a non-transitory computer readable storage medium is also provided that includes instructions, such as memory 1932 that includes instructions executable by a processor of server 1900 to perform the above-described method. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

A non-transitory computer-readable storage medium in which instructions, when executed by a processor of a server, enable the server to perform a method of information processing, the method comprising: acquiring text information corresponding to a voice signal of a speaking user; according to the clause information contained in the text information, obtaining the clause of which the clause information meets the preset condition from the text information, and using the clause as a target clause needing machine translation currently; and sending the target clause to a machine translation device so that the machine translation device translates the target clause information into characters in a target language.

Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This invention is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.

It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is only limited by the appended claims

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

The information processing method, the information processing apparatus, the apparatus for information processing, and the simultaneous interpretation system provided by the present invention are described in detail above, and the principle and the implementation of the present invention are explained herein by applying specific examples, and the description of the above examples is only used to help understanding the method of the present invention and the core idea thereof; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims

1. An information processing method characterized by comprising:

acquiring text information corresponding to a voice signal of a speaking user;

according to the sentence information contained in the text information, the sentence information includes: the method comprises the following steps of obtaining clauses of which the information meets preset conditions from the text information as target clauses needing machine translation at present, wherein the clauses comprise the following clauses in number and word number:

if the difference D between the number of preceding clauses in the text information and the delay threshold is a multiple of a second number threshold and the word number of the preceding clauses exceeds a second word number threshold, taking the preceding D clauses as target clauses needing machine translation currently; wherein D is a positive integer; the delay threshold is used for representing k clauses which are positioned in the text information and are transmitted in a delayed mode;

2. The method according to claim 1, wherein the step of obtaining, from the text information, a clause whose information meets a preset condition according to the clause information included in the text information, further comprises:

and if the number of the clauses positioned in front in the text information exceeds a first number threshold and the word number of the clauses positioned in front exceeds a first word number threshold, taking the clauses positioned in front as target clauses needing machine translation currently.

3. The method according to claim 1, wherein the step of obtaining, from the text information, a clause whose information meets a preset condition according to the clause information included in the text information, further comprises:

4. The method according to claim 2 or 3, wherein the step of obtaining, from the text information, the clause whose information meets a preset condition according to the clause information included in the text information, further comprises:

5. A method according to claim 1, 2 or 3, characterized in that the method further comprises:

6. A method according to claim 1, 2 or 3, characterized in that the method further comprises:

writing the acquired text information into a cache region;

7. The method according to any one of claims 1 to 3, further comprising:

8. An information processing apparatus characterized by comprising:

the target clause sending module is used for sending the target clause to a machine translation device so that the machine translation device translates the target clause information into characters in a target language;

the sentence information includes: and the clause number and the word number are determined, the target clause acquiring module comprises:

a second target clause acquiring submodule, configured to, when a difference D between the number of preceding clauses in the text information and a delay threshold is a multiple of a second number threshold and a number of words of the preceding clauses exceeds a second word number threshold, take the preceding D clauses as target clauses currently needing machine translation; wherein D is a positive integer; the delay threshold is used to characterize k clauses of the text message that are later in the text message and are transmitted with a delay.

9. The apparatus of claim 8, wherein the target clause obtaining module further comprises:

and the first target clause acquiring submodule is used for taking the preceding clause as a target clause needing machine translation currently when the number of preceding clauses in the text information exceeds a first number threshold and the number of words of the preceding clause exceeds a first word number threshold.

10. The apparatus of claim 8, wherein the target clause obtaining module further comprises:

11. The apparatus according to claim 9 or 10, wherein the target clause obtaining module further comprises:

12. The apparatus of claim 8, 9 or 10, further comprising:

the target clause obtaining module includes:

13. The apparatus of claim 8, 9 or 10, further comprising:

the target clause obtaining module includes:

14. The apparatus of claim 8, 9 or 10, further comprising:

15. An apparatus for information processing, comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory, and wherein execution of the one or more programs by one or more processors comprises instructions for:

acquiring text information corresponding to a voice signal of a speaking user;

sending the target clause to a machine translation device so that the machine translation device translates the target clause information into characters in a target language;

the sentence information includes: if the number of clauses and the number of words are equal, then the obtaining of the clause with the clause information meeting the preset condition from the text information according to the clause information contained in the text information includes:

if the difference D between the number of preceding clauses in the text information and the delay threshold is a multiple of a second number threshold and the word number of the preceding clauses exceeds a second word number threshold, taking the preceding D clauses as target clauses needing machine translation currently; wherein D is a positive integer; the delay threshold is used to characterize k clauses of the text message that are later in the text message and are transmitted with a delay.

16. The apparatus according to claim 15, wherein the obtaining, from the text information, the clause whose information of the clause meets a preset condition according to the clause information included in the text information, further comprises:

17. The apparatus according to claim 15, wherein the obtaining, from the text information, the clause whose information of the clause meets a preset condition according to the clause information included in the text information, further comprises:

18. The apparatus according to claim 16 or 17, wherein the obtaining, from the text information, the clause whose information of the clause meets a preset condition according to the clause information included in the text information, further comprises:

19. The device of claim 15, 16 or 17, wherein the device is also configured to execute the one or more programs by one or more processors includes instructions for:

the obtaining, from the text information, a clause whose information meets a preset condition according to the clause information included in the text information includes:

20. The device of claim 15, 16 or 17, wherein the device is also configured to execute the one or more programs by one or more processors includes instructions for:

writing the acquired text information into a cache region;

21. The device of claim 15, 16 or 17, wherein the device is also configured to execute the one or more programs by one or more processors includes instructions for:

22. A machine-readable medium having stored thereon instructions, which when executed by one or more processors, cause an apparatus to perform the method of one or more of claims 1-7.

23. A simultaneous interpretation system, comprising:

the information processing apparatus according to any one of claims 8 to 14;