CN111797631B

CN111797631B - Information processing method and device and electronic equipment

Info

Publication number: CN111797631B
Application number: CN201910270744.8A
Authority: CN
Inventors: 韩伟
Original assignee: Beijing Orion Star Technology Co Ltd
Current assignee: Beijing Orion Star Technology Co Ltd
Priority date: 2019-04-04
Filing date: 2019-04-04
Publication date: 2024-06-21
Anticipated expiration: 2039-04-04
Also published as: CN111797631A

Abstract

The embodiment of the invention provides an information processing method, an information processing device and electronic equipment, wherein the method comprises the following steps: acquiring text information to be identified, sequentially determining a set number of words of the text information as language units, carrying out semantic identification processing on the language units, and determining effective semantic information of the text information according to semantic identification results of the language units; therefore, before semantic recognition is performed, any pre-segmentation is not needed for voice information or text information, so that semantic recognition errors caused by segmentation errors are avoided, and the accuracy of semantic recognition is improved; in addition, as the semantic recognition processing is performed on each language unit in real time, the real-time performance of the semantic recognition is improved.

Description

Information processing method and device and electronic equipment

Technical Field

The embodiment of the invention relates to the technical field of artificial intelligence, in particular to an information processing method, an information processing device and electronic equipment.

Background

With the development of man-machine interaction technology, semantic recognition technology shows its importance. Semantic recognition is a process of extracting feature information from a voice signal emitted from a human being and determining the meaning of the language thereof, and mainly includes a voice recognition process and a semantic understanding process. The speech recognition process is a process of converting a human speech signal into text using an acoustic model, and the semantic understanding process is a process of recognizing the meaning of text using a natural language model.

In the prior art, when processing a voice signal input by a user, a voice activity detection (voice activity detection, VAD) technology is utilized to determine a starting point and an ending point of each voice segment in a continuous voice signal, so as to realize segmentation of the continuous voice signal, and then voice recognition and semantic understanding are performed on the switched voice segments to obtain the semantics of the user.

However, in practical application, due to different speaking speeds, speaking habits of different users and different scenes where the speakers are located, the sentence is segmented in a VAD detection mode, so that the segmentation of the sentence is not accurate enough, and the accuracy of semantic recognition is not high.

Disclosure of Invention

The embodiment of the invention provides an information processing method, an information processing device and electronic equipment, which are used for improving the accuracy of semantic recognition.

In a first aspect, an embodiment of the present invention provides an information processing method, including:

Acquiring text information to be identified;

Sequentially determining the words of the set number of the text information as language units, carrying out semantic identification processing on the language units, and determining the effective semantic information of the text information according to the semantic identification result of the language units.

Optionally, the semantic recognition result includes: the determining the effective semantic information of the text information according to the semantic recognition result of the language unit comprises the following steps:

And if the semantic integrity probability scores corresponding to the continuous N language units meet the preset conditions, the semantic information of the N language units is used as the effective semantic information of the text information, and N is greater than or equal to 1.

Optionally, if the semantic integrity probability scores corresponding to the N consecutive language units meet a preset condition, the step of using the semantic information of the N language units as the valid semantic information of the text information includes:

For any first language unit in the language units, obtaining a cached historical language unit, wherein the historical language unit comprises at least one language unit before the first language unit, and the semantic integrity probability score corresponding to the historical language unit does not meet a set condition;

performing semantic recognition processing on a second language unit obtained by splicing the historical language unit and the first language unit to obtain a semantic recognition result of the second language unit;

and if the semantic integrity probability score of the second language unit meets the set condition, the semantic information of the second language unit is used as the effective semantic information of the text information.

Optionally, determining that the semantic integrity probability score of the second language unit meets a set condition according to the following steps:

And if the semantic integrity probability score of the second language unit is greater than or equal to a preset threshold value, determining that the semantic integrity probability score of the second language unit meets a set condition.

If the semantic integrity probability score of the second language unit is greater than or equal to a preset threshold value, and the semantic integrity probability score of the second language unit is greater than or equal to the semantic integrity probability score of the language unit obtained by splicing the second language unit and the third language unit, determining that the semantic integrity probability score of the second language unit meets a set condition;

the third language unit is a language unit which is behind the first language unit and is adjacent to the first language unit.

If the semantic integrity probability score of the second language unit is larger than or equal to a preset threshold value, and the semantic integrity probability scores of the language units obtained by splicing the second language unit and the language units before the fourth language unit are smaller than or equal to the integrity probability score of the second language unit, determining that the semantic integrity probability score of the second language unit meets a set condition;

The fourth language unit is located behind the first language unit, and a preset number of language units are arranged between the fourth language unit and the first language unit at intervals.

Optionally, the method further comprises: and if the semantic integrity probability score of the second language unit meets the set condition, deleting the historical language unit from the cache.

Optionally, the method further comprises:

And if the semantic integrity probability score of the second language unit does not meet the set condition, determining the second language unit as the historical language unit, and caching the historical language unit into a cache.

Optionally, after the semantic information of the second language unit is used as the valid semantic information of the text information, the method further includes:

Obtaining cached prediction semantic information and prediction reply information corresponding to the prediction semantic information, wherein the prediction semantic information is obtained by prediction according to the semantic information of the historical language unit;

and if the effective semantic information is consistent with the predicted semantic information, taking the predicted reply information as reply information corresponding to the text information.

Optionally, before the obtaining the text information to be identified, the method further includes:

and acquiring voice information input into the intelligent equipment, and performing voice recognition processing on the voice information to obtain text information to be recognized.

Optionally, after determining the valid semantic information of the text information, the method further includes:

Obtaining reply information corresponding to the text information according to the effective semantic information;

And controlling the intelligent equipment to output the reply information.

In a second aspect, an embodiment of the present invention provides an information processing apparatus including:

The acquisition module is used for acquiring text information to be identified;

the first recognition module is used for sequentially determining the words with the set quantity of the text information as language units, carrying out semantic recognition processing on the language units, and determining the effective semantic information of the text information according to the semantic recognition result of the language units.

Optionally, the semantic recognition result includes: the first recognition module is specifically configured to:

Optionally, the first identifying module is specifically configured to:

Optionally, the first identifying module is further configured to:

and if the semantic integrity probability score of the second language unit meets the set condition, deleting the historical language unit from the cache.

Optionally, the first identifying module is further configured to:

Optionally, the apparatus further includes: a second identification module;

The acquisition module is also used for acquiring voice information input into the intelligent equipment;

The second recognition module is used for performing voice recognition processing on the voice information to obtain text information to be recognized.

Optionally, the first identifying module is further configured to:

And controlling the intelligent equipment to output the reply information.

In a third aspect, an embodiment of the present invention provides an electronic device, including: at least one processor and memory;

The memory stores computer-executable instructions;

The at least one processor executing computer-executable instructions stored in the memory causes the at least one processor to perform the method of any one of the first aspects.

In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium having stored therein computer-executable instructions which, when executed by a processor, implement a method according to any of the first aspects.

In a fifth aspect, embodiments of the present invention provide a computer program product comprising computer program code which, when run on a computer, causes the computer to perform the method of any of the first aspects above.

In a sixth aspect, an embodiment of the present invention provides a chip, including a memory for storing a computer program, and a processor for calling and running the computer program from the memory, so that an electronic device on which the chip is mounted performs the method according to any one of the first aspect above.

According to the technical scheme provided by the embodiment of the invention, the text information to be identified is obtained, the set number of vocabularies of the text information are sequentially determined to be language units, semantic identification processing is carried out on the language units, and the effective semantic information of the text information is determined according to the semantic identification result of the language units; therefore, before semantic recognition is performed, any pre-segmentation is not needed for voice information or text information, so that semantic recognition errors caused by segmentation errors are avoided, and the accuracy of semantic recognition is improved; in addition, as the semantic recognition processing is performed on each language unit in real time, the real-time performance of the semantic recognition is improved.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the description below are only some embodiments of the invention, and that other drawings can be obtained according to these drawings without inventive faculty for a person skilled in the art.

FIG. 1 is a schematic diagram of a prior art semantic recognition process;

fig. 2 is a schematic flow chart of an information processing method according to an embodiment of the present invention;

FIG. 3 is a second flowchart of an information processing method according to an embodiment of the present invention;

FIG. 4 is a schematic diagram I of a semantic recognition process according to an embodiment of the present invention;

FIG. 5 is a second schematic diagram of a semantic recognition process according to an embodiment of the present invention;

Fig. 6 is a flowchart illustrating a method for processing information according to an embodiment of the present invention;

Fig. 7 is a schematic diagram of a structure of an information processing apparatus according to an embodiment of the present invention;

FIG. 8 is a schematic diagram of a second embodiment of an information processing apparatus;

fig. 9 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The terms "first," "second," "third," "fourth" and the like in the description and in the claims and in the above drawings, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

Fig. 1 is a schematic diagram of a semantic recognition process in the prior art, as shown in fig. 1, when processing voice information input by a user, a voice activity detection (voice activity detection, VAD) technology is first utilized to determine a starting point and an ending point of each voice segment in continuous voice information, so as to realize segmentation of the continuous voice information, and then voice recognition and semantic understanding are performed on the switched voice segments, so as to obtain the semantics of the user. Specifically, the voice segment is input into an automatic voice recognition (Automatic Speech Recognition, ASR) model for recognition to obtain text information corresponding to the voice segment, and then the text information is input into a natural language processing (Natural Language Processing, NLP) model for recognition to obtain semantic information corresponding to the text information.

In order to solve the above problems, an embodiment of the present invention provides an information processing method. In this embodiment, continuous voice information is not segmented and voice recognition is performed to obtain text information to be recognized, and a set number of vocabularies of the text information are sequentially used as language units to perform semantic recognition processing in real time, and effective semantic information of the text information is determined according to semantic recognition results of the language units; because pre-segmentation is not needed for voice information or text information, semantic recognition errors caused by segmentation errors are avoided, and the accuracy of semantic recognition is improved; in addition, the language units obtained by voice recognition are subjected to semantic recognition processing in real time, so that the instantaneity of semantic recognition is improved.

The technical scheme of the invention is described in detail below by specific examples. The following embodiments may be combined with each other, and some embodiments may not be repeated for the same or similar concepts or processes.

Fig. 2 is a schematic flow chart of an information processing method according to an embodiment of the present invention, where the method of the present embodiment may be executed by a server, or may be executed by a controller of an intelligent device. The smart device may be any electronic device having a human-machine interaction function with a user, including but not limited to: robot, intelligent audio amplifier, intelligent house, intelligent wearing equipment, smart mobile phone etc..

It should be noted that, for convenience of description, in this embodiment and the subsequent embodiments, the description will be given by taking the smart device as an example only when the description is related to the example.

As shown in fig. 2, the information processing method may include:

s201: and acquiring text information to be identified.

The text information to be identified may be long text information. That is, the text information to be recognized is text information that has not been subjected to segmentation.

The text information may be entered by a user into the smart device. In one possible scenario, the user enters text information directly into the smart device. In another possible scenario, a user inputs voice information into an intelligent device, and the intelligent device then obtains text information by performing voice recognition on the voice information.

Based on the second scenario described above, S201 may specifically include:

Specifically, when voice information input into the intelligent device is acquired, the voice information of the user can be collected through a microphone of the intelligent device, and the voice information of the user acquired by other devices can be received through a network or Bluetooth mode. It should be noted that, the embodiments of the present invention are only described by taking the two possible implementation manners to obtain the voice information of the user as an example, but the embodiments of the present invention are not limited thereto.

After the voice information is obtained, voice recognition technology can be adopted to perform voice recognition processing on the voice information in real time, so that text information is obtained. In an alternative embodiment, the speech information is input into an automatic speech recognition ASR model, which outputs the recognized text information.

The embodiment is different from the prior art in that after the intelligent device obtains the voice information input by the user, the voice information is not segmented, but the voice information is directly subjected to voice recognition to obtain the text information. For example, the text information identified may be "you see the robot just I or me try the bar for how effective it is in today's weather.

S202: sequentially determining the words of the set number of the text information as language units, carrying out semantic identification processing on the language units, and determining the effective semantic information of the text information according to the semantic identification result of the language units.

In this embodiment, a set number of words sequentially read from text information are referred to as a language unit. It is understood that the number of words subjected to the semantic recognition processing at a time may be configured. For example, the set number may be configured to be 1; as another example, the set number may be configured to be 3. In general, the smaller the number of values of the set number that is configured, the higher the semantic recognition accuracy. However, the smaller the number of the set numbers to be configured, the longer the processing time is consumed.

In the embodiment of the invention, in the process of carrying out voice recognition on voice information, the words with set quantity in the text information are sequentially read for the recognized text information, and semantic recognition processing is carried out in real time.

In an alternative implementation, the NLP model is processed by natural language, and the words with the set number in the text information are sequentially input into the NLP model for semantic recognition processing. Specifically, when the ASR model is adopted to perform voice recognition on voice information, text information obtained through recognition is sequentially input into the NLP model in real time by taking a set number of words as language units, and the NLP model performs semantic recognition processing on the set number of words. Since each vocabulary input into the NLP model is input in a running-water type in real time, the real-time performance of semantic recognition can be improved.

The NLP model can typically process a text segment of a certain length at a time. As a possible implementation manner, the NLP model performs word segmentation processing on an input text segment to obtain a keyword sequence, then obtains word vectors with context semantic relations according to the keyword sequence, inputs the word vectors into the classification model for feature extraction, and the classification model outputs the probability of the semantic category to which the text segment belongs according to the extracted features.

Alternatively, the classification model in the NLP model may be a deep neural network model.

For example, assuming that the voice information is "how weather today," when voice recognition is performed on the voice information, words obtained by sequentially recognizing are respectively:

"Jing", "Tian", "Qi", "what", "like"

Taking the configuration of the set number as 1 as an example, the words are sequentially input into the NLP model in real time for semantic recognition processing. Specifically, the NLP model sequentially performs semantic recognition processing on "today", "everyday", "today weather", "how to the weather", and determines effective semantic information of the text information according to the obtained semantic recognition result.

In a possible implementation manner, in the process of performing semantic recognition on each language unit of the text information, if semantic integrity probability scores corresponding to N consecutive language units meet a preset condition, the semantic information of the N language units is used as effective semantic information of the text information, and N is greater than or equal to 1.

In this embodiment, when semantic recognition is sequentially performed on a set number of words in text information as language units, if the integrity probability scores corresponding to N consecutive language units therein meet a preset condition, the semantic information of the N language units is used as effective semantic information of the text information. It can be understood that "the semantic integrity probability scores corresponding to the continuous N language units satisfy the preset condition" means that the semantics corresponding to the continuous N language units are relatively complete.

It can be appreciated that when determining whether the semantics are complete according to the semantic integrity probability score of the language unit, a variety of preset conditions can be used for determining. The embodiment of the present invention is not particularly limited thereto.

Illustratively, assume that the words in the text information are respectively: the semantics of the 1 st to 6 th language units of 'Jing, tian, qi, how and like' are relatively complete, and the semantics of the 7 th to 10 th language units of 'sing, head, song and bar' are relatively complete, so that the semantic information corresponding to the two groups of language units is used as the effective semantic information of the text information.

In the information processing method provided by the embodiment, text information to be identified is obtained, a set number of vocabularies of the text information are sequentially determined to be language units, semantic identification processing is carried out on the language units, and effective semantic information of the text information is determined according to semantic identification results of the language units; therefore, before semantic recognition is performed, any pre-segmentation is not needed for voice information or text information, so that semantic recognition errors caused by segmentation errors are avoided, and the accuracy of semantic recognition is improved; in addition, as the semantic recognition processing is performed on each language unit in real time, the real-time performance of the semantic recognition is improved.

Fig. 3 is a flow chart of a second embodiment of an information processing method according to the present invention. This embodiment refines the embodiment shown in fig. 2. As shown in fig. 3, the method of the present embodiment includes:

S301: and acquiring voice information input into the intelligent equipment, and performing voice recognition processing on the voice information to obtain text information to be recognized.

In this embodiment, the specific implementation of S301 is similar to that of the embodiment shown in fig. 2, and will not be repeated here.

S302: sequentially determining the words of the set number of the text information as language units, and carrying out semantic recognition processing on the language units to obtain semantic recognition results, wherein the semantic recognition results comprise: semantic integrity probability scores and semantic information.

In this embodiment, the semantic recognition result includes: semantic integrity probability scores and semantic information. Specifically, when the semantic recognition processing is performed by using the NLP model, the language unit is input into the NLP model, the NLP model performs the semantic recognition processing on the language unit, semantic information of the language unit is output, and meanwhile, a semantic integrity probability score of the language unit is also output.

It will be appreciated that the semantic integrity probability score is used to indicate the integrity of the semantics expressed by the linguistic units. It can be appreciated that the more complete the semantics expressed by the linguistic units, the higher the corresponding semantic integrity probability score; the less complete the semantics expressed by the linguistic units, the lower the corresponding semantic integrity probability score. For example: the semantic integrity probability score for "today's weather" is less than the semantic integrity probability score for "how today's weather".

S303: and aiming at any first language unit in the language units, acquiring a cached historical language unit, wherein the historical language unit comprises at least one language unit before the first language unit, and the semantic integrity probability score corresponding to the historical language unit does not meet a set condition.

It will be appreciated that the lexical order of the cached historic language units is consistent with the lexical order in the original speech information.

In addition, the cache location of the history language unit is not particularly limited in this embodiment. It will be appreciated that the historic language units may be cached in a cache external to the NLP model.

S304: and performing semantic recognition processing on the second language unit obtained by splicing the history language unit and the first language unit, and obtaining a semantic recognition result of the second language unit.

It will be appreciated that the lexical order in the second language unit obtained by concatenation is consistent with the lexical order in the original speech information.

S305: if the semantic integrity probability score of the second language unit meets the set condition, the semantic information of the second language unit is used as the effective semantic information of the text information, and the historical language unit is deleted from the cache; and if the semantic integrity probability score of the second language unit does not meet the set condition, determining the second language unit as the historical language unit, and caching the historical language unit into a cache.

Wherein, the fact that the semantic integrity probability score corresponding to the language unit does not meet the set condition means that: the semantic integrity probability score corresponding to the language unit is lower, namely the semantic incompleteness expressed by the language unit. The fact that the semantic integrity probability score corresponding to the language unit meets the set condition means that: the semantic integrity probability score corresponding to the language unit is higher, namely the semantic integrity of the language unit expression is realized.

The setting conditions in the present embodiment may take various forms, and are not particularly limited. In one possible implementation, the semantic integrity probability score is greater than or equal to a preset threshold as the setting condition. That is, when the semantic integrity probability score of a language unit is greater than or equal to a preset threshold, the semantic meaning of the language unit is considered complete, and when the semantic integrity probability score of the language unit is less than the preset threshold, the semantic meaning of the language unit is considered incomplete.

The following is described in connection with examples. The method comprises the steps of performing speech recognition through an ASR model to obtain long text information, starting from the initial position of the long text information, inputting a set number of words (marked as language units 1) into an NLP model to perform semantic recognition, and inputting the language units 1 into the NLP model to obtain semantic integrity probability scores and semantic information of the language units 1 because the language units to be recognized are 1 st language units and no history language units exist in a cache. The following description will be given in two cases.

Case 1: if the semantic integrity probability score of the language unit 1 is greater than or equal to the preset threshold value, the meaning of the language unit 1 is complete, so that the meaning information of the language unit 1 is used as the effective meaning information of the text information. Then, starting from the current initial position of the long text information, a set number of words (marked as a language unit 2) are taken for semantic recognition, and the recognition process is similar to that of the language unit 1.

Case 2: if the semantic integrity probability score of the language unit 1 is smaller than the preset threshold, the semantic integrity probability score of the language unit 1 is determined to be incomplete, and therefore the language unit 1 is cached in the cache. In this case, starting from the current starting position of the long text information, a set number of words (i.e., language unit 2) are taken, a history language unit (i.e., language unit 1) is obtained from the cache, and the language unit 1 and the language unit 2 are spliced to obtain a new language unit.

And then carrying out semantic recognition processing on the new language unit to obtain semantic integrity probability scores and semantic information of the new language unit. When the new language unit is subjected to semantic recognition processing, the following two cases are described.

Case 3: and if the semantic integrity probability score of the new language unit is greater than or equal to a preset threshold value, the semantic information of the new language unit is used as the effective semantic information of the text information. In this case, since the semantic information of the language unit 1 is already included in the semantic information of the new language unit, the language unit 1 is deleted from the cache. Then, starting from the current initial position of the long text information, a set number of words (marked as a language unit 3) are taken for semantic recognition, and the recognition process is similar to that of the language unit 1.

Case 4: if the semantic integrity probability score of the new language unit is smaller than the preset threshold value, the language unit 2 is also stored in the cache to be used as a historical language unit, and the historical language unit comprises the language unit 1 and the language unit 2. In this case, starting from the current starting position of the long text information, a set number of vocabularies (i.e., language unit 3) are taken, historical language units (i.e., language unit 1 and language unit 2) are obtained from the cache, and the language unit 1, the language unit 2 and the language unit 3 are spliced to obtain a new language unit. Then, the new language unit is subjected to semantic recognition processing, and the specific processing process is similar to the above process and is not repeated here.

S306: and acquiring reply information corresponding to the text information according to the effective semantic information, and controlling the intelligent equipment to output the reply information.

Specifically, according to the effective semantic information, multiple embodiments may be used to obtain the reply information corresponding to the text information. In an alternative embodiment, the knowledge base may be queried for reply information based on the valid semantic information. The knowledge base records reply information corresponding to different semantic information.

In addition, the reply information output by the intelligent device can be in a Text form, or in a multimedia information form such as audio, video and pictures, or in a voice form, namely, TTS (Text To Speech, chinese). It can be appreciated that, in this embodiment, the smart device may be any one of the above forms or may be a combination of at least two of the above forms when outputting the reply message, which is not limited in this embodiment.

Note that, in this embodiment, when the text information is replied, the sentence pattern in the text information is not specifically limited. For example, statement sentences, query sentences, exclamation sentences, and the like may be used. That is, the embodiment not only replies text information of question sentence patterns, but also replies text information of other sentence patterns.

In the embodiment, before semantic recognition is performed on the voice information, any pre-segmentation is not required to be performed on the voice information, and after the voice recognition is performed on the voice information to obtain the text information, any pre-segmentation is not required to be performed on the text information, so that semantic recognition errors caused by segmentation errors are avoided, and the accuracy of semantic recognition is improved; in addition, in the embodiment, the set number of words in the text information obtained by voice recognition are used as language units to carry out semantic recognition processing in real time, so that the instantaneity of semantic recognition is improved; when the semantic integrity probability scores of the continuous N language units meet the preset conditions, the semantic information of the continuous N language units is used as the effective semantic information of the text information, so that the accuracy of semantic recognition is improved.

The semantic recognition processing procedure of the present embodiment is described below by taking fig. 4 as an example. Fig. 4 is a schematic diagram of a semantic recognition process according to an embodiment of the present invention, as shown in fig. 4, assuming that words in text information obtained by performing speech recognition on speech information are respectively:

"Jing", "Tian", "Qi", "what", "like"

Each vocabulary is used as a language unit to be sequentially input into the NLP model in real time. After inputting the 1 st language unit "jin" to the NLP model, the NLP model calculates and outputs semantic information (not shown) and semantic integrity probability scores corresponding to the language unit "jin". As shown in fig. 4, the semantic integrity probability score of the language unit "to" is 0.01, and since the semantic integrity probability score is lower than the preset threshold (assuming that the preset threshold is 0.95), the 1 st language unit "to" can be cached first.

For the 2 nd language unit 'day' to be identified, firstly taking out a history language unit 'Jing' from a cache, splicing the history language unit with the 2 nd language unit to obtain a language unit 'today', and inputting the language unit 'today' into an NLP model, wherein the semantic integrity probability score output by the NLP model is 0.1; since the semantic integrity probability score is still below the preset threshold, the 2 nd linguistic unit "day" is also cached.

For the 3 rd language unit 'day' to be identified, firstly taking out the historical language units 'Jinshen' and 'Tiantian' from the cache, splicing the historical language units with the 3 rd language unit to obtain the language unit 'today' and inputting the language unit 'Jintian' into the NLP model, and then obtaining a semantic integrity probability score of 0.2 output by the NLP model; since the semantic integrity probability score is still low, the 3 rd language unit "day" is also cached.

By analogy, as shown in fig. 4, the semantic integrity probability score corresponding to "today's weather" is 0.75, the semantic integrity probability score corresponding to "today's weather how" is 0.8, the semantic integrity probability score corresponding to "today's weather how" is 0.9, and the semantic integrity probability score corresponding to "today's weather how" is 0.95.

It may be appreciated that in a specific application, a suitable preset threshold may be set, and when the semantic integrity probability score is smaller than the preset threshold, the current language unit is cached as the context information of the subsequent language unit. When the semantic integrity probability score is larger than the preset threshold value, the meaning is complete, and the current language unit does not need to be cached again. Further, the currently recognized semantic information can be used as the effective semantic information of the text information.

In the above example, each text is described as a language unit, and in this manner, the amount of calculation is large because a semantic recognition process needs to be performed on each text. In an alternative embodiment, to save computing resources, multiple words may be used as one language unit, e.g., two words, three words, as one language unit.

It should be noted that, in the present embodiment and the following embodiments, the semantic integrity probability score and the setting of the preset threshold value of each language unit are only examples, and the present invention is not limited thereto.

In the embodiments shown in fig. 3 and fig. 4, when the semantic recognition processing is sequentially performed by using the set number of vocabularies as language units, as long as the semantic integrity probability score of the continuous N language units is greater than or equal to the preset threshold, the semantics of the continuous N language units are considered to be complete, and the semantic information corresponding to the continuous N language units is used as the effective semantic information of the text information. In practical application, in order to improve accuracy of semantic recognition, when detecting that the semantic integrity probability scores of the N language units are greater than or equal to a preset threshold value, one or more subsequent language units can be further detected, and contribution conditions of the subsequent language units to semantic integrity are judged. The following is a detailed description of two alternative embodiments by way of example.

In a possible implementation manner, the semantic integrity probability score of the second language unit may be determined to meet the set condition according to the following steps:

In this embodiment, when the semantic integrity probability score of the second language unit is detected to be greater than or equal to the preset threshold, the semantic integrity probability score of the language unit obtained by splicing the second language unit and the third language unit is continuously detected, if the semantic integrity probability score is reduced, it is indicated that the semantic of the second language unit is complete, and the third language unit expresses a new semantic. Thus, the semantic information of the second language unit is taken as the valid semantic information of the text information.

Fig. 5 is a schematic diagram two of a semantic recognition process according to an embodiment of the present invention. As shown in fig. 5, it is assumed that the words sequentially input to the NLP model are respectively: "Tian", "Qi", "how", "like", "effect", "fruit", "don't", "wrong", "bar", i.e. the set number is 1.

Assume that the preset threshold is 0.8. Referring to fig. 5, the semantic integrity probability score for "day" is 0.1, the semantic integrity probability score for "weather" is 0.3, the semantic integrity probability score for "weather how" is 0.7, the semantic integrity probability score for "weather how" is 0.75, and the semantic integrity probability score for "weather how" is 0.95.

In this embodiment, when the semantic integrity probability score of the first 5 words is greater than a preset threshold, one more word is monitored. The 6 th vocabulary is spliced to obtain weather how effective, and the semantic integrity probability score of the text is 0.81. That is, splicing the 6 th vocabulary on the basis of the first 5 vocabularies results in a decrease in the semantic integrity probability score, so that the first 5 vocabularies are used as text fragments with complete semantics, and the semantic information corresponding to the first 5 vocabularies is used as effective semantic information of the text information.

In another alternative embodiment, the semantic integrity probability score of the second language unit may be determined to satisfy a set condition according to the following steps:

If the semantic integrity probability score of the second language unit is greater than or equal to a preset threshold value, and the semantic integrity probability scores of the language units obtained by splicing the second language unit and the language units before the fourth language unit are smaller than or equal to the integrity probability score of the second language unit, determining that the semantic integrity probability score of the second language unit meets the set condition.

In the present embodiment, the number of the language units spaced between the fourth language unit and the first language unit is not limited, and two, three or more language units may be spaced. In the implementation process, the number of the language units at intervals can be determined according to a preset time threshold. For example: and continuously detecting N language units in a preset time period when the semantic integrity probability score of the second language unit is detected to be greater than or equal to a preset threshold value each time, and taking the semantic information of the second language unit as the effective semantic information of the text information if the semantic integrity probability score is not improved in the preset time period.

Illustratively, in connection with fig. 5, assume that each language unit sequentially input the NLP model is: "Tian", "Qi", "how", "like", "effect", "fruit", "don't", "wrong", "bar", i.e. the set number is 1. The semantic integrity probability scores corresponding to the first 5 words of weather how are respectively 0.95 and are larger than a preset threshold value of 0.8.

Assuming N is 3, the subsequent 3 words "effect", "fruit", "no" need to be detected continuously. With reference to fig. 5, on the basis of the first 5 vocabularies, the semantic integrity probability score corresponding to the weather how good effect is obtained after the 6 th vocabulary is spliced is 0.81, the semantic integrity probability score corresponding to the weather how good effect is obtained after the 7 th vocabulary is spliced and added is 0.8, and the semantic integrity probability score corresponding to the weather how good effect is not obtained after the 8 th vocabulary is continuously spliced is 0.7. Therefore, on the basis of the first 5 words, the subsequent 3 words can not be spliced continuously, so that the semantic integrity probability score can not be improved. Therefore, the first 5 words are used as text fragments with complete semantics, and the semantic information corresponding to the first 5 words is used as the effective semantic information of the text information. The vocabulary after "effect" is re-identified as the next sentence.

Compared with the embodiment, the method and the device for identifying the target language unit continuously monitor the plurality of language units after the target language unit is determined, so that misjudgment is reduced, and accuracy of semantic identification is improved.

In the embodiment shown in fig. 3, after the valid semantic information of the text information, in step S306, reply information is obtained according to the valid semantic information, and the intelligent device is controlled to output the reply information. That is, in the embodiment shown in fig. 3, the reply message is obtained after a relatively complete semantic meaning is recognized. In practical application, sometimes, only fewer vocabularies need to be identified, and complete semantic information can be predicted. For example: assuming that the voice information input by the user is "how Beijing weather is," in the actual recognition process, after the semantic information corresponding to the first 4 words "Beijing weather" is recognized, the semantic of the user can be predicted to be the weather condition of asking Beijing. Therefore, the embodiment of the invention also provides a scheme capable of acquiring the reply information in advance according to the predicted semantic information. The following description is made with reference to fig. 6.

Fig. 6 is a flowchart illustrating a method for processing information according to an embodiment of the present invention. This embodiment is a further refinement of S306 in the embodiment shown in fig. 3. As shown in fig. 6, S306 may specifically include:

s3061: and obtaining cached prediction semantic information and prediction reply information corresponding to the prediction semantic information, wherein the prediction semantic information is obtained by prediction according to the semantic information of the historical language unit.

S3062: and if the effective semantic information is consistent with the predicted semantic information, taking the predicted reply information as reply information corresponding to the text information.

Specifically, in this embodiment, when performing semantic recognition on each language unit, if the semantic integrity probability score of the language unit does not meet a set condition, that is, if the semantic of the language unit is incomplete, on one hand, the language unit is cached as a history language unit; on the other hand, according to the semantic information of the language unit, the complete semantic information is predicted, and the predicted semantic information is obtained. Further, according to the prediction semantic information, the prediction reply information corresponding to the prediction semantic information is obtained. And then caching the prediction semantic information and the prediction reply information.

It should be noted that, according to incomplete semantic information of a language unit, the complete semantic information is obtained by prediction, and there may be multiple prediction modes, which are not limited in detail in this embodiment, and an existing semantic prediction method may be adopted.

In one possible implementation manner, after a semantic recognition result is obtained for each language unit, if the semantics of the language unit are incomplete, prediction is performed according to the incomplete semantic information, so as to obtain complete prediction semantic information.

In another possible implementation manner, for each language unit, if the semantics of the language unit are incomplete, whether the semantic integrity probability score of the language unit is greater than or equal to a prediction threshold is further determined, and if so, the semantic prediction is performed, so that the computing resources can be saved. It will be appreciated that the predictive threshold is less than the predetermined threshold described above.

Illustratively, assume that the voice information input by the user is "what is the Beijing weather," and the prediction threshold is 0.5. In the process of sequentially inputting each vocabulary as a language unit into an NLP model for recognition, the semantic integrity probability score obtained by recognizing the 1 st vocabulary 'north' is 0.01, the semantic integrity probability score obtained by recognizing the first 2 vocabularies 'Beijing' is 0.1, and the semantic integrity probability score obtained by recognizing the first 3 vocabularies 'Beijing days' is 0.2; in the identification process, as the semantic integrity probability score corresponding to each language unit is smaller than the prediction threshold value of 0.5, the meaning is very incomplete, and even if semantic prediction is performed, the accuracy of the obtained predicted semantic information is low. Therefore, after the recognition of the 3 words, a semantic prediction process is not required.

The semantic integrity probability score obtained by identifying the first 4 words "Beijing weather" is 0.6. The semantic integrity probability scores corresponding to the first 4 words are larger than a prediction threshold value, and the semantics of the first 4 words are relatively complete, so that the user's semantics can be predicted to be the Beijing weather to be queried according to the Beijing weather. Therefore, in this embodiment, after the first 4 words are identified, prediction is performed according to the semantic information of the first 4 words, so as to obtain the predicted semantic information with complete semantics. And acquiring Beijing weather information in advance as prediction reply information, and caching the prediction semantic information and the prediction reply information.

Further, after identifying how the text information is "Beijing weather" to obtain the effective semantic information, since one or more pieces of predictive semantic information and predictive reply information have been cached before, it is possible to use, from among the pieces of predictive semantic information, the predictive reply information corresponding to the predictive semantic information that is consistent with the effective semantic information as the reply information corresponding to the effective semantic information, and control the intelligent device to output the reply information.

In this embodiment, when incomplete semantic information is obtained according to the recognition of a part of language units, complete semantic prediction is performed according to the incomplete semantic information, predicted reply information is obtained in advance to be cached, and when the complete semantic information is subsequently recognized, only corresponding reply information is required to be obtained from the cache, thereby improving the real-time performance of semantic recognition.

Fig. 7 is a schematic structural diagram of an information processing apparatus according to an embodiment of the present invention. The information processing apparatus of this embodiment may be in the form of software and/or hardware, and the apparatus may be specifically disposed in a server or disposed in an intelligent device.

As shown in fig. 7, the information processing apparatus 700 of the present embodiment includes: an acquisition module 701 and a first identification module 702.

The acquiring module 701 is configured to acquire text information to be identified;

The first recognition module 702 is configured to sequentially determine a set number of vocabularies of the text information as language units, perform semantic recognition processing on the language units, and determine effective semantic information of the text information according to a semantic recognition result of the language units.

The apparatus of this embodiment may be used to execute the method embodiment shown in fig. 2, and its implementation principle and technical effects are similar, and will not be described herein again.

Fig. 8 is a schematic diagram of a second structure of an information processing apparatus according to an embodiment of the present invention. The information processing apparatus 700 of the present embodiment may further include a second identification module 703 on the basis of the embodiment shown in fig. 7.

Optionally, the semantic recognition result includes: semantic integrity probability scores and semantic information, the first recognition module 702 is specifically configured to:

Optionally, the first identifying module 702 is specifically configured to:

Optionally, the first identifying module 702 is further configured to:

Optionally, the obtaining module 701 is further configured to obtain voice information input to the intelligent device;

the second recognition module 703 is configured to perform a voice recognition process on the voice information, so as to obtain text information to be recognized.

Optionally, the first identifying module 702 is further configured to:

And controlling the intelligent equipment to output the reply information.

The information processing device provided by the embodiment of the present invention may be used to execute the technical solution of any of the above method embodiments, and its implementation principle and technical effects are similar, and are not repeated here.

Fig. 9 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present invention, where the electronic device may be a controller of an intelligent device or may be a server, and the embodiment of the present invention is not limited in particular. As shown in fig. 9, the electronic device 900 of the present embodiment includes: at least one processor 901 and a memory 902. The processor 901 and the memory 902 are connected by a bus 903.

In a specific implementation process, at least one processor 901 executes computer-executed instructions stored in the memory 902, so that the at least one processor 901 executes the technical solution of any one of the method embodiments described above.

The specific implementation process of the processor 901 may refer to the above-mentioned method embodiment, and its implementation principle and technical effects are similar, and this embodiment will not be described herein again.

In the embodiment shown in fig. 9, it should be understood that the Processor may be a central processing unit (english: central Processing Unit, abbreviated as CPU), or may be other general purpose processors, digital signal processors (english: DIGITAL SIGNAL Processor, abbreviated as DSP), application-specific integrated circuits (english: application SPECIFIC INTEGRATED Circuit, abbreviated as ASIC), or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the present invention may be embodied directly in a hardware processor for execution, or in a combination of hardware and software modules in a processor for execution.

The memory may comprise high speed RAM memory or may further comprise non-volatile storage NVM, such as at least one disk memory.

The bus may be an industry standard architecture (Industry Standard Architecture, ISA) bus, an external device interconnect (PERIPHERAL COMPONENT, PCI) bus, or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, among others. The buses may be divided into address buses, data buses, control buses, etc. For ease of illustration, the buses in the drawings of the present application are not limited to only one bus or to one type of bus.

The embodiment of the invention also provides a computer readable storage medium, wherein the computer readable storage medium stores computer execution instructions, and when a processor executes the computer execution instructions, the technical scheme in any method embodiment is realized.

The computer readable storage medium described above may be implemented by any type of volatile or non-volatile memory device or combination thereof, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic disk, or optical disk. A readable storage medium can be any available medium that can be accessed by a general purpose or special purpose computer.

An exemplary readable storage medium is coupled to the processor such the processor can read information from, and write information to, the readable storage medium. In the alternative, the readable storage medium may be integral to the processor. The processor and the readable storage medium may reside in an Application SPECIFIC INTEGRATED Circuits (ASIC). The processor and the readable storage medium may reside as discrete components in a device.

Embodiments of the present invention also provide a computer program product comprising computer program code for causing a computer to carry out the technical solutions of any of the method embodiments above, when said computer program code is run on a computer.

The embodiment of the invention also provides a chip, which comprises a memory and a processor, wherein the memory is used for storing a computer program, and the processor is used for calling and running the computer program from the memory, so that the electronic equipment provided with the chip executes the technical scheme of any method embodiment.

Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the method embodiments described above may be performed by hardware associated with program instructions. The foregoing program may be stored in a computer readable storage medium. The program, when executed, performs steps including the method embodiments described above; and the aforementioned storage medium includes: various media that can store program code, such as ROM, RAM, magnetic or optical disks.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention.

Claims

1. An information processing method, characterized by comprising:

Acquiring text information to be identified;

sequentially determining the vocabulary of the character strings containing the preset number of characters in the text information as language units, carrying out semantic identification processing on the language units, and determining the effective semantic information of the text information according to the semantic identification result of the language units;

The semantic recognition result comprises: the determining the effective semantic information of the text information according to the semantic recognition result of the language unit comprises the following steps:

2. The method of claim 1, wherein the semantic integrity probability score for the second language unit is determined to satisfy a set condition according to the steps of:

3. The method of claim 1, wherein the semantic integrity probability score for the second language unit is determined to satisfy a set condition according to the steps of:

4. The method of claim 1, wherein the semantic integrity probability score for the second language unit is determined to satisfy a set condition according to the steps of:

5. The method according to claim 1, wherein the method further comprises:

6. The method according to claim 1, wherein the method further comprises:

7. The method of claim 1, wherein after said regarding said semantic information of said second language unit as valid semantic information for said text information, further comprising:

8. The method according to any one of claims 1 to 7, further comprising, before the obtaining the text information to be identified:

9. The method according to any one of claims 1 to 7, further comprising, after determining the valid semantic information of the text information:

And controlling the intelligent equipment to output the reply information.

10. An information processing apparatus, characterized by comprising:

The acquisition module is used for acquiring text information to be identified;

The first recognition module is used for sequentially determining the words of the set number of the text information as language units, carrying out semantic recognition processing on the language units and determining the effective semantic information of the text information according to the semantic recognition result of the language units;

the semantic recognition result comprises: the first recognition module is specifically configured to:

11. The apparatus of claim 10, wherein the first identification module is specifically configured to:

12. The apparatus of claim 10, wherein the first identification module is specifically configured to:

13. The apparatus of claim 10, wherein the first identification module is specifically configured to:

14. The apparatus of claim 10, wherein the first identification module is further configured to:

15. The apparatus of claim 10, wherein the first identification module is further configured to:

16. The apparatus of claim 10, wherein the first identification module is further configured to:

17. The apparatus according to any one of claims 10 to 16, further comprising: a second identification module;

18. The apparatus of any one of claims 10 to 16, wherein the first identification module is further configured to:

And controlling the intelligent equipment to output the reply information.

19. An electronic device, comprising: at least one processor and memory;

The memory stores computer-executable instructions;

The at least one processor executing computer-executable instructions stored in the memory causes the at least one processor to perform the method of any one of claims 1 to 9.

20. A computer readable storage medium having stored therein computer executable instructions which when executed by a processor implement the method of any one of claims 1 to 9.