[go: up one dir, main page]

CN112652308A - Vehicle control voice recognition method and device - Google Patents

Vehicle control voice recognition method and device Download PDF

Info

Publication number
CN112652308A
CN112652308A CN202011454179.XA CN202011454179A CN112652308A CN 112652308 A CN112652308 A CN 112652308A CN 202011454179 A CN202011454179 A CN 202011454179A CN 112652308 A CN112652308 A CN 112652308A
Authority
CN
China
Prior art keywords
vehicle control
result
recognition result
matching
vehicle
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011454179.XA
Other languages
Chinese (zh)
Inventor
周冰
解鹏
于永彦
梁子龙
万娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dongfeng Motor Corp
Original Assignee
Dongfeng Motor Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dongfeng Motor Corp filed Critical Dongfeng Motor Corp
Priority to CN202011454179.XA priority Critical patent/CN112652308A/en
Publication of CN112652308A publication Critical patent/CN112652308A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Navigation (AREA)

Abstract

本发明公开了一种车控语音识别方法及装置,其中方法包括:获取云端的第一语音识别结果;将第一语音识别结果在预设的车控词库中进行匹配,获得匹配结果;车控词库与被进行语音控制的车辆匹配;最后,基于匹配结果,判断第一语音识别结果是否包括第一车控类语义,第一车控类语义为车辆支持的车控语义;若是,则基于匹配结果,获得第二语音识别结果;第二语音识别结果用于对车辆进行语音控制;若否,则基于匹配结果,生成提示信息。本发明提高了车控语音识别的可靠性,避免了误识别和错误指令的执行。

Figure 202011454179

The invention discloses a vehicle control speech recognition method and device, wherein the method includes: obtaining a first speech recognition result in the cloud; matching the first speech recognition result in a preset vehicle control vocabulary to obtain a matching result; The control vocabulary is matched with the vehicle under voice control; finally, based on the matching result, it is judged whether the first voice recognition result includes the first vehicle control semantics, and the first vehicle control semantics is the vehicle control semantics supported by the vehicle; if so, then Based on the matching result, a second voice recognition result is obtained; the second voice recognition result is used to perform voice control on the vehicle; if not, prompt information is generated based on the matching result. The invention improves the reliability of vehicle control speech recognition, and avoids misrecognition and execution of wrong instructions.

Figure 202011454179

Description

Vehicle control voice recognition method and device
Technical Field
The invention relates to the technical field of computers, in particular to a vehicle control voice recognition method and device.
Background
In recent years, vehicle-mounted voice systems have become more widely used and are important signs for automobile intellectualization. However, current Speech ASR (Automatic Speech Recognition) and NLU (Natural Language Understanding) cannot perfectly cope with Speech of a vocal or dialect, so that the function of controlling the vehicle by Speech may be wrong. The vehicle control intention of the user cannot be executed correctly, and even can be executed incorrectly, which causes the complaint of the user.
Therefore, the method in the prior art has the problems of low accuracy and easy occurrence of false recognition in vehicle control intention recognition of the user.
Disclosure of Invention
In view of the above problems, the present invention provides a vehicle control voice recognition method and device, which improve the reliability of vehicle control voice recognition and avoid the execution of false recognition and false instructions.
In a first aspect, the present application provides the following technical solutions through an embodiment:
a vehicle control voice recognition method comprises the following steps:
acquiring a first voice recognition result of a cloud; matching the first voice recognition result in a preset vehicle control word bank to obtain a matching result; the vehicle control word bank is matched with a vehicle subjected to voice control; judging whether the first voice recognition result comprises a first vehicle control type semantic or not based on the matching result; the first vehicle control type semantic is a vehicle control semantic supported by the vehicle; if yes, obtaining a second voice recognition result based on the matching result; wherein the second voice recognition result is used for voice control of the vehicle; and if not, generating prompt information based on the matching result.
Optionally, the vehicle control word bank includes a vehicle control action word bank and a vehicle control object word bank, the matching result includes an action matching result and an object matching result, and the matching of the first speech recognition result in a preset vehicle control word bank to obtain a matching result includes:
matching the first voice recognition result in the vehicle control action word bank to obtain an action matching result; and matching the first voice recognition result in the vehicle control object word bank to obtain an object matching result.
Optionally, the determining, based on the matching result, whether the first speech recognition result includes a first vehicle control type semantic includes:
if the action matching result contains the vehicle control action words in the vehicle control action word bank and the object matching result contains the vehicle control object words in the vehicle control object word bank, determining that the first voice recognition result comprises a first vehicle control type semantic; otherwise, determining that the first voice recognition result does not comprise the first vehicle control type semantic.
Optionally, the obtaining a second speech recognition result based on the matching result includes:
and modifying the first voice recognition result based on the matching result to obtain the second voice recognition result.
Optionally, the generating of the prompt information based on the matching result includes:
obtaining a semantic recognition result corresponding to the first voice recognition result; judging whether the semantic recognition result is a second vehicle control type semantic; if yes, generating first prompt information representing a non-support-of-talk operation based on the semantic recognition result; and if not, generating second prompt information based on the semantic recognition result.
Optionally, the generating of the first prompt information based on the semantic recognition result includes:
acquiring corresponding dialect information based on the semantic recognition result; and generating the first prompt message based on the dialogistic information.
In a second aspect, based on the same inventive concept, the present application provides the following technical solutions through an embodiment:
an in-vehicle control speech recognition device comprising:
the acquisition module is used for acquiring a first voice recognition result of the cloud; the matching module is used for matching the first voice recognition result in a preset vehicle control word bank to obtain a matching result; the vehicle control word bank is matched with a vehicle subjected to voice control; the judging module is used for judging whether the first voice recognition result comprises a first vehicle control type semantic or not based on the matching result; the first vehicle control type semantic is a vehicle control semantic supported by the vehicle; the first processing module is used for obtaining a second voice recognition result based on the matching result if the first processing module is used for obtaining the second voice recognition result; wherein the second voice recognition result is used for voice control of the vehicle; and the second processing module is used for generating prompt information based on the matching result if the matching result is not the same as the preset matching result.
Optionally, the vehicle control word bank includes a vehicle control action word bank and a vehicle control object word bank, the matching result includes an action matching result and an object matching result, and the matching module is specifically configured to:
matching the first voice recognition result in the vehicle control action word bank to obtain an action matching result; and matching the first voice recognition result in the vehicle control object word bank to obtain an object matching result.
In a third aspect, based on the same inventive concept, the present application provides the following technical solutions through an embodiment:
a vehicle control voice recognition device comprising a processor and a memory coupled to the processor, the memory storing instructions that, when executed by the processor, cause the vehicle control voice recognition device to perform the steps of the method of any one of the first aspects above.
In a fourth aspect, based on the same inventive concept, the present application provides the following technical solutions through an embodiment:
a computer-readable storage medium, having stored thereon a computer program which, when being executed by a processor, carries out the steps of the method of any of the first aspects.
The vehicle control voice recognition method and device provided in the embodiment can be used for recognizing vehicle control voice, and a proprietary vehicle control word bank is adopted for semantic matching in the whole recognition process instead of directly using cloud semantic recognition. After semantic matching is carried out by using the vehicle control word bank, whether the first voice recognition result contains the first vehicle control type semantic meaning or not can be judged, and therefore the real intention of the user can be recognized. And finally, a second voice recognition result for performing voice control on the vehicle is generated according to the matching result, so that the reliability is improved, and the false recognition and the execution of a false instruction are avoided.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts. In the drawings:
FIG. 1 is a schematic diagram of a first prior art vehicular speech system;
FIG. 2 is a diagram illustrating a second prior art vehicular speech system;
FIG. 3 is a schematic structural diagram of an "instruction execution" module of a second vehicle-mounted voice system in the prior art;
FIG. 4 is a schematic structural diagram of a vehicular speech system to which the method or apparatus of the embodiment of the present invention is applicable;
FIG. 5 is a flowchart illustrating a vehicle control speech recognition method according to a first embodiment of the present invention;
fig. 6 is a schematic structural diagram illustrating a vehicle-control speech recognition device according to a second embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
As shown in fig. 1, the vehicle-mounted Speech system generally includes a microphone, a recording and denoising module, a wake-up module, a Speech-To-Speech (ASR), a semantic understanding (NLU), a knowledge map, a Text-To-Speech (TTS), and a speaker, wherein the recording and denoising module and the wake-up module are generally integrated in a vehicle machine, and along with technical progress in related fields, the ASR, the NLU, and the TTS may be placed in the vehicle machine locally or in a cloud, and are limited by respective limitations of the local and the cloud, so the vehicle-mounted Speech system in the market is often a cloud-end fusion scheme.
Therefore, as shown in fig. 2, in the conventional vehicle-mounted speech system, after the wake-up engine is started, an english-training signal is collected and then processed to form an audio file, the audio file is divided into two paths, one path is transmitted to the cloud speech semantic recognition module (online ASR & NLU) for speech and semantic recognition, and the other path is transmitted to the local speech semantic recognition module (online ASR & NLU) for speech and semantic recognition. And then, carrying out result arbitration based on the cloud and local recognition results, carrying out instruction execution, and finally carrying out voice feedback. In the result arbitration module and the instruction execution module, the former has the main function of arbitrating and selecting semantic results of an end cloud and a peer cloud, generally, the result is cloud-first, the result is local without the cloud semantic result, and arbitration judgment items such as timeout time are set; the latter is an important interface for the voice system to connect to the vehicle control module. As shown in fig. 3, in the voice vehicle Control scheme, the "instruction execution" module of the voice system performs information interaction with a vehicle CAN (Controller Area Network) Unit of the vehicle system, the "instruction execution" module converts a semantic result into a vehicle Control instruction, transmits the vehicle Control instruction to the CAN Unit, converts the vehicle Control instruction into a bus instruction meeting vehicle communication specifications and security scenes, performs logic processing and communication interaction with other ECU (Electronic Control Unit) nodes of the vehicle, and finally feeds back a state and a result of the other ECU of the vehicle executing the vehicle instruction to the "instruction execution" module.
In the embodiment of the present invention, the architecture of the vehicle-mounted speech system may be improved, and a newly added semantic recognition module (onboard VC _ NLU) is added between the cloud speech semantic recognition module (online ASR & NLU) and the Result arbitration module (Result arbiter), as shown in fig. 4. The method or the device in the embodiment of the invention can be applied to the newly added semantic recognition module. It should be noted that the method or apparatus of the present invention is not only applied to the above-mentioned scenario, but also applied to the above-mentioned improved vehicle-mounted speech system. For example, in the example of fig. 4, the newly added semantic recognition module is located locally in the vehicle, while in other examples, the newly added semantic recognition module may be located in the cloud; in addition, the method or apparatus of the present invention can also be integrated into other processing modules or apparatuses with processing capability for application, without limitation. Of course, the method and apparatus disclosed in the present invention may also be applied to other devices with control scenarios, such as intelligent robots, boats and various intelligent home appliances, without limitation. The method and apparatus of the present invention are described in detail below by way of an example.
First embodiment
Referring to fig. 5, fig. 5 is a flowchart illustrating a vehicle control voice recognition method according to a first embodiment of the present invention, where the method includes:
step S10: and acquiring a first voice recognition result of the cloud.
In step S10, after the user wakes up the voice capture module on the vehicle, the vehicle can capture the voice of the user and upload the captured voice data to the cloud. Therefore, voice recognition is carried out at the cloud end, and a first voice recognition result is obtained. In addition, semantic recognition can be performed on the first voice recognition result at the cloud end, so that a semantic recognition result is obtained.
Step S20: matching the first voice recognition result in a preset vehicle control word bank to obtain a matching result; and the vehicle control word bank is matched with the vehicle subjected to voice control.
In step S20, the preset vehicle control word bank is a word bank in which the vehicle control operation and the vehicle control object are stored in advance. The word bank can be independent from the cloud word bank, and can be convenient for automobile manufacturers to maintain. In addition, in the actual business application process, the cloud speech and semantic recognition used in speech recognition are usually provided by a third party outside an automobile manufacturer to provide a recognition engine, so that the cloud speech and semantic recognition are often difficult to adapt to all vehicle types after cloud recognition. The customization requirements of the individual phone cannot be met. In the embodiment, the automobile manufacturer can update the latest automobile control actions, automobile control objects, dialects and the like in the preset automobile control word bank, so that the accuracy of later semantic recognition is ensured. For example, for an automobile manufacturer, each vehicle type may correspond to a specific vehicle control word bank, or each vehicle family may correspond to a specific vehicle control word bank, so that different specific vehicle control word banks are used for different vehicle types or vehicle families, and the accuracy of semantic recognition is ensured.
Further, the vehicle control word bank of the embodiment may include a vehicle control action word bank and a vehicle control object word bank. Correspondingly, the matching result may include an action matching result and an object matching result. At this time, step S20 may include: matching the first voice recognition result in a vehicle control action word bank to obtain an action matching result; and matching the first voice recognition result in the vehicle control object word bank to obtain an object matching result. For example, the action matching result has two cases: 1. the corresponding vehicle control action is not matched; 2. and matching to corresponding vehicle control actions, such as opening, closing, enlarging, fast forwarding, opening and the like. The object matching result also has two cases: 1. not matching to the corresponding vehicle control object; 2. and matching the automobile control objects to corresponding automobile control objects, such as 'air conditioner', 'windshield wiper', 'atmosphere lamp', 'fog lamp', '360-degree look around', and the like.
Step S30: judging whether the first voice recognition result comprises a first vehicle control type semantic or not based on the matching result; and the first vehicle control type semantic is a vehicle control semantic supported by the vehicle.
In step S30, the first vehicle control type semantic is a specific vehicle control type semantic corresponding to the vehicle, and the first vehicle control type semantic corresponds to the vehicle control thesaurus. Therefore, if the action matching result includes the vehicle control action words in the vehicle control action word bank and the object matching result includes the vehicle control object words in the vehicle control object word bank, it may be determined that the first speech recognition result includes the first vehicle control class semantic. Otherwise, determining that the first voice recognition result does not include the first vehicle control type semantic. In the matching process, the matching result can be determined by the existing algorithm of natural language processing, for example, semantic similarity calculation is performed, and for frequently wrong words and fuzzy words, engineers can perform targeted training on the matching algorithm and personalized design on the vehicle control word bank to improve the matching accuracy. For example, the intention that the user voice wants to express is "open a vehicle condition display interface", the cloud-side voice recognition result is "open a vehicle condition", the cloud-side semantic recognition result is "open a vehicle window", and obviously, the cloud-side semantic recognition result is an erroneous result. And the correct result of the 'opening condition' can be obtained by adopting the vehicle control word stock for identification. Even if the target result of opening the window cannot be obtained, the maintenance engineer can quickly update the vehicle control word bank or the semantic analysis algorithm to obtain a corresponding correct result, and the method is not limited by the cloud.
The voice recognized by the cloud end is subjected to semantic recognition through the vehicle control word bank special for the vehicle through the recognition process, so that correct semantics can be obtained quickly.
Step S40: if yes, obtaining a second voice recognition result based on the matching result; wherein the second voice recognition result is used for voice control of the vehicle.
In step S40, when the first speech recognition result includes the first vehicle control class semantic, a second speech recognition result is obtained based on the matching result. The second speech recognition result determined in this embodiment has two cases: 1. and when the first vehicle control type semantic meaning obtained by recognition is the same as the semantic meaning contained in the first voice recognition result, taking the first voice recognition result as a second voice recognition result. 2. When the semantic meaning of the first vehicle control class obtained by recognition is different from the semantic meaning contained in the first voice recognition result, correcting the first voice recognition result based on the matching result to obtain a second voice recognition result; for example, when the intention expressed by the user is "turn off the ambience lamp", but the cloud speech recognition structure is "turn off the phoenix tail lamp", at this time, the matching result matched in the vehicle control thesaurus is "turn off the ambience lamp", and the first speech recognition result "turn off the phoenix tail lamp" is corrected to obtain the second speech recognition result "turn off the ambience lamp". And finally, generating a corresponding vehicle control instruction according to the second voice recognition result to perform voice control on the vehicle.
Step S50: and if not, generating prompt information based on the matching result.
In step S50, when the first speech recognition result does not include the first vehicle control type semantic, that is, the corresponding semantic for control cannot be found in the vehicle control thesaurus, the prompt information is generated based on the matching result. Specifically, the process of generating the prompt message may include:
firstly, a semantic recognition result corresponding to a first voice recognition result is obtained, and the semantic recognition result is a recognition result of a cloud. Then, judging whether the semantic recognition result is a second vehicle control type semantic; the second vehicle control type semantic meaning is a semantic meaning which cannot be identified through the vehicle control word bank.
1. And if the semantic recognition result is the second vehicle control type semantic, the first voice recognition result is beyond the recognition range of the vehicle control word bank of the current vehicle fixed language. At this time, two cases may be included:
1) the vehicle control word bank contains the semantics which the user wants to express, but the corresponding matching result cannot be matched in the vehicle control word bank through the first voice recognition result.
For this case, first prompt information for prompting the user to change the speech technology may be generated based on the semantic recognition result. Specifically, the corresponding dialect information may be obtained based on the semantic recognition result. The dialect information is a standard dialect that matches the semantic identification result. For example, the standard language corresponding to the vehicle control semantic "open window" is "open window". The verbal information may be obtained from a verbal library pre-stored on the vehicle. Further, generating first prompt information based on the dialect information; finally, the TTS module on the vehicle outputs the voice. For example, the first prompt message output is "bad meaning, the system does not support this, if you want to open the window," open the window "can be said.
2) The vehicle control word bank does not contain the semantics which the user wants to express.
In this case, it is explained that the current vehicle does not support the vehicle control action corresponding to the semantic recognition result. First prompt information that does not support the vehicle control function may be generated. For example, if the semantic recognition result is "open trunk", and the vehicle control thesaurus does not contain the semantic meaning, that is, the vehicle does not support the function, then a voice prompt message of "no answer, and the system does not support the function at present in an effort to learn the skill.
When the dialect corresponding to the semantic recognition result of the cloud exceeds the keyword vocabulary defined by the vehicle control semantic of the vehicle, an error vehicle control instruction is executed possibly, and a safety risk is caused. The steps are used for judging and executing to prompt a user that the execution of the wrong instruction can be effectively avoided.
2. If the semantic recognition result is not the second vehicle control type semantic meaning, the voice sent by the user cannot be used for vehicle control, and at the moment, second prompt information can be generated based on the semantic recognition result. The second prompt message can be a semantic recognition result recognized by a cloud end and output to a vehicle-mounted human-computer interface for displaying. Or not supported by voice prompts.
In addition, in this embodiment, there are provided false recognition cases of the cloud speech recognition result (cloud ASR result), the cloud semantic recognition result (cloud NLU result), and the vehicle semantic recognition result (vehicle-controlled thesaurus recognition/local VC _ NLU result), as shown in tables 1 and 2 below:
TABLE 1
Figure BDA0002827814650000091
TABLE 2
Figure BDA0002827814650000092
Figure BDA0002827814650000101
It can be seen from the above table 1 and table 2 that the real intention of the user can be effectively recognized after the recognition is performed by the method of the present embodiment, and the cloud speech recognition result is corrected to execute the vehicle control action, so that the corresponding guidance prompt can be provided to the user even if the vehicle control action cannot be executed.
It should be noted that, as shown in table 1 and table 2, the method of this embodiment can solve at least the following problems:
1. the result of the cloud ASR is not consistent with the user intention, and the cloud NLU has no semantic result, so that failure is caused, and the user intention is not executed.
2. And the cloud ASR result and the NLU result are not in accordance with the user intention, failure is caused, and an action which is not in accordance with the user intention is executed.
3. The cloud ASR result accords with the user intention, but the cloud NLU semantic result does not accord with the vehicle control intention of the user, so that failure is caused, and the action which does not accord with the user intention is executed.
4. The result of the cloud ASR accords with the user intention, but the cloud NLU has no semantic result, so that failure is caused, and the user intention is not executed;
5. the cloud ASR result accords with the user intention, but the cloud NLU is an incorrect vehicle control semantic result, so that failure is caused, and an action which does not accord with the user intention is executed.
In summary, the vehicle control voice recognition method provided in this embodiment includes: acquiring a first voice recognition result of a cloud; matching the first voice recognition result in a preset vehicle control word bank to obtain a matching result; the vehicle control word bank is matched with the vehicle subjected to voice control; finally, judging whether the first voice recognition result comprises a first vehicle control semantic based on the matching result, wherein the first vehicle control semantic is a vehicle control semantic supported by the vehicle; if so, obtaining a second voice recognition result based on the matching result; the second voice recognition result is used for carrying out voice control on the vehicle; and if not, generating prompt information based on the matching result. In the whole recognition process, the special vehicle control word bank is adopted for semantic matching, and cloud semantic recognition is not directly used. After semantic matching is carried out by using the vehicle control word bank, whether the first voice recognition result contains the first vehicle control type semantic meaning or not can be judged, and therefore the real intention of the user can be recognized. And finally, a second voice recognition result for performing voice control on the vehicle is generated according to the matching result, so that the reliability is improved, and the false recognition and the execution of a false instruction are avoided.
Second embodiment
Referring to fig. 6, a second embodiment of the present invention provides a vehicle-controlled speech recognition apparatus 300 based on the same inventive concept. Fig. 6 is a schematic structural diagram illustrating functional modules of a vehicle-control speech recognition device according to a second embodiment of the present invention. The vehicle-control voice recognition device 300 includes:
the obtaining module 301 is configured to obtain a first voice recognition result of a cloud;
the matching module 302 is configured to match the first voice recognition result in a preset vehicle control word bank to obtain a matching result; the vehicle control word bank is matched with a vehicle subjected to voice control;
a judging module 303, configured to judge whether the first speech recognition result includes a first vehicle control type semantic based on the matching result; the first vehicle control type semantic is a vehicle control semantic supported by the vehicle;
a first processing module 304, configured to, if yes, obtain a second speech recognition result based on the matching result; wherein the second voice recognition result is used for voice control of the vehicle;
and a second processing module 305, configured to generate a prompt message based on the matching result if the matching result is not the same as the first matching result.
As an optional implementation manner, the vehicle control word bank includes a vehicle control action word bank and a vehicle control object word bank, the matching result includes an action matching result and an object matching result, and the matching module 302 is specifically configured to:
matching the first voice recognition result in the vehicle control action word bank to obtain an action matching result; and matching the first voice recognition result in the vehicle control object word bank to obtain an object matching result.
As an optional implementation manner, the determining module 303 is specifically configured to:
if the action matching result contains the vehicle control action words in the vehicle control action word bank and the object matching result contains the vehicle control object words in the vehicle control object word bank, determining that the first voice recognition result comprises a first vehicle control type semantic; otherwise, determining that the first voice recognition result does not comprise the first vehicle control type semantic.
As an optional implementation manner, the first processing module 304 is specifically configured to:
and modifying the first voice recognition result based on the matching result to obtain the second voice recognition result.
As an optional implementation manner, the second processing module 305 is specifically configured to:
obtaining a semantic recognition result corresponding to the first voice recognition result; judging whether the semantic recognition result is a second vehicle control type semantic; if yes, generating first prompt information representing a non-support-of-talk operation based on the semantic recognition result; and if not, generating second prompt information based on the semantic recognition result.
As an optional implementation manner, the second processing module 305 is further specifically configured to:
acquiring corresponding dialect information based on the semantic recognition result; and generating the first prompt message based on the dialogistic information.
It should be noted that, the implementation and technical effects of the vehicle-controlled speech recognition apparatus 300 according to the embodiment of the present invention are the same as those of the foregoing method embodiment, and for a brief description, reference may be made to corresponding contents in the foregoing method embodiment for the part of the embodiment of the apparatus that is not mentioned.
Third embodiment
Based on the same inventive concept, a third embodiment of the present invention further provides a vehicle control voice recognition device, which includes a processor and a memory, wherein the memory is coupled to the processor, and the memory stores instructions that, when executed by the processor, cause the vehicle control voice recognition device to perform any one of the steps of the above method embodiments.
It should be noted that, in the vehicle-control voice recognition apparatus provided in the embodiment of the present invention, the specific implementation and the generated technical effect of each step are the same as those of the foregoing method embodiment, and for brief description, for the sake of brevity, corresponding contents in the foregoing method embodiment may be referred to for what is not mentioned in this embodiment.
Fourth embodiment
Based on the same inventive concept, a fourth embodiment of the present invention also provides a computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, performs any one of the steps in the above-mentioned method embodiments.
It should be noted that, in the computer-readable storage medium provided by the embodiment of the present invention, the specific implementation manner of each step and the generated technical effect achieved when the program is executed by the processor are the same as those of the foregoing method embodiment, and for the sake of brief description, for the sake of brevity, no mention in this embodiment may be made to the corresponding contents in the foregoing method embodiment.
The term "and/or" appearing herein is merely one type of associative relationship that describes an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship; the word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (10)

1.一种车控语音识别方法,其特征在于,包括:1. a vehicle control voice recognition method, is characterized in that, comprises: 获取云端的第一语音识别结果;Obtain the first speech recognition result in the cloud; 将所述第一语音识别结果在预设的车控词库中进行匹配,获得匹配结果;其中,所述车控词库与被进行语音控制的车辆匹配;Matching the first voice recognition result in a preset vehicle control thesaurus to obtain a matching result; wherein, the vehicle control thesaurus is matched with the vehicle under voice control; 基于所述匹配结果,判断所述第一语音识别结果是否包括第一车控类语义;其中,所述第一车控类语义为所述车辆支持的车控语义;Based on the matching result, determine whether the first speech recognition result includes the first vehicle control semantics; wherein the first vehicle control semantics is the vehicle control semantics supported by the vehicle; 若是,则基于所述匹配结果,获得第二语音识别结果;其中,所述第二语音识别结果用于对所述车辆进行语音控制;If so, obtain a second voice recognition result based on the matching result; wherein, the second voice recognition result is used to perform voice control on the vehicle; 若否,则基于所述匹配结果,生成提示信息。If not, generate prompt information based on the matching result. 2.根据权利要求1所述的方法,其特征在于,所述车控词库包括车控动作词库和车控对象词库,所述匹配结果包括动作匹配结果和对象匹配结果,所述将所述第一语音识别结果在预设的车控词库中进行匹配,获得匹配结果,包括:2. The method according to claim 1, wherein the vehicle control thesaurus includes a vehicle control action thesaurus and a vehicle control object thesaurus, the matching results include an action matching result and an object matching result, and the The first speech recognition result is matched in a preset vehicle control vocabulary to obtain a matching result, including: 将所述第一语音识别结果在所述车控动作词库中进行匹配,获得动作匹配结果;Matching the first speech recognition result in the vehicle control action thesaurus to obtain an action matching result; 将所述第一语音识别结果在所述车控对象词库中进行匹配,获得对象匹配结果。Matching the first speech recognition result in the vehicle control object vocabulary to obtain an object matching result. 3.根据权利要求2所述的方法,其特征在于,所述基于所述匹配结果,判断所述第一语音识别结果是否包括第一车控类语义,包括:3. The method according to claim 2, wherein, based on the matching result, judging whether the first speech recognition result includes the first vehicle control semantics, comprising: 若所述动作匹配结果包含所述车控动作词库中的车控动作词,且所述对象匹配结果包含所述车控对象词库中的车控对象词,则确定所述第一语音识别结果包括第一车控类语义;If the action matching result includes the vehicle control action words in the vehicle control action thesaurus, and the object matching result includes the vehicle control object words in the vehicle control object thesaurus, determine the first speech recognition The result includes the first vehicle control class semantics; 否则,确定所述第一语音识别结果不包括第一车控类语义。Otherwise, it is determined that the first speech recognition result does not include the first vehicle control semantics. 4.根据权利要求1所述的方法,其特征在于,所述基于所述匹配结果,获得第二语音识别结果,包括:4. The method according to claim 1, wherein the obtaining a second speech recognition result based on the matching result comprises: 基于所述匹配结果对所述第一语音识别结果修正,获得所述第二语音识别结果。The first speech recognition result is modified based on the matching result to obtain the second speech recognition result. 5.根据权利要求1所述的方法,其特征在于,所述基于所述匹配结果,生成提示信息,包括:5. The method according to claim 1, wherein the generating prompt information based on the matching result comprises: 获取所述第一语音识别结果对应的语义识别结果;obtaining a semantic recognition result corresponding to the first speech recognition result; 判断所述语义识别结果是否为第二车控类语义;Judging whether the semantic recognition result is the second vehicle control semantics; 若是,则基于所述语义识别结果,生成表征不支持话术的第一提示信息;If yes, then based on the semantic recognition result, generate first prompt information representing the unsupported speech; 若否,则基于所述语义识别结果,生成第二提示信息。If not, second prompt information is generated based on the semantic recognition result. 6.根据权利要求5所述的方法,其特征在于,所述基于所述语义识别结果,生成第一提示信息,包括:6. The method according to claim 5, wherein the generating first prompt information based on the semantic recognition result comprises: 基于所述语义识别结果,获取对应的话术信息;Based on the semantic recognition result, obtain corresponding vocabulary information; 基于所述话术信息,生成所述第一提示信息。Based on the speech information, the first prompt information is generated. 7.一种车控语音识别装置,其特征在于,包括:7. A vehicle-controlled voice recognition device, characterized in that, comprising: 获取模块,用于获取云端的第一语音识别结果;an acquisition module for acquiring the first speech recognition result in the cloud; 匹配模块,用于将所述第一语音识别结果在预设的车控词库中进行匹配,获得匹配结果;其中,所述车控词库与被进行语音控制的车辆匹配;a matching module, configured to match the first voice recognition result in a preset vehicle control thesaurus to obtain a matching result; wherein, the vehicle control thesaurus matches the vehicle under voice control; 判断模块,用于基于所述匹配结果,判断所述第一语音识别结果是否包括第一车控类语义;其中,所述第一车控类语义为所述车辆支持的车控语义;a judgment module, configured to judge whether the first speech recognition result includes a first vehicle control semantics based on the matching results; wherein, the first vehicle control semantics are vehicle control semantics supported by the vehicle; 第一处理模块,用于若是,则基于所述匹配结果,获得第二语音识别结果;其中,所述第二语音识别结果用于对所述车辆进行语音控制;a first processing module, configured to obtain a second voice recognition result based on the matching result if yes; wherein, the second voice recognition result is used to perform voice control on the vehicle; 第二处理模块,用于若否,则基于所述匹配结果,生成提示信息。The second processing module is configured to, if not, generate prompt information based on the matching result. 8.根据权利要求7所述的装置,其特征在于,所述车控词库包括车控动作词库和车控对象词库,所述匹配结果包括动作匹配结果和对象匹配结果,所述匹配模块,具体用于:8 . The device according to claim 7 , wherein the vehicle control thesaurus includes a vehicle control action thesaurus and a vehicle control object thesaurus, the matching results include an action matching result and an object matching result, and the matching modules, specifically for: 将所述第一语音识别结果在所述车控动作词库中进行匹配,获得动作匹配结果;Matching the first speech recognition result in the vehicle control action thesaurus to obtain an action matching result; 将所述第一语音识别结果在所述车控对象词库中进行匹配,获得对象匹配结果。The first speech recognition result is matched in the vehicle control object vocabulary to obtain an object matching result. 9.一种车控语音识别装置,其特征在于,包括处理器和存储器,所述存储器耦接到所述处理器,所述存储器存储指令,当所述指令由所述处理器执行时使所述车控语音识别装置执行权利要求1-6中任一项所述方法的步骤。9. A vehicle control speech recognition device, characterized in that it comprises a processor and a memory, the memory is coupled to the processor, the memory stores an instruction, and when the instruction is executed by the processor, the instruction is executed. The vehicle control voice recognition device executes the steps of the method of any one of claims 1-6. 10.一种计算机可读存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行时实现权利要求1-6中任一项所述方法的步骤。10. A computer-readable storage medium on which a computer program is stored, characterized in that, when the program is executed by a processor, the steps of the method according to any one of claims 1-6 are implemented.
CN202011454179.XA 2020-12-10 2020-12-10 Vehicle control voice recognition method and device Pending CN112652308A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011454179.XA CN112652308A (en) 2020-12-10 2020-12-10 Vehicle control voice recognition method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011454179.XA CN112652308A (en) 2020-12-10 2020-12-10 Vehicle control voice recognition method and device

Publications (1)

Publication Number Publication Date
CN112652308A true CN112652308A (en) 2021-04-13

Family

ID=75353549

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011454179.XA Pending CN112652308A (en) 2020-12-10 2020-12-10 Vehicle control voice recognition method and device

Country Status (1)

Country Link
CN (1) CN112652308A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114203179A (en) * 2021-10-28 2022-03-18 山东浪潮科学研究院有限公司 Speech semantic understanding method and device
CN116486815A (en) * 2023-04-25 2023-07-25 成都赛力斯科技有限公司 Vehicle-mounted voice signal processing method and device
CN117056473A (en) * 2023-07-20 2023-11-14 美的集团(上海)有限公司 An equipment control method, device, equipment and medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109003608A (en) * 2018-08-07 2018-12-14 北京东土科技股份有限公司 Court's trial control method, system, computer equipment and storage medium
CN109515449A (en) * 2018-11-09 2019-03-26 百度在线网络技术(北京)有限公司 The method and apparatus interacted for controlling vehicle with mobile unit

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109003608A (en) * 2018-08-07 2018-12-14 北京东土科技股份有限公司 Court's trial control method, system, computer equipment and storage medium
CN109515449A (en) * 2018-11-09 2019-03-26 百度在线网络技术(北京)有限公司 The method and apparatus interacted for controlling vehicle with mobile unit

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114203179A (en) * 2021-10-28 2022-03-18 山东浪潮科学研究院有限公司 Speech semantic understanding method and device
CN116486815A (en) * 2023-04-25 2023-07-25 成都赛力斯科技有限公司 Vehicle-mounted voice signal processing method and device
CN117056473A (en) * 2023-07-20 2023-11-14 美的集团(上海)有限公司 An equipment control method, device, equipment and medium

Similar Documents

Publication Publication Date Title
CN108922564B (en) Emotion recognition method and device, computer equipment and storage medium
CN112652308A (en) Vehicle control voice recognition method and device
US9202459B2 (en) Methods and systems for managing dialog of speech systems
US20210174797A1 (en) Voice command recognition device and method thereof
CN110001558A (en) Method for controlling a vehicle and device
CN104795065A (en) Method for increasing speech recognition rate and electronic device
CN110570867A (en) Voice processing method and system for locally added corpus
CN117565887A (en) Service recommendation method, vehicle-mounted terminal and vehicle
CN118782032A (en) A method, device and electronic device for processing voice commands
CN113597641A (en) Voice processing method, device and system
US10468017B2 (en) System and method for understanding standard language and dialects
CN115985317A (en) Information processing method, device, vehicle and storage medium
US20140343947A1 (en) Methods and systems for managing dialog of speech systems
US11664018B2 (en) Dialogue system, dialogue processing method
CN115019797A (en) Voice interaction method and server
CN118314899B (en) Voice information processing method, device, vehicle and readable storage medium
CN111027667B (en) Method and device for identifying intention category
CN120162233A (en) Method for generating vehicle computer function test instructions, readable storage medium and program product
CN113539254A (en) Voice interaction method and system based on action engine and storage medium
CN117153160B (en) Voice information recognition method, device, electronic device and storage medium
CN118262707A (en) Semantic recognition method, electronic device and storage medium thereof
US9858918B2 (en) Root cause analysis and recovery systems and methods
CN118629400A (en) Voice interaction method, device, equipment, storage medium and vehicle
CN114529641A (en) Intelligent network connection automobile assistant dialogue and image management system and method
US20150039312A1 (en) Controlling speech dialog using an additional sensor

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210413

RJ01 Rejection of invention patent application after publication