CN114155864B - Elevator control method, device, electronic device and readable storage medium - Google Patents
Elevator control method, device, electronic device and readable storage medium Download PDFInfo
- Publication number
- CN114155864B CN114155864B CN202111421777.1A CN202111421777A CN114155864B CN 114155864 B CN114155864 B CN 114155864B CN 202111421777 A CN202111421777 A CN 202111421777A CN 114155864 B CN114155864 B CN 114155864B
- Authority
- CN
- China
- Prior art keywords
- voice recognition
- recognition result
- call instruction
- recognition information
- information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 35
- 238000013473 artificial intelligence Methods 0.000 abstract description 2
- 238000004590 computer program Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 230000006978 adaptation Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000005236 sound signal Effects 0.000 description 1
- 230000001629 suppression Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/32—Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Indicating And Signalling Devices For Elevators (AREA)
- Elevator Control (AREA)
Abstract
The application discloses an elevator control method, an elevator control device, electronic equipment and a readable storage medium, and belongs to the technical field of artificial intelligence. The method comprises the steps of determining a first target recognition result according to other recognition information except the recognition information with the highest voice matching score corresponding to the first call instruction and/or the other recognition information except the recognition information with the highest voice matching score corresponding to the second call instruction when two call instructions are received within a first preset time period and a second voice recognition result corresponding to the second call instruction is identical to a first voice recognition result corresponding to the first call instruction, and controlling an elevator to respond to the second call instruction according to the first target recognition result. The application can improve the accuracy of identifying the call instruction and the operation efficiency of the elevator.
Description
Technical Field
The application belongs to the technical field of artificial intelligence, and particularly relates to an elevator control method, an elevator control device, electronic equipment and a readable storage medium.
Background
With the continuous development of speech recognition technology, the demand for speech recognition in different fields is increasing. In order to improve the user riding experience, a contact-free intelligent elevator based on voice recognition is generated.
Currently, when a user speaks a call instruction in a car, an elevator system can recognize instruction content based on a voice recognition technology and control elevator operation according to the instruction content, for example, when the instruction content is "stop at six floors", control keys corresponding to six floors of the elevator to be lightened and to be suspended when the elevator is operated to six floors. However, the recognized instruction content may be wrong, and the elevator control device cannot judge the situation, so that the elevator is in operation wrongly, for example, the elevator is stopped at a floor where the elevator is not stopped, or a door is opened when the door needs to be closed, and the operation efficiency of the elevator is low.
Disclosure of Invention
The embodiment of the application aims to provide an elevator control method, an elevator control device, electronic equipment and a readable storage medium, which can solve the problem of low accuracy of identifying call instructions.
In a first aspect, an embodiment of the present application provides an elevator control method, including:
When two call instructions are received within a first preset time period, a second voice recognition result of a second call instruction is obtained, and the first call instruction is received before the second call instruction;
When the second voice recognition result is the same as the first voice recognition result of the first call instruction, determining a first target recognition result according to first voice recognition information and/or second voice recognition information;
Controlling an elevator to respond to the second call instruction according to the first target identification result;
The first voice recognition information is other voice recognition information except for the first voice recognition result in a first voice recognition information set, the first voice recognition information set comprises at least one piece of information obtained by voice recognition of the first call instruction, the second voice recognition information is other voice recognition information except for the second voice recognition result in a second voice recognition information set, and the second voice recognition information set comprises at least one piece of information obtained by voice recognition of the second call instruction.
In a second aspect, an embodiment of the present application provides an elevator control apparatus including:
The first recognition module is used for acquiring a second voice recognition result of a second call instruction when two call instructions are received within a first preset time period, wherein the first call instruction is received before the second call instruction;
The first determining module is used for determining a first target recognition result according to the first voice recognition information and/or the second voice recognition information when the second voice recognition result is the same as the first voice recognition result of the first call instruction;
the first control module is used for controlling the elevator to respond to the second call instruction according to the first target identification result;
The first voice recognition information is other voice recognition information except for the first voice recognition result in a first voice recognition information set, the first voice recognition information set comprises at least one piece of information obtained by voice recognition of the first call instruction, the second voice recognition information is other voice recognition information except for the second voice recognition result in a second voice recognition information set, and the second voice recognition information set comprises at least one piece of information obtained by voice recognition of the second call instruction.
In a third aspect, an embodiment of the present application provides an electronic device, including a processor, a memory, and a program or instruction stored on the memory and executable on the processor, the program or instruction implementing the steps of the method according to the first aspect when executed by the processor.
In a fourth aspect, embodiments of the present application provide a readable storage medium having stored thereon a program or instructions which when executed by a processor perform the steps of the method according to the first aspect.
In a fifth aspect, an embodiment of the present application provides a chip, where the chip includes a processor and a communication interface, where the communication interface is coupled to the processor, and where the processor is configured to execute a program or instructions to implement a method according to the first aspect.
In the embodiment of the application, when two call instructions are received within a first preset time period and the voice recognition results of the two call instructions are the same, it may be determined that the current voice recognition result may be wrong, and then the target recognition result corresponding to the second call instruction may be determined according to other recognition information of the first call instruction except the recognition information with the highest voice matching score and/or other recognition information of the second call instruction except the recognition information with the highest voice matching score. The application can sense the voice recognition result which is possibly wrong, and eliminate the wrong voice recognition result under the condition of wrong voice recognition result, determine the instruction content corresponding to the call instruction according to other voice recognition information, and timely control the elevator to respond, thereby improving the recognition accuracy of the call instruction and further improving the operation efficiency of the elevator.
Drawings
Fig. 1 is a flowchart of an elevator control method provided by an embodiment of the present application;
Fig. 2 is a block diagram of an elevator control apparatus according to an embodiment of the present application;
Fig. 3 is a block diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions of the embodiments of the present application will be clearly described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which are obtained by a person skilled in the art based on the embodiments of the present application, fall within the scope of protection of the present application.
The terms first, second and the like in the description and in the claims, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged, as appropriate, such that embodiments of the present application may be implemented in sequences other than those illustrated or described herein, and that the objects identified by "first," "second," etc. are generally of a type, and are not limited to the number of objects, such as the first object may be one or more. Furthermore, in the description and claims, "and/or" means at least one of the connected objects, and the character "/", generally means that the associated object is an "or" relationship.
The elevator control method provided by the embodiment of the application is described in detail through specific embodiments and application scenes thereof with reference to the accompanying drawings.
Referring to fig. 1, fig. 1 is a schematic flow chart of an elevator control method according to an embodiment of the present application. The elevator control method may be performed by a control elevator system.
Step 101, when two call instructions are received in a first preset time length, a second voice recognition result of the second call instruction is obtained, and the first call instruction is received before the second call instruction.
In a specific implementation, the two call instructions are a first call instruction and a second call instruction, where the second call instruction is a call instruction that is received again after the first call instruction is received, and a time interval between the two call instructions does not exceed the first preset time length, and it is understood that the first preset time length may be shorter, for example, 1 second or 2 seconds or 3 seconds, so as to define that a time interval between the first call instruction and the second call instruction is shorter. The user of the first call instruction may or may not be the same person as the user of the second call instruction. It may be understood that the first call instruction and the second call instruction are audio instructions which are detected by the elevator control system in the environmental audio signal and carry call keywords, where the call keywords may be floor keywords, or keywords such as door opening, door closing, etc., and may be specifically determined according to actual situations, and embodiments of the present application are not limited herein.
When the second call instruction is received, the elevator control system can conduct voice recognition on the second call instruction to obtain a voice recognition result corresponding to the second call instruction. In one example, the elevator control system includes a voice recognition module for voice recognition of received call instructions.
In an alternative implementation manner, the elevator control system is preset with an acoustic model, a language model and a decoder, wherein the input of the acoustic model is the acoustic characteristic of the audio corresponding to the second call instruction, such as Mel-frequency cepstrum coefficient (Mel-Frequency Cepstral Coefficients, MFCC), filter Bank (FBANK) and the like, the output of the acoustic model is at least one pronunciation combination corresponding to the audio and the pronunciation matching probability and the acoustic matching score of each pronunciation combination, the acoustic matching score is determined based on the pronunciation matching probability, the input of the language model is at least one pronunciation combination output by the acoustic model, the output of the language model is the language matching probability and the language matching score of at least one identification information (combination of words and characters) corresponding to the audio and each identification information, the language matching score is determined based on the language matching probability, and the decoder is combined with the output of the acoustic model and the language model to construct a decoding graph, searches for an optimal matching path, and takes the highest value of the speech matching score to output as the speech recognition result of the second call instruction.
It should be understood that the implementation of voice recognition of the second call instruction is not limited thereto, and specific reference may be made to the description of voice recognition in the related art, which is not limited thereto.
Step 102, when the second voice recognition result is the same as the first voice recognition result of the first call instruction, determining a first target recognition result according to the first voice recognition information and/or the second voice recognition information.
The first voice recognition information is other voice recognition information except for the first voice recognition result in a first voice recognition information set, the first voice recognition information set comprises at least one piece of information obtained by voice recognition of the first call instruction, the second voice recognition information is other voice recognition information except for the second voice recognition result in a second voice recognition information set, and the second voice recognition information set comprises at least one piece of information obtained by voice recognition of the second call instruction.
In practical application, after a user inputs a call instruction, the elevator control system judges the intention of the user according to the obtained voice recognition result and controls the elevator to respond to the call instruction. For example, the user speaks "six floors stopped" in the car, and the elevator control system may recognize the floor information "six floors" included in the call instruction and control the lighting of the keys corresponding to the six floors in the elevator car. In general, after confirming that the button corresponding to the six floors is lit, the user who needs to go to the six floors will not input the call instruction related to floor selection.
In the embodiment of the application, if two call instructions are received in a short time and the voice recognition results obtained by recognizing the two call instructions are the same, the response error of the elevator to the prior first call instruction is indicated, so that the user performs the second call. For example, the user speaks "six-floor stop" in the car, the floor information corresponding to the call instruction identified by the elevator control system is "sixteen floors", the keys corresponding to sixteen floors in the elevator car are controlled to be lightened, the user needing to go to six floors inputs similar call instructions such as "six-floor stop" again after confirming that the keys corresponding to six floors are not lightened, but the floor information corresponding to the call instruction identified by the elevator control system is still "sixteen floors". In this case, the elevator control system may not respond to the second call instruction, but may redetermine the recognition result corresponding to the second call instruction, that is, the first target recognition result, according to the first voice recognition information and/or the second voice recognition information.
In a specific implementation, when the elevator control system performs voice recognition on the first call instruction or the second call instruction, taking the first call instruction as an example, a plurality of pieces of recognition information possibly matching the first call instruction are generally obtained, and the plurality of pieces of recognition information are referred to as the first voice recognition information set. Each of the first set of voice recognition information corresponds to a voice match score determined based on a probability of matching the recognition information to the first call instruction. In general, the final recognition result of the voice recognition is the recognition information with the highest voice matching score in the first voice recognition information set.
In the case where the recognition information having the highest score of the voice match is wrong, the correct recognition information may be among the recognition information other than the recognition information having the highest score of the voice match. Illustratively, the voice recognition information of a call instruction includes "sixteen layers", "six layers", "twenty-six layers", which are arranged in a high-to-low order according to the voice match score. In the case of a "sixteen-layer" error where the speech matching score is highest, "six-layer" or "twenty-six-layer" may be correct, and the recognition accuracy is higher than the case where the control elevator responds according to the "sixteen-layer" where the error is determined, and the final recognition result is determined according to the "six-layer" or "twenty-six-layer".
The ladder control system may determine a target recognition result corresponding to the second call instruction according to other voice recognition information except the first voice recognition result in the first voice recognition information set and/or other voice recognition information except the second voice recognition result in the second voice recognition information set, which is referred to as the first target recognition result.
In particular, in an alternative embodiment, the ladder control system may determine the first voice recognition information according to the voice matching score of each recognition information in the first voice recognition information set, for example, a first preset range may be preset, and the recognition information with the voice matching score within the first preset range may be determined as the first voice recognition information, where an upper limit value of the first preset range is smaller than the voice matching score corresponding to the first voice recognition result, or the recognition information with the voice matching score that is second highest or third highest may be determined as the first voice recognition information. In another alternative embodiment, the ladder system may determine the first voice recognition information based on the acoustic match scores of the respective recognition information in the first set of voice recognition information. Specifically, the present application may be determined according to actual requirements, which is not limited herein.
Note that, the embodiment for determining the second speech recognition information may be adaptively adjusted with reference to the above embodiment for determining the first speech recognition information, which is not described herein.
And 103, controlling the elevator to respond to the second call instruction according to the first target identification result.
When the method is specifically implemented, the elevator control system can control the elevator to respond to the second call instruction according to the specific content of the first target identification result. For example, triggering the floor key to light up, or opening the elevator door, closing the elevator door, etc., refer to the description in the related art, and are not described herein. In one example, the elevator control system includes an elevator control module for receiving results of the recognition by the speech recognition module and controlling the elevator to respond according to the results.
In the embodiment of the application, when two call instructions are received within the first preset time period and the voice recognition results of the two call instructions are the same, it can be determined that the current voice recognition result is possibly wrong, and then the target recognition result corresponding to the second call instruction can be determined according to other recognition information except the recognition information with the highest voice matching score of the first call instruction and/or other recognition information except the recognition information with the highest voice matching score of the second call instruction. The application can sense the voice recognition result which is possibly wrong, and eliminate the wrong voice recognition result under the condition of wrong voice recognition result, determine the instruction content corresponding to the call instruction according to other voice recognition information, and timely control the elevator to respond, thereby improving the recognition accuracy of the call instruction and further improving the operation efficiency of the elevator.
Optionally, the first voice recognition information is the information with the second highest voice matching score in the first voice recognition information set, and/or the second voice recognition information is the information with the second highest voice matching score in the second voice recognition information set.
In this embodiment, the ladder control system may determine the first target recognition result according to the recognition information with the second highest voice matching score corresponding to the two call instructions, so as to further improve the accuracy of instruction recognition.
In particular, in an alternative embodiment, the elevator control system may be preset with an acoustic model and a language model, and perform voice recognition on the received call instruction, where the acoustic model may output at least one pronunciation combination corresponding to the call instruction and a pronunciation matching probability and an acoustic matching score of each pronunciation combination, and the language model may output at least one identification information (a combination of a word and a word) corresponding to the call instruction and a language matching probability and a language matching score of each identification information. The ladder control system may determine the sum of the acoustic matching score and the language matching score corresponding to each piece of identification information as the voice matching score of the identification information, and determine information with the voice matching score being the second highest based on the voice matching score.
After determining the first speech recognition information and/or the second speech recognition information, three specific embodiments are optionally included:
in an embodiment, the ladder control system may determine the first target recognition information according to the first voice recognition information. That is, the elevator control system may determine the first target recognition information based on the recognition information having the second highest voice match score in the first voice recognition information set.
In an alternative embodiment, the elevator control system may determine the first voice recognition information as the first target recognition result, for example, the last received call instruction is "six layers stop", assuming that the voice recognition information of the last received call instruction includes three types, and the elevator control system may determine "six layers" as the first target recognition result according to the voice matching score from high to low in order of "sixteen layers", "six layers", "twenty-six layers". In another alternative embodiment, the elevator control system may identify the second call instruction again based on the first voice recognition information, for example, by raising the weight coefficient of the first voice recognition information during voice recognition.
In the second embodiment, the ladder control system may determine the first target identification information according to the second voice identification information. That is, the elevator control system may determine the first target recognition information based on the recognition information having the second highest voice match score in the second voice recognition information set.
In an alternative embodiment, the ladder control system may determine the second voice recognition information as the first target recognition result. In another alternative embodiment, the elevator control system may again identify the second call instruction based on the second voice recognition information.
In the third embodiment, the ladder control system may determine the first target recognition information according to the first voice recognition information and the second voice recognition information. In an alternative embodiment, the higher of the voice match scores in the first voice recognition information and the second voice recognition information may be determined as the first target recognition information. Illustratively, the recognition information with the second highest previous voice matching score is "six layers", the voice matching score is 87, the recognition information with the second highest current voice matching score is "twenty-six layers", the voice matching score is 83, and the ladder control system can determine "six layers" as the first target recognition result.
In other embodiments, the first voice recognition information may be information with the second highest acoustic matching score in the first voice recognition information set, and/or the second voice recognition information may be information with the second highest acoustic matching score in the second voice recognition information set, which is not limited herein.
Optionally, the step 102 includes:
And determining the second voice recognition information as a first target recognition result.
In the related art, the voice recognition model may be generally optimized based on each received call instruction, and then the recognition information obtained by performing voice recognition on the second call instruction received later may be better than the recognition information obtained by performing voice recognition on the call instruction received earlier.
In this embodiment, the ladder control system may determine the first target recognition result only according to the second voice recognition information, and further optionally, in a case where the second voice recognition information is unique, determine the second voice recognition information as the first target recognition result, so as to further improve accuracy of instruction recognition. In a specific example, the second voice recognition information is determined according to a voice matching score, and further, the second voice recognition information is recognition information with a voice matching score which corresponds to the second call instruction and is the second highest.
Optionally, the step 102 includes:
And determining a first target recognition result according to the first voice recognition information, the second voice recognition information, the first voice recognition result and the second voice recognition result.
In this embodiment, the ladder control system may determine the first target recognition result by combining the first voice recognition result and the second voice recognition result.
In particular, in an alternative embodiment, the ladder control system may calculate a difference between the first language identification information and the voice matching score of the first voice identification result, determine a first difference, calculate a difference between the second language identification information and the voice matching score of the second voice identification result, determine a second difference, and then further determine the first target identification result according to the first difference and the second difference. The first speech recognition information is determined to be a first target recognition result if the first difference is greater than the second difference, the second speech recognition information is determined to be a first target recognition result if the second difference is greater than the first difference, and the first speech recognition result is determined to be a first target recognition result if the first difference is equal to the second difference, optionally, the speech matching score between the first speech recognition information and the second speech recognition information is greater.
Or the ladder control system can calculate the ratio of the voice matching score of the first language identification information to the voice matching score of the first voice identification result, determine a first ratio, calculate the ratio of the voice matching score of the second language identification information to the voice matching score of the second voice identification result, determine a second ratio, and then further determine the first target identification result according to the first ratio and the second ratio. The first speech recognition information is determined to be a first target recognition result if the first ratio is greater than the second ratio, the second speech recognition information is determined to be a first target recognition result if the second ratio is greater than the first ratio, and the first speech recognition result is determined to be a first target recognition result if the first ratio is equal to the second ratio, optionally, the speech matching score between the first speech recognition information and the second speech recognition information is greater.
It should be noted that, in other embodiments, the ladder control system may also determine the first target recognition result according to the recognition information with the highest acoustic matching score and the recognition information with the suboptimal acoustic matching in the first voice recognition information set, and the recognition information with the highest acoustic matching score and the recognition information with the suboptimal acoustic matching in the second voice recognition information set, where the suboptimal acoustic matching recognition information may be the recognition information with the second highest acoustic matching score, and the specific determination process may be performed with reference to the description of the foregoing embodiments and will not be repeated herein.
Optionally, the step 102 includes:
and when the second voice recognition result is the same as the first voice recognition result of the first call instruction and the voiceprint feature of the second call instruction is matched with the voiceprint feature of the first call instruction, determining a first target recognition result according to the first voice recognition information and/or the second voice recognition information.
In this embodiment, if the elevator control system recognizes that the voiceprint feature of the second call instruction matches the voiceprint feature of the first call instruction, it may be determined that the same user continuously inputs the two call instructions within a short time, and if the recognition results of the two call instructions are the same, the elevator control system may recognize that the recognition of the call instruction is incorrect for both times, and the step of determining the first target recognition result according to the first voice recognition information and/or the second voice recognition information may be performed.
When the users limiting the two call instructions are the same person, the situation that the identification result is judged to be wrong due to the fact that different users have call demands can be avoided. For example, the user a speaks a first call instruction of "six-floor stop" in the car, the elevator control system identifies that the floor information corresponding to the obtained call instruction is "six-floor" and controls the call instruction corresponding to the six-floor in the car to be lit, the user B inputs a second call instruction of "me to six-floor", the floor information corresponding to the call instruction identified by the elevator control system is also "six-floor", if the elevator control system does not perform the operation of voiceprint matching, the operation of determining the first target identification result according to the first voice identification information and/or the second voice identification information is performed, so that the response to the second call instruction is wrong, and after the operation of voiceprint matching is performed, it may be determined that the two call instructions are different, and for the second call instruction, the elevator control system may control the elevator to respond according to the identification result of "six-floor" such as, for example, after the operation of determining the call instruction corresponding to the six-floor key to be lit is not performed any more.
In this embodiment, the two steps of voice recognition of the second call instruction and voice recognition of the second call instruction by the ladder control system may be performed synchronously or asynchronously, and the execution sequence of the two steps is not limited.
In this embodiment, in an optional implementation manner, after the step 103, the method further includes:
when a third call instruction is received within a second preset time period after the control elevator responds to the second call instruction, acquiring the voiceprint feature of the third call instruction, and acquiring a third voice recognition result corresponding to the third call instruction;
when the third voice recognition result is the same as the first target recognition result and the voiceprint feature of the third call instruction is matched with the voiceprint feature of the second call instruction, performing voice recognition on the third call instruction again based on the first target recognition result and the second target recognition result to obtain a third target recognition result;
Controlling an elevator to respond to the third call instruction according to the third target identification result;
The second target recognition result is the first voice recognition result or the second voice recognition result.
In this embodiment, if the first call instruction and the user of the second call instruction are the same person, and if the elevator is controlled to respond to the second call instruction based on the first target recognition result, then a third call instruction of the same user is received again, and if the recognition result is the first target recognition result, it may be determined that the first target recognition result may still be wrong.
In this case, two embodiments are optionally included:
In an embodiment, the ladder control system may perform voice recognition on the third call instruction again based on the first target recognition result and the second target recognition result, so as to obtain a third target recognition result. That is, in the process of performing the voice recognition on the third call instruction again, the voice recognition may be optimized based on the information that the first target recognition result and the second target recognition result are erroneous results, so as to improve the recognition accuracy of performing the voice recognition on the third call instruction again.
In this embodiment, optionally, in the process of performing voice recognition on the third call instruction again, hotword suppression may be performed on the first target recognition result and the second target recognition result, and illustratively, a hotword may be set for an acoustic model and/or a language model, where the hotword includes the first target recognition result and the second target recognition result, and a weight coefficient of the hotword is set to be smaller than a preset value or set to be negative, so as to suppress an influence of the first target recognition result and the second target recognition result on the voice recognition result.
In the second embodiment, the ladder control system may determine the third target recognition result according to the third voice recognition information and/or the fourth voice recognition information. The third voice recognition information is voice recognition information except the first target recognition result and the second target recognition result in the first voice recognition information set, and the fourth voice recognition information is voice recognition information except the first target recognition result and the second target recognition result in the second voice recognition information set.
The present embodiment may refer to the description of the above embodiments, and in order to avoid repetition, the description is omitted here.
It should be noted that, in the elevator control method provided by the embodiment of the present application, the execution body may be an elevator control device, or a control module for executing the elevator control method in the elevator control device. In the embodiment of the application, an elevator control device is taken as an example to execute an elevator control method, and the elevator control device provided by the embodiment of the application is described.
Referring to fig. 2, fig. 2 is a block diagram of an elevator control apparatus provided in an embodiment of the present application.
As shown in fig. 2, the elevator control apparatus 200 includes:
A first recognition module 201, configured to obtain a second speech recognition result of a second call instruction when two call instructions are received within a first preset duration, where the first call instruction is received before the second call instruction;
A first determining module 202, configured to determine a first target recognition result according to first voice recognition information and/or second voice recognition information when the second voice recognition result is the same as the first voice recognition result of the first call instruction;
a first control module 203, configured to control an elevator to respond to the second call instruction according to the first target recognition result;
The first voice recognition information is other voice recognition information except for the first voice recognition result in a first voice recognition information set, the first voice recognition information set comprises at least one piece of information obtained by voice recognition of the first call instruction, the second voice recognition information is other voice recognition information except for the second voice recognition result in a second voice recognition information set, and the second voice recognition information set comprises at least one piece of information obtained by voice recognition of the second call instruction.
Optionally, the first voice recognition information is the information with the second highest voice matching score in the first voice recognition information set, and/or the second voice recognition information is the information with the second highest voice matching score in the second voice recognition information set.
Optionally, the first determining module 202 includes:
and the first determining unit is used for determining the second voice recognition information as a first target recognition result.
Optionally, the first determining module 202 includes:
a second determining unit configured to determine a first target recognition result according to the first voice recognition information, the second voice recognition information, the first voice recognition result, and the second voice recognition result;
Wherein the third voice recognition information is the first voice recognition result, and the fourth voice recognition information is the second voice recognition result.
Optionally, the second determining unit includes:
A first calculating subunit, configured to calculate a ratio of the voice matching score of the first voice recognition information to the voice matching score of the first voice recognition result, to obtain a first ratio;
a second calculating subunit, configured to calculate a ratio of the matching score of the second speech recognition information to the speech matching score of the second speech recognition result, to obtain a second ratio;
And the first determining subunit is used for determining a first target identification result according to the first ratio and the second ratio.
Optionally, the first determining subunit is specifically configured to:
Determining the greater of the first ratio and the second ratio;
Determining the first voice recognition information as a first target recognition result in the case that the larger value is the first ratio;
and determining the second voice recognition information as a first target recognition result in the case that the larger value is the second ratio.
Optionally, the first determining module 202 is specifically configured to:
And when the second voice recognition result is the same as the first voice recognition result of the first call instruction and the voiceprint feature of the second call instruction is matched with the voiceprint feature of the first call instruction, determining a first target recognition result according to the first voice recognition information and/or the second voice recognition information.
Optionally, the elevator control device 200 further includes:
The second recognition module is used for acquiring the voiceprint characteristics of the third call instruction and acquiring a third voice recognition result corresponding to the third call instruction when the third call instruction is received within a second preset time period after the control elevator responds to the second call instruction;
The third recognition module is used for carrying out voice recognition on the third call instruction again based on the first target recognition result and the second target recognition result to obtain a third target recognition result when the third voice recognition result is the same as the first target recognition result and the voiceprint characteristics of the third call instruction are matched with the voiceprint characteristics of the second call instruction;
The second control module is used for controlling the elevator to respond to the third call instruction according to the third target identification result;
The second target recognition result is the first voice recognition result or the second voice recognition result.
The elevator control device 200 can implement the respective processes of the method embodiment corresponding to fig. 1, and achieve the same advantageous effects, and for avoiding repetition, a detailed description is omitted here.
The embodiment of the application also provides electronic equipment. Referring to fig. 3, the electronic device 300 may include a processor 301, a memory 302, and a computer program 3021 stored in the memory 302 and capable of running on the processor 301, where the computer program 3021, when executed by the processor 301, may implement any steps in the method embodiment corresponding to fig. 1 and achieve the same beneficial effects, and will not be described herein.
Those of ordinary skill in the art will appreciate that all or a portion of the steps of implementing the methods of the embodiments described above may be implemented by hardware associated with program instructions, where the program may be stored on a readable medium. The embodiment of the present application further provides a readable storage medium, where a computer program is stored, where the computer program when executed by a processor may implement any step in the method embodiment corresponding to fig. 1, and may achieve the same technical effect, so that repetition is avoided, and no further description is given here.
Such as Read-Only Memory (ROM), random access Memory (Random Access Memory, RAM), magnetic or optical disk, etc.
While the foregoing is directed to the preferred embodiments of the present application, it will be appreciated by those skilled in the art that various modifications and adaptations can be made without departing from the principles of the present application, and such modifications and adaptations are intended to be comprehended within the scope of the present application.
Claims (8)
1. An elevator control method, comprising:
When two call instructions are received within a first preset time period, a second voice recognition result of a second call instruction is obtained, and the first call instruction is received before the second call instruction;
When the second voice recognition result is the same as the first voice recognition result of the first call instruction, determining a first target recognition result according to first voice recognition information and/or second voice recognition information;
Controlling an elevator to respond to the second call instruction according to the first target identification result;
The first voice recognition information is other voice recognition information except the first voice recognition result in a first voice recognition information set, and the first voice recognition information set comprises at least one piece of information obtained by voice recognition of the first call instruction; the second voice recognition information is other voice recognition information except the second voice recognition result in a second voice recognition information set, and the second voice recognition information set comprises at least one piece of information obtained by voice recognition of the second call instruction;
wherein the user of the first call instruction is not the same person as the user of the second call instruction;
the determining the first target recognition result according to the first voice recognition information and/or the second voice recognition information comprises the following steps:
determining a first target recognition result according to the first voice recognition information, the second voice recognition information, the first voice recognition result and the second voice recognition result;
The determining a first target recognition result according to the first voice recognition information, the second voice recognition information, the first voice recognition result and the second voice recognition result includes:
calculating the ratio of the voice matching score of the first voice recognition information to the voice matching score of the first voice recognition result to obtain a first ratio;
Calculating the ratio of the voice matching score of the second voice recognition information to the voice matching score of the second voice recognition result to obtain a second ratio;
Determining a first target identification result according to the first ratio and the second ratio;
the determining a first target recognition result according to the first ratio and the second ratio includes:
Determining the greater of the first ratio and the second ratio;
Determining the first voice recognition information as a first target recognition result in the case that the larger value is the first ratio;
and determining the second voice recognition information as a first target recognition result in the case that the larger value is the second ratio.
2. The method of claim 1, wherein the first speech recognition information is the information with the second highest speech match score in the first set of speech recognition information and/or the second speech recognition information is the information with the second highest speech match score in the second set of speech recognition information.
3. The method according to claim 1, wherein determining the first target recognition result from the first speech recognition information and/or the second speech recognition information comprises:
And determining the second voice recognition information as a first target recognition result.
4. The method according to claim 1, wherein determining a first target recognition result from first speech recognition information and/or second speech recognition information when the second speech recognition result is the same as the first speech recognition result of the first call instruction comprises:
And when the second voice recognition result is the same as the first voice recognition result of the first call instruction and the voiceprint feature of the second call instruction is matched with the voiceprint feature of the first call instruction, determining a first target recognition result according to the first voice recognition information and/or the second voice recognition information.
5. The method of claim 4, wherein after said controlling an elevator to respond to said second call instruction based on said first target identification, said method further comprises:
when a third call instruction is received within a second preset time period after the control elevator responds to the second call instruction, acquiring the voiceprint feature of the third call instruction, and acquiring a third voice recognition result corresponding to the third call instruction;
when the third voice recognition result is the same as the first target recognition result and the voiceprint feature of the third call instruction is matched with the voiceprint feature of the second call instruction, performing voice recognition on the third call instruction again based on the first target recognition result and the second target recognition result to obtain a third target recognition result;
Controlling an elevator to respond to the third call instruction according to the third target identification result;
The second target recognition result is the first voice recognition result or the second voice recognition result.
6. An elevator control apparatus, comprising:
The first recognition module is used for acquiring a second voice recognition result of a second call instruction when two call instructions are received within a first preset time period, wherein the first call instruction is received before the second call instruction;
The first determining module is used for determining a first target recognition result according to the first voice recognition information and/or the second voice recognition information when the second voice recognition result is the same as the first voice recognition result of the first call instruction;
the first control module is used for controlling the elevator to respond to the second call instruction according to the first target identification result;
The first voice recognition information is other voice recognition information except the first voice recognition result in a first voice recognition information set, and the first voice recognition information set comprises at least one piece of information obtained by voice recognition of the first call instruction; the second voice recognition information is other voice recognition information except the second voice recognition result in a second voice recognition information set, and the second voice recognition information set comprises at least one piece of information obtained by voice recognition of the second call instruction;
wherein the user of the first call instruction is not the same person as the user of the second call instruction;
the first determining module includes:
a second determining unit configured to determine a first target recognition result according to the first voice recognition information, the second voice recognition information, the first voice recognition result, and the second voice recognition result;
The second determination unit includes:
A first calculating subunit, configured to calculate a ratio of the voice matching score of the first voice recognition information to the voice matching score of the first voice recognition result, to obtain a first ratio;
a second calculating subunit, configured to calculate a ratio of the matching score of the second speech recognition information to the speech matching score of the second speech recognition result, to obtain a second ratio;
A first determining subunit, configured to determine a first target recognition result according to the first ratio and the second ratio;
the first determining subunit is specifically configured to:
Determining the greater of the first ratio and the second ratio;
Determining the first voice recognition information as a first target recognition result in the case that the larger value is the first ratio;
and determining the second voice recognition information as a first target recognition result in the case that the larger value is the second ratio.
7. An electronic device comprising a processor, a memory and a program or instruction stored on the memory and executable on the processor, which when executed by the processor, implements the steps of the method of any of claims 1 to 5.
8. A readable storage medium, characterized in that it stores thereon a program or instructions, which when executed by a processor, implement the steps of the method according to any of claims 1 to 5.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202111421777.1A CN114155864B (en) | 2021-11-26 | 2021-11-26 | Elevator control method, device, electronic device and readable storage medium |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202111421777.1A CN114155864B (en) | 2021-11-26 | 2021-11-26 | Elevator control method, device, electronic device and readable storage medium |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN114155864A CN114155864A (en) | 2022-03-08 |
| CN114155864B true CN114155864B (en) | 2025-03-25 |
Family
ID=80458356
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202111421777.1A Active CN114155864B (en) | 2021-11-26 | 2021-11-26 | Elevator control method, device, electronic device and readable storage medium |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN114155864B (en) |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2016062069A (en) * | 2014-09-22 | 2016-04-25 | 株式会社日立製作所 | Speech recognition method and speech recognition apparatus |
| CN105810188A (en) * | 2014-12-30 | 2016-07-27 | 联想(北京)有限公司 | Information processing method and electronic equipment |
Family Cites Families (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN107704102B (en) * | 2017-10-09 | 2021-08-03 | 北京新美互通科技有限公司 | Text input method and device |
| CN110689881B (en) * | 2018-06-20 | 2022-07-12 | 深圳市北科瑞声科技股份有限公司 | Speech recognition method, speech recognition device, computer equipment and storage medium |
| KR102809420B1 (en) * | 2018-11-07 | 2025-05-21 | 삼성전자주식회사 | Electronic apparatus and method for voice recognition |
| KR102153220B1 (en) * | 2019-05-20 | 2020-09-07 | 주식회사 모두의연구소 | Method for outputting speech recognition results based on determination of sameness and appratus using the same |
| CN110556127B (en) * | 2019-09-24 | 2021-01-01 | 北京声智科技有限公司 | Method, device, equipment and medium for detecting voice recognition result |
| CN111816174B (en) * | 2020-06-24 | 2024-09-03 | 北京小米松果电子有限公司 | Speech recognition method, device and computer readable storage medium |
-
2021
- 2021-11-26 CN CN202111421777.1A patent/CN114155864B/en active Active
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2016062069A (en) * | 2014-09-22 | 2016-04-25 | 株式会社日立製作所 | Speech recognition method and speech recognition apparatus |
| CN105810188A (en) * | 2014-12-30 | 2016-07-27 | 联想(北京)有限公司 | Information processing method and electronic equipment |
Also Published As
| Publication number | Publication date |
|---|---|
| CN114155864A (en) | 2022-03-08 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US12080281B2 (en) | Generating input alternatives | |
| US8930196B2 (en) | System for detecting speech interval and recognizing continuous speech in a noisy environment through real-time recognition of call commands | |
| CN110827821B (en) | A voice interaction device, method and computer readable storage medium | |
| CN110136749B (en) | Method and device for detecting end-to-end voice endpoint related to speaker | |
| US10453117B1 (en) | Determining domains for natural language understanding | |
| JP3284832B2 (en) | Speech recognition dialogue processing method and speech recognition dialogue device | |
| JP3078279B2 (en) | Method and apparatus for speech recognition using neural network and Markov model recognition technology | |
| US20220343895A1 (en) | User-defined keyword spotting | |
| US11195522B1 (en) | False invocation rejection for speech processing systems | |
| US10506088B1 (en) | Phone number verification | |
| CN106940998A (en) | A kind of execution method and device of setting operation | |
| US11551681B1 (en) | Natural language processing routing | |
| CN113160821A (en) | Control method and device based on voice recognition | |
| CN109215634A (en) | Method and system for multi-word voice control on-off device | |
| JP2025065586A (en) | System and method for optimizing a user interaction session within an interactive voice response system - Patents.com | |
| CN110853669A (en) | Audio identification method, device and equipment | |
| KR20210000802A (en) | Artificial intelligence voice recognition processing method and system | |
| CN110503943B (en) | Voice interaction method and voice interaction system | |
| CN109065026B (en) | Recording control method and device | |
| JP3876703B2 (en) | Speaker learning apparatus and method for speech recognition | |
| CN114155864B (en) | Elevator control method, device, electronic device and readable storage medium | |
| JP2005221727A (en) | Speech recognition system, speech recognition method, and program for speech recognition | |
| KR101529918B1 (en) | Speech recognition apparatus using the multi-thread and methmod thereof | |
| JPH0643895A (en) | Device for recognizing voice | |
| KR101229108B1 (en) | Apparatus for utterance verification based on word specific confidence threshold |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |