CN107819946A - A kind of method, device and mobile terminal of speech recognition - Google Patents
A kind of method, device and mobile terminal of speech recognition Download PDFInfo
- Publication number
- CN107819946A CN107819946A CN201711038376.1A CN201711038376A CN107819946A CN 107819946 A CN107819946 A CN 107819946A CN 201711038376 A CN201711038376 A CN 201711038376A CN 107819946 A CN107819946 A CN 107819946A
- Authority
- CN
- China
- Prior art keywords
- recognition
- data
- voice data
- recognition result
- voice
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 145
- 230000008569 process Effects 0.000 claims abstract description 57
- 238000003860 storage Methods 0.000 claims description 73
- 238000004590 computer program Methods 0.000 claims description 17
- 238000012545 processing Methods 0.000 abstract description 37
- 238000004904 shortening Methods 0.000 abstract description 9
- 230000001960 triggered effect Effects 0.000 abstract description 9
- 230000006870 function Effects 0.000 description 86
- 238000004891 communication Methods 0.000 description 10
- 238000010586 diagram Methods 0.000 description 9
- 238000005516 engineering process Methods 0.000 description 5
- 230000005540 biological transmission Effects 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 238000001514 detection method Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000013500 data storage Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 239000004973 liquid crystal related substance Substances 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 241000699670 Mus sp. Species 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- 230000003213 activating effect Effects 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000007599 discharging Methods 0.000 description 1
- 230000005484 gravity Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 230000005236 sound signal Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000010897 surface acoustic wave method Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M1/00—Substation equipment, e.g. for use by subscribers
- H04M1/72—Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
- H04M1/724—User interfaces specially adapted for cordless or mobile telephones
- H04M1/72448—User interfaces specially adapted for cordless or mobile telephones with means for adapting the functionality of the device according to specific conditions
- H04M1/72454—User interfaces specially adapted for cordless or mobile telephones with means for adapting the functionality of the device according to specific conditions according to context-related or environment-related conditions
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M2250/00—Details of telephonic subscriber devices
- H04M2250/74—Details of telephonic subscriber devices with voice recognition means
Landscapes
- Engineering & Computer Science (AREA)
- Environmental & Geological Engineering (AREA)
- Human Computer Interaction (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Collating Specific Patterns (AREA)
- Telephone Function (AREA)
Abstract
本发明实施例公开了一种语音识别的方法、装置及移动终端,该方法包括:获取输入的指纹数据;如果所述指纹数据为预定指纹数据,则获取输入的语音数据,并对所述语音数据进行语音识别,得到识别结果;当检测到信息输入框的输入操作时,将所述识别结果输入到所述信息输入框中。利用本发明实施例,可以通过指纹来触发启动语音识别功能,并通过语音识别功能对语音数据进行识别,得到相应的识别结果并存储,以备后续使用,从而用户可以快速准确地启动语音识别功能,从而大大缩短了语音识别功能的调用路径,简化了语音识别的处理过程。
The embodiment of the present invention discloses a voice recognition method, device, and mobile terminal. The method includes: acquiring input fingerprint data; if the fingerprint data is predetermined fingerprint data, acquiring input voice data, and analyzing the voice Voice recognition is performed on the data to obtain a recognition result; when an input operation of the information input box is detected, the recognition result is input into the information input box. Utilizing the embodiment of the present invention, the voice recognition function can be triggered by the fingerprint, and the voice data can be recognized through the voice recognition function, and the corresponding recognition result can be obtained and stored for subsequent use, so that the user can quickly and accurately start the voice recognition function , thereby greatly shortening the calling path of the speech recognition function, and simplifying the processing process of the speech recognition.
Description
技术领域technical field
本发明涉及计算机技术领域,尤其涉及一种语音识别的方法、装置及移动终端。The present invention relates to the field of computer technology, in particular to a voice recognition method, device and mobile terminal.
背景技术Background technique
随着终端技术的不断发展,通过终端进行用户之间的沟通交流成为人们之间沟通交流的重要方式,其中最常用的是通过文字和字符等进行沟通交流,在此过程中,需要用户调用终端设备中安装的文字和字符输入法,编辑相应的文字或字符发送给指定的用户。但是,用户通过输入法输入文字或字符的速度通常较慢,这样,很容易影响用户之间的沟通交流。With the continuous development of terminal technology, communication between users through terminals has become an important way for people to communicate. The most commonly used communication is through text and characters. In this process, users need to call the terminal The text and character input method installed in the device, edit the corresponding text or characters and send it to the designated user. However, the speed at which users input text or characters through the input method is generally slow, which easily affects the communication between users.
语音输入凭借其速度快、效率高等优点,成为人们之间沟通交流的一种方式,然而,并不是所有的应用程序都支持用户之间发送语音数据,因此,为了提高文字或字符的输入效率,可以通过语音识别的方式将用户输入的语音数据转换为文字和字符,然后将转换后的文字和字符发送给其它用户。而且,随着语音识别算法的进步,语音识别的速度和准确率得到了进一步提升,而越来越多的文字和字符输入法增加了语音识别功能。Voice input has become a way of communication between people due to its advantages of fast speed and high efficiency. However, not all applications support the transmission of voice data between users. Therefore, in order to improve the efficiency of text or character input, The voice data input by the user can be converted into text and characters through speech recognition, and then the converted text and characters can be sent to other users. Moreover, with the advancement of speech recognition algorithms, the speed and accuracy of speech recognition have been further improved, and more and more text and character input methods have added speech recognition functions.
尽管如此,用户在启动语音识别功能时,仍然需要经过多道程序或过程才能够完成,例如,用户需要发送信息时,需要点击信息输入框,以调出文字和字符输入法,然后,从该输入法中查找到语音识别功能,并选择语音作为输入,文字和字符作为输出,以此将语音数据识别为文字或字符,输入到信息输入框中。可见,用户需要通过多个过程才能开启语音识别功能,语音识别功能的调用路径较长,使得语音识别的处理过程较繁琐。Nevertheless, when the user activates the speech recognition function, he still needs to go through multiple programs or processes to complete it. For example, when the user needs to send a message, he needs to click on the message input box to bring up the text and character input method, and then, from the Find the voice recognition function in the input method, and select voice as input, and text and characters as output, so as to recognize voice data as text or characters and input them into the information input box. It can be seen that the user needs to go through multiple processes to enable the speech recognition function, and the calling path of the speech recognition function is long, which makes the speech recognition processing process cumbersome.
发明内容Contents of the invention
本发明实施例提供一种语音识别的方法,以解决现有技术中语音识别的处理过程较繁琐的问题。An embodiment of the present invention provides a voice recognition method to solve the problem that the processing process of voice recognition in the prior art is relatively cumbersome.
为解决上述技术问题,本发明实施例是这样实现的:In order to solve the above-mentioned technical problems, the embodiment of the present invention is implemented as follows:
第一方面,本发明实施例提供一种语音识别的方法,所述方法包括:In a first aspect, an embodiment of the present invention provides a method for speech recognition, the method comprising:
获取输入的指纹数据;Obtain the input fingerprint data;
如果所述指纹数据为预定指纹数据,则获取输入的语音数据,并对所述语音数据进行语音识别,得到识别结果;If the fingerprint data is predetermined fingerprint data, then acquire input voice data, and perform voice recognition on the voice data to obtain a recognition result;
当检测到信息输入框的输入操作时,将所述识别结果输入到所述信息输入框中。When an input operation of the information input box is detected, the recognition result is input into the information input box.
可选地,所述得到识别结果之后,所述方法还包括:Optionally, after the recognition result is obtained, the method further includes:
存储所述识别结果;storing the recognition result;
记录所述识别结果的存储时间点;Recording the storage time point of the recognition result;
所述当检测到信息输入框的输入操作时,将所述识别结果输入到所述信息输入框中,包括:The step of inputting the recognition result into the information input box when the input operation of the information input box is detected includes:
当检测到信息输入框的输入操作,且当前时间点与所述存储时间点之间的时间间隔小于预定的第一时间阈值时,将所述识别结果输入到所述信息输入框中。When an input operation of the information input box is detected and the time interval between the current time point and the storage time point is less than a predetermined first time threshold, the recognition result is input into the information input box.
可选地,所述识别结果中包括多个语音数据的识别结果,Optionally, the recognition result includes a plurality of recognition results of voice data,
所述将所述识别结果输入到所述信息输入框中,包括:The inputting the recognition result into the information input box includes:
显示所述多个语音数据的识别结果;displaying recognition results of the plurality of speech data;
当接收到选取结束的操作指令时,获取从所述多个语音数据的识别结果中选取的至少一个语音数据的识别结果;When receiving an operation instruction to end the selection, acquire a recognition result of at least one voice data selected from the recognition results of the plurality of voice data;
将获取的至少一个语音数据的识别结果输入到所述信息输入框中。Inputting the acquired recognition result of at least one speech data into the information input box.
可选地,所述将所述识别结果输入到所述信息输入框中,包括:Optionally, the inputting the recognition result into the information input box includes:
显示所述多个语音数据的识别结果;displaying recognition results of the plurality of speech data;
当接收到编辑指令时,将所述多个语音数据的识别结果设置为待编辑状态;When receiving an editing instruction, setting the recognition results of the plurality of voice data as a state to be edited;
当接收到结束编辑的操作指令时,获取编辑后的识别结果;When an operation instruction to end editing is received, the edited recognition result is obtained;
将所述编辑后的识别结果输入到所述信息输入框中。Input the edited recognition result into the information input box.
可选地,所述显示所述多个语音数据的识别结果之后,所述方法还包括:Optionally, after displaying the recognition results of the plurality of voice data, the method further includes:
如果当前时间点与所述多个语音数据的识别结果中最近的存储时间点之间的时间间隔小于预定的第二时间阈值,则将所述最近的存储时间点的识别结果输入到所述信息输入框中;If the time interval between the current time point and the latest stored time point among the recognition results of the plurality of speech data is less than a predetermined second time threshold, input the recognition result of the latest stored time point into the information input box;
当接收到编辑指令时,将所述多个语音数据的识别结果设置为待编辑状态,包括:When receiving an editing instruction, setting the recognition results of the plurality of voice data to a state to be edited, including:
如果当前时间点与所述多个语音数据的识别结果中最近的存储时间点之间的时间间隔不小于预定的第二时间阈值,则当接收到编辑指令时,将所述多个语音数据的识别结果设置为待编辑状态。If the time interval between the current time point and the latest stored time point in the recognition results of the plurality of voice data is not less than the predetermined second time threshold, when an editing instruction is received, the plurality of voice data The recognition result is set to be edited.
可选地,所述获取输入的指纹数据,包括:Optionally, said acquiring input fingerprint data includes:
当检测到用户连续进行指纹识别的时长达到预设时长阈值时,获取输入的指纹数据。When it is detected that the duration of the user's continuous fingerprint recognition reaches a preset duration threshold, the input fingerprint data is acquired.
可选地,所述对所述语音数据进行语音识别,得到识别结果,包括:Optionally, performing speech recognition on the speech data to obtain a recognition result includes:
对所述语音数据进行语音识别的过程中,当检测到预定的结束识别操作时,停止获取语音数据,得到识别结果;或者,In the process of performing voice recognition on the voice data, when a predetermined end recognition operation is detected, stop acquiring the voice data and obtain a recognition result; or,
对所述语音数据进行语音识别的过程中,当预定时长内单位时间输出的识别结果的数据量小于预设数值时,停止获取语音数据,得到识别结果。In the process of performing speech recognition on the speech data, when the data volume of the recognition result output per unit time within a predetermined period of time is less than a preset value, stop acquiring the speech data and obtain the recognition result.
第二方面,本发明实施例提供一种语音识别的装置,所述装置包括:In a second aspect, an embodiment of the present invention provides a speech recognition device, the device comprising:
数据获取模块,用于获取输入的指纹数据;A data acquisition module, configured to acquire input fingerprint data;
识别结果确定模块,用于如果所述指纹数据为预定指纹数据,则获取输入的语音数据,并对所述语音数据进行语音识别,得到识别结果;A recognition result determining module, used to obtain input voice data if the fingerprint data is predetermined fingerprint data, and perform voice recognition on the voice data to obtain a recognition result;
第一输入模块,用于当检测到信息输入框的输入操作时,将所述识别结果输入到所述信息输入框中。The first input module is configured to input the recognition result into the information input box when an input operation of the information input box is detected.
可选地,所述装置还包括:Optionally, the device also includes:
存储模块,用于存储所述识别结果;a storage module, configured to store the recognition result;
记录模块,用于记录所述识别结果的存储时间点;A recording module, configured to record the storage time point of the recognition result;
所述第一输入模块,用于当检测到信息输入框的输入操作,且当前时间点与所述存储时间点之间的时间间隔小于预定的第一时间阈值时,将所述识别结果输入到所述信息输入框中。The first input module is configured to input the recognition result into Enter the information in the box.
可选地,所述识别结果中包括多个语音数据的识别结果,Optionally, the recognition result includes a plurality of recognition results of speech data,
所述第一输入模块,包括:The first input module includes:
显示单元,用于显示所述多个语音数据的识别结果;a display unit for displaying the recognition results of the plurality of speech data;
选取单元,用于当接收到选取结束的操作指令时,获取从所述多个语音数据的识别结果中选取的至少一个语音数据的识别结果;A selection unit, configured to obtain a recognition result of at least one voice data selected from the recognition results of the plurality of voice data when receiving an operation instruction of selection end;
输入单元,用于将获取的至少一个语音数据的识别结果输入到所述信息输入框中。An input unit, configured to input the acquired recognition result of at least one speech data into the information input box.
可选地,所述第一输入模块,包括:Optionally, the first input module includes:
所述显示单元,用于显示所述多个语音数据的识别结果;The display unit is used to display the recognition results of the plurality of voice data;
状态设置单元,用于当接收到编辑指令时,将所述多个语音数据的识别结果设置为待编辑状态;A state setting unit, configured to set the recognition results of the plurality of speech data as a state to be edited when an editing instruction is received;
编辑单元,用于当接收到结束编辑的操作指令时,获取编辑后的识别结果;An editing unit, configured to obtain the edited recognition result when receiving an operation instruction to end editing;
所述输入单元,用于将所述编辑后的识别结果输入到所述信息输入框中。The input unit is configured to input the edited recognition result into the information input box.
可选地,所述装置还包括:Optionally, the device also includes:
第二输入模块,用于如果当前时间点与所述多个语音数据的识别结果中最近的存储时间点之间的时间间隔小于预定的第二时间阈值,则将所述最近的存储时间点的识别结果输入到所述信息输入框中;The second input module is configured to: if the time interval between the current time point and the latest stored time point among the recognition results of the plurality of speech data is less than a predetermined second time threshold, the latest stored time point The recognition result is input into the information input box;
所述状态设置单元,用于如果当前时间点与所述多个语音数据的识别结果中最近的存储时间点之间的时间间隔不小于预定的第二时间阈值,则当接收到编辑指令时,将所述多个语音数据的识别结果设置为待编辑状态。The state setting unit is configured to: if the time interval between the current time point and the latest stored time point among the recognition results of the plurality of voice data is not less than a predetermined second time threshold, when an editing instruction is received, Setting the recognition results of the plurality of speech data as a state to be edited.
可选地,所述数据获取模块,用于当检测到用户连续进行指纹识别的时长达到预设时长阈值时,获取输入的指纹数据。Optionally, the data acquisition module is configured to acquire input fingerprint data when it is detected that the duration of the user's continuous fingerprint identification reaches a preset duration threshold.
可选地,所述识别结果确定模块,用于对所述语音数据进行语音识别的过程中,当检测到预定的结束识别操作时,停止获取语音数据,得到识别结果;或者,对所述语音数据进行语音识别的过程中,当预定时长内单位时间输出的识别结果的数据量小于预设数值时,停止获取语音数据,得到识别结果。Optionally, the recognition result determining module is configured to stop acquiring the voice data and obtain the recognition result when a predetermined end recognition operation is detected during the voice recognition process for the voice data; or, for the voice data During the speech recognition process of the data, when the data volume of the recognition result output per unit time within a predetermined period of time is less than the preset value, the acquisition of the speech data is stopped to obtain the recognition result.
第三方面,本发明实施例提供一种移动终端,包括处理器、存储器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现上述实施例提供的语音识别的方法的步骤In a third aspect, an embodiment of the present invention provides a mobile terminal, including a processor, a memory, and a computer program stored on the memory and operable on the processor. When the computer program is executed by the processor, The steps of the method for realizing the speech recognition provided by the above-mentioned embodiments
由以上本发明实施例提供的技术方案可见,本发明实施例通过获取输入的指纹数据,并且判定该指纹数据是否为预定指纹数据,如果是,则获取输入的语音数据,并对该语音数据进行语音识别,得到识别结果,进而在检测到信息输入框的输入操作时,将该识别结果输入到信息输入框中,这样,通过指纹来触发启动语音识别功能,并通过语音识别功能对语音数据进行识别,得到相应的识别结果并存储,以备后续使用,从而用户可以快速准确地启动语音识别功能,从而大大缩短了语音识别功能的调用路径,简化了语音识别的处理过程。It can be seen from the technical solutions provided by the above embodiments of the present invention that the embodiments of the present invention obtain the input fingerprint data and determine whether the fingerprint data is predetermined fingerprint data, and if so, obtain the input voice data and perform Speech recognition, obtain the recognition result, and then input the recognition result into the information input box when the input operation of the information input box is detected, so that the voice recognition function is triggered by the fingerprint, and the voice data is processed through the voice recognition function Recognition, obtain the corresponding recognition results and store them for subsequent use, so that users can quickly and accurately start the speech recognition function, thereby greatly shortening the calling path of the speech recognition function and simplifying the processing process of speech recognition.
附图说明Description of drawings
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明中记载的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only These are some embodiments described in the present invention. Those skilled in the art can also obtain other drawings based on these drawings without any creative effort.
图1为本发明一种语音识别的方法实施例;Fig. 1 is a kind of method embodiment of speech recognition of the present invention;
图2为本发明一种语音识别的显示界面示意图;Fig. 2 is a schematic diagram of a display interface of speech recognition in the present invention;
图3为本发明另一种语音识别的方法实施例;Fig. 3 is the method embodiment of another kind of speech recognition of the present invention;
图4为本发明一种选取识别结果的显示界面示意图;Fig. 4 is a schematic diagram of a display interface for selecting a recognition result according to the present invention;
图5为本发明又一种语音识别的方法实施例;Fig. 5 is another method embodiment of speech recognition of the present invention;
图6为本发明又一种语音识别的方法实施例;Fig. 6 is another method embodiment of speech recognition of the present invention;
图7为本发明一种语音识别的装置实施例;FIG. 7 is an embodiment of a speech recognition device of the present invention;
图8为本发明一种移动终端实施例。Fig. 8 is an embodiment of a mobile terminal according to the present invention.
具体实施方式Detailed ways
本发明实施例提供一种语音识别的方法、装置及移动终端。Embodiments of the present invention provide a voice recognition method, device and mobile terminal.
为了使本技术领域的人员更好地理解本发明中的技术方案,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都应当属于本发明保护的范围。In order to enable those skilled in the art to better understand the technical solutions in the present invention, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described The embodiments are only some of the embodiments of the present invention, not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.
实施例一Embodiment one
如图1所示,本发明实施例提供一种语音识别的方法,该方法的执行主体可以为终端设备,该终端设备可以如个人计算机等设备,也可以如手机、平板电脑等移动终端设备,该终端设备可以为用户使用的终端设备。该方法可以为用户快速调取语音识别功能或语音识别应用提供便利等。该方法具体可以包括以下步骤:As shown in FIG. 1 , the embodiment of the present invention provides a method for voice recognition. The execution body of the method may be a terminal device. The terminal device may be a personal computer or other mobile terminal device such as a mobile phone or a tablet computer. The terminal device may be a terminal device used by a user. The method can provide convenience for the user to quickly call the voice recognition function or the voice recognition application. The method specifically may include the following steps:
在步骤S102中,获取输入的指纹数据。In step S102, the input fingerprint data is acquired.
其中,指纹数据可以是通过指纹识别组件采集用户的指纹后经过分析处理得到的数据,该指纹数据可以是用户的任意手指的指纹数据,也可以是用户的多个手指的指纹数据等。Wherein, the fingerprint data may be the data obtained through analysis and processing after the user's fingerprint is collected by the fingerprint identification component, and the fingerprint data may be the fingerprint data of any finger of the user, or the fingerprint data of multiple fingers of the user, etc.
在实施中,随着终端技术的不断发展,通过终端进行用户之间的沟通交流成为人们之间沟通交流的重要方式,其中最常用的是通过文字和字符等进行沟通交流,在此过程中,需要用户调用终端设备中安装的文字和字符输入法,并通过输入法中提供的字符按键编辑文字或字符,并将编辑完成的文字或字符输入到文本框中,输入完成后,可以点击文本框中的发送按键,终端设备可以将文本框中的文字和/或字符发送给指定的用户。但是,用户通过输入法输入文字或字符的速度通常较慢,这样,很容易影响用户之间的沟通交流。而语音输入凭借其速度快、效率高等优点,成为人们之间沟通交流的一种方式,然而,并不是所有的应用程序都支持用户之间发送语音数据,因此,为了提高文字或字符的输入效率,本发明实施例提供一种实现方式,即通过语音识别的方式将用户输入的语音数据转换为文字和字符,然后将转换后的文字和字符发送给其它用户。随着语音识别算法的进步,语音识别的速度和准确率得到了进一步提升,而越来越多的文字和字符输入法增加了语音识别功能,尽管如此,用户在启动语音识别功能时,仍然需要经过多道程序或过程才能够完成,例如,用户需要发送信息时,可以点击信息输入框,此时,终端设备调出文字和字符输入法,并从该输入法中查找到语音识别功能,然后选择语音识别功能生效,此时,终端设备开启麦克风接收语音数据,并将语音数据识别为文字或字符,输入到信息输入框中。这样,用户需要使用语音识别功能时,需要通过多个过程才能开启,使得语音识别功能的开启较繁琐(即语音识别功能的调用路径较长),为此,本发明实施例提供一种能够快速进行语音识别的技术方案,具体可以包括以下内容:In implementation, with the continuous development of terminal technology, communication between users through terminals has become an important way for people to communicate, and the most commonly used communication is through text and characters. In the process, The user needs to call the text and character input method installed in the terminal device, edit the text or characters through the character buttons provided in the input method, and input the edited text or characters into the text box. After the input is completed, click the text box The send button in the terminal device can send the text and/or characters in the text box to the specified user. However, the speed at which users input text or characters through the input method is generally slow, which easily affects the communication between users. Voice input has become a way of communication between people due to its advantages of fast speed and high efficiency. However, not all applications support the transmission of voice data between users. Therefore, in order to improve the efficiency of text or character input , the embodiment of the present invention provides an implementation manner, that is, the voice data input by the user is converted into text and characters through speech recognition, and then the converted text and characters are sent to other users. With the advancement of speech recognition algorithms, the speed and accuracy of speech recognition have been further improved, and more and more text and character input methods have added speech recognition functions. However, when users start the speech recognition function, they still need to It can be completed through multiple programs or processes. For example, when the user needs to send information, he can click on the information input box. At this time, the terminal device calls out the text and character input method, and finds the voice recognition function from the input method, and then Select the voice recognition function to take effect. At this time, the terminal device turns on the microphone to receive voice data, and recognizes the voice data as text or characters, and inputs them into the information input box. In this way, when the user needs to use the voice recognition function, it needs to go through multiple processes to enable it, which makes the activation of the voice recognition function more complicated (that is, the calling path of the voice recognition function is longer). A technical solution for speech recognition, which may specifically include the following:
考虑到指纹识别具有安全、快捷和高效等特点,可以使用指纹来启动语音识别功能,具体地,终端设备中可以设置有指纹设置选项,当用户需要设置通过指纹启动语音识别功能时,可以点击指纹设置选项,终端设备获取并显示指纹设置页面,用户可以在指纹设置页面选择或设置启动语音识别功能的启动策略,例如,通过一个指纹数据或多个指纹数据的组合来启动语音识别功能等,然后,用户可以将用户的一个或多个手指分别放置在终端设备的指纹识别组件上,通过指纹识别组件获取用户的一个指纹数据或多个指纹数据作为启动语音识别功能的指纹数据。设置完成后,可以点击指纹设置页面中的完成按键,终端设备可以获取指纹设置页面中用户输入的指纹数据,并存储该指纹数据。Considering that fingerprint recognition is safe, fast, and efficient, fingerprints can be used to activate the voice recognition function. Specifically, the terminal device can be set with a fingerprint setting option. When the user needs to set the fingerprint to activate the voice recognition function, click the fingerprint Setting options, the terminal device obtains and displays the fingerprint setting page, the user can select or set the starting policy of the voice recognition function on the fingerprint setting page, for example, start the voice recognition function through a fingerprint data or a combination of multiple fingerprint data, etc., and then , the user can place one or more fingers of the user on the fingerprint identification component of the terminal device, and obtain one or more fingerprint data of the user through the fingerprint identification component as the fingerprint data for activating the voice recognition function. After the setting is completed, you can click the Finish button on the fingerprint setting page, and the terminal device can obtain the fingerprint data entered by the user on the fingerprint setting page and store the fingerprint data.
当用户需要启动语音识别功能时(例如,用户通过手机与好友进行语音通话的过程中,用户需要记录语音通话内容时,或者,用户需要记录某讲座或课程上老师讲述的内容时,再或者,用户当前不方便通过输入法输入字符向好友发送信息时等),用户可以根据预先设置的启动策略,将相应的手指放置在指纹识别组件上,指纹识别组件可以采集该手指的指纹数据,从而获取到输入的指纹数据。When the user needs to start the voice recognition function (for example, when the user needs to record the content of the voice call during the voice call with a friend through the mobile phone, or when the user needs to record the content of the teacher in a lecture or course, or, When it is inconvenient for the user to input characters through the input method to send information to friends, etc.), the user can place the corresponding finger on the fingerprint identification component according to the preset startup strategy, and the fingerprint identification component can collect the fingerprint data of the finger to obtain to the input fingerprint data.
在步骤S104中,如果上述指纹数据为预定指纹数据,则获取输入的语音数据,并对该语音数据进行语音识别,得到识别结果。In step S104, if the above-mentioned fingerprint data is predetermined fingerprint data, the input voice data is obtained, and voice recognition is performed on the voice data to obtain a recognition result.
其中,预定指纹数据可以是用于启动语音识别功能的指纹数据,预定指纹数据可以是一个手指的指纹数据,也可以是多个手指的指纹数据,预定指纹数据的设置方式可以参见上述步骤S102中的相关内容。语音数据可以用户输入的任意内容的语音数据。识别结果可以是语音数据的内容所对应的文字和/或其它字符,例如,识别结果由文字和英文字母构成,或者,识别结果由文字和数字构成等。Wherein, the predetermined fingerprint data may be the fingerprint data used to start the voice recognition function, the predetermined fingerprint data may be the fingerprint data of one finger, or the fingerprint data of multiple fingers, and the setting method of the predetermined fingerprint data may refer to the above step S102. related content. The voice data may be voice data of any content input by the user. The recognition result may be text and/or other characters corresponding to the voice data content, for example, the recognition result is composed of text and English letters, or the recognition result is composed of text and numbers.
在实施中,当终端设备通过上述步骤S102得到输入的指纹数据后,可以分别将该指纹数据与终端设备中预先存储的指纹数据进行对比,如果该指纹数据与预先存储的指纹数据均不相同,则可以输出提示信息,以提示用户本次输入的指纹数据有误,此时,用户可以重新输入指纹数据。如果该指纹数据与预先存储的指纹数据中的用于启动语音识别功能的预定指纹数据相同,则可以启动语音识别功能,此时,终端设备可以打开麦克风采集语音数据,并可以对采集到的语音数据,根据语音识别功能中预先设置的语音识别算法进行实时语音识别,即将采集到的语音数据转换为文字和/或其它字符,由转换后的文字和/或其它字符可以构建识别结果,从而得到语音数据的识别结果。In practice, after the terminal device obtains the input fingerprint data through the above step S102, it can respectively compare the fingerprint data with the pre-stored fingerprint data in the terminal device, if the fingerprint data is different from the pre-stored fingerprint data, Then, a prompt message may be output to remind the user that the fingerprint data input this time is incorrect, and at this time, the user may re-input the fingerprint data. If the fingerprint data is the same as the predetermined fingerprint data used to start the speech recognition function in the pre-stored fingerprint data, the speech recognition function can be started. At this time, the terminal device can turn on the microphone to collect speech data, and can analyze the collected speech Data, real-time speech recognition is performed according to the speech recognition algorithm preset in the speech recognition function, that is, the collected speech data is converted into text and/or other characters, and the recognition result can be constructed from the converted text and/or other characters, so as to obtain The recognition result of the speech data.
此外,在对接收到的语音数据进行实时识别的过程中,终端设备还可以实时显示识别结果,以供用户预览,如图2所示,用户在与用户A进行语音通话的过程中通过指纹启动了终端设备的语音识别功能,并得到了语音数据的识别结果,显示在终端设备的显示界面的预定位置。In addition, in the process of real-time recognition of the received voice data, the terminal device can also display the recognition result in real time for the user to preview. The speech recognition function of the terminal device is realized, and the recognition result of the speech data is obtained, which is displayed at a predetermined position on the display interface of the terminal device.
需要说明的是,终端设备得到识别结果后,为了后续能够使用该识别结果,可以对该识别结果进行存储,其存储位置可以由用户预先设置,相应的,终端设备中设置有识别结果的存储位置的设置页面,当用户需要设置其存储位置时,可以打开该设置页面,可以在该设置页面中输入识别结果的存储区域或存储位置。当终端设备通过上述步骤S104的处理得到语音数据的识别结果后,可以将该识别结果存储在上述存储区域中或存储位置处。It should be noted that after the terminal device obtains the recognition result, it can store the recognition result in order to be able to use the recognition result later, and its storage location can be preset by the user. Correspondingly, the terminal device is provided with a storage location of the recognition result When the user needs to set the storage location, the setting page can be opened, and the storage area or storage location of the recognition result can be entered in the setting page. After the terminal device obtains the recognition result of the voice data through the processing of the above step S104, the recognition result may be stored in the above storage area or at the storage location.
此外,为了提高数据的交互速度,可以将识别结果存储在内存或缓存中,这样,可以提高后续识别结果的取用。另外,还可以将识别结果存储在剪贴板中,这样,也可以提高后续识别结果的取用。In addition, in order to increase the speed of data interaction, the recognition results can be stored in the memory or in the cache, so that the retrieval of subsequent recognition results can be improved. In addition, the recognition result can also be stored in the clipboard, so that the retrieval of subsequent recognition results can also be improved.
在步骤S106中,当检测到信息输入框的输入操作时,将上述识别结果输入到该信息输入框中。In step S106, when an input operation of the information input box is detected, the above recognition result is input into the information input box.
在实施中,当用户需要向其他用户发送信息时,例如,用户需要向某即时通讯应用中的好友发送信息,此时,用户可以通过终端设备开启信息编辑页面或聊天界面等,该信息编辑页面或聊天界面中可以包括信息输入框,用户可以点击该信息输入框,此时,终端设备可以检测到信息输入框的输入操作。然后,终端设备可以获取上述步骤S104得到的识别结果,可以将该识别结果输入到该信息输入框中。此时,用户还可以对信息输入框中的识别结果做进一步的编辑,编辑完成后,可以点击信息编辑页面或聊天界面中的确定或发送按键,终端设备可以存储或发送信息输入框中的识别结果。In implementation, when a user needs to send information to other users, for example, the user needs to send information to a friend in an instant messaging application, at this time, the user can open the information editing page or chat interface through the terminal device, and the information editing page Alternatively, the chat interface may include an information input box, and the user may click on the information input box, and at this time, the terminal device may detect the input operation of the information input box. Then, the terminal device can obtain the recognition result obtained in the above step S104, and can input the recognition result into the information input box. At this time, the user can further edit the identification result in the information input box. After editing, the user can click the confirm or send button on the information editing page or chat interface, and the terminal device can store or send the identification result in the information input box. result.
本发明实施例提供一种语音识别的方法,通过获取输入的指纹数据,并且判定该指纹数据是否为预定指纹数据,如果是,则获取输入的语音数据,并对该语音数据进行语音识别,得到识别结果,进而在检测到信息输入框的输入操作时,将该识别结果输入到信息输入框中,这样,通过指纹来触发启动语音识别功能,并通过语音识别功能对语音数据进行识别,得到相应的识别结果并存储,以备后续使用,从而用户可以快速准确地启动语音识别功能,从而大大缩短了语音识别功能的调用路径,简化了语音识别的处理过程。The embodiment of the present invention provides a method of speech recognition, by obtaining the input fingerprint data, and judging whether the fingerprint data is predetermined fingerprint data, if yes, obtaining the input speech data, and performing speech recognition on the speech data to obtain The recognition result, and then when the input operation of the information input box is detected, the recognition result is input into the information input box, so that the voice recognition function is triggered by the fingerprint, and the voice data is recognized through the voice recognition function, and the corresponding The recognition results are stored for subsequent use, so that the user can quickly and accurately start the speech recognition function, thereby greatly shortening the calling path of the speech recognition function and simplifying the speech recognition processing process.
实施例二Embodiment two
如图3所示,本发明实施例提供一种语音识别的方法,该方法的执行主体可以为终端设备,该终端设备可以如个人计算机等设备,也可以如手机、平板电脑等移动终端设备,该终端设备可以为用户使用的终端设备。该方法可以为用户快速调取语音识别功能或语音识别应用提供便利等。该方法具体可以包括以下步骤:As shown in FIG. 3 , the embodiment of the present invention provides a method for voice recognition. The execution subject of the method may be a terminal device, and the terminal device may be a personal computer or other mobile terminal device such as a mobile phone or a tablet computer. The terminal device may be a terminal device used by a user. The method can provide convenience for the user to quickly call the voice recognition function or the voice recognition application. The method specifically may include the following steps:
在步骤S302中,当检测到用户连续进行指纹识别的时长达到预设时长阈值时,获取输入的指纹数据。In step S302, when it is detected that the duration of the user's continuous fingerprint identification reaches a preset duration threshold, the input fingerprint data is acquired.
其中,预设时长阈值可以根据实际情况设定,具体如2秒钟或3秒钟等。Wherein, the preset duration threshold can be set according to actual conditions, such as 2 seconds or 3 seconds, etc. specifically.
在实施中,考虑到终端设备中可以存在多种通过指纹进行验证或启动应用程序的设置,例如,通过指纹解锁手机屏幕或通过指纹进行支付验证等。为了区分上述指纹设置,可以采用指纹与特定操作相组合的方式来降低对其它应用程序的操作干扰。本发明实施例中可以采用指纹与连续进行指纹识别的时长来触发启动语音识别功能,具体地,用户可以在指纹设置页面中输入指纹,并设定连续进行指纹识别的预设时长阈值,例如,用户可以将食指的指纹数据通过指纹识别组件输入到指纹设置页面中,并在预设时长阈值处设置时长为2秒钟等。当用户需要启动语音识别功能时,用户可以将食指放置在指纹识别组件上,此时,终端设备可以通过指纹识别组件检测用户连续进行指纹识别的时长,如果用户保持食指放置在指纹识别组件上达到2秒钟,终端设备可以确定用户需要启动语音识别功能,此时,指纹识别组件可以采集该手指的指纹数据,从而获取到输入的指纹数据。In the implementation, it is considered that there may be various settings in the terminal device for authentication through fingerprints or for launching applications, for example, unlocking the screen of the mobile phone through fingerprints or performing payment verification through fingerprints. In order to distinguish the above fingerprint settings, a combination of fingerprints and specific operations can be used to reduce interference with other application programs. In the embodiment of the present invention, the fingerprint and the duration of continuous fingerprint recognition can be used to trigger the start of the voice recognition function. Specifically, the user can input the fingerprint on the fingerprint setting page and set the preset duration threshold for continuous fingerprint recognition, for example, The user can input the fingerprint data of the index finger into the fingerprint setting page through the fingerprint identification component, and set the duration as 2 seconds at the preset duration threshold. When the user needs to start the voice recognition function, the user can place the index finger on the fingerprint recognition component. At this time, the terminal device can detect the duration of the user’s continuous fingerprint recognition through the fingerprint recognition component. If the user keeps the index finger on the fingerprint recognition component to reach In 2 seconds, the terminal device can determine that the user needs to activate the voice recognition function. At this time, the fingerprint recognition component can collect the fingerprint data of the finger, so as to obtain the input fingerprint data.
需要说明的是,上述采用指纹与特定操作相组合的方式不仅可以降低对其它应用程序的操作干扰,还可以避免用户的误操作。It should be noted that, the above method of combining fingerprints with specific operations can not only reduce interference with operations of other application programs, but also avoid user misoperations.
在步骤S304中,如果上述指纹数据为预定指纹数据,则获取输入的语音数据,在对该语音数据进行语音识别的过程中,当预定时长内单位时间输出的识别结果的数据量小于预设数值时,停止获取语音数据,得到识别结果。In step S304, if the above-mentioned fingerprint data is predetermined fingerprint data, the input voice data is obtained, and during the voice recognition process of the voice data, when the data volume of the recognition result output per unit time within the predetermined time length is less than the preset value , stop acquiring voice data and get the recognition result.
其中,预设数值可以根据实际情况设定,具体如1个文字或字符,或2个文字或字符等。单位时间输出的识别结果的数据量具体可以是每秒钟识别出的文字或字符的数量,也可以是每隔3秒钟识别出的文字或字符的数量等。Wherein, the preset value can be set according to the actual situation, for example, 1 word or character, or 2 words or characters. The data volume of the recognition result output per unit time may specifically be the number of characters or characters recognized per second, or the number of characters or characters recognized every 3 seconds.
在实施中,如果上述指纹数据为预定指纹数据,则获取输入的语音数据,并对该语音数据进行语音识别的具体处理过程可以参见上述实施例一中步骤S104的相关内容,在此不再赘述。为了提高语音识别效率,可以设置检测语音识别结束的处理机制,即设定在预定时长内单位时间输出的识别结果的数据量的阈值(即预设数值),具体如,在连续的3秒钟内,每秒钟输出的有效识别结果的预设数值为1个文字或字符。在对该语音数据进行语音识别的过程中,终端设备可以实时检测输出的识别结果的速度,当检测到在连续的3秒钟内,每秒钟输出的有效识别结果的数据量小于1个文字或字符时,可以确定用户当前已停止语音数据的输入,此时,终端设备可以停止语音识别,并关闭语音识别功能,从而得到最终的识别结果。In implementation, if the above-mentioned fingerprint data is predetermined fingerprint data, the specific processing process of obtaining the input voice data and performing voice recognition on the voice data can refer to the relevant content of step S104 in the first embodiment above, and will not repeat them here. . In order to improve the efficiency of speech recognition, a processing mechanism for detecting the end of speech recognition can be set, that is, a threshold (that is, a preset value) of the data volume of the recognition result output per unit time within a predetermined period of time can be set, specifically, within 3 consecutive seconds Within, the preset value of valid recognition results output per second is 1 word or character. In the process of voice recognition of the voice data, the terminal device can detect the speed of the output recognition results in real time. When it is detected that within 3 consecutive seconds, the data volume of valid recognition results output per second is less than 1 character or characters, it can be determined that the user has stopped inputting voice data. At this time, the terminal device can stop voice recognition and turn off the voice recognition function, so as to obtain the final recognition result.
除了可以通过自动识别的方式停止语音识别处理外,还可以通过用户手动停止的方式实现,具体可以包括以下内容:对上述语音数据进行语音识别的过程中,当检测到预定的结束识别操作时,停止获取语音数据,得到识别结果。In addition to stopping the speech recognition process through automatic recognition, it can also be realized through manual stop by the user, which may specifically include the following content: in the process of speech recognition of the above-mentioned speech data, when a predetermined end recognition operation is detected, Stop acquiring voice data and get the recognition result.
其中,结束识别操作可以通过多种方式实现,例如,可以通过输入预设的指纹数据实现,或者,可以通过指纹与特定操作(如上述用户连续进行指纹识别的时长等)相组合的方式实现,再或者,可以通过指定的按键的点击操作实现等,本发明实施例对此不做限定。Among them, the end of the recognition operation can be realized in various ways, for example, it can be realized by inputting preset fingerprint data, or it can be realized by combining the fingerprint with a specific operation (such as the duration of the above-mentioned user’s continuous fingerprint recognition, etc.), Alternatively, it may be implemented by clicking a specified button, which is not limited in this embodiment of the present invention.
在实施中,以通过指纹与特定操作相组合的方式为例,当在上述语音数据进行语音识别的过程中,如果用户需要停止语音识别,则可以将预先设置的指纹数据对应的手指放置在指纹识别组件上,并保持一定的时长,此时,终端设备可以确定用户需要停止语音识别,则可以关闭语音识别功能,从而得到最终的识别结果。In the implementation, take the combination of fingerprint and specific operation as an example, when the user needs to stop the voice recognition during the speech recognition process of the above speech data, the finger corresponding to the preset fingerprint data can be placed on the fingerprint The recognition component remains on for a certain period of time. At this time, the terminal device can determine that the user needs to stop the speech recognition, and then the speech recognition function can be turned off, so as to obtain the final recognition result.
需要说明的是,如果由于用户的误操作,通过上述步骤S302和步骤S304的处理方式启动了语音识别功能,则终端设备可以通过屏幕显示识别结果,以供用户预览,从而使得用户知悉语音识别功能已开启,以便引导用户及时关闭语音识别功能。It should be noted that, if the voice recognition function is activated through the processing methods of the above steps S302 and S304 due to the user's misoperation, the terminal device can display the recognition result on the screen for the user to preview, so that the user knows the voice recognition function. is turned on to direct users to turn off speech recognition in a timely manner.
在步骤S306中,存储上述识别结果。In step S306, the above recognition result is stored.
上述步骤S306的具体处理过程可以参见上述步骤S104中的相关内容,在此不再赘述。For the specific processing procedure of the above step S306, reference may be made to the relevant content in the above step S104, and details are not repeated here.
考虑到输入的语音数据可能是多个,相应的,识别结果中也应该包括多个语音数据的识别结果,而用户选择使用的识别结果可能是其中的一个或多个识别结果,为此,本发明实施例还提供了相应的处理过程,具体可以参见下述步骤S308~步骤S316。Considering that there may be multiple input voice data, correspondingly, the recognition result should also include the recognition results of multiple voice data, and the recognition result selected by the user may be one or more of them. For this reason, this The embodiment of the invention also provides a corresponding processing procedure, for details, refer to the following steps S308 to S316.
在步骤S308中,记录上述识别结果的存储时间点。In step S308, the storage time point of the recognition result is recorded.
在步骤S310中,当检测到信息输入框的输入操作,且当前时间点与上述存储时间点之间的时间间隔小于预定的第一时间阈值时,显示多个语音数据的识别结果。In step S310, when an input operation of the information input box is detected and the time interval between the current time point and the above-mentioned storage time point is less than a predetermined first time threshold, a plurality of voice data recognition results are displayed.
其中,第一时间阈值可以根据实际情况设定,具体如20秒钟或30秒钟等。Wherein, the first time threshold may be set according to actual conditions, specifically, for example, 20 seconds or 30 seconds.
在实施中,为了便于数据的存储,上述步骤S306中识别结果可以存储在剪贴板中。当用户需要向其他用户发送信息时,例如,用户需要向某即时通讯应用中的好友发送信息,此时,用户可以通过终端设备开启信息编辑页面或聊天界面等,该信息编辑页面或聊天界面中可以包括信息输入框,用户可以点击该信息输入框,此时,终端设备可以检测到信息输入框的输入操作。为了提高信息的发送速度,终端设备可以首先判断是否需要将上述识别结果作为输入的信息,其判断策略可以基于当前时间点与上述存储时间点之间的时间间隔设定,即设定当前时间点与上述存储时间点之间的时间间隔的第一时间阈值,则终端设备检测到信息输入框的输入操作后,可以获取当前时间点,并将当前时间点与上述存储时间点进行比较,如果当前时间点与上述存储时间点之间的时间间隔不小于预定的第一时间阈值,则终端设备调取输入法,此时,用户可以通过输入法输入文字、数字或字母等字符。如果当前时间点与上述存储时间点之间的时间间隔小于预定的第一时间阈值,则终端设备可以确定用户需要将上述识别结果作为输入的信息,此时,如图4所示,终端设备可以打开剪贴板,显示多个语音数据的识别结果(具体如可以以下拉列表的方式显示识别结果,或者,以弹出的新页面中显示识别结果等),以供用户查看和选择。In implementation, in order to facilitate data storage, the identification result in step S306 above may be stored in the clipboard. When the user needs to send information to other users, for example, the user needs to send information to friends in an instant messaging application, at this time, the user can open the information editing page or chat interface through the terminal device, and the information editing page or chat interface An information input box may be included, and the user may click on the information input box, and at this time, the terminal device may detect the input operation of the information input box. In order to increase the speed of information transmission, the terminal device can first judge whether the above recognition result needs to be used as input information, and its judgment strategy can be set based on the time interval between the current time point and the above storage time point, that is, setting the current time point The first time threshold of the time interval between the above-mentioned storage time point, after the terminal device detects the input operation of the information input box, it can obtain the current time point and compare the current time point with the above-mentioned storage time point, if the current If the time interval between the time point and the above-mentioned storage time point is not less than the predetermined first time threshold, the terminal device invokes the input method. At this time, the user can input characters, numbers or letters through the input method. If the time interval between the current time point and the above-mentioned storage time point is less than the predetermined first time threshold, the terminal device may determine that the user needs to use the above-mentioned recognition result as input information. At this time, as shown in FIG. 4, the terminal device may Open the clipboard to display the recognition results of multiple voice data (specifically, the recognition results can be displayed in the form of a drop-down list, or the recognition results can be displayed in a pop-up new page, etc.), for the user to view and select.
需要说明的是,考虑到识别结果中包括多个不同的识别结果,且多个不同的识别结果的存储时间点可能不同,因此,在计算当前时间点与上述存储时间点之间的时间间隔时,可以通过以下方式实现,当前时间点与上述存储时间点中最近存储时间点之间的时间间隔,或者,计算上述存储时间点的平均值,然后再计算当前时间点与该平均值之间的时间间隔等。It should be noted that, considering that the recognition results include multiple different recognition results, and the storage time points of multiple different recognition results may be different, therefore, when calculating the time interval between the current time point and the above-mentioned storage time point , can be achieved by the following way, the time interval between the current time point and the most recent storage time point among the above storage time points, or calculating the average value of the above storage time points, and then calculating the time interval between the current time point and the average value time interval etc.
在步骤S312中,当接收到选取结束的操作指令时,获取从多个语音数据的识别结果中选取的至少一个语音数据的识别结果。In step S312, when the operation instruction of selecting and ending is received, at least one recognition result of voice data selected from a plurality of recognition results of voice data is acquired.
其中,选取结束的操作指令可以通过多种方式实现,例如通过点击指定按键触发选取结束的操作指令,或通过其它任意预设操作触发选取结束的操作指令等,本发明实施例对此不做限定。Among them, the operation instruction of selection end can be realized in various ways, for example, triggering the operation instruction of selection end by clicking a specified button, or triggering the operation instruction of selection end by any other preset operation, etc., which is not limited in the embodiment of the present invention. .
在实施中,如图4所示,终端设备显示多个语音数据的识别结果后,用户可以从中查找需要使用的识别结果,并选择相应的识别结果(如图4中用户选择了识别结果1),选择完成后,可以点击显示识别结果的页面中设置的确定按键,此时,终端设备可以生成选取结束的操作指令,并从显示识别结果的页面中获取用户选取的至少一个语音数据的识别结果。In implementation, as shown in Figure 4, after the terminal device displays a plurality of recognition results of voice data, the user can find the recognition result to be used and select the corresponding recognition result (as shown in Figure 4, the user selects the recognition result 1) After the selection is completed, you can click the OK button set on the page displaying the recognition result. At this time, the terminal device can generate an operation instruction to end the selection, and obtain the recognition result of at least one voice data selected by the user from the page displaying the recognition result. .
在步骤S314中,将获取的至少一个语音数据的识别结果输入到上述信息输入框中。In step S314, the acquired recognition result of at least one speech data is input into the above-mentioned information input box.
需要说明的是,上述步骤S310~步骤S314的处理是通过用户选取的方式向信息输入框中输入相应的识别结果的,在实际应用中,为了简化处理流程,还可以通过以下方式完成向信息输入框中输入相应的识别结果的处理,具体可以包括以下内容:当检测到信息输入框的输入操作,且当前时间点与上述存储时间点之间的时间间隔小于预定的第一时间阈值时,将上述识别结果输入到信息输入框中。It should be noted that the processing of the above steps S310 to S314 is to input the corresponding recognition results into the information input box by the way selected by the user. In practical applications, in order to simplify the processing flow, the information input can also be completed in the following way The processing of inputting the corresponding recognition result in the box may specifically include the following content: when the input operation of the information input box is detected, and the time interval between the current time point and the above-mentioned storage time point is less than the predetermined first time threshold, the The above identification results are input into the information input box.
此外,考虑到识别结果中包括多个不同的识别结果,而且,多个不同的识别结果的总体数据量可能较大,因此,在向信息输入框中输入相应的识别结果时,可以将最近存储时间点的识别结果输入到信息输入框中。In addition, considering that the recognition results include multiple different recognition results, and the overall data volume of multiple different recognition results may be large, when inputting the corresponding recognition results into the information input box, the latest stored The recognition result of the time point is input into the information input box.
在步骤S316中,存储或发送信息输入框中的识别结果。In step S316, the recognition result in the information input box is stored or transmitted.
本发明实施例提供一种语音识别的方法,通过获取输入的指纹数据,并且判定该指纹数据是否为预定指纹数据,如果是,则获取输入的语音数据,并对该语音数据进行语音识别,得到识别结果,进而在检测到信息输入框的输入操作时,将该识别结果输入到信息输入框中,这样,通过指纹来触发启动语音识别功能,并通过语音识别功能对语音数据进行识别,得到相应的识别结果并存储,以备后续使用,从而用户可以快速准确地启动语音识别功能,从而大大缩短了语音识别功能的调用路径,简化了语音识别的处理过程。The embodiment of the present invention provides a method of speech recognition, by obtaining the input fingerprint data, and judging whether the fingerprint data is predetermined fingerprint data, if yes, obtaining the input speech data, and performing speech recognition on the speech data to obtain The recognition result, and then when the input operation of the information input box is detected, the recognition result is input into the information input box, so that the voice recognition function is triggered by the fingerprint, and the voice data is recognized through the voice recognition function, and the corresponding The recognition results are stored for subsequent use, so that the user can quickly and accurately start the speech recognition function, thereby greatly shortening the calling path of the speech recognition function and simplifying the speech recognition processing process.
此外,本发明实施例优化了语音识别功能的触发速度、并行性及关闭的智能性,实现了识别结果的快速调用,提升了多条识别结果进一步编辑和复用的可能性,有效拓展了文字或字符输入方式及语音识别的使用场景和效率。In addition, the embodiment of the present invention optimizes the triggering speed, parallelism and closing intelligence of the speech recognition function, realizes the rapid calling of the recognition results, improves the possibility of further editing and multiplexing of multiple recognition results, and effectively expands the text Or character input methods and usage scenarios and efficiency of speech recognition.
实施例三Embodiment Three
如图5所示,本发明实施例提供一种语音识别的方法,该方法的执行主体可以为终端设备,该终端设备可以如个人计算机等设备,也可以如手机、平板电脑等移动终端设备,该终端设备可以为用户使用的终端设备。该方法可以为用户快速调取语音识别功能或语音识别应用提供便利等。该方法具体可以包括以下步骤:As shown in FIG. 5, the embodiment of the present invention provides a method for voice recognition. The execution subject of the method may be a terminal device. The terminal device may be a personal computer or other mobile terminal device such as a mobile phone or a tablet computer. The terminal device may be a terminal device used by a user. The method can provide convenience for the user to quickly call the speech recognition function or the speech recognition application. The method specifically may include the following steps:
在步骤S502中,当检测到用户连续进行指纹识别的时长达到预设时长阈值时,获取输入的指纹数据。In step S502, when it is detected that the duration of the user's continuous fingerprint identification reaches a preset duration threshold, the input fingerprint data is acquired.
在步骤S504中,如果上述指纹数据为预定指纹数据,则获取输入的语音数据,在对该语音数据进行语音识别的过程中,当预定时长内单位时间输出的识别结果的数据量小于预设数值时,停止获取语音数据,得到识别结果。In step S504, if the above-mentioned fingerprint data is predetermined fingerprint data, the input voice data is acquired, and during the voice recognition process of the voice data, when the data volume of the recognition result output per unit time within the predetermined duration is less than the preset value , stop acquiring voice data and get the recognition result.
除了可以通过自动识别的方式停止语音识别处理外,还可以通过用户手动停止的方式实现,具体可以包括以下内容:对上述语音数据进行语音识别的过程中,当检测到预定的结束识别操作时,停止获取语音数据,得到识别结果。In addition to stopping the speech recognition process through automatic recognition, it can also be realized through manual stop by the user, which may specifically include the following content: in the process of speech recognition of the above-mentioned speech data, when a predetermined end recognition operation is detected, Stop acquiring voice data and get the recognition result.
在步骤S506中,存储上述识别结果。In step S506, the above recognition result is stored.
上述步骤S502~步骤S506的步骤内容分别与上述实施例二中的步骤S302~步骤S306的步骤内容相同,步骤S502~步骤S506的具体处理过程可以分别参见上述步骤S302~步骤S306的相关内容,在此不再赘述。The step content of the above step S502~step S506 is the same as the step content of the step S302~step S306 in the above embodiment 2 respectively, the specific processing process of the step S502~step S506 can refer to the relevant content of the above step S302~step S306 respectively, in This will not be repeated here.
考虑到输入的语音数据可能是多个,相应的,识别结果中也应该包括多个语音数据的识别结果,而用户选择使用的识别结果可能是其中的一个或多个识别结果,而且,用户可能还需要对选择的识别结果进行删除、排序或修改等编辑操作,为此,本发明实施例还提供了相应的处理过程,具体可以参见下述步骤S508~步骤S518。Considering that there may be multiple input voice data, correspondingly, the recognition result should also include the recognition results of multiple voice data, and the recognition result selected by the user may be one or more of the recognition results, and the user may It is also necessary to perform editing operations such as deleting, sorting, or modifying the selected recognition results. For this reason, the embodiment of the present invention also provides a corresponding processing procedure. For details, please refer to the following steps S508 to S518.
在步骤S508中,记录上述识别结果的存储时间点。In step S508, the storage time point of the recognition result is recorded.
在步骤S510中,当检测到信息输入框的输入操作,且当前时间点与上述存储时间点之间的时间间隔小于预定的第一时间阈值时,显示多个语音数据的识别结果。In step S510, when the input operation of the information input box is detected, and the time interval between the current time point and the above-mentioned storage time point is less than a predetermined first time threshold, a plurality of voice data recognition results are displayed.
上述步骤S508~步骤S510的步骤内容分别与上述实施例二中的步骤S308~步骤S310的步骤内容相同,步骤S508~步骤S510的具体处理过程可以分别参见上述步骤S308~步骤S310的相关内容,在此不再赘述。The step content of the above step S508~step S510 is the same as the step content of the step S308~step S310 in the above embodiment 2 respectively, the specific processing process of the step S508~step S510 can refer to the relevant content of the above step S308~step S310 respectively, in This will not be repeated here.
在步骤S512中,当接收到编辑指令时,将上述多个语音数据的识别结果设置为待编辑状态。In step S512, when an editing instruction is received, the above-mentioned recognition results of the plurality of speech data are set as a state to be edited.
在实施中,显示多个语音数据的识别结果的页面中可以包括多个操作按键,例如,在图4中增加编辑按键,则显示多个语音数据的识别结果的页面中可以包括编辑按键、取消按键和确定按键等,其中,编辑按键可以引导用户对识别结果进行编辑操作。如果用户查看到显示的识别结果相对于用户想要编写的信息不能完全匹配,则可以点击编辑按键,此时,终端设备可以将多个语音数据的识别结果设置到编辑页面中,相应的,多个语音数据的识别结果设置为待编辑状态。In implementation, a plurality of operation buttons may be included in the page displaying the recognition results of a plurality of voice data, for example, an edit button is added in Fig. button, confirm button, etc., wherein the edit button can guide the user to edit the recognition result. If the user sees that the displayed recognition result cannot completely match the information that the user wants to write, he can click the edit button. At this time, the terminal device can set the recognition results of multiple voice data to the edit page. Correspondingly, multiple The recognition results of voice data are set to the pending editing state.
在步骤S514中,当接收到结束编辑的操作指令时,获取编辑后的识别结果。In step S514, when an operation instruction to end editing is received, the edited recognition result is acquired.
其中,结束编辑的操作指令可以通过多种方式实现,例如通过点击指定按键触发结束编辑的操作指令,或通过其它任意预设操作触发结束编辑的操作指令等,本发明实施例对此不做限定。Wherein, the operation instruction to end editing can be realized in various ways, for example, triggering the operation instruction to end editing by clicking a specified button, or triggering the operation instruction to end editing through any other preset operation, etc., which is not limited in this embodiment of the present invention .
在实施中,用户可以对编辑页面中的多个语音数据的识别结果进行删除、修改和排序中的任意一种或多种操作,最终可以将多个语音数据的识别结果编辑成用户想要编写的信息,编辑完成后,可以点击编辑页面中的确定按键,此时,终端设备可以生成结束编辑的操作指令,执行结束编辑的操作指令,以获取编辑页面中用户编辑完成的识别结果(即编辑后的识别结果)。In the implementation, the user can perform any one or more operations of deleting, modifying, and sorting the recognition results of multiple voice data on the editing page, and finally can edit the recognition results of multiple voice data into the After editing the information, you can click the OK button on the edit page. At this time, the terminal device can generate an operation instruction to end editing and execute the operation instruction to end editing, so as to obtain the recognition result of the user's editing completion on the edit page (that is, edit After the recognition results).
在步骤S516中,将上述编辑后的识别结果输入到信息输入框中。In step S516, input the above-mentioned edited recognition result into the information input box.
在步骤S518中,存储或发送信息输入框中的识别结果。In step S518, the recognition result in the information input box is stored or sent.
本发明实施例提供一种语音识别的方法,通过获取输入的指纹数据,并且判定该指纹数据是否为预定指纹数据,如果是,则获取输入的语音数据,并对该语音数据进行语音识别,得到识别结果,进而在检测到信息输入框的输入操作时,将该识别结果输入到信息输入框中,这样,通过指纹来触发启动语音识别功能,并通过语音识别功能对语音数据进行识别,得到相应的识别结果并存储,以备后续使用,从而用户可以快速准确地启动语音识别功能,从而大大缩短了语音识别功能的调用路径,简化了语音识别的处理过程。The embodiment of the present invention provides a method of speech recognition, by obtaining the input fingerprint data, and judging whether the fingerprint data is predetermined fingerprint data, if yes, obtaining the input speech data, and performing speech recognition on the speech data to obtain The recognition result, and then when the input operation of the information input box is detected, the recognition result is input into the information input box, so that the voice recognition function is triggered by the fingerprint, and the voice data is recognized through the voice recognition function, and the corresponding The recognition results are stored for subsequent use, so that the user can quickly and accurately start the speech recognition function, thereby greatly shortening the calling path of the speech recognition function and simplifying the speech recognition processing process.
此外,本发明实施例优化了语音识别功能的触发速度、并行性及关闭的智能性,实现了识别结果的快速调用,提升了多条识别结果进一步编辑和复用的可能性,有效拓展了文字或字符输入方式及语音识别的使用场景和效率。In addition, the embodiment of the present invention optimizes the triggering speed, parallelism and closing intelligence of the speech recognition function, realizes the rapid calling of the recognition results, improves the possibility of further editing and multiplexing of multiple recognition results, and effectively expands the text Or character input methods and usage scenarios and efficiency of speech recognition.
实施例四Embodiment four
如图6所示,本发明实施例提供一种语音识别的方法,该方法的执行主体可以为终端设备,该终端设备可以如个人计算机等设备,也可以如手机、平板电脑等移动终端设备,该终端设备可以为用户使用的终端设备。该方法可以为用户快速调取语音识别功能或语音识别应用提供便利等。该方法具体可以包括以下步骤:As shown in FIG. 6, the embodiment of the present invention provides a method for voice recognition. The execution body of the method may be a terminal device, and the terminal device may be a personal computer or other mobile terminal device such as a mobile phone or a tablet computer. The terminal device may be a terminal device used by a user. The method can provide convenience for the user to quickly call the voice recognition function or the voice recognition application. The method specifically may include the following steps:
在步骤S602中,当检测到用户连续进行指纹识别的时长达到预设时长阈值时,获取输入的指纹数据。In step S602, when it is detected that the duration of the user's continuous fingerprint identification reaches a preset duration threshold, the input fingerprint data is acquired.
在步骤S604中,如果上述指纹数据为预定指纹数据,则获取输入的语音数据,在对该语音数据进行语音识别的过程中,当预定时长内单位时间输出的识别结果的数据量小于预设数值时,停止获取语音数据,得到识别结果。In step S604, if the above-mentioned fingerprint data is predetermined fingerprint data, the input voice data is acquired, and during the voice recognition process of the voice data, when the data volume of the recognition result output per unit time within the predetermined duration is less than the preset value , stop acquiring voice data and get the recognition result.
除了可以通过自动识别的方式停止语音识别处理外,还可以通过用户手动停止的方式实现,具体可以包括以下内容:对上述语音数据进行语音识别的过程中,当检测到预定的结束识别操作时,停止获取语音数据,得到识别结果。In addition to stopping the speech recognition process through automatic recognition, it can also be realized through manual stop by the user, which may specifically include the following content: in the process of speech recognition of the above-mentioned speech data, when a predetermined end recognition operation is detected, Stop acquiring voice data and get the recognition result.
在步骤S606中,存储上述识别结果。In step S606, the above recognition result is stored.
上述步骤S602~步骤S606的步骤内容分别与上述实施例二中的步骤S302~步骤S306的步骤内容相同,步骤S602~步骤S606的具体处理过程可以分别参见上述步骤S302~步骤S306的相关内容,在此不再赘述。The step content of the above step S602~step S606 is the same as the step content of the step S302~step S306 in the above embodiment 2 respectively, the specific processing process of the step S602~step S606 can refer to the relevant content of the above step S302~step S306 respectively, in This will not be repeated here.
考虑到输入的语音数据可能是多个,相应的,识别结果中也应该包括多个语音数据的识别结果,而用户选择使用的识别结果可能是其中的一个或多个识别结果,而且,用户可能最希望使用最近存储时间点的识别结果,或者,用户可能还需要对识别结果进行删除、排序或修改等编辑操作,为此,本发明实施例还提供了相应的处理过程,具体可以参见下述步骤S608~步骤S620。Considering that there may be multiple input voice data, correspondingly, the recognition result should also include the recognition results of multiple voice data, and the recognition result selected by the user may be one or more of the recognition results, and the user may It is most desirable to use the recognition results at the most recent storage point in time, or the user may also need to perform edit operations such as deleting, sorting, or modifying the recognition results. For this reason, the embodiment of the present invention also provides a corresponding processing process. For details, please refer to the following Step S608 to step S620.
在步骤S608中,记录上述识别结果的存储时间点。In step S608, the storage time point of the recognition result is recorded.
在步骤S610中,当检测到信息输入框的输入操作,且当前时间点与上述存储时间点之间的时间间隔小于预定的第一时间阈值时,显示多个语音数据的识别结果。In step S610, when the input operation of the information input box is detected, and the time interval between the current time point and the above-mentioned storage time point is less than a predetermined first time threshold, the recognition results of multiple voice data are displayed.
上述步骤S608~步骤S610的步骤内容分别与上述实施例二中的步骤S308~步骤S310的步骤内容相同,步骤S608~步骤S610的具体处理过程可以分别参见上述步骤S308~步骤S310的相关内容,在此不再赘述。The step content of the above step S608 ~ step S610 is the same as the step content of the step S308 ~ step S310 in the above embodiment 2 respectively. This will not be repeated here.
考虑到如果用户刚刚完成语音识别处理后,就执行了信息输入操作,此时,用户可能最希望使用最近存储时间点的识别结果,因此,可以设置相应的策略,以使得可以在编辑识别结果前判断用户是否希望使用最近存储时间点的识别结果,如果否,则可以执行下述步骤S612~步骤S616的处理,如果是,则可以执行下述步骤S618的处理。Considering that if the user has just completed the speech recognition process and then performs the information input operation, at this time, the user may most want to use the recognition result at the most recent storage point in time, therefore, a corresponding strategy can be set so that the recognition result can be edited before editing the recognition result. It is judged whether the user wants to use the recognition result at the latest storage time point, if not, the processing of the following steps S612 to S616 may be performed, and if yes, the processing of the following step S618 may be performed.
在步骤S612中,如果当前时间点与上述多个语音数据的识别结果中最近的存储时间点之间的时间间隔不小于预定的第二时间阈值,则当接收到编辑指令时,将多个语音数据的识别结果设置为待编辑状态。In step S612, if the time interval between the current time point and the latest stored time point in the recognition results of the above-mentioned multiple voice data is not less than the predetermined second time threshold, when an editing instruction is received, the multiple voice data The recognition result of the data is set to be edited.
其中,第二时间阈值可以根据实际情况设定,具体如10秒钟或8秒钟等。而且,第一时间阈值大于第二时间阈值,具体如,第一时间阈值为20秒钟,第二时间阈值为10秒钟等。Wherein, the second time threshold may be set according to actual conditions, such as 10 seconds or 8 seconds, etc. specifically. Moreover, the first time threshold is greater than the second time threshold, specifically, the first time threshold is 20 seconds, the second time threshold is 10 seconds, and so on.
在步骤S614中,当接收到结束编辑的操作指令时,获取编辑后的识别结果。In step S614, when an operation instruction to end editing is received, the edited recognition result is acquired.
在步骤S616中,将上述编辑后的识别结果输入到信息输入框中。In step S616, input the above-mentioned edited recognition result into the information input box.
上述步骤S612~步骤S616的步骤内容分别与上述实施例三中的步骤S512~步骤S516的步骤内容相同,步骤S612~步骤S616的具体处理过程可以分别参见上述步骤S512~步骤S516的相关内容,在此不再赘述。The step content of the above step S612~step S616 is the same as the step content of the step S512~step S516 in the third embodiment respectively, the specific processing process of the step S612~step S616 can refer to the relevant content of the above step S512~step S516 respectively, in This will not be repeated here.
在步骤S618中,如果当前时间点与上述多个语音数据的识别结果中最近的存储时间点之间的时间间隔小于预定的第二时间阈值,则将最近的存储时间点的识别结果输入到信息输入框中。In step S618, if the time interval between the current time point and the latest stored time point among the recognition results of the plurality of voice data is less than the predetermined second time threshold, then input the recognition result of the latest stored time point into the information input box.
在实施中,例如,多个语音数据的识别结果中共包括3个语音数据的识别结果,分别为识别结果1、识别结果2和识别结果3,其中,识别结果1的存储时间点为10:32:20,识别结果2的存储时间点为10:31:50,识别结果3的存储时间点为10:31:10,第一时间阈值为20秒钟,第二时间阈值为10秒钟,当前时间点为10:32:28,则可以确定最近的存储时间点为10:32:20,当前时间点10:32:28与最近的存储时间点10:32:20之间的时间间隔8秒钟小于预定的第二时间阈值10秒钟,因此,可以将存储时间点为10:32:20的识别结果1直接输入到信息输入框中。In implementation, for example, the recognition results of a plurality of speech data include three recognition results of speech data, which are respectively recognition result 1, recognition result 2 and recognition result 3, wherein the storage time point of recognition result 1 is 10:32 :20, the storage time point of recognition result 2 is 10:31:50, the storage time point of recognition result 3 is 10:31:10, the first time threshold is 20 seconds, the second time threshold is 10 seconds, the current If the time point is 10:32:28, it can be determined that the latest storage time point is 10:32:20, and the time interval between the current time point 10:32:28 and the latest storage time point 10:32:20 is 8 seconds The clock is less than the predetermined second time threshold of 10 seconds, therefore, the recognition result 1 whose storage time point is 10:32:20 can be directly input into the information input box.
需要说明的是,输入到信息输入框中的识别结果,用户仍然可以对其内容进行编辑。It should be noted that the user can still edit the recognition result input into the information input box.
在步骤S620中,存储或发送信息输入框中的识别结果。In step S620, the recognition result in the information input box is stored or transmitted.
此外,如果剪贴板中的识别结果中包括如手机号、邮箱、链接、账号等信息时,用户还可以对其进行快速拨号、发送短消息、打开浏览器等一系列快捷操作。In addition, if the recognition result in the clipboard includes information such as mobile phone number, email address, link, account number, etc., the user can perform a series of shortcut operations such as speed dialing, sending a short message, and opening a browser.
本发明实施例提供一种语音识别的方法,通过获取输入的指纹数据,并且判定该指纹数据是否为预定指纹数据,如果是,则获取输入的语音数据,并对该语音数据进行语音识别,得到识别结果,进而在检测到信息输入框的输入操作时,将该识别结果输入到信息输入框中,这样,通过指纹来触发启动语音识别功能,并通过语音识别功能对语音数据进行识别,得到相应的识别结果并存储,以备后续使用,从而用户可以快速准确地启动语音识别功能,从而大大缩短了语音识别功能的调用路径,简化了语音识别的处理过程。The embodiment of the present invention provides a method of speech recognition, by obtaining the input fingerprint data, and judging whether the fingerprint data is predetermined fingerprint data, if yes, obtaining the input speech data, and performing speech recognition on the speech data to obtain The recognition result, and then when the input operation of the information input box is detected, the recognition result is input into the information input box, so that the voice recognition function is triggered by the fingerprint, and the voice data is recognized through the voice recognition function, and the corresponding The recognition results are stored for subsequent use, so that the user can quickly and accurately start the speech recognition function, thereby greatly shortening the calling path of the speech recognition function and simplifying the speech recognition processing process.
此外,本发明实施例优化了语音识别功能的触发速度、并行性及关闭的智能性,实现了识别结果的快速调用,提升了多条识别结果进一步编辑和复用的可能性,有效拓展了文字或字符输入方式及语音识别的使用场景和效率。In addition, the embodiment of the present invention optimizes the triggering speed, parallelism and closing intelligence of the speech recognition function, realizes the rapid calling of the recognition results, improves the possibility of further editing and multiplexing of multiple recognition results, and effectively expands the text Or character input methods and usage scenarios and efficiency of speech recognition.
实施例五Embodiment five
以上为本发明实施例提供的语音识别的方法,基于同样的思路,本发明实施例还提供一种语音识别的装置,如图7所示。The above is the speech recognition method provided by the embodiment of the present invention. Based on the same idea, the embodiment of the present invention also provides a speech recognition device, as shown in FIG. 7 .
所述语音识别的装置包括:数据获取模块701、识别结果确定模块702和第一输入模块703,其中:The device for speech recognition includes: a data acquisition module 701, a recognition result determination module 702 and a first input module 703, wherein:
数据获取模块701,用于获取输入的指纹数据;A data acquisition module 701, configured to acquire input fingerprint data;
识别结果确定模块702,用于如果所述指纹数据为预定指纹数据,则获取输入的语音数据,并对所述语音数据进行语音识别,得到识别结果;A recognition result determining module 702, configured to obtain input voice data if the fingerprint data is predetermined fingerprint data, and perform voice recognition on the voice data to obtain a recognition result;
第一输入模块703,用于当检测到信息输入框的输入操作时,将所述识别结果输入到所述信息输入框中。The first input module 703 is configured to input the recognition result into the information input box when an input operation of the information input box is detected.
本发明实施例中,所述装置还包括:In an embodiment of the present invention, the device further includes:
存储模块,用于存储所述识别结果;a storage module, configured to store the recognition result;
记录模块,用于记录所述识别结果的存储时间点;A recording module, configured to record the storage time point of the recognition result;
所述第一输入模块703,用于当检测到信息输入框的输入操作,且当前时间点与所述存储时间点之间的时间间隔小于预定的第一时间阈值时,将所述识别结果输入到所述信息输入框中。The first input module 703 is configured to input the recognition result when an input operation of the information input box is detected and the time interval between the current time point and the storage time point is less than a predetermined first time threshold into the information input box.
本发明实施例中,所述识别结果中包括多个语音数据的识别结果,In the embodiment of the present invention, the recognition result includes a plurality of recognition results of voice data,
所述第一输入模块703,包括:The first input module 703 includes:
显示单元,用于显示所述多个语音数据的识别结果;a display unit for displaying the recognition results of the plurality of speech data;
选取单元,用于当接收到选取结束的操作指令时,获取从所述多个语音数据的识别结果中选取的至少一个语音数据的识别结果;A selection unit, configured to obtain a recognition result of at least one voice data selected from the recognition results of the plurality of voice data when receiving an operation instruction of selection end;
输入单元,用于将获取的至少一个语音数据的识别结果输入到所述信息输入框中。An input unit, configured to input the acquired recognition result of at least one speech data into the information input box.
本发明实施例中,所述第一输入模块703,包括:In the embodiment of the present invention, the first input module 703 includes:
所述显示单元,用于显示所述多个语音数据的识别结果;The display unit is used to display the recognition results of the plurality of voice data;
状态设置单元,用于当接收到编辑指令时,将所述多个语音数据的识别结果设置为待编辑状态;A state setting unit, configured to set the recognition results of the plurality of speech data as a state to be edited when an editing instruction is received;
编辑单元,用于当接收到结束编辑的操作指令时,获取编辑后的识别结果;An editing unit, configured to obtain the edited recognition result when receiving an operation instruction to end editing;
所述输入单元,用于将所述编辑后的识别结果输入到所述信息输入框中。The input unit is configured to input the edited recognition result into the information input box.
本发明实施例中,所述装置还包括:In an embodiment of the present invention, the device further includes:
第二输入模块,用于如果当前时间点与所述多个语音数据的识别结果中最近的存储时间点之间的时间间隔小于预定的第二时间阈值,则将所述最近的存储时间点的识别结果输入到所述信息输入框中;The second input module is configured to: if the time interval between the current time point and the latest stored time point among the recognition results of the plurality of speech data is less than a predetermined second time threshold, the latest stored time point The recognition result is input into the information input box;
所述状态设置单元,用于如果当前时间点与所述多个语音数据的识别结果中最近的存储时间点之间的时间间隔不小于预定的第二时间阈值,则当接收到编辑指令时,将所述多个语音数据的识别结果设置为待编辑状态。The state setting unit is configured to: if the time interval between the current time point and the latest stored time point among the recognition results of the plurality of voice data is not less than a predetermined second time threshold, when an editing instruction is received, Setting the recognition results of the plurality of speech data as a state to be edited.
本发明实施例中,所述装置还包括:In an embodiment of the present invention, the device further includes:
处理模块,用于存储或发送所述信息输入框中的识别结果。A processing module, configured to store or send the recognition result in the information input box.
本发明实施例中,所述数据获取模块701,用于当检测到用户连续进行指纹识别的时长达到预设时长阈值时,获取输入的指纹数据。In the embodiment of the present invention, the data acquisition module 701 is configured to acquire input fingerprint data when it is detected that the duration of the user's continuous fingerprint identification reaches a preset duration threshold.
本发明实施例中,所述识别结果确定模块702,用于对所述语音数据进行语音识别的过程中,当检测到预定的结束识别操作时,停止获取语音数据,得到识别结果;或者,对所述语音数据进行语音识别的过程中,当预定时长内单位时间输出的识别结果的数据量小于预设数值时,停止获取语音数据,得到识别结果。In the embodiment of the present invention, the recognition result determining module 702 is configured to stop acquiring the voice data and obtain the recognition result when a predetermined end recognition operation is detected during the speech recognition process of the speech data; or, During the voice recognition process of the voice data, when the data volume of the recognition result output per unit time within a predetermined period of time is less than a preset value, the acquisition of the voice data is stopped to obtain the recognition result.
本发明实施例提供的语音识别的装置能够实现图1至图6的方法实施例中终端设备实现的各个过程,为避免重复,这里不再赘述。The speech recognition apparatus provided by the embodiment of the present invention can implement various processes implemented by the terminal device in the method embodiments in FIG. 1 to FIG. 6 , and details are not repeated here to avoid repetition.
本发明实施例提供一种语音识别的装置,通过获取输入的指纹数据,并且判定该指纹数据是否为预定指纹数据,如果是,则获取输入的语音数据,并对该语音数据进行语音识别,得到识别结果,进而在检测到信息输入框的输入操作时,将该识别结果输入到信息输入框中,这样,通过指纹来触发启动语音识别功能,并通过语音识别功能对语音数据进行识别,得到相应的识别结果并存储,以备后续使用,从而用户可以快速准确地启动语音识别功能,从而大大缩短了语音识别功能的调用路径,简化了语音识别的处理过程。An embodiment of the present invention provides a device for voice recognition, by acquiring input fingerprint data, and judging whether the fingerprint data is predetermined fingerprint data, if yes, acquiring input voice data, and performing voice recognition on the voice data to obtain The recognition result, and then when the input operation of the information input box is detected, the recognition result is input into the information input box, so that the voice recognition function is triggered by the fingerprint, and the voice data is recognized through the voice recognition function, and the corresponding The recognition results are stored for subsequent use, so that the user can quickly and accurately start the speech recognition function, thereby greatly shortening the calling path of the speech recognition function and simplifying the speech recognition processing process.
此外,本发明实施例优化了语音识别功能的触发速度、并行性及关闭的智能性,实现了识别结果的快速调用,提升了多条识别结果进一步编辑和复用的可能性,有效拓展了文字或字符输入方式及语音识别的使用场景和效率。In addition, the embodiment of the present invention optimizes the triggering speed, parallelism and closing intelligence of the speech recognition function, realizes the rapid calling of the recognition results, improves the possibility of further editing and multiplexing of multiple recognition results, and effectively expands the text Or character input methods and usage scenarios and efficiency of speech recognition.
实施例六Embodiment six
图8为实现本发明各个实施例的一种移动终端的硬件结构示意图。FIG. 8 is a schematic diagram of a hardware structure of a mobile terminal implementing various embodiments of the present invention.
该移动终端800包括但不限于:射频单元801、网络模块802、音频输出单元803、输入单元804、传感器805、显示单元806、用户输入单元807、接口单元808、存储器809、处理器810、以及电源811等部件。本领域技术人员可以理解,图8中示出的移动终端结构并不构成对移动终端的限定,移动终端可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。在本发明实施例中,移动终端包括但不限于手机、平板电脑、笔记本电脑、掌上电脑、车载终端、可穿戴设备、以及计步器等。The mobile terminal 800 includes, but is not limited to: a radio frequency unit 801, a network module 802, an audio output unit 803, an input unit 804, a sensor 805, a display unit 806, a user input unit 807, an interface unit 808, a memory 809, a processor 810, and Power supply 811 and other components. Those skilled in the art can understand that the structure of the mobile terminal shown in Figure 8 does not constitute a limitation on the mobile terminal, and the mobile terminal may include more or less components than shown in the figure, or combine some components, or different components layout. In the embodiment of the present invention, the mobile terminal includes, but is not limited to, a mobile phone, a tablet computer, a notebook computer, a palmtop computer, a vehicle-mounted terminal, a wearable device, and a pedometer.
其中,处理器810,用于获取输入的指纹数据;Wherein, the processor 810 is configured to obtain input fingerprint data;
处理器810,还用于如果所述指纹数据为预定指纹数据,则获取输入的语音数据,并对所述语音数据进行语音识别,得到识别结果;The processor 810 is further configured to acquire input voice data if the fingerprint data is predetermined fingerprint data, and perform voice recognition on the voice data to obtain a recognition result;
输入单元804,用于当检测到信息输入框的输入操作时,将所述识别结果输入到所述信息输入框中。The input unit 804 is configured to input the recognition result into the information input box when an input operation of the information input box is detected.
此外,存储器809,用于存储所述识别结果;In addition, a memory 809 is configured to store the recognition result;
处理器810,还用于记录所述识别结果的存储时间点;The processor 810 is further configured to record the storage time point of the recognition result;
输入单元804,还用于当检测到信息输入框的输入操作,且当前时间点与所述存储时间点之间的时间间隔小于预定的第一时间阈值时,将所述识别结果输入到所述信息输入框中。The input unit 804 is further configured to input the recognition result into the information input box.
另外,所述识别结果中包括多个语音数据的识别结果,In addition, the recognition result includes a plurality of recognition results of voice data,
输入单元804,用于显示所述多个语音数据的识别结果;当接收到选取结束的操作指令时,获取从所述多个语音数据的识别结果中选取的至少一个语音数据的识别结果;将获取的至少一个语音数据的识别结果输入到所述信息输入框中。The input unit 804 is configured to display the recognition results of the plurality of voice data; when receiving an operation instruction to end the selection, acquire a recognition result of at least one voice data selected from the recognition results of the multiple voice data; The acquired recognition result of at least one speech data is input into the information input box.
此外,输入单元804,用于显示所述多个语音数据的识别结果;当接收到编辑指令时,将所述多个语音数据的识别结果设置为待编辑状态;当接收到结束编辑的操作指令时,获取编辑后的识别结果;将所述编辑后的识别结果输入到所述信息输入框中。In addition, the input unit 804 is configured to display the recognition results of the plurality of voice data; when an editing instruction is received, set the recognition results of the plurality of voice data to a state to be edited; when receiving an operation instruction to end editing , acquire the edited recognition result; and input the edited recognition result into the information input box.
另外,输入单元804,还用于如果当前时间点与所述多个语音数据的识别结果中最近的存储时间点之间的时间间隔小于预定的第二时间阈值,则将所述最近的存储时间点的识别结果输入到所述信息输入框中;In addition, the input unit 804 is further configured to: if the time interval between the current time point and the latest storage time point in the recognition results of the plurality of speech data is less than a predetermined second time threshold, then set the latest storage time The recognition result of the point is input into the information input box;
输入单元804,还用于如果当前时间点与所述多个语音数据的识别结果中最近的存储时间点之间的时间间隔不小于预定的第二时间阈值,则当接收到编辑指令时,将所述多个语音数据的识别结果设置为待编辑状态。The input unit 804 is further configured to: if the time interval between the current time point and the latest stored time point among the recognition results of the plurality of speech data is not less than a predetermined second time threshold, when an editing instruction is received, the The recognition results of the plurality of voice data are set to be edited.
另外,处理器810,还用于存储或发送所述信息输入框中的识别结果。In addition, the processor 810 is further configured to store or send the recognition result in the information input box.
此外,处理器810,还用于当检测到用户连续进行指纹识别的时长达到预设时长阈值时,获取输入的指纹数据。In addition, the processor 810 is further configured to obtain input fingerprint data when it is detected that the duration of the user's continuous fingerprint identification reaches a preset duration threshold.
另外,处理器810,还用于对所述语音数据进行语音识别的过程中,当检测到预定的结束识别操作时,停止获取语音数据,得到识别结果;或者,对所述语音数据进行语音识别的过程中,当预定时长内单位时间输出的识别结果的数据量小于预设数值时,停止获取语音数据,得到识别结果。In addition, the processor 810 is also used for performing speech recognition on the speech data. When a predetermined end recognition operation is detected, the acquisition of speech data is stopped to obtain a recognition result; or, speech recognition is performed on the speech data. During the process, when the data volume of the recognition result output per unit time within a predetermined period of time is less than the preset value, the acquisition of voice data is stopped and the recognition result is obtained.
本发明实施例提供一种移动终端,通过获取输入的指纹数据,并且判定该指纹数据是否为预定指纹数据,如果是,则获取输入的语音数据,并对该语音数据进行语音识别,得到识别结果,进而在检测到信息输入框的输入操作时,将该识别结果输入到信息输入框中,这样,通过指纹来触发启动语音识别功能,并通过语音识别功能对语音数据进行识别,得到相应的识别结果并存储,以备后续使用,从而用户可以快速准确地启动语音识别功能,从而大大缩短了语音识别功能的调用路径,简化了语音识别的处理过程。An embodiment of the present invention provides a mobile terminal, by acquiring input fingerprint data, and judging whether the fingerprint data is predetermined fingerprint data, and if so, acquiring input voice data, performing voice recognition on the voice data, and obtaining a recognition result , and then when the input operation of the information input box is detected, the recognition result is input into the information input box, so that the voice recognition function is triggered by the fingerprint, and the voice data is recognized through the voice recognition function to obtain the corresponding recognition The result is stored for subsequent use, so that the user can quickly and accurately start the speech recognition function, thereby greatly shortening the calling path of the speech recognition function and simplifying the processing process of the speech recognition.
此外,本发明实施例优化了语音识别功能的触发速度、并行性及关闭的智能性,实现了识别结果的快速调用,提升了多条识别结果进一步编辑和复用的可能性,有效拓展了文字或字符输入方式及语音识别的使用场景和效率。In addition, the embodiment of the present invention optimizes the triggering speed, parallelism and closing intelligence of the speech recognition function, realizes the rapid calling of the recognition results, improves the possibility of further editing and multiplexing of multiple recognition results, and effectively expands the text Or character input methods and usage scenarios and efficiency of speech recognition.
应理解的是,本发明实施例中,射频单元801可用于收发信息或通话过程中,信号的接收和发送,具体的,将来自基站的下行数据接收后,给处理器810处理;另外,将上行的数据发送给基站。通常,射频单元801包括但不限于天线、至少一个放大器、收发信机、耦合器、低噪声放大器、双工器等。此外,射频单元801还可以通过无线通信系统与网络和其他设备通信。It should be understood that, in the embodiment of the present invention, the radio frequency unit 801 can be used for receiving and sending signals during sending and receiving information or during a call. Specifically, after receiving the downlink data from the base station, the processor 810 processes it; Uplink data is sent to the base station. Generally, the radio frequency unit 801 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like. In addition, the radio frequency unit 801 can also communicate with the network and other devices through a wireless communication system.
移动终端通过网络模块802为用户提供了无线的宽带互联网访问,如帮助用户收发电子邮件、浏览网页和访问流式媒体等。The mobile terminal provides users with wireless broadband Internet access through the network module 802, such as helping users send and receive emails, browse web pages, and access streaming media.
音频输出单元803可以将射频单元801或网络模块802接收的或者在存储器809中存储的音频数据转换成音频信号并且输出为声音。而且,音频输出单元803还可以提供与移动终端800执行的特定功能相关的音频输出(例如,呼叫信号接收声音、消息接收声音等等)。音频输出单元803包括扬声器、蜂鸣器以及受话器等。The audio output unit 803 may convert audio data received by the radio frequency unit 801 or the network module 802 or stored in the memory 809 into an audio signal and output as sound. Also, the audio output unit 803 can also provide audio output related to a specific function performed by the mobile terminal 800 (for example, call signal reception sound, message reception sound, etc.). The audio output unit 803 includes a speaker, a buzzer, a receiver, and the like.
输入单元804用于接收音频或视频信号。输入单元804可以包括图形处理器(Graphics Processing Unit,GPU)8041和麦克风8042,图形处理器8041对在视频捕获模式或图像捕获模式中由图像捕获装置(如摄像头)获得的静态图片或视频的图像数据进行处理。处理后的图像帧可以显示在显示单元806上。经图形处理器8041处理后的图像帧可以存储在存储器809(或其它存储介质)中或者经由射频单元801或网络模块802进行发送。麦克风8042可以接收声音,并且能够将这样的声音处理为音频数据。处理后的音频数据可以在电话通话模式的情况下转换为可经由射频单元801发送到移动通信基站的格式输出。The input unit 804 is used for receiving audio or video signals. The input unit 804 may include a graphics processing unit (Graphics Processing Unit, GPU) 8041 and a microphone 8042, and the graphics processing unit 8041 is used for still pictures or video images obtained by an image capture device (such as a camera) in a video capture mode or an image capture mode. The data is processed. The processed image frames may be displayed on the display unit 806 . The image frames processed by the graphics processor 8041 may be stored in the memory 809 (or other storage media) or sent via the radio frequency unit 801 or the network module 802 . The microphone 8042 can receive sound, and can process such sound into audio data. The processed audio data can be converted into a format that can be sent to a mobile communication base station via the radio frequency unit 801 for output in the case of a phone call mode.
移动终端800还包括至少一种传感器805,比如光传感器、运动传感器以及其他传感器。具体地,光传感器包括环境光传感器及接近传感器,其中,环境光传感器可根据环境光线的明暗来调节显示面板8061的亮度,接近传感器可在移动终端800移动到耳边时,关闭显示面板8061和/或背光。作为运动传感器的一种,加速计传感器可检测各个方向上(一般为三轴)加速度的大小,静止时可检测出重力的大小及方向,可用于识别移动终端姿态(比如横竖屏切换、相关游戏、磁力计姿态校准)、振动识别相关功能(比如计步器、敲击)等;传感器805还可以包括指纹传感器、压力传感器、虹膜传感器、分子传感器、陀螺仪、气压计、湿度计、温度计、红外线传感器等,在此不再赘述。The mobile terminal 800 also includes at least one sensor 805, such as a light sensor, a motion sensor, and other sensors. Specifically, the light sensor includes an ambient light sensor and a proximity sensor, wherein the ambient light sensor can adjust the brightness of the display panel 8061 according to the brightness of the ambient light, and the proximity sensor can turn off the display panel 8061 and the / or backlighting. As a kind of motion sensor, the accelerometer sensor can detect the magnitude of acceleration in various directions (generally three axes), and can detect the magnitude and direction of gravity when it is still, and can be used to identify the posture of mobile terminals (such as horizontal and vertical screen switching, related games, etc.) , magnetometer posture calibration), vibration recognition-related functions (such as pedometer, knocking), etc.; the sensor 805 can also include fingerprint sensors, pressure sensors, iris sensors, molecular sensors, gyroscopes, barometers, hygrometers, thermometers, Infrared sensors, etc., will not be repeated here.
显示单元806用于显示由用户输入的信息或提供给用户的信息。显示单元806可包括显示面板8061,可以采用液晶显示器(Liquid Crystal Display,LCD)、有机发光二极管(Organic Light-Emitting Diode,OLED)等形式来配置显示面板8061。The display unit 806 is used to display information input by the user or information provided to the user. The display unit 806 may include a display panel 8061, and the display panel 8061 may be configured in the form of a liquid crystal display (Liquid Crystal Display, LCD), an organic light-emitting diode (Organic Light-Emitting Diode, OLED), or the like.
用户输入单元807可用于接收输入的数字或字符信息,以及产生与移动终端的用户设置以及功能控制有关的键信号输入。具体地,用户输入单元807包括触控面板8071以及其他输入设备8072。触控面板8071,也称为触摸屏,可收集用户在其上或附近的触摸操作(比如用户使用手指、触笔等任何适合的物体或附件在触控面板8071上或在触控面板8071附近的操作)。触控面板8071可包括触摸检测装置和触摸控制器两个部分。其中,触摸检测装置检测用户的触摸方位,并检测触摸操作带来的信号,将信号传送给触摸控制器;触摸控制器从触摸检测装置上接收触摸信息,并将它转换成触点坐标,再送给处理器810,接收处理器810发来的命令并加以执行。此外,可以采用电阻式、电容式、红外线以及表面声波等多种类型实现触控面板8071。除了触控面板8071,用户输入单元807还可以包括其他输入设备8072。具体地,其他输入设备8072可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆,在此不再赘述。The user input unit 807 can be used to receive input numbers or character information, and generate key signal input related to user settings and function control of the mobile terminal. Specifically, the user input unit 807 includes a touch panel 8071 and other input devices 8072 . The touch panel 8071, also referred to as a touch screen, can collect touch operations of the user on or near it (for example, the user uses any suitable object or accessory such as a finger or a stylus on the touch panel 8071 or near the touch panel 8071). operate). The touch panel 8071 may include two parts, a touch detection device and a touch controller. Among them, the touch detection device detects the user's touch orientation, detects the signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives the touch information from the touch detection device, converts it into contact coordinates, and sends it to For the processor 810, receive the command sent by the processor 810 and execute it. In addition, the touch panel 8071 can be implemented in various types such as resistive, capacitive, infrared, and surface acoustic wave. In addition to the touch panel 8071 , the user input unit 807 may also include other input devices 8072 . Specifically, other input devices 8072 may include, but are not limited to, physical keyboards, function keys (such as volume control keys, switch keys, etc.), trackballs, mice, and joysticks, which will not be repeated here.
进一步的,触控面板8071可覆盖在显示面板8061上,当触控面板8071检测到在其上或附近的触摸操作后,传送给处理器810以确定触摸事件的类型,随后处理器810根据触摸事件的类型在显示面板8061上提供相应的视觉输出。虽然在图8中,触控面板8071与显示面板8061是作为两个独立的部件来实现移动终端的输入和输出功能,但是在某些实施例中,可以将触控面板8071与显示面板8061集成而实现移动终端的输入和输出功能,具体此处不做限定。Furthermore, the touch panel 8071 can be covered on the display panel 8061, and when the touch panel 8071 detects a touch operation on or near it, it will be sent to the processor 810 to determine the type of the touch event, and then the processor 810 will The type of event provides a corresponding visual output on the display panel 8061. Although in FIG. 8, the touch panel 8071 and the display panel 8061 are used as two independent components to realize the input and output functions of the mobile terminal, in some embodiments, the touch panel 8071 and the display panel 8061 can be integrated. The implementation of the input and output functions of the mobile terminal is not specifically limited here.
接口单元808为外部装置与移动终端800连接的接口。例如,外部装置可以包括有线或无线头戴式耳机端口、外部电源(或电池充电器)端口、有线或无线数据端口、存储卡端口、用于连接具有识别模块的装置的端口、音频输入/输出(I/O)端口、视频I/O端口、耳机端口等等。接口单元808可以用于接收来自外部装置的输入(例如,数据信息、电力等等)并且将接收到的输入传输到移动终端800内的一个或多个元件或者可以用于在移动终端800和外部装置之间传输数据。The interface unit 808 is an interface for connecting an external device to the mobile terminal 800 . For example, an external device may include a wired or wireless headset port, an external power (or battery charger) port, a wired or wireless data port, a memory card port, a port for connecting a device with an identification module, audio input/output (I/O) ports, video I/O ports, headphone ports, and more. The interface unit 808 can be used to receive input from an external device (for example, data information, power, etc.) transfer data between devices.
存储器809可用于存储软件程序以及各种数据。存储器809可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据手机的使用所创建的数据(比如音频数据、电话本等)等。此外,存储器809可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。The memory 809 can be used to store software programs as well as various data. The memory 809 can mainly include a program storage area and a data storage area, wherein the program storage area can store an operating system, at least one application program required by a function (such as a sound playback function, an image playback function, etc.); Data created by the use of mobile phones (such as audio data, phonebook, etc.), etc. In addition, the memory 809 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage devices.
处理器810是移动终端的控制中心,利用各种接口和线路连接整个移动终端的各个部分,通过运行或执行存储在存储器809内的软件程序和/或模块,以及调用存储在存储器809内的数据,执行移动终端的各种功能和处理数据,从而对移动终端进行整体监控。处理器810可包括一个或多个处理单元;优选的,处理器810可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器810中。The processor 810 is the control center of the mobile terminal, which uses various interfaces and lines to connect various parts of the entire mobile terminal, by running or executing software programs and/or modules stored in the memory 809, and calling data stored in the memory 809 , execute various functions of the mobile terminal and process data, so as to monitor the mobile terminal as a whole. The processor 810 may include one or more processing units; preferably, the processor 810 may integrate an application processor and a modem processor, wherein the application processor mainly processes the operating system, user interface and application programs, etc., and the modem The processor mainly handles wireless communication. It can be understood that the foregoing modem processor may not be integrated into the processor 810 .
移动终端800还可以包括给各个部件供电的电源811(比如电池),优选的,电源811可以通过电源管理系统与处理器810逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。The mobile terminal 800 can also include a power supply 811 (such as a battery) for supplying power to various components. Preferably, the power supply 811 can be logically connected to the processor 810 through a power management system, so as to manage charging, discharging, and power consumption through the power management system. and other functions.
优选的,本发明实施例还提供一种移动终端,包括处理器810,存储器809,存储在存储器809上并可在所述处理器810上运行的计算机程序,该计算机程序被处理器810执行时实现上述语音识别的方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。Preferably, the embodiment of the present invention also provides a mobile terminal, including a processor 810, a memory 809, and a computer program stored in the memory 809 and operable on the processor 810. When the computer program is executed by the processor 810 The various processes of the above speech recognition method embodiment can achieve the same technical effect, so in order to avoid repetition, details will not be repeated here.
实施例七Embodiment seven
本发明实施例还提供一种计算机可读存储介质,计算机可读存储介质上存储有计算机程序,该计算机程序被处理器执行时实现上述语音识别的方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。其中,所述的计算机可读存储介质,如只读存储器(Read-Only Memory,简称ROM)、随机存取存储器(Random Access Memory,简称RAM)、磁碟或者光盘等。The embodiment of the present invention also provides a computer-readable storage medium. A computer program is stored on the computer-readable storage medium. When the computer program is executed by a processor, each process of the above-mentioned voice recognition method embodiment is realized, and the same Technical effects, in order to avoid repetition, will not be repeated here. Wherein, the computer-readable storage medium is, for example, a read-only memory (Read-Only Memory, ROM for short), a random access memory (Random Access Memory, RAM for short), a magnetic disk or an optical disk, and the like.
本发明实施例提供一种计算机可读存储介质,通过获取输入的指纹数据,并且判定该指纹数据是否为预定指纹数据,如果是,则获取输入的语音数据,并对该语音数据进行语音识别,得到识别结果,进而在检测到信息输入框的输入操作时,将该识别结果输入到信息输入框中,这样,通过指纹来触发启动语音识别功能,并通过语音识别功能对语音数据进行识别,得到相应的识别结果并存储,以备后续使用,从而用户可以快速准确地启动语音识别功能,从而大大缩短了语音识别功能的调用路径,简化了语音识别的处理过程。An embodiment of the present invention provides a computer-readable storage medium. By acquiring input fingerprint data and determining whether the fingerprint data is predetermined fingerprint data, if yes, acquiring input voice data and performing voice recognition on the voice data, The recognition result is obtained, and then when the input operation of the information input box is detected, the recognition result is input into the information input box, so that the voice recognition function is triggered by the fingerprint, and the voice data is recognized through the voice recognition function, and the obtained The corresponding recognition results are stored for subsequent use, so that the user can quickly and accurately start the speech recognition function, thereby greatly shortening the calling path of the speech recognition function and simplifying the processing process of the speech recognition.
此外,本发明实施例优化了语音识别功能的触发速度、并行性及关闭的智能性,实现了识别结果的快速调用,提升了多条识别结果进一步编辑和复用的可能性,有效拓展了文字或字符输入方式及语音识别的使用场景和效率。In addition, the embodiment of the present invention optimizes the triggering speed, parallelism and closing intelligence of the speech recognition function, realizes the rapid calling of the recognition results, improves the possibility of further editing and multiplexing of multiple recognition results, and effectively expands the text Or character input methods and usage scenarios and efficiency of speech recognition.
本领域内的技术人员应明白,本发明的实施例可提供为方法、系统、或计算机程序产品。因此,本发明可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that the embodiments of the present invention may be provided as methods, systems, or computer program products. Accordingly, the present invention can take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It should be understood that each procedure and/or block in the flowchart and/or block diagram, and a combination of procedures and/or blocks in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions may be provided to a general purpose computer, special purpose computer, embedded processor, or processor of other programmable data processing equipment to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing equipment produce a An apparatus for realizing the functions specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising instruction means, the instructions The device realizes the function specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded onto a computer or other programmable data processing device, causing a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process, thereby The instructions provide steps for implementing the functions specified in the flow chart or blocks of the flowchart and/or the block or blocks of the block diagrams.
在一个典型的配置中,计算设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。Memory may include non-permanent storage in computer readable media, in the form of random access memory (RAM) and/or nonvolatile memory such as read only memory (ROM) or flash RAM. Memory is an example of computer readable media.
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括暂存电脑可读媒体(transitory media),如调制的数据信号和载波。Computer-readable media, including both permanent and non-permanent, removable and non-removable media, can be implemented by any method or technology for storage of information. Information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Flash memory or other memory technology, Compact Disc Read-Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, Magnetic tape cartridge, tape magnetic disk storage or other magnetic storage device or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer-readable media excludes transitory computer-readable media, such as modulated data signals and carrier waves.
还需要说明的是,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、商品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、商品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、商品或者设备中还存在另外的相同要素。It should also be noted that the term "comprises", "comprises" or any other variation thereof is intended to cover a non-exclusive inclusion such that a process, method, article, or apparatus comprising a set of elements includes not only those elements, but also includes Other elements not expressly listed, or elements inherent in the process, method, commodity, or apparatus are also included. Without further limitations, an element defined by the phrase "comprising a ..." does not exclude the presence of additional identical elements in the process, method, article or apparatus comprising said element.
本领域技术人员应明白,本发明的实施例可提供为方法、系统或计算机程序产品。因此,本发明可采用完全硬件实施例、完全软件实施例或结合软件和硬件方面的实施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that the embodiments of the present invention may be provided as methods, systems or computer program products. Accordingly, the present invention can take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
以上所述仅为本发明的实施例而已,并不用于限制本发明。对于本领域技术人员来说,本发明可以有各种更改和变化。凡在本发明的精神和原理之内所作的任何修改、等同替换、改进等,均应包含在本发明的权利要求范围之内。The above descriptions are only examples of the present invention, and are not intended to limit the present invention. Various modifications and variations of the present invention will occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention shall be included within the scope of the claims of the present invention.
Claims (15)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711038376.1A CN107819946B (en) | 2017-10-27 | 2017-10-27 | Method, device and mobile terminal for voice recognition |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711038376.1A CN107819946B (en) | 2017-10-27 | 2017-10-27 | Method, device and mobile terminal for voice recognition |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107819946A true CN107819946A (en) | 2018-03-20 |
CN107819946B CN107819946B (en) | 2019-09-27 |
Family
ID=61603369
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711038376.1A Active CN107819946B (en) | 2017-10-27 | 2017-10-27 | Method, device and mobile terminal for voice recognition |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107819946B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111367504A (en) * | 2018-12-26 | 2020-07-03 | 商派软件有限公司 | A data selector and data selection method applicable to all scenarios |
CN111625508A (en) * | 2020-06-01 | 2020-09-04 | 联想(北京)有限公司 | Information processing method and device |
CN112860011A (en) * | 2020-12-31 | 2021-05-28 | 维沃移动通信有限公司 | Folding electronic device and control method and control device thereof |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106161811A (en) * | 2016-06-24 | 2016-11-23 | 维沃移动通信有限公司 | A kind of reminding method and mobile terminal |
CN107193914A (en) * | 2017-05-15 | 2017-09-22 | 广东艾檬电子科技有限公司 | A kind of pronunciation inputting method and mobile terminal |
-
2017
- 2017-10-27 CN CN201711038376.1A patent/CN107819946B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106161811A (en) * | 2016-06-24 | 2016-11-23 | 维沃移动通信有限公司 | A kind of reminding method and mobile terminal |
CN107193914A (en) * | 2017-05-15 | 2017-09-22 | 广东艾檬电子科技有限公司 | A kind of pronunciation inputting method and mobile terminal |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111367504A (en) * | 2018-12-26 | 2020-07-03 | 商派软件有限公司 | A data selector and data selection method applicable to all scenarios |
CN111367504B (en) * | 2018-12-26 | 2021-01-26 | 商派软件有限公司 | A data selector and data selection method applicable to all scenarios |
CN111625508A (en) * | 2020-06-01 | 2020-09-04 | 联想(北京)有限公司 | Information processing method and device |
CN112860011A (en) * | 2020-12-31 | 2021-05-28 | 维沃移动通信有限公司 | Folding electronic device and control method and control device thereof |
CN112860011B (en) * | 2020-12-31 | 2024-04-26 | 维沃移动通信有限公司 | Foldable electronic device and control method and control device thereof |
Also Published As
Publication number | Publication date |
---|---|
CN107819946B (en) | 2019-09-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108021305B (en) | Method, device and mobile terminal for application association startup | |
CN107613131B (en) | Application program do not disturb method, mobile terminal and computer readable storage medium | |
CN107896279A (en) | Screenshotss processing method, device and the mobile terminal of a kind of mobile terminal | |
CN108647058A (en) | A method for starting an application program and a mobile terminal | |
CN108108214A (en) | A kind of guiding method of operating, device and mobile terminal | |
WO2021136159A1 (en) | Screenshot method and electronic device | |
CN110908513B (en) | A data processing method and electronic device | |
CN108334272B (en) | A control method and mobile terminal | |
CN106933351B (en) | A method, device and mobile terminal for starting a camera in a mobile terminal | |
CN107870674B (en) | A program starting method and mobile terminal | |
US10951754B2 (en) | Method for responding to incoming call by means of fingerprint recognition, storage medium, and mobile terminal | |
US20150253894A1 (en) | Activation of an electronic device with a capacitive keyboard | |
CN107623794A (en) | A voice data processing method, device and mobile terminal | |
CN107066090B (en) | Method for controlling fingerprint identification module and mobile terminal | |
CN108133708A (en) | A kind of control method of voice assistant, device and mobile terminal | |
CN109683768A (en) | A kind of operating method and mobile terminal of application | |
CN111580911A (en) | Operation prompting method and device for terminal, storage medium and terminal | |
CN107819946B (en) | Method, device and mobile terminal for voice recognition | |
CN108664818A (en) | A kind of unlock control method and device | |
CN108076229A (en) | Application running state control method and mobile terminal | |
CN110932964A (en) | Information processing method and device | |
CN108696642B (en) | Method for arranging icons and mobile terminal | |
CN107562356B (en) | Fingerprint identification positioning method and device, storage medium and electronic equipment | |
CN111405043B (en) | Information processing method and device and electronic equipment | |
CN108170360B (en) | Control method of gesture function and mobile terminal |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |