[go: up one dir, main page]

CN102074231A - Speech recognition method and speech recognition system - Google Patents

Speech recognition method and speech recognition system Download PDF

Info

Publication number
CN102074231A
CN102074231A CN2010106143655A CN201010614365A CN102074231A CN 102074231 A CN102074231 A CN 102074231A CN 2010106143655 A CN2010106143655 A CN 2010106143655A CN 201010614365 A CN201010614365 A CN 201010614365A CN 102074231 A CN102074231 A CN 102074231A
Authority
CN
China
Prior art keywords
speech recognition
user
information
scene
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2010106143655A
Other languages
Chinese (zh)
Inventor
冯雁
杨永胜
黄石磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wanyinda Co ltd
Original Assignee
Wanyinda Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wanyinda Co ltd filed Critical Wanyinda Co ltd
Priority to CN2010106143655A priority Critical patent/CN102074231A/en
Publication of CN102074231A publication Critical patent/CN102074231A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Telephonic Communication Services (AREA)

Abstract

A speech recognition method comprising the steps of: collecting voice; extracting the characteristics of the collected voice; acquiring scene information of a user, and matching a grammar model or a language model according to the scene information; and performing a pattern matching algorithm according to the matched grammar model or language model to obtain a voice recognition result. By adopting the method, the accuracy of voice recognition can be improved. In addition, a speech recognition system is also provided.

Description

Audio recognition method and speech recognition system
[technical field]
The present invention relates to speech recognition technology, relate in particular to a kind of audio recognition method and speech recognition system.
[background technology]
Speech recognition is that the vocabulary content in people's the voice is converted to computer-readable input, for example button, binary coding or character string etc.Traditional audio recognition method is to gather voice, again the voice that collect are carried out feature extraction, feature extraction is meant and will obtains one group of vector after speech waveform process linearity or the nonlinear operation, pass through pattern matching algorithm then, vector is converted to and the immediate pronunciation unit sequence of model, and then is converted to voice identification result.Yet traditional this audio recognition method only carries out pattern match according to voice that collect and fixing acoustic model and language model (perhaps syntactic model), and recognition accuracy is not high.
[summary of the invention]
Based on this, be necessary to provide a kind of audio recognition method that can improve the speech recognition accuracy.
A kind of audio recognition method may further comprise the steps:
Gather voice;
The voice of gathering are carried out feature extraction;
Obtain user's scene information, according to described scene information coupling syntactic model or language model;
Syntactic model or language model according to described coupling carry out pattern matching algorithm, obtain voice identification result.
Preferably, described method also comprises obtains user position information, according to the step of described positional information coupling syntactic model or language model.
Preferably, described method also comprises the step according to described positional information and scene information coupling Pronounceable dictionary;
Described syntactic model or language model according to described coupling carries out pattern matching algorithm, and the step that obtains voice identification result is:
Syntactic model, language model and Pronounceable dictionary according to described coupling carry out pattern matching algorithm, obtain voice identification result.
Preferably, described positional information detects geographic position or the GPS locating information that provides automatically for user's terminal device, and described scene information is the scene delta data in the user interaction process.
Preferably, geographic position or GPS locating information that described positional information initiatively provides for the user or revises, the scene delta data that described scene information is initiatively set or changed for the user.
In addition, also be necessary to provide a kind of speech recognition system that can improve the speech recognition accuracy.
A kind of speech recognition system comprises client and carries out mutual server with it, and described client comprises:
Voice acquisition module is used to gather voice;
First communication module, the voice that are used for gathering are sent to server;
Described server comprises:
Second communication module is used to receive the voice that described first communication module sends;
Characteristic extracting module is used for described voice are carried out feature extraction;
Sound identification module is used to obtain user's scene information, according to described scene information coupling syntactic model or language model, carries out pattern matching algorithm according to the syntactic model or the language model of described coupling, obtains voice identification result.
Preferably, described client also comprises:
The information acquisition module is used to obtain user's scene information and positional information;
Described first communication module also is used for described scene information and positional information are sent to described server.
Preferably, described sound identification module also is used to obtain user position information, according to described positional information coupling syntactic model or language model; Described server also comprises the database that is used to store user position information and scene information.
Preferably, described sound identification module also is used for according to described positional information and scene information coupling Pronounceable dictionary, carries out pattern matching algorithm according to syntactic model, language model and the Pronounceable dictionary of described coupling, obtains voice identification result.
Preferably, described positional information detects positional information or the GPS locating information that provides automatically for user's terminal device, and described scene information is the scene delta data in the user interaction process.
Preferably, geographic position or GPS locating information that described positional information initiatively provides for the user or revises, the scene delta data that described scene information is initiatively set or changed for the user.
Above-mentioned audio recognition method and speech recognition system, scene information coupling syntactic model or language model according to the user, can when carrying out pattern matching algorithm, change the parameter of syntactic model or language model according to user's scene information, make syntactic model that pattern matching algorithm adopted or language model adaptation user's interaction scenarios, therefore can improve the accuracy of speech recognition.
[description of drawings]
Fig. 1 is the process flow diagram of an audio recognition method among the embodiment;
Fig. 2 is the structural representation of a speech recognition system among the embodiment;
Fig. 3 is the structural representation of the speech recognition system among another embodiment.
[embodiment]
Fig. 1 shows an audio recognition method flow process among the embodiment, and this method flow may further comprise the steps:
Step S102 gathers voice.In one embodiment, by being installed in the client software input voice of portable terminal, for example, user's click keys is talked after entering the voice collecting pattern, phonetic entry finishes after the click keys once more, client software is gathered voice, and the voice that collect can be sent to background server and handle.
Step S104 carries out feature extraction to the voice of gathering.The data that collect are speech waveform, speech waveform is carried out feature extraction after, obtain the voice acoustic feature.Can adopt traditional phonetic feature extraction algorithm that speech waveform is carried out feature extraction, for example extract MFCC (Mel frequency cepstrum system), LPC (linear forecast coding coefficient), speech energy etc.
Step S106 obtains user's scene information, according to described scene information coupling syntactic model or language model.User's scene information is meant the scene delta data in the user interaction process.The user realizes various application by the client software that is installed in portable terminal, can produce the scene delta data in the reciprocal process of application system.Query context that produces when for example, inquiring about shopping information, Flight Information and Query Result etc.
Scene information coupling appropriate grammar model or language model according to the user, for example, during the user inquiring firm name, adopt the big syntactic model of each trade name probability of occurrence, during user inquiring clothes shop information, then adopt clothes shop's title probability big syntactic model or language model.
Step S108 carries out pattern matching algorithm according to the syntactic model or the language model that mate, obtains voice identification result.The speech recognition resource needed has speech model, syntactic model and Pronounceable dictionary etc., according to the above-mentioned voice acoustic feature that obtains, from speech recognition resources, find the result of mating most, can adopt traditional Viterbi (Viterbi) algorithm to carry out speech recognition, obtain voice identification result.
Scene information by the user changes the parameter of syntactic model or language model, makes syntactic model that pattern matching algorithm adopted or language model adaptation user's interaction scenarios, therefore can improve the accuracy of speech recognition.
In one embodiment, said method also comprises: obtain user position information, according to positional information coupling syntactic model or language model.User position information detects geographic position or the GPS locating information that provides automatically for user's terminal device.In addition, user position information can also be geographic position or the GPS locating information that the user initiatively provides or revises, the scene delta data that described scene information is initiatively set or changed for the user.User self the geographic position of filling in by client software for example, this geographic position is stored in server as userspersonal information's a part, and when this information of user's modification, server upgrades.The GPS locating information can be obtained in real time, when the change in location at user place, obtains user's GPS locating information, then can obtain the residing position of active user.Also can obtain the geographic position that the user sets, mate syntactic model or language model according to the geographic position that the user sets.For example, user's terminal device detects the current position of user in Beijing, and the user sets self geographical position in Shanghai, then according to Shanghai this geographic position coupling syntactic model or language model.
But at the relation data of server end maintenance position information and syntactic model, language model, get access to user position information after, can mate appropriate grammar model or language model according to positional information.For example, user position information is the Beijing area, is called main syntactic model, language model with then mating the Beijing area.When the user moves to Shanghai from Beijing, get access to user's current position information, be called main syntactic model, language model with mating above Haiti district.
In another embodiment, said method also comprises: according to user position information and scene information coupling Pronounceable dictionary.Among this embodiment, according to user position information and scene information coupling appropriate grammar model, language model and Pronounceable dictionary, then syntactic model, language model and the Pronounceable dictionary according to coupling carries out pattern matching algorithm, obtains voice identification result.
After carrying out pattern matching algorithm, obtain the word sequence of one or more speech, in the speech that obtains, choose the speech of probability of occurrence maximum and form word sequence, be voice identification result.Voice identification result can be symbol, numerical value or a literal etc., and the voice that for example collect are " today ", and the result that identification obtains can be " today ", " jintian ", " today " etc., and this result can do subsequent treatment in application program.
Fig. 2 shows a speech recognition system among the embodiment, and this system comprises that client 100 reaches and client 100 is carried out mutual server 200, wherein:
Client 100 comprises voice acquisition module 102 and first communication module 104, and wherein: voice acquisition module 102 is used to gather voice; The voice that first communication module 104 is used for collecting send to server 200.In one embodiment, the user can begin input by being installed in the application software input voice on the portable terminal behind button click, stop input once more behind the button click, 102 of voice acquisition module are gathered voice, send to server 200 by first communication module 104 and handle.
Server 200 comprises second communication module 202, characteristic extracting module 204 and sound identification module 206, and wherein: second communication module 202 is used to receive the voice that first communication module 104 sends; Characteristic extracting module 204 is used for these voice are carried out feature extraction; Sound identification module 206 is used to obtain user's scene information, according to scene information coupling syntactic model or language model, carries out pattern matching algorithm according to the syntactic model or the language model that mate, obtains voice identification result.User's scene information is the scene delta data in the user interaction process.
Among this embodiment, the data that second voice module 202 receives are speech waveform, and 204 pairs of speech waveforms of characteristic extracting module carry out feature extraction, obtain the voice acoustic feature.Can adopt traditional feature extraction algorithm to extract MFCC (Mel frequency cepstrum system), LPC (linear forecast coding coefficient), the speech energy etc. of voice.
The user produces the scene delta data by the various application software that are installed on the portable terminal, for example, and query context that produces when inquiry shopping information, Flight Information and Query Result etc.Sound identification module 206 is according to user's scene information coupling appropriate grammar model or language model, for example, during the user inquiring firm name, adopt big syntactic model of each trade name probability of occurrence or language model, during user inquiring clothes shop information, then adopt clothes shop's title probability big syntactic model or language model.The speech recognition resource needed has speech model, syntactic model and Pronounceable dictionary etc., according to the above-mentioned voice acoustic feature that obtains, sound identification module 206 finds the result of mating most from speech recognition resources, can adopt traditional Viterbi (Viterbi) algorithm to carry out speech recognition, obtain voice identification result.
Scene information by the user changes the parameter of syntactic model or language model, makes syntactic model that pattern matching algorithm adopted or language model adaptation user's interaction scenarios, therefore can improve the accuracy of speech recognition.
Fig. 3 shows the speech recognition system among another embodiment, and this system is on basis embodiment illustrated in fig. 1, and client 100 also comprises information acquisition module 106, and server 200 also comprises database 208. wherein:
Information acquisition module 106 is used to obtain user's scene information and positional information.Among this embodiment, first communication module 104 sends to server 200 with user's scene information and positional information.User's scene information is the scene delta data in the user interaction process, and user position information can be that geographic position or the GPS locating information that provides is provided user's terminal device automatically.It also can be the geographic position that the user initiatively provides or revises, for example, the user fills in by client software the geographic position of self, this geographic position is stored in the database 208 of server 200 as userspersonal information's a part, when this information of user's modification, database 208 upgrades.The GPS locating information can be obtained in real time, when the change in location at user place, obtains user's GPS locating information, then can obtain the residing position of active user.Also can obtain the geographic position that the user sets, mate syntactic model, language model according to the geographic position that the user sets.For example, user's terminal device detects the current position of user in Beijing, and the user sets self geographical position in Shanghai, then according to this geographic position coupling syntactic model of Shanghai, language model.
Database 208 is used to store user position information and scene information.In addition, database 208 also can be used for the storaged voice recognition resource, promptly is used to carry out speech model, syntactic model and the Pronounceable dictionary etc. of speech recognition.
Sound identification module 206 also is used to obtain user position information, according to positional information coupling syntactic model or language model.But the relation data of maintenance position information and syntactic model, language model in database 208 after sound identification module 206 gets access to user position information, can mate appropriate grammar model or language model.
But the relation data at database 208 maintenance position information and syntactic model, language model after sound identification module 206 gets access to user position information, can mate appropriate grammar model or language model according to positional information.For example, user position information is the Beijing area, is called main syntactic model or language model with then mating the Beijing area.When the user moves to Shanghai from Beijing, get access to user's current position information, be called main syntactic model or language model with mating above Haiti district.
In another embodiment, sound identification module 206 also is used to obtain positional information and scene information, according to positional information and scene information coupling Pronounceable dictionary, carry out pattern matching algorithm according to syntactic model, language model and the Pronounceable dictionary of described coupling, obtain voice identification result.
After sound identification module 206 carries out pattern matching algorithm, obtain the word sequence of one or more speech, in the speech that obtains, choose the speech of probability of occurrence maximum and form word sequence, be voice identification result.Voice identification result can be symbol, numerical value or a literal etc., and the voice that for example collect are " today ", and the result that identification obtains can be " today ", " jintian ", " today " etc., and this result can do subsequent treatment in application program.
The above embodiment has only expressed several embodiment of the present invention, and it describes comparatively concrete and detailed, but can not therefore be interpreted as the restriction to claim of the present invention.Should be pointed out that for the person of ordinary skill of the art without departing from the inventive concept of the premise, can also make some distortion and improvement, these all belong to protection scope of the present invention.Therefore, the protection domain of patent of the present invention should be as the criterion with claims.

Claims (11)

1.一种语音识别方法,包括以下步骤:1. A speech recognition method, comprising the following steps: 采集语音;collect voice; 对采集的语音进行特征提取;Perform feature extraction on the collected speech; 获取用户的场景信息,根据所述场景信息匹配语法模型或语言模型;Obtain the user's scene information, and match the grammar model or language model according to the scene information; 根据所述匹配的语法模型或语言模型进行模式匹配算法,得到语音识别结果。A pattern matching algorithm is performed according to the matched grammar model or language model to obtain a speech recognition result. 2.根据权利要求1所述的语音识别方法,其特征在于,所述方法还包括获取用户的位置信息,根据所述位置信息匹配语法模型或语言模型的步骤。2. The speech recognition method according to claim 1, characterized in that the method further comprises the step of acquiring the location information of the user, and matching a grammar model or a language model according to the location information. 3.根据权利要求2所述的语音识别方法,其特征在于,所述方法还包括根据所述位置信息和场景信息匹配发音字典的步骤;3. The speech recognition method according to claim 2, wherein the method further comprises the step of matching a pronunciation dictionary according to the location information and scene information; 所述根据所述匹配的语法模型或语言模型进行模式匹配算法,得到语音识别结果的步骤为:The step of performing a pattern matching algorithm according to the grammatical model or language model of the matching to obtain a speech recognition result is: 根据所述匹配的语法模型、语言模型和发音字典进行模式匹配算法,得到语音识别结果。A pattern matching algorithm is performed according to the matched grammar model, language model and pronunciation dictionary to obtain a speech recognition result. 4.根据权利要求2或3所述的语音识别方法,其特征在于,所述位置信息为用户的终端设备自动检测提供的地理位置或GPS定位信息,所述场景信息为用户交互过程中的场景变化数据。4. The speech recognition method according to claim 2 or 3, wherein the location information is the geographic location or GPS positioning information provided by the automatic detection of the user's terminal equipment, and the scene information is the scene in the user interaction process change data. 5.根据权利要求2或3所述的语音识别方法,其特征在于,所述位置信息为用户主动提供或修改的地理位置或GPS定位信息,所述场景信息为用户主动设定或更改的场景变化数据。5. The speech recognition method according to claim 2 or 3, wherein the location information is the geographic location or GPS positioning information actively provided or modified by the user, and the scene information is the scene set or changed by the user change data. 6.一种语音识别系统,包括客户端及与其进行交互的服务器,其特征在于,所述客户端包括:6. A speech recognition system, comprising a client and a server interacting with it, wherein the client includes: 语音采集模块,用于采集语音;Voice collection module, used for collecting voice; 第一通信模块,用于将采集的语音发送至服务器;The first communication module is used to send the collected voice to the server; 所述服务器包括:The servers include: 第二通信模块,用于接收所述第一通信模块发送的语音;The second communication module is configured to receive the voice sent by the first communication module; 特征提取模块,用于对所述语音进行特征提取;A feature extraction module, configured to extract features of the speech; 语音识别模块,用于获取用户的场景信息,根据所述场景信息匹配语法模型或语言模型,根据所述匹配的语法模型或语言模型进行模式匹配算法,得到语音识别结果。The speech recognition module is used to obtain the scene information of the user, match a grammar model or a language model according to the scene information, perform a pattern matching algorithm according to the matched grammar model or language model, and obtain a speech recognition result. 7.根据权利要求6所述的语音识别系统,其特征在于,所述客户端还包括:7. The speech recognition system according to claim 6, wherein the client further comprises: 信息获取模块,用于获取用户的场景信息和位置信息;An information acquisition module, configured to acquire the user's scene information and location information; 所述第一通信模块还用于将所述场景信息和位置信息发送到所述服务器。The first communication module is further configured to send the scene information and location information to the server. 8.根据权利要求6所述的语音识别系统,其特征在于,所述语音识别模块还用于获取用户的位置信息,根据所述位置信息匹配语法模型或语言模型;所述服务器还包括用于存储用户的位置信息和场景信息的数据库。8. The speech recognition system according to claim 6, wherein the speech recognition module is also used to obtain the location information of the user, and matches a grammar model or a language model according to the location information; the server also includes a A database that stores the user's location information and scene information. 9.根据权利要求6所述的语音识别系统,其特征在于,所述语音识别模块还用于根据所述位置信息和场景信息匹配发音字典,根据所述匹配的语法模型、语言模型和发音字典进行模式匹配算法,得到语音识别结果。9. The speech recognition system according to claim 6, wherein the speech recognition module is also used for matching a pronunciation dictionary according to the position information and scene information, and according to the matching grammar model, language model and pronunciation dictionary Carry out the pattern matching algorithm to get the result of speech recognition. 10.根据权利要求6至9中任意一项所述的语音识别系统,其特征在于,所述位置信息为用户的终端设备自动检测提供的位置信息或GPS定位信息,所述场景信息为用户交互过程中的场景变化数据。10. The speech recognition system according to any one of claims 6 to 9, wherein the location information is the location information or GPS positioning information provided by the user's terminal equipment automatically detected, and the scene information is user interaction Scene change data in process. 11.根据权利要求6至9中任意一项所述的语音识别系统,其特征在于,所述位置信息为用户主动提供或修改的地理位置或GPS定位信息,所述场景信息为用户主动设定或更改的场景变化数据。11. The speech recognition system according to any one of claims 6 to 9, wherein the location information is geographical location or GPS positioning information actively provided or modified by the user, and the scene information is actively set by the user or changed scene change data.
CN2010106143655A 2010-12-30 2010-12-30 Speech recognition method and speech recognition system Pending CN102074231A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2010106143655A CN102074231A (en) 2010-12-30 2010-12-30 Speech recognition method and speech recognition system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2010106143655A CN102074231A (en) 2010-12-30 2010-12-30 Speech recognition method and speech recognition system

Publications (1)

Publication Number Publication Date
CN102074231A true CN102074231A (en) 2011-05-25

Family

ID=44032749

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2010106143655A Pending CN102074231A (en) 2010-12-30 2010-12-30 Speech recognition method and speech recognition system

Country Status (1)

Country Link
CN (1) CN102074231A (en)

Cited By (47)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103474063A (en) * 2013-08-06 2013-12-25 福建华映显示科技有限公司 Voice recognition system and method
CN103514875A (en) * 2012-06-29 2014-01-15 联想(北京)有限公司 Voice data matching method and electronic equipment
CN103594085A (en) * 2012-08-16 2014-02-19 百度在线网络技术(北京)有限公司 Method and system providing speech recognition result
CN103903611A (en) * 2012-12-24 2014-07-02 联想(北京)有限公司 Speech information identifying method and equipment
CN105161110A (en) * 2015-08-19 2015-12-16 百度在线网络技术(北京)有限公司 Bluetooth connection-based speech recognition method, device and system
CN105225665A (en) * 2015-10-15 2016-01-06 桂林电子科技大学 A kind of audio recognition method and speech recognition equipment
CN105336326A (en) * 2011-09-28 2016-02-17 苹果公司 Speech recognition repair using contextual information
CN105448292A (en) * 2014-08-19 2016-03-30 北京羽扇智信息科技有限公司 Scene-based real-time voice recognition system and method
CN105488044A (en) * 2014-09-16 2016-04-13 华为技术有限公司 Data processing method and device
CN105516289A (en) * 2015-12-02 2016-04-20 广东小天才科技有限公司 Method and system for assisting voice interaction based on position and action
WO2016101577A1 (en) * 2014-12-24 2016-06-30 中兴通讯股份有限公司 Voice recognition method, client and terminal device
CN105788598A (en) * 2014-12-19 2016-07-20 联想(北京)有限公司 Speech processing method and electronic device
CN105845133A (en) * 2016-03-30 2016-08-10 乐视控股(北京)有限公司 Voice signal processing method and apparatus
CN105869635A (en) * 2016-03-14 2016-08-17 江苏时间环三维科技有限公司 Speech recognition method and system
CN105989836A (en) * 2015-03-06 2016-10-05 腾讯科技(深圳)有限公司 Voice acquisition method, device and terminal equipment
CN106128462A (en) * 2016-06-21 2016-11-16 东莞酷派软件技术有限公司 Speech Recognition Method and System
CN106205606A (en) * 2016-08-15 2016-12-07 南京邮电大学 A kind of dynamic positioning and monitoring method based on speech recognition and system
CN106228983A (en) * 2016-08-23 2016-12-14 北京谛听机器人科技有限公司 Scene process method and system during a kind of man-machine natural language is mutual
CN106558306A (en) * 2015-09-28 2017-04-05 广东新信通信息系统服务有限公司 Method for voice recognition, device and equipment
CN106683662A (en) * 2015-11-10 2017-05-17 中国电信股份有限公司 Speech recognition method and device
CN103956169B (en) * 2014-04-17 2017-07-21 北京搜狗科技发展有限公司 A kind of pronunciation inputting method, device and system
CN107316635A (en) * 2017-05-19 2017-11-03 科大讯飞股份有限公司 Audio recognition method and device, storage medium, electronic equipment
CN107483714A (en) * 2017-06-28 2017-12-15 努比亚技术有限公司 A kind of audio communication method, mobile terminal and computer-readable recording medium
CN107785014A (en) * 2017-10-23 2018-03-09 上海百芝龙网络科技有限公司 A kind of home scenarios semantic understanding method
CN107845382A (en) * 2012-06-21 2018-03-27 谷歌有限责任公司 Dynamic language model
CN107945792A (en) * 2017-11-06 2018-04-20 百度在线网络技术(北京)有限公司 Method of speech processing and device
CN108242237A (en) * 2016-12-26 2018-07-03 现代自动车株式会社 Speech processing device, the vehicle and method of speech processing with the equipment
CN108831505A (en) * 2018-05-30 2018-11-16 百度在线网络技术(北京)有限公司 The method and apparatus for the usage scenario applied for identification
CN108924370A (en) * 2018-07-23 2018-11-30 携程旅游信息技术(上海)有限公司 Call center's outgoing call speech waveform analysis method, system, equipment and storage medium
CN109065045A (en) * 2018-08-30 2018-12-21 出门问问信息科技有限公司 Audio recognition method, device, electronic equipment and computer readable storage medium
CN109509466A (en) * 2018-10-29 2019-03-22 Oppo广东移动通信有限公司 Data processing method, terminal and computer storage medium
CN109509473A (en) * 2019-01-28 2019-03-22 维沃移动通信有限公司 Sound control method and terminal device
CN109801619A (en) * 2019-02-13 2019-05-24 安徽大尺度网络传媒有限公司 A kind of across language voice identification method for transformation of intelligence
CN109920429A (en) * 2017-12-13 2019-06-21 上海擎感智能科技有限公司 It is a kind of for vehicle-mounted voice recognition data processing method and system
CN110085228A (en) * 2019-04-28 2019-08-02 广西盖德科技有限公司 Phonetic code application method, applications client and system
CN110299136A (en) * 2018-03-22 2019-10-01 上海擎感智能科技有限公司 A kind of processing method and its system for speech recognition
CN110364165A (en) * 2019-07-18 2019-10-22 青岛民航凯亚系统集成有限公司 Flight dynamic information voice inquiry method
CN110459203A (en) * 2018-05-03 2019-11-15 百度在线网络技术(北京)有限公司 A kind of intelligent sound guidance method, device, equipment and storage medium
CN110827824A (en) * 2018-08-08 2020-02-21 Oppo广东移动通信有限公司 Voice processing method, device, storage medium and electronic equipment
CN111048091A (en) * 2019-12-30 2020-04-21 苏州思必驰信息科技有限公司 Voice recognition method, voice recognition equipment and computer readable storage medium
WO2020119541A1 (en) * 2018-12-11 2020-06-18 阿里巴巴集团控股有限公司 Voice data identification method, apparatus and system
CN111986651A (en) * 2020-09-02 2020-11-24 上海优扬新媒信息技术有限公司 Man-machine interaction method and device and intelligent interaction terminal
CN112102833A (en) * 2020-09-22 2020-12-18 北京百度网讯科技有限公司 Voice recognition method, device, equipment and storage medium
CN112102815A (en) * 2020-11-13 2020-12-18 深圳追一科技有限公司 Speech recognition method, speech recognition device, computer equipment and storage medium
CN114120979A (en) * 2022-01-25 2022-03-01 荣耀终端有限公司 Optimization method, training method, device and medium of voice recognition model
WO2023029442A1 (en) * 2021-08-30 2023-03-09 佛山市顺德区美的电子科技有限公司 Smart device control method and apparatus, smart device, and readable storage medium
CN115881105A (en) * 2022-12-01 2023-03-31 湖北星纪时代科技有限公司 Speech recognition method, electronic device, and computer-readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1293428A (en) * 2000-11-10 2001-05-02 清华大学 Information check method based on speed recognition
CN1674091A (en) * 2005-04-18 2005-09-28 南京师范大学 Sound identifying method for geographic information and its application in navigation system
CN101345051A (en) * 2008-08-19 2009-01-14 南京师范大学 Voice Control Method of Geographic Information System with Quantitative Parameters
US20090228281A1 (en) * 2008-03-07 2009-09-10 Google Inc. Voice Recognition Grammar Selection Based on Context
CN101593518A (en) * 2008-05-28 2009-12-02 中国科学院自动化研究所 A Balanced Approach to Real Scene Corpus and Finite State Network Corpus

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1293428A (en) * 2000-11-10 2001-05-02 清华大学 Information check method based on speed recognition
CN1674091A (en) * 2005-04-18 2005-09-28 南京师范大学 Sound identifying method for geographic information and its application in navigation system
US20090228281A1 (en) * 2008-03-07 2009-09-10 Google Inc. Voice Recognition Grammar Selection Based on Context
CN101593518A (en) * 2008-05-28 2009-12-02 中国科学院自动化研究所 A Balanced Approach to Real Scene Corpus and Finite State Network Corpus
CN101345051A (en) * 2008-08-19 2009-01-14 南京师范大学 Voice Control Method of Geographic Information System with Quantitative Parameters

Cited By (62)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105336326A (en) * 2011-09-28 2016-02-17 苹果公司 Speech recognition repair using contextual information
CN107845382A (en) * 2012-06-21 2018-03-27 谷歌有限责任公司 Dynamic language model
CN103514875A (en) * 2012-06-29 2014-01-15 联想(北京)有限公司 Voice data matching method and electronic equipment
CN103594085A (en) * 2012-08-16 2014-02-19 百度在线网络技术(北京)有限公司 Method and system providing speech recognition result
CN103594085B (en) * 2012-08-16 2019-04-26 百度在线网络技术(北京)有限公司 It is a kind of that the method and system of speech recognition result are provided
CN103903611B (en) * 2012-12-24 2018-07-03 联想(北京)有限公司 A kind of recognition methods of voice messaging and equipment
CN103903611A (en) * 2012-12-24 2014-07-02 联想(北京)有限公司 Speech information identifying method and equipment
CN103474063B (en) * 2013-08-06 2015-12-23 福建华映显示科技有限公司 Voice identification system and method
CN103474063A (en) * 2013-08-06 2013-12-25 福建华映显示科技有限公司 Voice recognition system and method
CN103956169B (en) * 2014-04-17 2017-07-21 北京搜狗科技发展有限公司 A kind of pronunciation inputting method, device and system
CN105448292A (en) * 2014-08-19 2016-03-30 北京羽扇智信息科技有限公司 Scene-based real-time voice recognition system and method
CN105448292B (en) * 2014-08-19 2019-03-12 北京羽扇智信息科技有限公司 A scene-based real-time speech recognition system and method
CN105488044A (en) * 2014-09-16 2016-04-13 华为技术有限公司 Data processing method and device
CN105788598A (en) * 2014-12-19 2016-07-20 联想(北京)有限公司 Speech processing method and electronic device
WO2016101577A1 (en) * 2014-12-24 2016-06-30 中兴通讯股份有限公司 Voice recognition method, client and terminal device
CN105989836A (en) * 2015-03-06 2016-10-05 腾讯科技(深圳)有限公司 Voice acquisition method, device and terminal equipment
CN105161110A (en) * 2015-08-19 2015-12-16 百度在线网络技术(北京)有限公司 Bluetooth connection-based speech recognition method, device and system
CN105161110B (en) * 2015-08-19 2017-11-17 百度在线网络技术(北京)有限公司 Audio recognition method, device and system based on bluetooth connection
CN106558306A (en) * 2015-09-28 2017-04-05 广东新信通信息系统服务有限公司 Method for voice recognition, device and equipment
CN105225665A (en) * 2015-10-15 2016-01-06 桂林电子科技大学 A kind of audio recognition method and speech recognition equipment
CN106683662A (en) * 2015-11-10 2017-05-17 中国电信股份有限公司 Speech recognition method and device
CN105516289A (en) * 2015-12-02 2016-04-20 广东小天才科技有限公司 Method and system for assisting voice interaction based on position and action
CN105869635B (en) * 2016-03-14 2020-01-24 江苏时间环三维科技有限公司 Voice recognition method and system
CN105869635A (en) * 2016-03-14 2016-08-17 江苏时间环三维科技有限公司 Speech recognition method and system
WO2017166631A1 (en) * 2016-03-30 2017-10-05 乐视控股(北京)有限公司 Voice signal processing method, apparatus and electronic device
CN105845133A (en) * 2016-03-30 2016-08-10 乐视控股(北京)有限公司 Voice signal processing method and apparatus
CN106128462A (en) * 2016-06-21 2016-11-16 东莞酷派软件技术有限公司 Speech Recognition Method and System
CN106205606A (en) * 2016-08-15 2016-12-07 南京邮电大学 A kind of dynamic positioning and monitoring method based on speech recognition and system
CN106228983B (en) * 2016-08-23 2018-08-24 北京谛听机器人科技有限公司 A kind of scene process method and system in man-machine natural language interaction
CN106228983A (en) * 2016-08-23 2016-12-14 北京谛听机器人科技有限公司 Scene process method and system during a kind of man-machine natural language is mutual
CN108242237A (en) * 2016-12-26 2018-07-03 现代自动车株式会社 Speech processing device, the vehicle and method of speech processing with the equipment
CN107316635A (en) * 2017-05-19 2017-11-03 科大讯飞股份有限公司 Audio recognition method and device, storage medium, electronic equipment
CN107483714A (en) * 2017-06-28 2017-12-15 努比亚技术有限公司 A kind of audio communication method, mobile terminal and computer-readable recording medium
CN107785014A (en) * 2017-10-23 2018-03-09 上海百芝龙网络科技有限公司 A kind of home scenarios semantic understanding method
CN107945792A (en) * 2017-11-06 2018-04-20 百度在线网络技术(北京)有限公司 Method of speech processing and device
CN107945792B (en) * 2017-11-06 2021-05-28 百度在线网络技术(北京)有限公司 Voice processing method and device
CN109920429A (en) * 2017-12-13 2019-06-21 上海擎感智能科技有限公司 It is a kind of for vehicle-mounted voice recognition data processing method and system
CN110299136A (en) * 2018-03-22 2019-10-01 上海擎感智能科技有限公司 A kind of processing method and its system for speech recognition
CN110459203A (en) * 2018-05-03 2019-11-15 百度在线网络技术(北京)有限公司 A kind of intelligent sound guidance method, device, equipment and storage medium
CN108831505A (en) * 2018-05-30 2018-11-16 百度在线网络技术(北京)有限公司 The method and apparatus for the usage scenario applied for identification
CN108924370A (en) * 2018-07-23 2018-11-30 携程旅游信息技术(上海)有限公司 Call center's outgoing call speech waveform analysis method, system, equipment and storage medium
CN108924370B (en) * 2018-07-23 2020-12-15 携程旅游信息技术(上海)有限公司 Call center outbound voice waveform analysis method, system, equipment and storage medium
CN110827824A (en) * 2018-08-08 2020-02-21 Oppo广东移动通信有限公司 Voice processing method, device, storage medium and electronic equipment
CN109065045A (en) * 2018-08-30 2018-12-21 出门问问信息科技有限公司 Audio recognition method, device, electronic equipment and computer readable storage medium
CN109509466A (en) * 2018-10-29 2019-03-22 Oppo广东移动通信有限公司 Data processing method, terminal and computer storage medium
CN111312233A (en) * 2018-12-11 2020-06-19 阿里巴巴集团控股有限公司 Voice data identification method, device and system
WO2020119541A1 (en) * 2018-12-11 2020-06-18 阿里巴巴集团控股有限公司 Voice data identification method, apparatus and system
CN109509473A (en) * 2019-01-28 2019-03-22 维沃移动通信有限公司 Sound control method and terminal device
CN109509473B (en) * 2019-01-28 2022-10-04 维沃移动通信有限公司 Voice control method and terminal device
CN109801619A (en) * 2019-02-13 2019-05-24 安徽大尺度网络传媒有限公司 A kind of across language voice identification method for transformation of intelligence
CN110085228A (en) * 2019-04-28 2019-08-02 广西盖德科技有限公司 Phonetic code application method, applications client and system
CN110364165A (en) * 2019-07-18 2019-10-22 青岛民航凯亚系统集成有限公司 Flight dynamic information voice inquiry method
CN111048091A (en) * 2019-12-30 2020-04-21 苏州思必驰信息科技有限公司 Voice recognition method, voice recognition equipment and computer readable storage medium
CN111986651A (en) * 2020-09-02 2020-11-24 上海优扬新媒信息技术有限公司 Man-machine interaction method and device and intelligent interaction terminal
CN111986651B (en) * 2020-09-02 2023-09-29 度小满科技(北京)有限公司 Man-machine interaction method and device and intelligent interaction terminal
CN112102833A (en) * 2020-09-22 2020-12-18 北京百度网讯科技有限公司 Voice recognition method, device, equipment and storage medium
CN112102833B (en) * 2020-09-22 2023-12-12 阿波罗智联(北京)科技有限公司 Speech recognition method, device, equipment and storage medium
CN112102815A (en) * 2020-11-13 2020-12-18 深圳追一科技有限公司 Speech recognition method, speech recognition device, computer equipment and storage medium
CN112102815B (en) * 2020-11-13 2021-07-13 深圳追一科技有限公司 Speech recognition method, speech recognition device, computer equipment and storage medium
WO2023029442A1 (en) * 2021-08-30 2023-03-09 佛山市顺德区美的电子科技有限公司 Smart device control method and apparatus, smart device, and readable storage medium
CN114120979A (en) * 2022-01-25 2022-03-01 荣耀终端有限公司 Optimization method, training method, device and medium of voice recognition model
CN115881105A (en) * 2022-12-01 2023-03-31 湖北星纪时代科技有限公司 Speech recognition method, electronic device, and computer-readable storage medium

Similar Documents

Publication Publication Date Title
CN102074231A (en) Speech recognition method and speech recognition system
CN103187053B (en) Input method and electronic equipment
US9542938B2 (en) Scene recognition method, device and mobile terminal based on ambient sound
US9299347B1 (en) Speech recognition using associative mapping
CN103164403B (en) The generation method and system of video index data
CN102592591B (en) Dual-band speech encoding
CN112925945A (en) Conference summary generation method, device, equipment and storage medium
CN101599270A (en) Voice server and voice control method
WO2014101717A1 (en) Voice recognizing method and system for personalized user information
CN105895103A (en) Speech recognition method and device
KR20150134993A (en) Method and Apparatus of Speech Recognition Using Device Information
CN103514882B (en) A kind of audio recognition method and system
KR101302563B1 (en) System and method for constructing named entity dictionary
CN102510426A (en) Personal assistant application access method and system
CN102541505A (en) Voice input method and system thereof
JP2002125047A5 (en)
CN102543076A (en) Speech training method and corresponding system for phonetic entry method
CN104142831B (en) Application search method and device
CN113409774A (en) Voice recognition method and device and electronic equipment
CN106356054A (en) Method and system for collecting information of agricultural products based on voice recognition
CN113593580B (en) Voiceprint recognition method and device
WO2019075829A1 (en) Voice translation method and apparatus, and translation device
CN104240698A (en) Voice recognition method
JP5112978B2 (en) Speech recognition apparatus, speech recognition system, and program
CN1545694A (en) Client-server based distributed speech recognition system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20110525