DE69912754D1 - Vorrichtung und verfahren zum gleichzeitigen multimodalen diktieren - Google Patents
Vorrichtung und verfahren zum gleichzeitigen multimodalen diktierenInfo
- Publication number
- DE69912754D1 DE69912754D1 DE69912754T DE69912754T DE69912754D1 DE 69912754 D1 DE69912754 D1 DE 69912754D1 DE 69912754 T DE69912754 T DE 69912754T DE 69912754 T DE69912754 T DE 69912754T DE 69912754 D1 DE69912754 D1 DE 69912754D1
- Authority
- DE
- Germany
- Prior art keywords
- recognition
- sequence
- model
- modules
- command
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/19—Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
- G10L15/193—Formal grammars, e.g. finite state automata, context free grammars or word networks
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/226—Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics
- G10L2015/228—Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics of application context
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Machine Translation (AREA)
- Document Processing Apparatus (AREA)
- Preparation Of Compounds By Using Micro-Organisms (AREA)
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US7733798P | 1998-03-09 | 1998-03-09 | |
| US7773898P | 1998-03-12 | 1998-03-12 | |
| US7792298P | 1998-03-13 | 1998-03-13 | |
| PCT/US1999/005090 WO1999046763A1 (en) | 1998-03-09 | 1999-03-09 | Apparatus and method for simultaneous multimode dictation |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| DE69912754D1 true DE69912754D1 (de) | 2003-12-18 |
Family
ID=27373082
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| DE69912754T Expired - Lifetime DE69912754D1 (de) | 1998-03-09 | 1999-03-09 | Vorrichtung und verfahren zum gleichzeitigen multimodalen diktieren |
Country Status (8)
| Country | Link |
|---|---|
| US (1) | US6292779B1 (de) |
| EP (1) | EP1062660B1 (de) |
| JP (1) | JP2002507010A (de) |
| AT (1) | ATE254328T1 (de) |
| AU (1) | AU2901299A (de) |
| CA (1) | CA2321299A1 (de) |
| DE (1) | DE69912754D1 (de) |
| WO (1) | WO1999046763A1 (de) |
Families Citing this family (53)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6799169B1 (en) * | 1999-08-13 | 2004-09-28 | International Business Machines Corporation | Method and system for modeless operation of a multi-modal user interface through implementation of independent decision networks |
| US6912499B1 (en) * | 1999-08-31 | 2005-06-28 | Nortel Networks Limited | Method and apparatus for training a multilingual speech model set |
| US6581033B1 (en) | 1999-10-19 | 2003-06-17 | Microsoft Corporation | System and method for correction of speech recognition mode errors |
| US6600497B1 (en) * | 1999-11-15 | 2003-07-29 | Elliot A. Gottfurcht | Apparatus and method to navigate interactive television using unique inputs with a remote control |
| US7020845B1 (en) * | 1999-11-15 | 2006-03-28 | Gottfurcht Elliot A | Navigating internet content on a television using a simplified interface and a remote control |
| GB0003903D0 (en) | 2000-02-18 | 2000-04-05 | Canon Kk | Improved speech recognition accuracy in a multimodal input system |
| US6741963B1 (en) * | 2000-06-21 | 2004-05-25 | International Business Machines Corporation | Method of managing a speech cache |
| WO2002029783A1 (en) * | 2000-09-30 | 2002-04-11 | Intel Corporation | Method and system for using rule-based knowledge to build a class-based domain specific statistical language model |
| US6983239B1 (en) * | 2000-10-25 | 2006-01-03 | International Business Machines Corporation | Method and apparatus for embedding grammars in a natural language understanding (NLU) statistical parser |
| US20020072914A1 (en) | 2000-12-08 | 2002-06-13 | Hiyan Alshawi | Method and apparatus for creation and user-customization of speech-enabled services |
| US7027987B1 (en) * | 2001-02-07 | 2006-04-11 | Google Inc. | Voice interface for a search engine |
| DE10120513C1 (de) * | 2001-04-26 | 2003-01-09 | Siemens Ag | Verfahren zur Bestimmung einer Folge von Lautbausteinen zum Synthetisieren eines Sprachsignals einer tonalen Sprache |
| WO2003001319A2 (en) * | 2001-06-26 | 2003-01-03 | Vladimir Grigorievich Yakhno | Method for recognising information images using automatically controlled adaptation and system for carrying out said method |
| US20040150676A1 (en) * | 2002-03-25 | 2004-08-05 | Gottfurcht Elliot A. | Apparatus and method for simple wide-area network navigation |
| US7366645B2 (en) * | 2002-05-06 | 2008-04-29 | Jezekiel Ben-Arie | Method of recognition of human motion, vector sequences and speech |
| US7693720B2 (en) | 2002-07-15 | 2010-04-06 | Voicebox Technologies, Inc. | Mobile systems and methods for responding to natural language speech utterance |
| KR100504982B1 (ko) * | 2002-07-25 | 2005-08-01 | (주) 메카트론 | 환경 적응형 다중 음성인식 장치 및 음성인식 방법 |
| US7191130B1 (en) * | 2002-09-27 | 2007-03-13 | Nuance Communications | Method and system for automatically optimizing recognition configuration parameters for speech recognition systems |
| US20040138883A1 (en) * | 2003-01-13 | 2004-07-15 | Bhiksha Ramakrishnan | Lossless compression of ordered integer lists |
| US7171358B2 (en) * | 2003-01-13 | 2007-01-30 | Mitsubishi Electric Research Laboratories, Inc. | Compression of language model structures and word identifiers for automated speech recognition systems |
| GB2418764B (en) * | 2004-09-30 | 2008-04-09 | Fluency Voice Technology Ltd | Improving pattern recognition accuracy with distortions |
| ES2237345B1 (es) * | 2005-02-28 | 2006-06-16 | Prous Institute For Biomedical Research S.A. | Procedimiento de conversion de fonemas a texto escrito y sistema informatico y programa informatico correspondientes. |
| US7640160B2 (en) | 2005-08-05 | 2009-12-29 | Voicebox Technologies, Inc. | Systems and methods for responding to natural language speech utterance |
| US7620549B2 (en) | 2005-08-10 | 2009-11-17 | Voicebox Technologies, Inc. | System and method of supporting adaptive misrecognition in conversational speech |
| US7949529B2 (en) | 2005-08-29 | 2011-05-24 | Voicebox Technologies, Inc. | Mobile systems and methods of supporting natural language human-machine interactions |
| EP1796080B1 (de) | 2005-12-12 | 2009-11-18 | Gregory John Gadbois | Mehrstimmige Spracherkennung |
| US8781837B2 (en) * | 2006-03-23 | 2014-07-15 | Nec Corporation | Speech recognition system and method for plural applications |
| US20080086311A1 (en) * | 2006-04-11 | 2008-04-10 | Conwell William Y | Speech Recognition, and Related Systems |
| US8073681B2 (en) | 2006-10-16 | 2011-12-06 | Voicebox Technologies, Inc. | System and method for a cooperative conversational voice user interface |
| US7818176B2 (en) | 2007-02-06 | 2010-10-19 | Voicebox Technologies, Inc. | System and method for selecting and presenting advertisements based on natural language processing of voice-based input |
| US9129599B2 (en) * | 2007-10-18 | 2015-09-08 | Nuance Communications, Inc. | Automated tuning of speech recognition parameters |
| US8140335B2 (en) | 2007-12-11 | 2012-03-20 | Voicebox Technologies, Inc. | System and method for providing a natural language voice user interface in an integrated voice navigation services environment |
| US9305548B2 (en) | 2008-05-27 | 2016-04-05 | Voicebox Technologies Corporation | System and method for an integrated, multi-modal, multi-device natural language voice services environment |
| US8364481B2 (en) | 2008-07-02 | 2013-01-29 | Google Inc. | Speech recognition with parallel recognition tasks |
| JP5478903B2 (ja) * | 2009-01-22 | 2014-04-23 | 三菱重工業株式会社 | ロボットおよび音声認識装置ならびにプログラム |
| US8326637B2 (en) | 2009-02-20 | 2012-12-04 | Voicebox Technologies, Inc. | System and method for processing multi-modal device interactions in a natural language voice services environment |
| WO2011059997A1 (en) | 2009-11-10 | 2011-05-19 | Voicebox Technologies, Inc. | System and method for providing a natural language content dedication service |
| US9171541B2 (en) | 2009-11-10 | 2015-10-27 | Voicebox Technologies Corporation | System and method for hybrid processing in a natural language voice services environment |
| WO2011071484A1 (en) * | 2009-12-08 | 2011-06-16 | Nuance Communications, Inc. | Guest speaker robust adapted speech recognition |
| JP2012047924A (ja) * | 2010-08-26 | 2012-03-08 | Sony Corp | 情報処理装置、および情報処理方法、並びにプログラム |
| US9620122B2 (en) * | 2011-12-08 | 2017-04-11 | Lenovo (Singapore) Pte. Ltd | Hybrid speech recognition |
| EP2733697A1 (de) * | 2012-11-16 | 2014-05-21 | QNX Software Systems Limited | Anwendungsdienst-Schnittstelle zu ASR |
| US9477753B2 (en) * | 2013-03-12 | 2016-10-25 | International Business Machines Corporation | Classifier-based system combination for spoken term detection |
| US10186262B2 (en) * | 2013-07-31 | 2019-01-22 | Microsoft Technology Licensing, Llc | System with multiple simultaneous speech recognizers |
| JP5709955B2 (ja) * | 2013-09-30 | 2015-04-30 | 三菱重工業株式会社 | ロボットおよび音声認識装置ならびにプログラム |
| EP3195145A4 (de) | 2014-09-16 | 2018-01-24 | VoiceBox Technologies Corporation | Sprachhandel |
| WO2016044321A1 (en) | 2014-09-16 | 2016-03-24 | Min Tang | Integration of domain information into state transitions of a finite state transducer for natural language processing |
| US9747896B2 (en) | 2014-10-15 | 2017-08-29 | Voicebox Technologies Corporation | System and method for providing follow-up responses to prior natural language inputs of a user |
| US10431214B2 (en) | 2014-11-26 | 2019-10-01 | Voicebox Technologies Corporation | System and method of determining a domain and/or an action related to a natural language input |
| US10614799B2 (en) | 2014-11-26 | 2020-04-07 | Voicebox Technologies Corporation | System and method of providing intent predictions for an utterance prior to a system detection of an end of the utterance |
| US10089977B2 (en) * | 2015-07-07 | 2018-10-02 | International Business Machines Corporation | Method for system combination in an audio analytics application |
| US10331784B2 (en) | 2016-07-29 | 2019-06-25 | Voicebox Technologies Corporation | System and method of disambiguating natural language processing requests |
| US10607606B2 (en) | 2017-06-19 | 2020-03-31 | Lenovo (Singapore) Pte. Ltd. | Systems and methods for execution of digital assistant |
Family Cites Families (11)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6101468A (en) | 1992-11-13 | 2000-08-08 | Dragon Systems, Inc. | Apparatuses and methods for training and operating speech recognition systems |
| US5638486A (en) | 1994-10-26 | 1997-06-10 | Motorola, Inc. | Method and system for continuous speech recognition using voting techniques |
| US5832430A (en) * | 1994-12-29 | 1998-11-03 | Lucent Technologies, Inc. | Devices and methods for speech recognition of vocabulary words with simultaneous detection and verification |
| US5677991A (en) | 1995-06-30 | 1997-10-14 | Kurzweil Applied Intelligence, Inc. | Speech recognition system using arbitration between continuous speech and isolated word modules |
| US5794196A (en) | 1995-06-30 | 1998-08-11 | Kurzweil Applied Intelligence, Inc. | Speech recognition system distinguishing dictation from commands by arbitration between continuous speech and isolated word modules |
| US5737489A (en) * | 1995-09-15 | 1998-04-07 | Lucent Technologies Inc. | Discriminative utterance verification for connected digits recognition |
| US5799279A (en) | 1995-11-13 | 1998-08-25 | Dragon Systems, Inc. | Continuous speech recognition of text and commands |
| DE19635754A1 (de) * | 1996-09-03 | 1998-03-05 | Siemens Ag | Sprachverarbeitungssystem und Verfahren zur Sprachverarbeitung |
| US6029124A (en) * | 1997-02-21 | 2000-02-22 | Dragon Systems, Inc. | Sequential, nonparametric speech recognition and speaker identification |
| US6076056A (en) * | 1997-09-19 | 2000-06-13 | Microsoft Corporation | Speech recognition system for recognizing continuous and isolated speech |
| US6182038B1 (en) * | 1997-12-01 | 2001-01-30 | Motorola, Inc. | Context dependent phoneme networks for encoding speech information |
-
1999
- 1999-03-09 JP JP2000536068A patent/JP2002507010A/ja not_active Withdrawn
- 1999-03-09 US US09/267,925 patent/US6292779B1/en not_active Expired - Lifetime
- 1999-03-09 AT AT99909926T patent/ATE254328T1/de not_active IP Right Cessation
- 1999-03-09 EP EP99909926A patent/EP1062660B1/de not_active Expired - Lifetime
- 1999-03-09 WO PCT/US1999/005090 patent/WO1999046763A1/en not_active Ceased
- 1999-03-09 AU AU29012/99A patent/AU2901299A/en not_active Abandoned
- 1999-03-09 CA CA002321299A patent/CA2321299A1/en not_active Abandoned
- 1999-03-09 DE DE69912754T patent/DE69912754D1/de not_active Expired - Lifetime
Also Published As
| Publication number | Publication date |
|---|---|
| WO1999046763A1 (en) | 1999-09-16 |
| US6292779B1 (en) | 2001-09-18 |
| JP2002507010A (ja) | 2002-03-05 |
| CA2321299A1 (en) | 1999-09-16 |
| EP1062660A1 (de) | 2000-12-27 |
| EP1062660B1 (de) | 2003-11-12 |
| ATE254328T1 (de) | 2003-11-15 |
| AU2901299A (en) | 1999-09-27 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| DE69912754D1 (de) | Vorrichtung und verfahren zum gleichzeitigen multimodalen diktieren | |
| US10074363B2 (en) | Method and apparatus for keyword speech recognition | |
| Feng et al. | End-to-End Speech Emotion Recognition Combined with Acoustic-to-Word ASR Model. | |
| Nishimura et al. | Singing Voice Synthesis Based on Deep Neural Networks. | |
| US8321218B2 (en) | Searching in audio speech | |
| JP5386692B2 (ja) | 対話型学習装置 | |
| Eide | Distinctive features for use in an automatic speech recognition system. | |
| Demircan et al. | Feature extraction from speech data for emotion recognition | |
| DE60005326D1 (de) | Erkennungseinheiten mit komplementären sprachmodellen | |
| Masuko et al. | Imposture using synthetic speech against speaker verification based on spectrum and pitch. | |
| RU2016144006A (ru) | Способ осуществления многорежимного диалога между человекоподобным роботом и пользователем, компьютерный программный продукт и человекоподобный робот для осуществления упомянутого способа | |
| US20110218802A1 (en) | Continuous Speech Recognition | |
| Rudnicky et al. | Interactive problem solving with speech | |
| CN107972028A (zh) | 人机交互方法、装置及电子设备 | |
| Chadha et al. | Current challenges and application of speech recognition process using natural language processing: A survey | |
| Tabbaa et al. | Computer-aided training for Quranic recitation | |
| Bozkurt et al. | Improving automatic emotion recognition from speech signals. | |
| Ullah et al. | Speech emotion recognition using deep neural networks | |
| El Amrani et al. | Towards using CMU sphinx tools for the holy Quran recitation verification | |
| Reynolds et al. | Integration of speaker and speech recognition systems | |
| JP2013061402A (ja) | 音声言語評価装置、方法、及びプログラム | |
| Yousfi et al. | Automatic speech recognition for the holy Qur ‘an, A review | |
| Wallich | Putting speech recognizers to work: While advances in signal processing and algorithms would extend their usefulness, limited models are already meeting many inspection and inventory applications | |
| Mishra et al. | Incremental emotion recognition. | |
| Jalalvand et al. | A classifier combination approach for Farsi accents recognition |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 8332 | No legal effect for de |