DE29914227U1

DE29914227U1 - Device for converting voice commands and / or speech texts into keyboard and / or mouse movements and / or texts

Info

Publication number: DE29914227U1
Application number: DE29914227U
Authority: DE
Original assignee: Individual
Current assignee: Individual
Priority date: 1999-08-13
Filing date: 1999-08-13
Publication date: 2000-07-27
Anticipated expiration: 2009-08-14

Description

Int. Aktz.: GbmO799AInt. Ref.: GbmO799A

Vorrichtung zur Umwandlung von Sprachkommandos und / oder Sprachtexte in Tastatur- und / oder Mausbewegungen und / oder TexteDevice for converting voice commands and/or voice texts into keyboard and/or mouse movements and/or texts

Die Erfindung bezieht sich auf eine Vorrichtung zur Umwandlung von Sprachkommandos und / oder Sprachtexte in Tastatur- und / oder Mausbewegungen und / oder Texte.The invention relates to a device for converting voice commands and/or voice texts into keyboard and/or mouse movements and/or texts.

Es ist bekannt, dass zur Bedienung von Computersystemen entweder eine Tastatur, ein Touch-Screen, und / oder eine Computermaus benutzt wird.It is known that either a keyboard, a touch screen, and/or a computer mouse are used to operate computer systems.

Tastaturen als Bedienelement für Computersysteme sind seit den ersten Tagen des PC's und seiner Vorgänger, um 1980, bekannt. Computermäuse werden seit dem Einsatz der ersten grafischen Benutzeroberfläche, eingeführt von der Fa. Apple Inc., um 1986, eingesetzt und verwendet.Keyboards as a control element for computer systems have been known since the early days of the PC and its predecessors, around 1980. Computer mice have been used since the first graphical user interface was introduced by Apple Inc. in around 1986.

Die Mausbewegungen werden durch Bewegungen der Hand erzeugt, und die Auslösung eines Menüpunktes oder Programmbefehls erfolgt durch drücken mit dem Zeigefinger auf die Maustaste.The mouse movements are generated by movements of the hand, and a menu item or program command is triggered by pressing the mouse button with the index finger.

Die automatische Spracherkennung kann sowohl zur Umsetzung gesprochener Sprache in Tastaturbefehle und / oder Mausbewegungen und Texten, als auch zur Verifikation der Identität eines Benutzers eingesetzt werden.Automatic speech recognition can be used to convert spoken language into keyboard commands and/or mouse movements and text, as well as to verify the identity of a user.

Seit ca. 1950 laufen verschiedene Forschungsvorhaben hinsichtlich Spracherkennungstechniken. Wobei seit 1980 die Erkennungsmöglichkeiten durch die Entwicklung von statistischen Verfahren wie z.B. dem Hiddden-Markow-Modell(HMM) deutlich verbessert wurden.Since around 1950, various research projects have been underway into speech recognition techniques. Since 1980, recognition capabilities have been significantly improved through the development of statistical methods such as the Hidden-Markow Model (HMM).

Aus der Literatur (Schukat- Talamazzini, E.G.(1995), Automatische Spracherkennung, Grundlagen, statistische Modelle und effiziente Algorithmen, Vieweg Verlag, Braunschweig) ist bereits bekannt, daß Verfahren der Spracherkennung entweder auf dem Vergleich zwischen abgespeicherten Referenzmustern und der unbekannten Äußerung beruhen oder auf der Beschreibung einzelner Wörter des Vokabulars mittels stochastischer Modelle. Dabei wird eine Äußerung, bestehend aus digitalen Abtastwerten, zunächst in eine Folge von Sprachblöcken vorgegebener Dauer zerlegt, und dann für jeden Sprachblock ein Satz von Merkmalsgrößen berechnet. Jeder Satz ergibt einen sogenannten Merkmalsvektor. Die statistischen Eigenschaften der Merkmalsgrößen werden in dem modellbasierenden Ansatz durch Verteilungsdichtefunktionen mit entsprechenden Mittelwerten und Varianzen erfasst. Diese Mittelwerte und Varianzen müssen zunächst in einer Trainingsphase anhand einer Vielzahl von repräsentativen Trainingsäußerungen bestimmt werden, um einen Referenzsatz(ein Modell) zu gewinnen. Zur Erkennung einer unbekannten Äußerung werden dann für die Modelle, die die Wörter des Vokabulars repräsentieren, jeweils Wahrscheinlichkeiten berechnet.It is already known from the literature (Schukat-Talamazzini, E.G.(1995), Automatic speech recognition, basics, statistical models and efficient algorithms, Vieweg Verlag, Braunschweig) that speech recognition methods are based either on the comparison between stored reference patterns and the unknown utterance or on the description of individual words in the vocabulary using stochastic models. In this process, an utterance consisting of digital samples is first broken down into a sequence of speech blocks of a given duration, and then a set of feature variables is calculated for each speech block. Each set results in a so-called feature vector. The statistical properties of the feature variables are recorded in the model-based approach using distribution density functions with corresponding mean values and variances. These mean values and variances must first be determined in a training phase using a large number of representative training utterances in order to obtain a reference sentence (a model). To recognize an unknown utterance, probabilities are then calculated for the models that represent the words in the vocabulary.

Diese statistischen Verfahren wurden durch die Einbeziehung und Kombination mit Methoden neuronaler Netzwerktechnologien erweitert.These statistical procedures have been extended by incorporating and combining them with methods from neural network technologies.

All diesen Verfahren und Vorrichtungen ist gemeinsam, dass diese auf einem Host-PC direkt ablaufen müssen, somit diesen stark belasten und die Handhabung dadurch komplizieren daß oftmals hohe Wartezeiten bis zum Abschluss des Erkennungsvorgangs bestehen.Sie benötigen sehr große Ressourcen an CPU-Leistung und Speicher und sind bei der Sprecheridentifizierung unzuverlässig.What all of these methods and devices have in common is that they have to run directly on a host PC, which places a heavy load on it and complicates handling because there are often long waiting times until the recognition process is completed. They require very large resources in terms of CPU power and memory and are unreliable when it comes to speaker identification.

Aus der US-PS 5,659,665 ist es bekannt, daß vordefinierte Sprachkommandos in Tastaturanschlags-Daten umgewandelt und in die Tastaturschnittstelle eingeschliffen werden.From US-PS 5,659,665 it is known that predefined voice commands are converted into keystroke data and integrated into the keyboard interface.

Die derzeit auf dem Markt erhältlichen Bedieneinheiten für Rechner und Terminals basieren auf mechanischen Eingabegeräten, bei denen entweder eine Taste gedrückt oder eine Maus bewegt wird. Dies hat den Nachteil, das immer erst diese Art der Bedienung gelernt werden muß, und der Umgang mit Rechnern bzw. Terminals gelernt werden muss, und der Umgang mit Rechnern dadurch weiten Teilen der Bevölkerung verschlossen bleibt.The control units for computers and terminals currently available on the market are based on mechanical input devices, where either a key is pressed or a mouse is moved. This has the disadvantage that this type of operation always has to be learned first, and the use of computers or terminals has to be learned, and the use of computers is therefore inaccessible to large parts of the population.

Weiterhin sind diese Bedieneinheiten, die durch Schalter, Taster, Tastatur oder Maus realisiert werden, im Gegensatz zur automatischen Sprachein- und Sprachausgabe (gemäß der vorliegenden Erfindung) erheblich fehleranfälliger, störanfälliger und aufwendiger (in der Handhabung) hinsichtlich der Datenein- oder Ausgabe. Darüber hinaus sind bei solchen Systemen immer entsprechende Fähigkeiten und Kenntnisse bezüglich der Funktionsweise und Bedienung (z.B. bei der Tastatur) erforderlich, was oft zu einem erhöhten Zeitaufwand und damit erhöhten Kosten führt.Oftmals ist es auch im eigentlichen Arbeitsprozeß hinderlich, die Augen und Hände vom zu bearbeitenden Gegenstand und / oder Schriftstück zu lösen um die Eingaben mit dem mechanischen Hilfsmittel vornehmen zu können.Furthermore, these control units, which are implemented using switches, buttons, keyboards or mice, are considerably more error-prone, prone to malfunctions and more complex (to handle) in terms of data input and output than automatic voice input and output (according to the present invention). In addition, such systems always require appropriate skills and knowledge regarding the functionality and operation (e.g. of the keyboard), which often leads to increased expenditure of time and thus increased costs. It is often also a hindrance in the actual work process to have to take your eyes and hands away from the object and/or document to be processed in order to be able to enter data using the mechanical aid.

Die Aufgabe der vorliegenden Erfindung besteht nun darin, eine Vorrichtung zu schaffen, die eine zuverlässige automatische Umsetzung von Sprache in Tastaturbefehle, Mausbewegungen und / oder Text sicherstellt, effizient und robust auch bei Störungen durch Hintergrundgeräusche funktioniert und eine verläßliche Identifikation des Sprechers / -in ermöglicht.The object of the present invention is to create a device which ensures a reliable automatic conversion of speech into keyboard commands, mouse movements and/or text, functions efficiently and robustly even in the event of interference from background noise and enables a reliable identification of the speaker.

Zur Lösung dieser Aufgabe wird vorgeschlagen, dass durch eine Rechnereinheit mit einer Spracherkennungseinheit über ein Mikrophon Sprachkommandos bzw. Sprachtexte in Tastatur-oder Mausbefehle und/oder Texte mittels automatischer Spracherkennung und Sprecheridentifizierung umgewandelt und inTo solve this problem, it is proposed that a computer unit with a speech recognition unit converts speech commands or speech texts into keyboard or mouse commands and/or texts by means of automatic speech recognition and speaker identification via a microphone and then converts them into

Form digitaler Werte über das USB-Interface oder eine andere bidirektionale Schnittstelle an einen Rechner übergeben werden.The data can be transferred to a computer in the form of digital values via the USB interface or another bidirectional interface.

Eine eventuell notwendige Umsetzung der übergebenen Daten kann mittels Treibersoftware, welche auf dem Rechner oder Terminal abläuft durchgeführt werden.Any necessary conversion of the transferred data can be carried out using driver software that runs on the computer or terminal.

Über die an diese Einheit angeschlossene Mikrophon-Lautsprecher Kombination, können mittels Sprachausgabe Kommandos oder Daten an den Benutzer zurückgegeben werden.Commands or data can be returned to the user via voice output via the microphone-speaker combination connected to this unit.

Die vorstehenden Ausführungen werden anhand der folgenden Zeichnungen näher erläutert. Dabei zeigenThe above statements are explained in more detail using the following drawings.

Fig. 1 eine Rechnereinheit mit einer Spracherkennungseinheit, einem Mikrophon,einem USB-Interface oder einer anderen bidirektionale Schnittstelle in Draufsicht,Fig. 1 a computer unit with a speech recognition unit, a microphone, a USB interface or another bidirectional interface in plan view,

Fig. 2 eine Rechnereinheit mit weiteren USB-Anschlüssen und USB-Verteiler in Draufsicht,Fig. 2 a computer unit with additional USB ports and USB distributor in top view,

Fig.3 eine Rechnereinheit mit PCMCIA (Personal ComputerFig.3 a computer unit with PCMCIA (Personal Computer

Memory Card Interface Association)-Steckplatz in
Draufsicht,Memory Card Interface Association) slot in
Top view,

Fig. 4 eine Rechnereinheit mit einer Spracherkennungseinheit, eine Sprecheridentifizierung, eineFig. 4 a computer unit with a speech recognition unit, a speaker identification, a

Spracherzeugungseinheit, bestehend aus einem Taktgenerator, einem CPU (Central Processor Unit), einem Befehlsspeicher und/oder Datenspeicher, einem Mikrophon , einem Lautsprecher und einer Analogein- und
ausgangsschaltung in Draufsicht.Speech generation unit, consisting of a clock generator, a CPU (Central Processor Unit), an instruction memory and/or data memory, a microphone, a loudspeaker and an analog input and output.
output circuit in top view.

Fig. 1 beschreibt eine Rechnereinheit (1) mit einer Spracherkennungseinheit (2) . Diese Spracherkennungseinheit (2) dient zur Umwandlung von Sprachkommandos bzw. Sprachtexten mittels eines Mikrophons (3) in Tastatur- oder Mausbefehle und/oderFig. 1 describes a computer unit (1) with a speech recognition unit (2). This speech recognition unit (2) is used to convert speech commands or speech texts by means of a microphone (3) into keyboard or mouse commands and/or

Texte. Die umgewandelten digitalen Werte werden dann anschließend über das USB-Interface (4)oder eine beliebige andere bidirektionale Schnittstelle (5) wie &zgr; .B eine serielle RS232-Schnittstelle an einen Rechner oder Terminal übergeben.Texts. The converted digital values are then transferred to a computer or terminal via the USB interface (4) or any other bidirectional interface (5) such as a serial RS232 interface.

In Fig. 2 wird die Rechnereinheit (1) mit weiteren USB-Anschlüssen (6) dargestellt. Mit Hilfe dieser Vorrichtung ist es möglich, die gewandelten Sprachkommandos oder Sprachdaten mit anderen Daten von den Peripheriegeräten zu kombinieren und über den USB-Verteiler (7) weiter zuleiten. Es daher z. B. die Anzahl einer Warenlieferung gesprochen und deren Produktnummer gescannt werden. Die dargestellte Recheneinheit (1) fügt die Daten entsprechend der Vorgaben des Host Rechners zusammen und sendet sie als eine zusammenhängende Tastensequenz.Fig. 2 shows the computer unit (1) with additional USB connections (6). This device makes it possible to combine the converted voice commands or voice data with other data from the peripheral devices and forward them via the USB distributor (7). For example, the number of a delivery of goods can be spoken and its product number scanned. The computer unit (1) shown combines the data according to the specifications of the host computer and sends it as a coherent key sequence.

Fig. 3 beschreibt eine Rechnereinheit (1) mit einem PCMCIA (Personal Computer Memory Card Interface Association)-Steckplatz (8). Dies ermöglicht die Erweiterung der Rechnereinheit (1) mit den verschiedensten PCMCIA-Karten, wie zum Beispiel Ethernet- oder Funknetzkarten.Fig. 3 describes a computer unit (1) with a PCMCIA (Personal Computer Memory Card Interface Association) slot (8). This allows the computer unit (1) to be expanded with a wide variety of PCMCIA cards, such as Ethernet or wireless network cards.

Fig. 4 zeigt eine Rechnereinheit (1) mit einer Spracherkennungseinheit (2) , einerFig. 4 shows a computer unit (1) with a speech recognition unit (2), a

Sprecheridentifizierungseinheit (15) und einer Spracherzeugungseinheit (9) , wobei diese aus einem Taktgenerator (10), einer CPU(Central Processor Unit) (11) , einem Befehlsspeicher und/oder Datenspeicher (12), einem Mikrophon (3), einem Lautsprecher(13) und einer Analogein- und -ausgangsschaltung (14) besteht. Mit Hilfe dieser Vorrichtung können komplexe Dateneingaben zuerst Dialogorientiert (durch gesprochene Dialoge mittels Sprachausgabe) zusammengestellt werden und danach als eine zusammenhängende Tastensequenz an den Host-Rechner oder Terminal
gesendet werden.Speaker identification unit (15) and a speech generation unit (9), which consists of a clock generator (10), a CPU (Central Processor Unit) (11), an instruction memory and/or data memory (12), a microphone (3), a loudspeaker (13) and an analog input and output circuit (14). With the help of this device, complex data inputs can first be compiled in a dialog-oriented manner (through spoken dialogs by means of voice output) and then sent to the host computer or terminal as a coherent key sequence.
be sent.

1010

Gemäß der erfindungsgemäßen Vorrichtung wird zur Umsetzung der Sprachkommandos bzw. Sprachtexte in Tastatur- oder Mausbefehle und/oder Texte mittels automatischer Spracherkennung und Sprecheridentifizierung ein Sprachsignal mit einer vorgegebenen Taktrate, z.B. 100 us, digitalisiert. Das Sprachsignal wird verändert und/oder transformiert, und / oder vorgeschaltete Algorithmen zur Merkmalsextraktion (wie digitale Filter) werden verwendet. Den GP's (genetischen Programmen) wird zusätzlich und/oder ausschließlich dieses Signal zugeführt. Das digitale Signal kann verändert und/oder transformiert werden, indem die Phonem- und/oder Wort-Identifikation auf der Basis von neuronalen Netzwerken (NN) erfolgt, und das Klassifikationsergebnis einem NN in Form von digitalen Werten zugeführt wird. Die Phonem- oder Wort- Identifikation können auch auf der Basis von Fuzzy Logik (FL) erfolgen. Dabei wird dann das Klassifikationsergebnis einer FL-Funktion in Form von digitalen Werten zugeführt. Zur Identifikation des Sprechers/der Sprecherin wird das Klassifikationsergebnis von GP's (genetischen Programmen) aus dem Sprachsignal verwendet.According to the device according to the invention, a speech signal is digitized at a predetermined clock rate, e.g. 100 us, to convert the voice commands or voice texts into keyboard or mouse commands and/or texts using automatic speech recognition and speaker identification. The speech signal is changed and/or transformed, and/or upstream algorithms for feature extraction (such as digital filters) are used. This signal is fed to the GPs (genetic programs) in addition to and/or exclusively. The digital signal can be changed and/or transformed by performing phoneme and/or word identification on the basis of neural networks (NN) and feeding the classification result to an NN in the form of digital values. Phoneme or word identification can also be performed on the basis of fuzzy logic (FL). The classification result is then fed to an FL function in the form of digital values. To identify the speaker, the classification result of GPs (genetic programs) from the speech signal is used.

Hierzu werden nachstehende Beispiele erläutert:The following examples are explained:

Beispiel 1example 1

Die Steuerung einer Computermaus und die Navigation auf der Oberfläche eines Computerbetriebssystems kann per Sprachsteuerung auf Basis von GP (genetischen Programmen) und/oder NN-Algorithmen und/oder Fuzzy-Logik erfolgen. Damit ist es möglich, eine Computermaus herzustellen, bei der der Benutzer alternativ die Betriebssystemkommandos direkt per Sprache eingibt, Menüs öffnet, Programme startet, oder Steuerkommandos absetzt, ohne erst den Mauszeiger auf die entsprechende Position zu fahren, und anzuklicken.The control of a computer mouse and navigation on the surface of a computer operating system can be carried out by voice control based on GP (genetic programs) and/or NN algorithms and/or fuzzy logic. This makes it possible to produce a computer mouse with which the user can alternatively enter the operating system commands directly by voice, open menus, start programs, or issue control commands without first moving the mouse pointer to the corresponding position and clicking.

Beispiel 2Example 2

Die Datenein- und ausgaben im Bereich der Logistik können die über das Mikrophon (3) eingegebenen Sprachkommandos mit denThe data input and output in the logistics area can be the voice commands entered via the microphone (3) with the

·&diams;&diams; ····&diams;&diams; ···

1111

Daten anderer Peripheriegeräte kombiniert und danach als Datenstrom über das USB-Interface (4) an einen übergeordneten Rechner übergeben werden. Wird zum Beispiel in einem Kommissionierungsprozess ein Artikel der mit einem Barcode versehen ist, noch mit einer Mengenangabe versehen werden, kann der Benutzer die Daten in beliebiger Reihenfolge eingeben. Über den am USB-Interface (4) angeschlossenen Scanner wird die Artikelnummer erfasst und der Benutzer spricht entweder davor oder danach die kommissionierte Menge ein. Das System kann beide Eingabearten unterscheiden, und gibt den kompletten DatensatzData from other peripheral devices can be combined and then transferred as a data stream to a higher-level computer via the USB interface (4). For example, if an article with a barcode is also to be given a quantity in a picking process, the user can enter the data in any order. The article number is recorded via the scanner connected to the USB interface (4) and the user speaks the picked quantity either before or after. The system can distinguish between the two types of input and enters the complete data set.

a) nur dann weiter wenn alle Daten vorliegena) only continue if all data is available

b) in einer vordefinierten Form, z.b. erst die Artikelnummer und dann die Menge, weiter.b) in a predefined form, e.g. first the article number and then the quantity.

Beispiel 3Example 3

Im Lagerbereich kann die Weitergabe der erfassten Sprachdaten über den PCMCIA (Personal Computer Memory Cardinterface Association)-Steckplatz (8) und einer darin eingesteckten Karte, z. B. ein Funkmodem, erfolgen.In the storage area, the recorded voice data can be transmitted via the PCMCIA (Personal Computer Memory Card Interface Association) slot (8) and a card inserted into it, e.g. a radio modem.

Beispiel 4Example 4

Gemäß der erfindungsgemäßen Vorrichtung ist es möglich, daß die Spracherkennungseinheit (2) die Emulation des Tastaturtreibers sicherstellt. Dabei läuft eine Software im Hintergrund des Betriebssystems und prüft in kurzen Abständen (<50msec)ob Daten am USB-Interface (4) ankommen. Diese werden dann in die selben Systemkommandos, die auch eine Tastaturtreiber generiert umgesetzt, und über das API(Application Programming Interface) des Betriebssystems entweder an die derzeit aktive Vordergrundanwendung oder eine vordefinierte Anwendung, gesandt. Dadurch muss innerhalb der Zielapplikation keine Änderung vorgenommen werden, da sie ohnehin schon auf Tastaturanschläge reagiert.According to the device according to the invention, it is possible for the speech recognition unit (2) to ensure the emulation of the keyboard driver. In this case, software runs in the background of the operating system and checks at short intervals (<50msec) whether data is arriving at the USB interface (4). These are then converted into the same system commands that a keyboard driver generates and sent via the API (Application Programming Interface) of the operating system either to the currently active foreground application or to a predefined application. This means that no changes need to be made within the target application, since it already reacts to keyboard strokes anyway.

Eine weitere Zusatzfunktion dieser Software ist die UmsetzungAnother additional function of this software is the implementation

1212

• ··

eines gesprochenen Wortes wie z.B. ||Datei öffnen^ in sogenannte Shortcuts(Kurzbefehle). Diese Shortcuts erlauben das direkte aktivieren einer Funktion über Tastaturkombinationen. Im obigen Beispiel ist dies Hstrg + 0|| . Dies geschieht über Tabellen und Matrizzen die vor der Benutzung erstellt werden.of a spoken word such as ||Open file^ into so-called shortcuts. These shortcuts allow the direct activation of a function via keyboard shortcuts. In the example above, this is Hctrl + 0|| . This is done via tables and matrices that are created before use.

Beispiel 5Example 5

Gemäß der erfindungsgemäßen Vorrichtung ist es ebenso möglich, daß die Rechnereinheit (1) durch die Spracherkennungseinheit (2) Sprachkommandos in Tastatur- und / oder Mausbefehle umwandelt, welche die sprachgesteuerte Bedienung, eines WebBrowsers ermöglicht. Die Bedienung erfolgt in der Weise, dass die gesprochenen Wortes in Shortcuts(Kurzbefehle) umgesetzt werden. Diese Shortcuts erlauben das direkte Aktivieren einer Browserfunktion über Tastaturkombinationen. Dies geschieht über Tabellen und Matrizzen die vor der Benutzung erstellt werden.According to the device according to the invention, it is also possible for the computer unit (1) to convert voice commands into keyboard and/or mouse commands using the voice recognition unit (2), which enables voice-controlled operation of a web browser. The operation is carried out in such a way that the spoken words are converted into shortcuts. These shortcuts allow a browser function to be activated directly using keyboard combinations. This is done using tables and matrices that are created before use.

Beispiel 6Example 6

Nach der erfindungsgemäßen Vorrichtung kann die Rechnereinheit (1) durch die Spracherkennungseinheit (2) Sprachkommandos umwandeln, die die sprachgesteuerte Bedienung eines E-Mail-Programmes ermöglicht. Dadurch entfällt die Bedienung mit der Maus und das Schreiben auf der Tastatur.According to the device according to the invention, the computer unit (1) can convert voice commands using the voice recognition unit (2), which enables voice-controlled operation of an email program. This eliminates the need for operation with the mouse and typing on the keyboard.

Weiterhin wird mittels der erfindungsgemäßen Vorrichtung auch die sprachgesteuerte Bedienung eines Newsreaders ermöglicht.
Weitere Bespiele sind die sprachgesteuerte Terminalemulation einer Datenbanksoftware, die sprachgesteuerte Bedienung einer Tabellenkalkulationssoftware oder die sprachgesteuerte Bedienung eines PPS(Produktion-Planung-Steuerung) Systems.Furthermore, the device according to the invention also enables voice-controlled operation of a newsreader.
Further examples are the voice-controlled terminal emulation of a database software, the voice-controlled operation of a spreadsheet software or the voice-controlled operation of a PPS (production planning control) system.

Ebenso kann die Bedienung eines Warenwirtschaftssystems oder eines Buchhaltungssysteme mittels akustischer Bedienung erfolgen.Likewise, a merchandise management system or an accounting system can be operated using acoustic operation.

Letztlich kann auch bei allen Anwendungen, bei denen eine Bedienung der mechanischen Bedienelemente nicht möglich ist, da z.B. beide Hände für andere Aufgaben benötigt werden, die Vorrichtung Einsatz finden.Ultimately, the device can also be used in all applications where it is not possible to operate the mechanical controls, for example because both hands are needed for other tasks.

Vorteil dieser Erfindung ist es, eine Vorrichtung anbieten zu können, die eine zuverlässige automatische Spracherkennung ermöglicht, sich einfach als Peripheriegerät anschalten bzw. einbinden läßt und die bisher übliche mechanische Bedienung einer Rechnereinheit durch die Sprachbedienung ablöst. Dadurch wird die Bedienung und die Anwendung vieler Softwareprogramme stark vereinfacht. Der Lernaufwand für die Rechnerbedienung wird sehr verringert, und die Arbeitabläufe beschleunigt und sicherer, wodurch erhebliche Kosten und viel Zeit gespart wird.The advantage of this invention is that it is possible to offer a device that enables reliable automatic speech recognition, can be easily connected or integrated as a peripheral device, and replaces the previously usual mechanical operation of a computer unit with voice control. This greatly simplifies the operation and use of many software programs. The amount of time required to learn how to operate a computer is greatly reduced, and work processes are accelerated and made safer, which saves considerable costs and a lot of time.

Claims

1. Device for converting voice commands and/or voice texts into keyboard and/or mouse movements and/or texts, characterized in that a computer unit ( 1 ) with a speech recognition unit ( 2 ) via a microphone ( 3 ) converts voice commands or voice texts into keyboard or mouse commands and/or texts by means of automatic speech recognition and speaker identification and transfers them to a computer in the form of digital values via the USB interface ( 4 ) or another bidirectional interface ( 5 ).

2. Device according to claim 1, characterized in that the computer unit ( 1 ) additionally has further USB connections ( 6 ) and thus realizes a USB distributor ( 7 ).

3. Device according to one of claims 1 to 2, characterized in that the computer unit ( 1 ) combines the voice commands entered via a microphone ( 3 ) with the data from other peripheral devices and then transmits them as a data stream to a computer via the USB interface ( 4 ) or another bidirectional interface ( 5 ).

4. Device according to one of claims 1 to 3, characterized in that the computer unit ( 1 ) has a PCMCIA (Personal Computer Memory Card Interface Association) slot ( 8 ) for accommodating peripheral devices, such as wireless network cards.

5. Device according to one of claims 1 to 4, characterized in that the computer unit ( 1 ) contains a speech recognition unit ( 2 ), a speaker identification unit ( 15 ) and a speech generation unit ( 9 ), which consists of a clock generator ( 10 ), CPU (Central Processor Unit) ( 11 ), instruction memory and/or data memory ( 12 ), microphone ( 3 ), a loudspeaker ( 13 ) and an analog input and output circuit ( 14 ).

6. Device according to one of claims 1 to 5, characterized in that the emulation of the keyboard driver is made possible by the speech recognition unit ( 2 ).

7. Device according to one of claims 1 to 6, characterized in that the computer unit ( 1 ) converts voice commands into keyboard and/or mouse commands by the voice recognition unit ( 2 ), which enables the voice-controlled operation of a web browser.

8. Device according to one of claims 1 to 7, characterized in that the computer unit ( 1 ) converts voice commands through the voice recognition unit ( 2 ), which enables the voice-controlled operation of an e-mail program.

9. Device according to one of claims 1 to 8, characterized in that the computer unit ( 1 ) converts voice commands through the voice recognition unit ( 2 ), which enables the voice-controlled operation of a newsreader.

10. Device according to one of claims 1 to 9, characterized in that the computer unit ( 1 ) converts voice commands through the voice recognition unit ( 2 ), which enables the voice-controlled operation of a terminal emulation.

11. Device according to one of claims 1 to 10, characterized in that the computer unit ( 1 ) converts voice commands through the voice recognition unit ( 2 ), which enables the voice-controlled operation of a database software.

12. Device according to one of claims 1 to 11, characterized in that the computer unit ( 1 ) converts voice commands through the voice recognition unit ( 2 ), which enables the voice-controlled operation of a spreadsheet software.

13. Device according to one of claims 1 to 12, characterized in that the computer unit ( 1 ) converts voice commands through the voice recognition unit ( 2 ), which enables the voice-controlled operation of a PPS (production planning control) system.

14. Device according to one of claims 1 to 13, characterized in that the computer unit ( 1 ) converts voice commands through the voice recognition unit ( 2 ), which enables the voice-controlled operation of a merchandise management system.

15. Device according to one of claims 1 to 14, characterized in that the computer unit ( 1 ) converts voice commands by the voice recognition unit ( 2 ), which enables the voice-controlled operation of an accounting system.