JPH01116700A

JPH01116700A - Voice recognition control system

Info

Publication number: JPH01116700A
Application number: JP62273490A
Authority: JP
Inventors: Takahiko Ogita; 荻田　隆彦
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1987-10-30
Filing date: 1987-10-30
Publication date: 1989-05-09

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】〔概　要〕入力音声単語に対する近似度情報に基づいて選出された
複数の候補単語群中より認識単語を選別する音声認識制
御方式に関し、入力音声単語に対し、正確でかつ異常動作を生じる恐れ
のない安全な認識単語を簡単な構成で選出することを目
的とし、入力音声単語に対する近似度情報に基づいて選択された
複数の候補単語群中より認識単語を選別する音声認識制
御方式において、近似度の最も高い第１候補単語がフェ
イル・セーフ処理対象単語群に含まれていない場合に第
１候補単語を認識単語とし、第１候補単語がフェイル・
セーフ処理対象単語に含まれる場合、候補単語群に他の
フェイル・セーフ処理対象単語が含まれないときは第１
候補単語を認識単語とし、候補単語群に他のフェイル・
セーフ処理対象単語が含まれる場合はフェイル・セーフ
処理を行うように構成する。、〔産業上の利用分野〕本発明は、音声認識制御方式、特に、入力音声単語に対
する近似度情報に基づいて選択された複　−敗の候補単
語群中より認識単語を選別する音声認識制御方式に関す
る。[Detailed Description of the Invention] [Summary] This invention relates to a speech recognition control method that selects recognition words from a plurality of candidate word groups selected based on similarity information to input speech words. The purpose is to select safe recognition words with a simple structure that are free from the risk of abnormal behavior, and is a voice system that selects recognition words from a group of candidate words selected based on similarity information to input speech words. In the recognition control method, if the first candidate word with the highest degree of approximation is not included in the fail-safe processing target word group, the first candidate word is set as a recognition word;
If the word is included in the safe processing target word, and the candidate word group does not include any other fail-safe processing target word, the first
The candidate words are recognized words, and other fail words are added to the candidate word group.
If a word subject to safe processing is included, the configuration is configured to perform fail-safe processing. , [Industrial Application Field] The present invention relates to a speech recognition control method, particularly a speech recognition control method that selects a recognized word from a group of candidate words selected based on similarity information to an input speech word. Regarding.

[Conventional technology]

音声認識は、機械、装置の音声による制御や音声による
データ入力等に広く利用されている。この場合、音声の
誤認識は制御入力の誤りやデータ入力の誤りとなって現
れるので、その影響は極めて重大である。特に、誤認識
により異常動作をし、危険事態が発生する可能性が高い
場合には、誤認識を生じないように音声認識の認識率を
高いものとすることが必要である。Speech recognition is widely used for voice control of machines and devices, voice data input, and the like. In this case, erroneous voice recognition manifests itself as an error in control input or data input, and the effects thereof are extremely serious. In particular, when there is a high possibility that a dangerous situation will occur due to abnormal operation due to erroneous recognition, it is necessary to increase the recognition rate of voice recognition to avoid erroneous recognition.

音声を認識する場合は、通常多数の標準単語の音声情報
を格納した辞書を予め用意し、入力音声単語に対する近
似度情報に基づいて複数の標準単語を認識対象となる候
補単語として選出し、その中で近似度の最も高い第１候
補単語を認識単語とするいわゆるパターンマツチング方
式が用いられている。すなわち、入力音声単語と標準単
語間の距離を近似度情報とした場合、入力音声単語と所
定距離範囲内にある複数の標準単語を候補単語とし、そ
の中で距離の最も小さい第１候補単語が認識単語とされ
る。この場合、候補単語を選出する近似度情報である距
離を小さくし、より認識結果が確実な場合に限り認識単
語とすることにより、認識率を上げることができる。When recognizing speech, a dictionary that stores the speech information of many standard words is usually prepared in advance, and multiple standard words are selected as candidate words to be recognized based on their similarity to the input speech word. A so-called pattern matching method is used in which the first candidate word with the highest degree of approximation is used as the recognized word. In other words, when the distance between the input speech word and the standard word is used as similarity information, a plurality of standard words within a predetermined distance range from the input speech word are considered as candidate words, and the first candidate word with the smallest distance among them is the candidate word. It is considered a recognized word. In this case, the recognition rate can be increased by reducing the distance, which is the similarity information for selecting a candidate word, and selecting a word to be recognized only when the recognition result is more reliable.

このように、従来の音声認識制御方式においては、認識
したとみなす近似度情報の有効境界を強化し、より認識
結果が確実な場合に限り認識扱いすることにより認識率
を向上させ、誤認識を減少させていた。In this way, in conventional speech recognition control methods, the effective boundary of the similarity information that is considered to be recognized is strengthened, and recognition is treated only when the recognition result is more certain, thereby improving the recognition rate and preventing false recognition. was decreasing.

更に、誤認識により異常動作し危険事態が発生する可能
性がある場合は、緊急用停止スイッチを用意し、異常動
作時はそのスイッチを操作して動作を停止させて危険防
止を行っていた。Furthermore, if there is a possibility that a dangerous situation may occur due to abnormal operation due to erroneous recognition, an emergency stop switch is provided, and in the event of abnormal operation, the switch is operated to stop the operation to prevent danger.

[Problem that the invention seeks to solve]

従来の音声認識制御方式は、前述のように近似度情報の
有効境界を強化してその第１候補単語を認識結果とする
ことにより認識率を上げていた。As described above, the conventional speech recognition control method increases the recognition rate by strengthening the effective boundary of the similarity information and using the first candidate word as the recognition result.

しかしながら、近似度情報の有効境界を強化するだけで
は誤認識を確実に防止することは困難であり、また、認
識対象となる候補単語中の第１候補を直ちに認識結果と
していたので認識率に限界があった。このため、誤認識
による異常動作発生の可能を無くすることができないと
いう問題があった。However, it is difficult to reliably prevent erroneous recognition by simply strengthening the effective boundaries of the similarity information, and the recognition rate is limited because the first candidate among the candidate words to be recognized is immediately recognized as the recognition result. was there. Therefore, there is a problem in that it is not possible to eliminate the possibility of abnormal operation occurring due to misrecognition.

更に、これにより、誤認識により異常動作を生じる分野
では音声認識装置の使用できる範囲が限定され、また、
危険防止のための対策を別に必要とするという不都合が
あった。Furthermore, this limits the range in which speech recognition devices can be used in fields where abnormal operations occur due to misrecognition, and
This has the disadvantage of requiring separate measures to prevent danger.

本発明は、入力音声単語に対し、正確でかつ異常を生じ
る恐れのない安全な認識単語を簡単な構成で選出できる
ように改良した音声認識制御方式を提供することを目的
とする。SUMMARY OF THE INVENTION An object of the present invention is to provide an improved speech recognition control method that is capable of selecting safe recognition words that are accurate and free from the risk of abnormalities from input speech words with a simple configuration.

[Means for solving problems]

まず、本発明の原理について説明する。入力音声単語に
対する近似度情報に基づいて音声認識を行い、第１候補
単語を含む複数の候補単語を選出した場合、第１候補単
語が入力音声単語と一致することは必ずしも保証されな
いが、候補単語群全体について見ると、その中に入力音
声単語と一致するものが存在する確率は極めて高く、実
際上その確率は１であるとみることができる。First, the principle of the present invention will be explained. When performing speech recognition based on the similarity information to the input speech word and selecting a plurality of candidate words including the first candidate word, it is not necessarily guaranteed that the first candidate word matches the input speech word, but the candidate word Looking at the group as a whole, the probability that there is one that matches the input speech word is extremely high, and in reality, the probability can be considered to be 1.

一方、音声単語の中には、その音声単語が他の音声単語
に誤認識されると異常動作をする可能性は高いが、他の
音声単語を前記音声単語と誤認識しても異常動作を生じ
る恐れのないものがある。On the other hand, some spoken words are likely to behave abnormally if they are mistakenly recognized as other spoken words, but they may also behave abnormally even if other spoken words are mistakenly recognized as the aforementioned spoken word. There are things that are unlikely to occur.

したがって、このような音声単語が候補単語群中に存在
する場合は、その候補単語を認識結果とすれば、異常動
作は生じないので、フェイル・セーフを行うことができ
る。以下、この音声単語又はこの音声単語が誤認識され
る可能の高い他の単語を、第１種のフェイル・セーフ処
理対象単語（第１種Ｆ−３単語）という。Therefore, if such a spoken word exists in the candidate word group, if the candidate word is used as the recognition result, no abnormal operation will occur, so fail-safe operation can be performed. Hereinafter, this audio word or another word that is likely to be misrecognized is referred to as a first type fail-safe processing target word (first type F-3 word).

また、認識結果の利用装置においては、例えば「右」と
「左」、「前」と「後」のように、利用装置が同時に遂
行することができない動作や処理、矛盾する動作や処理
を指示する音声単語群がある。In addition, the device that uses the recognition results instructs the device to perform actions or processes that cannot be performed simultaneously, or contradictory actions or processes, such as “right” and “left” or “front” and “back”. There is a group of spoken words that

このような音声単語群が候補単語群中に存在している場
合は、何れの候補単語を認識結果としても、誤認識とな
って異常動作を生じる可能性が高いので、゛このような
音声単語は無効とすることにより、フェイル・セーフを
行うことができる。If such a spoken word group exists in the candidate word group, there is a high possibility that no matter which candidate word is recognized as a result, it will be misrecognized and cause an abnormal operation. Fail-safe can be achieved by disabling the .

以下、このような音声単語を第２種のフェイル・セーフ
処理対象単語（第２種Ｆ−３単語）という。Hereinafter, such spoken words will be referred to as second-type fail-safe processing target words (second-type F-3 words).

また、第１種及び第２種のフェイル・セーフ処理を含め
、誤動作や異常動作等のためにフェイル・セーフ処理を
行う対象となる単語をフェイル・セーフ処理対象単語と
いう。In addition, words that are subject to fail-safe processing due to malfunctions, abnormal operations, etc., including the first and second types of fail-safe processing, are referred to as fail-safe processing target words.

本発明は、前述の着想に基づき、認識単語からフェイル
・セーフ処理対象単語を排除して正確で安全な認識結果
が得られるようにし、かつ、フェイル・セーフ処理対象
単語が候補単語群中に存在する場合は、フェイル・セー
フ処理を行って、利用装置の安全性を保証するようにし
たものである。Based on the above-mentioned idea, the present invention eliminates fail-safe processing target words from recognized words to obtain accurate and safe recognition results, and eliminates fail-safe processing target words from candidate word groups. In this case, fail-safe processing is performed to ensure the safety of the equipment used.

以下、本発明の採用した解決手段を、第１図を参照して
説明する。第１図は、本発明の基本構成゛をフローチャ
ートで示したものである。Hereinafter, the solution adopted by the present invention will be explained with reference to FIG. FIG. 1 is a flowchart showing the basic configuration of the present invention.

第１図において、Ｓｌ及びＳ２は処理ステップで、ステ
ップＳ１においては、第１候補単語がフェイル・セーフ
（ｆａｉｌ−ｓａｆｅ）処理対象となる単語群に含まれ
ていない場合に、第１候補単語を認識結果とする処理が
行われる。In FIG. 1, Sl and S2 are processing steps, and in step S1, when the first candidate word is not included in the word group to be subjected to fail-safe processing, the first candidate word is Processing is performed to obtain a recognition result.

ステップＳ２においては、第１候補単語がフェイル・セ
ーフ処理候補の対象となる単語群に含まれる場合は、候
補単語群に他のフェイル・セーフ処理対象単語が含まれ
ていないときは第１候補単・語を認識単語とし、候補単
語群に他のフェイル・セーフ処理対象単語が含まれる場
合はフェイル・セーフ処理が行われる。In step S2, if the first candidate word is included in the word group subject to fail-safe processing candidates, if the candidate word group does not include other words subject to fail-safe processing, the first candidate word is・If the candidate word group includes other fail-safe processing target words, fail-safe processing is performed.

[For production]

入力音、声単語に対する近似度情報に基づいて音声認識
がなされ、近似度の高い順に認識対象となる複数の候補
単語が選出され、更に、この候補単語群中で最も近似度
の高いものが第１候補単語として選出される。Speech recognition is performed based on the similarity information for input sounds and voice words, and multiple candidate words to be recognized are selected in descending order of similarity, and the word with the highest similarity among these candidate words is selected as the It is selected as one candidate word.

まず、第１候補単語が、フェイル・セーフ処理候補の対
象となる単語群に含まれているかが判定される。含まれ
ていない場合には、第１候補を認識結果としても異常動
作が生己る恐れがないので、最も近似度の高い第１候補
が認識単語として出力される（ステップＳ＋）− 第１候補単語がフェイル・セーフ処理候補単語群に含ま
れている場合は、候補単語群に他のフェイル・セーフ処
理候補単語が含まれているか判定される。含まれていな
い場合は、第１候補単語を認識結果としても異常動作を
生じる恐れがないので、第１候補単語が認識単語として
出力される。First, it is determined whether the first candidate word is included in a group of words that are candidates for fail-safe processing. If it is not included, there is no risk of abnormal behavior even if the first candidate is used as the recognition result, so the first candidate with the highest degree of approximation is output as the recognized word (step S+) - first candidate If the word is included in the fail-safe processing candidate word group, it is determined whether the candidate word group includes other fail-safe processing candidate words. If it is not included, there is no risk of abnormal operation occurring even if the first candidate word is used as the recognition result, so the first candidate word is output as the recognized word.

存在する場合は、フェイル・セーフ処理が行われる（ス
テップＳＺ）。If it exists, fail-safe processing is performed (step SZ).

フェイル・セーフ処理候補単語群及びフェイル・セーフ
処理の内容は、音声認識結果の利用装置の内容に対応し
て適宜決められる（具体的な内容については、次の実施
例の項で説明する）。The fail-safe processing candidate word group and the contents of the fail-safe processing are appropriately determined in accordance with the contents of the device using the speech recognition results (the specific contents will be explained in the next example section).

以上のようにすることにより、正確でかつ異常動作を生
じる恐れのない安全な認識単語を得ることができる。ま
た、認識結果の安全性に疑問がある場合はフェイル・セ
ーフ処理を行うようにしたので、特別の安全対策装置を
設けることなく利用装置の安全性を保証することができ
る。これにより、音声認識装置の利用分野を拡げること
ができる。By doing the above, it is possible to obtain safe recognition words that are accurate and free from any risk of abnormal operation. Furthermore, if there is any doubt about the safety of the recognition results, fail-safe processing is performed, so the safety of the device used can be guaranteed without the need for special safety measures. Thereby, the field of use of the speech recognition device can be expanded.

〔Example〕

本発明の各実施例を、第２図〜第６図を参照して説明す
る。第２図は、本発明の各実施例の実施に使用する音声
認識制御装置の説明図、第３図は第１の実施例に用いら
れる第１種テーブルの説明図、第４図は第１の実施例の
処理フローチャート、第５図は第２の実施例に用いられ
る第２種テーブルの説明図、第６図は第２の実施例の処
理フローチャートである。Each embodiment of the present invention will be described with reference to FIGS. 2 to 6. FIG. 2 is an explanatory diagram of the voice recognition control device used to implement each embodiment of the present invention, FIG. 3 is an explanatory diagram of the first type table used in the first embodiment, and FIG. 4 is an explanatory diagram of the first type table used in the first embodiment. FIG. 5 is an explanatory diagram of the second type table used in the second embodiment, and FIG. 6 is a processing flowchart of the second embodiment.

（Ａ）音声認識制御装置の構成第２図の音声認識制御装置において、１１は音声入力マ
イクで、入力された音声単語を電気的な音声単語信号に
変換する。(A) Structure of Voice Recognition Control Apparatus In the voice recognition control apparatus shown in FIG. 2, reference numeral 11 denotes a voice input microphone, which converts input voice words into electrical voice word signals.

１２は音声認識処理部で、内部にプロセッサ（図示せず
）を備え、近似度情報に基づいて入力音声単語に対する
音声認識を処理を行い、認識対象となる候補単語群及び
近似度の最も高い第１候補を出力する。Reference numeral 12 denotes a speech recognition processing unit, which includes a processor (not shown) therein, processes speech recognition for input speech words based on similarity information, and selects a group of candidate words to be recognized and the first word with the highest similarity. Output one candidate.

１３は主記憶で、本発明の音声認識制御を実行するため
のプログラムやデータ及びテーブル（１３１または１３
２）が格納される。１３１は第１種テーブル、１３２は
第２種テーブルで、それらについては、第３図及び第５
図で説明する。13 is a main memory, which stores programs, data, and tables (131 or 13) for executing the voice recognition control of the present invention;
2) is stored. 131 is the first type table, 132 is the second type table, and they are shown in Figures 3 and 5.
This will be explained with a diagram.

１４は主処理部で、内部にプロセッサ（図示せず）を備
え、主記憶１３のプログラムに従って、音声認識制御を
行う。A main processing unit 14 includes a processor (not shown) therein and performs voice recognition control according to the program stored in the main memory 13.

１５は外部通信処理部で認識結果又はフェイル・セーフ
処理内容をホスト（図示せず）に転送する処理を行う。Reference numeral 15 denotes an external communication processing unit that performs processing to transfer the recognition result or fail-safe processing contents to a host (not shown).

（Ｂ）第１の実施例本発明の第１の実施例を、第３図及び第４図を参照して
説明する。(B) First Embodiment A first embodiment of the present invention will be described with reference to FIGS. 3 and 4.

第１の実施例は、第１種のフェイル・セーフ処理対象単
語を対象とする実施例である。The first example is an example that targets the first type of fail-safe processing target words.

まず、第３図の第１種テーブル１３１について説明する
。第３図において、左欄は単語欄で、認識対象となる単
語が示されている。「右、左、上、下、停止」は、それ
ぞれ右、左、上、下及び停止の各動作を指示し、ｒｌ　
、２Ｊは、移動距離を指示する。First, the first type table 131 shown in FIG. 3 will be explained. In FIG. 3, the left column is a word column in which words to be recognized are shown. "Right, left, up, down, stop" indicates the right, left, up, down, and stop movements, respectively, and rl
, 2J indicate the moving distance.

右欄は制約指示欄で、第１種のフェイル・セーフ処理対
象単語中で制約指示単語であるものを指示する。制約指
示単語は、他の単語に誤認識されると異常動作をする可
能性が高いが、他の単語を制約指示単語と誤認識しても
異常動作の生じる恐れのないものである。The right column is a constraint instruction column, which indicates which of the first type fail-safe processing target words are constraint instruction words. A constraint instruction word is likely to cause an abnormal operation if it is erroneously recognized as another word, but there is no risk of abnormal operation occurring even if another word is erroneously recognized as a constraint instruction word.

中央は制約条件欄で、第１種フェイル・セーフ処理対象
単語中の制約条件単語が指示される。制約条件単語は、
前述の制約指示単語が誤認識される可能性の高い他の単
語である。In the center is a constraint column in which constraint words among the words subject to type 1 fail-safe processing are indicated. The constraint word is
The above-mentioned constraint instruction word is another word that is likely to be misrecognized.

これらの制約条件単語及び制約指示単語は、認識結果の
利用装置に対応して選定される。これらの各単語の選定
にあたっては、単なる認識率の畜さだけでな（、他単語
との相互間の近似度の低さ、発声のしやすさ等を総合し
て判定される。These constraint condition words and constraint instruction words are selected in accordance with the device using the recognition result. When selecting each of these words, the judgment is made not only based on the recognition rate, but also based on factors such as how close they are to other words, and how easy they are to pronounce.

第３図の場合は、「停止」が制約指示単語に、「右、左
、上、下」が制約条件単語に選定されている。この第１
種テーブル１３１は、予め作成され主記憶１３内に格納
される。In the case of FIG. 3, "stop" is selected as the constraint instruction word, and "right, left, top, bottom" are selected as the constraint condition words. This first
The seed table 131 is created in advance and stored in the main memory 13.

次に、第４図の処理フローチャートを参照し、そのステ
ップに従って、第１の実施例の動作を説明する。第４図
の処理フローチャートにおいて、ステップＳ、及びＳ２
については、第１図で説明したとおりである。Next, the operation of the first embodiment will be explained according to the steps with reference to the processing flowchart of FIG. 4. In the processing flowchart of FIG. 4, steps S and S2
The details are as explained in FIG.

ステップ８１□〜Ｓ、１４はステップＳ１の内部処理を
、ステップ５ｔ１１及び３２＋□はステップＳ２の内部
処理をそれぞれ示す。なお、ステップＳ目４は、ステッ
プＳ、及びＳ２に共用される。Steps 81□ to S and 14 indicate the internal processing of step S1, and steps 5t11 and 32+□ indicate the internal processing of step S2, respectively. Note that the Sth step 4 is shared by steps S and S2.

■　ステップ５ｉｌｌ　　ｌ　ｓｚｚ主処理部１４は、音声認識処理部１２に対し、音声入力
の受付は開始を指示する（ステップ５１１１）。■Step 5ill l szz The main processing section 14 instructs the speech recognition processing section 12 to start accepting voice input (Step 5111).

この指示に従って音声認識処理部１２は音声入力を受は
付・け、入力音声単語に対する認識処理を行う。すなわ
ち、入力音声単語データと基準となる音声データ（例え
ば、標準パターンデータ）と比較し、近似度の高い順に
複数の候補単語を選出して、主処理部１４に返答する（
ステップ５ＩＩｚ）Ｏ複数候補単語群の選出に当っては、例えば統計的手法に
より近似度の分布状態により判定され、通常では３〜８
個の候補単語が選出される。According to this instruction, the speech recognition processing section 12 accepts and accepts the speech input, and performs recognition processing on the input speech words. That is, the input speech word data is compared with the reference speech data (for example, standard pattern data), and a plurality of candidate words are selected in descending order of similarity and are sent back to the main processing unit 14 (
Step 5 IIz) O When selecting a plurality of candidate word groups, judgment is made based on the distribution state of the degree of approximation using, for example, a statistical method, and usually 3 to 8
candidate words are selected.

最も近似度の高い候補単語が、第１候補単語である。The candidate word with the highest degree of approximation is the first candidate word.

■　ステップＳｌｌ：ｌ　　、５１１４主処理部１４は
、第１種テーブル１３１を参照して、音声認識処理部１
２より返答された第１候補単語が制約条件単語であるか
を判定する（ステップＳｌｌ：ｌ）。■ Step Sll: l , 5114 The main processing unit 14 refers to the first type table 131, and the speech recognition processing unit 1
It is determined whether the first candidate word returned from No. 2 is a constraint word (step Sll: l).

第１候補単語が制約条件単語でない場合は、第１候補単
語を認識結果としても異常動作を生じる恐れがないので
、最も近似度の高い第１候補単語を認識結果とし、外部
通信処理部１５を通して図示しないホストに送る（ステ
ップＳ■４）。If the first candidate word is not a constraint word, there is no risk of abnormal operation even if the first candidate word is used as the recognition result, so the first candidate word with the highest degree of approximation is used as the recognition result, and The data is sent to a host (not shown) (step S4).

例えば、第１候補単語が「１」である場合は、「１」が
認識結果となる。For example, if the first candidate word is "1", "1" is the recognition result.

■　ステップＳ’ｚ＋＋　　ｙ　Ｓ＋＋ａ　　ｐ　Ｓｚ
＋ｚステップ“Ｓ　１１３において第１候補単語が制約
条件単語である場合は、主処理部１４は、更に第１種テ
ーブル１３１を参照して、候補単語群中に制約指示単語
が含まれているか判定する（ステップＳ２□）。■ Step S'z++ y S++a p Sz
+z step "If the first candidate word is a constraint word in S113, the main processing unit 14 further refers to the first type table 131 and determines whether a constraint instruction word is included in the candidate word group. (Step S2□).

制約指示単語が含まれていない場合は、第１候補単語を
認識結果としても異常動作を生じる恐れがないので、第
１候補単語を認識単語とする処理が行われる（ステップ
Ｓｌ’＋４）。If the constraint instruction word is not included, there is no risk of abnormal operation occurring even if the first candidate word is used as the recognition result, so a process is performed in which the first candidate word is used as the recognition word (step Sl'+4).

制約指示単語が含まれている場合は、第１候補単語に代
えて制約指示単語を認識単語とし、外部通信処理部１５
を通してホストに送る（ステップＳｏｌ□）。例えば、
「右」が第１候補単語であるが、候補単語群中に制約指
示単語「停止」が含まれている場合は、第１候補単語「
右」に代えて制約指示単語「停止」が認識結果とされる
。これにより、フェイル・セーフを行うことができる。If the constraint instruction word is included, the constraint instruction word is used as the recognition word instead of the first candidate word, and the external communication processing unit 15
(Step Sol□). for example,
"Right" is the first candidate word, but if the constraint instruction word "stop" is included in the candidate word group, the first candidate word "stop" is included in the candidate word group.
Instead of "right", the constraint instruction word "stop" is used as the recognition result. This allows fail-safe operation.

なお、゛ステップ５Ｉ１３　　ｔ　ｓｚ＋＋及び５１１
４の処理は、制約条件単語の有無に拘らず制約指示単語
が候補単語群中に含まれている場合に、該制約指示単語
を認識結果とする処理と等価である。したがって、この
ような処理も、第１の実施例の中に含まれるものである
。In addition, 'Step 5I13 t sz++ and 511
The process of 4 is equivalent to the process of using the constraint instruction word as a recognition result when the constraint instruction word is included in the candidate word group regardless of the presence or absence of the constraint condition word. Therefore, such processing is also included in the first embodiment.

（Ｃ）第２の実施例本発明の第２の実施例を、第５図及び第６図を参照して
説明する。第２の実施例は、第２種のフェイル・セーフ
処理対象単語を対象とする実施例である。　　。(C) Second Embodiment A second embodiment of the present invention will be described with reference to FIGS. 5 and 6. The second example is an example that targets words to be subjected to the second type of fail-safe processing. .

まず、第５図の第２種テーブル１３２について説明する
。第５図において、左欄は単語欄で、認識対象となる単
語が示されている。第３図の第１種テーブル１３１と同
様に、「右、左、上、下、停止」は、それぞれ右、左、
上、下及び停止の各動作を指示し、ｒｌ　、２Ｊは、移
動距離を指示する。First, the second type table 132 shown in FIG. 5 will be explained. In FIG. 5, the left column is a word column in which words to be recognized are shown. Similar to the first type table 131 in FIG. 3, "right, left, up, down, stop" means right, left,
It instructs the up, down and stop movements, and rl and 2J instruct the moving distance.

右欄は第２種Ｆ−３単語欄で、認識対象単語中で第２種
フェイル・セーフ処理対象単語（第２種Ｆ−３単語）が
示されている。The right column is a 2nd type F-3 word column, in which 2nd type fail-safe processing target words (2nd type F-3 words) are shown among the recognition target words.

この第２種Ｆ−３単語も、第１種Ｆ−３単語の制約条件
単語及び制約指示単語と同様に、認識結果の利用装置に
対応して選定される。これらの各単語の選定にあたって
は、単なる認識率の高さだけでなく、他単語との相互間
の近似性、発声のしやすさ等を総合して判定される。This second type F-3 word is also selected in accordance with the device using the recognition result, similar to the constraint condition word and constraint instruction word of the first type F-3 word. When selecting each of these words, the judgment is made not only based on the recognition rate, but also based on their mutual similarity to other words, ease of pronouncing, and other factors.

第５図の場合は、「右、左、上、下」が、第２種Ｆ　−
５ｆＩ−語に選定されている。この第２種テーブル１３
２は、予め作成されて主記憶１３内に格納される。In the case of Fig. 5, "right, left, top, bottom" are Type 2 F-
It has been selected as a 5fI-word. This second type table 13
2 is created in advance and stored in the main memory 13.

次に、第６図の処理フローチャートを参照し、そのステ
ップに従って、第２の実施例の動作を説明する。第６図
の処理フローチャートにおいて、ステップＳ１及びＳ２
については、第１図で説明したとおりである。ステップ
ＳＩ□、〜５Ｉ２４はステップＳ１の内部処理を、ステ
ップＳ２□１及びＳ２゜はステップＳ２の内部処理をそ
れぞれ示す。Next, the operation of the second embodiment will be explained according to the steps with reference to the processing flowchart of FIG. In the processing flowchart of FIG. 6, steps S1 and S2
The details are as explained in FIG. Steps SI□ and -5I24 indicate the internal processing of step S1, and steps S2□1 and S2° indicate the internal processing of step S2, respectively.

なお、ステップＳＩ□４は、ステップＳ、及びＳ２に共
用される。Note that step SI□4 is shared by steps S and S2.

■　ステップＳＩ□ＩｔＳＩ□２主処理部１４は、音声認識処理部１２に対し、音声入力
の受付は開始を指示する（ステップＳ１□、）。■ Step SI□ItSI□2 The main processing unit 14 instructs the voice recognition processing unit 12 to start accepting voice input (Step S1□).

この指示に従って音声認識処理部１２は音声入力を受は
付けて入力音声単語に対する認識処理を行い、近似度の
高い順に複数の候補単語を選出して主処理部１４に返答
する。近似度の最も高い候補単語が、第１候補単語であ
る（ステップ５Ｉｚｚ）・ ■　ステップ５Ｉ２３　２３１２４主処理部１４は、第２種テーブル１３２を参照して、音
声認識処理部１２より返答された第１候補単語が第２種
Ｆ−３単語であるかを判定する（ステップ５ＩＺ３）、第１候補単語が第２種Ｆ−３単語でない場合は、第１候
補単語を認識結果としても異常動作を生じる恐れがない
ので、最も近似度の高い第１候補単語を認識結果とし、
外部通信処理部１５を通してホストに送る（ステップ５
１２４）。In accordance with this instruction, the speech recognition processing section 12 accepts the speech input, performs recognition processing on the input speech words, selects a plurality of candidate words in descending order of degree of approximation, and replies to the main processing section 14 . The candidate word with the highest degree of approximation is the first candidate word (Step 5Izz). ■ Step 5I23 23124 The main processing unit 14 refers to the second type table 132 and selects the first candidate word replied from the speech recognition processing unit 12. Determine whether the first candidate word is a type 2 F-3 word (step 5IZ3). If the first candidate word is not a type 2 F-3 word, abnormal behavior is detected even if the first candidate word is the recognition result. Since there is no risk of this occurring, the first candidate word with the highest degree of approximation is used as the recognition result.
Send to the host through the external communication processing unit 15 (step 5
124).

例えば、第１候補単語がｒｌＪである場合は、「１」が
認識結果となる。For example, if the first candidate word is rlJ, "1" is the recognition result.

■　ステップ３２２１　　）　Ｓｌｚ＜　ｔ　５ｚｚｚ
ステツプＳ１□３において第１候補単語が第２種Ｆ−３
単語である場合は、主処理部１４は、更に第２種テーブ
ル１３２を参照して、候補単語群中に第２種Ｆ−３単語
が含まれているか判定する（ステップＳｔｔ＋　）。■ Step 3221) Slz<t 5zzz
In step S1□3, the first candidate word is type 2 F-3.
If it is a word, the main processing unit 14 further refers to the second type table 132 and determines whether the second type F-3 word is included in the candidate word group (step Stt+).

第２種Ｆ−３単語が含まれていない場合は、第１候補単
語を認識結果としても異常動作を生じる恐れがないので
、第１候補単語を認識結果とする処理が行われる（不チ
ップ５Ｉ２４）。If the second type F-3 word is not included, there is no risk of abnormal operation even if the first candidate word is used as the recognition result, so processing is performed that uses the first candidate word as the recognition result (non-chip 5I24 ).

第２種Ｆ−３単語が含まれている場合は、いずれの候補
単語を認識結果としても誤認識となって異常動作を生じ
る可能性が高いので、入力された音声単語を無効として
、外部通信処理部１５を通してホストにリジェクトを知
らせる。If Type 2 F-3 words are included, there is a high possibility that any candidate word will be misrecognized and cause an abnormal operation. The host is notified of the rejection through the processing unit 15.

これにより、フェイル・セーフを行うことができる。This allows fail-safe operation.

（Ｄ）第３の実施例候補単語群の中に第１種又は第２種のＦ−３単語のいず
れかが含まれている場合には、異常動作を生じ得る危険
がある。(D) Third Example If either the first type or the second type F-3 word is included in the candidate word group, there is a risk that an abnormal operation may occur.

第３の実施例は、候補単語群中に第１種又は第２種のＦ
−Ｓ単語が含まれている場合にフェイル・セーフ処理を
行うようにしたものである。In the third embodiment, the first type or the second type F is included in the candidate word group.
-S Fail-safe processing is performed when the word is included.

具体的には、第１実施例のステップＳ１□１及びＳｌｌ
□　（又は第２実施例のステップ３１２１及びＳ１□２
）に続いて第１実施例のステップ３１．３以降の各処理
と第２実施例のステップ５１２３以降の各処理が並列的
に行われる。Specifically, steps S1□1 and Sll of the first embodiment
□ (or step 3121 and S1□2 of the second embodiment
), each process after step 31.3 of the first embodiment and each process after step 5123 of the second embodiment are performed in parallel.

すなわち、第３の実施例においては、ステップＳｌにお
けるフェイル・セーフ処理対象単語は制約条件単語又は
第２種のフェイル・セーフ処理対象単語であり、前者の
場合は、ステップＳ２における他のフェイル・セーフ処
理対象単語は制約指示単語である。後者の場合は、ステ
ップＳ２における他のフェイル・セーフ処理対象単語は
、第２種のフェイル・セーフ処理対象単語である。That is, in the third embodiment, the fail-safe processing target word in step Sl is a constraint word or the second type of fail-safe processing target word, and in the former case, another fail-safe processing target word in step S2 The processing target word is a constraint instruction word. In the latter case, the other fail-safe processing target words in step S2 are the second type of fail-safe processing target words.

第３の実施例の各ステップの処理内容は、前述の第１及
び第２の実施例の説明から明らかであるので、それらに
ついての説明は省略する。The processing contents of each step in the third embodiment are clear from the description of the first and second embodiments, so a description thereof will be omitted.

〔Effect of the invention〕

以上説明したように、本発明によれば、次の諸効果が得
られる。As explained above, according to the present invention, the following effects can be obtained.

（１）入力音声に対し、正確でかつ異常動作を生じる恐
れのない安全な認識単語を得ることができる。(1) Accurate and safe recognition words that are free from abnormal behavior can be obtained for input speech.

（２）認識結果の安全性に疑問がある場合はフェイル・
セーフ処理を行うようにしたので、特別の安全対策装置
を設けることなく、音声認識結果を利用する装置の安全
性を保証することができる。(2) If there is any doubt about the safety of the recognition result, fail.
Since safe processing is performed, the safety of the device that uses the voice recognition results can be guaranteed without providing any special safety measures.

（３）前記（１）及び（２）により、これまで音声認識
の認識率の問題から異常動作をする可能性があるために
利用が妨げられていた分野にも広く利用することが可能
ネこなった。(3) Due to (1) and (2) above, this device can be widely used in fields where its use was previously hindered due to the possibility of abnormal operation due to problems with the recognition rate of speech recognition. became.

[Brief explanation of the drawing]

第１図は本発明の基本構成の説明図、第２図は本発明の各実施例の実施に使用する音声認識装
置の説明図、第３図は第１の実施例に用いられる第１種テーブルの説
明図、第４図は第１の実施例の処理フローチャート、第５図は
第２の実施例に用いられる第２種テーブルの説明図、第６図は第２の実施例の処理フローチャートである。第２図において、１１・・・音声入力マイク、１２・・・音声認識処理部
、１３・・・主記憶、１４・・・主処理部、１５・・・
外部通信処理部、１３１・・・第１種テーブル、１３２
・・・第２種テーブル。特許出願人　　　　富　士　通　株式会社本発明の基本
構成第１図各実施例の実施に使用する音声認識装置第２図第１種Ｆ・Ｓ第１種テーブル第３図第２種テーブル第５図Fig. 1 is an explanatory diagram of the basic configuration of the present invention, Fig. 2 is an explanatory diagram of a speech recognition device used in implementing each embodiment of the present invention, and Fig. 3 is a diagram of a type 1 speech recognition device used in the first embodiment. An explanatory diagram of the table, FIG. 4 is a processing flowchart of the first embodiment, FIG. 5 is an explanatory diagram of the second type table used in the second embodiment, and FIG. 6 is a processing flowchart of the second embodiment. It is. In FIG. 2, 11... voice input microphone, 12... voice recognition processing section, 13... main memory, 14... main processing section, 15...
External communication processing unit, 131... first type table, 132
...Second type table. Patent Applicant: Fujitsu Ltd. Basic Structure of the Invention Fig. 1 Speech recognition device used to implement each embodiment Fig. 2 Type 1 F/S Table Fig. 3 Type 2 table Fig. 5

Claims

[Claims]

(1) In a speech recognition control method that selects recognition words from a group of candidate words selected based on similarity information to input speech words, (A) the first candidate word with the highest similarity is fail-safe; (B) If the first candidate word is included in the fail-safe processing target word group, the first candidate word is set as a recognized word (step S_1); (B) If the first candidate word is included in the fail-safe processing target word group, other candidate words are If the fail-safe processing target word is not included, the first candidate word is used as the recognition word, and if the candidate word group includes other fail-safe processing target words, fail-safe is performed (step S
_2) A voice recognition control method characterized by:

(2) There is a high possibility that abnormal behavior will occur if the word targeted for fail-safe processing in step S_2 is misrecognized by another word, but abnormal behavior will occur even if the word is misrecognized by another word. The constraint instruction word is a fearless constraint instruction word, and the other fail-safe processing target word in step S_1 is a constraint condition word with a high possibility that the constraint instruction word will be misrecognized. The voice recognition control system according to claim 1, characterized in that fail-safe processing is performed.

(3) The fail-safe processing target word is a type 2 fail-safe processing target word group whose contents include actions or processes that cannot be performed simultaneously or contradictory actions or processes, and the fail-safe processing in step S_2 is performed. 2. The voice recognition control system according to claim 1, wherein the input voice word is invalidated as .

(4) The fail-safe processing target word in step S_1 is a constraint condition word or the second type of fail-safe processing target word, and in the former case, the other fail-safe processing target word in step S_2 is a constraint instruction word. In the latter case, the other fail-safe processing target words in step S_2 are also second-type fail-safe processing target words, the voice recognition control method according to claim 1. .