[go: up one dir, main page]

JPH01177600A - Voice recognition error correcting device - Google Patents

Voice recognition error correcting device

Info

Publication number
JPH01177600A
JPH01177600A JP63001488A JP148888A JPH01177600A JP H01177600 A JPH01177600 A JP H01177600A JP 63001488 A JP63001488 A JP 63001488A JP 148888 A JP148888 A JP 148888A JP H01177600 A JPH01177600 A JP H01177600A
Authority
JP
Japan
Prior art keywords
symbol
output
section
input
unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP63001488A
Other languages
Japanese (ja)
Other versions
JPH0580000B2 (en
Inventor
Kenichi Iso
健一 磯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp filed Critical NEC Corp
Priority to JP63001488A priority Critical patent/JPH01177600A/en
Publication of JPH01177600A publication Critical patent/JPH01177600A/en
Publication of JPH0580000B2 publication Critical patent/JPH0580000B2/ja
Granted legal-status Critical Current

Links

Abstract

PURPOSE:To practically improve the recognition performance of an acoustic recognizing part by using a reverse propagation network model to correct a time series of symbols including erroneous recognition obtained as the acoustic recognition result. CONSTITUTION:A symbol string as the acoustic recognition result is stored in an input buffer part 1, and an input window part 2 successively segments a fixed-length symbol string from the input buffer part 1 while shifting the start point by every one symbol, and a corresponding symbol in the input buffer part 1 is written with the output symbol each time when the inference result of the input is outputted. The output of a reverse propagation network model part 3 is stored in an output buffer part 4, and a first control part 6 shifts the start position of the input window part 2 by one symbol to perform the next correcting operation each time when one symbol is outputted. When detecting that correction is performed up to the layer symbol in the input buffer part 1, a second control part 7 writes back stored contents of the output buffer part 4 to the input buffer part 1 and performs the correcting operation again, and contents of the output buffer part 4 are started to write in a correction result output part 8 after this process is repeated a fixed number of times. Thus, the recognition performance of the acoustic recognizing part is improved.

Description

【発明の詳細な説明】 〔産業上の利用分野〕 本発明は音声認識誤り訂正装置に関し1%に音声認識装
置において嘗祐認誠結釆として得られる誤シを含むシン
ボルの時系列(たとえば音素認識の結果書られる音素シ
ンボル列や単語認識の結果書られる単語シンボル列等)
を、時系列内の前後のコンテキストを考慮して修正する
音声認識誤り訂正装置の改良に関する。
DETAILED DESCRIPTION OF THE INVENTION [Field of Industrial Application] The present invention relates to a speech recognition error correction device. phoneme symbol strings written as a result of recognition, word symbol strings written as a result of word recognition, etc.)
This invention relates to an improvement of a speech recognition error correction device that corrects the error by taking into consideration the preceding and following contexts in a time series.

〔従来の技術〕[Conventional technology]

時系列内の前後のコンテキストを考慮して誤りを訂正す
る方法として1前後のシンボル列が確定した場合の中央
のシンボルの出現確率(条件付き確率)を認識対象のデ
ータから算出してテーブル化し、誤りを含む時系列が与
えられるとテーブル化された条件付き確率を用いて、事
後確率が最大になるようにシンボル列を曹き換えて修正
する方法がある。たとえば1前後3シンボルを考慮して
訂正を行う場合には条件付き確率Pは次のように表され
る。
As a method for correcting errors by taking into account the preceding and following contexts in the time series, the probability of appearance (conditional probability) of the central symbol when the symbol sequence before and after one is determined is calculated from the data to be recognized and tabulated. When a time series containing errors is given, there is a method of modifying the symbol sequence by using tabulated conditional probabilities to maximize the posterior probability. For example, when correction is performed by considering three symbols before and after one, the conditional probability P is expressed as follows.

(式1)  P(solslszsss4ssSssy
)ここでsiはi番目のシンボルを表し、Pはシンボル
Soを34に誤る確率を表している。中央のシンボルS
4に対する訂正結果は(SsSzSsSaSsうなS。
(Formula 1) P(solslszsss4ssSssy
) Here, si represents the i-th symbol, and P represents the probability of mistaking the symbol So to 34. central symbol S
The correction result for 4 is (SsSzSsSaSs).

とじて決められる。即ち、訂正結果S。は△ (式2 )  S、 = argmax [P (so
 l 51S2S3O 3a 5586’?)) で与えられる。
It can be decided by closing. That is, the correction result S. is △ (Equation 2) S, = argmax [P (so
l 51S2S3O 3a 5586'? )) is given by.

〔発明が解決しようとする問題点〕[Problem that the invention seeks to solve]

しかし上記の方法では、考慮に入れる前後のコンテキス
トを広げると条件付き確率のテーフ″ルのサイズが指数
的に増大してしまい、実用的ではない。即ち、考慮にい
れるコンテキストの長さをり。
However, in the above method, the size of the table of conditional probabilities increases exponentially when the contexts taken into account are expanded, which is not practical.In other words, the length of the contexts taken into account is

シンボルの種類をMとする条件付き確率の定義式(式l
)からも知られるようにテーブルのサイズは 〜O(M  ) (ただし 〜0()はサイズのオーダーを示す)となる
。また事後確率の最大化の為の最適化計算の計算量も無
視できなくなる。更に前後のコンテキストに多くの誤り
が含まれる場合には安定な誤シ訂正が困難になる。
Definition formula of conditional probability where symbol type is M (formula l
), the size of the table is ~O(M) (where ~0() indicates the order of size). Moreover, the amount of calculation required for optimization calculations to maximize the posterior probability cannot be ignored. Furthermore, when many errors are included in the preceding and following contexts, stable error correction becomes difficult.

本発明の目的は、上記のように条件付き確率のテーフ゛
ルの記憶容量が膨大になシ実現が困難になへ4− るのを回避し、更に音響認識部の認識結果を用いて誤り
訂正の教師付き学習を行なう事によシ音響認識部の認識
誤りの傾向に適応した誤り訂正を実現し、また誤り訂正
時には最適化計算は不要であるので計算量の大幅な削減
を可能にし、加えて訂正結果を用いて入力シンボル列を
順次訂正しておくことによって誤シの少ない前後関係を
用いて安定な誤り訂正を行うことを可能にするような認
識誤り訂正装置を提供することにある。
The purpose of the present invention is to avoid the above-mentioned conditional probability table having an enormous storage capacity, which is difficult to realize, and to perform error correction using the recognition results of the acoustic recognition unit. By performing supervised learning, it is possible to perform error correction that adapts to the tendency of recognition errors in the acoustic recognition unit.Also, since no optimization calculation is required during error correction, it is possible to significantly reduce the amount of calculation. It is an object of the present invention to provide a recognition error correction device that makes it possible to perform stable error correction using a context with fewer errors by sequentially correcting an input symbol string using correction results.

本発明による認識誤り訂正装置を音響認識部の後処理部
として用いれば、実質的に音響認識部の認識性能を向上
させたのと同じ効果が得られる。
If the recognition error correction device according to the present invention is used as a post-processing section of the acoustic recognition section, substantially the same effect as that of improving the recognition performance of the acoustic recognition section can be obtained.

〔問題点を解決するための手段〕[Means for solving problems]

本発明による音声認識誤り訂正装置は、音声認識に於て
、認識の結果として得られるシンボルの時系列に含まれ
る認識誤りを修正するのに際して。
The speech recognition error correction device according to the present invention is used in speech recognition to correct recognition errors included in a time series of symbols obtained as a result of recognition.

前記時系列を記憶する入力バッファ部と、前記入力バッ
ファ部に記憶されているシンボルの時系列の先頭から順
次始点を1シンボル分づつずらして固定長の該シンホル
列を切り出す入力窓部と、前V5− 記入力窓部の出力として得られる固定長の該シンボル列
を入力としてその中央のシンボルに対する正解を出力す
るようにあらかじめ誤りを含むシンボル列を用いて教師
付きの学習を行なった逆伝播ネットワーク・モデル部と
、前記逆伝播ネットワーク・モデル部がシンボルを出力
した時点で入力バッファ部の対応するシンボルを修正さ
れたシンボルに書き換える誓き換え部と、続いて前記入
力バッファ部から同定長の該シンボル列を切り出す入力
窓部の始点を1シンボル分シフトして前記逆伝播ネット
ワークeモデル部に次のシンボルの修正動作を行わせる
第一制御部と、前記逆伝播ネットワーク・モデル部が出
力するシンボル列を記憶する出力バッファ部と、前記入
力バッファ部のシンボル列の終端のシンボルが修正され
たことを検出した時点で前記出力バッファ部の内容を前
記入力バッファ部に書き戻し、再度前記修正動作を繰り
返させる第二制御部と、一定回数前記修正動作を繰り返
した時点で出力バッファ部の内容を修正結果として出力
する修正結果出力部とを備えて構−一ら = 成される。
an input buffer section for storing the time series; an input window section for cutting out the fixed-length symbol string by sequentially shifting the starting point by one symbol from the beginning of the time series of symbols stored in the input buffer section; V5- A backpropagation network that performs supervised learning using a symbol string containing errors in advance so as to input the fixed-length symbol string obtained as the output of the input window and output the correct answer for the central symbol. - a model section, a rewriting section that rewrites the corresponding symbol in the input buffer section to a modified symbol at the time when the backpropagation network model section outputs a symbol; a first control unit that causes the backpropagation network e model unit to perform a correction operation on the next symbol by shifting the starting point of an input window unit for cutting out a symbol string by one symbol; and a symbol output by the backpropagation network model unit. an output buffer section for storing the sequence, and when it is detected that the symbol at the end of the symbol string in the input buffer section has been modified, the contents of the output buffer section are written back to the input buffer section, and the modification operation is performed again. The present invention comprises: a second control section that repeats the correction operation; and a correction result output section that outputs the contents of the output buffer section as a correction result when the correction operation is repeated a certain number of times.

〔作用〕[Effect]

本発明の基本的な原理は、音声認識に於て、音響認識結
果として得られる誤認識を含むシンボルの時系列をあら
かじめ教師付きの学習を行なった逆伝播ネットワーク・
モデルを用いて修正しようとするものである。以下に本
発明の原理を詳細に説明する。
The basic principle of the present invention is that in speech recognition, a backpropagation network is used that performs supervised learning in advance to analyze the time series of symbols, including erroneous recognition, obtained as acoustic recognition results.
This is an attempt to correct the problem using a model. The principle of the present invention will be explained in detail below.

入力音声を認識した場合に音響認識部の出力として得ら
れるシンボル列は、現状では不可避な音響認識部の認識
誤りによって、音響認識部の誤り傾向を反映した幾つか
の誤りを含んでいる。本発明ではこの誤りを含むシンボ
ルの時系列をその前後のコンテキストを考慮して修正し
、実質的には音響認識部の認識性能を向上させようとす
るものである。
The symbol string obtained as the output of the acoustic recognition section when input speech is recognized contains several errors reflecting the error tendency of the acoustic recognition section due to recognition errors of the acoustic recognition section, which are unavoidable under the present circumstances. The present invention corrects the time series of symbols containing errors by taking into consideration the context before and after the symbol, thereby essentially improving the recognition performance of the acoustic recognition unit.

訂正には連想記憶やパターン認識のモデルとして考案さ
れた逆伝播ネットワーク・モデルを利用する。このモデ
ルの詳細については、[欧文誌コンプレックス・システ
ムズ%198IIEg1号145−168頁J (’P
arallel Networks thatLear
n  to Pronounce English T
ext’、T、J。
For correction, a backpropagation network model, which was devised as a model for associative memory and pattern recognition, is used. For details of this model, please refer to [European Magazine Complex Systems%198IIEg1, pp.145-168J ('P
arallel Networks thatLear
n to Pronounce English T
ext', T, J.

Sejnowski h C,R,Rosenberg
、Complex Systems。
Sejnowski h C, R, Rosenberg
, Complex Systems.

Vol、1(1987)145−168)が詳り、い。Vol. 1 (1987) 145-168).

モデルは一般に第2図のように3種類の層から階層的に
構成され、それぞれ入カニニット層、隠れユニット層、
出力ユニ、ト層と呼ばれている。
Generally, a model is hierarchically composed of three types of layers as shown in Figure 2: an input crab layer, a hidden unit layer, and a hidden unit layer.
It is called the output layer.

各層にはユニットと呼ばれる処理単位が配置され。Each layer has processing units called units.

各ユニットは入力層に近い側に隣接する層のユニットか
らの入力を受けて、隣接する出力層に近い側の層のユニ
ットへ出力を出す。各ユニットの入・出力の応答関係は
次のように与えられる。
Each unit receives an input from a unit in a layer adjacent to the input layer, and outputs an output to a unit in a layer adjacent to the output layer. The input/output response relationship of each unit is given as follows.

(式4)  y”=f(x寸)) (式5 )  f(x)= (1+e−”)−’ここで
Xはユニットへの入力、yはユニットの出力、θはユニ
ットの持つ閾値、上付き添え字は入力層からの階層を表
わしくn==l、・・・・・・、N)。
(Formula 4) y"=f(x dimension)) (Formula 5) f(x)=(1+e-")-'Here, X is the input to the unit, y is the output of the unit, and θ is the threshold value of the unit. , the superscript represents the hierarchy from the input layer (n==l,...,N).

下付き添え字は層内のユニットを表わす番号であ第n層
のユニットjへの結合を表わす荷重、f(x)は(式5
)に示すように各ユニ、トに共通の非線形飽和型の応答
関数である。結局、各ユニットは隣接する上位層のユニ
ットの出力の荷重和とあらかじめ定められた閾値との差
を入力として一程の閾値論理によってその出力を決定す
る。
The subscript is the number representing the unit in the layer, and the load representing the connection to unit j of the n-th layer, f(x) is (Equation 5
) is a nonlinear saturation type response function common to each unit. After all, each unit determines its output by a certain amount of threshold logic using as input the difference between the weighted sum of outputs of adjacent upper layer units and a predetermined threshold.

このモデルの入力層にデータが与えられると。When data is given to the input layer of this model.

その情報(データ)II′i隣接する下位層で順次処理
されながら出力層まで伝播して行く。そしてこの出力層
のユニットの出力が与えられた入力データに対するモデ
ルの推論結果となるのである。
The information (data) II'i is sequentially processed in adjacent lower layers and propagated to the output layer. The output of this output layer unit becomes the model's inference result for the given input data.

本発明では入力層にibを含むシンボル列から切り出し
た固定長のシンボル列を提示したときへ出力層に入力さ
れた固定長のシンボル列の中央のシンボルに対する誤り
訂正の結果(推論結果)が出力されるようなモデルを構
成する。
In the present invention, when a fixed-length symbol string cut out from a symbol string containing ib is presented to the input layer, the error correction result (inference result) for the center symbol of the fixed-length symbol string input to the output layer is output. Construct a model that will

次にモデルが望ましい推論動作を行なうようにユニ、ト
間の結合を定める学習法(逆伝播学習)壕な入力音声に
対する実際の音響認識部の出力である誤りを含むシンボ
ル列から切り出した固定長のシンボル列か、あるいはシ
ンボル間のflaすM向を仮定して、誤りのないシンボ
ル列に確率的に誤りを付加した疑似データである。これ
らのデータを入力層に提示し、出力層には中央のシンボ
ルに対する正解を提示して逆伝播学習を繰り返し行なう
。逆伝播法では入力されたデータに対する望ましい推論
結果(出力データ)を教師信号として与えて、モデルの
推論結果と教師信号の差(誤差)を小さくする方向に繰
り返しユニット間結合を修正する。実際には次式で定義
される出力層(第N層)に於けるモデルの出力yiと与
えられた入力に対する望ましい出力(答え)yIとから
定まる誤差関数を最小化するようなユニット間結合を見
い出すことに対応する。
Next, a learning method (backpropagation learning) that determines the connections between units and units so that the model performs the desired inference operation.A fixed length of symbols extracted from the error-containing symbol string that is the output of the actual acoustic recognition unit for the incorrect input voice. This is pseudo data in which errors are stochastically added to a symbol string with no errors, assuming a symbol string of , or an M direction of fla between symbols. These data are presented to the input layer, the correct answer for the central symbol is presented to the output layer, and backpropagation learning is repeatedly performed. In the backpropagation method, a desired inference result (output data) for input data is given as a teacher signal, and connections between units are repeatedly corrected in a direction that reduces the difference (error) between the model's inference result and the teacher signal. In reality, the connection between units that minimizes the error function determined from the output yi of the model in the output layer (Nth layer) defined by the following equation and the desired output (answer) yI for the given input is created. Respond to what you discover.

(式6 )  n= (1/2)ぞ(yマーy、 )2
この関数はy(N)を通じてあらゆるユニット間結合に
依存しているので、最小化はEを評価関数とし一1〇− て行なえばよい。結果として得られる逆伝播学習のアル
ゴリズムに関しては前記の文献に詳しい。
(Formula 6) n= (1/2) (y mer y, )2
Since this function depends on all inter-unit connections through y(N), minimization can be performed using E as the evaluation function. The resulting backpropagation learning algorithm is detailed in the above-mentioned literature.

学習の終了したモデルを用いて訂正を行なう場合には2
入力音声に対する音響認識部の出力であるシンボル列か
ら1シンボルづつ始点をシフトして逐次的に固定長のシ
ンボル列を切り出して逆伝播ネットワーク・モデルに入
力する。モデルが入力された(支)足長シンボル列の中
央のシンボルに対する修正結果を出力すると、そのシン
ボルで入力シンボル時系列の対応するシンボルを書き換
える。
When performing correction using a trained model, 2
The starting point is shifted one symbol at a time from the symbol string that is the output of the acoustic recognition unit for input speech, and fixed-length symbol strings are sequentially cut out and input to the backpropagation network model. When the model outputs the correction result for the center symbol of the input (support) leg length symbol string, the corresponding symbol in the input symbol time series is rewritten with that symbol.

このことVCよってモデルの入カニニット層に提示され
る固定長のシンボル列の前半部は常にそれ以前に訂正を
加えられたよシ確からしいシンボルから構成されること
になるので、モデルによる誤シ訂正がよ多安定に行われ
ることになる。
This means that the first half of the fixed-length symbol string presented to the input layer of the model by VC always consists of highly probable symbols that have been previously corrected, so the model cannot correct errors. It will be done more stably.

このようにしてモデルによって修正されたシンボル列に
も修正しきれなかった誤シが残っている可能性があるの
で、その残された誤シを修正するために一度モデルによ
って修正されたシンボル列全体を再び入力としてモデル
に与えて誤り訂正を行わせる。この過程を繰り返すこと
によって1次第に誤りの少ないシンボル列が得られるよ
うになる。
In this way, even the symbol string corrected by the model may still have errors that could not be corrected, so in order to correct the remaining errors, the entire symbol string that has been corrected by the model is given to the model again as input to perform error correction. By repeating this process, symbol strings with progressively fewer errors can be obtained.

〔実施例〕〔Example〕

第1図は本発明を実現した装置の一実施例を示したプロ
、り図である。人カバ、ファ部1は音響認識結果である
シンボル列を格納し、入力窓部2は入力バッファ部1か
ら1シンボルづつ始点をシフトして順次固定長のシンボ
ル列を切り出して逆伝播ネットワーク・モデル部3が入
力に対する推論結果を出力する毎に、その出力シンボル
で入力バッファ部の対応するシンボルを書き換える。出
カバッ7ア部4は逆伝播ネットワーク・モデル部3の出
力を記憶し、第一制御部6は逆伝播ネットの修正動作を
行わせる。第二制御部7は入力バッファ部1の終端のシ
ンボルまで訂正されたことを検出すると出力バラフッ部
4の記憶内容を入力バッファ部1に書き戻し、再度前記
修正動作を行わせ、この過程を一定回数繰ル返した後に
出力バッファ部4の内容を修正結果出力部8に書き出す
FIG. 1 is a schematic diagram showing an embodiment of a device implementing the present invention. The buffer section 1 stores the symbol string that is the acoustic recognition result, and the input window section 2 shifts the starting point one symbol at a time from the input buffer section 1 and sequentially cuts out fixed-length symbol strings to create a backpropagation network model. Every time unit 3 outputs an inference result for the input, it rewrites the corresponding symbol in the input buffer unit with the output symbol. The output buffer section 4 stores the output of the backpropagation network model section 3, and the first control section 6 causes the backpropagation network to be corrected. When the second control unit 7 detects that the symbols at the end of the input buffer unit 1 have been corrected, the second control unit 7 writes the memory contents of the output balance unit 4 back to the input buffer unit 1, causes the correction operation to be performed again, and keeps this process constant. After repeating the process several times, the contents of the output buffer section 4 are written to the modified result output section 8.

〔発明の効果〕〔Effect of the invention〕

以上述べたように1本発明によれば音響認識部の出力で
あるシンボル列の誤シをその前後関係を利用して、ボト
ムアップ的に訂正することが可能である。更に修正結果
を1シンボル毎に入力シンボル列に書き戻すことによっ
てよシ確からしい前後関係を利用して誤シ訂正を行うこ
とを可能にすると共に、モデルの出力シンボル全体を繰
り返し再入力して誤り訂正させることによって誤シの少
ない訂正結果を得ることを可能にする。
As described above, according to the present invention, it is possible to correct errors in the symbol string output from the acoustic recognition section in a bottom-up manner by utilizing the context. Furthermore, by writing the correction results back to the input symbol string symbol by symbol, it is possible to correct errors using a likely context, and by repeatedly re-inputting the entire output symbol of the model, errors can be corrected. By performing the correction, it is possible to obtain a correction result with fewer errors.

本発明の効果は結果的に/l′i音響認識部の認識性能
を向上させたことに相当し、音声認識装置全体としても
高い精度を実現することを可能にする。
The effect of the present invention corresponds to improving the recognition performance of the /l'i sound recognition section, and makes it possible to achieve high accuracy as a whole of the speech recognition device.

また、実行に喪する記憶容量は、考慮に入れる前後関係
の長さをり、シンボルの種類をM、隠れユニットの数を
Hとすると記憶容量のオーダーは。
Furthermore, the storage capacity required for execution is determined by the length of the context to be taken into account, where M is the type of symbol and H is the number of hidden units.The order of the storage capacity is as follows.

〜O(L−M−)1) となシ゛、従来技術と比べて大幅に縮小することを一1
13− 可能にする。
~O(L-M-)1)
13- Make possible.

【図面の簡単な説明】[Brief explanation of the drawing]

第1図は本発明の一実施例を示すブロック図。 第2図は逆伝播ネットワーク・モデルの一般的な構成を
表す図。 1は入力バッファ部、2は入力窓部、3は逆伝播ネット
ワーク・モデル部、4は出力バッファ部。 5は書き換え部、6は第一制御部、7は第二制御部、8
Fi、修正結果出力部である。 代理人 弁理士  内 原   晋 ふ− 7\、カユエ・ソF滑 閃
FIG. 1 is a block diagram showing one embodiment of the present invention. FIG. 2 is a diagram showing the general configuration of a backpropagation network model. 1 is an input buffer section, 2 is an input window section, 3 is a backpropagation network model section, and 4 is an output buffer section. 5 is a rewriting unit, 6 is a first control unit, 7 is a second control unit, 8
Fi is a modified result output unit. Agent: Patent Attorney Susumu Uchihara 7\, Kayue SoF Nasen

Claims (1)

【特許請求の範囲】[Claims] 音声認識に於て、認識の結果として得られるシンボルの
時系列に含まれる認識誤りを修正するのに際して、前記
時系列を記憶する入力バッファ部と、前記入力バッファ
部に記憶されているシンボルの時系列の先頭から順次始
点を1シンボル分づつずらして固定長の該シンボル列を
切り出す入力窓部と、前記入力窓部の出力として得られ
る固定長の該シンボル列を入力としてその中央のシンボ
ルに対する正解を出力するようにあらかじめ誤りを含む
シンボル列を用いて教師付きの学習を行なった逆伝播ネ
ットワーク・モデル部と、前記逆伝播ネットワーク・モ
デル部がシンボルを出力した時点で入力バッファ部の対
応するシンボルを修正されたシンボルに書き換える書き
換え部と、続いて前記入力バッファ部から固定長の該シ
ンボル列を切り出す入力窓部の始点を1シンボル分シフ
トして前記逆伝播ネットワーク・モデル部に次のシンボ
ルの修正動作を行わせる第一制御部と、前記逆伝播ネッ
トワーク・モデル部が出力するシンボル列を記憶する出
力バッファ部と、前記入力バッファ部のシンボル列の終
端のシンボルが修正されたことを検出した時点で前記出
力バッファ部の内容を前記入力バッファ部に書き戻し、
再度前記修正動作を繰り返させる第二制御部と、一定回
数前記修正動作を繰り返した時点で出力バッファ部の内
容を修正結果として出力する修正結果出力部とを備えて
成ることを特徴とする音声認識誤り訂正装置。
In speech recognition, when correcting recognition errors included in a time series of symbols obtained as a result of recognition, an input buffer section that stores the time series, and a time series of symbols stored in the input buffer section are used. An input window section that sequentially shifts the starting point one symbol at a time from the beginning of the series to cut out a fixed-length symbol string, and inputs the fixed-length symbol string obtained as the output of the input window section and calculates the correct answer for the center symbol. A backpropagation network model section that performs supervised learning using a symbol string containing errors in advance so as to output the corresponding symbol in the input buffer section at the time the backpropagation network model section outputs a symbol. a rewriting unit that rewrites the symbol sequence into a modified symbol, and then shifts the starting point of the input window unit that cuts out the fixed-length symbol string from the input buffer unit by one symbol and sends the next symbol to the backpropagation network model unit. a first control unit that performs a correction operation; an output buffer unit that stores the symbol string output by the backpropagation network model unit; and a first control unit that detects that a symbol at the end of the symbol string of the input buffer unit has been corrected. writing the contents of the output buffer section back to the input buffer section at a point in time;
A voice recognition system characterized by comprising: a second control section that repeats the correction operation again; and a correction result output section that outputs the contents of the output buffer section as a correction result when the correction operation is repeated a certain number of times. Error correction device.
JP63001488A 1988-01-06 1988-01-06 Voice recognition error correcting device Granted JPH01177600A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP63001488A JPH01177600A (en) 1988-01-06 1988-01-06 Voice recognition error correcting device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP63001488A JPH01177600A (en) 1988-01-06 1988-01-06 Voice recognition error correcting device

Publications (2)

Publication Number Publication Date
JPH01177600A true JPH01177600A (en) 1989-07-13
JPH0580000B2 JPH0580000B2 (en) 1993-11-05

Family

ID=11502828

Family Applications (1)

Application Number Title Priority Date Filing Date
JP63001488A Granted JPH01177600A (en) 1988-01-06 1988-01-06 Voice recognition error correcting device

Country Status (1)

Country Link
JP (1) JPH01177600A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004341518A (en) * 2003-04-25 2004-12-02 Sony Internatl Europ Gmbh Speech recognition processing method
US7020606B1 (en) * 1997-12-11 2006-03-28 Harman Becker Automotive Systems Gmbh Voice recognition using a grammar or N-gram procedures
US7343288B2 (en) 2002-05-08 2008-03-11 Sap Ag Method and system for the processing and storing of voice information and corresponding timeline information
US7406413B2 (en) 2002-05-08 2008-07-29 Sap Aktiengesellschaft Method and system for the processing of voice data and for the recognition of a language
DE102014201730A1 (en) 2013-03-26 2014-10-02 Toyota Boshoku Kabushiki Kaisha INTERNAL COMPONENT FOR A VEHICLE
JP2016161313A (en) * 2015-02-27 2016-09-05 株式会社日立アドバンストシステムズ Positioning system
US11096436B2 (en) 2013-08-29 2021-08-24 Toyota Boshoku Kabushiki Kaisha Beadings

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7020606B1 (en) * 1997-12-11 2006-03-28 Harman Becker Automotive Systems Gmbh Voice recognition using a grammar or N-gram procedures
US7343288B2 (en) 2002-05-08 2008-03-11 Sap Ag Method and system for the processing and storing of voice information and corresponding timeline information
US7406413B2 (en) 2002-05-08 2008-07-29 Sap Aktiengesellschaft Method and system for the processing of voice data and for the recognition of a language
JP2004341518A (en) * 2003-04-25 2004-12-02 Sony Internatl Europ Gmbh Speech recognition processing method
DE102014201730A1 (en) 2013-03-26 2014-10-02 Toyota Boshoku Kabushiki Kaisha INTERNAL COMPONENT FOR A VEHICLE
DE102014201730B4 (en) * 2013-03-26 2017-10-19 Toyota Boshoku Kabushiki Kaisha INTERNAL COMPONENT FOR A VEHICLE
US11096436B2 (en) 2013-08-29 2021-08-24 Toyota Boshoku Kabushiki Kaisha Beadings
JP2016161313A (en) * 2015-02-27 2016-09-05 株式会社日立アドバンストシステムズ Positioning system

Also Published As

Publication number Publication date
JPH0580000B2 (en) 1993-11-05

Similar Documents

Publication Publication Date Title
Bahl et al. Maximum mutual information estimation of hidden Markov model parameters for speech recognition
US5577164A (en) Incorrect voice command recognition prevention and recovery processing method and apparatus
KR102167719B1 (en) Method and apparatus for training language model, method and apparatus for recognizing speech
JP6628350B2 (en) Method for learning recurrent neural network, computer program therefor, and speech recognition device
JPH0355837B2 (en)
US8494847B2 (en) Weighting factor learning system and audio recognition system
US7035802B1 (en) Recognition system using lexical trees
Franco et al. Context-dependent connectionist probability estimation in a hybrid hidden Markov model-neural net speech recognition system
US12125478B2 (en) System and method for a natural language understanding system based on iterative intent detection and slot filling neural layers
US11893983B2 (en) Adding words to a prefix tree for improving speech recognition
US20180293494A1 (en) Local abbreviation expansion through context correlation
JP2020020872A (en) Discriminator, learnt model, and learning method
US20220122586A1 (en) Fast Emit Low-latency Streaming ASR with Sequence-level Emission Regularization
JP2020042257A (en) Voice recognition method and apparatus
US20130138441A1 (en) Method and system for generating search network for voice recognition
JPH01177600A (en) Voice recognition error correcting device
JP2000298663A (en) Recognition device using neural network and learning method thereof
KR20200120595A (en) Method and apparatus for training language model, method and apparatus for recognizing speech
US8204738B2 (en) Removing bias from features containing overlapping embedded grammars in a natural language understanding system
JP2021039220A (en) Speech recognition device, learning device, speech recognition method, learning method, speech recognition program, and learning program
US20240185839A1 (en) Modular Training for Flexible Attention Based End-to-End ASR
JPH01177599A (en) Voice recognition error correcting device
KR102654803B1 (en) Method for detecting speech-text alignment error of asr training data
KR20210091919A (en) Methods and Apparatus for learning deep learning models that can handle complex problems
US7206738B2 (en) Hybrid baseform generation