JPH08305805A

JPH08305805A - Text recognition device

Info

Publication number: JPH08305805A
Application number: JP7111651A
Authority: JP
Inventors: Kiyoshi Tashiro; 潔田代
Original assignee: Fuji Xerox Co Ltd
Current assignee: Fujifilm Business Innovation Corp
Priority date: 1995-05-10
Filing date: 1995-05-10
Publication date: 1996-11-22

Abstract

PURPOSE: To efficiently and accurately narrow down candidates in character recognition processing and character segmentation processing. CONSTITUTION: This text recognition device is equipped with a single character recognizing means 201 which performs the character recognition processing for an image of one character and outputs plural character codes as a character recognition result of one character in the decreasing order of accuracy and a character predicting means 202 which predicts a character that appears next from the character recognition result obtained by the single character recognition means 201 up to a certain point of time and outputs its character code. When there are 41 character code predicted by the character predicting means 202, the character recognition result is outputted after being corrected so that the respective predicted character codes increase in order.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、ある規則に従って並ぶ
複数の文字又はある統計的性質をもって並ぶ複数の文
字、例えば日本語、英語などの文章、あるいは形式化さ
れたデータ列、などを含む画像中から、各文字を順次認
識して複数の文字コードの並びを出力するテキスト認識
装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an image containing a plurality of characters arranged according to a certain rule or a plurality of characters arranged with a certain statistical property, for example, a sentence such as Japanese or English, or a formatted data string. The present invention relates to a text recognition device that sequentially recognizes each character from the inside and outputs a sequence of a plurality of character codes.

【０００２】また、単文字認識手段をオンライン文字認
識手段とすることで、オンライン入力された、日本語、
英語などの文章、あるいは形式化されたデータ列、など
を順次認識して複数の文字コードの並びを出力する、オ
ンラインのテキスト認識装置にも応用できる。Further, by using the single character recognition means as the online character recognition means, it is possible to
It can also be applied to an online text recognition device that sequentially recognizes sentences such as English or formatted data strings and outputs a sequence of a plurality of character codes.

【０００３】また、単文字認識手段を音素認識手段とす
ることで、連続発声された、日本語、英語などの文章、
あるいは形式化されたデータ列、などを順次認識して複
数の文字コードの並びを出力する、連続音声認識装置に
も応用できる。Further, by using the single-character recognizing means as the phoneme recognizing means, continuously uttered sentences such as Japanese and English,
Alternatively, it can be applied to a continuous voice recognition device that sequentially recognizes a formalized data string and outputs a sequence of a plurality of character codes.

【０００４】[0004]

【従来の技術】従来のテキスト認識装置あるいは文字認
識装置において、対象とする文字の並びの性質、例えば
日本語テキストの文法的規則や統計的性質を利用して認
識性能を向上しようとする場合、ある複数の文字に対し
て文字認識処理を行なった後、その認識結果のならびに
対して検定、修正などを行なうことが一般的であった。
このような技術においては、例えばテキスト領域内など
の複数の文字に対する文字認識処理がすべて終了した後
に、文字の並びの性質を利用した処理が行なわれる。2. Description of the Related Art In a conventional text recognition device or character recognition device, when it is attempted to improve the recognition performance by utilizing the property of the arrangement of target characters, for example, the grammatical rules and statistical properties of Japanese text, It is common to perform character recognition processing on a plurality of characters and then perform verification, correction, etc. on the recognition results.
In such a technique, after the character recognition processing for a plurality of characters, such as in a text area, is completed, processing using the character arrangement is performed.

【０００５】図１は、従来技術によるテキスト認識の動
作フローを示す図である。FIG. 1 is a diagram showing an operation flow of text recognition according to a conventional technique.

【０００６】（Ｓ１０１）各文字の認識を行ない認識結
果列を得る。(S101) Each character is recognized to obtain a recognition result string.

【０００７】（Ｓ１０２）認識結果列に対して文法的な
検定、修正を行なう。(S102) The recognition result sequence is grammatically verified and corrected.

【０００８】（Ｓ１０３）認識結果列の内容を出力す
る。(S103) The contents of the recognition result sequence are output.

【０００９】例えば、特公平６−３２０９１号公報に
は、日本文を構成する文字の認識において、複数の認識
候補文字の列に対して、単語辞書と文法的接続情報など
を用いて単語列を検出し、読み取り結果とする技術が示
されている。しかし、このような技術においては、正解
の文字が認識候補の中に含まれていない場合や文字の切
り出しが誤っている場合は修正が困難である。For example, Japanese Examined Patent Publication No. 6-32091 discloses a method of recognizing characters forming a Japanese sentence by using a word dictionary and grammatical connection information for a plurality of recognition candidate character strings. Techniques for detecting and reading results are shown. However, in such a technique, it is difficult to correct the correct character if it is not included in the recognition candidates or if the character is cut out incorrectly.

【００１０】これらの場合に対応するためには、認識候
補を極端に多くしたり、正解である可能性のある複数の
文字切り出しパスをすべて認識して結果を保存したりす
る必要があり、計算量、記憶容量とも増大してしまう。In order to deal with these cases, it is necessary to extremely increase the number of recognition candidates or to recognize a plurality of character cutting paths that may be correct and save the result. Both the amount and the storage capacity increase.

【００１１】また、特開平６−２０１０５号公報には、
予測ＬＲ（ＬｅｆｔｔｏＲｉｇｈｔ）パーザを用い
て次の文字を予測し、予測された文字の存在確率を文字
照合部により求める技術が示されている。Further, Japanese Patent Laid-Open No. 6-20105 discloses that
A technique for predicting the next character by using a predictive LR (Left to Right) parser and determining the existence probability of the predicted character by a character collating unit is shown.

【００１２】しかし、この技術においては、文字照合部
において誤認が起こる場合、即ち求められる存在確率が
不正確である場合には、それ以降の予測はすべて不正確
になり認識が困難になるという欠点がある。単語辞書に
含まれていない単語が認識対象中にある場合などにも、
やはりそれ以降の予測はすべて不正確になる。However, in this technique, if misidentification occurs in the character collating unit, that is, if the required existence probability is incorrect, all subsequent predictions will be inaccurate and recognition will be difficult. There is. Even if a word that is not included in the word dictionary is being recognized,
After all, all subsequent predictions will be inaccurate.

【００１３】また、１文字分の画像と推測される文字領
域候補が２つ以上ある場合が考慮されておらず、例えば
一般的な印刷テキスト、フリーピッチで手書きされたテ
キスト等を認識するためには実用的でない。Further, the case where there are two or more character area candidates that are presumed to be an image of one character is not taken into consideration. For example, in order to recognize general print text, text handwritten at free pitch, etc. Is not practical.

【００１４】[0014]

【発明が解決しようとする課題】上記特公平６−３２０
９１号公報に代表される技術においては、対象とする文
字の並びの性質が、一連の文字認識処理が終わった後
に、その認識結果に対する検定、修正として用いられる
ため、文字認識処理中には利用できない。従って、文字
認識処理や文字切り出し処理における候補をその処理の
時点において絞り込むために利用できない。[Patent Document 1] Japanese Patent Publication No. 6-320
In the technology typified by Japanese Patent Publication No. 91, since the character arrangement of the target character is used as a test and a correction for the recognition result after a series of character recognition processing, it is used during the character recognition processing. Can not. Therefore, it cannot be used to narrow down the candidates in the character recognition processing or the character cutout processing at the time of the processing.

【００１５】このため、対象とする文字の並びの性質を
有効に用いることができないか、または文字認識処理や
文字切り出し処理における候補を多数保存しなければな
らないという問題点がある。Therefore, there is a problem in that the character arrangement of the target character cannot be effectively used, or a large number of candidates in the character recognition processing and the character cutting processing must be stored.

【００１６】また、特開平６−２０１０５号公報に示さ
れた技術では、予測ＬＲパーザ部という文字予測手段を
備えることによって、文字認識処理を行なう時点で対象
とする文字の並びの性質を利用することができるもの
の、ある時点で文字認識処理か予測処理が不正確である
と、以降の文字に対する処理が困難になってしまう。こ
の原因は、予測されない文字を文字認識処理の対象とし
ないこと、および、文字認識処理により得られる存在確
率が低い文字を以降の文字の予測の根拠としないことに
ある。Further, in the technique disclosed in Japanese Patent Laid-Open No. 6-20105, a character predicting unit called a predictive LR parser unit is provided to utilize the character arrangement of target characters at the time of performing character recognition processing. However, if the character recognition process or the prediction process is inaccurate at a certain point, it becomes difficult to process subsequent characters. The reason for this is that unpredicted characters are not targeted for character recognition processing, and that characters with a low existence probability obtained by character recognition processing are not used as the basis for prediction of subsequent characters.

【００１７】そこで本発明は、文字認識処理や文字切り
出し処理における候補を効率的かつ正確に絞り込むこと
ができるようにすることにある。Therefore, the present invention is to enable efficient and accurate narrowing down of candidates in the character recognition processing and the character cutout processing.

【００１８】[0018]

【課題を解決するための手段】本発明は、文字列を含む
画像中から各文字を順次認識して、文字コードの並びを
出力するテキスト認識装置であって、１文字分の画像に
対して文字認識処理を行い、該１文字分の画像に対応す
る文字コードとその文字コードの確からしさである確度
を含む文字認識結果を求める第１の単文字認識手段と、
ある時点までに前記第１の単文字認識手段により得られ
た文字認識結果の列から次に出現する文字コードを予測
する文字予測手段と、前記第１の単文字認識手段により
求められた文字コードとその確度、及び前記文字予測手
段によって予測された文字コードに基づいて、文字コー
ドを順序付けし、前記１文字分の画像に対する文字認識
結果を求める第２の単文字認識手段とを有することを特
徴とする。SUMMARY OF THE INVENTION The present invention is a text recognition device for sequentially recognizing each character from an image containing a character string and outputting a sequence of character codes, and for a character image. A first single-character recognizing unit that performs character recognition processing and obtains a character recognition result including a character code corresponding to the image for one character and a certainty that is the certainty of the character code;
A character predicting unit that predicts a character code that appears next from a sequence of character recognition results obtained by the first single character recognizing unit up to a certain time point, and a character code obtained by the first single character recognizing unit And a second single character recognizing means for ordering the character codes based on the accuracy and the character code predicted by the character predicting means and obtaining a character recognition result for the one character image. And

【００１９】また、本発明は、文字列を含む画像中から
各文字を順次認識して、順序付けされた複数の文字コー
ドの並びを出力するテキスト認識装置であって、１文字
分の画像領域に対して文字コードの標準パターンに対す
る距離又は類似度を計算し、該１文字分の画像領域に対
応する文字コードとその距離または類似度を含む文字認
識結果を求める第１の単文字認識手段と、ある時点まで
に前記第１の単文字認識手段により得られた文字認識結
果の列から次に出現する文字コードを予測する文字予測
手段と、前記文字予測手段によって予測された文字コー
ドに対応する前記第１の単文字認識手段で求めた距離又
は類似度に所定の演算を行い、該演算の結果に基づい
て、文字コードを順序付けし、前記１文字分の画像に対
する文字認識結果を求める第２の単文字認識手段とを有
することを特徴とする。Further, the present invention is a text recognition device for sequentially recognizing each character from an image including a character string and outputting a sequence of a plurality of ordered character codes, which is an image area for one character. First character recognition means for calculating a distance or similarity to the standard pattern of the character code, and obtaining a character recognition result including the character code corresponding to the image area for one character and the distance or similarity, A character prediction unit that predicts a character code that appears next from a sequence of character recognition results obtained by the first single character recognition unit up to a certain time point; and the character code that corresponds to the character code predicted by the character prediction unit. A predetermined arithmetic operation is performed on the distance or the similarity calculated by the first single character recognizing means, the character codes are ordered based on the result of the arithmetic operation, and the character recognizing result for the one character image is obtained. And having a Mel second single character recognition means.

【００２０】また、本発明は、文字列を含む画像中から
各文字を順次認識して、順序付けされた複数の文字コー
ドの並びを出力するテキスト認識装置であって、１文字
分の画像に対して文字認識処理を行い、該１文字分の画
像に対応する文字コードを含む文字認識結果を求める第
１の単文字認識手段と、ある時点までに前記単文字認識
手段により得られた文字認識結果の列から次に出現する
文字コードを予測する文字予測手段と、前記第１の単文
字認識手段により求められた文字コードと、前記文字予
測手段により予測された文字コードに基づいて、前記１
文字分の画像に対して文字認識処理を行い、該１文字分
の画像に対応する文字コードをその文字コードの確から
しさである確度に基づいて順序付けして求める第２の単
文字認識手段とを有することを特徴とする。Further, the present invention is a text recognition device for sequentially recognizing each character from an image containing a character string and outputting a sequence of a plurality of ordered character codes, and for a character image. And a character recognition result obtained by the single character recognition means up to a certain point of time, the first character recognition means obtaining a character recognition result including a character code corresponding to the image for one character Character predicting means for predicting a character code that appears next from the character string, the character code obtained by the first single character recognizing means, and the character code predicted by the character predicting means.
A second single-character recognizing unit that performs character recognition processing on an image for a character and obtains a character code corresponding to the image for one character by ordering the character codes based on the certainty that is the likelihood of the character code. It is characterized by having.

【００２１】更に本発明は、文字列を含む画像中から各
文字を順次認識して、文字コードの並びを出力するテキ
スト認識装置であって、画像中から１文字分の文字領域
候補を切り出す文字切り出し手段と、前記文字切り出し
手段によって切り出された１文字分の文字領域候補に対
して文字認識処理を行い、該１文字分の文字領域候補に
対する文字コードとその確からしさを示す確度を含む文
字認識結果を求める第１の単文字認識手段と、前記第１
の単文字認識手段により求められた文字コードと確度に
基づいて前記１文字分の文字領域候補の確度を表す文字
領域スコアを算出するスコア算出手段と、ある時点まで
に前記第１の単文字認識手段により得られた文字認識結
果の列から次に出現する文字コードを予測する文字予測
手段と、前記第１の単文字認識手段により求められた文
字コードとその確度と、前記スコア算出手段により算出
された領域スコア及び前記文字予測手段によって予測さ
れた文字コードに基づいて、前記文字領域候補の中から
文字領域を決定するとともに、文字コードを順序付けし
前記文字領域に対応する文字認識結果を求める第２の単
文字認識手段とを有することを特徴とする。Further, the present invention is a text recognition device for sequentially recognizing each character from an image including a character string and outputting a sequence of character codes, wherein a character area candidate for one character is cut out from the image. A character recognizing process is performed on the character area candidate for one character cut out by the slicing means and the character slicing means, and character recognition including a character code for the character area candidate for the one character and a certainty indicating its certainty. A first single character recognizing means for obtaining a result;
Score calculating means for calculating a character area score representing the accuracy of the character area candidate for one character based on the character code and the accuracy obtained by the single character recognizing means, and the first single character recognizing by a certain point of time. A character predicting unit that predicts a character code that will appear next from a sequence of character recognition results obtained by the unit; a character code obtained by the first single character recognizing unit and its accuracy; and a score calculating unit that calculates the character code. Determining a character region from the character region candidates based on the determined region score and the character code predicted by the character prediction means, and ordering the character codes to obtain a character recognition result corresponding to the character region. It is characterized by having two single character recognition means.

【００２２】前記第１の単文字認識手段と前記第２の単
文字認識手段が同じ構成をとり、一つのハードウェア或
いは一つのソフトウェアで実現されていることを特徴と
する。The first single-character recognizing means and the second single-character recognizing means have the same structure and are realized by one piece of hardware or one piece of software.

【００２３】[0023]

【作用】本発明によるテキスト認識装置は、文字予測手
段を備えており、テキストの各文字を認識している時点
で、次に出現する文字を予測し、単文字認識手段による
認識処理および文字切り出し手段による文字領域の確定
に利用することにより、文字認識処理や文字切り出し処
理における候補を効率的かつ正確に絞り込むことができ
る。The text recognition device according to the present invention is provided with the character prediction means, predicts the next character appearing at the time of recognizing each character of the text, and performs the recognition processing and the character segmentation by the single character recognition means. It is possible to efficiently and accurately narrow down the candidates in the character recognition processing and the character cutout processing by using the character area to determine the character area.

【００２４】また、単文字認識手段は、文字予測手段に
より予測されていない文字についても認識処理の対象と
するため、文字認識処理または予測処理が不正確な場合
でも、以降の文字に対する処理を正確に続けることがで
きる。Further, since the single character recognizing means also targets the character which is not predicted by the character predicting means, the processing for the subsequent characters is accurately performed even if the character recognizing process or the predicting process is inaccurate. You can continue to.

【００２５】また、第１の単文字認識手段と第２の単文
字認識手段を同じ構成とし、一つのハードウェア或いは
一つのソフトウェアで実現することにより構成が簡略化
される。Further, the first single-character recognizing means and the second single-character recognizing means have the same structure and are realized by one piece of hardware or one piece of software, thereby simplifying the structure.

【００２６】[0026]

【実施例】以下、図面を参照しながら実施例に基づいて
本発明の特徴を具体的に説明する。なお、請求項に記載
の第１の単文字認識手段と第２の単文字認識手段を同じ
構成とし、一つのハードウェア或いは一つのソフトウェ
アで実現することにより構成が簡略化されるので、以下
の実施例ではこのような例を示している。しかし、この
構成に限定されるものではなく、第１の単文字認識手段
と第２の単文字認識手段を独立に設けてもよい。DESCRIPTION OF THE PREFERRED EMBODIMENTS The features of the present invention will be specifically described below based on embodiments with reference to the drawings. Since the first single-character recognizing means and the second single-character recognizing means described in the claims have the same structure and are realized by one piece of hardware or one piece of software, the structure is simplified. The embodiment shows such an example. However, the present invention is not limited to this configuration, and the first single character recognition means and the second single character recognition means may be provided independently.

【００２７】図２は、本発明によるテキスト認識装置の
実施例の構成の一例を示す図である。２０１は単文字認
識手段、２０２は文字予測手段、２０３はメモリ、２０
４はメモリ２０３中に保存される認識結果列、２０５は
メモリ２０３中に保存される予測文字集合である。FIG. 2 is a diagram showing an example of the configuration of an embodiment of the text recognition device according to the present invention. Reference numeral 201 is a single character recognition means, 202 is a character prediction means, 203 is a memory, 20
Reference numeral 4 is a recognition result string stored in the memory 203, and 205 is a predicted character set stored in the memory 203.

【００２８】単文字認識手段２０１により得られた認識
結果は、順に認識結果列２０４に加えられ保存される。
また、文字予測手段２０２により予測された文字は、予
測文字集合２０５として保存される。さらに単文字認識
手段２０１は、その処理の課程において予測文字集合２
０５を参照する。文字予測手段２０２は、その処理の課
程において認識結果列２０４を参照する。The recognition results obtained by the single character recognizing means 201 are sequentially added to the recognition result sequence 204 and stored.
Further, the characters predicted by the character predicting means 202 are stored as a predicted character set 205. Further, the single character recognizing means 201 uses the predicted character set 2 in the course of the processing.
Refer to 05. The character prediction unit 202 refers to the recognition result string 204 in the course of the processing.

【００２９】上記文字予測手段２０２には既存の技術を
用いることができる。例えば、特開平６−２０１０５号
公報に示された技術のように、次の文字を予測するＬＲ
テーブルをあらかじめ用意して、それまでに得られた認
識結果列からＬＲテーブルを参照し、次に出現すること
が可能な文字を得ることができる。さらに本発明では、
それまでの認識結果列を参照できるため、未登録語の出
現や誤認識などの原因でパージングに失敗した場合に
は、スタックをリセットして必要な箇所からパージング
を再開すれば、以降の文字の予測が可能である。An existing technique can be used for the character predicting means 202. For example, as in the technique disclosed in Japanese Patent Laid-Open No. 6-20105, LR for predicting the next character.
It is possible to prepare a table in advance, refer to the LR table from the recognition result sequence obtained so far, and obtain the character that can appear next. Further in the present invention,
Since the recognition result string up to that point can be referenced, if parsing fails due to the occurrence of unregistered words or misrecognition, reset the stack and restart parsing from the required position, Can be predicted.

【００３０】また文字予測手段を実現する他の技術とし
て、例えば１９９４年電子情報通信学会春期大会講演論
文集７−２２３『２重マルコフモデルを用いた日本語文
書認識』に示されるＮ−ｇｒａｍ統計データを用いるこ
とができる。Ｎ−ｇｒａｍ統計データを用いる場合、認
識結果列の後尾の（Ｎ−１）文字から次に各文字が出現
する確率を求めることができる。As another technique for realizing the character predicting means, N-gram statistics shown in, for example, Proceedings of the 1994 Spring Conference of the Institute of Electronics, Information and Communication Engineers, 7-223, "Japanese Document Recognition Using Double Markov Model". The data can be used. When N-gram statistical data is used, the probability that each character will appear next can be calculated from the (N-1) characters at the end of the recognition result sequence.

【００３１】図３は、本発明によるテキスト認識装置の
実施例の概略の動作フローを示す図である。FIG. 3 is a diagram showing a schematic operation flow of an embodiment of the text recognition device according to the present invention.

【００３２】（Ｓ３０１）認識結果列２０４および予測
文字集合２０５をクリアする。(S301) The recognition result sequence 204 and the predicted character set 205 are cleared.

【００３３】（Ｓ３０２）予測文字集合２０５を参照し
て、認識すべきテキスト中の１文字を単文字認識手段２
０１により認識する。(S302) With reference to the predicted character set 205, one character in the text to be recognized is recognized by the single character recognition means 2
Recognize by 01.

【００３４】（Ｓ３０３）得られた認識結果を認識結果
列２０４に書き加える。(S303) The obtained recognition result is added to the recognition result column 204.

【００３５】（Ｓ３０４）認識結果列２０４を参照し
て、文字予測手段２０２により、次に出現する可能性の
ある文字を予測する。(S304) With reference to the recognition result sequence 204, the character predicting means 202 predicts a character that may appear next.

【００３６】（Ｓ３０５）得られた予測文字を予測文字
集合２０５に書き込む。(S305) The obtained predicted character is written in the predicted character set 205.

【００３７】（Ｓ３０６）認識すべきテキスト中の全て
の文字を認識したか否かを調べる。全て認識が終わって
いない場合は（Ｓ３０２）の処理から繰り返す。全て認
識が終わった場合は（Ｓ３０７）へ進む。(S306) It is checked whether all the characters in the text to be recognized have been recognized. If all the recognitions have not been completed, the process is repeated from (S302). When all the recognition is completed, the process proceeds to (S307).

【００３８】（Ｓ３０７）認識結果列２０４の内容を出
力する。(S307) The contents of the recognition result sequence 204 are output.

【００３９】以下、上記した単文字認識手段２０１の構
成及び動作の詳細について説明する。The configuration and operation of the above-mentioned single character recognizing means 201 will be described below in detail.

【００４０】図４は、テキスト認識装置における単文字
認識手段２０１の構成の一例を示す図である。４０１は
特徴抽出手段、４０２は距離計算手段、４０３は標準特
徴記憶手段、４０４は距離値記憶手段、４０５は距離値
ソート手段である。FIG. 4 is a diagram showing an example of the configuration of the single character recognition means 201 in the text recognition device. Reference numeral 401 is a feature extracting means, 402 is a distance calculating means, 403 is a standard feature storing means, 404 is a distance value storing means, and 405 is a distance value sorting means.

【００４１】特徴抽出手段４０１で抽出する特徴は、様
々なものを用いることができる。例えばメッシュ特徴、
ペリフェラル特徴（橋本：「文字認識概論」，電気通信
協会，昭和５７年３月２０日発行，ｐｐ．８３−８５参
照）、方向線素特徴などが利用可能である。また、複数
の特徴を併用してもよい。Various features can be used as the features extracted by the feature extracting means 401. Eg mesh features,
Peripheral features (see Hashimoto: “Introduction to Character Recognition”, The Telecommunications Association, March 20, 1982, pp. 83-85), direction line element features, etc. can be used. Moreover, you may use a some characteristic together.

【００４２】距離計算手段４０２で用いる距離計算方法
は、様々なものを用いることができる。例えばユークリ
ッド距離、マハラノビス距離、疑似ベイズ関数、などが
利用可能である。Various distance calculation methods can be used as the distance calculation means 402. For example, Euclidean distance, Mahalanobis distance, pseudo Bayes function, etc. can be used.

【００４３】標準特徴記憶手段４０３には、予め収集し
た文字画像から特徴抽出手段４０１と同様の方法で抽出
した特徴が、文字番号と対応付けられて記憶されてい
る。文字番号は各々文字コードに対応付けられて記憶さ
れており、文字番号を参照すれば文字コードが一意に決
定する。In the standard feature storage means 403, features extracted from previously collected character images by the same method as the feature extraction means 401 are stored in association with character numbers. The character number is stored in association with each character code, and the character code is uniquely determined by referring to the character number.

【００４４】距離値記憶手段４０４には、距離計算手段
４０２により計算された距離値が、文字番号と対応付け
られて記憶される。The distance value storage means 404 stores the distance value calculated by the distance calculation means 402 in association with the character number.

【００４５】距離値ソート手段４０５は、距離値記憶手
段４０４に記憶された距離値を文字番号と共にソート
し、ソートされた上位の複数個の距離値に対応する文字
番号を、文字認識結果として出力する。また、文字番号
のみならず距離値そのものも併せて出力し、文字予測手
段２０２等がその値を参照できるようにしてもよい。The distance value sorting means 405 sorts the distance values stored in the distance value storage means 404 together with the character numbers, and outputs the character numbers corresponding to the plurality of sorted higher distance values as the character recognition result. To do. Further, not only the character number but also the distance value itself may be output so that the character predicting means 202 or the like can refer to the value.

【００４６】図５は、発明によるテキスト認識装置の実
施例における単文字認識手段２０１の動作フローを示す
図である。FIG. 5 is a diagram showing an operation flow of the single character recognizing means 201 in the embodiment of the text recognizing device according to the invention.

【００４７】（Ｓ５０１）１文字分の文字画像から認識
のための特徴ベクトルを特徴抽出手段４０１により抽出
する。この特徴ベクトルをＸと呼ぶ。(S501) The feature extraction means 401 extracts a feature vector for recognition from a character image of one character. This feature vector is called X.

【００４８】（Ｓ５０２）処理中の文字番号を表す整数
変数ｉに０を代入する。(S502) 0 is assigned to the integer variable i representing the character number being processed.

【００４９】（Ｓ５０３）特徴ベクトルＸと標準特徴Ｆ
ｉとの距離を距離計算手段４０２により計算し、その値
を距離値記憶手段４０４に記憶する。この値をＤｉとす
る。標準特徴Ｆｉは標準特徴記憶手段４０３に文字番号
ｉに対応付けられて記憶されている特徴ベクトルであ
る。(S503) Feature vector X and standard feature F
The distance calculation means 402 calculates the distance to i, and the value is stored in the distance value storage means 404. Let this value be Di. The standard feature Fi is a feature vector stored in the standard feature storage means 403 in association with the character number i.

【００５０】（Ｓ５０４）番号ｉに対応する文字が予測
文字集合２０５に含まれているか否かを調べる。含まれ
ている場合は（Ｓ５０５）を実行する。含まれていない
場合は（Ｓ５０６）へ進む。(S504) It is checked whether the character corresponding to the number i is included in the predicted character set 205. If it is included, (S505) is executed. If not included, the process proceeds to (S506).

【００５１】（Ｓ５０５）距離値Ｄｉに予め定められた
定数０．９を乗じ、その値を距離値記憶手段４０４に記
憶する。(S505) The distance value Di is multiplied by a predetermined constant 0.9, and the value is stored in the distance value storage means 404.

【００５２】（Ｓ５０６）変数ｉに１を加える。(S506) 1 is added to the variable i.

【００５３】（Ｓ５０７）変数ｉが総文字数Ｉより小さ
いか否かを調べる。小さい場合は（Ｓ５０３）の処理か
ら繰り返す。小さくない場合には（Ｓ５０８）に進む。(S507) It is checked whether the variable i is smaller than the total number of characters I. If it is smaller, the process is repeated from (S503). If it is not smaller, the process proceeds to (S508).

【００５４】（Ｓ５０８）距離値ソート手段４０５によ
り、距離値記憶手段４０４に記憶された距離値Ｄｉを昇
順、すなわち値の小さい順にソートし、得られた上位の
文字番号を出力する。(S508) The distance value sorting means 405 sorts the distance values Di stored in the distance value storage means 404 in ascending order, that is, in ascending order of the value, and outputs the obtained upper character number.

【００５５】上記（Ｓ５０３）から（Ｓ５０７）のルー
プは、番号ｉに対応する文字に対して同様の処理を行な
うためのループであるが、例えば距離計算手段４０２を
複数用意し処理を並列化することにより、処理時間を短
縮することが可能である。The loop from (S503) to (S507) described above is a loop for performing the same processing on the character corresponding to the number i. For example, a plurality of distance calculating means 402 are prepared and the processing is parallelized. As a result, the processing time can be shortened.

【００５６】上記（Ｓ５０５）の処理により、文字予測
手段２０２により予測され文字予測集合２０５に含まれ
ている文字は、標準特徴との距離値がより小さい値に変
更される。上記（Ｓ５０５）でＤｉに乗じる定数０．９
は他の値でもよい。定数は１より小さい値にすることが
効果的である。文字予測手段２０２の正確さなどにより
最適な値は変わる。また、文字予測集合２０５に含まれ
る文字数に応じてその都度定数の値を変更することも可
能である。さらに、番号ｉに対応する文字の文字種に応
じて定数の値を変更することも効果的である。例えば、
番号ｉに対応する文字が平仮名であるときには０．９
５、片仮名であるときには０．９３、漢字であるときに
は０．９とする。By the process of (S505), the characters predicted by the character prediction unit 202 and included in the character prediction set 205 are changed to have a smaller distance value from the standard feature. Constant 0.9 multiplied by Di in the above (S505)
May have other values. It is effective to set the constant to a value smaller than 1. The optimum value changes depending on the accuracy of the character predicting means 202. It is also possible to change the value of the constant each time according to the number of characters included in the character prediction set 205. Furthermore, it is also effective to change the value of the constant according to the character type of the character corresponding to the number i. For example,
0.9 if the character corresponding to the number i is hiragana
5, 0.93 for katakana and 0.9 for kanji.

【００５７】上記（Ｓ５０５）では、距離値Ｄｉに定数
を乗じる代わりに、定数を加えるようにしてもよい。こ
の場合、定数は０より小さい値、すなわち、負にするこ
とが効果的である。In the above (S505), a constant may be added instead of multiplying the distance value Di by a constant. In this case, it is effective to make the constant smaller than 0, that is, negative.

【００５８】次に、従来技術との対比により本発明の効
果を説明する。Next, the effect of the present invention will be described in comparison with the prior art.

【００５９】図６は、従来技術によるテキスト認識装置
における認識の様子を示す図である。６０１はテキスト
画像の一部である。６０２は認識結果列の内容であり、
各文字に対応して、上の段から順に第１位、第２位、第
３位の認識候補が示されている。FIG. 6 is a diagram showing a state of recognition in a conventional text recognition device. 601 is a part of the text image. 602 is the content of the recognition result sequence,
Corresponding to each character, the first, second, and third recognition candidates are shown in order from the top.

【００６０】この例では、「ビジネスとしての」という
テキスト中の「ス」に相当する文字にノイズが付加して
おり、正解である「ス」は第３位までの認識候補に含ま
れないため、各文字の認識が終了した後の文法的な検
定、修正によっても正解とすることができない。In this example, noise is added to the character corresponding to "su" in the text "as a business", and the correct answer "su" is not included in the recognition candidates up to the third place. , The correct answer cannot be obtained even by the grammatical test and correction after the recognition of each character is completed.

【００６１】従来技術の改良によってこの例のような場
合を正解とする方法として、認識結果列として保存する
候補をより多くする、文法的な検定、修正において認識
候補に含まれない文字も考慮にいれる、などが考えられ
るが、いずれも記憶容量の増大、計算処理量の増大、副
作用による誤認の増加などの欠点がある。As a method for making the case like this example a correct answer by improving the conventional technique, the number of candidates to be stored as a recognition result sequence is increased, and characters that are not included in the recognition candidates are taken into consideration in the grammatical verification and correction. Although there is a possibility to put it in, there are drawbacks such as an increase in storage capacity, an increase in calculation processing amount, and an increase in false positives due to side effects.

【００６２】図７は、本発明によるテキスト認識装置の
実施例における認識の様子を示す図である。FIG. 7 is a diagram showing a state of recognition in the embodiment of the text recognition device according to the present invention.

【００６３】７０２、７０４、７０６、７０８、７１０
は認識結果列２０４の内容である。７０３、７０５、７
０７、７０９、７１１は予測文字集合２０５の内容であ
る。702, 704, 706, 708, 710
Is the contents of the recognition result sequence 204. 703, 705, 7
07, 709, and 711 are the contents of the predicted character set 205.

【００６４】７０２は１文字目「ビ」を単文字認識手段
２０１により認識した時点での認識結果列２０４の内
容、７０３は７０２を参照して文字予測手段２０２によ
り得られた文字予測集合２０５の内容、７０４は７０３
を参照して２文字目「ジ」を単文字認識手段２０１によ
り認識した時点での認識結果列２０４の内容であり、以
下同様である。３文字目までを認識した認識結果列７０
６から、文字予測集合７０７「ス」が得られるので、文
字「ス」に対応する距離値Ｄｉが変更され第１位候補に
なる。Reference numeral 702 indicates the contents of the recognition result sequence 204 at the time when the first character "Bi" is recognized by the single character recognition means 201, and reference numeral 703 indicates the character prediction set 205 obtained by the character prediction means 202 with reference to 702. Content, 704 is 703
Is the content of the recognition result sequence 204 at the time when the second character “ji” is recognized by the single character recognition means 201 with reference to, and so on. Recognition result string 70 that has recognized up to the third character
Since the character prediction set 707 “S” is obtained from 6, the distance value Di corresponding to the character “S” is changed and becomes the first candidate.

【００６５】このように本発明によれば、従来技術では
文法的な規則を用いても正解とできなかった文字を正解
とすることができる。また、「ス」が第１位候補にはな
らず、第２位あるいは第３位の候補になる場合にも、本
発明に従来技術と同様な文字認識終了後の文法的検定、
修正を併用すれば、第１位候補とすることも可能であ
る。As described above, according to the present invention, it is possible to make a correct answer a character that cannot be determined as a correct answer by using a grammatical rule in the prior art. Further, even when "su" is not the first candidate but the second or third candidate, the grammatical test after completion of character recognition similar to the prior art in the present invention,
It is also possible to make it the first-ranked candidate by using the modification together.

【００６６】また、本発明によるテキスト認識装置で
は、文字予測手段２０２により得られる予測文字集合２
０５において、予測文字集合２０５中の各文字に対応し
て予測スコアが付与される。予測スコアは各文字が予測
される度合いを示す値であって、例えばその文字が次に
出現することが強く予測されるほど大きな値とする。文
字予測手段２０２として前述したＮ−ｇｒａｍ統計デー
タを用いる場合、各文字が次に出現する統計的確率が得
られるので、その値をそのまま予測スコアとすることが
できる。Further, in the text recognition apparatus according to the present invention, the predicted character set 2 obtained by the character prediction means 202 is used.
At 05, a prediction score is assigned to each character in the predicted character set 205. The prediction score is a value indicating the degree to which each character is predicted, and is set to a larger value, for example, as the character is strongly predicted to appear next. When the above-described N-gram statistical data is used as the character prediction unit 202, the statistical probability that each character will appear next can be obtained, so that value can be used as the prediction score as it is.

【００６７】予測スコアを使用する本発明によるテキス
ト認識装置では、図５（Ｓ５０５）の処理において、定
数を用いる替わりに予測スコアの関数を用いる。例えば
予測スコアをｐとするとき、In the text recognition device according to the present invention which uses the predictive score, a function of the predictive score is used instead of using the constant in the process of FIG. 5 (S505). For example, when the prediction score is p,

【数１】をＤｉに乗じる。[Equation 1] To Di.

【００６８】距離値の補正に予測スコアの関数を用いれ
ば、各文字が予測される度合いに応じて、距離値の補正
の度合いを調整することができる。即ち、その文字が次
に出現することが強く予測されるほど距離値を小さい値
に変更することにより、文法的な性質をより正確に文字
認識結果に反映させることができる。If the function of the prediction score is used for the correction of the distance value, the correction degree of the distance value can be adjusted according to the degree of prediction of each character. That is, the grammatical property can be more accurately reflected in the character recognition result by changing the distance value to a smaller value so that it is strongly predicted that the character will appear next.

【００６９】図４乃至図７を用いて説明した構成および
作用により、文字予測手段２０２により予測され文字予
測集合２０５に含まれている文字は、標準特徴Ｆｉとの
距離値Ｄｉがより小さい値に変更されるため、文字認識
結果の中で上位に位置するようになる効果がある。さら
に、予測された文字が上位に移動する度合いは、定数の
値を変えることにより調整が可能である。With the configuration and operation described with reference to FIGS. 4 to 7, the characters predicted by the character prediction unit 202 and included in the character prediction set 205 have a smaller distance value Di with respect to the standard feature Fi. Since it is changed, there is an effect that it is positioned higher in the character recognition result. Furthermore, the degree to which the predicted character moves to the upper position can be adjusted by changing the value of the constant.

【００７０】図８は、本発明によるテキスト認識装置の
実施例における単文字認識手段２０１の構成の他の例を
示す図である。図において、８０１は第１の特徴抽出手
段、８０２は第１の距離計算手段、８０３は第１の標準
特徴記憶手段、８０４は第１の距離値記憶手段、８０５
は第１の距離値ソート手段、８０６は中間候補付加手
段、８０７は第２の特徴抽出手段、８０８は第２の距離
計算手段、８０９は第２の標準特徴記憶手段、８１０は
第２の距離値記憶手段、８１１は第２の距離値ソート手
段である。FIG. 8 is a diagram showing another example of the configuration of the single character recognizing means 201 in the embodiment of the text recognizing device according to the present invention. In the figure, 801 is a first feature extraction means, 802 is a first distance calculation means, 803 is a first standard feature storage means, 804 is a first distance value storage means, 805.
Is a first distance value sorting means, 806 is an intermediate candidate adding means, 807 is a second feature extracting means, 808 is a second distance calculating means, 809 is a second standard feature storing means, and 810 is a second distance. A value storage unit 811 is a second distance value sorting unit.

【００７１】第１の特徴抽出手段８０１および第２の特
徴抽出手段８０７は、同一の、あるいは互いに異なる特
徴を文字画像から抽出する。同一の特徴を用いる場合に
は２つの手段を１つのハードウエアあるいは１つのソフ
トウエアモジュールで実現することにより、構成を簡略
にすることができる。この場合にはさらに、第１の標準
特徴記憶手段８０３と第２の標準特徴記憶手段８０９を
１つのハードウエアあるいは１つのメモリ領域として実
現することにより、構成を簡略にすることができる。The first feature extracting means 801 and the second feature extracting means 807 extract the same or different features from the character image. When the same features are used, the configuration can be simplified by implementing the two means with one hardware or one software module. In this case, the configuration can be simplified by further implementing the first standard feature storage means 803 and the second standard feature storage means 809 as one hardware or one memory area.

【００７２】第１の特徴抽出手段８０１で用いる特徴を
相対的に短時間で抽出できるものにし、第２の特徴抽出
手段８０７で用いる特徴を相対的に高精度なものにする
ことが、高精度な認識結果を高速に得るために有効であ
る。有効な組み合わせの一例として、第１の特徴抽出手
段８０１ではメッシュ特徴を用い、第２の特徴抽出手段
８０７ではメッシュ特徴、ペリフェラル特徴、方向線素
特徴を併用することが挙げられる。It is highly accurate to make the features used by the first feature extraction means 801 extractable in a relatively short time and to make the features used by the second feature extraction means 807 relatively accurate. This is effective for obtaining a high recognition result at high speed. An example of an effective combination is that the first feature extraction unit 801 uses mesh features, and the second feature extraction unit 807 uses mesh features, peripheral features, and direction line element features together.

【００７３】第１の距離計算手段８０２および第２の距
離計算手段８０８で用いる距離計算方法は、同一の、あ
るいは互いに異なる距離計算方法を用いる。同一の距離
計算方法を用いる場合には２つの手段を１つのハードウ
エアあるいは１つのソフトウエアモジュールで実現する
ことにより、構成を簡略にすることができる。The distance calculation methods used by the first distance calculation means 802 and the second distance calculation means 808 are the same or different distance calculation methods. When the same distance calculation method is used, the configuration can be simplified by implementing the two means with one hardware or one software module.

【００７４】第１の距離計算手段８０２で用いる距離計
算方法を相対的に短時間で計算できるものにし、第２の
距離計算手段８０８で用いる距離計算方法を相対的に高
精度なものにすることが、高精度な認識結果を高速に得
るために有効である。有効な組み合わせの一例として、
第１の距離計算手段８０２ではユークリッド距離を用
い、第２の距離計算手段８０８では疑似ベイズ関数を用
いることが挙げられる。The distance calculation method used by the first distance calculation means 802 can be calculated in a relatively short time, and the distance calculation method used by the second distance calculation means 808 can be made relatively accurate. However, it is effective for obtaining a highly accurate recognition result at high speed. As an example of valid combinations,
The first distance calculating means 802 may use the Euclidean distance, and the second distance calculating means 808 may use the pseudo Bayes function.

【００７５】第１の距離値記憶手段８０４と第２の距離
計算手段８０８、第１の距離値ソート手段８０５と第２
の距離値ソート手段８１１はそれぞれ、１つのハードウ
エアあるいは１つのソフトウエアモジュールで実現する
ことにより、構成を簡略にすることができる。First distance value storage means 804 and second distance calculation means 808, first distance value sorting means 805 and second distance value storage means 805.
The distance value sorting means 811 can be configured with one hardware or one software module to simplify the configuration.

【００７６】第１の距離値ソート手段８０５は、第１の
距離値記憶手段８０４に記憶された距離値を文字番号と
共にソートし、ソートされた上位の複数個の距離値に対
応する文字番号を、中間候補とする。The first distance value sorting means 805 sorts the distance values stored in the first distance value storage means 804 together with the character numbers, and obtains the character numbers corresponding to the plurality of sorted upper distance values. , As an intermediate candidate.

【００７７】図９は、図８に示されたテキスト認識装置
における単文字認識手段の動作フローを示す図である。FIG. 9 is a diagram showing an operation flow of the single character recognizing means in the text recognition device shown in FIG.

【００７８】（Ｓ９０１）１文字分の文字画像から認識
のための第１の特徴ベクトルを第１の特徴抽出手段８０
１により抽出する。この特徴ベクトルをＸと呼ぶ。(S901) The first feature extraction means 80 obtains the first feature vector for recognition from the character image of one character.
Extract by 1. This feature vector is called X.

【００７９】（Ｓ９０２）処理中の文字番号を表す整数
変数ｉに０を代入する。(S902) 0 is assigned to the integer variable i representing the character number being processed.

【００８０】（Ｓ９０３）Ｘと第１の標準特徴Ｆｉとの
距離を第１の距離計算手段８０２により計算し、その値
を第１の距離値記憶手段８０４に記憶する。この値をＤ
ｉとする。第１の標準特徴Ｆｉは第１の標準特徴記憶手
段８０３に文字番号ｉに対応付けられて記憶されている
特徴ベクトルである。(S903) The distance between X and the first standard feature Fi is calculated by the first distance calculation means 802, and the value is stored in the first distance value storage means 804. This value is D
i. The first standard feature Fi is a feature vector stored in the first standard feature storage means 803 in association with the character number i.

【００８１】（Ｓ９０４）変数ｉに１を加える。(S904) 1 is added to the variable i.

【００８２】（Ｓ９０５）変数ｉが総文字数Ｉより小さ
いか否かを調べる。小さい場合は（Ｓ９０３）の処理か
ら繰り返す。小さくない場合には（Ｓ９０６）に進む。(S905) It is checked whether the variable i is smaller than the total number of characters I. If it is smaller, the processing from (S903) is repeated. If it is not smaller, the process proceeds to (S906).

【００８３】（Ｓ９０６）第１の距離値ソート手段８０
５により、第１の距離値記憶手段８０４に記憶された距
離値Ｄｉを昇順、すなわち値の小さい順にソートし、得
られた上位のＮ個の文字を中間候補とする。(S906) First distance value sorting means 80
5, the distance values Di stored in the first distance value storage unit 804 are sorted in ascending order, that is, in ascending order of the values, and the obtained upper N characters are used as intermediate candidates.

【００８４】（Ｓ９０７）中間候補付加手段８０６によ
り、中間候補に含まれず予測文字集合２０５に含まれる
文字があれば、それを中間候補に加える。(S907) If there is a character that is not included in the intermediate candidates and is included in the predicted character set 205, the intermediate candidate adding means 806 adds it to the intermediate candidates.

【００８５】（Ｓ９０８）整数変数Ｊに中間候補の数を
代入する。(S908) The number of intermediate candidates is substituted into the integer variable J.

【００８６】（Ｓ９０９）１文字分の文字画像から認識
のための第２の特徴ベクトルを第２の特徴抽出手段８０
７により抽出する。この特徴ベクトルをＸ’と呼ぶ。(S909) The second feature vector for recognizing from the character image of one character is extracted by the second feature extracting means 80.
Extract according to 7. This feature vector is called X '.

【００８７】（Ｓ９１０）処理中の文字番号を表す整数
変数ｊに０を代入する。(S910) 0 is assigned to the integer variable j representing the character number being processed.

【００８８】（Ｓ９１１）整数変数ｉに第ｊ番目の中間
候補の文字番号を代入する。(S911) The character number of the jth intermediate candidate is assigned to the integer variable i.

【００８９】（Ｓ９１２）Ｘ’と第２の標準特徴Ｆ’ｉ
との距離を第２の距離計算手段８０８により計算し、そ
の値を第２の距離値記憶手段８１０に記憶する。この値
をＤｊとする。第２の標準特徴Ｆ’ｉは第２の標準特徴
記憶手段８０９に文字番号ｉに対応付けられて記憶され
ている特徴ベクトルである。(S912) X'and the second standard feature F'i
The distance to and is calculated by the second distance calculation means 808, and the value is stored in the second distance value storage means 810. This value is Dj. The second standard feature F′i is a feature vector stored in the second standard feature storage unit 809 in association with the character number i.

【００９０】（Ｓ９１３）変数ｊに１を加える。(S913) 1 is added to the variable j.

【００９１】（Ｓ９１４）変数ｊが中間候補数Ｊより小
さいか否かを調べる。小さい場合は（Ｓ９１１）の処理
から繰り返す。小さくない場合には（Ｓ９１５）に進
む。(S914) It is checked whether the variable j is smaller than the number J of intermediate candidates. If it is smaller, the process is repeated from (S911). If it is not smaller, the process proceeds to (S915).

【００９２】（Ｓ９１５）第２の距離値ソート手段８１
１により、第２の距離値記憶手段８１０に記憶された距
離値Ｄｊを昇順、すなわち値の小さい順にソートし、得
られた上位の文字番号を出力する。(S915) Second distance value sorting means 81
According to 1, the distance values Dj stored in the second distance value storage means 810 are sorted in ascending order, that is, in ascending order of the values, and the obtained upper character number is output.

【００９３】上記（Ｓ９０６）の処理で用いるＮの値と
して例えば１０を用いる。Ｎの値は、総文字数Ｉに等し
いか又は小さく、かつ最終的に出力する候補数に等しい
か又は大きい範囲にあればよい。Ｎの値が大きいほど最
終的な認識候補の精度は高くなるが計算量が増大する傾
向がある。For example, 10 is used as the value of N used in the process of (S906). The value of N may be in the range of equal to or smaller than the total number of characters I and equal to or larger than the number of candidates to be finally output. The larger the value of N, the higher the accuracy of the final recognition candidate, but the amount of calculation tends to increase.

【００９４】上記（Ｓ９０７）の処理が本発明の特徴と
するところである。対象となる文字画像にノイズや変形
があり、従来の技術では正解の文字が中間候補に残らず
結果として認識結果が誤認となる場合にも、本発明によ
れば、中間候補付加手段８０６による作用により、正解
の文字が予測文字集合２０５に含まれていれば中間候補
に含まれるようになる。第１の特徴抽出手段８０１およ
び第１の距離計算手段８０２と比較して、第２の特徴抽
出手段８０７および第２の距離計算手段８０８は高精度
な処理を行なうためノイズや変形に影響を受けにくいの
で、中間候補付加手段８０６により付加された正解の文
字が認識結果の上位になる確率は高く、その分認識結果
の精度を高めることができる。The process of (S907) is a feature of the present invention. According to the present invention, even when the target character image has noise or deformation and the recognition result is falsely recognized as a result in which the correct character does not remain in the intermediate candidate in the conventional technique, the effect of the intermediate candidate adding unit 806 is achieved. Thus, if a correct character is included in the predicted character set 205, it will be included in the intermediate candidate. Compared with the first feature extracting means 801 and the first distance calculating means 802, the second feature extracting means 807 and the second distance calculating means 808 perform high-precision processing and therefore are affected by noise and deformation. Since it is difficult, the probability that the correct character added by the intermediate candidate adding means 806 is higher in the recognition result is high, and the accuracy of the recognition result can be increased accordingly.

【００９５】なお、図８および図９では２つの候補抽出
段階をもつ例を示したが、候補抽出段階は２つ以上でも
本発明は応用可能である。各段階毎に中間候補付加手段
を設けることにより、その段階での中間候補から正解の
文字が欠落するのを防ぐことができる。Although FIG. 8 and FIG. 9 show an example having two candidate extraction stages, the present invention can be applied even if there are two or more candidate extraction stages. By providing the intermediate candidate adding means for each stage, it is possible to prevent the correct character from being missing from the intermediate candidates at that stage.

【００９６】以上述べたように、図８および図９を用い
て説明した構成および作用により、２つ以上の候補抽出
段階をもつ単文字認識処理の中で、中間において候補正
解の文字が欠落することによって誤認が起きる場合の大
部分を救済することができ、最終的なテキスト認識の精
度を向上させることができる。As described above, due to the configuration and operation described with reference to FIGS. 8 and 9, in the single character recognition process having two or more candidate extraction stages, the candidate correct answer character is missing in the middle. By doing so, most cases where misidentification occurs can be remedied, and the accuracy of final text recognition can be improved.

【００９７】図１０は、本発明によるテキスト認識装置
の実施例の構成の更に他の例を示す図である。図におい
て、１００１は単文字認識手段、１００２は文字予測手
段、１００６は文字切り出し手段、１００３はメモリ、
１００４はメモリ１００３中に保存される認識結果列、
１００５はメモリ１００３中に保存される予測文字集
合、１００７はメモリ１００３中に保存される文字領域
候補、１００８はメモリ１００３中に保存される一時的
認識結果である。FIG. 10 is a diagram showing still another example of the configuration of the embodiment of the text recognition apparatus according to the present invention. In the figure, 1001 is a single character recognizing unit, 1002 is a character predicting unit, 1006 is a character cutting unit, 1003 is a memory,
1004 is a recognition result sequence stored in the memory 1003,
Reference numeral 1005 is a predictive character set stored in the memory 1003, 1007 is a character area candidate stored in the memory 1003, and 1008 is a temporary recognition result stored in the memory 1003.

【００９８】単文字認識手段１００１により得られた認
識結果は、一時的認識結果１００８に保存される。文字
予測手段１００２により予測された文字は、予測文字集
合１００５として保存される。文字切り出し手段１００
６により得られる文字領域候補は、文字領域候補１００
７に保存される。文字切り出し手段１００６により確定
された文字領域に対応する認識結果は、順に認識結果列
１００４に加えられ保存される。The recognition result obtained by the single character recognizing means 1001 is stored in the temporary recognition result 1008. The characters predicted by the character prediction unit 1002 are stored as a predicted character set 1005. Character cutting means 100
6 is the character area candidate 100.
Stored in 7. The recognition result corresponding to the character area determined by the character cutting means 1006 is sequentially added to the recognition result sequence 1004 and stored.

【００９９】単文字認識手段１００１は、その処理の過
程において文字領域候補１００７を参照する。文字予測
手段１００２は、その処理の過程において認識結果列１
００４を参照する。文字切り出し手段１００６は、その
処理の過程において予測文字集合１００５および一時的
認識結果１００８を参照する。The single character recognizing means 1001 refers to the character area candidate 1007 in the process of the processing. The character prediction unit 1002 recognizes the recognition result string 1 in the process of the processing.
Please refer to 004. The character cutout unit 1006 refers to the predicted character set 1005 and the temporary recognition result 1008 in the process of the processing.

【０１００】図１１は、図１０に示されたテキスト認識
装置の動作フローを示す図である。FIG. 11 is a diagram showing an operation flow of the text recognition device shown in FIG.

【０１０１】（Ｓ１１０１）認識結果列１００４および
予測文字集合１００５をクリアする。(S1101) The recognition result sequence 1004 and the predicted character set 1005 are cleared.

【０１０２】（Ｓ１１０２）文字領域候補１００７およ
び一時的認識結果１００８をクリアする。(S1102) The character area candidate 1007 and the temporary recognition result 1008 are cleared.

【０１０３】（Ｓ１１０３）文字切り出し手段１００５
により、認識すべきテキスト中で１文字と推測される文
字領域候補を抽出し、文字領域候補１００７に保存す
る。(S1103) Character cutting means 1005
Thus, the character area candidate that is estimated to be one character in the text to be recognized is extracted and stored in the character area candidate 1007.

【０１０４】（Ｓ１１０４）文字領域候補の１つに対応
する画像を単文字認識手段１００１により認識する。(S1104) The single character recognition means 1001 recognizes the image corresponding to one of the character area candidates.

【０１０５】（Ｓ１１０５）得られた認識結果を一時的
認識結果１００８に書き加える。(S1105) The obtained recognition result is added to the temporary recognition result 1008.

【０１０６】（Ｓ１１０６）文字領域候補の全ての文字
を認識したか否かを調べる。全て認識が終わっていない
場合は（Ｓ１１０４）の処理から繰り返す。全て認識が
終わった場合は（Ｓ１１０７）へ進む。(S1106) It is checked whether all the characters of the character area candidate have been recognized. If all the recognitions have not been completed, the process is repeated from (S1104). If all have been recognized, the process proceeds to (S1107).

【０１０７】（Ｓ１１０７）文字切り出し手段１００６
により、一時的認識結果１００８を参照して各文字領域
候補に対応する文字領域スコアを計算する。各文字領域
に対応する一時的認識結果に含まれる文字のいずれか
が、予測文字集合１００５に含まれる場合は、予測スコ
アに定数２を乗じる。予測スコアが最大である文字領域
候補を文字領域として確定する。(S1107) Character cutting means 1006
Thus, the character area score corresponding to each character area candidate is calculated with reference to the temporary recognition result 1008. If any of the characters included in the temporary recognition result corresponding to each character region is included in the predicted character set 1005, the prediction score is multiplied by a constant 2. The character area candidate having the largest prediction score is determined as the character area.

【０１０８】（Ｓ１１０８）確定した文字領域に対応す
る認識結果を一時的認識結果１００８から認識結果列１
００４に追加する。(S1108) The recognition result corresponding to the confirmed character area is changed from the temporary recognition result 1008 to the recognition result sequence 1
Add to 004.

【０１０９】（Ｓ１１０９）文字予測手段１００２によ
り、次に出現する可能性のある文字を予測する。(S1109) The character predicting means 1002 predicts a character that may appear next.

【０１１０】（Ｓ１１１０）得られた予測文字を予測文
字集合１００５に書き込む。(S1110) The obtained predicted character is written in the predicted character set 1005.

【０１１１】（Ｓ１１１１）認識すべきテキスト中の全
ての文字を認識したか否かを調べる。全て認識が終わっ
ていない場合は（Ｓ１１０２）の処理から繰り返す。全
て認識が終わった場合は（Ｓ１１１２）へ進む。(S1111) It is checked whether all the characters in the text to be recognized have been recognized. When all the recognition is not completed, the process is repeated from (S1102). When all recognition is completed, the process proceeds to (S1112).

【０１１２】（Ｓ１１１２）認識結果列１００４の内容
を出力する。(S1112) The contents of the recognition result column 1004 are output.

【０１１３】上記動作フローにおいて、上記（Ｓ１１０
７）で予測スコアに乗じる定数２は他の値でもよい。定
数は１より大きい値にすることが効果的である。文字予
測手段１００２の正確さなどにより最適な値は変わる。
また、予測文字集合１００５に含まれる文字数に応じて
その都度定数の値を変更することも可能である。さら
に、一時的認識結果１００８と予測文字集合１００５に
共通に含まれる文字の文字種に応じて定数の値を変更す
ることも効果的である。例えば、一時的認識結果１００
８と予測文字集合１００５に共通に含まれる文字が平仮
名であるときには１．５、片仮名であるときには１．
７、漢字であるときには２とする。さらに、一時的認識
結果１００８と予測文字集合１００５に共通に含まれる
文字の認識候補中での順位に応じて定数の値を変更する
ことも効果的である。例えば、一時的認識結果１００８
の第１位候補が予測文字集合１００５に含まれる場合は
２、第２位候補が含まれる場合は１．３、第３位候補が
含まれる場合は１．１とする。In the above operation flow, the above (S110
The constant 2 by which the prediction score is multiplied in 7) may be another value. It is effective to set the constant to a value greater than 1. The optimum value changes depending on the accuracy of the character prediction unit 1002.
It is also possible to change the value of the constant each time according to the number of characters included in the predicted character set 1005. Furthermore, it is also effective to change the value of the constant according to the character type of the character commonly included in the temporary recognition result 1008 and the predicted character set 1005. For example, the temporary recognition result 100
8 and the character commonly included in the predicted character set 1005 is 1.5 when the character is hiragana, and 1. when the character is katakana.
7. If it is Kanji, it is 2. Furthermore, it is also effective to change the value of the constant in accordance with the rank of the characters included in both the temporary recognition result 1008 and the predicted character set 1005 in the recognition candidates. For example, the temporary recognition result 1008
If the first-ranked candidate is included in the predicted character set 1005, it is set to 2, if the second-ranked candidate is included, 1.3, and if the third-ranked candidate is included, 1.1 is set.

【０１１４】なお、上述の実施例においては、定数を乗
じる条件を、一時的認識結果１００８に含まれる文字の
いずれかが予測文字集合１００５に含まれる場合とした
が他の条件でもよい。例えば一時的認識結果１００８の
第１位候補が予測文字集合１００５に含まれる場合とし
てもよい。In the above embodiment, the condition for multiplying the constant is the case where any of the characters included in the temporary recognition result 1008 is included in the predicted character set 1005, but other conditions may be used. For example, the first rank candidate of the temporary recognition result 1008 may be included in the predicted character set 1005.

【０１１５】図１２は、図１０に示されたテキスト認識
装置による効果の例を示す図である。FIG. 12 is a diagram showing an example of the effect of the text recognition device shown in FIG.

【０１１６】図において、１２０１はテキスト画像の一
部である。１２０２、１２０９は確定した文字領域であ
る。１２０３、１２１０は認識結果列１００４の内容で
ある。１２０４は文字予測集合１００５の内容である。
１２０５、１２０６は文字領域候補１００７である。１
２０７、１２０８は一時的認識結果１００８の内容であ
る。In the figure, 1201 is a part of a text image. 1202 and 1209 are defined character areas. 1203 and 1210 are the contents of the recognition result sequence 1004. 1204 is the content of the character prediction set 1005.
1205 and 1206 are character area candidates 1007. 1
207 and 1208 are the contents of the temporary recognition result 1008.

【０１１７】１２０２はテキスト画像の一部１２０１に
おいて「ベクト」までを認識した時点での文字領域の様
子であり、確定した文字領域を黒い矩形で示している。
この時点で認識結果列１００４の内容は１２０３に示す
ようになる。１２０３を参照して文字予測手段１００２
により得られた予測文字集合１００５の内容は１２０４
に示すようになる。Reference numeral 1202 shows the state of the character area at the time when the part 1201 of the text image is recognized up to "Vect", and the confirmed character area is shown by a black rectangle.
At this point, the contents of the recognition result sequence 1004 are as shown in 1203. Referring to 1203, character prediction means 1002
The content of the predicted character set 1005 obtained by
It becomes as shown in.

【０１１８】次の文字「ル」は縦の射影を見ると２つの
部分からなっており、文字領域候補１２０５および１２
０６が得られる。図中で文字領域候補１００７はそれぞ
れ灰色の矩形で示されている。When the vertical projection is seen, the next character "Lu" has two parts, and character region candidates 1205 and 12
06 is obtained. In the figure, the character area candidates 1007 are shown by gray rectangles.

【０１１９】各文字領域候補に対して単文字認識手段１
００１で認識を行なって得られる一時的認識結果１００
８の内容が１２０７および１２０８である。それぞれ左
列は上から順に第１位候補、第２位候補、第３位候補で
あり、右の数値は文字領域スコアを示す。文字領域スコ
アは、その文字領域がどの程度文字らしいかを示す値で
あり、例えば文字認識処理における第１位候補の距離の
逆数を用いる。Single character recognition means 1 for each character area candidate
Temporary recognition result 100 obtained by performing recognition at 001
The contents of 8 are 1207 and 1208. The left columns are the first, second, and third candidates in order from the top, and the numerical values on the right show the character region scores. The character area score is a value indicating how likely the character area is to be a character, and for example, the reciprocal of the distance of the first candidate in the character recognition process is used.

【０１２０】第１位候補の距離の逆数は、文字領域候補
１２０５に対しては「０．１７２」、文字領域候補１２
０６に対しては「０．１５５」となる。The reciprocal of the distance of the first rank candidate is "0.172" for the character area candidate 1205, and the character area candidate 12
It is "0.155" for 06.

【０１２１】文字切り出し手段１００６は、文字領域ス
コアが最大の文字領域候補を文字領域として確定するの
で、第１位候補の距離の逆数の値をそのまま用いる場合
には、文字領域候補１２０５が文字領域として確定され
てしまい、文字切り出しの誤りを引き起こす。Since the character segmenting means 1006 determines the character region candidate having the largest character region score as the character region, when the value of the reciprocal of the distance of the first place candidate is used as it is, the character region candidate 1205 becomes the character region. Will be determined as, causing an error in character cutting.

【０１２２】しかし本発明では、文字領域候補１２０６
に対応する一時的認識結果１２０８の第１位候補「ル」
が予測文字集合１２０４に含まれているため、文字領域
候補１２０６に対応する距離の逆数の値「０．１５５」
に定数２が乗じられた値「０．３１０」が文字領域スコ
アとして得られる。一方、文字領域候補１２０５に対応
する一時的認識結果１２０７には、予測文字集合１２０
４に含まれる文字がないため、文字領域スコアはそのま
ま「０．１７２」となる。However, in the present invention, the character area candidate 1206
No. 1 candidate of the temporary recognition result 1208 corresponding to
Is included in the predicted character set 1204, the value of the reciprocal of the distance corresponding to the character area candidate 1206 is “0.155”.
The value "0.310" obtained by multiplying by is a character area score. On the other hand, the temporary recognition result 1207 corresponding to the character area candidate 1205 includes the predicted character set 120
Since there is no character included in 4, the character area score is “0.172” as it is.

【０１２３】このような作用の結果、文字領域候補１２
０６に対応する文字領域スコア「０．３１０」が最大と
なり、文字領域候補１２０６が文字領域として確定され
て文字領域は１２０９のようになり、正しい文字切り出
し結果が得られる。As a result of such an operation, the character area candidate 12
The character area score “0.310” corresponding to 06 becomes the maximum, the character area candidate 1206 is determined as the character area, the character area becomes like 1209, and the correct character cutout result is obtained.

【０１２４】文字領域として確定した文字領域候補１２
０５に対応する一時的認識結果１２０７の内容が認識結
果列１００４に書き込まれ、認識結果列１００４の内容
は１２１０に示すようになる。Character area candidate 12 confirmed as a character area
The contents of the temporary recognition result 1207 corresponding to 05 are written in the recognition result column 1004, and the contents of the recognition result column 1004 are as shown in 1210.

【０１２５】上述したように、図１０に示す実施例にお
いては、図１０乃至図１２を用いて説明した構成および
作用により、文字予測手段により予測され文字予測集合
に含まれている文字に対応する文字領域候補は、文字領
域スコアがより大きい値に変更されるため、文字領域と
して確定されやすくなる効果がある。さらに、予測され
た文字が確定されやすくなる度合いは、定数の値を変え
ることにより調整が可能である。As described above, the embodiment shown in FIG. 10 corresponds to the character predicted by the character prediction means and included in the character prediction set by the configuration and operation described with reference to FIGS. 10 to 12. Since the character area candidate is changed to have a larger character area score, it has the effect of being easily determined as a character area. Furthermore, the degree to which the predicted character is likely to be fixed can be adjusted by changing the value of the constant.

【０１２６】なお、上述した実施例は互いに組み合わせ
ることが可能であって、これらの併用により文字認識精
度をさらに向上させることができる。The above-described embodiments can be combined with each other, and by using them in combination, the character recognition accuracy can be further improved.

【０１２７】[0127]

【発明の効果】以上に述べたように、本発明によれば、
文字予測手段によって予測された文字コードが１つ以上
ある場合には、予測されたそれぞれの文字コードが上位
に来るように文字認識結果を補正してから出力するよう
にしたので、文字認識処理や文字切り出し処理における
候補を効率的かつ正確に絞り込むことができる。As described above, according to the present invention,
When there is one or more character codes predicted by the character predicting means, the character recognition result is corrected so that each predicted character code is in the higher order, and then output. It is possible to efficiently and accurately narrow down the candidates in the character cutting process.

【０１２８】また、文字認識処理または予測処理が不正
確な場合でも、以降の文字に対する処理を正確に続ける
ことができる。Even if the character recognition process or the prediction process is inaccurate, the process for the subsequent characters can be continued accurately.

[Brief description of drawings]

【図１】従来技術によるテキスト認識の動作フローを
示す図である。FIG. 1 is a diagram showing an operation flow of text recognition according to a conventional technique.

【図２】本発明によるテキスト認識装置の実施例の構
成の一例を示す図である。FIG. 2 is a diagram showing an example of a configuration of an embodiment of a text recognition device according to the present invention.

【図３】本発明によるテキスト認識装置の実施例の概
略の動作フローを示す図である。FIG. 3 is a diagram showing a schematic operation flow of an embodiment of a text recognition device according to the present invention.

【図４】テキスト認識装置における単文字認識手段２
０１の構成の一例を示す図である。FIG. 4 is a single character recognition means 2 in the text recognition device.
It is a figure which shows an example of a structure of 01.

【図５】発明によるテキスト認識装置の実施例におけ
る単文字認識手段２０１の動作フローを示す図である。FIG. 5 is a diagram showing an operation flow of a single character recognition means 201 in an embodiment of a text recognition device according to the present invention.

【図６】従来技術によるテキスト認識装置における認
識の様子を示す図である。FIG. 6 is a diagram showing a state of recognition in a conventional text recognition device.

【図７】本発明によるテキスト認識装置の実施例にお
ける認識の様子を示す図である。FIG. 7 is a diagram showing a state of recognition in the embodiment of the text recognition device according to the present invention.

【図８】本発明によるテキスト認識装置の実施例にお
ける単文字認識手段の構成の他の例を示す図である。FIG. 8 is a diagram showing another example of the configuration of the single character recognition means in the embodiment of the text recognition device according to the present invention.

【図９】図８に示されたテキスト認識装置における単
文字認識手段の動作フローを示す図である。9 is a diagram showing an operation flow of a single character recognition means in the text recognition device shown in FIG.

【図１０】本発明によるテキスト認識装置の実施例の
構成の更に他の例を示す図である。FIG. 10 is a diagram showing still another example of the configuration of the embodiment of the text recognition device according to the present invention.

【図１１】図１０に示されたテキスト認識装置の動作
フローを示す図である。11 is a diagram showing an operation flow of the text recognition device shown in FIG.

【図１２】図１０に示されたテキスト認識装置による
効果の例を示す図である。FIG. 12 is a diagram showing an example of effects of the text recognition device shown in FIG.

[Explanation of symbols]

２０１…単文字認識手段、２０２…文字予測手段、２０
３…メモリ、２０４…認識結果列、２０５…予測文字集
合、４０１…特徴抽出手段、４０２…距離計算手段、４
０３…標準特徴記憶手段、４０４…距離値記憶手段、４
０５…距離値ソート手段、６０１…テキスト画像の一
部、６０２…認識結果列の内容、７０２，７０４，７０
６，７０８，７１０…認識結果列の内容、７０３，７０
５，７０７，７０９，７１１…予測文字集合の内容、８
０１…第１の特徴抽出手段、８０２…第１の距離計算手
段、８０３…第１の標準特徴記憶手段、８０４…第１の
距離値記憶手段、８０５…第１の距離値ソート手段、８
０６…中間候補付加手段、８０７…第２の特徴抽出手
段、８０８…第２の距離計算手段、８０９…第２の標準
特徴記憶手段、８１０…第２の距離値記憶手段、８１１
…第２の距離値ソート手段、１００１…単文字認識手
段、１００２…文字予測手段、１００３…メモリ、１０
０４…認識結果列、１００５…予測文字集合、１００６
…文字切り出し手段、１００７…文字領域候補、１００
８…一時的認識結果、１２０１…テキスト画像の一部、
１２０２，１２０９…確定した文字領域、１２０３，１
２１０…認識結果列の内容、１２０４…文字予測集合の
内容、１２０５，１２０６…文字領域候補、１２０７，
１２０８…一時的認識結果の内容201 ... Single character recognition means, 202 ... Character prediction means, 20
3 ... Memory, 204 ... Recognition result sequence, 205 ... Predicted character set, 401 ... Feature extraction means, 402 ... Distance calculation means, 4
03 ... Standard feature storage means, 404 ... Distance value storage means, 4
05 ... Distance value sorting means, 601 ... Part of text image, 602 ... Contents of recognition result sequence, 702, 704, 70
6, 708, 710 ... Contents of recognition result sequence, 703, 70
5, 707, 709, 711 ... Predicted character set contents, 8
01 ... first feature extraction means, 802 ... first distance calculation means, 803 ... first standard feature storage means, 804 ... first distance value storage means, 805 ... first distance value sorting means, 8
06 ... Intermediate candidate adding means, 807 ... Second feature extracting means, 808 ... Second distance calculating means, 809 ... Second standard feature storing means, 810 ... Second distance value storing means, 811
... second distance value sorting means, 1001 ... single character recognition means, 1002 ... character prediction means, 1003 ... memory, 10
04 ... Recognition result sequence, 1005 ... Predicted character set, 1006
... character cutting means, 1007 ... character area candidates, 100
8 ... Temporary recognition result, 1201 ... Part of text image,
1202, 1209 ... Confirmed character area, 1203, 1
210 ... Contents of recognition result sequence, 1204 ... Contents of character prediction set, 1205, 1206 ... Character region candidates, 1207,
1208 ... Contents of temporary recognition result

Claims

[Claims]

1. A text recognition device for sequentially recognizing each character from an image containing a character string and outputting a sequence of character codes, wherein character recognition processing is performed on an image for one character, A first single-character recognizing means for obtaining a character recognition result including a character code corresponding to an image of characters and a certainty that is the certainty of the character code; and Character predicting means for predicting a character code that appears next from the sequence of the character recognition results, the character code obtained by the first single character recognizing means and its accuracy, and the character code predicted by the character predicting means. And a second single character recognizing means for determining the character recognition result for the image of one character based on the above.

2. A text recognition device for sequentially recognizing each character from an image including a character string and outputting a sequence of a plurality of ordered character codes, wherein the character code is for an image area of one character. A first single character recognition means for calculating a distance or similarity with respect to the standard pattern of, and obtaining a character recognition result including the character code corresponding to the image area for one character and the distance or the similarity; A character prediction unit that predicts a character code that will appear next from a sequence of character recognition results obtained by the first single character recognition unit; and the first character that corresponds to the character code predicted by the character prediction unit. A second operation is performed in which a predetermined operation is performed on the distance or the similarity calculated by the character recognition means, the character codes are sequenced based on the operation result, and the character recognition result for the one character image is obtained. A text recognition device comprising a character recognition means.

3. A text recognition device for sequentially recognizing each character from an image including a character string and outputting a sequence of a plurality of ordered character codes, the character recognizing process for an image of one character. And a first single character recognition means for obtaining a character recognition result including a character code corresponding to the image for one character, and a sequence of character recognition results obtained by the single character recognition means up to a certain time. A character predicting means for predicting a character code appearing in, a character code obtained by the first single character recognizing means, and a character code predicted by the character predicting means based on the image for one character. And a second single character recognizing means for performing character recognition processing for the character code, and ordering and determining character codes corresponding to the image for one character based on the certainty which is the certainty of the character code. A text recognition device for collection.

4. A text recognition device for sequentially recognizing each character from an image including a character string and outputting a sequence of character codes, and a character slicing means for slicing a character region candidate for one character from the image. , Character recognition processing is performed on the character area candidate for one character cut out by the character cutting means, and a character recognition result including a character code for the character area candidate for one character and a probability indicating the certainty is obtained. A first single-character recognizing means, and a score calculating means for calculating a character area score indicating the accuracy of the character area candidate for the one character based on the character code and the accuracy obtained by the first single-character recognizing means. A character predicting unit that predicts a character code that appears next from a sequence of character recognition results obtained by the single character recognizing unit up to a certain time point; and the first single character recognizing unit. A character code determined by the means and its accuracy, based on the area score calculated by the score calculation means and the character code predicted by the character prediction means, while determining the character area from the character area candidates And a second single-character recognizing unit that orders character codes and obtains a character recognition result corresponding to the character area.

5. The accuracy is a value corresponding to a distance between a feature extracted from a character image and a standard feature, and the accuracy is any one of claims 1, 3, and 4. The text recognition device according to paragraph.

6. The first single-character recognizing means and the second single-character recognizing means have the same configuration and are realized by one piece of hardware or one piece of software. The text recognition device according to claim 5.