JPH02300988A

JPH02300988A - character recognition device

Info

Publication number: JPH02300988A
Application number: JP1122543A
Authority: JP
Inventors: Yoshihiro Kitamura; 義弘北村; Hisafumi Saika; 尚史齋鹿
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 1989-05-16
Filing date: 1989-05-16
Publication date: 1990-12-13
Anticipated expiration: 2013-12-02
Also published as: JP2832035B2

Abstract

PURPOSE:To recognize a specified character such as '-' (Chinese character meaning unity), '1', '-', etc., at high speed and in addition, correctly by deciding an input character in which the value of the ratio of the number of character constituent elements to a character frame area is settled within a prescribed range to be the specified character. CONSTITUTION:Based on character picture information which is compand- processed so that width and height are settled at prescribed values by a width/ height extracting part 251 and a width/height comparing part 252, the ratio of the number of the character constituent elements measured by a number of character constituent element measuring part 254 to the area of the rectangular character frame with which the input character is in contact at top and bottom and right and left measured by a character frame calculating means 253 is obtained by a specified character discriminating part 255. Then, if the value of this ratio is settled within the prescribed range, the input character is decided to be the specified character such as '-', '1', '-', etc. Thus, the specified character can be recognized at high speed and in addition, correctly without executing recognizing processing by pattern matching, etc.

Description

【発明の詳細な説明】く産業上の利用分野〉この発明は、ｒ−Ｊ，ＩＮおよび「〜」等のように、水
平方向または垂直方向に延在する一本の線によって概略
構成される特定文字を精度よく認識することができる文
字認識装置に関する。DETAILED DESCRIPTION OF THE INVENTION [Field of Industrial Application] This invention generally consists of a single line extending in the horizontal or vertical direction, such as r-J, IN, and "~". The present invention relates to a character recognition device that can accurately recognize specific characters.

く従来の技術〉文章の文字情報をコンピュータ処理等によって認識する
文字認識装置として光学式文字読取装置（ＯＣＲ）が知
られている。このＯＣＲは、例えば手書き文字や印刷文
字等の認識対象文字の文字情報を光電変換し、この光電
変換されて得られた電気信号に基づく文字画像情報を１
文字単位で切り出す。そして、認識部において、この１
文字単位で切り出された文字画像情報に基づいて、所定
の認識アルゴリズムに従って１文字ずつ入力文字を認識
する。2. Description of the Related Art An optical character reader (OCR) is known as a character recognition device that recognizes character information of a text by computer processing or the like. This OCR photoelectrically converts character information of characters to be recognized, such as handwritten characters and printed characters, and converts character image information based on the electrical signals obtained by this photoelectric conversion into one image.
Cut out each character. Then, in the recognition section, this 1
Input characters are recognized one by one according to a predetermined recognition algorithm based on character image information cut out character by character.

従来、上述のようなＯＣＲにおいて入力文字を認識する
場合には、まず入力文字の文字画像情報の大局的特徴（
例えば、複雑指数や四辺コード等）に基づいて大分類を
行い、幾つかの候補文字を選択する。次に、上記文字画
像情報の詳細な特徴（例えば、複合類似度や混合類似度
等）に基づいて、大分類によって選択された候補文字の
中から、さらに詳細認識を行うようにしている。Conventionally, when recognizing an input character using the above-mentioned OCR, first the global characteristics of the character image information of the input character (
For example, major classification is performed based on complexity index, four-sided code, etc.), and several candidate characters are selected. Next, further detailed recognition is performed from among the candidate characters selected by the major classification based on detailed features (for example, composite similarity, mixed similarity, etc.) of the character image information.

〈発明が解決しようとする課題〉しかしながら、上記ＯＣＲにおける従来の文字認識方法
では、大分類あるいは詳細認識いずれの場合においても
、１文字（カテゴリ）単位の分類アルゴリズムあるいは
認識アルゴリズムを用意していないので、分類対象ある
いは認識対象の複数の文字の総てを同一のアルゴリズム
に従って分類あるいは認識しなければならない。そのた
めに、正規化処理によってｒ−Ｊ、ｒｌｊおよび「＝」
等の特定文字も所定の幅と高さになるように引き伸ばさ
れ、文字領域を構成する黒情報で満たされた略矩形とな
ってしまう。したがって、上記特定文字は他の画数の多
い（すなわち、文字領域を構成する黒情報の多い）文字
と誤認識される場合があり、上記特定文字の認識率が低
くなるという問題がある。<Problems to be Solved by the Invention> However, the conventional character recognition method in OCR mentioned above does not provide a classification algorithm or recognition algorithm for each character (category) in either major classification or detailed recognition. , all of the multiple characters to be classified or recognized must be classified or recognized according to the same algorithm. Therefore, by normalization process, r-J, rlj and "="
A specific character, such as , is also stretched to a predetermined width and height, resulting in a substantially rectangular shape filled with black information forming the character area. Therefore, the specific character may be erroneously recognized as another character with a large number of strokes (that is, with a large amount of black information forming the character area), resulting in a problem that the recognition rate of the specific character becomes low.

ところが、上記特定文字の認識率を高めるためには、複
雑な認識アルゴリズムに基づく複雑な処理を必要とし、
文字認識装置のコストアップを招くという問題が生じる
。However, in order to increase the recognition rate of the above-mentioned specific characters, complex processing based on complex recognition algorithms is required.
A problem arises in that the cost of the character recognition device increases.

そこで、この発明の目的は、簡単な処理によって、ｒ−
Ｊ、ｒ　１　ｊＪ−Ｊ等の特定文字を高速かつ正確に認
識できる文字認識装置を提供することにある。Therefore, an object of the present invention is to provide r-
An object of the present invention is to provide a character recognition device that can quickly and accurately recognize specific characters such as J, r 1 jJ-J, etc.

〈課題を解決するための手段〉上記目的を達成するため、この発明は、入力された文字
画像情報に従って、入力文字の幅と高さを幅／高さ抽出
部によって抽出し、この抽出された文字の幅の値と高さ
の値とに基づいて、入力文字の幅および高さが所定の値
になるように幅／高さ伸縮部によって伸縮処理を行い、
この伸縮処理が行われた文字画像情報に基づいて入力文
字を認識する文字認識装置において、上記伸縮処理が行
われた文字画像情報に基づいて、入力文字の上下および
左右が共に接する矩形の文字枠の面積を算出する文字枠
面積算出部と、上記伸縮処理が行われた文字画像情報に
基づいて、入力文字の文字領域を構成する要素の数を計
測する文字構成要素数計測部と、上記文字枠面積算出部
からの文字枠面積の値と上記文字構成要素数計測部から
の文字構成要素数とに基づいて、上記文字構成要素数と
文字枠面積との比を求め、この文字構成要素数と文字枠
面積との比の値が所定の範囲内に入るような入力文字を
特定文字と判定する特定文字判定部を備えたことを特徴
としている。<Means for Solving the Problems> In order to achieve the above object, the present invention extracts the width and height of an input character by a width/height extraction unit according to input character image information, and extracts the width and height of the input character according to input character image information. Based on the width value and height value of the character, a width/height expansion/contraction unit performs expansion/contraction processing so that the width and height of the input character become predetermined values;
In a character recognition device that recognizes an input character based on the character image information that has been subjected to the stretching process, a rectangular character frame is formed in which the top and bottom and left and right sides of the input character are in contact, based on the character image information that has been subjected to the stretching process. a character frame area calculation unit that calculates the area of the character frame area, a character component number measurement unit that measures the number of elements constituting the character area of the input character based on the character image information subjected to the expansion/contraction processing; Based on the character frame area value from the frame area calculation section and the number of character components from the character component number measurement section, calculate the ratio between the number of character components and the character frame area, and calculate the number of character components. The present invention is characterized in that it includes a specific character determining section that determines, as a specific character, an input character in which the value of the ratio between the character frame area and the character frame area falls within a predetermined range.

また、この発明の文字認識装置は、上記特定文字判定部
を、上記幅／高さ抽出部によって抽出された入力文字の
幅の値と高さの値とに基づいて、縦長の特定文字と横長
の特定文字とを識別するように成すことが望ましい。Further, the character recognition device of the present invention may cause the specific character determination unit to determine whether a vertically long specific character or a horizontally long specific character based on the width value and height value of the input character extracted by the width/height extraction unit. It is desirable to distinguish between specific characters.

く作用〉文字画像情報が入力されると、入力された文字画像情報
に従って、入力文字の幅と高さが幅／高さ抽出部によっ
て抽出され、この抽出された文字の幅の値と高さの値と
に基づいて、入力文字の幅および高さが所定の値になる
ように幅／高さ伸縮部によって伸縮処理が行われる。そ
うすると、この伸縮処理が行われた文字画像情報に基づ
いて、入力文字の上下および左右が共に接する矩形の文
字枠の面積が文字枠面積算出部によって算出される。一
方、上記伸縮処理が行われた文字画像情報に基づいて、
入力文字の文字領域を構成する要素の数が文字構成要素
数計測部によって計測される。Function> When character image information is input, the width/height extractor extracts the width and height of the input character according to the input character image information, and extracts the width value and height of the extracted character. Based on the values of , the width/height expansion/contraction section performs expansion/contraction processing so that the width and height of the input character become predetermined values. Then, based on the character image information subjected to the expansion/contraction process, the area of a rectangular character frame where the upper and lower and left and right sides of the input character are in contact is calculated by the character frame area calculation unit. On the other hand, based on the character image information that has been subjected to the above expansion/contraction processing,
The number of elements constituting a character area of an input character is measured by a character component number measuring section.

そして、上記文字枠面積算出部からの文字枠面積の値と
上記文字構成要素数計測部からの文字構成要素数とに基
づいて、上記文字構成要素数と文字枠面積との比シ（特
定文字判定部によって求められると共に、この文字構成
要素数と文字枠面積との比の値が所定の範囲内に入るよ
うな入力文字が上記特定文字判定部によって特定文字で
あると判定される。したがって、このようにして特定文
字であると判定された入力文字に対しては、認識処理を
実行する必要がなくなる。Then, based on the value of the character frame area from the character frame area calculation section and the number of character components from the character component number measurement section, the ratio of the number of character components to the character frame area (specific character An input character that is determined by the determining section and for which the value of the ratio between the number of character constituent elements and the area of the character frame falls within a predetermined range is determined to be a specific character by the specific character determining section.Therefore, There is no need to perform recognition processing on input characters that are determined to be specific characters in this way.

また、上記文字認識装置は、上記特定文字判定部を、上
記幅／高さ抽出部によって抽出された入力文字の幅の値
と高さの値とに基づいて、縦長の特定文字と横長の特定
文字とを識別するように成せば、縦長の特定文字と横長
の特定文字とを識別して上記特定文字を判定できる。The character recognition device also causes the specific character determination unit to identify a vertically long specific character and a horizontally long specific character based on the width value and height value of the input character extracted by the width/height extraction unit. If it is configured to distinguish between characters, it is possible to distinguish between a vertically long specific character and a horizontally long specific character and determine the specific character.

〈実施例〉以下、この発明を図示の実施例により詳細に説明する。<Example> Hereinafter, the present invention will be explained in detail with reference to illustrated embodiments.

第１図はこの発明の文字認識装置のブロック図である。FIG. 1 is a block diagram of a character recognition device according to the present invention.

イメージスキャナ等から成る入力部１から入力された文
字画像情報は、前処理部２によって２値化されて雑音除
去の処理が施される。さらに、この２値化されて雑音が
除去された文字画像情報に基づいて文字切り出しが行わ
れると共に、正規化等の処理が施された後に認識部３に
出力される。認識部３では、前処理部２からの１文字単
位の画像情報に従って文字パターンの特徴パラメータが
抽出され、この抽出された特徴パラメータと標準パター
ンとのパターンマツチングが行われて入力文字の認識が
行われる。そして、認識結果が認識結果表示部４に表示
されると共に、誤認識されたりジェクト文字の修正が行
われる。制御部５は入力部１．前処理部２．認識部３お
よび認識結果表示部４を制御して、文字認識動作を実行
ずろ。Character image information input from an input section 1 comprising an image scanner or the like is binarized by a preprocessing section 2 and subjected to noise removal processing. Furthermore, characters are cut out based on this binarized character image information from which noise has been removed, and after being subjected to processing such as normalization, it is output to the recognition unit 3. In the recognition unit 3, feature parameters of the character pattern are extracted according to the image information for each character from the preprocessing unit 2, and pattern matching is performed between the extracted feature parameters and a standard pattern to recognize the input character. It will be done. Then, the recognition result is displayed on the recognition result display section 4, and erroneous recognition or correction of the eject character is performed. The control section 5 has an input section 1. Pre-processing section 2. Control the recognition unit 3 and recognition result display unit 4 to execute character recognition operation.

第２図は第１図における前処理部２の詳細なブロック図
である。通信部２１は、ＯＣＲ本体の制御部５と情報を
交換するためのものであり、入力部ｌから入力されて制
御部５から送出された文字画像情報は通信部２１を介し
て取り込まれる。画像受信部２２は、雑音除去部２３お
よび正規化部２４において後に述べるような画像処理を
実行する際に用いられるバッファであり、通信部２１を
介して取り込まれた文字画像情報はこの画像受信部２２
に一旦格納される。FIG. 2 is a detailed block diagram of the preprocessing section 2 in FIG. 1. The communication section 21 is for exchanging information with the control section 5 of the OCR main body, and character image information input from the input section 1 and sent out from the control section 5 is taken in via the communication section 21. The image receiving section 22 is a buffer used when the noise removing section 23 and the normalizing section 24 perform image processing as described later. 22
is temporarily stored.

上記雑音除去部２３は、文字画像情報における数ビツト
以下の独立領域を消去して、汚れ等の雑音を除去する。The noise removal section 23 removes noise such as dirt by erasing independent areas of several bits or less in the character image information.

そして、文字切出部２４は、雑音除去部２３によって雑
音が除去された文字画像情報を１文字単位で切り出す。Then, the character cutting unit 24 cuts out the character image information from which noise has been removed by the noise removing unit 23, character by character.

その後、正規化部２５は雑音が除去されて１文字単位に
切り出された文字画像情報に基づいて正規化処理を実行
し、文字の大きさ、傾き１位置および線幅等の変動を除
去して文字の大きさ等を一定にする。さらに、正規化部
２５は、後に詳述するようにしてこの発明に係る特定文
字判定処理を実行する。そして、雑音除去部２３あるい
は正規化部２５によって所定の処理が行われた後の！文
字単位の文字画像情報は、通信部２１を介して制御部５
へ送出される。Thereafter, the normalization unit 25 performs normalization processing based on the character image information from which noise has been removed and extracted into individual characters, and variations in character size, slant position, line width, etc. are removed. Make the font size etc. constant. Furthermore, the normalization unit 25 executes specific character determination processing according to the present invention as will be described in detail later. Then, after predetermined processing is performed by the noise removal section 23 or the normalization section 25! The character image information for each character is sent to the control unit 5 via the communication unit 21.
sent to.

第３図は上記正規化部２５における特定文字判定処理に
係る部分の更に詳細なブロック図である。FIG. 3 is a more detailed block diagram of the part related to specific character determination processing in the normalization section 25.

以下、第３図に従ってこの発明に係るｒ−１，ｒｌｊお
よび「−」等の特定文字の判定処理について詳細に述べ
る。Hereinafter, the determination process for specific characters such as r-1, rlj and "-" according to the present invention will be described in detail with reference to FIG.

上記通信部２１を介して取り込まれ、雑音除去部２３に
よって雑音が除去されて画像受信部２２に格納された文
字画像情報は、正規化部２５に読み込まれて幅／高さ抽
出部２５１に入力される。The character image information taken in through the communication section 21, noise removed by the noise removal section 23, and stored in the image reception section 22 is read into the normalization section 25 and input to the width/height extraction section 251. be done.

この幅／高さ抽出部２５１においては、入力された１文
字単位の文字画像情報に基づいて、当該文字の幅と高さ
が抽出される。そうすると、幅／高さ伸縮部２５２は、
幅／高さ抽出部２５１からの当該文字の幅の値と高さの
値とに基づいて、当該文字の幅と高さが所定の正規化基
準値になるように伸縮処理を行う。その際におけろ伸縮
処理は例えば次のようにして行う。すなわち、予め設定
した所定の大きさを有する矩形の枠に、当該文字の上下
および左右が共に接するように当該文字を縦方向および
横方向に伸縮するのである。The width/height extraction section 251 extracts the width and height of the character based on the input character image information for each character. Then, the width/height expansion/contraction section 252 becomes
Based on the width value and height value of the character from the width/height extraction unit 251, expansion/contraction processing is performed so that the width and height of the character become predetermined normalized reference values. At this time, the expansion/contraction process is performed, for example, as follows. That is, the character is expanded and contracted in the vertical and horizontal directions so that the upper and lower sides and left and right sides of the character touch a rectangular frame having a predetermined size.

すなわち、伸縮処理が施された例えば特定文字ｒｌＪの
場合は、所定の幅になるまで横方向に引き伸ばされるた
めに、上記矩形の枠が文字領域を構成する黒情報で十分
に満たされることになるのである。In other words, in the case of a specific character rlJ that has been subjected to stretching processing, for example, it is stretched in the horizontal direction until it reaches a predetermined width, so that the rectangular frame is sufficiently filled with the black information that constitutes the character area. It is.

上記幅／高さ抽出部２５１および幅／高さ伸縮部２５２
によって実行される正規化処理は、従来からよく行われ
ている正規化処理である。The width/height extraction section 251 and the width/height expansion/contraction section 252
The normalization process performed by is a normalization process that has been commonly performed in the past.

上記幅／高さ伸縮部２５２によって伸縮処理を施された
文字画像情報は文字枠面積算出部２５３および文字構成
要素数計測部２５４に出力される。The character image information subjected to the expansion/contraction process by the width/height expansion/contraction section 252 is output to the character frame area calculation section 253 and the character component number measurement section 254.

文字枠面積算出部２５３は、入力された文字画像情報に
基づいて当該文字を囲む最小矩形（すなわち、当該文字
の上下および左右が共に接する矩形）の文字枠の面積Ｗ
を算出する。一方、文字構成要素数計測部２５４は、入
力された文字画像情報に基づいて文字領域を構成するビ
ット数ａを計測する。　　　　　′ そうすると、特定文字判定部２５５は、文字枠面積算出
部２５３からの文字枠の面積Ｗの値と文字構成要素数計
測部２５４からの文字領域を構成するビット数ａの値と
に基づいて、判定定数Ｃ＝ａ／ｗを算出する。この判定
定数Ｃは上記文字枠の面積に対する文字領域の面積を表
し、その値が大きいほど文字枠の面積と文字領域の面積
とが略等しく、文字枠は文字領域を構成する黒情報で十
分に満たされていることを表す。すなわち、判定定数Ｃ
の値がある閾値０以上になるような文字画像情報は、ｒ
−Ｊ、ｒｌＪおよび「＝」等の特定文字が幅／高さ伸縮
部２５２によって伸縮された文字画像情報であると言う
ことができる。The character frame area calculation unit 253 calculates the area W of the character frame of the minimum rectangle surrounding the character (that is, the rectangle in which the upper and lower sides and the left and right sides of the character are in contact) based on the input character image information.
Calculate. On the other hand, the character component number measuring unit 254 measures the number a of bits forming a character area based on the input character image information. ' Then, based on the value of the area W of the character frame from the character frame area calculation unit 253 and the value of the number of bits a forming the character area from the character component number measurement unit 254, the specific character determination unit 255 calculates the following: Calculate the determination constant C=a/w. This determination constant C represents the area of the character area relative to the area of the character frame, and the larger the value, the more the area of the character frame is approximately equal to the area of the character area, and the character frame has sufficient black information that constitutes the character area. It means being satisfied. That is, the judgment constant C
Character image information for which the value of is greater than a certain threshold value 0 is r
It can be said that specific characters such as -J, rlJ, and “=” are character image information expanded and contracted by the width/height expansion/contraction unit 252.

この場合、画数の多い文字の場合は、文字画像情報に対
して幅／高さ伸縮部２５２によって伸縮処理が施されて
も、上記特定文字の場合のように文字枠が文字領域を構
成する黒情報で十分に満たされることはなく、判定定数
Ｃの値はある程度小さな値を示すために、上記特定文字
と識別することが容易である。そこで、特定文字判定部
２５５は、判定定数Ｃの値が上記閾値θより大きい場合
には、当該文字の認識カテゴリはｒ−ｊＪｌｊおよび「
−」等の特定文字であると判定するのである。In this case, in the case of a character with a large number of strokes, even if the width/height expansion/contraction unit 252 performs expansion/contraction processing on the character image information, the character frame forms a black area that constitutes the character area, as in the case of the specific character described above. Since it is not filled with sufficient information and the value of the determination constant C is relatively small, it is easy to distinguish it from the above-mentioned specific character. Therefore, when the value of the determination constant C is larger than the threshold θ, the specific character determination unit 255 determines that the recognition category of the character is r−jJlj and “
It is determined that the character is a specific character such as "-".

さらに、特定文字判定部２５５は、上記幅／高さ抽出部
２５１で抽出された当該文字の幅の値と高さの値とに基
づいて、上述のようにして判定された特定文字がＩｌｌ
およびｒｌＪ等の縦長の特定文字であるかｒ−Ｊ、ｒ−
Ｊおよび「−」等の横長の特定文字であるかを識別する
のである。Further, the specific character determining unit 255 determines whether the specific character determined as described above is based on the width value and height value of the character extracted by the width/height extracting unit 251.
Is it a vertically long specific character such as rlJ or r-J, r-
It identifies whether it is a horizontally long specific character such as J or "-".

このようにして、入力文字が縦長の特定文字であるか横
長の特定文字であるかが特定文字判定部２５５によって
判定され、その判定結果が第２図における通信部２１を
介して第１図における制御部５に出力される。そうする
と、上記特定文字に対する認識処理の必要かなくなるの
で、制御部５は認識部３に対して上記特定文字に対する
認識処理を実行しないように指示するのである。こうす
ることによって、特定文字に対する認識を高速、かつ、
正確に実施できるのである。In this way, the specific character determination unit 255 determines whether the input character is a vertically long specific character or a horizontally long specific character, and the determination result is transmitted to the character shown in FIG. 1 via the communication unit 21 in FIG. It is output to the control section 5. Then, there is no need to perform the recognition process for the specific character, so the control unit 5 instructs the recognition unit 3 not to perform the recognition process for the specific character. By doing this, recognition of specific characters can be made faster and
It can be done accurately.

その際に、上記特定文字が傾きをもって書き込まれたり
印刷された場合には、上述の判定方法によっては特定文
字と判定されない恐れがある。そこで、他の文字の場合
と同様に予めｒ−Ｊ、ｒｌＪおよび「−ｊ等の特定文字
の標準パターンも用意しておけば、認識部３によって傾
きをもった特定文字を認識することができるのである。At that time, if the specific character is written or printed with an inclination, there is a possibility that it will not be determined as a specific character depending on the above-described determination method. Therefore, if standard patterns for specific characters such as r-J, rlJ, and "-j are prepared in advance as in the case of other characters, the recognition unit 3 can recognize specific characters with an inclination. It is.

このように、本実施例の文字認識装置は、前処理部２の
正規化部２５に、幅／高さ抽出部２５１゜幅／高さ伸縮
部２５２１文字枠而積算出部２５３゜文字構成要素計測
部２５４および特定文字判定部２５５を備える。そして
、幅／高さ伸縮部２５２によって入力文字の幅および高
さを正規化する。In this way, the character recognition device of this embodiment includes a width/height extraction section 251, a width/height expansion/contraction section 2521, a character frame and product calculation section 253, and a character component element in the normalization section 25 of the preprocessing section 2. It includes a measuring section 254 and a specific character determining section 255. Then, the width/height expansion/contraction unit 252 normalizes the width and height of the input character.

そして、文字枠面積算出部２５３によって算出された当
該文字を囲む最小矩形の文字枠の面積Ｗと文字構成要素
計測部２５４によって計測された当該文字を構成するビ
ット数ａとに基づく判定定数Ｃ（−ａ　／ｗ　）の値が
、閾値θより大きい場合に、特定文字判定部２５５によ
って当該文字の認識カテゴリはｒ−Ｊ、ｒｌＪおよび「
−」等の特定文字であると判定する。さらに、上記幅／
高さ抽出部２５１によって抽出された当該文字の幅の値
と高さの値とに基づいて、当該文字が縦長であるか横長
であるかを特定文字判定部２５５によって判定して、上
記特定文字が縦長の特定文字であるか横長の特定文字で
あるかを識別するようにしている。したがって、認識部
３で実施されるパターンマツチングによる認識処理を行
わずに、簡単な処理によって高速に上記特定文字を識別
できる。また、幅および高さか正規化された入力文字を
構成するビット数に基づく判定定数Ｃで上記特定文字を
判定するので、確実に上記特定文字を認識できる。Then, a determination constant C( -a/w) is larger than the threshold θ, the specific character determination unit 255 determines the recognition categories of the character as r-J, rlJ, and “
It is determined that the character is a specific character such as "-". Furthermore, the above width/
Based on the width value and height value of the character extracted by the height extraction unit 251, the specific character determination unit 255 determines whether the character is vertically long or horizontally long, and the specific character is a vertically long specific character or a horizontally long specific character. Therefore, the specific character can be identified at high speed by simple processing without performing recognition processing by pattern matching performed in the recognition section 3. Furthermore, since the specific character is determined using the determination constant C based on the number of bits constituting the input character whose width and height have been normalized, the specific character can be reliably recognized.

上記実施例においては、特定文字判定に係る文字枠面積
算出部２５３２文字構成要素計測部２５４および特定文
字判定部２５５を前処理部２の正規化部２５に設けてい
る。しかしながら、この発明はこれに限定されるもので
はな（、正規化部２５と独立して設けてもよい。また、
認識部３に設けて認識処理の前処理として特定文字判定
処理を実施してもよい。In the above embodiment, the normalization unit 25 of the preprocessing unit 2 includes a character frame area calculation unit 2532, a character component measurement unit 254, and a specific character determination unit 255 related to specific character determination. However, the present invention is not limited to this (it may be provided independently of the normalization unit 25.
The recognition unit 3 may be provided to perform specific character determination processing as a preprocessing of recognition processing.

上記実施例においては、傾いた特定文字に対しては認識
部３で通常の認識処理によって認識するようにしている
。しかしながら、この発明はこれに限定されるものでは
ない。すなわち、上記閾値θの値より小さな値の閾値θ
。を設け、この閾値θ。に従って特定文字を判定（ある
程度傾いた特定文字も含まれる）する。そして、上記判
定定数Ｃの値がθ。≦Ｃ≦θの範囲にある場合には、そ
の文字画像情報に対して正規化部２５において回転正規
化を実施した後再度判定定数Ｃを算出する。In the above embodiment, the recognition unit 3 recognizes the tilted specific character through normal recognition processing. However, the invention is not limited thereto. In other words, the threshold θ is smaller than the threshold θ above.
. and set this threshold value θ. (This includes specific characters that are tilted to some extent.) Then, the value of the determination constant C is θ. If it is within the range of ≦C≦θ, the normalization unit 25 performs rotation normalization on the character image information, and then calculates the determination constant C again.

その結果、得られた判定定数Ｃの値Ｃ゛がｃ’＞ｅとな
る場合に当該文字を上記特定文字であると確定するよう
にしてもよい。As a result, when the value C' of the obtained determination constant C satisfies c'>e, the character may be determined to be the specific character.

〈発明の効果〉以上より明らかなように、この発明の文字認識装置は、
文字枠面積算出部１文字構成要素数計側部および特定文
字判定部を備えて、幅／高さ抽出部および幅／高さ伸縮
部によって伸縮処理が行われた文字画像情報に基づいて
、上記特定文字判定部によって、上記文字構成要素数計
測部で計測された文字構成要素数と上記文字枠面積算出
部で算出された入力文字の上下および左右が共に接する
矩形の文字枠の面積との比を求めて、この比の値が所定
の範囲内に入るような入力文字をｒ−Ｊ、ｒ！Ｊおよび
「−」等の特定文字と判定するようにしたので、パター
ンマツチング等による認識処理を実行することなく、簡
単な処理によって、高速かつ正確に上記特定文字を認識
できる。<Effects of the Invention> As is clear from the above, the character recognition device of the present invention has the following effects:
The character frame area calculating section includes a side section for counting the number of character constituent elements and a specific character determining section, and the above-mentioned The ratio of the number of character constituent elements measured by the character constituent element number measuring unit to the area of a rectangular character frame where the top and bottom and left and right sides of the input character are in contact, calculated by the character frame area calculation unit, by the specific character determination unit. Find the input characters such that the value of this ratio falls within the predetermined range r-J, r! Since the characters are determined to be specific characters such as J and "-", the specific characters can be recognized quickly and accurately through simple processing without performing recognition processing such as pattern matching.

また、この発明の文字認識装置は、上記特定文字判定部
を、上記幅／高さ抽出部によって抽出された入力文字の
幅の値と高さの値とに基づいて、縦長の特定文字と横長
の特定文字とを識別するように成したので、縦長の特定
文字と横長の特定文字とを区別して上記特定文字を判定
でき、さらに正確に上記特定文字を認識できる。Further, the character recognition device of the present invention may cause the specific character determination unit to determine whether a vertically long specific character or a horizontally long specific character based on the width value and height value of the input character extracted by the width/height extraction unit. Since the specific character is identified from the specific character, the specific character can be determined by distinguishing between the vertically long specific character and the horizontally long specific character, and the specific character can be recognized more accurately.

[Brief explanation of drawings]

第１図はこの発明に係る文字認識装置のブロック図、第
２図は第１図における前処理部の詳細なブロック図、第
３図は第２図における正規化部の特定文字判定処理に係
る部分の更に詳細なブロック図である。１・・・入力部、　２・・・前処理部、　３・・・認識
部、４・・認識結果表示部、５・・・制御部、２１・・
通信部、２２・・・画像受信部、　２３・・・雑音除去
部、２４・・文字切出部、　２５・・・正規化部、２５
１・・・幅／高さ抽出部、２５２・・・幅／高さ伸縮部
、２５３・・・文字枠面積算出部、２５４・・・文字構成要素数計測部、２５５・・・特定文字判定部。FIG. 1 is a block diagram of a character recognition device according to the present invention, FIG. 2 is a detailed block diagram of the preprocessing section in FIG. 1, and FIG. 3 is a block diagram of a specific character determination process of the normalization section in FIG. FIG. 3 is a more detailed block diagram of the parts; DESCRIPTION OF SYMBOLS 1... Input part, 2... Preprocessing part, 3... Recognition part, 4... Recognition result display part, 5... Control part, 21...
Communication unit, 22... Image receiving unit, 23... Noise removal unit, 24... Character extraction unit, 25... Normalization unit, 25
1...Width/height extraction section, 252...Width/height expansion/contraction section, 253...Character frame area calculation section, 254...Character component number measurement section, 255...Specific character determination Department.

Claims

[Claims]

(1) The width and height of the input character are extracted by the width/height extraction unit according to the input character image information, and the input character is extracted based on the width value and height value of the extracted character. In a character recognition device that performs expansion/contraction processing by a width/height expansion/contraction unit so that the width and height of Based on the processed character image information,
A character frame area calculation unit that calculates the area of a rectangular character frame where the upper and lower sides and left and right sides of the input character are in contact with each other, and the number of elements that make up the character area of the input character based on the character image information that has been subjected to the above expansion/contraction processing. based on the character frame area value from the character frame area calculation section and the character component number from the character component number measurement section,
A specific character determination unit that calculates the ratio between the number of character components and the area of the character frame, and determines an input character for which the value of the ratio between the number of character components and the area of the character frame falls within a predetermined range as a specific character. A character recognition device comprising:

(2) The specific character determination section identifies vertically long specific characters and horizontally long specific characters based on the width value and height value of the input character extracted by the width/height extraction section. The character recognition device according to claim 1, characterized in that the character recognition device is configured as follows.