JP3998439B2

JP3998439B2 - Image processing apparatus, image processing method, and program causing computer to execute these methods

Info

Publication number: JP3998439B2
Application number: JP2001212731A
Authority: JP
Inventors: 昌利大西
Original assignee: Glory Ltd
Current assignee: Glory Ltd
Priority date: 2001-07-12
Filing date: 2001-07-12
Publication date: 2007-10-24
Anticipated expiration: 2021-07-12
Also published as: JP2003030644A

Description

【０００１】
【発明の属する技術分野】
この発明は、入力画像を参照画像と比較して両画像を照合する際に、この入力画像に含まれるノイズ成分を除去する画像処理装置、画像処理方法およびこれらの方法をコンピュータに実行させるプログラムに関し、特に、帳票内の文字を利用して帳票を判別する場合に、効率的かつ精度良く帳票を判別することができる画像処理装置、画像処理方法およびこれらの方法をコンピュータに実行させるプログラムに関する。
【０００２】
【従来の技術】
従来、帳票の種類を判別する際に、判別コードや判別マークによることなく帳票を判別する技術が知られており、たとえば特開平４−２６８６８５号公報には、入力された帳票画像データから罫線の水平、垂直方向の線分を抽出して複数エリアに分割し、エリアごとに抽出された線分の方向、長さ、位置を用いて、ベクトルパターン化して標準パターンの特徴ベクトルと比較照合する帳票類の種類判別方法が開示されている。
【０００３】
この従来技術のように線分を抽出して特徴量とすると、スキャナの特性や回転補正などで線分が途切れてしまうので、線分をつなぐ補間処理をおこなう必要があるが、かかる補間処理をおこなうと本来別個の線分までをつないでしまうおそれがある。
【０００４】
このため、画像の変動などに起因する判別精度の低下を防ぐことにより、精度良く帳票類を判別する技術が提案されており、たとえば、本願出願人が出願した特願２０００−９５５１４号には、水平・垂直方向にそれぞれ隣接する画素列内に所在する黒画素割合を利用して帳票種別を判別する帳票類判別装置が開示されている。
【０００５】
ただし、罫線特徴が全く同じ帳票が複数存在する場合には、これらを判別することはできないので、この先行技術では、画像中に含まれている文字を文字認識して帳票判別をおこなっている。
【０００６】
【発明が解決しようとする課題】
しかしながら、この先行技術のように、単に画像中から文字を切り出して文字認識をする場合には、画像中に多くのノイズが含まれる場合には精度良く帳票識別をおこなうことができないという問題がある。特に、帳票についてはカーボン紙を使って書き写すようなケースも考えられるので、カーボン紙のこすれなどにより文字を正確に読み取れない場合が多い。
【０００７】
このため、帳票内の文字を利用して帳票を判別する方式において、精度良く帳票を判別するためには文字認識をいかに正確におこなうかが重要な課題となっている。特に、同じ系列のカード会社などでは、罫線のフォームが同じ帳票を使う場合が多いので、文字認識の精度が帳票の判別精度に大きく影響する。
【０００８】
この発明は、上述した従来技術による問題点を解消するためになされたものであり、帳票内の文字を利用して帳票を判別する場合に、効率的かつ精度良く帳票を判別することができる画像処理装置、画像処理方法およびこれらの方法をコンピュータに実行させるプログラムを提供することを目的とする。
【０００９】
【課題を解決するための手段】
上述した課題を解決し、目的を達成するため、請求項１の発明に係る画像処理装置は、入力画像を参照画像と比較して両画像を照合する際に、前記入力画像に含まれるノイズ成分を除去する画像処理装置において、前記参照画像を２値化した２値画像を形成する各画素と、該画素に対応する入力画像の２値画像を形成する画素とが、ともに黒画素である画素の画素値を黒画素とし、それ以外の画素の画素値を白画素としたテンプレートを生成するテンプレート生成手段と、前記入力画像のうち前記テンプレート生成手段により生成されたテンプレートの各黒画素に対応する部分以外の画素の画素値を白画素とする型抜きを行い、該入力画像からノイズ成分を除去したノイズカット画像を生成するノイズカット画像生成手段とを備えたことを特徴とする。
【００１０】
また、請求項２の発明に係る画像処理装置は、請求項１の発明において、前記テンプレート生成手段は、前記参照画像に最小値フィルタを適用して黒画素部分を膨張させる最小値フィルタリング手段と、前記最小値フィルタリング手段により最小値フィルタを適用した画像を２値化する２値化手段とを備えたことを特徴とする。
【００１１】
また、請求項３の発明に係る画像処理装置は、請求項１または２の発明において、前記ノイズカット画像生成手段により生成されたノイズカット画像を形成する画素のうち、入力画像から型抜きした画素値を持たない各画素は、型抜き対象でない前記入力画像中の複数画素の画素値から求めた値を画素値として持つことを特徴とする。
【００１２】
また、請求項４の発明に係る画像処理装置は、請求項１、２または３の発明において、前記入力画像を２値化した２値画像の黒画素数と、前記ノイズカット画像生成手段により生成されたノイズカット画像の２値画像の黒画素数との差に基づいて、前記入力画像が不正な画像であるか否かを判定する不正判定手段をさらに備えたことを特徴とする。
【００１３】
また、請求項５の発明に係る画像処理方法は、入力画像を参照画像と比較して両画像を照合する際に、前記入力画像に含まれるノイズ成分を除去する画像処理方法において、前記参照画像を２値化した２値画像を形成する各画素と、該画素に対応する入力画像の２値画像を形成する画素とが、ともに黒画素である画素の画素値を黒画素とし、それ以外の画素の画素値を白画素としたテンプレートを生成するテンプレート生成工程と、前記入力画像のうち前記テンプレート生成手段により生成されたテンプレートの各黒画素に対応する部分以外の画素の画素値を白画素とする型抜きを行い、該入力画像からノイズ成分を除去したノイズカット画像を生成するノイズカット画像生成工程とを含んだことを特徴とする。
【００１４】
また、請求項６の発明に係る画像処理方法は、請求項５の発明において、前記テンプレート生成工程は、前記参照画像に最小値フィルタを適用して黒画素部分を膨張させる最小値フィルタリング工程と、前記最小値フィルタリング工程により最小値フィルタを適用した画像を２値化する２値化工程とを含んだことを特徴とする。
【００１５】
また、請求項７の発明に係る画像処理方法は、請求項５または６の発明において、前記ノイズカット画像生成工程により生成されたノイズカット画像を形成する画素のうち、入力画像から型抜きした画素値を持たない各画素は、型抜き対象でない前記入力画像中の複数画素の画素値から求めた値を画素値として持つことを特徴とする。
【００１６】
また、請求項８の発明に係る画像処理方法は、請求項５、６または７の発明において、前記入力画像を２値化した２値画像の黒画素数と、前記ノイズカット画像生成工程により生成されたノイズカット画像の２値画像の黒画素数との差に基づいて、前記入力画像が不正な画像であるか否かを判定する不正判定工程をさらに含んだことを特徴とする。
【００１７】
また、請求項９の発明に係るプログラムは、前記請求項５〜８のいずれか一つに記載された方法をコンピュータに実行させることを特徴とする。
【００１８】
【発明の実施の形態】
以下に添付図面を参照して、この発明に係る画像処理装置、画像処理方法およびその方法をコンピュータに実行させるプログラムの好適な実施の形態を詳細に説明する。なお、本実施の形態では、本発明を帳票類判別装置に適用した場合を示すこととする。
【００１９】
図１は、本実施の形態１で用いる帳票判別装置の構成を示す機能ブロック図である。同図に示す帳票判別装置１０は、罫線の特徴量を用いた第１段階の帳票判別と、該第１段階で正確な帳票判別をおこなえない場合に、帳票内の文字を用いて第２段階の帳票判別をおこなう帳票判別装置であり、特に第２段階の帳票判別に本発明の特徴がある。
【００２０】
この第１段階の帳票判別では、あらかじめ参照画像の特徴量を辞書として登録しておき、判別対象となる帳票の画像を入力したならば、この入力画像の罫線の特徴量を抽出して辞書と比較することにより、帳票の種別を判定する。
【００２１】
ここで、この帳票判別装置１０で用いる特徴量は、帳票にあらかじめ印刷された該帳票の本質的内容をなす罫線を考慮した黒画素割合であるが、この装置１０では、線分補間処理などはおこなっていない。その理由は、かかる線分補間処理をおこなうと判別精度が低下するおそれがあるからである。なお、この黒画素割合とは、注目画素から水平方向または垂直方向の所定区間の画素列内に含まれる黒画素の割合のことであり、画像データの各画素ごとに求める値である。
【００２２】
また、第２段階の帳票判別では、参照画像とともにオペレータが帳票の特定領域に含まれる文字などをテキストとして入力しておき、入力画像の該特定領域に含まれる文字とこのテキスト文字とを比較して詳細判別をおこなう。かかる詳細判別をおこなう理由は、たとえば、罫線が全く同一で、印刷される文字のみが異なるというように、この罫線を考慮した黒画素割合のみでは帳票種別が判定できない場合があるからである。
【００２３】
具体的には、この帳票判別装置１０では、参照文字列画像に最小値フィルタを適用して該参照文字列画像の黒画素部分を膨張させた後にこれを２値化し、この２値画像と入力文字列画像によりテンプレートを作成する。そして、このテンプレートを使って入力文字列画像を型抜きすることにより、入力文字列画像からノイズ成分を除去したノイズカット画像を生成する。そして、このノイズカット画像中の文字を文字認識した後、あらかじめオペレータによりテキスト入力され記憶されたテキスト文字列と比較することにより、文字が一致するか否かを判定する。
【００２４】
図１に示すように、この帳票判別装置１０は、画像入力部１０１と、罫線特徴抽出部１０２と、辞書作成部１０３と、罫線特徴辞書１０４と、特定領域辞書１０５と、罫線特徴照合部１０６と、詳細判定部１０７と、出力部１０８とからなる。
【００２５】
画像入力部１０１は、帳票の画像データを光学的に入力するスキャナである。この画像データ入力部１０１は、図示しない切替ボタンなどにより参照画像を入力する「参照画像入力モード」に設定されている場合には、入力した濃度画像データと、白画素が’０’の画素値となり、黒画素が’１’の画素値となる２値画像データを罫線特徴抽出部１０２に出力する。これに対して、帳票の照合をおこなう「照合モード」に設定されている場合には、入力した濃度画像データを詳細判定部１０７に出力するとともに、２値画像を罫線特徴抽出部１０２に出力する。
【００２６】
罫線特徴抽出部１０２は、画像入力部１０１から受け取った２値画像データから罫線特徴（特徴量）を抽出する処理部であり、具体的には、参照画像入力モードでは、この罫線特徴および濃度画像データを辞書作成部１０３に出力する。また、照合モードの場合には、この罫線特徴を罫線特徴照合部１０６に出力する
【００２７】
辞書作成部１０３は、罫線特徴抽出部１０２から参照画像の罫線特徴および濃度画像データを受け取った際に、これらの情報に基づいて罫線特徴辞書１０４および特定領域辞書１０５の作成または追加をおこなう処理部である。
【００２８】
具体的には、この辞書作成部１０３が罫線特徴辞書１０４を作成または追加する場合には、罫線特徴および帳票の種別をそれぞれ対応づけて罫線特徴辞書１０４に登録することになる。
【００２９】
これに対して、この辞書作成部１０３が特定領域辞書１０５を作成または追加する場合には、帳票の特定領域の画像データ（参照文字列画像）に最小値フィルタを適用して黒画素部分を膨張させた後にこれを２値化した２値画像データと、オペレータがあらかじめ入力したテキスト文字列とを帳票の種別と対応づけて特定領域辞書１０５に登録することになる。
【００３０】
罫線特徴辞書１０４は、各帳票の種別ごとに罫線特徴を対応づけて記憶した辞書であり、特定領域辞書１０５は、各帳票の種別ごとに特定領域の２値画像並びにオペレータにより入力されたテキスト文字列を対応づけて記憶した辞書である。
【００３１】
罫線特徴照合部１０６は、判別対象となる帳票の画像データの罫線特徴（特徴量）と罫線特徴辞書１０４に記憶した各参照画像の罫線特徴（特徴量）とを照合し、判別対象となる画像データとの距離に基づいて、複数の候補を選択して詳細判定部１０７に出力する処理部である。
【００３２】
なお、かかる照合処理としては、従来の文字認識などで広く使用されている手法を適用することができ、たとえばユークリッド距離などに基づいて識別することができる。
【００３３】
詳細判定部１０７は、罫線特徴照合部１０６から受け取った複数の候補のうちいずれの候補が最も入力画像に近いかを詳細判定する処理部であり、画像入力部１０１から受け取った判別対象となる帳票の画像データから特定の領域の画像データ（入力文字列画像）を切り出し、この入力文字列画像を文字認識したテキスト文字列を特定領域辞書１０５に登録したテキスト文字列と照合する。
【００３４】
たとえば、図７（ａ）および（ｂ）に示すように、２つの帳票の罫線が全く同一である場合には、罫線特徴だけでは帳票の種別を判定することができないので、入力画像と参照画像の各特定領域内に含まれる文字（帳票タイトルや会社名などの帳票の特徴をなす文字列やロゴなど）を切り出して比較する。
【００３５】
ただし、この詳細判定部１０７では、単に入力文字列画像と参照画像のテキスト文字列を比較照合するだけではなく、入力文字列画像内にノイズが含まれている場合を考慮して、ノイズに強い判定をおこなうこととしている。なお、この点についての詳細な説明については後述する。
【００３６】
出力部１０８は、詳細判定部１０７から受け取った判定結果を出力する処理部である。この判定結果としては、判別対象となる帳票に最も近い登録帳票を出力することができるが、複数の候補を順番付けして出力することもできる。
【００３７】
次に、図１に示した罫線特徴抽出部１０２による罫線特徴の抽出処理についてさらに具体的に説明する。図２は、図１に示した罫線特徴抽出部１０２による罫線特徴の抽出処理の概念を説明するための説明図である。
【００３８】
同図（ａ）および（ｂ）に示すように、この罫線特徴抽出部１０２では、注目画素を中心として、水平・垂直方向それぞれについて区間Ｐｉ（ｉ＝１，２，３，…，Ｋ）（区間長ｐｉ×２+１（ドット））の中に含まれる黒画素の割合（黒画素割合）を算出している。
【００３９】
具体的には、同図（ａ）に示す水平方向の区間１の場合には、注目画素から左右に８画素までの画素値を調べる。ここでは、
区間長＝８×２＋１＝１７ドット
区間内の黒画素数＝１１ドット
となる。
【００４０】
ただし、ノイズや垂直方向の罫線（計数方向とは違う方向の罫線）などの影響をなくすために、黒画素の連続数があるしきい値以下のものは計数しないこととする。たとえば、同図（ａ）では、黒画素ＡおよびＢはその連続数が１であるので計数しない。
【００４１】
このため、
黒画素割合＝（１１―２）／１７＝０．５２９
となる。なお、水平方向の区間２および３、同図（ｂ）に示す垂直方向についても同様に求めることになる。
【００４２】
その後、帳票の画像をＭ×Ｎのブロックに分割し、該ブロック内の各画素の黒画素割合を加算して罫線特徴とする。なお、かかる罫線特徴の次元数はＭ×Ｎ×２（水平・垂直）×Ｋ次元となる。
【００４３】
この際、黒画素割合があるしきい値よりも大きいときのみ加算することとすれば、ノイズや手書き記入文字などの変動要因を省くことができる。なぜなら、手書き記入文字やノイズは、罫線と比べて短い線分の集まりであり、区間中の黒画素割合も小さくなるからである。
【００４４】
次に、図１に示した罫線特徴抽出部１０２による罫線特徴の抽出例についてさらに具体的に説明する。図３は、図１に示した罫線特徴抽出部１０２による罫線特徴の抽出例を示す説明図である。
【００４５】
同図（ａ）に示す「ロの字」の形の罫線入力画像がある場合に、区間長を３ドットとし、連続数のしきい値を考えないものとすると、水平方向についての各画素の黒画素割合は同図（ｂ）に示すようになり、垂直方向についての各画素の黒画素割合は同図（ｃ）のようになる。
【００４６】
そして、同図（ｄ）に示すように画像を３×３のブロックに分割し、同図（ｂ）に示す水平方向の各画素の黒画素割合をブロックごとに加算すると、同図（ｅ）に示す罫線特徴が得られる。また、同図（ｃ）に示す垂直方向の各画素の黒画素割合をブロックごとに加算すると、同図（ｆ）に示す罫線特徴が得られる。
【００４７】
このように、この罫線特徴抽出部１０２では、黒画素割合および罫線特徴を特徴量としたので、罫線の線分の途切れを補間する処理を必要とせず、また、回転補正などの処理で罫線の線分がとぎれたとしても安定して特徴量を取得することができる。
【００４８】
また、図２に示したように区間を複数持つと、様々な長さの罫線の特徴を忠実に得ることができる。なお、本実施の形態ではおこなっていないが、特徴抽出前に入力画像について罫線を太めるような処理をおこない、回転による変動を押さえることもできる。また、ぼかし処理などの文字認識で広く知られた認識率をあげるための様々な処理を適用して、位置ずれに強い特徴量などを取得することもできる。
【００４９】
次に、帳票を判別時の比較対象として辞書登録する場合の処理手順について説明する。図４は、帳票を判別時の比較対象として辞書登録する場合の処理手順を示すフローチャートである。
【００５０】
同図に示すように、帳票を判別時の比較対象として辞書登録する場合には、まず最初に帳票の画像を画像入力部１０１から取り込み（ステップＳ４０１）、必要に応じて画像の前処理をおこなう（ステップＳ４０２）。ただし、この前処理には線分の補間処理などは含まれない。
【００５１】
その後、罫線特徴抽出部１０２が、あらかじめ指定された区間についての水平・垂直方向の黒画素割合を算定し（ステップＳ４０３）、この黒画素割合をブロックごとに加算して罫線特徴を抽出する（ステップＳ４０４）
【００５２】
そして、辞書作成部１０３は、罫線特徴抽出部１０２により抽出された罫線特徴を罫線特徴辞書１０４に登録した後（ステップＳ４０５）、この罫線特徴を罫線特徴辞書１０４に過去に登録された罫線特徴と照合して判別可能であるか否かを確認する（ステップＳ４０６〜Ｓ４０７）。
【００５３】
その結果、判別可能でない場合には（ステップＳ４０７否定）、特定領域辞書１０５に特定領域情報（特定領域の２値画像データ並びにテキスト文字列）を追加登録する処理を繰り返し（ステップＳ４０８）、判別可能となった時点で（ステップＳ４０７肯定）、処理を終了する。
【００５４】
たとえば、文字列によって詳細判定をおこなう場合には、あらかじめ各帳票上の特徴のある特定領域（タイトルや会社名等の文字列）内の文字列（テキストデータ）とその位置を登録することになる。
【００５５】
上記一連の処理をおこなうことにより、帳票の判別に先立って、各種帳票の罫線特徴を罫線特徴辞書１０４に登録するとともに、特定領域の２値画像データおよびテキストデータを特定領域辞書１０５に登録することができる。
【００５６】
次に、図１に示した帳票判別装置１０による帳票の判別処理手順について説明する。図５は、図１に示した帳票判別装置１０による帳票の判別処理手順を示すフローチャートである。
【００５７】
同図に示すように、帳票の種別を判別する場合には、まず最初に帳票の画像を画像入力部１０１から取り込み（ステップＳ５０１）、必要に応じて画像の前処理をおこなう（ステップＳ５０２）。ただし、この前処理には線分の補間処理などは含まれない。
【００５８】
その後、罫線特徴抽出部１０２が、あらかじめ指定された区間についての水平・垂直方向の黒画素割合を算定し（ステップＳ５０３）、この黒画素割合をブロックごとに加算して罫線特徴を抽出する（ステップＳ５０４）
【００５９】
そして、罫線特徴照合部１０６が、罫線特徴抽出部１０２により抽出された罫線特徴と罫線特徴辞書１０４に登録された罫線特徴と照合して（ステップＳ５０５）、距離値が所定のしきい値以内であるか否かを調べ、この距離順にしたがって帳票の候補を近い順にソートしておく。
【００６０】
そして、所定のしきい値以内である場合には、詳細判定部１０７による後述する詳細判定をおこなって（ステップＳ５０６）、判定結果を出力し（ステップＳ５０７）、所定のしきい値内でない場合には、そのまま詳細判定部１０７を介して判定結果を出力する（ステップＳ５０７）。
【００６１】
すなわち、かかる帳票の候補のうち、１位と２位との間があるしきい値以上離れていれば１位のものを判定結果として出力することになるが、両者が離れていないときには、特定領域の文字列を認識し、それでも駄目なら別の特定領域も認識することになる。なお、詳細判定の処理手順についての説明は後述する。
【００６２】
上記一連の処理をおこなうことにより、罫線特徴辞書１０４および特定領域辞書１０５に基づく罫線特徴並びに特定領域内の文字を利用した帳票の判別をおこなうことができる。
【００６３】
次に、図１に示した辞書作成部１０３の構成について具体的に説明する。図８は、図１に示した辞書作成部１０３の構成を示す機能ブロック図である。同図に示すように、この辞書作成部１０３は、罫線特徴辞書１０４を作成する罫線特徴辞書作成部８００と、特定領域辞書１０５を作成する特定領域辞書作成部８１０とからなる。
【００６４】
特定領域辞書作成部８１０は、切り出し処理部８１１と、最小値フィルタリング部８１２と、２値化処理部８１３とを有する。なお、この最小値フィルタリング部８１２は請求項２の最小値フィルタリング手段に対応し、２値化処理部８１３は、請求項２の２値化手段に対応する。
【００６５】
切り出し処理部８１１は、参照画像中の特定の位置に所在する参照文字列画像を切り出す処理部である。なお、切り出すべき特定の位置については、あらかじめ設定されているものとする。
【００６６】
最小値フィルタリング部８１２は、参照文字列画像に最小値フィルタを適用して、該参照文字列画像内の黒画素部分を膨張させる処理部である。ここで、この最小値フィルタとは、注目画素の８近傍に位置する近傍画素のうち画素値が最小のものを注目画素の画素値とするマスクパターン処理である。
【００６７】
２値化処理部８１３は、最小値フィルタリング部８１２により最小値フィルタを適用した濃度画像を２値化する処理部である。なお、この２値化に際しては、濃度ヒストグラムを利用する判別基準法などを適用することができる。
【００６８】
次に、図８に示した特定領域辞書作成部８００の処理手順について具体的に説明する。図９は、図８に示した特定領域辞書作成部８００の処理手順を示すフローチャートであり、図１０は、文字列画像の一例を示す説明図である。
【００６９】
図９に示すように、この特定領域辞書作成部８００では、まず最初に参照画像を入力すると（ステップＳ９０１）、切出処理部８１１が、たとえば図１０（ａ）に示すような参照画像の特定領域に所在する参照文字列画像を切り出す（ステップＳ９０２）。
【００７０】
その後、最小値フィルタリング処理部８１１が、この参照文字列画像に最小値フィルタを適用して、図１０（ｂ）に示すように、参照文字列画像の黒画素部分を膨張させる（ステップＳ９０３）。
【００７１】
そして、２値化処理部８１３が、最小値フィルタを適用した参照文字列画像を判別基準法などで２値化して、図１０（ｃ）に示すような２値画像を取得する（ステップＳ９０４）。このようにして得られた２値画像と、別途オペレータから入力されたテキスト文字列とを特定領域辞書１０５に格納する。
【００７２】
次に、図１に示した詳細判定部１０７の構成について具体的に説明する。図１１は、図１に示した詳細判定部１０７の構成を示す機能ブロック図である。同図に示すように、この詳細判定部１０７は、テンプレート作成部１１０１と、ノイズカット画像抽出部１１０２と、不正判定部１１０３と、文字認識処理部１１０４と、文字判定部１１０５とを有する。なお、このテンプレート作成部１１０１は請求項１のテンプレート生成手段に対応し、ノイズカット画像抽出部１１０２はノイズカット画像生成手段に対応する。また、不正判定部１１０３は請求項４の不正判定手段に対応する。
【００７３】
テンプレート作成部１１０１は、特定領域辞書１０５から読み出した２値画像（最小値フィルタを適用した参照文字列画像を２値化した２値画像）を形成する各画素と、該画素に対応する入力文字列画像の２値画像を形成する画素とが、ともに黒画素である画素の画素値を黒画素としたテンプレートを作成する処理部である。
【００７４】
ノイズカット画像抽出部１１０２は、入力文字列画像からテンプレート部分の画素値を型抜きして、該入力文字列画像からノイズ成分を除去したノイズカット画像を抽出する処理部である。
【００７５】
不正判定部１１０３は、本発明に係るテンプレートを用いた型抜きを悪用する不正の可能性を判定する処理部である。すなわち、テンプレートを用いた型抜きをおこなう場合に、入力文字画像の全画素が黒画素である場合には、テンプレートの効果によって参照文字画像の各文字が切り出されてしまう。そこで、このような入力画像については、この不正判定部１１０３で不正とみなしてリジェクトするようにしている。
【００７６】
文字認識処理部１１０４は、ノイズカット画像中に含まれる各文字を文字認識して切り出す処理部であり、文字判定部１１０５は、切り出されたノイズカット画像の文字と特定領域辞書から読み出したテキスト文字列とを比較して各文字が一致するか否かを判定する処理部である。
【００７７】
次に、図１１に示した詳細判定部１０７の処理手順について説明する。図１２は、図１１に示した詳細判定部１０７の処理手順を示すフローチャートであり、図１３は、文字列画像の一例を示す説明図である。
【００７８】
図１２に示すように、この詳細判定部１０７に入力画像を入力すると（ステップＳ１２０１）、テンプレート作成部１１０１が入力画像の特定領域に位置する入力文字列画像を切り出す（ステップＳ１２０２）。具体的には、入力画像は参照画像と異なり位置ずれなどを伴うことが多いため、図１３（ａ）に示すようにやや広めに文字列サーチエリアを設け、その後同図（ｂ）に示すように入力文字列画像を切り出す。なお、この入力文字列画像には、受領印がノイズとして付加されている。
【００７９】
その後、テンプレート作成部１１０１は、図１３（ｃ）に示すようにこの入力文字列画像を判別基準法などを利用して２値化処理し（ステップＳ１２０３）、この入力文字列画像の２値化画像と特定領域辞書１０５から取り出した参照文字列画像の２値画像とからテンプレートを作成する（ステップＳ１２０４）。
【００８０】
具体的には、参照文字列画像を２値化した２値画像を形成する各画素と、該画素に対応する入力文字列画像の２値画像を形成する画素とが、ともに黒画素である画素の画素値を黒画素としたテンプレートを生成する。
【００８１】
その後、ノイズカット画像抽出部１１０３は、このテンプレートを用いて入力文字列画像の画素値（濃度データ）を型抜きして、図１３（ｄ）に示すようなノイズカット画像を作成する（ステップＳ１２０５）。なお、このノイズカット画像の背景部をなす各画素には、入力文字列画像の濃度ヒストグラムを判別基準法で２分割したときの高濃度クラス（白）の代表濃度値を画素値として付与する。
【００８２】
また、不正判定部１１０３は、このノイズカット画像を判別基準法などを用いて図１３（ｅ）に示すように２値化処理した後に（ステップＳ１２０６）、どの程度ノイズ成分が除去されているかを確認する。具体的には、図１３（ｃ）に示す２値画像の黒画素数から同図（ｅ）に示す２値画像の黒画素数を減算してこれを変化量とし（ステップＳ１２０７）、この変化量を入力文字列画像の全画素数で除算した値を求める（ステップＳ１２０８）。
【００８３】
そして、この値が所定の値以下であるか否かを調べ（ステップＳ１２０９）、所定の値以下でない場合には（ステップＳ１２０９否定）、入力画像をリジェクトする（ステップＳ１２１０）。すなわち、変化量を入力文字列画像の全画素数で除算した値が大きいと、テンプレートによる型抜きを不正に利用している可能性が大きくなるので、この値が所定の値を越えている場合には、不正の可能性があるとしてリジェクトするのである。なお、ここでは所定の値として「０．０７（７％）」程度の値を使用することができる。
【００８４】
なお、ここでは説明の便宜上、その後におこなわれる文字認識や文字照合についての説明を省略したが、これらは公知の文字認識技術などを用いておこなわれる。
【００８５】
上述してきたように、本実施の形態によれば、テンプレート作成部１１０１が、最小値フィルタを適用した参照文字列画像を２値化した２値画像を形成する各画素と、該画素に対応する入力文字列画像の２値画像を形成する画素とが、ともに黒画素である画素の画素値を黒画素としたテンプレートを作成し、ノイズカット画像抽出部１１０２が、入力文字列画像からテンプレート部分の画素値を型抜きして、該入力文字列画像からノイズ成分を除去したノイズカット画像を抽出するよう構成したので、入力画像中のノイズを効率良く除去することができる。
【００８６】
なお、本実施の形態では、参照文字列画像に最小値フィルタを適用した後にこれを２値化することとしたが、本発明はこれに限定されるものではなく、最小値フィルタを適用することなくそのまま２値化することもできる。すなわち、この最小値フィルタを適用する理由は、入力画像の位置ずれを考慮したものであるので、かかる位置ずれがあまり生じない場合には、そのまま参照文字列画像を２値化することもできる。
【００８７】
また、本実施の形態では、オペレータが入力したテキスト文字列を特定領域辞書１０５に登録することとしたが、本発明はこれに限定されるものではなく、文字認識などをおこなって参照画像の特定領域に含まれる文字列を特定領域辞書１０５に登録することもできる。
【００８８】
ところで、本実施の形態では、説明の便宜上、文字部分が低い濃度値（黒）を持ち、背景部分が高い濃度値（白）を有する場合について説明したが、本発明はこれに限定されるものではなく、文字部分が高い濃度値（白）を持ち、背景部分が低い濃度値（黒）を持つ場合に適用することもできる。たとえば、帳票に白抜き文字が印刷されている場合には、文字部分が高濃度値（白）を持ち、背景部分が低濃度値（黒）を持つことになる。
【００８９】
かかる場合であっても、上記実施の形態で詳細に説明したように、参照文字列画像を２値化した２値画像に最小値フィルタを適用して黒画素部分（白抜き目の背景部分）を膨張させ、この画像の各画素と、入力文字列画像を２値化した２値画像を形成する各画素とが、ともに黒画素である画素の画素値を黒画素としたテンプレートを作成し、入力文字列画像からテンプレート部分の画素値を型抜きして、該入力文字列画像からノイズ成分を除去したノイズカット画像を抽出することにより、白抜き文字を含む帳票であっても、効率的かつ精度良く帳票を判別することができる。
【００９０】
【発明の効果】
以上詳細に説明したように、請求項１の発明によれば、参照画像を２値化した２値画像を形成する各画素と、該画素に対応する入力画像の２値画像を形成する画素とが、ともに黒画素である画素の画素値を黒画素とし、それ以外の画素の画素値を白画素としたテンプレートを生成し、入力画像のうちテンプレートの各黒画素に対応する部分以外の画素の画素値を白画素とする型抜きを行い、該入力画像からノイズ成分を除去したノイズカット画像を生成するよう構成したので、効率良くノイズを除去することが可能な画像処理装置が得られるという効果を奏する。
【００９１】
また、請求項２の発明によれば、参照画像に最小値フィルタを適用して黒画素部分を膨張させ、この最小値フィルタを適用した画像を２値化するよう構成したので、入力画像の位置ずれにも効率よく対応することが可能な画像処理装置が得られるという効果を奏する。
【００９２】
また、請求項３の発明によれば、ノイズカット画像を形成する画素のうち、入力画像から型抜きした画素値を持たない各画素は、型抜き対象でない入力画像中の複数画素の画素値から求めた値を画素値として持つよう構成したので、型抜き対象とならない画素の画素値を適正な値に再構築することが可能な画像処理装置が得られるという効果を奏する。
【００９３】
また、請求項４の発明によれば、入力画像を２値化した２値画像の黒画素数と、ノイズカット画像の２値画像の黒画素数との差に基づいて、入力画像が不正な画像であるか否かを判定するよう構成したので、テンプレートを用いた型抜きを逆手にとった不正を効率良く防止することが可能な画像処理装置が得られるという効果を奏する。
【００９４】
また、請求項５の発明によれば、参照画像を２値化した２値画像を形成する各画素と、該画素に対応する入力画像の２値画像を形成する画素とが、ともに黒画素である画素の画素値を黒画素とし、それ以外の画素の画素値を白画素としたテンプレートを生成し、入力画像のうちテンプレートの各黒画素に対応する部分以外の画素の画素値を白画素とする型抜きを行い、該入力画像からノイズ成分を除去したノイズカット画像を生成するよう構成したので、効率良くノイズを除去することが可能な画像処理方法が得られるという効果を奏する。
【００９５】
また、請求項６の発明によれば、参照画像に最小値フィルタを適用して黒画素部分を膨張させ、この最小値フィルタを適用した画像を２値化するよう構成したので、入力画像の位置ずれにも効率よく対応することが可能な画像処理方法が得られるという効果を奏する。
【００９６】
また、請求項７の発明によれば、ノイズカット画像を形成する画素のうち、入力画像から型抜きした画素値を持たない各画素は、型抜き対象でない入力画像中の複数画素の画素値から求めた値を画素値として持つよう構成したので、型抜き対象とならない画素の画素値を適正な値に再構築することが可能な画像処理方法が得られるという効果を奏する。
【００９７】
また、請求項８の発明によれば、入力画像を２値化した２値画像の黒画素数と、ノイズカット画像の２値画像の黒画素数との差に基づいて、入力画像が不正な画像であるか否かを判定するよう構成したので、テンプレートを用いた型抜きを逆手にとった不正を効率良く防止することが可能な画像処理方法が得られるという効果を奏する。
【００９８】
また、請求項９の発明によれば、請求項５〜８のいずれか一つに記載された方法をコンピュータに実行させるプログラムとしたので、請求項４〜６のいずれか一つの動作をコンピュータによって実現することができる。
【図面の簡単な説明】
【図１】この発明の実施の形態で用いる帳票判別装置の構成を示す機能ブロック図である。
【図２】図１に示した罫線特徴抽出部による罫線特徴の抽出処理の概念を説明するための説明図である。
【図３】図１に示した罫線特徴抽出部による罫線特徴の抽出例を示す説明図である。
【図４】帳票を判別時の比較対象として辞書登録する場合の処理手順を示すフローチャートである。
【図５】図１に示した帳票判別装置による帳票の判別処理手順を示すフローチャートである。
【図６】本実施の形態で判別対象とする帳票の一例を示す図である。
【図７】図１に示した詳細判定部により詳細判定される帳票を説明するための説明図である。
【図８】図１に示した辞書作成部の構成を示す機能ブロック図である。
【図９】図８に示した特定領域辞書作成部の処理手順を示すフローチャートである。
【図１０】文字列画像の一例を示す説明図である。
【図１１】図１に示した詳細判定部の構成を示す機能ブロック図である。
【図１２】図１１に示した詳細判定部の処理手順を示すフローチャートである。
【図１３】文字列画像の一例を示す説明図である。
【符号の説明】
１０帳票判別装置
１０１画像入力部
１０２罫線特徴抽出部
１０３辞書作成部
１０４罫線特徴辞書
１０５特定領域辞書
１０６罫線特徴照合部
１０７詳細判定部
１０８出力部
８００罫線特徴辞書作成部
８１０特定領域辞書作成部
８１１切出処理部
８１２最小値フィルタリング部
８１３２値化処理部
１１０１テンプレート作成部
１１０２ノイズカット画像抽出部
１１０３不正判定部
１１０４文字認識処理部
１１０５文字判定部[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an image processing apparatus, an image processing method, and a program for causing a computer to execute these methods when an input image is compared with a reference image to collate both images, and noise components included in the input image are removed. In particular, the present invention relates to an image processing apparatus, an image processing method, and a program for causing a computer to execute these methods to efficiently and accurately determine a form when the form is determined using characters in the form.
[0002]
[Prior art]
Conventionally, a technique for discriminating a form without discriminating a discrimination code or a discrimination mark when discriminating the type of the form is known. For example, Japanese Patent Application Laid-Open No. 4-268865 discloses a ruled line from input form image data. A form that extracts horizontal and vertical line segments and divides them into multiple areas, and uses the direction, length, and position of the line segments extracted for each area as a vector pattern for comparison with the standard pattern feature vector. A class type discrimination method is disclosed.
[0003]
If a line segment is extracted and used as a feature value as in this prior art, the line segment is interrupted due to scanner characteristics, rotation correction, etc., so it is necessary to perform interpolation processing to connect the line segments. Doing so may lead to the connection of separate line segments.
[0004]
For this reason, there has been proposed a technique for accurately discriminating forms by preventing a decrease in discrimination accuracy due to image fluctuations. For example, Japanese Patent Application No. 2000-95514 filed by the present applicant includes There is disclosed a form discriminating apparatus that discriminates a form type by using the ratio of black pixels located in pixel rows adjacent to each other in the horizontal and vertical directions.
[0005]
However, when there are a plurality of forms having the same ruled line feature, it is impossible to distinguish them. In this prior art, the form discrimination is performed by recognizing the characters included in the image.
[0006]
[Problems to be solved by the invention]
However, as in this prior art, when character recognition is performed by simply cutting out characters from an image, there is a problem that it is not possible to accurately identify a form when a lot of noise is included in the image. . In particular, there may be cases where the forms are copied using carbon paper, so there are many cases where characters cannot be read accurately due to rubbing of the carbon paper.
[0007]
For this reason, in a method of discriminating a form using characters in the form, how to accurately perform character recognition is an important issue in order to discriminate the form with high accuracy. In particular, since the card companies of the same series often use the same form with the ruled line form, the accuracy of character recognition greatly affects the discrimination accuracy of the form.
[0008]
The present invention has been made to solve the above-described problems caused by the prior art, and is an image that can efficiently and accurately determine a form when the form is determined using characters in the form. It is an object of the present invention to provide a processing apparatus, an image processing method, and a program for causing a computer to execute these methods.
[0009]
[Means for Solving the Problems]
In order to solve the above-described problems and achieve the object, the image processing apparatus according to the first aspect of the present invention includes a noise component included in the input image when the input image is compared with the reference image and both images are collated. In the image processing apparatus that removes the pixel, each pixel that forms a binary image obtained by binarizing the reference image and a pixel that forms a binary image of the input image corresponding to the pixel are both black pixels The pixel value of , Other pixel values are white pixels Template generating means for generating a template, Die-cutting is performed in which the pixel values of pixels other than the portion corresponding to each black pixel of the template generated by the template generation unit in the input image are white pixels. And a noise cut image generating means for generating a noise cut image obtained by removing a noise component from the input image.
[0010]
An image processing apparatus according to a second aspect of the present invention is the image processing apparatus according to the first aspect, wherein the template generation means applies a minimum value filter to the reference image to expand a black pixel portion; And binarizing means for binarizing an image to which the minimum value filter is applied by the minimum value filtering means.
[0011]
According to a third aspect of the present invention, there is provided an image processing apparatus according to the first or second aspect of the present invention, wherein among the pixels forming the noise cut image generated by the noise cut image generating means, the pixels cut out from the input image Each pixel having no value has a value obtained from the pixel values of a plurality of pixels in the input image that are not to be punched as a pixel value.
[0012]
According to a fourth aspect of the present invention, there is provided an image processing apparatus according to the first, second or third aspect of the present invention, wherein the input image is generated by the binarized black pixel number and the noise cut image generating means. The image processing apparatus further includes fraud determination means for determining whether the input image is an improper image based on a difference between the noise-cut image and the number of black pixels in the binary image.
[0013]
According to a fifth aspect of the present invention, there is provided an image processing method for removing a noise component included in the input image when comparing both images by comparing the input image with a reference image. Each pixel forming a binary image obtained by binarizing a pixel and a pixel forming a binary image of an input image corresponding to the pixel are both black pixels. , Other pixel values are white pixels A template generation process for generating a template, Die-cutting is performed in which the pixel values of pixels other than the portion corresponding to each black pixel of the template generated by the template generation unit in the input image are white pixels. And a noise cut image generation step of generating a noise cut image obtained by removing noise components from the input image.
[0014]
An image processing method according to a sixth aspect of the present invention is the image processing method according to the fifth aspect of the present invention, wherein the template generation step includes a minimum value filtering step of expanding a black pixel portion by applying a minimum value filter to the reference image; And a binarization step of binarizing the image to which the minimum value filter is applied by the minimum value filtering step.
[0015]
According to a seventh aspect of the present invention, there is provided an image processing method according to the fifth or sixth aspect, wherein pixels out of an input image among pixels forming a noise cut image generated by the noise cut image generation step are used. Each pixel having no value has a value obtained from the pixel values of a plurality of pixels in the input image that are not to be punched as a pixel value.
[0016]
According to an eighth aspect of the present invention, there is provided an image processing method according to the fifth, sixth or seventh aspect, wherein the input image is generated by the number of black pixels of a binary image obtained by binarizing the input image and the noise cut image generating step. The fraud determination step of determining whether or not the input image is a fraudulent image based on the difference between the noise cut image and the number of black pixels of the binary image is further included.
[0017]
According to a ninth aspect of the present invention, a program causes a computer to execute the method according to any one of the fifth to eighth aspects.
[0018]
DETAILED DESCRIPTION OF THE INVENTION
Exemplary embodiments of an image processing apparatus, an image processing method, and a program for causing a computer to execute the method according to the present invention will be described below in detail with reference to the accompanying drawings. In the present embodiment, the case where the present invention is applied to a form classification device is shown.
[0019]
FIG. 1 is a functional block diagram showing the configuration of the form discriminating apparatus used in the first embodiment. The form discriminating apparatus 10 shown in the figure uses a character in the form in the second stage when the first stage form judgment using the feature amount of the ruled line and the accurate form discrimination cannot be performed in the first stage. This is a form discriminating apparatus for discriminating a form, and the feature of the present invention is particularly in the second stage form discrimination.
[0020]
In this first-stage form discrimination, if the feature amount of the reference image is registered in advance as a dictionary and the image of the form to be discriminated is input, the feature amount of the ruled line of this input image is extracted and By comparing, the type of form is determined.
[0021]
Here, the feature quantity used in the form discriminating apparatus 10 is a black pixel ratio in consideration of the ruled lines forming the essential contents of the form preprinted on the form. In this apparatus 10, line segment interpolation processing and the like are performed. Not done. The reason for this is that if such line segment interpolation processing is performed, the discrimination accuracy may be lowered. The black pixel ratio is a ratio of black pixels included in a pixel column in a predetermined section in the horizontal direction or vertical direction from the target pixel, and is a value obtained for each pixel of the image data.
[0022]
In the second stage form discrimination, the operator inputs characters included in the specific area of the form as text together with the reference image, and the characters included in the specific area of the input image are compared with the text characters. To make a detailed discrimination. The reason for performing such detailed determination is that, for example, the form type may not be determined only by the ratio of black pixels considering the ruled lines, such as the ruled lines being exactly the same and only the printed characters being different.
[0023]
Specifically, the form discriminating apparatus 10 applies a minimum value filter to the reference character string image to expand the black pixel portion of the reference character string image, binarizes it, and inputs the binary image and the input image. Create a template with a character string image. Then, by cutting out the input character string image using this template, a noise cut image in which noise components are removed from the input character string image is generated. Then, after recognizing the character in the noise cut image, it is determined whether or not the character matches by comparing it with a text character string that has been previously inputted and stored by the operator.
[0024]
As shown in FIG. 1, the form discriminating apparatus 10 includes an image input unit 101, a ruled line feature extracting unit 102, a dictionary creating unit 103, a ruled line feature dictionary 104, a specific area dictionary 105, and a ruled line feature matching unit 106. A detailed determination unit 107 and an output unit 108.
[0025]
The image input unit 101 is a scanner that optically inputs form image data. When the image data input unit 101 is set to a “reference image input mode” in which a reference image is input by a switching button (not shown) or the like, the input density image data and a pixel value with white pixels of “0” are set. Thus, binary image data in which the black pixel has a pixel value of “1” is output to the ruled line feature extraction unit 102. On the other hand, when the “collation mode” for collating the form is set, the input density image data is output to the detail determination unit 107 and the binary image is output to the ruled line feature extraction unit 102. .
[0026]
The ruled line feature extraction unit 102 is a processing unit that extracts ruled line features (features) from the binary image data received from the image input unit 101. Specifically, in the reference image input mode, the ruled line feature and density image are extracted. Data is output to the dictionary creation unit 103. In the collation mode, the ruled line feature is output to the ruled line feature collating unit 106.
[0027]
When the dictionary creation unit 103 receives the ruled line feature and density image data of the reference image from the ruled line feature extraction unit 102, the dictionary creation unit 103 creates or adds the ruled line feature dictionary 104 and the specific area dictionary 105 based on the information. It is.
[0028]
Specifically, when the dictionary creation unit 103 creates or adds the ruled line feature dictionary 104, the ruled line feature and the form type are associated with each other and registered in the ruled line feature dictionary 104.
[0029]
On the other hand, when the dictionary creation unit 103 creates or adds the specific area dictionary 105, the black pixel portion is expanded by applying a minimum value filter to the image data (reference character string image) of the specific area of the form. Then, the binary image data binarized and the text character string previously input by the operator are registered in the specific area dictionary 105 in association with the form type.
[0030]
The ruled line feature dictionary 104 is a dictionary that stores ruled line features in association with each form type, and the specific area dictionary 105 is a binary image of a specific area and text characters input by the operator for each form type. A dictionary in which columns are stored in association with each other.
[0031]
The ruled line feature matching unit 106 checks the ruled line feature (feature amount) of the image data of the document to be discriminated against the ruled line feature (feature value) of each reference image stored in the ruled line feature dictionary 104, and determines the image to be discriminated. A processing unit that selects a plurality of candidates based on the distance to the data and outputs the selected candidates to the detail determination unit 107.
[0032]
Note that a technique widely used in conventional character recognition or the like can be applied as such collation processing, and for example, identification can be performed based on the Euclidean distance or the like.
[0033]
The detail determination unit 107 is a processing unit that determines in detail which one of the plurality of candidates received from the ruled line feature matching unit 106 is closest to the input image, and is a form to be determined received from the image input unit 101. The image data of the specific area (input character string image) is cut out from the image data, and the text character string obtained by character recognition of the input character string image is collated with the text character string registered in the specific area dictionary 105.
[0034]
For example, as shown in FIGS. 7A and 7B, when the ruled lines of two forms are exactly the same, the type of the form cannot be determined only by the ruled line characteristics, and therefore the input image and the reference image Characters (character strings, logos, etc. that form the form such as form title and company name) included in each specific area are cut out and compared.
[0035]
However, the detailed determination unit 107 is not only to compare and collate the input character string image and the text character string of the reference image, but also to resist noise in consideration of the case where noise is included in the input character string image. Judgment is made. A detailed description of this point will be described later.
[0036]
The output unit 108 is a processing unit that outputs the determination result received from the detail determination unit 107. As the determination result, a registered form closest to the form to be determined can be output, but a plurality of candidates can also be output in order.
[0037]
Next, ruled line feature extraction processing by the ruled line feature extraction unit 102 shown in FIG. 1 will be described in more detail. FIG. 2 is an explanatory diagram for explaining the concept of ruled line feature extraction processing by the ruled line feature extraction unit 102 shown in FIG.
[0038]
As shown in FIGS. 9A and 9B, the ruled line feature extraction unit 102 has a section Pi (i = 1, 2, 3,..., K) (in the horizontal and vertical directions around the target pixel) ( The ratio of black pixels (black pixel ratio) included in the section length pi × 2 + 1 (dot)) is calculated.
[0039]
Specifically, in the case of the horizontal section 1 shown in FIG. 5A, the pixel values from the target pixel to the left and right pixels are examined. here,
Section length = 8 x 2 + 1 = 17 dots
Number of black pixels in the section = 11 dots
It becomes.
[0040]
However, in order to eliminate the influence of noise and vertical ruled lines (ruled lines in a direction different from the counting direction), the number of continuous black pixels is not counted. For example, in FIG. 5A, black pixels A and B are not counted because their continuous number is 1.
[0041]
For this reason,
Black pixel ratio = (11-2) / 17 = 0.529
It becomes. It should be noted that the horizontal directions 2 and 3 and the vertical direction shown in FIG.
[0042]
Thereafter, the form image is divided into M × N blocks, and the black pixel ratio of each pixel in the block is added to obtain a ruled line feature. Note that the number of dimensions of the ruled line feature is M × N × 2 (horizontal / vertical) × K dimensions.
[0043]
At this time, if the black pixel ratio is added only when it is larger than a certain threshold value, it is possible to eliminate fluctuation factors such as noise and handwritten characters. This is because handwritten characters and noise are a collection of short line segments compared to ruled lines, and the black pixel ratio in the section is also small.
[0044]
Next, an example of ruled line feature extraction by the ruled line feature extraction unit 102 shown in FIG. 1 will be described more specifically. FIG. 3 is an explanatory diagram illustrating an example of ruled line feature extraction by the ruled line feature extraction unit 102 illustrated in FIG. 1.
[0045]
In the case where there is a ruled line input image in the shape of “R” shown in FIG. 4A, if the section length is 3 dots and the threshold of the continuous number is not considered, each pixel in the horizontal direction is considered. The black pixel ratio is as shown in FIG. 4B, and the black pixel ratio of each pixel in the vertical direction is as shown in FIG.
[0046]
Then, when the image is divided into 3 × 3 blocks as shown in FIG. 4D and the black pixel ratio of each pixel in the horizontal direction shown in FIG. The ruled line feature shown in FIG. Further, when the black pixel ratio of each pixel in the vertical direction shown in FIG. 5C is added for each block, the ruled line feature shown in FIG.
[0047]
As described above, since the ruled line feature extraction unit 102 uses the black pixel ratio and the ruled line feature as the feature amount, it does not require a process for interpolating the break of the ruled line segment, and the ruled line feature extraction process 102 performs a process such as rotation correction. Even if the line segment is interrupted, the feature amount can be acquired stably.
[0048]
Further, if there are a plurality of sections as shown in FIG. 2, the features of ruled lines of various lengths can be obtained faithfully. Although not performed in the present embodiment, it is possible to suppress fluctuation due to rotation by performing a process of thickening the ruled line on the input image before feature extraction. In addition, it is possible to acquire a feature amount that is resistant to misalignment by applying various processes for increasing the recognition rate widely known in character recognition such as blurring.
[0049]
Next, a processing procedure for registering a form as a comparison target at the time of discrimination will be described. FIG. 4 is a flowchart showing a processing procedure when a form is registered as a dictionary for comparison at the time of discrimination.
[0050]
As shown in the figure, when registering a form as a comparison target at the time of discrimination, a form image is first taken from the image input unit 101 (step S401), and image preprocessing is performed as necessary. (Step S402). However, this pre-processing does not include line segment interpolation processing or the like.
[0051]
Thereafter, the ruled line feature extraction unit 102 calculates the black pixel ratio in the horizontal and vertical directions for the section designated in advance (step S403), and adds the black pixel ratio for each block to extract the ruled line feature (step S403). S404)
[0052]
Then, the dictionary creation unit 103 registers the ruled line feature extracted by the ruled line feature extraction unit 102 in the ruled line feature dictionary 104 (step S405), and then stores the ruled line feature in the ruled line feature dictionary 104 in the past. It is confirmed whether it can be determined by collation (steps S406 to S407).
[0053]
As a result, if it cannot be determined (No at step S407), the process of additionally registering the specific area information (binary image data and text character string of the specific area) in the specific area dictionary 105 is repeated (step S408). At this point (Yes in step S407), the process is terminated.
[0054]
For example, in the case of making a detailed determination using a character string, a character string (text data) in a specific area (character string such as a title or company name) with a characteristic on each form and its position are registered in advance. .
[0055]
By performing the above-described series of processing, the ruled line features of various forms are registered in the ruled line feature dictionary 104 and the binary image data and text data of the specific area are registered in the specific area dictionary 105 prior to the discrimination of the form. Can do.
[0056]
Next, a procedure for discriminating a form by the form discriminating apparatus 10 shown in FIG. 1 will be described. FIG. 5 is a flowchart showing a form discrimination processing procedure performed by the form discrimination apparatus 10 shown in FIG.
[0057]
As shown in the figure, when determining the type of a form, first, an image of the form is fetched from the image input unit 101 (step S501), and image preprocessing is performed as necessary (step S502). However, this pre-processing does not include line segment interpolation processing or the like.
[0058]
After that, the ruled line feature extraction unit 102 calculates the black pixel ratio in the horizontal and vertical directions for the section designated in advance (step S503), and adds the black pixel ratio for each block to extract the ruled line feature (step S503). S504)
[0059]
The ruled line feature matching unit 106 matches the ruled line feature extracted by the ruled line feature extracting unit 102 with the ruled line feature registered in the ruled line feature dictionary 104 (step S505), and the distance value is within a predetermined threshold value. It is checked whether or not there is, and the candidates for the form are sorted in the order of closeness according to this distance order.
[0060]
If it is within the predetermined threshold value, the detailed determination unit 107 performs detailed determination described later (step S506) and outputs the determination result (step S507). Outputs the determination result via the detailed determination unit 107 as it is (step S507).
[0061]
That is, among the candidates for the form, if the distance between the first place and the second place is more than a certain threshold, the first place is output as the determination result. It recognizes the character string of the area, and if it is still not good, it recognizes another specific area. The detailed determination processing procedure will be described later.
[0062]
By performing the above-described series of processing, it is possible to determine a form using the ruled line feature based on the ruled line feature dictionary 104 and the specific area dictionary 105 and the characters in the specific area.
[0063]
Next, the configuration of the dictionary creation unit 103 shown in FIG. 1 will be specifically described. FIG. 8 is a functional block diagram showing the configuration of the dictionary creation unit 103 shown in FIG. As shown in the figure, the dictionary creation unit 103 includes a ruled line feature dictionary creation unit 800 that creates a ruled line feature dictionary 104 and a specific area dictionary creation unit 810 that creates a specific area dictionary 105.
[0064]
The specific area dictionary creation unit 810 includes a cutout processing unit 811, a minimum value filtering unit 812, and a binarization processing unit 813. The minimum value filtering unit 812 corresponds to the minimum value filtering unit of claim 2, and the binarization processing unit 813 corresponds to the binarization unit of claim 2.
[0065]
The cutout processing unit 811 is a processing unit that cuts out a reference character string image located at a specific position in the reference image. Note that a specific position to be cut out is set in advance.
[0066]
The minimum value filtering unit 812 is a processing unit that applies a minimum value filter to the reference character string image and expands a black pixel portion in the reference character string image. Here, the minimum value filter is a mask pattern process in which a pixel having a minimum pixel value among neighboring pixels located in the vicinity of 8 of the target pixel is used as a pixel value of the target pixel.
[0067]
The binarization processing unit 813 is a processing unit that binarizes the density image to which the minimum value filter is applied by the minimum value filtering unit 812. In this binarization, a discrimination criterion method using a density histogram can be applied.
[0068]
Next, the processing procedure of the specific area dictionary creation unit 800 shown in FIG. 8 will be specifically described. FIG. 9 is a flowchart illustrating a processing procedure of the specific area dictionary creation unit 800 illustrated in FIG. 8, and FIG. 10 is an explanatory diagram illustrating an example of a character string image.
[0069]
As shown in FIG. 9, in the specific area dictionary creation unit 800, when a reference image is first input (step S901), the extraction processing unit 811 specifies a reference image as shown in FIG. 10A, for example. A reference character string image located in the area is cut out (step S902).
[0070]
Thereafter, the minimum value filtering processing unit 811 applies a minimum value filter to the reference character string image to expand the black pixel portion of the reference character string image as shown in FIG. 10B (step S903).
[0071]
Then, the binarization processing unit 813 binarizes the reference character string image to which the minimum value filter is applied using a discrimination criterion method or the like, and acquires a binary image as shown in FIG. 10C (step S904). . The binary image obtained in this way and a text character string separately input from the operator are stored in the specific area dictionary 105.
[0072]
Next, the configuration of the detail determination unit 107 shown in FIG. 1 will be specifically described. FIG. 11 is a functional block diagram illustrating a configuration of the detail determination unit 107 illustrated in FIG. As shown in the figure, the detail determination unit 107 includes a template creation unit 1101, a noise cut image extraction unit 1102, a fraud determination unit 1103, a character recognition processing unit 1104, and a character determination unit 1105. The template creation unit 1101 corresponds to the template generation unit of claim 1, and the noise cut image extraction unit 1102 corresponds to the noise cut image generation unit. The fraud determination unit 1103 corresponds to the fraud determination unit according to the fourth aspect.
[0073]
The template creating unit 1101 forms each pixel forming a binary image (binary image obtained by binarizing a reference character string image to which a minimum value filter is applied) read from the specific area dictionary 105, and an input character corresponding to the pixel A pixel that forms a binary image of a column image is a processing unit that creates a template in which the pixel values of pixels that are both black pixels are black pixels.
[0074]
The noise cut image extraction unit 1102 is a processing unit that extracts a pixel value of a template portion from an input character string image and extracts a noise cut image from which noise components are removed from the input character string image.
[0075]
The fraud determination unit 1103 is a processing unit that determines the possibility of fraud that exploits die cutting using a template according to the present invention. That is, when performing die cutting using a template, if all the pixels of the input character image are black pixels, each character of the reference character image is cut out due to the effect of the template. Therefore, the fraud determination unit 1103 regards such input images as being fraudulent and rejects them.
[0076]
The character recognition processing unit 1104 is a processing unit that recognizes and cuts out each character included in the noise cut image, and the character determination unit 1105 reads the character of the cut out noise cut image and the text character read from the specific area dictionary. It is a processing unit that compares the columns and determines whether or not each character matches.
[0077]
Next, the processing procedure of the detail determination unit 107 shown in FIG. 11 will be described. FIG. 12 is a flowchart illustrating a processing procedure of the detail determination unit 107 illustrated in FIG. 11, and FIG. 13 is an explanatory diagram illustrating an example of a character string image.
[0078]
As shown in FIG. 12, when an input image is input to the detail determination unit 107 (step S1201), the template creation unit 1101 cuts out an input character string image located in a specific area of the input image (step S1202). Specifically, unlike the reference image, the input image is often accompanied by a positional shift or the like. Therefore, a character string search area is provided slightly wider as shown in FIG. 13A, and thereafter, as shown in FIG. The input character string image is cut out. Note that a receipt stamp is added as noise to the input character string image.
[0079]
Thereafter, the template creation unit 1101 binarizes the input character string image using a discrimination criterion method or the like as shown in FIG. 13C (step S1203), and binarizes the input character string image. A template is created from the image and the binary image of the reference character string image extracted from the specific area dictionary 105 (step S1204).
[0080]
Specifically, each pixel that forms a binary image obtained by binarizing the reference character string image and a pixel that forms a binary image of the input character string image corresponding to the pixel are both black pixels. A template having a pixel value of black as a black pixel is generated.
[0081]
Thereafter, the noise cut image extraction unit 1103 cuts out the pixel value (density data) of the input character string image using this template and creates a noise cut image as shown in FIG. 13D (step S1205). ). Note that a representative density value of the high density class (white) when the density histogram of the input character string image is divided into two by the discrimination reference method is assigned to each pixel forming the background portion of the noise cut image as a pixel value.
[0082]
Further, the fraud determination unit 1103 binarizes the noise-cut image using a discrimination criterion method or the like as shown in FIG. 13E (step S1206), and how much noise component is removed. Check. Specifically, the number of black pixels of the binary image shown in FIG. 13E is subtracted from the number of black pixels of the binary image shown in FIG. 13C to obtain a change amount (step S1207). A value obtained by dividing the amount by the total number of pixels of the input character string image is obtained (step S1208).
[0083]
Then, it is checked whether or not this value is equal to or smaller than a predetermined value (step S1209). If it is not equal to or smaller than the predetermined value (No at step S1209), the input image is rejected (step S1210). In other words, if the value obtained by dividing the amount of change by the total number of pixels in the input character string image is large, the possibility of unauthorized use of template punching increases, so this value exceeds the specified value. Will be rejected as possibly fraudulent. Here, a value of about “0.07 (7%)” can be used as the predetermined value.
[0084]
Here, for convenience of explanation, explanation of character recognition and character collation performed thereafter is omitted, but these are performed using a known character recognition technique or the like.
[0085]
As described above, according to the present embodiment, the template creation unit 1101 corresponds to each pixel that forms a binary image obtained by binarizing the reference character string image to which the minimum value filter is applied, and the pixel. A template in which the pixel values of the pixels that form the binary image of the input character string image are both black pixels is created as a black pixel, and the noise cut image extraction unit 1102 extracts the template portion from the input character string image. Since the pixel value is cut out and the noise cut image from which the noise component is removed is extracted from the input character string image, the noise in the input image can be efficiently removed.
[0086]
In this embodiment, the minimum value filter is applied to the reference character string image and then binarized. However, the present invention is not limited to this, and the minimum value filter is applied. Alternatively, it can be binarized as it is. That is, the reason why the minimum value filter is applied is that the positional deviation of the input image is taken into consideration, so that when the positional deviation does not occur so much, the reference character string image can be binarized as it is.
[0087]
In this embodiment, the text character string input by the operator is registered in the specific area dictionary 105. However, the present invention is not limited to this, and character recognition is performed to specify the reference image. A character string included in the area can also be registered in the specific area dictionary 105.
[0088]
In the present embodiment, for the sake of convenience of explanation, the case where the character portion has a low density value (black) and the background portion has a high density value (white) has been described, but the present invention is not limited to this. Instead, it can also be applied to the case where the character portion has a high density value (white) and the background portion has a low density value (black). For example, when white letters are printed on a form, the character portion has a high density value (white) and the background portion has a low density value (black).
[0089]
Even in such a case, as described in detail in the above embodiment, a black pixel portion (background portion of a white eye) is applied by applying a minimum value filter to a binary image obtained by binarizing the reference character string image. A template in which each pixel of this image and each pixel forming a binary image obtained by binarizing the input character string image is a black pixel is created as a pixel, By cutting out the pixel value of the template portion from the input character string image and extracting the noise cut image from which the noise component has been removed from the input character string image, even a form including white characters can be efficiently and A form can be discriminated with high accuracy.
[0090]
【The invention's effect】
As described above in detail, according to the invention of claim 1, each pixel forming a binary image obtained by binarizing the reference image, and a pixel forming a binary image of the input image corresponding to the pixel, However, the pixel value of a pixel that is both a black pixel is defined as a black pixel. , Other pixel values are white pixels Generated templates, Performs die cutting with the pixel values of the pixels other than the portion corresponding to each black pixel of the template as white pixels in the input image Since the noise cut image is generated by removing the noise component from the input image, an image processing apparatus capable of efficiently removing noise is obtained.
[0091]
According to the second aspect of the present invention, the minimum value filter is applied to the reference image to expand the black pixel portion, and the image to which the minimum value filter is applied is binarized. There is an effect that an image processing apparatus capable of efficiently dealing with the deviation can be obtained.
[0092]
According to the invention of claim 3, among the pixels forming the noise cut image, each pixel that does not have a pixel value cut out from the input image is obtained from the pixel values of a plurality of pixels in the input image that are not to be cut out. Since the obtained value is configured to have the pixel value, there is an effect that an image processing apparatus capable of reconstructing the pixel value of the pixel that is not to be punched into an appropriate value is obtained.
[0093]
According to the invention of claim 4, the input image is invalid based on the difference between the number of black pixels of the binary image obtained by binarizing the input image and the number of black pixels of the binary image of the noise cut image. Since it is configured to determine whether or not it is an image, there is an effect that it is possible to obtain an image processing apparatus capable of efficiently preventing fraud in which the die removal using the template is taken.
[0094]
According to the invention of claim 5, each pixel forming a binary image obtained by binarizing the reference image and each pixel forming the binary image of the input image corresponding to the pixel are black pixels. The pixel value of a pixel is black , Other pixel values are white pixels Generated templates, Performs die cutting with the pixel values of the pixels other than the portion corresponding to each black pixel of the template as white pixels in the input image Since the noise cut image is generated by removing the noise component from the input image, an image processing method capable of efficiently removing the noise is obtained.
[0095]
According to the sixth aspect of the present invention, the minimum value filter is applied to the reference image to expand the black pixel portion, and the image to which the minimum value filter is applied is binarized. There is an effect that an image processing method capable of efficiently dealing with the deviation can be obtained.
[0096]
According to the invention of claim 7, among the pixels forming the noise cut image, each pixel having no pixel value cut out from the input image is determined from the pixel values of a plurality of pixels in the input image not to be cut out. Since the obtained value is used as the pixel value, there is an effect that an image processing method capable of reconstructing the pixel value of the pixel that is not to be punched into an appropriate value is obtained.
[0097]
According to the invention of claim 8, the input image is invalid based on the difference between the number of black pixels of the binary image obtained by binarizing the input image and the number of black pixels of the binary image of the noise cut image. Since it is configured to determine whether or not it is an image, there is an effect that an image processing method capable of efficiently preventing fraud in which die cutting using a template is taken in reverse is obtained.
[0098]
According to the ninth aspect of the present invention, since the computer program executes the method according to any one of the fifth to eighth aspects, the operation according to any one of the fourth to sixth aspects is performed by the computer. Can be realized.
[Brief description of the drawings]
FIG. 1 is a functional block diagram showing a configuration of a form discriminating apparatus used in an embodiment of the present invention.
FIG. 2 is an explanatory diagram for explaining a concept of ruled line feature extraction processing by a ruled line feature extraction unit shown in FIG. 1;
FIG. 3 is an explanatory diagram illustrating an example of ruled line feature extraction by the ruled line feature extraction unit illustrated in FIG. 1;
FIG. 4 is a flowchart showing a processing procedure when a form is registered as a dictionary for comparison at the time of discrimination.
FIG. 5 is a flowchart showing a form discrimination processing procedure performed by the form discrimination apparatus shown in FIG. 1;
FIG. 6 is a diagram illustrating an example of a form to be determined in the present embodiment.
7 is an explanatory diagram for explaining a form whose details are determined by a detail determination unit shown in FIG. 1; FIG.
8 is a functional block diagram showing a configuration of a dictionary creation unit shown in FIG.
FIG. 9 is a flowchart showing a processing procedure of a specific area dictionary creation unit shown in FIG. 8;
FIG. 10 is an explanatory diagram illustrating an example of a character string image.
11 is a functional block diagram illustrating a configuration of a detail determination unit illustrated in FIG. 1. FIG.
12 is a flowchart illustrating a processing procedure of a detail determination unit illustrated in FIG.
FIG. 13 is an explanatory diagram illustrating an example of a character string image.
[Explanation of symbols]
10 Form discrimination device
101 Image input unit
102 Ruled line feature extraction unit
103 Dictionary creation part
104 Ruled line feature dictionary
105 Specific area dictionary
106 Ruled line feature matching unit
107 Detailed judgment section
108 Output section
800 Ruled line feature dictionary creation part
810 Specific area dictionary creation part
811 Cutout processing part
812 Minimum value filtering unit
813 binarization processing unit
1101 Template creation part
1102 Noise cut image extraction unit
1103 Fraud determination part
1104 Character recognition processing unit
1105 Character determination part

Claims

In an image processing apparatus for removing noise components included in the input image when comparing both images by comparing the input image with a reference image,
Each pixel that forms a binary image obtained by binarizing the reference image and a pixel that forms a binary image of the input image corresponding to the pixel are both black pixels, and the pixel value of the pixel is a black pixel . Template generating means for generating a template in which the pixel values of other pixels are white pixels ;
A noise cut image obtained by performing die cutting using a pixel value of a pixel other than a portion corresponding to each black pixel of the template generated by the template generation unit as a white pixel in the input image, and removing a noise component from the input image An image processing apparatus comprising: a noise-cut image generation unit that generates

The template generation unit applies a minimum value filter to the reference image to expand a black pixel portion, and binarizes an image to which the minimum value filter is applied by the minimum value filtering unit. The image processing apparatus according to claim 1, further comprising: means.

Among the pixels forming the noise cut image generated by the noise cut image generation means, each pixel having no pixel value cut out from the input image is determined from the pixel values of a plurality of pixels in the input image that are not to be cut out. The image processing apparatus according to claim 1, wherein the obtained value is a pixel value.

Based on the difference between the number of black pixels of the binary image obtained by binarizing the input image and the number of black pixels of the binary image of the noise cut image generated by the noise cut image generation unit, the input image is invalid. 4. The image processing apparatus according to claim 1, further comprising fraud determination means for determining whether or not the image is a correct image.

In the image processing method for removing a noise component included in the input image when comparing both images by comparing the input image with a reference image,
Each pixel that forms a binary image obtained by binarizing the reference image and a pixel that forms a binary image of the input image corresponding to the pixel are both black pixels, and the pixel value of the pixel is a black pixel . A template generation step of generating a template in which the pixel values of other pixels are white pixels ;
A noise cut image obtained by performing die cutting with a pixel value of a pixel other than a portion corresponding to each black pixel of the template generated by the template generation step in the input image as a white pixel, and removing a noise component from the input image And a noise cut image generation step for generating the image processing method.

The template generation step includes a minimum value filtering step of expanding a black pixel portion by applying a minimum value filter to the reference image, and a binarization that binarizes an image to which the minimum value filter is applied by the minimum value filtering step. The image processing method according to claim 5, further comprising a step.

Among the pixels forming the noise cut image generated by the noise cut image generation step, each pixel that does not have a pixel value cut out from the input image is determined from pixel values of a plurality of pixels in the input image that are not to be cut out. The image processing method according to claim 5, wherein the obtained value is a pixel value.

Based on the difference between the number of black pixels of the binary image obtained by binarizing the input image and the number of black pixels of the binary image of the noise cut image generated by the noise cut image generation step, the input image is invalid. The image processing method according to claim 5, further comprising a fraud determination step of determining whether the image is a correct image.

The program which makes a computer perform the method as described in any one of the said Claims 5-8.