JP3885001B2

JP3885001B2 - Image processing method and image processing apparatus

Info

Publication number: JP3885001B2
Application number: JP2002188353A
Authority: JP
Inventors: 邦浩山本
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2002-06-27
Filing date: 2002-06-27
Publication date: 2007-02-21
Anticipated expiration: 2022-06-27
Also published as: JP2004030430A

Description

【０００１】
【発明の属する技術分野】
本発明は、画像処理方法及び画像処理装置に関し、特に矩形画像の検出を行う画像処理方法及び画像処理装置に関する。
【０００２】
【従来の技術】
従来より、原画像から矩形領域を検出し、切り出すためのアルゴリズムが種々提案されている。例えば、特開平8-237537号においては、まず輪郭抽出を行い、輪郭が直線を成す部分を探すことによって矩形領域の探索を行なう技術が記載されている。
【０００３】
【発明が解決しようとする課題】
しかしながら従来の矩形切り出しのアルゴリズムによれば、ノイズの多い環境下においては、矩形輪郭の直線部分を検出すること、さらに、4つの直線を関連付けて1個の矩形を検出することは困難であった。従って、正確な切り出しを行うことは困難であり、切り出し結果の精度は十分とはいえなかった。
【０００４】
本発明は上述した問題を解決するためになされたものであり、ノイズの多い環境下でも矩形領域を高精度に検出可能とする画像処理方法及び画像処理装置を提供することを目的とする。
【０００５】
【課題を解決するための手段】
上記目的を達成するための一手法として、本発明の画像処理方法は以下の工程を備える。
【０００６】
すなわち、画像から矩形領域を検出する画像処理方法であって、原画像を２値化した画像において、その水平及び垂直方向の黒画素数のヒストグラムを作成するヒストグラム作成工程と、該ヒストグラムを台形近似することによって、前記２値画像における矩形領域の存在範囲を検出する矩形範囲検出工程と、前記存在範囲内における、前記矩形領域の傾き方向を検出する傾き方向検出工程とを有し、前記矩形範囲検出工程においては、前記矩形領域が長方形であると仮定して、前記ヒストグラムの形状が、左右の斜辺が対称となる台形を呈するように、該左右の斜辺について回帰分析による直線近似の精度を比較して、誤差の少ない一方の斜辺から得られるデータに合わせて、他方の斜辺のデータを修正することを特徴とする。
【０００７】
さらに、前記２値画像に対してラベル付けを行うラベリング工程を有し、前記ヒストグラム作成工程においては、前記ラベル付けされた画素数のヒストグラムを作成することを特徴とする。
【０００８】
【発明の実施の形態】
以下、本発明に係る一実施形態について、図面を参照して詳細に説明する。
【０００９】
＜第１実施形態＞
●システム構成
図１は、本発明が実行されるコンピュータシステムの構成を示すブロック図である。101はCPUであり、システム全体の制御を行なう。102はキーボードであり、102aのマウスとともにシステムへのデータ入力に使用される。103は表示装置であり、CRTや液晶等で構成されている。104はROM、105はRAMであり、システムの記憶装置を構成し、システムが実行するプログラムやシステムが利用するデータを記憶する。106はハードディスク装置(HDD)、107はフロッピディスク装置(FDD)であり、システムのファイルシステムに使用される外部記憶装置を構成している。108はプリンタである。
【００１０】
109はネットワークI/Fであり、LANまたはWANに接続している。I/F109を介して、コンピュータプログラムやデジタル画像等のデータファイルを他のコンピュータシステムとやりとりすることができる。プログラムやデータはハードディスク106に記録されたり、直接RAM105に展開されて実行される。
【００１１】
110はイメージスキャナであり、原稿台上の写真や印刷物を読み込んで、多値のデジタル画像データとしてRAM105上に取り込むことができる。
【００１２】
●矩形画像概要
図２に、イメージスキャナ110から取り込んで2値化された画像の一例を示す。図中の矩形は、写真等の矩形の原稿画像（以下、矩形画像と称する）を示し、3つの楕円はノイズを示している。なお、同図においては説明のためノイズを簡略化して示したが、実際のノイズ分はサイズも形状も不定であり、その数も数百から数千個程度になることが多い。また、画像の2値化は所定の閾値との比較による周知の単純二値化法を用いるとし、ここでは詳細な説明を省略する。
【００１３】
図２に示す画像は、RAM105上で幅W画素、高さH画素のビットマップとして保持されている。各画素は1バイトの深さを持ち、黒画素は値255、白画素は値0で示されるとする。また、RAM105上のビットマップは2次元配列としてアクセスできるものとし、該ビットマップの左上を起点として水平方向にi番目、垂直方向にj番目の画素をbuf(i,j)と表記する。本実施形態ではこのビットマップから、以下に示す方法によって矩形画像の存在領域の座標を特定する。
【００１４】
●ヒストグラム作成
図３は、図２に示す2値画像においてノイズが存在しない理想的な状態を想定した場合に、水平方向に黒画素を計数したヒストグラムの様子を示す図であり、同図上部が2値画像、下部がヒストグラム例を示す。同図によれば、2値画像の矩形の4頂点に対応して、ヒストグラムが概ね台形をなすことがわかる。なお、実際の2値画像はノイズを含むため、ヒストグラムが完全な台形となることはないが、水平方向における頂点の位置を特定するのに十分な情報を、このヒストグラムの台形形状から得ることができる。
【００１５】
図４は、図３に示したヒストグラムの作成処理を示すフローチャートである。以下、ヒストグラムデータを格納する配列をhist()で示す。
【００１６】
先ずステップS401，S402において、変数x，y及びのそれぞれに値0を代入し、ステップS403で変数hist(x)に値0を代入する。
【００１７】
そしてステップS404において、ビットマップ上の画素buf(x,y)と値255を比較し、異なればステップS406へ進むが、等しければステップS405へ進んでhist(x)を値1だけインクリメントした後、ステップS406へ進む。
【００１８】
ステップS406では変数yを値1だけインクリメントし、次にステップS407においてyとビットマップの高さHを比較する。yとHが異なればステップS404へ戻るが、等しければステップS408へ進んで変数xを値1だけインクリメントし、ステップS409でxとビットマップ幅Wを比較する。xとWが異なればステップS402へ戻るが、等しければ処理を終了する。
【００１９】
●頂点検出処理
以上の処理によって、黒画素を計数したヒストグラムの配列hist()が得られる。このヒストグラムは図３に示したように、X軸及び3つの直線で囲まれた台形をなしている。本実施形態ではこの台形において、左辺がX軸と交わる位置をa、左辺と上底が交わる位置をb、右辺と上底が交わる位置をｃ、右辺がX軸と交わる位置をdとする。
【００２０】
ここで図５のフローチャートを用いて、ヒストグラムが呈する台形の左辺を解析して、上記位置a，bを求める方法について説明する。
【００２１】
まずステップS501において、hist(x)が最大になるピーク点Pを検出する。図３に示すヒストグラムにおいては、X=400前後の位置に高さ800のピークがある。次にステップS502において、点Pの前方向に、ピークの半分の高さになる点Fを検出する。図３に示すヒストグラムにおいては、X=100前後の位置が点Fに相当する。そしてステップS503で、点Fの前方向にピークの1/4の高さになる点F0を検出し、さらにステップS504で、点Fの後方向にピークの3/4の高さになる点F1を検出する。
【００２２】
次にステップS505において、F0≦X≦F1の全ての点について、周知の回帰分析を用いて直線近似を行うことによって、図３に示すような直線Lを得る。そしてステップS506において直線LがX軸と交わる点を算出し、この点をaとして得る。そしてステップS507において、直線LがピークPの高さの98％になる点を算出し、この点をbとして得る。これは、ピークPはノイズ分を含んでいるため実際のピークよりも多めに出ていることを考慮して、これを98％分に補正するためである。
以上、図５に示すフローチャートにより、ピークPの前方向への探索を行って点a，bを得る例について説明したが、同様にピークPの後方向に探索を行うことにより、点c、dを得ることができる。
【００２３】
また、以上は水平方向の頂点位置検出についての説明であるが、上記説明における縦軸と横軸を反転して全く同様の方法を適用することにより、垂直方向における頂点位置検出を行うことが可能である。これにより図６に示すように、矩形を囲む縦横それぞれ4個の座標値、すなわち8個の変数（ax，bx，cx，dx，ay，by，cy，dy）を、頂点候補として得ることができる。
【００２４】
ここで、矩形から上記8つの変数は一意に決定されるが、逆に、8つの変数から矩形を決定することはできない。すなわち図７に示すように、8つの変数に基づく矩形としては、その斜行方向（以下、斜行モード）によって4通りの頂点座標が考えられるが、そのいずれであるかを決定することができない。なお図７においては、矩形が完全に水平である特別な場合（不図示）をモード0として、4つの斜行モードをそれぞれ、モード1,2,-1,-2とする。これら各モードで示される画像からは同じ形状のヒストグラムが生成されるため、ヒストグラム以外の他の方法によって、斜行モードを判定する必要がある。
【００２５】
●斜行モード判定
以下、本実施形態における斜行モード判定処理について、詳細に説明する。
【００２６】
上述した図６は、図７に示すモード1の状態に相当するが、図６において斜線で示した2つの三角領域の内側に存在する黒画素を計数すると、理想的には黒画素（図中網点部分）が含まれないため、黒画素の計数結果は0になると考えられる。以下、この図６に示す斜線領域を、モード1のテンプレートと呼ぶ。なお、図７に示す他のモードについても同様に、矩形上部に存在する2つの三角領域を、各モードのテンプレートとする。
【００２７】
ここで図８に示すように、実際にはモード1である矩形画像（図中網点部分）に対してモード2のテンプレート（図中斜線部分）を当てはめると、テンプレートの内側に矩形部分がはみ出してしまうため、黒画素数の計数結果は0にならない。本実施形態ではこの特性を利用して、矩形の斜行モードを判定する。
【００２８】
具体的には、判定対象となる矩形画像に対して4つのモードのテンプレートを適用した場合の黒画素数をそれぞれ計数し、計数結果が最少であるテンプレートを、その画像の斜行モードとして決定する。すなわち、テンプレートへの入り込みの度合いが最も少ない場合が、該矩形画像が最も近似する斜行度合いであるとする。もちろん、テンプレート外に存在する黒画素を計数し、計数結果が最大であるテンプレートによって斜行モードを決定することも可能である。
【００２９】
以上の処理により、2値画像中の矩形領域の4つの頂点座標を特定することができる。従ってこの情報に基づき、原画像に対して周知の斜行補正及び画像切り出し処理を施すことによって、例えばイメージスキャナ110上に配置した矩形の写真画像を切り出すことができる。
【００３０】
なお、上記周知の斜行補正処理としては、単純補間法（ニアレストネイバ法）または線形補間法（バイリニア法）を用いることが望ましく、前者は速度に優れ、後者は画質に優れるので、用途に応じて使い分ければ良い。
【００３１】
以上説明したように本実施形態によれば、ノイズの多い環境下であっても、矩形領域を正確に検出することができる。
【００３２】
なお、矩形領域が完全な長方形であれば、理想的なヒストグラムとして、左右の斜辺が対称となる理想的な台形を呈するはずであるが、ノイズの影響により左右の斜辺が対称とはならず、すなわち、図３に示すa-b間の距離が、c-d間の距離と等しくならないことがある。そのような場合には、ノイズの影響が少ないと考えられる方の斜辺から得られるデータにあわせて、台形近似を行う。具体的には左右の斜辺について、回帰分析による直線近似の精度（直線と実データの間の２乗誤差）を比較して、誤差の少ないほうを採用すれば良い。
【００３３】
同様に、矩形検出の結果として得られた4つの頂点を結んだ矩形が、ノイズの影響で必ずしも長方形にならず、隣り合う辺が直交しないことも考えられる。このような場合も、ノイズの影響が少ないと考えられる方のデータに基づいて、矩形が長方形になるように修正を施すことで、矩形検出の精度を向上することができる。
【００３４】
＜第２実施形態＞
以下、本発明に係る第２実施形態について説明する。なお、第２実施形態におけるシステム構成は上述した第１実施形態と同様であるため、説明を省略する。
【００３５】
上述した第１実施形態においては、画像内に1つの矩形領域が存在する場合について説明した。このような状況としては例えば、イメージスキャナ110の原稿台上に1枚の写真を必ずしも水平ではなく載置した状態でスキャンを行い、その後自動的に斜行補正、画像切り出しを行うことで、写真画像を抽出する処理が想定される。
【００３６】
しかしながら、一般的なフラットヘッドスキャナの原稿台はA4サイズ程度の原稿の載置を可能とするため、一般的な写真（所謂L版サイズ）の複数枚を一度に載置することができる。従って、このような複数枚の写真画像を一回のスキャンによって読み取り、得られた画像から写真画像を示す矩形領域だけを切り出すことができれば、スキャン作業の効率化が図れる。
【００３７】
そこで第２実施形態においては、一枚の画像から、複数の矩形画像を検出することを特徴とする。
【００３８】
以下、図９を参照して、第２実施形態における複数の矩形画像の検出処理について説明する。図９の左部は、原稿台の副走査方向（図中長手方向）に2枚の写真を離して配置した際の2値化画像を示し、図９の右部は該2値化画像の副走査方向におけるヒストグラムの様子を示す。同図から分かるように、複数枚の写真が離れて配置されている場合、ヒストグラムに複数個の台形が現れる。従って第２実施形態においてはこのヒストグラムを参照することによって、これら複数の台形の間、すなわち図９において1点鎖線Lで示した位置で、画像を二つに切り分ける。すると、以降は上述した第１実施形態で示した方法によって、該切り分けられた各々の領域において、矩形画像をそれぞれ検出することができる。
【００３９】
なお、図９においては複数枚の写真を副走査方向に離して配置した場合について説明したが、これを主走査方向（図中短手方向）に離して配置した場合も同様に、主走査方向についてのヒストグラムを作成することによって、画像を主走査方向で切り分け、それぞれの矩形画像を抽出することができる。
【００４０】
以上説明したように第２実施形態によれば、一枚の画像から複数の矩形画像を検出することができる。
【００４１】
＜第３実施形態＞
以下、本発明に係る第３実施形態について説明する。なお、第３実施形態におけるシステム構成は上述した第１実施形態と同様であるため、説明を省略する。
【００４２】
第３実施形態においては、図２に示すような2値画像に対して理想的なラベリング処理を施すことによってノイズの影響を排除し、矩形画像の検出をより高精度に行うことを特徴とする。
【００４３】
ここで、図２に示す2値画像における理想的なラベリング処理とは、黒画素同士の連結に基づき、例えば上方の矩形オブジェクトに含まれる全ての画素に対して画素値1を与え、下方の矩形オブジェクトに含まれる全画素に対しては画素値2を与えることである。すなわち、微小なノイズ分にはラベルを与えず、画素値255のままにしておく。
【００４４】
●ラベリング処理概要
以下、上記理想的なラベリングを実現するためのラベリング方法について、詳細に説明する。
【００４５】
図１０は、図２における上方の矩形オブジェクトを含む左上部分に対応し、第３実施形態における塗りつぶし処理の起点となるブロックの探索方法を説明するための図である。
【００４６】
まず図１０に示すように、図２に示したW画素×H画素のビットマップ全体を、w画素×h画素のブロックに分け、その内部が全て値255の黒画素で占められるようなブロック（以下、「黒ブロック」と称する）を図中矢印方向に探索する。図３において黒矩形で示したブロックが、見つかった黒ブロックである。
【００４７】
第３実施形態では、見つかった黒ブロックと黒画素が連結している領域を塗りつぶすことによって、高速なラベリングを実現する。図１１に、第３実施形態におけるラベリング処理全体のフローチャートを示し、説明する。
【００４８】
まずステップS1401で、変数cに値1を代入し、ステップS1402で黒ブロック探索を行う。この黒ブロック探索処理の詳細については後述する。
【００４９】
次にステップS1403において黒ブロックが見つかったか否かを判断し、見つかったらステップS1404へ進むが、見つからなければ処理を終了する。
【００５０】
ステップS1404では、見つかった黒ブロックを起点にして、黒画素連結領域をラベルcで塗りつぶす。この塗りつぶし処理の詳細については後述する。
【００５１】
そしてステップS1405で変数cを1だけインクリメントしてステップS1402に戻り、以上の処理を繰り返す。
【００５２】
この処理の結果、所定以上の大きさを持つオブジェクトのみに対して、ラベル1,2,…,cを与えることができる。
【００５３】
●黒ブロック探索処理
以下、図１１のステップS1402に示した黒ブロック探索処理の詳細を図１２のフローチャートに示し、説明する。
【００５４】
まずステップS1501〜S1504において、変数y，x，j，iをそれぞれ値0で初期化する。そしてステップS1505でbuf(x+i,y+j)が黒画素を示す255であるか否かを判断し、255であればステップS1506へ、255でなければステップS1510へ進む。
【００５５】
ステップS1506では変数iを値1だけインクリメントし、ステップS1507でiがブロック幅を示すwに等しければステップS1508へ進み、異なればステップS1505へ進む。
【００５６】
ステップS1508では変数jを値1だけインクリメントし、ステップS1509でjがブロック高さを示すhに等しければ処理を終了し、このとき、(x,y)を左上起点とするw×h画素ブロックが黒ブロックであるとして検出される。jがhと異なるときはステップS1504へ戻る。
【００５７】
ステップS1510ではxにwを加え、ステップS1511でxが画像幅を示すW以上であればステップS1512へ進み、そうでなければステップS1503へ戻る。
【００５８】
ステップS1512ではyにhを加え、ステップS1513でyが画像高さを示すH以上であれば処理を終了し、このとき、黒ブロックは存在しないと判定される。yがH未満であればステップS1502へ戻る。
【００５９】
以上の処理によって、W画素×H画素の2値画像からw画素×h画素の黒ブロックが検出される。
【００６０】
●連結領域の塗りつぶし処理
以下、図１１のステップS1404に示した連結領域の塗りつぶし処理の詳細を図１３のフローチャートに示し、説明する。
【００６１】
なお、オブジェクトの塗りつぶし処理としては種々の方法が知られており、第３実施形態ではどのような塗りつぶし方法を用いても良い。したがって、ここでは基本的な方法について説明することとし、説明の簡略化のためオブジェクトは外に凸である単純な形状であると仮定する。
【００６２】
上述した黒ブロック探索処理の結果、図１４に示すように(x,y)を左上起点とするブロックが黒ブロックとして検出されている。そこで第３実施形態の塗りつぶし処理においては、この黒ブロックの中心点である（(x+w)/2，(y+h)/2）を起点として、黒画素で連結している領域をラベルcで塗りつぶす。すなわち、値255である画素値を値cで置き換える。
【００６３】
具体的にはまずステップS601において、黒ブロックの中心点（(x+w)/2，(y+h)/2）を含む走査線について、該走査線上の黒連結部分をラベルcで塗りつぶす。すなわち、図１４に示す黒ブロックを含むオブジェクトにおける上記走査線の左端（x0，y+h/2）と右端（x1，y+h/2）を結ぶ線分（図１４においてオブジェクトを水平方向に切断する線分）がラベルcで塗りつぶされる。なお、上記（x0，y+h/2）と（x1，y+h/2）を結ぶ線分を以下では連結線分と称し、その塗りつぶし処理の詳細については後述する。
【００６４】
そして、ステップS602で連結線分から下にある黒連結部分を塗りつぶし、ステップS603で連結線分から上にある黒連結部分を塗りつぶす。なお、ステップS602，S603における塗りつぶし処理の詳細については後述する。
【００６５】
この処理の結果、黒ブロックと黒画素で連結した領域を、ラベルcで塗りつぶすことができる。
【００６６】
●連結線分の塗りつぶし処理
以下、図１３のステップS601に示した連結線分の塗りつぶし処理の詳細を図１５のフローチャートに示し、説明する。なお、説明の簡略化のために、処理の起点となる黒ブロックの中心点（(x+w)/2，(y+h)/2）を、座標（X，Y）で示すとする。すなわち、X=(x+w)/2，Y=(y+h)/2とする。
【００６７】
まず、ステップS801で変数iにXを代入し、ステップS802でiを値1だけデクリメントする。そして、ステップS803でbuf(i,Y)が値255に等しいか否かを判定し、異なればステップS804へ進むが、等しければステップS802へ戻ってiをさらにデクリメントする。
【００６８】
ステップS804では変数x0に値i+1を格納し、ステップS805では変数iにXを代入する。そしてステップS806でiを値1だけインクリメントし、ステップS807でbuf(i,Y)が値255に等しいか否かを判定し、等しければステップS808へ進むが、異なればステップS806へ戻ってiをさらにインクリメントする。
【００６９】
ステップS808では、変数x1に値i-1を格納し、ステップS809では変数iに値x0を代入し、ステップS810ではbuf(i,Y)にcを代入する。そしてステップS811でiを値1だけインクリメントし、ステップS812でiがx1よりも大きければ処理を終了するが、そうでなければステップS810に戻る。
【００７０】
この処理により、変数x0，x1に連結線分の両端におけるx座標を格納し、該連結線分をラベルcで塗りつぶすことができる。
【００７１】
●連結線分から下（上）の塗りつぶし処理
以下、図１３のステップS602に示した連結線分から下にある黒連結部分の塗りつぶし処理の詳細を図１６のフローチャートに示し、説明する。なお、ここでもY=y+h/2とし、したがって、(x0,Y)と(x1,Y)を結ぶ連結線分より下の黒連結部分をラベルcで塗りつぶす。
【００７２】
まず、ステップS901で変数Yを値1だけインクリメントし、ステップS902で変数iにx0を代入する。そしてステップS903でbuf(i,Y)が値255に等しいか否かを判定し、異なればステップS904へ、等しければステップS906へ進む。
【００７３】
ステップS904ではiを値1だけインクリメントし、ステップS905でiがx1よりも大きければ処理を終了するが、そうでなければステップS903に戻る。
【００７４】
ステップS906ではXにiを代入し、ステップS907で（X,Y）を起点として、横方向、すなわち走査線上の連結線分を塗りつぶす。なお、この走査線上の塗りつぶし処理については図１５のフローチャートで示したため、ここではその詳細な説明を省略する。
【００７５】
なお、図１３のステップS603に示した連結線分から上にある黒連結部分の塗りつぶし処理については、基本的に上述した図１６のフローチャートに示す、下方向への塗りつぶし処理と同様であり、ステップS901においてYをインクリメントする代わりにデクリメントすれば良く、他のステップについては同様の処理を行えば良い。
【００７６】
以上説明したようなラベリング処理によれば、所定サイズの黒ブロックを含むオブジェクト、すなわち所定サイズ以上の面積をもつオブジェクトについてのみ、ラベル付けを行うことができる。したがって、微小なノイズ領域に対してラベルを付けてしまうことがなく、高速なラベリングが可能になる。
【００７７】
●ラベリング後の矩形検出
以上のラベリング処理により、図２に示すbuf()の2次元配列においては、同図上方の矩形オブジェクトに含まれる全ての画素に対してラベル1としての画素値1が、下方の矩形オブジェクトに含まれる全画素に対してはラベル2としての画素値2が、それ以外の微小ノイズ部分の黒画素には画素値255が、背景の白画素には値0が、それぞれ格納される。
【００７８】
第３実施形態においては、各ラベル領域毎に矩形オブジェクトの位置（頂点座標）を判定する。すなわち、ラベリング後の画像に対して、上述した第１及び第２実施形態において説明した矩形検出処理を施す。具体的には、上述した図４に示すヒストグラムの作成フローチャートにおいて、ステップS404に示すbuf()の比較対象を、黒画素を示す値255から、ラベル値（この場合、1または2）に変更すれば良い。これにより、ラベル1及びラベル2の領域が、矩形として検出される。
【００７９】
なお、ここでは2個の矩形を含む場合を例として説明したが、任意個数の矩形を含む2値画像に対して上記ラベリング処理を施し、各ラベル毎に、矩形検出処理を行うことが可能であることは言うまでもない。
【００８０】
以上説明したように第３実施形態によれば、2値画像に対して理想的なラベリング処理を施すことによってノイズの影響を排除し、矩形画像の検出をより高精度に行うことができる。
【００８１】
＜第４実施形態＞
以下、本発明に係る第４実施形態について説明する。なお、第４実施形態におけるシステム構成は上述した第１実施形態と同様であるため、説明を省略する。
【００８２】
上述した第３実施形態においては、2値画像から黒ブロックの探索を行なう際にまず2値画像全体をブロックに分割した後に、各ブロックの内部が全て黒画素であるか否かをチェックした。しかしながら黒ブロックの探索方法としては他の方法も適用可能であり、例えば第４実施形態においては、2値画像内におけるブロックの左上座標を連続的に変化させていく方法について説明する。この方法によれば演算負荷は高くなるものの、黒ブロックの検出性能を向上することができる。
【００８３】
図１７は、第４実施形態において矩形オブジェクト内の黒ブロックを探索する様子を示す図である。同図によれば第４実施形態における探索の起点(x,y)は、第３実施形態で図１０に示したブロック境界からは外れた位置にあることが分かる。
【００８４】
図１８に、第４実施形態における黒ブロックの探索処理のフローチャートを示し、詳細に説明する。
【００８５】
まずステップS1101〜S1104において、変数y，x，j，iをそれぞれ値0で初期化する。そしてステップS1105でbuf(x+i,y+j)が黒画素を示す255であるか否かを判断し、255であればステップS1106へ、255でなければステップS1110へ進む。
【００８６】
ステップS1106では変数iを値1だけインクリメントし、ステップS1107でiがブロック幅を示すwに等しければステップS1108へ進み、異なればステップS1105へ進む。
【００８７】
ステップS1108では変数jを値1だけインクリメントし、ステップS1109でjがブロック高さを示すhに等しければ処理を終了し、このとき、(x,y)を左上起点とするw×h画素ブロックが黒ブロックであるとして検出される。jがhと異なるときはステップS1104へ戻る。
【００８８】
ステップS1110ではxにi+1を加える。なお、ここでxを単に値1だけインクリメントしても、最終的には全く同様の結果が得られるが、i+1を加えることで探索の無駄を省き、処理を高速化できる。
【００８９】
そして、ステップS1111でxが画像幅を示すW以上であればステップS1112へ進み、そうでなければステップS1103へ戻る。
【００９０】
ステップS1112ではyを値1だけインクリメントし、ステップS1113でyが画像高さを示すHに等しければ処理を終了し、このとき、黒ブロックは存在しないと判定される。yがHに等しくなければステップS1102へ戻る。
【００９１】
以上の処理によって、W画素×H画素の2値画像からw画素×h画素の黒ブロックが検出される。そして、以降は上述した第３実施形態と同様に、黒画素連結領域の塗りつぶし及び矩形検出処理を行うことができる。
【００９２】
以上説明したように第４実施形態によれば、所定サイズの黒ブロックをより高精度に検出することができる。したがって、有効なオブジェクトのみについてのラベリングを、より高精度に行うことができ、ひいては、より高精度な矩形検出が可能となる。
【００９３】
なお、上述した第３及び第４実施形態によれば、ブロック内の全画素が黒画素であることを黒ブロックの判定条件としたが、例えばノイズの状態によっては、黒領域の内部に白画素が多数混入することがある。そこで本発明においては例えば、ブロック内部の黒画素数をカウントし、これが所定の閾値を超えた場合に黒ブロックであると判定しても良い。
【００９４】
また、探索すべき黒ブロックは必ずしも矩形である必要はなく、例えば六角形等の多角形や、円形もしくは楕円形であってもよいし、想定されるオブジェクトの形状に応じて選択可能なようにしてもよい。
【００９５】
また、探索ブロックのサイズも1つの所定サイズに限らず、例えば大きなブロックサイズから探索をはじめ、オブジェクトが見つからなければ徐々にサイズを小さくして探索を繰り返すようにしても良い。これにより、想定されたよりも小さなオブジェクトをノイズと誤認し、無視してしまう危険性を回避できる。
【００９６】
【他の実施形態】
なお、本発明は、複数の機器(例えばホストコンピュータ、インタフェイス機器、リーダ、プリンタなど)から構成されるシステムに適用しても、一つの機器からなる装置(例えば、複写機、ファクシミリ装置など)に適用しても良い。
【００９７】
また、本発明の目的は、前述した実施形態の機能を実現するソフトウェアのプログラムコードを記録した記憶媒体を、システムあるいは装置に供給し、そのシステムあるいは装置のコンピュータ(またはCPUまたはMPU)が記憶媒体に格納されたプログラムコードを読み出し実行することによっても達成されることは言うまでもない。
【００９８】
この場合、記憶媒体から読み出されたプログラムコード自体が前述した実施形態の機能を実現することになり、そのプログラムコードを記憶した記憶媒体は本発明を構成することになる。
【００９９】
プログラムコードを供給するための記憶媒体としては、例えば、フロッピーディスク、ハードディスク、光ディスク、光磁気ディスク、CD-ROM、CD-R、磁気テープ、不揮発性のメモリカード、ROMなどを用いることが出来る。
【０１００】
また、コンピュータが読み出したプログラムコードを実行することにより、前述した実施形態の機能が実現されるだけでなく、そのプログラムコードの指示に基づき、コンピュータ上で稼動しているOS(オペレーティングシステム)などが実際の処理の一部を行い、その処理によって前述した実施形態の機能が実現される場合も含まれることは言うまでもない。
【０１０１】
さらに、記憶媒体から読み出されたプログラムコードが、コンピュータに挿入された機能拡張ボードやコンピュータに接続された機能拡張ユニットに備わるメモリに書き込まれた後、そのプログラムコードの指示に基づき、その機能拡張ボードや機能拡張ユニットに備わるCPUなどが実際の処理の一部または全部を行い、その処理によって前述した実施形態の機能が実現される場合も含まれることは言うまでもない。
【０１０２】
【発明の効果】
以上説明したように本発明によれば、ノイズの多い環境下でも矩形領域を高精度に検出することができる。
【図面の簡単な説明】
【図１】本発明に係る一実施形態における画像処理システムの概要構成を示すブロック図である。
【図２】本実施形態において検出対象となる2値画像例を示す図である。
【図３】本実施形態における黒画素数の水平方向ヒストグラム例を示す図である。
【図４】本実施形態におけるヒストグラム算出処理を示すフローチャートである。
【図５】本実施形態におけるヒストグラムの台形近似方法を示すフローチャートである。
【図６】本実施形態における矩形画像の検出座標例を示す図である。
【図７】本実施形態における4つの斜行モードを示す図である。
【図８】本実施形態における斜行モード判定の原理を説明するための図である。
【図９】第２実施形態において、複数枚の写真を配置した際のヒストグラムの様子を示す図である。
【図１０】第３実施形態における黒ブロック探索方法を説明するための図である。
【図１１】第３実施形態におけるラベリング処理の概要を示すフローチャートである。
【図１２】第３実施形態における黒ブロック探索処理を示すフローチャートである。
【図１３】第３実施形態における塗りつぶし処理を示すフローチャートである。
【図１４】第３実施形態における連結線分塗りつぶし方法を説明するための図である。
【図１５】第３実施形態における連結線分塗りつぶし処理を示すフローチャートである。
【図１６】第３実施形態における連結線分下領域の塗りつぶし処理を示すフローチャートである。
【図１７】第４実施形態における黒ブロック探索方法を説明するための図である。
【図１８】第４実施形態における黒ブロック探索処理を示すフローチャートである。
【符号の説明】
101 CPU
102 キーボード
102a マウス
103 表示部
104 ROM
105 RAM
106 ハードディスク
107 フロッピーディスク
108 プリンタ
109 ネットワークI/F
110 イメージスキャナ[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an image processing method and an image processing apparatus, and more particularly to an image processing method and an image processing apparatus that detect a rectangular image.
[0002]
[Prior art]
Conventionally, various algorithms for detecting and cutting out a rectangular area from an original image have been proposed. For example, Japanese Patent Application Laid-Open No. 8-237537 describes a technique for searching a rectangular region by first extracting a contour and searching for a portion where the contour forms a straight line.
[0003]
[Problems to be solved by the invention]
However, according to the conventional rectangular segmentation algorithm, it was difficult to detect a straight line portion of a rectangular outline and to detect a single rectangle by associating four straight lines in a noisy environment. . Therefore, it is difficult to perform accurate extraction, and the accuracy of the extraction result is not sufficient.
[0004]
SUMMARY An advantage of some aspects of the invention is that it provides an image processing method and an image processing apparatus capable of detecting a rectangular region with high accuracy even in a noisy environment.
[0005]
[Means for Solving the Problems]
As a technique for achieving the above object, the image processing method of the present invention includes the following steps.
[0006]
That is, an image processing method for detecting a rectangular area from an image, in which an original image is binarized, a histogram creating step for creating a histogram of the number of black pixels in the horizontal and vertical directions, and the histogram is approximated to a trapezoid A rectangular range detecting step for detecting an existing range of the rectangular area in the binary image, and an inclination direction detecting step for detecting an inclination direction of the rectangular area within the existing range, and the rectangular range In the detection step, assuming that the rectangular region is a rectangle, the shape of the histogram exhibits a trapezoid whose left and right hypotenuses are symmetrical. Compare the accuracy of linear approximation by regression analysis for the left and right hypotenuses, and there is little error In accordance with data obtained from one hypotenuse, the data of the other hypotenuse is modified.
[0007]
Furthermore, it has a labeling process for labeling the binary image, and the histogram creating process creates a histogram of the number of labeled pixels.
[0008]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, an embodiment according to the present invention will be described in detail with reference to the drawings.
[0009]
<First Embodiment>
● System configuration
FIG. 1 is a block diagram showing a configuration of a computer system in which the present invention is executed. A CPU 101 controls the entire system. Reference numeral 102 denotes a keyboard which is used for data input to the system together with the mouse 102a. Reference numeral 103 denotes a display device, which includes a CRT, a liquid crystal, and the like. 104 is a ROM, and 105 is a RAM, which constitutes a storage device of the system, and stores programs executed by the system and data used by the system. 106 is a hard disk device (HDD), 107 is a floppy disk device (FDD), and constitutes an external storage device used for the file system of the system. Reference numeral 108 denotes a printer.
[0010]
Reference numeral 109 denotes a network I / F, which is connected to a LAN or WAN. Data files such as computer programs and digital images can be exchanged with other computer systems via the I / F 109. Programs and data are recorded in the hard disk 106 or directly expanded in the RAM 105 and executed.
[0011]
Reference numeral 110 denotes an image scanner, which can read a photo or printed matter on a platen and import it into the RAM 105 as multivalued digital image data.
[0012]
● Rectangle image overview
FIG. 2 shows an example of an image taken from the image scanner 110 and binarized. A rectangle in the figure indicates a rectangular document image (hereinafter referred to as a rectangular image) such as a photograph, and three ellipses indicate noise. Although the noise is shown in a simplified manner in the figure for the sake of explanation, the actual noise is indefinite in size and shape, and the number thereof is often about several hundred to several thousand. In addition, it is assumed that the binarization of an image uses a known simple binarization method by comparison with a predetermined threshold, and detailed description thereof is omitted here.
[0013]
The image shown in FIG. 2 is held on the RAM 105 as a bitmap having a width W pixel and a height H pixel. Each pixel has a depth of 1 byte, a black pixel is indicated by a value 255, and a white pixel is indicated by a value 0. The bitmap on the RAM 105 can be accessed as a two-dimensional array, and the i-th pixel in the horizontal direction and the j-th pixel in the vertical direction are expressed as buf (i, j) starting from the upper left of the bitmap. In this embodiment, the coordinates of the existing area of the rectangular image are specified from this bitmap by the following method.
[0014]
● Histogram creation
FIG. 3 is a diagram showing a state of a histogram in which black pixels are counted in the horizontal direction when assuming an ideal state where noise does not exist in the binary image shown in FIG. The lower part shows an example of a histogram. According to the figure, it can be seen that the histogram has a substantially trapezoidal shape corresponding to the four vertices of the rectangle of the binary image. Note that since the actual binary image contains noise, the histogram will not be a complete trapezoid, but sufficient information can be obtained from the trapezoidal shape of this histogram to specify the position of the vertex in the horizontal direction. it can.
[0015]
FIG. 4 is a flowchart showing the histogram creation process shown in FIG. Hereinafter, an array for storing histogram data is denoted by hist ().
[0016]
First, in steps S401 and S402, the value 0 is substituted for each of the variables x, y, and in step S403, the value 0 is substituted for the variable hist (x).
[0017]
In step S404, the pixel buf (x, y) on the bitmap is compared with the value 255.If they are different, the process proceeds to step S406, but if they are equal, the process proceeds to step S405 and hist (x) is incremented by the value 1. Proceed to step S406.
[0018]
In step S406, the variable y is incremented by 1 and then in step S407, y is compared with the height H of the bitmap. If y and H are different, the process returns to step S404. If they are equal, the process proceeds to step S408, where the variable x is incremented by 1, and x and the bitmap width W are compared in step S409. If x and W are different, the process returns to step S402, but if they are equal, the process ends.
[0019]
● Vertex detection processing
Through the above processing, a histogram array hist () obtained by counting black pixels is obtained. As shown in FIG. 3, this histogram has a trapezoid surrounded by the X axis and three straight lines. In the present embodiment, in this trapezoid, a position where the left side intersects the X axis is a, a position where the left side and the upper base intersect is b, a position where the right side and the upper base intersect is c, and a position where the right side intersects the X axis is d.
[0020]
Here, a method of obtaining the positions a and b by analyzing the left side of the trapezoid represented by the histogram will be described using the flowchart of FIG.
[0021]
First, in step S501, a peak point P at which hist (x) is maximum is detected. In the histogram shown in FIG. 3, there is a peak of height 800 at a position around X = 400. Next, in step S502, a point F that is half the height of the peak in the forward direction of the point P is detected. In the histogram shown in FIG. 3, a position around X = 100 corresponds to the point F. In step S503, a point F0 having a height of 1/4 of the peak in front of the point F is detected, and further in step S504, a point F1 having a height of 3/4 of the peak in the backward direction of the point F. Is detected.
[0022]
Next, in step S505, a straight line L as shown in FIG. 3 is obtained by performing a linear approximation using a well-known regression analysis for all points of F0 ≦ X ≦ F1. In step S506, a point where the straight line L intersects the X axis is calculated, and this point is obtained as a. In step S507, a point where the straight line L is 98% of the height of the peak P is calculated, and this point is obtained as b. This is because the peak P includes noise and is corrected to 98% in consideration of the fact that the peak P appears more than the actual peak.
As described above, the example of obtaining the points a and b by performing the forward search of the peak P has been described with reference to the flowchart illustrated in FIG. 5, but the points c and d are obtained by performing the backward search of the peak P in the same manner. Can be obtained.
[0023]
Further, the above is the explanation of the vertex position detection in the horizontal direction, but it is possible to detect the vertex position in the vertical direction by inverting the vertical axis and the horizontal axis in the above description and applying the same method. It is. As a result, as shown in FIG. 6, four coordinate values, that is, eight variables (ax, bx, cx, dx, ay, by, cy, dy) surrounding the rectangle can be obtained as vertex candidates. it can.
[0024]
Here, the above eight variables are uniquely determined from the rectangle, but conversely, the rectangle cannot be determined from the eight variables. That is, as shown in FIG. 7, as the rectangle based on the eight variables, four vertex coordinates can be considered depending on the skew direction (hereinafter referred to as skew mode), but it is not possible to determine which of them is. . In FIG. 7, a special case (not shown) where the rectangle is completely horizontal is set as mode 0, and the four skew feeding modes are set as modes 1, 2, 1, and 2, respectively. Since a histogram having the same shape is generated from the images shown in these modes, it is necessary to determine the skew mode by a method other than the histogram.
[0025]
● Skew mode judgment
Hereinafter, the skew mode determination processing in the present embodiment will be described in detail.
[0026]
FIG. 6 described above corresponds to the state of mode 1 shown in FIG. 7, but when black pixels existing inside the two triangular areas shown by hatching in FIG. 6 are counted, ideally black pixels (in the figure) Since the halftone dot portion is not included, the black pixel count result is considered to be zero. Hereinafter, the shaded area shown in FIG. 6 is referred to as a mode 1 template. Similarly, in the other modes shown in FIG. 7, two triangular areas existing in the upper part of the rectangle are used as templates for the respective modes.
[0027]
As shown in FIG. 8, when a mode 2 template (shaded portion in the figure) is actually applied to a rectangular image in mode 1 (halftone dot portion in the figure), the rectangular portion protrudes inside the template. Therefore, the counting result of the number of black pixels does not become zero. In this embodiment, this characteristic is used to determine the rectangular skew mode.
[0028]
Specifically, the number of black pixels when the four mode templates are applied to the rectangular image to be determined is counted, and the template with the smallest counting result is determined as the skew mode of the image. . That is, it is assumed that the case where the degree of entry into the template is the smallest is the skewing degree that the rectangular image is closest to. Of course, it is also possible to count the black pixels existing outside the template and determine the skew feeding mode based on the template having the maximum counting result.
[0029]
With the above processing, the four vertex coordinates of the rectangular area in the binary image can be specified. Therefore, based on this information, for example, a rectangular photographic image arranged on the image scanner 110 can be cut out by performing known skew correction and image cut-out processing on the original image.
[0030]
As the known skew correction processing, it is desirable to use a simple interpolation method (nearest neighbor method) or a linear interpolation method (bilinear method), the former being excellent in speed and the latter being excellent in image quality. You can use them accordingly.
[0031]
As described above, according to the present embodiment, a rectangular region can be accurately detected even in a noisy environment.
[0032]
If the rectangular area is a perfect rectangle, the ideal histogram should exhibit an ideal trapezoid in which the left and right hypotenuses are symmetric, but the left and right hypotenuses are not symmetric due to the influence of noise. That is, the distance between ab shown in FIG. 3 may not be equal to the distance between cd. In such a case, trapezoidal approximation is performed in accordance with data obtained from the hypotenuse that is considered to be less affected by noise. Specifically, for the left and right hypotenuses, the accuracy of linear approximation by regression analysis (the square error between the straight line and the actual data) is compared, and the one with the smaller error may be adopted.
[0033]
Similarly, a rectangle connecting four vertices obtained as a result of rectangle detection may not necessarily be a rectangle due to the influence of noise, and adjacent sides may not be orthogonal. Even in such a case, the accuracy of rectangle detection can be improved by performing correction so that the rectangle becomes a rectangle based on data that is considered to be less influenced by noise.
[0034]
Second Embodiment
Hereinafter, a second embodiment according to the present invention will be described. The system configuration in the second embodiment is the same as that of the first embodiment described above, and a description thereof will be omitted.
[0035]
In the first embodiment described above, the case where one rectangular area exists in the image has been described. As such a situation, for example, scanning is performed with a single photo placed on the platen of the image scanner 110, not necessarily horizontal, and then the skew correction and image cropping are performed automatically. A process of extracting an image is assumed.
[0036]
However, since the document table of a general flat head scanner can place a document of about A4 size, a plurality of general photographs (so-called L size) can be placed at a time. Therefore, if such a plurality of photographic images can be read by a single scan and only a rectangular area indicating the photographic image can be cut out from the obtained image, the efficiency of the scanning operation can be improved.
[0037]
Therefore, the second embodiment is characterized in that a plurality of rectangular images are detected from one image.
[0038]
Hereinafter, with reference to FIG. 9, the detection processing of the several rectangular image in 2nd Embodiment is demonstrated. The left part of FIG. 9 shows a binarized image when two photographs are arranged apart from each other in the sub-scanning direction (longitudinal direction in the figure) of the document table, and the right part of FIG. 9 shows the binarized image. The state of the histogram in the sub-scanning direction is shown. As can be seen from the figure, when a plurality of photographs are arranged apart from each other, a plurality of trapezoids appear in the histogram. Therefore, in the second embodiment, by referring to this histogram, the image is divided into two parts between these plural trapezoids, that is, at the position indicated by the one-dot chain line L in FIG. Then, a rectangular image can be detected in each segmented area by the method described in the first embodiment described above.
[0039]
In FIG. 9, the case where a plurality of photographs are arranged apart from each other in the sub-scanning direction has been described, but the case where they are arranged apart from each other in the main scanning direction (short direction in the figure) is also the main scanning direction. By creating a histogram for, the image can be cut in the main scanning direction and each rectangular image can be extracted.
[0040]
As described above, according to the second embodiment, a plurality of rectangular images can be detected from one image.
[0041]
<Third Embodiment>
The third embodiment according to the present invention will be described below. The system configuration in the third embodiment is the same as that in the first embodiment described above, and a description thereof will be omitted.
[0042]
The third embodiment is characterized in that an ideal labeling process is performed on a binary image as shown in FIG. 2 to eliminate the influence of noise and to detect a rectangular image with higher accuracy. .
[0043]
Here, the ideal labeling process in the binary image shown in FIG. 2 is based on the connection of black pixels, for example, by giving a pixel value of 1 to all the pixels included in the upper rectangular object, The pixel value 2 is given to all the pixels included in the object. That is, the minute noise is not given a label, and the pixel value 255 is left as it is.
[0044]
● Outline of labeling process
Hereinafter, a labeling method for realizing the ideal labeling will be described in detail.
[0045]
FIG. 10 is a diagram for explaining a method of searching for a block corresponding to the upper left portion including the upper rectangular object in FIG. 2 and serving as a starting point of the painting process in the third embodiment.
[0046]
First, as shown in FIG. 10, the entire W pixel × H pixel bitmap shown in FIG. 2 is divided into blocks of w pixels × h pixels, and all of the blocks are occupied by black pixels having a value of 255 ( (Hereinafter referred to as “black block”) is searched in the direction of the arrow in the figure. The blocks indicated by black rectangles in FIG. 3 are found black blocks.
[0047]
In the third embodiment, high-speed labeling is realized by painting a region where a found black block and a black pixel are connected. FIG. 11 is a flowchart illustrating the entire labeling process according to the third embodiment.
[0048]
First, in step S1401, the value 1 is substituted into the variable c, and black block search is performed in step S1402. Details of the black block search process will be described later.
[0049]
Next, in step S1403, it is determined whether a black block is found. If found, the process proceeds to step S1404. If not found, the process ends.
[0050]
In step S1404, starting from the found black block, the black pixel connection region is filled with the label c. Details of the filling process will be described later.
[0051]
In step S1405, the variable c is incremented by 1. The process returns to step S1402, and the above processing is repeated.
[0052]
As a result of this processing, labels 1, 2,..., C can be given only to objects having a size larger than a predetermined size.
[0053]
● Black block search processing
Details of the black block search process shown in step S1402 of FIG. 11 will be described below with reference to the flowchart of FIG.
[0054]
First, in steps S1501 to S1504, variables y, x, j, and i are each initialized with a value of 0. In step S1505, it is determined whether buf (x + i, y + j) is 255 indicating a black pixel. If 255, the process proceeds to step S1506. If not 255, the process proceeds to step S1510.
[0055]
In step S1506, the variable i is incremented by a value 1, and if i is equal to w indicating the block width in step S1507, the process proceeds to step S1508, and if not, the process proceeds to step S1505.
[0056]
In step S1508, the variable j is incremented by a value of 1. If the j is equal to h indicating the block height in step S1509, the process ends.At this time, the w × h pixel block having (x, y) as the upper left starting point is determined. Detected as a black block. If j is different from h, the process returns to step S1504.
[0057]
In step S1510, w is added to x. If x is equal to or larger than W indicating the image width in step S1511, the process proceeds to step S1512. Otherwise, the process returns to step S1503.
[0058]
In step S1512, h is added to y. If y is equal to or greater than H indicating the image height in step S1513, the process is terminated, and at this time, it is determined that there is no black block. If y is less than H, the process returns to step S1502.
[0059]
Through the above processing, a black block of w pixels × h pixels is detected from the binary image of W pixels × H pixels.
[0060]
● Filling the connected area
The details of the connecting area filling process shown in step S1404 of FIG. 11 will be described with reference to the flowchart of FIG.
[0061]
Various methods are known as the object filling process, and any painting method may be used in the third embodiment. Accordingly, the basic method will be described here, and it is assumed that the object has a simple shape that is convex outward for the sake of simplicity.
[0062]
As a result of the black block search process described above, a block having (x, y) as the upper left starting point is detected as a black block as shown in FIG. Therefore, in the filling process of the third embodiment, the regions connected by the black pixels are labeled from the center point ((x + w) / 2, (y + h) / 2) of the black block. Fill with c. That is, the pixel value having the value 255 is replaced with the value c.
[0063]
Specifically, first, in step S601, with respect to the scanning line including the center point ((x + w) / 2, (y + h) / 2) of the black block, the black connected portion on the scanning line is filled with the label c. That is, a line segment connecting the left end (x0, y + h / 2) and the right end (x1, y + h / 2) of the scanning line in the object including the black block shown in FIG. 14 (the object in the horizontal direction in FIG. 14). The line segment to be cut) is filled with label c. The line segment connecting the above (x0, y + h / 2) and (x1, y + h / 2) is hereinafter referred to as a connected line segment, and the details of the filling process will be described later.
[0064]
In step S602, the black connected portion below the connecting line segment is filled, and in step S603, the black connected portion above the connecting line segment is filled. Details of the painting process in steps S602 and S603 will be described later.
[0065]
As a result of this processing, the area connected by the black block and the black pixel can be filled with the label c.
[0066]
● Fill line processing
The details of the connecting line segment filling process shown in step S601 of FIG. 13 will be described with reference to the flowchart of FIG. For simplification of explanation, it is assumed that the center point ((x + w) / 2, (y + h) / 2) of the black block that is the starting point of processing is indicated by coordinates (X, Y). That is, X = (x + w) / 2 and Y = (y + h) / 2.
[0067]
First, X is substituted for variable i in step S801, and i is decremented by 1 in step S802. In step S803, it is determined whether buf (i, Y) is equal to the value 255. If they are different, the process proceeds to step S804. If they are equal, the process returns to step S802 and i is further decremented.
[0068]
In step S804, the value i + 1 is stored in the variable x0, and in step S805, X is substituted for the variable i. In step S806, i is incremented by a value 1. In step S807, it is determined whether buf (i, Y) is equal to value 255.If they are equal, the process proceeds to step S808. Increment further.
[0069]
In step S808, the value i-1 is stored in the variable x1, in step S809, the value x0 is substituted into the variable i, and in step S810, c is substituted into buf (i, Y). In step S811, i is incremented by 1 and if i is larger than x1 in step S812, the process ends. If not, the process returns to step S810.
[0070]
By this processing, the x coordinates at both ends of the connecting line segment can be stored in the variables x0 and x1, and the connecting line segment can be filled with the label c.
[0071]
● Filling processing from the connecting line to the bottom (upper)
The details of the process of painting the black connected portion below the connecting line segment shown in step S602 of FIG. 13 will be described with reference to the flowchart of FIG. Here, Y = y + h / 2 is also set, and therefore, the black connected portion below the connecting line segment connecting (x0, Y) and (x1, Y) is filled with the label c.
[0072]
First, the variable Y is incremented by 1 in step S901, and x0 is substituted for variable i in step S902. In step S903, it is determined whether buf (i, Y) is equal to the value 255. If they are different, the process proceeds to step S904, and if they are equal, the process proceeds to step S906.
[0073]
In step S904, i is incremented by a value of 1. If i is larger than x1 in step S905, the process ends. If not, the process returns to step S903.
[0074]
In step S906, i is substituted for X, and in step S907, (X, Y) is used as a starting point, and the connecting line segment on the scanning line is filled in the horizontal direction. Note that the painting process on the scanning line is shown in the flowchart of FIG. 15, and therefore detailed description thereof is omitted here.
[0075]
Note that the process for filling the black connected portion above the connecting line segment shown in step S603 in FIG. 13 is basically the same as the process for filling in the downward direction shown in the flowchart of FIG. In this case, instead of incrementing Y, it may be decremented, and the same processing may be performed for the other steps.
[0076]
According to the labeling process as described above, it is possible to label only an object including a black block having a predetermined size, that is, an object having an area larger than the predetermined size. Therefore, labeling is not performed on a minute noise region, and high-speed labeling is possible.
[0077]
● Rectangle detection after labeling
With the above labeling processing, in the two-dimensional array of buf () shown in FIG. 2, the pixel value 1 as the label 1 is included in the lower rectangular object for all the pixels included in the upper rectangular object. The pixel value 2 as the label 2 is stored for all the pixels, the pixel value 255 is stored for the black pixels in the other minute noise portions, and the value 0 is stored for the white pixels of the background.
[0078]
In the third embodiment, the position (vertex coordinates) of the rectangular object is determined for each label area. That is, the rectangle detection processing described in the first and second embodiments described above is performed on the labeled image. Specifically, in the histogram creation flowchart shown in FIG. 4 described above, the comparison target of buf () shown in step S404 is changed from a value 255 indicating a black pixel to a label value (in this case, 1 or 2). It ’s fine. As a result, the areas of label 1 and label 2 are detected as rectangles.
[0079]
Although the case where two rectangles are included has been described here as an example, the above labeling process can be performed on a binary image including an arbitrary number of rectangles, and the rectangle detection process can be performed for each label. Needless to say.
[0080]
As described above, according to the third embodiment, by performing an ideal labeling process on a binary image, it is possible to eliminate the influence of noise and detect a rectangular image with higher accuracy.
[0081]
<Fourth embodiment>
The fourth embodiment according to the present invention will be described below. The system configuration in the fourth embodiment is the same as that in the first embodiment described above, and a description thereof will be omitted.
[0082]
In the third embodiment described above, when searching for a black block from a binary image, the entire binary image is first divided into blocks, and then it is checked whether or not all the inside of each block is black pixels. However, other methods can be applied as the black block search method. For example, in the fourth embodiment, a method of continuously changing the upper left coordinate of a block in a binary image will be described. According to this method, although the calculation load increases, the black block detection performance can be improved.
[0083]
FIG. 17 is a diagram illustrating a state where a black block in a rectangular object is searched in the fourth embodiment. According to the figure, it can be seen that the search starting point (x, y) in the fourth embodiment is located away from the block boundary shown in FIG. 10 in the third embodiment.
[0084]
FIG. 18 shows a flowchart of black block search processing in the fourth embodiment, which will be described in detail.
[0085]
First, in steps S1101 to S1104, variables y, x, j, and i are each initialized with a value of 0. In step S1105, it is determined whether buf (x + i, y + j) is 255 indicating a black pixel. If 255, the process proceeds to step S1106. If not 255, the process proceeds to step S1110.
[0086]
In step S1106, the variable i is incremented by the value 1. If i is equal to w indicating the block width in step S1107, the process proceeds to step S1108, and if not, the process proceeds to step S1105.
[0087]
In step S1108, the variable j is incremented by a value 1, and if j is equal to h indicating the block height in step S1109, the process ends.At this time, a w × h pixel block having (x, y) as the upper left starting point is determined. Detected as a black block. If j is different from h, the process returns to step S1104.
[0088]
In step S1110, i + 1 is added to x. Note that even if x is simply incremented by 1 here, the same result is finally obtained. However, by adding i + 1, search waste can be eliminated and the processing speed can be increased.
[0089]
If x is greater than or equal to W indicating the image width in step S1111, the process proceeds to step S1112, and if not, the process returns to step S1103.
[0090]
In step S1112, y is incremented by a value 1, and in step S1113, if y is equal to H indicating the image height, the process is terminated, and at this time, it is determined that there is no black block. If y is not equal to H, the process returns to step S1102.
[0091]
Through the above processing, a black block of w pixels × h pixels is detected from the binary image of W pixels × H pixels. Thereafter, similarly to the third embodiment described above, it is possible to perform black pixel connection region filling and rectangle detection processing.
[0092]
As described above, according to the fourth embodiment, a black block of a predetermined size can be detected with higher accuracy. Therefore, labeling for only valid objects can be performed with higher accuracy, and as a result, rectangular detection with higher accuracy is possible.
[0093]
According to the third and fourth embodiments described above, the black block determination condition is that all the pixels in the block are black pixels. For example, depending on the state of noise, white pixels may be included in the black area. May be mixed. Therefore, in the present invention, for example, the number of black pixels in the block may be counted, and when this exceeds a predetermined threshold, it may be determined that the block is a black block.
[0094]
Further, the black block to be searched does not necessarily have to be a rectangle, and may be a polygon such as a hexagon, a circle or an ellipse, and can be selected according to the assumed shape of the object. May be.
[0095]
Also, the size of the search block is not limited to one predetermined size. For example, the search may be started from a large block size, and if no object is found, the size may be gradually reduced and the search may be repeated. As a result, it is possible to avoid a risk that a smaller object than assumed is mistaken for noise and ignored.
[0096]
[Other Embodiments]
Note that the present invention can be applied to a system including a plurality of devices (for example, a host computer, an interface device, a reader, and a printer), and a device (for example, a copying machine and a facsimile device) including a single device. You may apply to.
[0097]
Another object of the present invention is to supply a storage medium storing software program codes for realizing the functions of the above-described embodiments to a system or apparatus, and the computer (or CPU or MPU) of the system or apparatus stores the storage medium. Needless to say, this can also be achieved by reading and executing the program code stored in.
[0098]
In this case, the program code itself read from the storage medium realizes the functions of the above-described embodiment, and the storage medium storing the program code constitutes the present invention.
[0099]
As a storage medium for supplying the program code, for example, a floppy disk, a hard disk, an optical disk, a magneto-optical disk, a CD-ROM, a CD-R, a magnetic tape, a nonvolatile memory card, a ROM, or the like can be used.
[0100]
In addition, by executing the program code read by the computer, not only the functions of the above-described embodiments are realized, but also an OS (operating system) running on the computer based on the instruction of the program code. It goes without saying that a part of the actual processing is performed and the functions of the above-described embodiments are realized by the processing.
[0101]
Further, after the program code read from the storage medium is written to a memory provided in a function expansion board inserted into the computer or a function expansion unit connected to the computer, the function expansion is performed based on the instruction of the program code. It goes without saying that the CPU or the like provided in the board or the function expansion unit performs part or all of the actual processing, and the functions of the above-described embodiments are realized by the processing.
[0102]
【The invention's effect】
As described above, according to the present invention, a rectangular region can be detected with high accuracy even in a noisy environment.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a schematic configuration of an image processing system according to an embodiment of the present invention.
FIG. 2 is a diagram illustrating an example of a binary image to be detected in the present embodiment.
FIG. 3 is a diagram illustrating an example of a horizontal histogram of the number of black pixels in the present embodiment.
FIG. 4 is a flowchart showing histogram calculation processing in the present embodiment.
FIG. 5 is a flowchart showing a method for approximating a trapezoid of a histogram in the present embodiment.
FIG. 6 is a diagram illustrating an example of detected coordinates of a rectangular image in the present embodiment.
FIG. 7 is a diagram showing four skew feeding modes in the present embodiment.
FIG. 8 is a diagram for explaining the principle of skew mode determination in the present embodiment;
FIG. 9 is a diagram showing a state of a histogram when a plurality of photographs are arranged in the second embodiment.
FIG. 10 is a diagram for explaining a black block search method according to the third embodiment;
FIG. 11 is a flowchart showing an outline of a labeling process in the third embodiment.
FIG. 12 is a flowchart showing black block search processing in the third embodiment.
FIG. 13 is a flowchart showing a painting process in the third embodiment.
FIG. 14 is a diagram for explaining a connecting line segment filling method according to the third embodiment;
FIG. 15 is a flowchart showing a connecting line segment filling process in the third embodiment;
FIG. 16 is a flowchart showing a process of filling a region below a connecting line segment in the third embodiment.
FIG. 17 is a diagram for explaining a black block search method according to the fourth embodiment;
FIG. 18 is a flowchart showing black block search processing in the fourth embodiment.
[Explanation of symbols]
101 CPU
102 keyboard
102a mouse
103 Display
104 ROM
105 RAM
106 hard disk
107 floppy disk
108 Printer
109 Network I / F
110 Image scanner

Claims

An image processing method for detecting a rectangular area from an image,
A histogram creating step of creating a histogram of the number of black pixels in the horizontal and vertical directions in an image obtained by binarizing an original image;
A rectangular range detecting step of detecting the existence range of the rectangular area in the binary image by approximating the histogram to a trapezoid;
An inclination direction detecting step for detecting an inclination direction of the rectangular area within the existence range;
In the rectangular range detection step, assuming that the rectangular area is a rectangle, the histogram is linearly approximated by regression analysis with respect to the left and right hypotenuses so that the left and right hypotenuses are symmetrical . An image processing method characterized by comparing the accuracy and correcting data of the other hypotenuse according to data obtained from one hypotenuse with a small error .

In the rectangular range detection step,
Performing a trapezoidal approximation for each of the histograms in the horizontal and vertical directions of the binary image;
The image processing method according to claim 1, wherein an existence range of a rectangular area is detected for each of the binary image in a horizontal direction and a vertical direction.

The image processing method according to claim 2, wherein in the rectangular range detection step, a plurality of vertex coordinate candidates for the rectangular area are detected.

4. The image processing method according to claim 3, wherein, in the rectangular range detecting step, the approximated trapezoidal vertex coordinates are detected as a plurality of vertex coordinate candidates for the rectangular region.

5. The image processing method according to claim 4, wherein in the tilt direction detecting step, coordinates of four vertices are specified from a plurality of vertex coordinate candidates of the rectangular area.

In the tilt direction detection step,
Based on a plurality of vertex coordinate candidates for the rectangular area, assuming a plurality of inclination directions of the rectangular area,
For the rectangular area indicated by each of the plurality of tilt directions, the number of black pixels existing outside the counter is counted,
6. The image processing method according to claim 5, wherein the coordinates of the four vertices are specified by specifying one inclination direction based on each counting result.

In the tilt direction detection step,
Based on a plurality of vertex coordinate candidates for the rectangular area, assuming a plurality of inclination directions of the rectangular area,
For the rectangular area indicated by each of the plurality of inclination directions, the black pixels existing inside the rectangular area are counted,
6. The image processing method according to claim 5, wherein the coordinates of the four vertices are specified by specifying one inclination direction based on each counting result.

6. The image processing method according to claim 5, further comprising a skew correction step of performing skew correction on a rectangular area determined by the coordinates of the four vertices detected in the tilt direction detection step.

9. The image processing method according to claim 8, wherein the skew correction step performs bilinear interpolation.

9. The image processing method according to claim 8, wherein the skew correction step performs interpolation by a nearest neighbor method.

The image according to claim 1, wherein the trapezoidal approximation is performed according to data obtained from a hypotenuse with few errors by comparing accuracy of linear approximation by regression analysis for left and right hypotenuses constituting the trapezoid. Processing method.

12. The image processing method according to claim 11, wherein the rectangular area is an area showing a photographic image.

2. The image processing method according to claim 1, further comprising a binarization step of binarizing a multi-value original image to obtain the binary image.

14. The image processing method according to claim 13, wherein the original image is an image in which one photographic image is arranged.

When the original image is an image in which a plurality of photographic images are arranged at positions separated in the main scanning direction,
14. The image processing according to claim 13, wherein in the histogram creation step, the binary image is divided in the main scanning direction for each photographic image, and the histogram is created for each of the divided binary images. Method.

When the original image is an image in which a plurality of photographic images are arranged at positions separated in the sub-scanning direction,
14. The image processing according to claim 13, wherein, in the histogram creation step, the binary image is divided in the sub-scanning direction for each photographic image, and the histogram is created for each of the divided binary images. Method.

And a labeling step for labeling the binary image,
The image processing method according to claim 1, wherein in the histogram creating step, a histogram of the number of labeled pixels is created.

The labeling process includes
In the binary image, a black block detecting step of detecting a block having a predetermined size such that the number of black pixels in the binary image is a predetermined ratio or more, as a black block;
18. The image processing method according to claim 17, further comprising a labeling step of replacing black pixels in the detected black block and black pixels connected to the black block with a predetermined label value.

19. The image processing method according to claim 18, wherein, in the black block detecting step, a block consisting entirely of black pixels is detected as the black block.

The black block detection step includes
A dividing step of dividing the entire binary image into blocks of the predetermined size;
The image processing method according to claim 18, further comprising: a determination step of determining whether each block is a black block.

In the black block detection step,
19. The image processing method according to claim 18, wherein black blocks are detected by sequentially shifting the block positions of the predetermined size in the binary image.

19. The black block detecting step of detecting a black block having a size smaller than the predetermined size when the black book of the predetermined size is not detected in the binary image. Image processing method.

The labeling step includes
A reference labeling step of replacing a reference line in the black block with the label value;
An upper labeling step of replacing black pixels connected above the reference line with the label value;
The image processing method according to claim 18, further comprising a lower labeling step of replacing a black pixel connected to a lower part than the reference line with the label value.

19. The image processing method according to claim 18, wherein the black block is rectangular.

19. The image processing method according to claim 18, wherein the black block is a polygon.

The image processing method according to claim 18, wherein the black block is circular.

An image processing apparatus for detecting a rectangular area from an image,
A histogram creating means for creating a histogram of the number of black pixels in the horizontal and vertical directions in an image obtained by binarizing an original image;
Rectangular range detection means for detecting the existence range of the rectangular area in the binary image by approximating the histogram to a trapezoid;
Inclination direction detecting means for detecting the inclination direction of the rectangular region within the existence range,
In the rectangular range detecting means, assuming that the rectangular area is a rectangle, the histogram is linearly approximated by regression analysis with respect to the left and right hypotenuses so that the left and right hypotenuses are symmetrical . An image processing apparatus characterized by comparing accuracy and correcting data on the other hypotenuse according to data obtained from one hypotenuse with a small error .

An image processing program for detecting a rectangular area from an image,
By running the computer
A histogram creating step of creating a histogram of the number of black pixels in the horizontal and vertical directions in an image obtained by binarizing an original image;
A rectangular range detecting step of detecting the existence range of the rectangular area in the binary image by approximating the histogram to a trapezoid;
An inclination direction detection step of detecting an inclination direction of the rectangular area within the existence range,
In the rectangular range detection step, assuming that the rectangular area is a rectangle, the histogram is linearly approximated by regression analysis with respect to the left and right hypotenuses so that the left and right hypotenuses are symmetrical . A computer-readable image processing program for comparing accuracy and correcting data on the other hypotenuse according to data obtained from one hypotenuse with a small error .