JP2019114024A

JP2019114024A - Device, method and program for setting information related to scan image

Info

Publication number: JP2019114024A
Application number: JP2017246571A
Authority: JP
Inventors: 大次郎宮本; Daijiro Miyamoto
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2017-12-22
Filing date: 2017-12-22
Publication date: 2019-07-11
Anticipated expiration: 2037-12-22
Also published as: JP7030505B2

Abstract

【課題】画面上でユーザにより指定された位置に基づいて、文字認識処理を行う領域を適切に検出することを目的とする。【解決手段】装置は、スキャン画像データが表すスキャン画像のプレビュー画面を表示する。スキャン画像内において文字列と推認される文字列領域を抽出し、抽出された文字列領域のうち、プレビュー画面内において指定された第１の座標および第２の座標に基づく選択領域に少なくとも一部が重なる文字列領域を対象領域として検出する。対象領域の文字認識処理の結果得られた文字を用いてスキャン画像データに関連する情報を設定する。【選択図】図１０PROBLEM TO BE SOLVED: To appropriately detect a region for performing character recognition processing based on a position designated by a user on a screen. An apparatus displays a preview screen of a scan image represented by scan image data. A character string area that is presumed to be a character string in the scanned image is extracted, and at least a part of the extracted character string area is selected in the selected area based on the first coordinate and the second coordinate specified in the preview screen. The character string area where is overlapped is detected as the target area. Information related to the scan image data is set using the character obtained as a result of the character recognition processing of the target area. [Selection diagram] Fig. 10

Description

本発明は、スキャンして得られたスキャン画像に関連する情報を設定する技術に関する。 The present invention relates to a technique for setting information related to a scanned image obtained by scanning.

従来、紙文書をスキャンして得られた画像データ（以下、スキャン画像データという）に対して文字認識処理を行い、認識された文字を、その紙文書の電子ファイルのファイル名として使用する技術がある。 Conventionally, there is a technology in which character recognition processing is performed on image data obtained by scanning a paper document (hereinafter referred to as scan image data), and the recognized characters are used as file names of electronic files of the paper document. is there.

特許文献１には、スキャン画像データを表示する操作パネルにおいて、指によるスワイプ操作やドラッグ操作などによって、長方形の領域を指定し、その領域を文字認識処理して得た文字をファイル名としたファイルを作成することが開示されている。さらに、特許文献１では、指で指定された領域の位置から所定量ずらした位置に別の領域を定め、当該定めた別の領域についても文字認識処理を実行する技術も開示されている。 In Patent Document 1, a rectangular area is designated by a swipe operation or a drag operation with a finger on an operation panel for displaying scanned image data, and a file obtained by character recognition processing of the area is used as a file name. It is disclosed to create. Furthermore, Patent Document 1 also discloses a technique of defining another area at a position shifted by a predetermined amount from the position of the area designated by the finger, and performing the character recognition process also on the determined other area.

特開２０１５−２１５８７８号公報JP, 2015-215878, A

画面上でユーザに文字列の領域を選択させる場合、ユーザが選択した領域と、スキャン画像中におけるその文字列の領域とが一致しない場合がある。操作パネルなどの小さい画面上でユーザが文字列の領域を選択するような形態では、この傾向が強い。特許文献１では、ユーザにより指で指定された長方形領域の位置から所定量ずらした位置に定めた領域に文字認識処理が行われる。しかしながら、所定量ずらした位置が、ユーザの所望する文字列の領域と必ずしも一致するわけではない。このため、複数回にわたって所定量ずらした領域について文字認識処理が行われることになり、文字認識処理にかかる負荷が大きくなりやすい。 When the user is made to select the area of the character string on the screen, the area selected by the user may not match the area of the character string in the scanned image. This tendency is strong in a mode in which the user selects a character string area on a small screen such as an operation panel. In Patent Document 1, character recognition processing is performed on an area defined at a position shifted by a predetermined amount from the position of a rectangular area designated by the user with a finger. However, the position shifted by the predetermined amount does not necessarily coincide with the area of the character string desired by the user. For this reason, the character recognition process is performed on the area shifted a predetermined amount a plurality of times, and the load on the character recognition process tends to be large.

本発明は、画面上でユーザにより指定された位置に基づいて、文字認識処理を行う領域を適切に検出することを目的とする。 An object of the present invention is to appropriately detect a region where character recognition processing is to be performed based on a position designated by a user on a screen.

本発明の一態様に係る装置は、文書をスキャンして得られたスキャン画像データに関連する情報を設定するための装置であって、前記スキャン画像データが表すスキャン画像のプレビュー画面を表示する表示制御手段と、前記スキャン画像内において文字列と推認される文字列領域を抽出する抽出手段と、抽出された文字列領域のうち、前記プレビュー画面内において指定された第１の座標および第２の座標に基づく選択領域に少なくとも一部が重なる文字列領域を対象領域として検出する検出手段と、前記対象領域の文字認識処理を行う認識手段と、前記文字認識処理の結果得られた文字を用いて前記情報を設定する設定手段とを備えることを特徴とする。 An apparatus according to an aspect of the present invention is an apparatus for setting information related to scanned image data obtained by scanning a document, and a display for displaying a preview screen of a scanned image represented by the scanned image data. The control means, the extraction means for extracting a character string area assumed to be a character string in the scan image, and the first coordinates and the second of the extracted character string areas designated in the preview screen Using detection means for detecting as an object area a character string area at least partially overlapping a selection area based on coordinates, recognition means for performing character recognition processing of the object area, and characters obtained as a result of the character recognition process And setting means for setting the information.

本発明によれば、画面上でユーザにより指定された位置に基づいて、文字認識処理を行う対象領域を適切に検出することができる。 According to the present invention, it is possible to appropriately detect a target area on which character recognition processing is to be performed, based on the position designated by the user on the screen.

システム全体図である。FIG. ＭＦＰのハードウェア構成図である。It is a hardware block diagram of MFP. ファイルサーバのハードウェア構成図である。It is a hardware block diagram of a file server. ＭＦＰのソフトウェア構成図である。It is a software block diagram of MFP. アップロードまでの一連の処理を示すフローチャートである。It is a flowchart which shows a series of processes to an upload. ＭＦＰのスキャン設定画面を示す図である。FIG. 6 is a view showing a scan setting screen of the MFP. ＭＦＰのプレビュー画面を示す図である。FIG. 6 is a view showing a preview screen of the MFP. ＭＦＰのプレビュー画面を示す図である。FIG. 6 is a view showing a preview screen of the MFP. ＭＦＰのアップロード設定画面を示す図である。FIG. 6 is a view showing an upload setting screen of the MFP. ファイル名生成処理を示すフローチャートである。It is a flow chart which shows file name generation processing. ＭＦＰのプレビュー画面での操作例を示す図である。FIG. 7 is a diagram showing an operation example on the preview screen of the MFP. ＭＦＰのプレビュー画面での操作例を示す図である。FIG. 7 is a diagram showing an operation example on the preview screen of the MFP. ＭＦＰのプレビュー画面での操作例を示す図である。FIG. 7 is a diagram showing an operation example on the preview screen of the MFP. ＭＦＰのプレビュー画面を示す図である。FIG. 6 is a view showing a preview screen of the MFP. ＭＦＰのプレビュー画面を示す図である。FIG. 6 is a view showing a preview screen of the MFP. ＭＦＰのプレビュー画面を示す図である。FIG. 6 is a view showing a preview screen of the MFP. ファイル名生成処理を示すフローチャートである。It is a flow chart which shows file name generation processing.

以下、本発明を実施するための形態について図面を用いて説明する。なお、以下の実施の形態は特許請求の範囲に係る発明を限定するものでなく、また実施の形態で説明されている特徴の組み合わせの全てが発明の解決手段に必須のものとは限らない。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. The following embodiments do not limit the invention according to the claims, and all combinations of the features described in the embodiments are not necessarily essential to the solution means of the invention.

＜＜実施形態１＞＞
＜全体構成＞
図１は、本実施形態に係る画像処理システムの全体構成を示す図である。画像処理システムは、ＭＦＰ１１０とファイルサーバ１２０とを含む。ＭＦＰ１１０とファイルサーバ１２０とは、ＬＡＮ（Local Area Network）を介して互いに通信可能に接続されている。 << First Embodiment >>
<Overall configuration>
FIG. 1 is a diagram showing an overall configuration of an image processing system according to the present embodiment. The image processing system includes an MFP 110 and a file server 120. The MFP 110 and the file server 120 are communicably connected to each other via a LAN (Local Area Network).

ＭＦＰ（Multi Function Printer）１１０は、スキャナやプリンタといった複数の機能を有する複合機であり、画像処理装置の一例である。ファイルサーバ１２０は、電子化された文書ファイルを保存し、管理する外部サーバの一例である。本実施形態の画像処理システムは、ＭＦＰ１１０とファイルサーバ１２０とを含むがこれに限定されない。例えば、ＭＦＰ１１０がファイルサーバ１２０の役割を兼ね備えてもよい。また、ＬＡＮに代えてインターネットなどを介した接続形態であってもよい。また、ＭＦＰ１１０は、ＰＳＴＮ（Public Switched Telephone Networks）に接続され、ファクシミリ装置（不図示）との間で画像データをファクシミリ通信することができる。 An MFP (Multi Function Printer) 110 is a multifunction peripheral having a plurality of functions such as a scanner and a printer, and is an example of an image processing apparatus. The file server 120 is an example of an external server that stores and manages digitized document files. The image processing system according to the present embodiment includes, but is not limited to, the MFP 110 and the file server 120. For example, the MFP 110 may have the role of the file server 120. Also, instead of the LAN, connection may be made via the Internet or the like. Further, the MFP 110 is connected to a Public Switched Telephone Networks (PSTN), and can perform facsimile communication of image data with a facsimile apparatus (not shown).

＜ＭＦＰのハードウェア構成＞
図２は、ＭＦＰ１１０のハードウェア構成図である。ＭＦＰ１１０は、制御部２１０、操作部２２０、プリンタ部２２１、スキャナ部２２２、およびモデム２２３を有する。制御部２１０は、以下の各部２１１〜２１９を有し、ＭＦＰ１１０全体の動作を制御する。ＣＰＵ２１１は、ＲＯＭ２１２に記憶された制御プログラムを読み出して、読取、印刷、通信などＭＦＰ１１０が有する各種機能を実行および制御する。ＲＡＭ２１３は、ＣＰＵ２１１の主メモリおよびワークエリア等の一時記憶領域として用いられる。なお、本実施形態では１つのＣＰＵ２１１が、１つのメモリ（ＲＡＭ２１３またはＨＤＤ２１４）を用いて後述のフローチャートに示す各処理を実行するものとするが、これに限定されない。例えば、複数のＣＰＵ、および、複数のＲＡＭまたはＨＤＤを協働させて各処理を実行してもよい。ＨＤＤ２１４は、画像データおよび各種プログラムを記憶する大容量記憶部である。操作部Ｉ／Ｆ２１５は、操作部２２０と制御部２１０とを接続するインタフェースである。操作部２２０には、タッチパネルやキーボードなどが備えられており、ユーザによる操作、入力、指示などを受け付ける。プリンタＩ／Ｆ２１６は、プリンタ部２２１と制御部２１０とを接続するインタフェースである。印刷用の画像データは、プリンタＩ／Ｆ２１６を介して制御部２１０からプリンタ部２２１へ転送され、記録媒体上に印刷される。スキャナＩ／Ｆ２１７は、スキャナ部２２２と制御部２１０とを接続するインタフェースである。スキャナ部２２２は、不図示の原稿台やＡＤＦ（Auto Document Feeder）にセットされた原稿を読み取って画像データ（スキャン画像データ）を生成し、スキャナＩ／Ｆ２１７を介して制御部２１０にスキャン画像データを入力する。ＭＦＰ１１０は、スキャナ部２２２で生成されたスキャン画像データをプリンタ部２２１から印刷出力（コピー）する他、ファイル送信またはメール送信することができる。モデムＩ／Ｆ２１８は、モデム２２３と制御部２１０とを接続するインタフェースである。モデム２２３は、ＰＳＴＮ上のファクシミリ装置との間で画像データをファクシミリ通信する。ネットワークＩ／Ｆ２１９は、制御部２１０（ＭＦＰ１１０）をＬＡＮに接続するインタフェースである。ＭＦＰ１１０は、ネットワークＩ／Ｆ２１９を用いてＬＡＮ上の外部装置（ファイルサーバ１２０など）に画像データや情報を送信したり、各種情報を受信したりする。 <Hardware Configuration of MFP>
FIG. 2 is a hardware configuration diagram of the MFP 110. As shown in FIG. The MFP 110 includes a control unit 210, an operation unit 220, a printer unit 221, a scanner unit 222, and a modem 223. The control unit 210 includes the following units 211 to 219, and controls the overall operation of the MFP 110. The CPU 211 reads out a control program stored in the ROM 212, and executes and controls various functions of the MFP 110 such as reading, printing, and communication. The RAM 213 is used as a main memory of the CPU 211 and a temporary storage area such as a work area. In the present embodiment, one CPU 211 executes each process shown in the flowchart described later using one memory (the RAM 213 or the HDD 214), but is not limited to this. For example, a plurality of CPUs and a plurality of RAMs or HDDs may cooperate to execute each process. The HDD 214 is a large-capacity storage unit that stores image data and various programs. The operation unit I / F 215 is an interface that connects the operation unit 220 and the control unit 210. The operation unit 220 includes a touch panel, a keyboard, and the like, and receives operations, inputs, instructions, and the like by the user. The printer I / F 216 is an interface that connects the printer unit 221 and the control unit 210. The image data for printing is transferred from the control unit 210 to the printer unit 221 via the printer I / F 216 and printed on a recording medium. The scanner I / F 217 is an interface that connects the scanner unit 222 and the control unit 210. The scanner unit 222 reads a document set on an unillustrated document table or ADF (Auto Document Feeder) to generate image data (scanned image data), and the scanned image data is transmitted to the control unit 210 via the scanner I / F 217. Enter The MFP 110 can print out (copy) scanned image data generated by the scanner unit 222 from the printer unit 221, and can transmit a file or an email. The modem I / F 218 is an interface that connects the modem 223 and the control unit 210. A modem 223 carries out facsimile communication of image data with a facsimile apparatus on the PSTN. A network I / F 219 is an interface that connects the control unit 210 (MFP 110) to a LAN. The MFP 110 transmits image data and information to an external device (such as the file server 120) on the LAN using the network I / F 219, and receives various information.

＜ファイルサーバのハードウェア構成＞
図３は、ファイルサーバ１２０のハードウェア構成図である。ファイルサーバ１２０は、ＣＰＵ３１１、ＲＯＭ３１２、ＲＡＭ３１３、ＨＤＤ３１４及びネットワークＩ／Ｆ３１５を有する。ＣＰＵ３１１は、ＲＯＭ３１２に記憶された制御プログラムを読み出して各種処理を実行することで、ファイルサーバ１２０全体の動作を制御する。ＲＡＭ３１３は、ＣＰＵ３１１の主メモリおよびワークエリア等の一時記憶領域として用いられる。ＨＤＤ３１４は、画像データおよび各種プログラムを記憶する大容量記憶部である。ネットワークＩ／Ｆ３１５は、ファイルサーバ１２０をＬＡＮに接続するインタフェースである。ファイルサーバ１２０は、ネットワークＩ／Ｆ３１５を用いてＬＡＮ上の他の装置（ＭＦＰ１１０など）との間で各種情報を送受信する。 <Hardware Configuration of File Server>
FIG. 3 is a hardware configuration diagram of the file server 120. As shown in FIG. The file server 120 includes a CPU 311, a ROM 312, a RAM 313, an HDD 314, and a network I / F 315. The CPU 311 controls the entire operation of the file server 120 by reading out the control program stored in the ROM 312 and executing various processes. The RAM 313 is used as a main memory of the CPU 311 and a temporary storage area such as a work area. The HDD 314 is a large-capacity storage unit that stores image data and various programs. The network I / F 315 is an interface that connects the file server 120 to a LAN. The file server 120 transmits and receives various types of information to and from another apparatus (such as the MFP 110) on the LAN using the network I / F 315.

＜ＭＦＰのソフトウェア構成＞
図４は、ＭＦＰ１１０のソフトウェア構成図である。ＭＦＰ１１０は、ネイティブ機能モジュール４１０とアディショナル機能モジュール４２０との２つに大きく分けられる。ネイティブ機能モジュール４１０に含まれる各部はＭＦＰ１１０に標準的に備えられたものであるのに対し、アディショナル機能モジュール４２０はＭＦＰ１１０に追加インストールされたアプリケーションである。アディショナル機能モジュール４２０はＪａｖａ（登録商標）をベースとしたアプリケーションであり、ＭＦＰ１１０への機能追加を容易に実現できる。なお、ＭＦＰ１１０には図示しない他のアディショナル機能モジュール（追加アプリケーション）がインストールされていても良い。 <Software Configuration of MFP>
FIG. 4 is a software configuration diagram of the MFP 110. The MFP 110 can be roughly divided into two: a native function module 410 and an additional function module 420. The components included in the native function module 410 are provided as standard in the MFP 110, whereas the additional function module 420 is an application additionally installed in the MFP 110. The additional function module 420 is a Java (registered trademark) -based application, and can easily realize the addition of functions to the MFP 110. Note that another additional functional module (additional application) (not shown) may be installed in the MFP 110.

ネイティブ機能モジュール４１０は、スキャン実行部４１１および画像データ保存部４１２を有する。アディショナル機能モジュール４２０は、スキャン指示部４２１、メタデータ生成部４２２、画像解析部４２３、アップロード指示部４２４、ファイル生成部４２５、および表示制御部４２６を有する。 The native function module 410 includes a scan execution unit 411 and an image data storage unit 412. The additional function module 420 includes a scan instruction unit 421, a metadata generation unit 422, an image analysis unit 423, an upload instruction unit 424, a file generation unit 425, and a display control unit 426.

表示制御部４２６は、ＭＦＰ１１０の操作部２２０のタッチパネル機能を有する液晶表示部に、ユーザによるスキャン設定、ならびに、スキャン開始の操作、入力、および指示を受け付けるためのＵＩ画面（例えば図６、詳細は後述）を表示する。スキャン指示部４２１は、ＵＩ画面を介して入力されたユーザ指示に応じたスキャン設定と共にスキャン実行部４１１にスキャン処理を要求する。 The display control unit 426 causes the liquid crystal display unit having the touch panel function of the operation unit 220 of the MFP 110 to receive a scan setting by the user and a scan start operation, an input, and an instruction UI screen (for example, FIG. (Described later) is displayed. The scan instruction unit 421 requests the scan execution unit 411 to perform scan processing together with the scan setting according to the user instruction input via the UI screen.

スキャン実行部４１１は、スキャン指示部４２１からのスキャン設定を含んだスキャン要求を受け取る。スキャン実行部４１１は、スキャン要求に従い、スキャナＩ／Ｆ２１７を介してスキャナ部２２２で、原稿上の画像を読み取ることでスキャン画像データを生成する。生成したスキャン画像データは、画像データ保存部４１２に送られる。スキャン実行部４１１は、保存したスキャン画像データを一意に示すスキャン画像識別子をスキャン指示部４２１へ送る。画像データ保存部４１２は、スキャン実行部４１１から受け取ったスキャン画像データをＨＤＤ２１４に保存する。 The scan execution unit 411 receives a scan request including the scan setting from the scan instruction unit 421. The scan execution unit 411 generates scan image data by reading an image on a document by the scanner unit 222 via the scanner I / F 217 in accordance with a scan request. The generated scan image data is sent to the image data storage unit 412. The scan execution unit 411 sends a scan image identifier uniquely indicating the stored scan image data to the scan instruction unit 421. The image data storage unit 412 stores the scan image data received from the scan execution unit 411 in the HDD 214.

スキャン指示部４２１は、スキャン実行部４１１から受け取ったスキャン画像識別子に対応するスキャン画像データを画像データ保存部４１２から取得する。スキャン指示部４２１は、取得したスキャン画像データのファイル名の生成をメタデータ生成部４２２に要求する。 The scan instruction unit 421 acquires, from the image data storage unit 412, scan image data corresponding to the scan image identifier received from the scan execution unit 411. The scan instruction unit 421 requests the metadata generation unit 422 to generate a file name of the acquired scan image data.

メタデータ生成部４２２は、表示制御部４２６にＵＩ画面の表示指示を送る。表示制御部４２６は、この表示指示に基づき、ＭＦＰ１１０の操作部２２０のタッチパネル機能を有する液晶表示部に、ファイル名を生成するための操作、入力、および指示をユーザから受け付けるためのＵＩ画面（例えば図７（ａ）、詳細は後述）を表示する。メタデータ生成部４２２は、表示制御部４２６に指示を送り、受け取ったスキャン画像データのプレビュー画像をＵＩ画面に表示させるとともに、スキャン画像データの解析を画像解析部４２３に指示する。 The metadata generation unit 422 sends a display instruction of the UI screen to the display control unit 426. Based on the display instruction, the display control unit 426 causes the liquid crystal display unit having the touch panel function of the operation unit 220 of the MFP 110 to receive an operation, an input, and an instruction for generating a file name from the user FIG. 7 (a), the details will be described later. The metadata generation unit 422 sends an instruction to the display control unit 426, displays a preview image of the received scan image data on the UI screen, and instructs the image analysis unit 423 to analyze the scan image data.

画像解析部４２３は、メタデータ生成部４２２からの指示に基づき、スキャン画像データに対してレイアウト解析処理や文字認識処理を行う。画像解析部４２３は、処理結果をメタデータ生成部４２２に返す。 The image analysis unit 423 performs layout analysis processing and character recognition processing on the scan image data based on an instruction from the metadata generation unit 422. The image analysis unit 423 returns the processing result to the metadata generation unit 422.

メタデータ生成部４２２は、ユーザ指示と解析結果とに基づいてメタデータを生成する。メタデータは、スキャン画像データに関連する情報である。このような情報の例としては、スキャン画像データに付与されるファイル名が挙げられる。以下、本実施形態では、メタデータは、ファイル名である場合を例に説明する。メタデータ生成部４２２は、ユーザ指示と解析結果とに基づいてファイル名を生成する。メタデータ生成部４２２は、スキャン画像識別子および生成した結果得られたファイル名をアップロード指示部４２４に送り、ファイルサーバ１２０へのスキャン画像データのアップロードを指示する。 The metadata generation unit 422 generates metadata based on the user instruction and the analysis result. Metadata is information related to scan image data. An example of such information is a file name given to scan image data. Hereinafter, in the present embodiment, metadata will be described by way of an example of a file name. The metadata generation unit 422 generates a file name based on the user instruction and the analysis result. The metadata generation unit 422 sends the scan image identifier and the file name obtained as a result of the generation to the upload instruction unit 424, and instructs upload of the scan image data to the file server 120.

アップロード指示部４２４は、表示制御部４２６にＵＩ画面の表示指示を送る。表示制御部４２６は、この表示指示に基づき、ＭＦＰ１１０の操作部２２０のタッチパネル機能を有する液晶表示部に、ＵＩ画面を表示する。このＵＩ画面は、フォルダパス設定およびアップロードの操作、入力、および指示をユーザから受け付けるための画面（例えば図９、詳細は後述）である。 The upload instruction unit 424 sends a display instruction of the UI screen to the display control unit 426. The display control unit 426 displays the UI screen on the liquid crystal display unit having the touch panel function of the operation unit 220 of the MFP 110 based on the display instruction. This UI screen is a screen for receiving folder path setting and upload operations, inputs, and instructions from the user (for example, FIG. 9, the details will be described later).

アップロード指示部４２４は、ユーザからのアップロード指示を受け、指示に従って、ファイル生成部４２５にスキャン画像識別子が示すスキャン画像データのファイル生成を指示する。 In response to the upload instruction from the user, the upload instruction unit 424 instructs the file generation unit 425 to generate a file of scan image data indicated by the scan image identifier according to the instruction.

ファイル生成部４２５は、指示されたスキャン画像識別子に対応するスキャン画像データを画像データ保存部４１２から取得し、ファイルサーバ１２０へ送信するファイルを生成する。 The file generation unit 425 acquires scan image data corresponding to the instructed scan image identifier from the image data storage unit 412, and generates a file to be transmitted to the file server 120.

アップロード指示部４２４は、設定したフォルダパス設定、ファイル生成部４２５により生成されたファイル、およびメタデータ生成部４２２により生成されたファイル名を用いてファイルサーバ１２０に接続し、ファイルを送信する。 The upload instruction unit 424 connects to the file server 120 using the set folder path setting, the file generated by the file generation unit 425, and the file name generated by the metadata generation unit 422, and transmits the file.

アップロード指示部４２４は、ＳＭＢ（Server Message Block）クライアント機能を有している。これにより、ＳＭＢサーバ機能を有するファイルサーバ１２０に対してＳＭＢを用いてファイル及びフォルダ操作を行う。ＳＭＢの他に、ＷｅｂＤＡＶ（Distributed Authoring and Versioning protocol for the WWW）を使用してもよい。また、ＦＴＰ（File Transfer Protocol）、ＳＭＴＰ（Simple Mail Transfer Protocol）等を使用してもよい。また、ファイル送信目的以外のＳＯＡＰ（Simple Object Access Protocol）やＲＥＳＴ（Representational State Transfer）等を使用してもよい。 The upload instruction unit 424 has a server message block (SMB) client function. As a result, file and folder operations are performed on the file server 120 having the SMB server function using the SMB. In addition to SMB, WebDAV (Distributed Authoring and Versioning Protocol for the WWW) may be used. Also, FTP (File Transfer Protocol), SMTP (Simple Mail Transfer Protocol), or the like may be used. In addition, Simple Object Access Protocol (SOAP) or Representational State Transfer (REST) other than the purpose of file transmission may be used.

＜全体の処理のフローチャート＞
図５は、スキャン画像データの生成からアップロードまでの全体的な制御の流れを示すフローチャートである。この一連の処理は、制御部２１０において、ＣＰＵ２１１がＨＤＤ２１４に記憶された制御プログラムを実行することにより実現される。以下、詳しく説明する。 <Flowchart of the whole process>
FIG. 5 is a flowchart showing the overall flow of control from generation of scan image data to upload. The series of processes are realized by the CPU 211 executing a control program stored in the HDD 214 in the control unit 210. Details will be described below.

ステップ５０１においてスキャン指示部４２１は、表示制御部４２６にスキャン設定画面の表示を指示する。表示制御部４２６は、スキャン処理における各種設定を行うためのスキャン設定画面を操作部２２０に表示する。 In step 501, the scan instruction unit 421 instructs the display control unit 426 to display a scan setting screen. The display control unit 426 displays on the operation unit 220 a scan setting screen for performing various settings in the scan processing.

図６は、スキャン設定画面６００の一例を示す図である。図６のスキャン設定画面６００には、５つの設定ボタン６０１〜６０５が存在する。［カラー設定］ボタン６０１は、原稿をスキャンする際のカラーまたはモノクロを設定するためのボタンである。［解像度設定］ボタン６０２は、原稿をスキャンする際の解像度を設定するためのボタンである。［両面読み取り設定］ボタン６０３は、原稿の両面をスキャンしたい場合に用いる設定ボタンである。［原稿混載設定］ボタン６０４は、サイズが異なる原稿をまとめてスキャンしたい場合に用いる設定ボタンである。［画像形式設定］ボタン６０５は、スキャン画像データの保存形式を指定する際に用いる設定ボタンである。これら設定ボタン６０１〜６０５を用いた設定時には、ＭＦＰ１１０においてサポートされている範囲で設定可能な候補（選択肢）が表示され、ユーザは表示された候補から望むものを選択する。なお、上述の設定ボタンは一例であって、これらすべての設定項目が存在しなくても良いし、これら以外の設定項目が存在してもよい。ユーザは、このようなスキャン設定画面６００を介してスキャン処理についての詳細な設定を行なう。［キャンセル］ボタン６２０は、スキャン設定を中止する場合に用いるボタンである。［スキャン開始］ボタン６２１は、原稿台等にセットした原稿に対するスキャン処理の開始を指示するためのボタンである。 FIG. 6 is a diagram showing an example of the scan setting screen 600. As shown in FIG. Five setting buttons 601 to 605 exist in the scan setting screen 600 of FIG. [Color setting] button 601 is a button for setting color or monochrome when scanning an original. [Resolution setting] button 602 is a button for setting the resolution when scanning an original. [Duplex reading setting] button 603 is a setting button used when it is desired to scan both sides of a document. [Mixed original setting] button 604 is a setting button used when it is desired to scan originals of different sizes collectively. An “image format setting” button 605 is a setting button used to designate a storage format of scan image data. At the time of setting using these setting buttons 601 to 605, settable candidates (options) in the range supported by the MFP 110 are displayed, and the user selects a desired one from the displayed candidates. Note that the above setting button is an example, and all the setting items may not exist, and other setting items may exist. The user makes detailed settings for the scan process via the scan setting screen 600. [Cancel] button 620 is used to cancel the scan setting. [Scan start] button 621 is a button for instructing start of scan processing for the document set on the document table or the like.

ステップ５０２においてスキャン指示部４２１は、［スキャン開始］ボタン６２１が押されたか、［キャンセル］ボタン６２０が押されたかを判定する。［スキャン開始］ボタン６２１が押されたと判定すると、スキャン指示部４２１は、各スキャン設定ボタン６０１乃至６０５で選択された設定項目の設定でスキャン実行部４１１に対してスキャン処理を実行させる。［キャンセル］ボタン６２０が押されたと判定すると処理を終了する。 In step 502, the scan instruction unit 421 determines whether the “scan start” button 621 is pressed or the “cancel” button 620 is pressed. If it is determined that the “scan start” button 621 is pressed, the scan instruction unit 421 causes the scan execution unit 411 to execute the scan processing with the setting of the setting item selected by each of the scan setting buttons 601 to 605. If it is determined that the [cancel] button 620 has been pressed, the process ends.

ステップ５０３においてスキャン実行部４１１は、スキャナ部２２２にスキャン指示を出し、原稿をスキャンする。スキャンして生成されたスキャン画像データは画像データ保存部４１２に保存され、対応するスキャン画像識別子がスキャン指示部４２１に通知される。 In step 503, the scan execution unit 411 issues a scan instruction to the scanner unit 222 to scan an original. The scan image data generated by scanning is stored in the image data storage unit 412, and the corresponding scan image identifier is notified to the scan instruction unit 421.

ステップ５０４においてスキャン指示部４２１は、スキャン画像識別子に対応するスキャン画像データを画像データ保存部４１２から取得する。 In step 504, the scan instruction unit 421 acquires scan image data corresponding to the scan image identifier from the image data storage unit 412.

ステップ５０５においてメタデータ生成部４２２は、画像データ保存部４１２から取得されたスキャン画像データのレイアウト解析指示を画像解析部４２３に送る。画像解析部４２３は、スキャン画像データのレイアウト解析を行う。例えば、スキャン画像のヒストグラムを抽出したり、画素の塊を抽出するなどして、文字列領域や図形領域など、スキャン画像中におけるレイアウトを解析する。文字列領域は、文字列と推認される領域（画像領域）である。文字列領域は、一文字の領域も含むものである。なお、レイアウト解析処理にはレイアウト解析しやすいようにスキャン画像の傾きを補正したり、方向を検知して回転したりする処理を含むようにしてもよい。画像解析部４２３は、レイアウト解析によって解析した文字列領域の情報（以下、文字列領域情報という）をメタデータ生成部４２２に渡す。 In step 505, the metadata generation unit 422 sends a layout analysis instruction of the scan image data acquired from the image data storage unit 412 to the image analysis unit 423. The image analysis unit 423 analyzes the layout of the scan image data. For example, a layout of a scan image such as a character string area or a graphic area is analyzed by extracting a histogram of a scan image or extracting a block of pixels. The character string area is an area (image area) assumed to be a character string. The character string area also includes an area of one character. Note that the layout analysis process may include a process of correcting the inclination of the scan image to facilitate layout analysis, or detecting and rotating a direction. The image analysis unit 423 passes, to the metadata generation unit 422, information on the character string area analyzed by layout analysis (hereinafter, referred to as character string area information).

表１は、レイアウト解析によって解析された文字列領域情報の一例を示す。 Table 1 shows an example of character string area information analyzed by layout analysis.

上記表１において、［番号］は、特定された各文字列領域を一意に示す番号である。この例では１から１１までの通し番号が、認識した順番に付けられている。［領域のＸ座標］は、特定された各文字列領域の左上隅のＸ座標を示す。［領域のＹ座標］は、特定された各文字列領域の左上隅のＹ座標を示す。以後、文字列領域に対して“座標”と言う場合は、特に断らない限り、文字列領域の左上隅の位置座標のことを意味するものとする。［領域の幅］は、特定された各文字列領域の左辺から右辺までの距離を示す。［領域の高さ］は、特定された各文字列領域の上辺から下辺までの距離を示す。本実施形態では、［領域のＸ座標］、［領域のＹ座標］、［領域の幅］、［領域の高さ］はいずれもピクセルで示すが、ポイントやインチ等で示してもよい。スキャン画像から抽出された各文字列領域の情報は、画像解析データとしてメタデータ生成部４２２に渡される。画像解析データは、例えばＣＳＶやＸＭＬのフォーマットとするが、他のフォーマットであっても構わない。また、ＨＤＤ２１４に一旦保存した上で、所定のタイミングで渡してもよい。 In Table 1 above, [No.] is a number uniquely indicating each identified character string area. In this example, serial numbers 1 to 11 are assigned in the order of recognition. [X coordinate of area] indicates the X coordinate of the upper left corner of each specified character string area. [Y coordinate of area] indicates the Y coordinate of the upper left corner of each specified character string area. Hereinafter, the term "coordinates" with respect to the character string area means the position coordinates of the upper left corner of the character string area unless otherwise specified. [Area width] indicates the distance from the left side to the right side of each of the specified character string areas. [Area height] indicates the distance from the upper side to the lower side of each specified character string area. In the present embodiment, “X coordinate of area”, “Y coordinate of area”, “width of area”, and “height of area” are all represented by pixels, but may be represented by points, inches or the like. Information of each character string area extracted from the scan image is passed to the metadata generation unit 422 as image analysis data. The image analysis data is, for example, in the format of CSV or XML, but may be another format. Alternatively, the data may be temporarily stored in the HDD 214 and then delivered at a predetermined timing.

ステップ５０６においてメタデータ生成部４２２は、表示制御部４２６にプレビュー画像の表示を指示する。表示制御部４２６は、スキャン指示部４２１から受け取ったスキャン画像データを用いて操作部２２０のタッチパネル上にプレビュー画面を表示する。ユーザは、プレビュー画面を介して、スキャン画像データのファイル名を設定することができる。 In step 506, the metadata generation unit 422 instructs the display control unit 426 to display the preview image. The display control unit 426 displays a preview screen on the touch panel of the operation unit 220 using the scan image data received from the scan instruction unit 421. The user can set the file name of scan image data via the preview screen.

図７（ａ）は、プレビュー画面７００の一例を示す図である。プレビュー画面内において、画面中央にあるプレビュー領域７１０内に、読み込まれたスキャン画像データによって表されるスキャン画像が表示される。そして、プレビュー領域７１０内には、スキャン画像と共にその表示状態を変更するための複数のボタン７１１〜７１４も表示される。ボタン７１１及び７１２はスキャン画像の全体を表示しきれないときに現れるボタンで、表示領域を縦方向にスクロールするためのボタンである。ＭＦＰ１１０が備えるタッチパネルは通常それほど大きくはない。そこで、例えば、スキャン画像がＡ４縦・横書きの原稿を読み取ったものである場合は、スキャン画像の幅方向（短手方向）全体がプレビュー領域７１０にちょうど収まるように上詰めで縮小表示されるよう初期設定される。つまり、初期設定においては、Ａ４縦のスキャン画像の下部はプレビュー領域７１０内に表示されないことになる。このようなとき、「↓」ボタン７１２を押下すると下に表示領域がスクロールし、下部を表示させることができる。さらに、スキャン画像が例えばＡ４横やＡ３などの場合には、表示領域を横方向にスクロールするためのボタンをさらに設ければよい。ボタン７１３及び７１４は、表示領域を拡大・縮小するためのボタンであり、「＋」のボタン７１３を押下するとズームインし、「−」のボタン７１４を押下するとズームアウトする。これらボタン操作による動作を、プレビュー画面上でのスワイプ、ピンチアウト、およびピンチインといったユーザの指による操作で実現してもよい。 FIG. 7A shows an example of the preview screen 700. As shown in FIG. In the preview screen, in the preview area 710 at the center of the screen, a scan image represented by the read scan image data is displayed. Then, in the preview area 710, along with the scanned image, a plurality of buttons 711 to 714 for changing the display state are also displayed. Buttons 711 and 712 appear when the entire scanned image can not be displayed, and are buttons for vertically scrolling the display area. The touch panel provided in the MFP 110 is usually not so large. Therefore, for example, when the scanned image is a read of an A4 vertical / horizontal written document, the entire image in the width direction (short side direction) of the scanned image is displayed in a reduced size so that it fits exactly in the preview area 710. Initialized. That is, in the initial setting, the lower part of the A4 vertical scan image is not displayed in the preview area 710. In such a case, when the “↓” button 712 is pressed, the display area scrolls downward, and the lower part can be displayed. Furthermore, in the case where the scan image is, for example, A4 side or A3 side, a button may be further provided to scroll the display area in the horizontal direction. The buttons 713 and 714 are buttons for enlarging or reducing the display area, and zoom in is performed when the “+” button 713 is pressed, and zoom out is performed when the “−” button 714 is pressed. The operation by these button operations may be realized by an operation by the user's finger such as swipe on the preview screen, pinch out, and pinch in.

ファイル名入力欄７０１は、スキャン画像に対するファイル名を表示する。初期状態ではスキャンした時の日時を示す文字列などが設定される。プレビュー領域７１０上の文字列領域をユーザが指でなぞる操作（スワイプ操作またはフリック操作）を行うと、なぞった領域に対応する文字列が、ファイル名入力欄７０１に入力される。詳細な処理については、後述する。［戻る］ボタン７２０は、プレビュー表示を中止する場合に用いるボタンである。［次へ］ボタン７２１は、読み込まれたスキャン画像データのアップロード先を設定する画面に移行するためのボタンである。また、ボタン７０２はファイル名のフォーマット等を設定するためのボタンである。なお、上述した各種ボタンの種類、各文字列領域の表示や選択の態様は一例にすぎず、これに限定されない。例えば、ファイル名入力欄７０１に表示された文字列を修正・変更したり、ファイル名を確定したりするためのボタンがあってもよい。 A file name input field 701 displays the file name for the scan image. In the initial state, a character string indicating the date and time of scanning is set. When the user performs an operation (swipe operation or flick operation) to trace the character string area on the preview area 710 with a finger, a character string corresponding to the traced area is input to the file name input field 701. Detailed processing will be described later. A [Back] button 720 is a button used to cancel the preview display. [Next] button 721 is a button for shifting to a screen for setting an upload destination of the read scan image data. A button 702 is a button for setting the format of the file name and the like. Note that the types of various buttons described above and the manner of displaying and selecting each character string area are merely an example, and the present invention is not limited thereto. For example, there may be a button for correcting or changing the character string displayed in the file name input field 701 or determining the file name.

図７（ｂ）は、図７（ａ）で示すプレビュー領域７１０に、表１で示した文字列領域の対応する領域の座標と番号とを示す図である。文字列領域については、点線矩形で座標位置を示しており、当該文字列領域に紐づけて番号が表示されている。本実施形態では、主に図７（ａ）で示すように、文字列領域をタッチパネル上のプレビュー画面には明示しない形態を例に挙げて説明するが、図７（ｂ）で示すように、文字列領域を表示する形態を採用してもよい。 FIG. 7B is a view showing coordinates and numbers of corresponding areas of the character string area shown in Table 1 in the preview area 710 shown in FIG. 7A. As for the character string area, the coordinate positions are indicated by dotted rectangles, and the numbers are displayed in association with the character string area. In the present embodiment, as shown mainly in FIG. 7A, the character string area is not illustrated on the preview screen on the touch panel by way of example. However, as shown in FIG. 7B, You may employ | adopt the form which displays a character string area | region.

ステップ５０７においてメタデータ生成部４２２は、ユーザからの入力指示に基づいてスキャン画像に対するファイル名を生成する。ファイル名の生成処理の詳細については後述する。 In step 507, the metadata generation unit 422 generates a file name for the scan image based on the input instruction from the user. Details of the file name generation process will be described later.

図８は、ステップＳ５０７でファイル名が生成された後のプレビュー画面７００の状態を示している。この例では、「見積書」、「東京株式会社」、「２０１７年０４月１４日」に対応する文字列が順次選択されたことで、「見積書＿東京株式会社＿２０１７年０４月１４日」の文字列が、ファイル名入力欄７０１に表示（設定）されている。プレビュー領域７１０では、ユーザがなぞる操作してファイル名に使用された文字列を示す矩形８０１、８０２、８０３が表示される。所望するファイル名が生成されてユーザが［次へ］ボタン７２１を押下すると、ステップ５０８へ進む。 FIG. 8 shows the state of the preview screen 700 after the file name is generated in step S507. In this example, “quotation sheet _ Tokyo Co., Ltd. April 14, 2017” is selected by sequentially selecting the character string corresponding to “quotation sheet”, “Tokyo Ltd.”, and “April 14, 2017”. The character string is displayed (set) in the file name input field 701. In the preview area 710, rectangles 801, 802, 803 indicating the character string used for the file name by the user's tracing operation are displayed. When a desired file name is generated and the user presses the “next” button 721, the process proceeds to step 508.

ステップ５０８においてメタデータ生成部４２２は、［次へ］ボタン７２１が押されたか［戻る］ボタン７２０が押されたかを判定する。［次へ］ボタン７２１が押されたと判定すると、ステップ５０９へ進み、［戻る］ボタン７２０が押されたと判定するとステップ５０１へ戻る。 In step 508, the metadata generation unit 422 determines whether the “next” button 721 is pressed or the “back” button 720 is pressed. If it is determined that the "Next" button 721 is pressed, the process proceeds to step 509. If it is determined that the "Back" button 720 is pressed, the process returns to step 501.

ステップ５０９においてメタデータ生成部４２２は、ファイル名入力欄７０１に設定されたファイル名を取得する。メタデータ生成部４２２は、取得したファイル名とスキャン画像識別子とをアップロード指示部４２４へ渡す。 In step 509, the metadata generation unit 422 acquires the file name set in the file name input field 701. The metadata generation unit 422 passes the acquired file name and scan image identifier to the upload instruction unit 424.

ステップ５１０においてアップロード指示部４２４は、表示制御部４２６にアップロード設定画面の表示を指示する。表示制御部４２６は、操作部２２０のタッチパネル上にアップロード設定画面を表示する。ユーザは、アップロード設定画面を介して、ファイルサーバ１２０へのアップロードに関する詳細設定を行う。 In step 510, the upload instruction unit 424 instructs the display control unit 426 to display the upload setting screen. The display control unit 426 displays an upload setting screen on the touch panel of the operation unit 220. The user performs detailed settings for uploading to the file server 120 via the upload setting screen.

図９は、アップロード設定画面９００の一例を示す図である。ユーザは、［フォルダパス］入力欄９０１に、ファイルサーバ１２０へ外部転送する際のフォルダパスを入力する。図９の例では、“\\Server1\Share\ScanData”がフォルダパスとして入力されている。フォルダパスの入力方法は、［アドレス帳］ボタン９０２からアドレス帳参照画面（不図示）を表示し、ＭＦＰ１１０のＨＤＤ２１４に保存されたアドレス帳データからユーザがアドレスを選択することで入力することができる。アドレス帳にはファイルサーバ１２０のフォルダパスの他にアクセスするためのユーザ名とパスワードも保存されている。ユーザ名とパスワードは、ファイルサーバ１２０へファイルをアップロードする際に使用される。［ファイル名］ラベル９０３は、ファイルサーバに格納するファイルの名前をユーザが認識しやすいように表示したものである。［戻る］ボタン９２０は、アップロードに関する詳細設定を中止する場合に用いるボタンである。［アップロード］ボタン９２１は、［フォルダパス］入力欄９０１で設定したフォルダパスへのアップロードを指示するためのボタンである。 FIG. 9 is a diagram showing an example of the upload setting screen 900. As shown in FIG. The user inputs a folder path for external transfer to the file server 120 in the [folder path] input field 901. In the example of FIG. 9, "\\ Server1 \ Share \ ScanData" is input as the folder path. The folder path can be input by displaying an address book reference screen (not shown) from the [address book] button 902 and the user selecting an address from the address book data stored in the HDD 214 of the MFP 110. . In the address book, user names and passwords for accessing other than the folder path of the file server 120 are also stored. The username and password are used when uploading the file to the file server 120. The [file name] label 903 displays the name of the file stored in the file server so that the user can easily recognize it. A “Back” button 920 is a button used to cancel the detailed setting regarding upload. An “Upload” button 921 is a button for instructing uploading to the folder path set in the “Folder path” input field 901.

ステップ５１１においてアップロード指示部４２４は、［アップロード］ボタン９２１が押されたか［戻る］ボタン９２０が押されたかを判定する。［アップロード］ボタン９２１が押されたと判定すると、ステップ５１２へ進み、［戻る］ボタン９２０が押されたと判定すると、ステップ５０６へ戻る。 In step 511, the upload instruction unit 424 determines whether the “upload” button 921 is pressed or the “return” button 920 is pressed. If it is determined that the "upload" button 921 is pressed, the process proceeds to step 512, and if it is determined that the "back" button 920 is pressed, the process returns to step 506.

ステップ５１２においてアップロード指示部４２４は、スキャン画像識別子に対応するスキャン画像データから、アップロードするファイルを生成する。ステップＳ５１３においてアップロード指示部４２４は、ステップ５１２で生成したファイルを、ステップＳ５０９で取得したファイル名で、ステップＳ５１０で設定されたファイルサーバのフォルダへアップロードする。 In step 512, the upload instruction unit 424 generates a file to be uploaded from the scan image data corresponding to the scan image identifier. In step S513, the upload instruction unit 424 uploads the file generated in step 512 to the folder of the file server set in step S510 with the file name acquired in step S509.

以上が、本実施形態に係るスキャン画像データの生成からアップロードまでの動作制御の内容である。なお、本実施形態では、ステップ５０５〜５０７の処理を、スキャンによって生成された１ページ分の画像データに対して行うことを想定している。例えば、プレビュー画面７００内に次のページの画像解析を行うためのボタンを設け、その解析によって得られた次ページのプレビュー表示を行って、次ページ以降の文字列領域からファイル名を構成する文字列を設定できるようにしてもよい。 The above is the contents of operation control from generation of scan image data to upload according to the present embodiment. In the present embodiment, it is assumed that the processing in steps 505 to 507 is performed on image data of one page generated by scanning. For example, a button for performing image analysis of the next page is provided in the preview screen 700, and the preview display of the next page obtained by the analysis is performed, and the characters constituting the file name from the character string area of the subsequent pages It may be possible to set the column.

＜ファイル名の生成処理＞
次に、本実施形態におけるファイル名の生成処理（ステップＳ５０７）の詳細な処理を説明する。本実施形態においては、前述したように、ステップＳ５０５において画像解析部４２３によるレイアウト解析によって、スキャン画像全体における文字列領域が抽出されている。文字列領域は、前述したように、文字列と推認される領域（画像領域）である。その後、文字列領域に対して文字認識処理（ＯＣＲ：Optical Character Recognition）処理）を行うことで、文字列領域（画像領域）に含まれている文字（テキストデータ）が抽出される。文字認識処理は、例えば文字列領域に含まれている画素群と、予め登録されている辞書とをマッチング処理することで、文字（テキストデータ）を認識する処理である。この文字認識処理は、処理に時間を要する場合がある。このため、本実施形態においては、レイアウト解析によって抽出された文字列領域に逐次的に文字認識処理を行わずに、ユーザが所望する文字列領域に対して文字認識処理を行うことで、処理の高速化を図っている。 <File Name Generation Process>
Next, detailed processing of the file name generation processing (step S507) in the present embodiment will be described. In the present embodiment, as described above, the character string region in the entire scanned image is extracted by the layout analysis by the image analysis unit 423 in step S505. The character string area is an area (image area) that is presumed to be a character string, as described above. Thereafter, by performing character recognition processing (OCR: Optical Character Recognition) processing on the character string area, characters (text data) included in the character string area (image area) are extracted. The character recognition process is, for example, a process of recognizing a character (text data) by performing a matching process on a pixel group included in a character string area and a dictionary registered in advance. This character recognition process may take time for processing. Therefore, in the present embodiment, the character recognition process is performed on the character string area desired by the user without sequentially performing the character recognition process on the character string area extracted by the layout analysis. We are trying to speed up.

本実施形態では、ユーザが、プレビュー領域７１０上の所望とする文字列の上をなぞることで、ユーザの所望する文字列領域が決定される。具体的には、ユーザがプレビュー領域７１０上でなぞった領域に少なくとも一部が重なる文字列領域に対して文字認識処理が行われる。このような処理によれば、ユーザがプレビュー領域上７１０でなぞった領域が、抽出されている文字列領域を包含していないような場合であっても、ユーザの所望する文字列領域に対して文字認識処理が実行されることになる。以下、具体的なフローチャートに則して説明する。 In the present embodiment, the user traces the desired character string on the preview area 710 to determine the character string region desired by the user. Specifically, the character recognition process is performed on a character string area at least a portion of which overlaps the area traced by the user on the preview area 710. According to such processing, even if the area traced by the user on the preview area 710 does not include the character string area being extracted, the character string area desired by the user is selected. Character recognition processing is to be performed. Hereinafter, description will be made according to a specific flowchart.

図１０は、メタデータ生成部４２２によるファイル名生成処理を示すフローチャートである。ステップ１００１においてメタデータ生成部４２２は、プレビュー画面７００において［次へ］ボタン７２１または［戻る］ボタン７２０が押されたか否かを判定する。［次へ］ボタン７２１または［戻る］ボタン７２０が押されたと判定すると、ファイル名生成処理を終了し図５のフローチャートの処理へ戻る。［次へ］ボタン７２１または［戻る］ボタン７２０が押されていない場合、ステップ１００２へ進む。 FIG. 10 is a flowchart showing file name generation processing by the metadata generation unit 422. In step S 1001, the metadata generation unit 422 determines whether the “next” button 721 or the “back” button 720 is pressed on the preview screen 700. If it is determined that the "Next" button 721 or the "Return" button 720 is pressed, the file name generation processing is terminated and the processing returns to the processing of the flowchart of FIG. If the “next” button 721 or the “back” button 720 is not pressed, the process proceeds to step 1002.

ステップ１００２においてメタデータ生成部４２２は、ユーザがプレビュー領域７１０をタッチしたか否かを判定する。ユーザによってタッチされたと判定すると、ステップ１００３へ進む。タッチされたと判定されない場合、ステップＳ１００１に戻る。 In step 1002, the metadata generation unit 422 determines whether the user has touched the preview area 710. If it is determined that the user has touched, the process proceeds to step 1003. If it is not determined that the touch has been made, the process returns to step S1001.

ステップ１００３においてメタデータ生成部４２２は、プレビュー領域７１０においてユーザがタッチした座標を取得し、第１の操作座標として保持する。 In step 1003, the metadata generation unit 422 acquires the coordinates touched by the user in the preview area 710 and holds the coordinates as the first operation coordinates.

ステップ１００４においてメタデータ生成部４２２は、ユーザがタッチしている座標が変化したか、すなわち、ユーザが、タッチした指を移動したか否かを判定する。ユーザがタッチしている座標が変化していた場合、ステップ１００５に進む。タッチしている座標が変化していない場合、ステップＳ１００６に進む。 In step 1004, the metadata generation unit 422 determines whether the coordinates touched by the user have changed, that is, the user has moved the touched finger. If the coordinates touched by the user have changed, the process proceeds to step 1005. If the coordinates being touched have not changed, the process proceeds to step S1006.

ステップＳ１００５においてメタデータ生成部４２２は、表示制御部４２６に、ステップＳ１００３で保持している第１の操作座標から現在タッチしている座標までの領域を、選択領域として、プレビュー領域７１０の上に重ねて表示させる。図１１は、選択領域をプレビュー領域７１０に表示している操作部２２０のタッチパネルの一例を示す。図１１の例では、タッチを開始した第１の操作座標１１０１から指を位置１１０２まで移動したとき（スワイプしたとき）に、第１の操作座標１１０１と現在のタッチ座標１１０２の２点を頂点とした矩形領域が表示される。この例では、指は、第１の操作座標１１０１から位置１１０２までの移動の間に、プレビュー領域７１０から離されていない状態である。図示しない他の位置に移動した場合、当該他の位置までの矩形領域が表示される。つまり、指の移動に伴い、表示される矩形領域（すなわち、選択領域）が変化する。なお、図１１の例では、選択領域は矩形領域として表示される形態を示しているが、第１の操作座標１１０１と現在のタッチ座標１１０２の２点を直線でつないだ線を表示する形態でもよい。 In step S1005, the metadata generation unit 422 causes the display control unit 426 to set the area from the first operation coordinate held in step S1003 to the coordinate currently touched as the selection area on the preview area 710. Overlap and display. FIG. 11 shows an example of the touch panel of the operation unit 220 displaying the selection area in the preview area 710. In the example of FIG. 11, when the finger is moved from the first operation coordinate 1101 starting the touch to the position 1102 (when swiping is performed), two points of the first operation coordinate 1101 and the current touch coordinate 1102 are vertexes The rectangular area is displayed. In this example, the finger is not released from the preview area 710 during the movement from the first operation coordinates 1101 to the position 1102. When moving to another position not shown, a rectangular area up to the other position is displayed. That is, as the finger moves, the displayed rectangular area (that is, the selected area) changes. In the example shown in FIG. 11, the selection area is displayed as a rectangular area, but it is also possible to display a line connecting two points of the first operation coordinates 1101 and the current touch coordinates 1102 by a straight line. Good.

ステップＳ１００６においてメタデータ生成部４２２は、ユーザが、タッチしている指を離したか否かを判定する。指が離れたと判定するとステップＳ１００７に進み、指を離していないと判定すると、ステップＳ１００４に戻り指を離すまでステップＳ１００４〜１００６の処理を繰り返す。 In step S1006, the metadata generation unit 422 determines whether the user has released the touching finger. If it is determined that the finger has been released, the process proceeds to step S1007. If it is determined that the finger is not released, the process returns to step S1004 and repeats the processing of steps S1004 to S1006 until the finger is released.

ステップＳ１００７においてメタデータ生成部４２２は、指が離れた座標を第２の操作座標として保持する。すなわち、タッチされた状態が解消された時点の座標を第２の操作座標として保持する。ステップＳ１００７の処理が終わった時点で、第１の操作座標と第２の操作座標とが確定した状態になる。すなわち、選択領域が確定された状態となる。 In step S1007, the metadata generation unit 422 holds the coordinates at which the finger is separated as the second operation coordinates. That is, the coordinates at the time when the touched state is canceled are held as the second operation coordinates. When the process of step S1007 ends, the first operation coordinates and the second operation coordinates are determined. That is, the selected area is determined.

ステップＳ１００８においてメタデータ生成部４２２は、ステップＳ１００７までの処理によって確定した選択領域に重なる文字列領域を検出する。前述したように、選択領域は、第１の操作座標と第２の操作座標の２点を頂点とした矩形領域を選択領域でもよいし、第１の操作座標と第２の操作座標を結ぶ直線を選択領域としてもよい。あるいは、タッチ座標の変化毎の座標をすべて記憶しておき、記憶した点すべてを選択領域としてもよい。ステップＳ１００８では、選択領域と文字列領域とが一部でも重なる場合には、その文字列領域が、選択領域に重なる文字列領域として検出される。 In step S1008, the metadata generation unit 422 detects a character string area overlapping the selected area determined by the processing up to step S1007. As described above, the selection area may be a rectangular area having two points of the first operation coordinates and the second operation coordinates as vertices, or a straight line connecting the first operation coordinates and the second operation coordinates. May be the selection area. Alternatively, all the coordinates of each change in touch coordinates may be stored, and all the stored points may be set as the selection area. In step S1008, when the selection area and the character string area partially overlap, the character string area is detected as a character string area overlapping the selection area.

図１２（ａ）は、ステップＳ１００８の詳細を説明する図である。図１２（ａ）は、ユーザが、第１の操作座標１２０１から第２の操作座標１２０２までタッチパネル上のプレビュー領域７１０をなぞった場合の状態を示す。選択領域１２１０は第１の操作座標１２０１と第２の操作座標１２０２との２点を頂点とした矩形領域となる。選択領域と重なる文字列領域の検出は、それぞれの文字列領域と選択領域とで重なる領域があるかどうかを判定し、重なると判定された文字列領域を全て検出することで行われる。図１２（ａ）の例では、文字列領域１〜１０のうち、文字列領域１、文字列領域２、および文字列領域３が、選択領域と部分的に重なっており、これらの文字列領域が、選択領域と重なる文字列領域として検出される。 FIG. 12A is a diagram for explaining the details of step S1008. FIG. 12A shows a state in which the user traces the preview area 710 on the touch panel from the first operation coordinate 1201 to the second operation coordinate 1202. The selection area 1210 is a rectangular area having two points of the first operation coordinates 1201 and the second operation coordinates 1202 as vertices. The detection of the character string area overlapping the selection area is performed by determining whether there is an area overlapping each character string area and the selection area, and detecting all character string areas determined to overlap. In the example of FIG. 12A, of the character string areas 1 to 10, the character string area 1, the character string area 2, and the character string area 3 partially overlap the selected area, and these character string areas Is detected as a character string area overlapping the selected area.

ステップＳ１００９においてメタデータ生成部４２２は、ステップＳ１００８の検出処理の結果、選択領域と重なる文字列領域があるか否かを判定する。重なる文字列領域があった場合、ステップＳ１０１０に進む。重なる文字列領域がなかった場合、ステップＳ１０１１に進む。 In step S1009, the metadata generation unit 422 determines whether there is a character string area overlapping with the selected area as a result of the detection processing in step S1008. If there is an overlapping character string area, the process proceeds to step S1010. If there is no overlapping character string area, the process advances to step S1011.

重なる文字列領域がある場合、ステップＳ１０１０においてメタデータ生成部４２２は、選択領域と、当該選択領域に重なる文字列領域との情報を用いて、文字認識を行う領域となる文字認識対象領域を検出する。本実施形態において文字認識対象領域は、選択領域および当該選択領域と重なる文字列領域を包含する外接矩形となる。例えば、図１２（ａ）の操作を行った時の文字認識対象領域は、図１２（ｂ）に示す領域１２２０となる。 If there is an overlapping character string area, in step S1010, the metadata generation unit 422 detects a character recognition target area to be an area for character recognition using information on the selection area and the character string area overlapping the selection area. Do. In the present embodiment, the character recognition target area is a circumscribed rectangle including a selection area and a character string area overlapping the selection area. For example, the character recognition target area when the operation of FIG. 12 (a) is performed is the area 1220 shown in FIG. 12 (b).

ここで、選択領域および当該選択領域と重なる文字列領域を、文字認識対象領域とする理由を説明する。ユーザが指でなぞった領域は、必ずしも文字列領域を包含していないことがある。この場合、ユーザが指でなぞった領域のみを文字認識対象領域としてしまうと、文字の一部が欠けた画像領域に対して文字認識処理が行われてしまう。このため、適切な文字認識結果を得ることができない。本実施形態では、あらかじめレイアウト解析を行っており、文字列領域が抽出されている。そこで、ユーザが指でなぞった領域が、少しでも文字列領域と重なっている場合には、選択領域の範囲内だけではなく、重なっている文字列領域まで文字認識対象領域が拡張されることになる。このため、ユーザが所望とする文字認識結果を得られ易くなる。 Here, the reason why the selection area and the character string area overlapping the selection area are set as the character recognition target area will be described. The area touched by the user may not necessarily include the text area. In this case, if only the area traced by the user with the finger is set as the character recognition target area, the character recognition process is performed on the image area where a part of the character is missing. For this reason, appropriate character recognition results can not be obtained. In this embodiment, layout analysis is performed in advance, and a character string area is extracted. Therefore, in the case where the area traced by the user with a finger overlaps with the character string area, the character recognition target area is expanded not only within the range of the selection area but also to the overlapping character string area. Become. Therefore, it becomes easy to obtain the character recognition result desired by the user.

図１３（ａ）は、第２の操作座標１３０２が、文字「書」を横方向において包含していないものの、文字列領域３の一部と重なっている例を示している。このような選択領域１３１０が選択された場合であっても、文字認識対象領域は、図１３（ｂ）に示す領域１３２０となる。つまり、プレビュー領域上で、所定の文字を包含するように選択領域が決定されていない場合であっても、検出されている文字列領域と一部でも重なっている場合には、その所定の文字も含まれた文字認識対象領域が検出されることになる。 FIG. 13A shows an example in which the second operation coordinates 1302 overlap with part of the character string area 3 although the character “book” is not included in the lateral direction. Even when such a selection area 1310 is selected, the character recognition target area is an area 1320 shown in FIG. That is, even if the selected area is not determined so as to include a predetermined character on the preview area, the predetermined character is overlapped with the character string area being detected even if it is partially overlapped The character recognition target area also included is detected.

図１４を用いて別の例を説明する。レイアウト解析処理の結果、文書中には文字が記載されているにも関わらず、文字列領域として認識されない場合がある。これは、例えば、文字のかすれによる解析精度の低下などに起因する。図１４（ａ）では、「積」の文字列領域が、何かの原因によって抽出されていない場合を示している。本実施形態では、選択領域および当該選択領域と重なる文字列領域を包含する外接矩形を、文字認識対象領域として検出としている。このため、図１４の選択領域１４１０および当該選択領域１４１０と重なる文字列領域１および文字列領域３とを包含する外接矩形が、文字認識対象領域として検出される。すなわち、図１４（ｂ）の領域１４２０が文字認識対象領域として検出されることになる。 Another example will be described with reference to FIG. As a result of the layout analysis process, the document may not be recognized as a character string area even though characters are described in the document. This is due to, for example, a decrease in analysis accuracy due to blurring of characters. FIG. 14A shows the case where the string area of “product” is not extracted due to any cause. In this embodiment, a circumscribed rectangle including a selection area and a character string area overlapping the selection area is detected as a character recognition target area. Therefore, a circumscribed rectangle including the selection area 1410 in FIG. 14 and the character string area 1 and the character string area 3 overlapping the selection area 1410 is detected as the character recognition target area. That is, the area 1420 in FIG. 14B is detected as the character recognition target area.

なお、選択領域に重なる文字列領域が２行にまたがった領域となる場合がある。その場合、選択領域と当該選択領域に重なる２行分の文字列領域とを包含する領域を文字認識対象領域として検出することができる。あるいは、レイアウト結果、複数行のうち、文字列が並ぶ第１方向（Ｘ方向）と交差する第２方向（Ｙ方向）において選択領域との重なりが大きい行を文字列認識対象領域として検出してもよい。 The character string area overlapping with the selection area may be an area spanning two lines. In that case, it is possible to detect a region including the selection region and a character line region of two lines overlapping the selection region as a character recognition target region. Alternatively, a line having a large overlap with the selected area in the second direction (Y direction) intersecting the first direction (X direction) in which the character strings are arranged is detected as a character string recognition target area among a plurality of lines as a layout result. It is also good.

図１０のフローチャートに戻り説明を続ける。選択領域と重なる文字列領域がない場合、ステップＳ１０１１においてメタデータ生成部４２２は、選択領域から文字認識対象領域を検出する。なお、選択領域をそのまま文字認識対象領域とすることもできるが、選択領域の高さ（第２方向の幅）が十分でない場合、文字認識対象領域内には、文字画像の一部しか含まれなくなる。このため、文字が検出できない。そこで、選択領域の左右領域（第１方向の領域）に文字列領域が存在する場合には、その文字列領域のＹ座標と高さの位置に、選択領域のＹ座標と高さを修正し、修正後の選択領域を文字認識対象領域として検出することができる。 Returning to the flowchart of FIG. 10, the description will be continued. If there is no character string area overlapping the selection area, the metadata generation unit 422 detects a character recognition target area from the selection area in step S1011. Although the selected area may be used as the character recognition target area as it is, when the height of the selected area (the width in the second direction) is not sufficient, only a part of the character image is included in the character recognition target area. It disappears. Because of this, characters can not be detected. Therefore, if the character string area exists in the left and right areas (area in the first direction) of the selected area, correct the Y coordinate and height of the selected area at the Y coordinate and height position of the character string area. The corrected selected area can be detected as a character recognition target area.

また、なぞる位置がずれていたために、選択領域に重なる文字列領域が存在しない場合も考えられる。例えば、指でなぞる形態の場合、ユーザが想定している指定座標と、画面上で検出される検出座標とが一致しない場合がある。そこで、ステップＳ１００９の判定において選択領域に重なる文字列領域がないと判定された場合、選択領域を上下に所定量（数ポイント分）ずらして選択領域を再設定する。そして、再設定された選択領域に重なる文字列領域の検出を再度行ってもよい。再度の検出の結果、重なる文字列領域が検出された場合、数ポイント分ずらした選択領域と当該選択領域に重なる文字列領域を包含する領域を文字認識対象領域として検出する処理を追加してもよい。以上説明したようなステップＳ１０１０またはステップＳ１０１１の処理の結果、文字認識対象領域が特定される状態となる。 In addition, it is conceivable that there is no character string area overlapping the selected area because the tracing position is shifted. For example, in the case of the form of tracing with a finger, the designated coordinates assumed by the user may not coincide with the detected coordinates detected on the screen. Therefore, when it is determined in step S1009 that there is no character string area overlapping the selected area, the selected area is reset by shifting the selected area up and down by a predetermined amount (for several points). Then, detection of a character string area overlapping the reset selection area may be performed again. If an overlapping character string area is detected as a result of the second detection, even if a process of detecting a selected area shifted by several points and an area including a character string area overlapping the selected area as a character recognition target area is added. Good. As a result of the process of step S1010 or step S1011 as described above, the character recognition target area is identified.

ステップＳ１０１２においてメタデータ生成部４２２は、スキャン画像の文字認識対象領域に対して文字認識を行い、文字座標および文字の情報を取得する。例えば、図１２（ｂ）の例では、領域１２２０に対して文字認識処理を行うことで「見積書」という文字列（テキストデータ）が取得される。また、個々の文字「見」「積」「書」の文字座標も取得される。 In step S1012, the metadata generation unit 422 performs character recognition on the character recognition target area of the scan image, and acquires character coordinates and character information. For example, in the example of FIG. 12B, the character recognition process is performed on the area 1220 to acquire a character string (text data) “estimate”. In addition, the character coordinates of each character “m”, “product”, and “write” are also obtained.

ステップＳ１０１３においてメタデータ生成部４２２は、ステップＳ１０１２で文字認識処理の結果取得した文字列のうち、ファイル名として使用する文字（以下、ファイル名使用文字という）を検出する。本実施形態では、文字認識処理の結果取得した文字列をそのままファイル名として使用せず、文字認識処理の結果取得した文字列からファイル名として使用する文字を検出する処理が行われる。これは、本実施形態では、文字認識対象領域は、選択領域と当該選択領域に一部でも重なる文字列領域を包含する領域が検出されているからである。つまり、ユーザの所望とする文字列が検出されやすくなるように、文字認識対象領域は、広めの領域が検出されている。しかしながら、場合によっては、ユーザが所望する以上の文字列領域がレイアウト解析の結果検出されている場合がある。すると、その領域を包含する文字認識対象領域に文字認識処理が行われるので、その結果、取得した文字列をそのままファイル名として使用すると、ユーザが所望する結果とならない場合がある。つまり、文字列認識対象領域の文字認識処理を行った結果取得した文字列のうち、一部の文字のみをユーザが使用したい状況が発生する。そこで、本実施形態では、文字認識処理によって取得した文字列（テキストデータ）の各文字の文字座標と、選択領域とに基づいて、ファイル名として使用する文字を検出する。以下、図を用いて具体的に説明する。 In step S1013, the metadata generation unit 422 detects a character to be used as a file name (hereinafter referred to as a file name usage character) from the character string acquired as a result of the character recognition processing in step S1012. In the present embodiment, the character string acquired as a result of the character recognition process is not used as a file name as it is, and processing is performed to detect a character to be used as a file name from the character string acquired as a result of the character recognition process. This is because, in the present embodiment, as the character recognition target area, an area including a selection area and a character string area partially overlapping the selection area is detected. That is, the character recognition target area is detected as a wider area so that the character string desired by the user can be easily detected. However, in some cases, more character string areas desired by the user may be detected as a result of layout analysis. Then, character recognition processing is performed on the character recognition target area including the area. As a result, when the obtained character string is used as the file name as it is, the result desired by the user may not be obtained. That is, a situation occurs in which the user wants to use only a part of the characters among the character strings acquired as a result of the character recognition processing of the character string recognition target area. Therefore, in the present embodiment, the character to be used as the file name is detected based on the character coordinates of each character of the character string (text data) acquired by the character recognition processing and the selection area. Hereinafter, this will be specifically described using the drawings.

図１５（ａ）は、第１の操作座標１５０１から第２の操作座標１５０２までユーザがなぞる操作をして、選択領域１５１０が検出された状態を示す。選択領域１５１０と重なる文字列領域は、文字列領域４のみと検出され、選択領域１５１０と、選択領域１５１０に重なる文字列領域４とを包含する外接矩形が文字認識対象領域として検出されている。このため、図１５（ａ）の場合、ステップＳ１０１２の文字認識処理を行うことで得られる文字列（テキストデータ）は、「東京株式会社御中」となる。また、文字認識処理を行うことで、図１５（ｂ）に示すように、文字ごとの文字座標が検出される。図１５（ｂ）では、文字座標に基づく各文字の外接矩形を示している。ステップＳ１０１３の処理では、この文字ごとの文字座標と選択領域とが少なくとも一部で重なる領域の文字をファイル名に使用する文字列（テキストデータ）として検出する。図１５（ｂ）では、「東」、「京」、「株」、「式」、「会」、「社」の文字が選択領域と重なり、「御」、「中」は選択領域と重ならない。このため、「東京株式会社」がファイル名使用文字として検出される。このような処理によれば、ユーザが選択した選択領域に対応する文字が、ファイル名使用文字として検出されることになる。このため、ユーザが所望する文字が検出される。 FIG. 15A shows a state in which the selection area 1510 is detected by the user following the operation from the first operation coordinate 1501 to the second operation coordinate 1502. A character string area overlapping the selection area 1510 is detected as the character string area 4 only, and a circumscribed rectangle including the selection area 1510 and the character string area 4 overlapping the selection area 1510 is detected as a character recognition target area. For this reason, in the case of FIG. 15A, the character string (text data) obtained by performing the character recognition processing in step S1012 is “Tokyo Tokyo Co., Ltd.”. Further, by performing the character recognition process, character coordinates for each character are detected as shown in FIG. FIG. 15B shows a circumscribed rectangle of each character based on character coordinates. In the process of step S1013, the character of the area where the character coordinates of each character and the selection area partially overlap with each other is detected as a character string (text data) used for the file name. In FIG. 15 (b), the characters "east", "kyo", "stock", "form", "meeting" and "company" overlap the selected area, and "G" and "middle" are overlapped with the selected area. It does not. For this reason, "Tokyo Ltd." is detected as the file name usage character. According to such processing, the character corresponding to the selected area selected by the user is detected as the file name usage character. Therefore, characters desired by the user are detected.

ステップＳ１０１４においてメタデータ生成部４２２は、ファイル名の変更がプレビュー画面７００で行われているかを判定する。ファイル名の編集が一度も行われていない場合は、初期状態のファイル名となっているため、ステップＳ１０１５に進みファイル名を空に設定する。その後、ステップＳ１０１６に進む。一方、既にファイル名の編集が行われていた場合は、既にユーザによりタッチ操作をした文字が追加されているため、ステップＳ１０１６に進む。ステップＳ１０１６では、今回のステップＳ１０１３で検出したファイル名使用文字を、ファイル名に追加する。そして、メタデータ生成部４２２は、表示制御部４２６に、ファイル名入力欄７０１の文字列を設定・表示させる。このときに、区切り文字（例えば、「―」ハイフンや「＿」アンダーバー）を追加してからファイル名使用文字を追加するようにしてもよい。 In step S1014, the metadata generation unit 422 determines whether the change of the file name is performed on the preview screen 700. If the file name has not been edited even once, since the file name is in the initial state, the process proceeds to step S1015 and the file name is set to be empty. Thereafter, the process proceeds to step S1016. On the other hand, if the file name has already been edited, the user has already added the character on which the touch operation has been performed, and the process advances to step S1016. In step S1016, the file name usage character detected in step S1013 this time is added to the file name. Then, the metadata generation unit 422 causes the display control unit 426 to set and display the character string in the file name input field 701. At this time, it is also possible to add a delimiter (for example, “-” hyphen or “_” underbar) and then add a file name usage character.

なお、図１０のフローチャートにおいてタッチを開始した点を第１の操作座標、指を離した点を第２の操作座標とする例を説明したが、そのほかの方法を用いて第１の操作座標および第２の操作座標が指定されてもよい。例えば、２回タッチをすることで、１回目を第１の操作座標、２回目を第２の操作座標とする指定方法を用いることもできる。マルチタッチに対応している場合、ユーザによる複数の同時タッチを検出した各座標を第１の操作座標および第２の操作座標としてもよい。 In the flowchart of FIG. 10, an example in which the point where the touch is started is the first operation coordinate and the point where the finger is released is the second operation coordinate has been described, but other methods may be used for the first operation coordinate and Second operation coordinates may be designated. For example, by performing the touch twice, it is possible to use a designation method in which the first operation coordinate is the first operation coordinate and the second operation is the second operation coordinate. When multi-touch is supported, each coordinate at which a plurality of simultaneous touches by the user is detected may be set as the first operation coordinate and the second operation coordinate.

以上説明したように、本実施形態においては、ユーザが選択した選択領域と、その選択領域に少なくとも一部が重なる文字列領域とを包含する領域を文字認識対象領域として決定する。そして、文字認識対象領域に対して文字認識処理を行う。このような処理によれば、ユーザが選択した選択領域が必ずしも文字列領域を適切に指定できていなくても、ユーザが所望とする文字認識処理を行う対象領域を適切に検出することができる。この結果、ユーザがスキャン画像データに関連する情報として使用を望む文字列が適切に得られる。 As described above, in the present embodiment, an area including the selection area selected by the user and the character string area at least partially overlapping the selection area is determined as the character recognition target area. Then, character recognition processing is performed on the character recognition target area. According to such processing, even if the selection region selected by the user does not necessarily designate the character string region properly, it is possible to appropriately detect the target region on which the character recognition processing desired by the user is to be performed. As a result, a character string that the user desires to use as information related to the scan image data is properly obtained.

＜＜実施形態２＞＞
実施形態１では、選択領域と少なくとも一部が重なる文字座標の文字を、ファイル名に用いる文字として取得する例を説明した。ＭＦＰ１１０の操作部２２０の液晶表示部は、操作領域が狭いため、ユーザの操作が適切に行われない場合もある。本実施形態は、ファイル名に用いる文字として取得した後に、簡易な編集を行う形態を説明する。 << Embodiment 2 >>
In the first embodiment, an example has been described in which a character having character coordinates at least partially overlapping a selection area is acquired as a character used for a file name. Since the liquid crystal display unit of the operation unit 220 of the MFP 110 has a narrow operation area, the user may not be appropriately operated. In the present embodiment, a mode in which simple editing is performed after acquisition as characters used for a file name will be described.

図１６は、プレビュー画面でユーザが選択領域の指定を３回行うことでファイル名の編集を行った後に、修正を行うための修正ボタンを画面上に示す図である。図１６では、ユーザは、「見積書」、「東京株式会社」の後に「２０１７年０４月」まで入力しようとしたが、操作領域が狭いため操作に失敗し「２０１７年０４月１」まで入力されてしまった状態を示している。このような操作ミスを簡単に修正するために修正ボタン１６０１〜１６０４が、選択された文字列領域の近傍に表示される。文字列領域の近傍は、文字列領域と重複する領域でもよいし、重複しない領域でもよい。図１６では、現在の設定されているファイル名を構成する文字の文字部分（文字認識対象領域部分）が強調表示されている。修正ボタン１６０１が押されると選択領域の先頭の文字が１文字追加される。修正ボタン１６０２が押されると先頭の文字が１文字削除される。同様に修正ボタン１６０３が押されると選択領域の末尾の１文字が削除され、修正ボタン１６０４が押されると末尾１文字が追加される。ユーザは、「２０１７年０４月１」を「２０１７年０４月」に修正するためには修正ボタン１６０３を押すことで末尾から１文字削除することができる。なお、先頭に追加する文字が存在しないときは修正ボタン１６０１を押しても文字は追加されない。修正ボタン１６０４も同様である。あるいは、追加する文字が存在しない場合には、対応する修正ボタンを表示しなくてもよい。修正ボタン１６０１〜１６０４は、選択領域ごとに表示され、ボタンを押すごとにファイル名入力欄７０１の対応する文字が追加、削除される。 FIG. 16 is a diagram showing on the screen a correction button for making corrections after the file name is edited by the user specifying the selected area three times on the preview screen. In FIG. 16, the user tried to input up to "April 2017" after "Estimate" and "Tokyo Co., Ltd.", but the operation area is narrow and the operation fails and "April 1 2017" is input. It shows that it has been done. In order to easily correct such an operation mistake, correction buttons 1601 to 1604 are displayed in the vicinity of the selected character string area. The vicinity of the character string area may be an area overlapping with the character string area or an area not overlapping with the character string area. In FIG. 16, the character part (character recognition target area part) of the characters constituting the currently set file name is highlighted. When the correction button 1601 is pressed, the first character of the selected area is added one character. When the correction button 1602 is pressed, the first character is deleted. Similarly, when the correction button 1603 is pressed, the last character of the selected area is deleted, and when the correction button 1604 is pressed, the last character is added. The user can delete one character from the end by pressing the correction button 1603 in order to correct "April 1, 2017" to "April 2017". When there is no character to be added at the beginning, the character is not added even if the correction button 1601 is pressed. The same applies to the correction button 1604. Alternatively, if there is no character to add, the corresponding correction button may not be displayed. The correction buttons 1601 to 1604 are displayed for each selection area, and corresponding characters in the file name input field 701 are added and deleted each time the button is pressed.

なおここでは、文字の追加及び削除が理解しやすいボタン状のアイコンを用いる形態を説明したが、これに限られるものではない。ユーザが理解しやすい任意のアイコンを用いることができる。 Although the form using the button-like icon which is easy to understand addition and deletion of a character was explained here, it is not restricted to this. Any icon that the user can easily understand can be used.

以上説明したように、本実施形態によれば、ユーザによる選択領域の指定が適切でない場合においても、簡易にファイル名の編集を行うことができる。 As described above, according to the present embodiment, the file name can be edited easily even when the user does not appropriately designate the selection area.

＜＜実施形態３＞＞
実施形態１では、まずレイアウト解析を行い、その後、ユーザによってプレビュー領域がタッチされた際に文字認識処理を行う例を説明した。これは、レイアウト解析と文字認識処理とを続けてスキャン画像全体に対して行うと処理時間が長くなり、ユーザの待ち時間が増え、その結果、操作性が低下する場合があるからである。実施形態１で説明した処理によれば、スキャン画像データが生成された後に、ユーザが、ファイル名の設定を開始するための待ち時間は減少する。しかしながら、ユーザが操作した際に待ち時間が発生することになる。タッチされた時点ではじめて対象の領域に対して文字認識処理を行うからである。本実施形態では、ユーザが操作した際の待ち時間を抑制させる形態を説明する。具体的には、レイアウト解析処理と文字認識処理とを続けて行う形態を説明する。ユーザが、ファイル名の設定を開始する操作の前に先に文字認識処理が行われているので、文字単位の文字座標がユーザの操作前に検出されている。実施形態３は、このように検出されている文字座標を用いる形態を説明する。以下では、主に実施形態１と相違する点を中心に説明する。 << Third Embodiment >>
In the first embodiment, an example has been described in which layout analysis is performed first and then character recognition processing is performed when the preview area is touched by the user. This is because if the layout analysis and the character recognition processing are continuously performed on the entire scanned image, the processing time will be long, the user's waiting time will increase, and as a result, operability may deteriorate. According to the process described in the first embodiment, after the scan image data is generated, the waiting time for the user to start setting of the file name is reduced. However, waiting time occurs when the user operates. This is because the character recognition process is performed on the target area only when it is touched. In the present embodiment, an embodiment in which the waiting time when the user operates is suppressed will be described. Specifically, an embodiment in which the layout analysis process and the character recognition process are continuously performed will be described. Since the character recognition process is performed prior to the user's operation of starting the setting of the file name, the character coordinates of the character unit are detected before the user's operation. The third embodiment will explain an embodiment using character coordinates detected in this manner. In the following, mainly the points different from the first embodiment will be mainly described.

本実施形態では、図５のステップＳ５０５のスキャン画像データの解析処理においては、レイアウト解析に続きスキャン画像全体の文字認識も行い、文字ごとの文字座標も検出する。それ以外は、実施形態１と同じであるため説明を省略する。 In the present embodiment, in the analysis processing of scan image data in step S505 in FIG. 5, character recognition of the entire scan image is also performed following layout analysis, and character coordinates for each character are also detected. The other respects are the same as in the first embodiment, and thus the description thereof is omitted.

図１７は、実施形態３でのファイル名生成処理を示すフローチャートである。ステップＳ１７０１〜Ｓ１７０３の処理は、図１０のステップＳ１００１〜Ｓ１００３の処理と同じである。 FIG. 17 is a flowchart showing file name generation processing in the third embodiment. The processes of steps S1701 to S1703 are the same as the processes of steps S1001 to S1003 of FIG.

ステップＳ１７０４においてメタデータ生成部４２２は、タッチした時点のファイル名を保持する。後述するように、本実施形態では、ユーザがプレビュー領域をなぞる状況に応じてリアルタイムでファイル名入力欄７０１のファイル名が変化する。例えば、第一の選択領域を選択中（選択領域を拡大したり縮小したりしている最中に）に、リアルタイムでファイル名入力欄７０１のファイル名が変化する。その後、ユーザが一旦プレビュー領域から指を離す。すると、その時点で、第一の選択領域に対応するファイル名が決定される。その後、別の第二の選択領域を選択中においては、ファイル名入力欄７０１のうちの第二の選択領域に対応するファイル名がリアルタイムで変化する。このように、それまでの時点で決定されている現在のファイル名については、一旦、確定させておき、編集中の選択領域に対応する文字をリアルタイムで編集する。このため、ステップＳ１７０４では、現在のファイル名を保持する処理を行っている。 In step S1704, the metadata generation unit 422 holds the file name at the time of the touch. As described later, in the present embodiment, the file name of the file name input field 701 changes in real time according to the situation where the user traces the preview area. For example, while the first selection area is being selected (while expanding or reducing the selection area), the file name of the file name input field 701 changes in real time. After that, the user once removes the finger from the preview area. Then, at that time, the file name corresponding to the first selected area is determined. Thereafter, while another second selection area is being selected, the file name corresponding to the second selection area in the file name input field 701 changes in real time. As described above, the current file name determined at that time is decided once, and the character corresponding to the selected area being edited is edited in real time. Therefore, in step S1704, processing for holding the current file name is performed.

ステップＳ１７０５においてメタデータ生成部４２２は、ユーザがタッチしている座標の変化を検出する。ユーザのタッチしている座標が変化していなければタッチ座標が変化するまで監視し続ける。ユーザのタッチしている座標が変化していればステップＳ１７０６に進み、変化後の座標を第２の操作座標として保持する。 In step S1705, the metadata generation unit 422 detects a change in coordinates touched by the user. If the coordinates touched by the user do not change, monitoring continues until the touch coordinates change. If the coordinates touched by the user have changed, the process advances to step S1706 to hold the changed coordinates as the second operation coordinates.

ステップＳ１７０７においてメタデータ生成部４２２は、第１の操作座標と第２の操作座標の２点を頂点とした矩形領域を選択領域としてプレビュー画像の上に重ねて表示する。そして、ステップＳ１７０８でメタデータ生成部４２２は、選択領域と重なる文字領域を検出する。本実施形態では、すでに各文字座標が取得されているので、選択領域と少なくとも一部が重なる文字座標の文字の領域を検出する。ステップＳ１７０９においてメタデータ生成部４２２は、検出した領域の文字を結合した文字列をファイル名使用文字として検出する。 In step S1707, the metadata generation unit 422 superimposes and displays a rectangular area having two points of the first operation coordinate and the second operation coordinate as the selected area on the preview image. Then, in step S1708, the metadata generation unit 422 detects a character area overlapping the selected area. In the present embodiment, since each character coordinate has already been acquired, a character area of character coordinates at least partially overlapping the selected area is detected. In step S1709, the metadata generation unit 422 detects a character string obtained by combining the characters of the detected area as a file name usage character.

ステップＳ１７１０とＳ１７１１の処理は、ステップＳ１０１４とＳ１０１５と同じであるため省略する。 The processes of steps S1710 and S1711 are the same as steps S1014 and S1015, and are thus omitted.

ステップＳ１７１２においてメタデータ生成部４２２は、ステップＳ１７０４で保持したファイル名に対してファイル名使用文字として検出した文字を追加する。そして、ステップ１５１３でユーザがタッチパネルから指を離したかどうかを検出し、離していた場合ステップＳ１７０１の処理に戻り、離していなかった場合ステップＳ１７０５の処理に戻る。 In step S1712, the metadata generation unit 422 adds the character detected as the file name usage character to the file name held in step S1704. Then, in step 1513, it is detected whether or not the user has released the finger from the touch panel. If the user has released the finger, the process returns to the process of step S1701, and if not, the process returns to the process of step S1705.

以上説明したように、本実施形態の処理によれば、ユーザがプレビュー領域を操作している間は選択領域を検出し続ける。そして、選択領域に重なる文字が増えればファイル名入力欄７０１の文字も増え、選択領域に重なる文字が減ればファイル名入力欄７０１の文字も減るようになる。そのため、ユーザ操作に応じてリアルタイムにファイル名入力欄７０１の文字が設定されるようになる。 As described above, according to the process of the present embodiment, while the user is operating the preview area, the selection area is continuously detected. When the characters overlapping the selection area increase, the characters of the file name input field 701 also increase, and when the characters overlapping the selection area decrease, the characters of the file name input field 701 also decrease. Therefore, the characters of the file name input field 701 are set in real time according to the user operation.

なお、本実施形態では、文字認識処理をレイアウト解析処理に続いて、ユーザが、ファイル名の設定を開始する操作の前に一括で行う例を示した。しかしながら、文字認識の処理が十分に高速である場合、タッチ座標が変化するごとに選択領域に対して文字認識を実施するような形態を採用してもよい。 Note that, in the present embodiment, an example has been described in which the character recognition processing is performed collectively before the operation of starting the setting of the file name following the layout analysis processing. However, when the character recognition process is sufficiently fast, the character recognition may be performed on the selected area each time the touch coordinates change.

また、基本は実施形態１の構成をとるが、プレビュー画面が表示されてからユーザが操作するまでの間に、バックグランドで文字認識処理を、例えば上部の文字列領域から順番に行ってもよい。そして、文字認識が終了している領域に対しては実施形態３の処理に切り替えるように構成することもできる。 In addition, although basically the configuration of the first embodiment is employed, character recognition processing may be performed in the background, for example, from the upper character string area in the background from the time the preview screen is displayed to the time the user operates it. . Then, the processing can be switched to the processing of the third embodiment for the region where the character recognition is completed.

また、ＭＦＰの操作部２２０のタッチパネルがマルチタッチに対応している場合、１つ目のタッチした座標を第１の操作座標、２つ目のタッチした座標を第２の操作座標としてファイル名生成処理を行うこともできる。その場合、両方のタッチ座標の変化を検知して選択領域を変更するようにすることで、選択領域の両側を広げたり狭めたりできるようにすることもできる。 When the touch panel of the operation unit 220 of the MFP corresponds to multi-touch, file name generation is performed with the first touched coordinates as the first operated coordinates and the second touched coordinates as the second operated coordinates. Processing can also be performed. In that case, both sides of the selection area can be expanded or narrowed by detecting changes in both touch coordinates and changing the selection area.

＜＜その他の実施形態＞＞
以上説明した実施形態においては、主にファイルアップロードの形態を例に挙げて説明したが、これに限られない。スキャン画像データのファイルをＭＦＰ１１０内のＨＤＤに保存する態様でもよい。 << Other Embodiments >>
In the embodiment described above, although the file upload mode has been mainly described as an example, the present invention is not limited to this. The scan image data file may be stored in the HDD in the MFP 110.

また、以上説明した実施形態においては、スキャン画像内の文字列領域の文字認識結果を用いてファイル名を設定する場面を例に説明したが、この形態に限られるものではない。メタデータは、例えば、スキャン画像のアップロード先といったデータの転送先設定や、ＦＡＸ送信やメール送信の宛先設定に用いられるものであってもよい。この場合、例えば、前述の図９のアップロード設定画面９００において、スキャン画像内の文字列領域の文字認識結果をフォルダパスの候補として選択可能に表示し、ユーザ選択に応じてパス名を設定できるようにしてもよい。また、不図示の宛先設定画面において、スキャン画像内の文字列領域の文字認識結果をＦＡＸ番号やメールアドレスの候補として選択可能に表示し、ユーザ選択に応じてＦＡＸ番号やメールアドレスを設定してもよい。さらには、メタデータは、スキャン画像データの属性情報でもよい。例えば、ユーザが選択した文字列の情報が、ファイルの属性情報に付与されてもよい。 Further, in the embodiment described above, the scene of setting the file name using the character recognition result of the character string area in the scan image has been described as an example, but the present invention is not limited to this form. The metadata may be used for, for example, transfer destination setting of data such as a scan image upload destination, or destination setting for fax transmission or e-mail transmission. In this case, for example, in the upload setting screen 900 of FIG. 9 described above, the character recognition result of the character string area in the scan image can be displayed as selectable folder path candidates so that the path name can be set according to user selection. You may Also, on the destination setting screen (not shown), the character recognition result of the character string area in the scanned image is displayed as selectable as a FAX number or a mail address candidate, and a FAX number or a mail address is set according to the user selection. It is also good. Furthermore, the metadata may be attribute information of the scan image data. For example, information of a character string selected by the user may be added to the attribute information of the file.

また、上述した実施形態においては、左から右に文字が記載されているフォーマットの文書をスキャンした例を挙げて説明したが、文字列の並び順はこれに限られるものではない。また、第１の操作座標と第２の操作座標との位置関係、および、文字列の並び順は、上述した実施形態で説明した態様に限られるものではない。例えば、左から右に文字が記載されている場合において、第１の操作座標が相対的に右側の位置にあり、第２の操作座標が相対的に左側の位置にある形態でもよい。つまり、文字列の並びの逆順に選択領域が指定されてもよい。また、文字列の並び順が上下方向（Ｙ方向）に並んでいる形態でもよい。 In the embodiment described above, an example in which a document in a format in which characters are described from left to right is scanned has been described, but the arrangement order of character strings is not limited to this. Further, the positional relationship between the first operation coordinates and the second operation coordinates, and the arrangement order of the character strings are not limited to the aspect described in the above-described embodiment. For example, when characters are described from left to right, the first operation coordinates may be relatively at the right position, and the second operation coordinates may be relatively at the left position. That is, the selection areas may be designated in the reverse order of the arrangement of the character strings. Further, the arrangement order of the character strings may be arranged in the vertical direction (Y direction).

また、上述した実施形態では、文字認識処理を行った結果、各文字の文字座標が得られる形態を説明したが、これに限られるものではない。レイアウト解析処理の種類によっては、文字列領域のみならず、文字列領域に含まれる各文字の文字座標を解析結果で得られるものがある。この場合、上述したステップＳ１００９で説明した選択領域に重なる文字列領域があるか否かの判定によって、所望とする文字認識対象領域を得ることができる。このため、ステップＳ１０１３では、文字認識対象領域に対して文字認識処理を行った結果をそのままファイル名使用文字列として検出してもよい。 Further, in the embodiment described above, the form in which the character coordinates of each character are obtained as a result of the character recognition processing has been described, but the present invention is not limited to this. Depending on the type of layout analysis processing, not only the character string area but also the character coordinates of each character contained in the character string area may be obtained as an analysis result. In this case, it is possible to obtain a desired character recognition target area by determining whether there is a character string area overlapping the selection area described in step S1009 described above. Therefore, in step S1013, the result of performing the character recognition process on the character recognition target area may be detected as the file name use character string as it is.

また、上記の実施形態では、ユーザがプレビュー画面上の文字列の領域を指でなぞる（スワイプ操作またはフリック操作）することで、選択領域が特定される形態を説明したが、これに限られない。指の代わりにスタイラスペンなどでプレビュー画面上の文字列の領域がなぞられてもよい。また、マウスカーソルなど他の入力手段によってプレビュー画面上の文字列の領域が指定されてもよい。 Further, in the above embodiment, although the user has described the form in which the selection area is specified by tracing the area of the character string on the preview screen with a finger (swipe operation or flick operation), the present invention is not limited thereto. . Instead of the finger, the area of the text on the preview screen may be traced with a stylus pen or the like. Further, the area of the character string on the preview screen may be designated by another input means such as a mouse cursor.

本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 The present invention supplies a program that implements one or more functions of the above-described embodiments to a system or apparatus via a network or storage medium, and one or more processors in a computer of the system or apparatus read and execute the program. Can also be realized. It can also be implemented by a circuit (eg, an ASIC) that implements one or more functions.

Claims

An apparatus for setting information related to scanned image data obtained by scanning a document, comprising:
Display control means for displaying a preview screen of a scanned image represented by the scanned image data;
Extracting means for extracting a character string area assumed to be a character string in the scan image;
Detection means for detecting, as a target area, a character string area at least partially overlapping a selected area based on the first coordinate and the second coordinate designated in the preview screen among the extracted character string areas;
Recognition means for performing character recognition processing of the target area;
An setting unit configured to set the information using characters obtained as a result of the character recognition processing.

The apparatus according to claim 1, wherein the detection unit detects an area including the selection area and a character string area at least partially overlapping the selection area as the target area.

3. The apparatus according to claim 2, wherein the detection means detects an area of a circumscribed rectangle including the selection area and a character string area at least partially overlapping the selection area as the target area.

The said setting means sets the said information using the character which at least one part overlaps with the said selection area among the characters obtained as a result of the said character recognition process, It is characterized by the above-mentioned. The device described in the section.

When the detection unit has a plurality of lines of character string areas at least partially overlapping the selection area, the detection unit has an overlap with the selection area in a second direction intersecting the first direction in which the character strings are arranged in the character string area. The apparatus according to any one of claims 1 to 4, wherein a large row is detected as the target area.

The detection means is characterized in that, when there is no character string area at least partially overlapping the selected area, a character string area overlapping at least partially with the area shifted by the predetermined amount as the target area is detected as the target area. The device according to any one of claims 1 to 5, wherein

The display control means displays the set information at a predetermined position on the preview screen, and adds or deletes a character forming the information in the vicinity of the character string area including the character that is the source of the information. The apparatus according to any one of claims 1 to 6, characterized in that an icon is displayed to indicate.

The apparatus according to claim 7, wherein the display control means changes the information displayed at the predetermined position when the icon is selected.

9. The apparatus according to claim 7, wherein the display control means highlights a character string area including a character that is the source of the information on the preview screen.

An apparatus for setting information related to scanned image data obtained by scanning a document, comprising:
Display control means for displaying a preview screen of a scanned image represented by the scanned image data;
The information is set using a character obtained as a result of performing character recognition processing on a character string area at least partially overlapping a selected area based on the first coordinate and the second coordinate designated in the preview screen. And setting means,
The display control means displays the set information at a predetermined position on the preview screen, and adds or deletes a character forming the information in the vicinity of the character string area including the character that is the source of the information. Displaying an icon for instructing the user.

The apparatus according to claim 10, wherein the display control means changes the information displayed at the predetermined position when the icon is selected.

12. The apparatus according to claim 10, wherein the display control means highlights a character string area including characters as a source of the information on the preview screen.

An apparatus for setting information related to scanned image data obtained by scanning a document, comprising:
Display control means for displaying a preview screen of a scanned image represented by the scanned image data;
Extracting means for extracting a character string area assumed to be a character string in the scan image;
Recognition means for performing character recognition processing of the extracted character string area;
Detection means for detecting a change in first and second coordinates designated in the preview screen;
A determination unit configured to determine a selected area based on the first coordinate and the second coordinate each time a change by the detection unit is detected;
Among the characters obtained by the character recognition process, characters overlapping at least in part with the determined selection area are displayed at a predetermined position on the preview screen, and at least in the selection area when the coordinates are not detected. And setting means for setting the information by using partially overlapping characters.

The first coordinate is a coordinate at which a touch by a user is detected in the preview screen, and the second coordinate is a coordinate at a time when the touched state is canceled. The device according to any one of 1 to 13.

The first coordinate is a coordinate at which a first touch by a user is detected in the preview screen, and the second coordinate is a coordinate by the user after the first touch is canceled in the preview screen. 14. An apparatus according to any one of the preceding claims, characterized in that it is the coordinates at which the second touch was detected.

The said 1st coordinate and the said 2nd coordinate are each coordinates which detected the several simultaneous touch by the user in the said preview screen, The said 1st coordinate and the 2nd coordinate are characterized by the above-mentioned. apparatus.

The apparatus according to any one of claims 1 to 16, wherein the information is a file name given to the scan image data.

A method for setting information related to scanned image data obtained by scanning a document, comprising:
Displaying a preview screen of a scanned image represented by the scanned image data;
Extracting a character string area assumed to be a character string in the scanned image;
Detecting, as a target area, a character string area at least partially overlapping a selected area based on a first coordinate and a second coordinate designated in the preview screen among the extracted character string areas;
Performing character recognition processing of the target area;
Setting the information using characters obtained as a result of the character recognition process.

A method for setting information related to scanned image data obtained by scanning a document, comprising:
Displaying a preview screen of a scanned image represented by the scanned image data;
The information is set using a character obtained as a result of performing character recognition processing on a character string area at least partially overlapping a selected area based on the first coordinate and the second coordinate designated in the preview screen. Displaying the set information at a predetermined position of the preview screen;
Displaying an icon for instructing addition or deletion of characters constituting the information in the vicinity of the character string area including the character as a source of the information.

A method for setting information related to scanned image data obtained by scanning a document, comprising:
Displaying a preview screen of a scanned image represented by the scanned image data;
Extracting a character string area assumed to be a character string in the scanned image;
Performing character recognition processing of the extracted character string area;
Detecting changes in first and second coordinates designated in the preview screen;
Determining a selected region based on the first coordinate and the second coordinate each time the change is detected;
Among the characters obtained by the character recognition process, characters overlapping at least in part with the determined selection area are displayed at a predetermined position on the preview screen, and at least in the selection area when the coordinates are not detected. Setting the information using partially overlapping characters.

The program for functioning a computer as each means of the apparatus as described in any one of Claims 1-17.