WO2000075869A1

WO2000075869A1 - Image processing method

Info

Publication number: WO2000075869A1
Application number: PCT/JP2000/003641
Authority: WO
Inventors: Kazuo Toraichi; Kouichi Wada; Kouichi Mori
Original assignee: Fluency Research & Development Co., Ltd.
Priority date: 1999-06-03
Filing date: 2000-06-05
Publication date: 2000-12-14

Abstract

An image processing method for converting the contents of a paper document to a high-resolution digital image and displaying and printing the image. After a scanner or digital camera captures a paper document and outputs a binarized pixel image to an encoder, the encoder traces an edge of a black pixel forms closed outline point strings, subjects all the closed outline point strings to function approximation, and outputs function-conversion image data on the basis of information of an approximation function. The viewer determines an approximation function from function-conversion image data and generates closed outlines of a color area by using the approximation function. The viewer fills all the closed areas with colors to output the image for display/printing.

Description

Description Image processing method Technical field

The present invention relates to an image processing method for converting an image printed on paper or the like into a high-definition digital image, and storing, displaying, and transmitting the image. Background art

With the development of computer networks and personal computers in recent years, the use of digital documents has become active. The advantages of digital documents are that they can be transferred instantly over a network, take up little space, and can be easily pasted to another document by performing operations such as enlargement, reduction, and trimming. . Digital documents can be easily created using a text editor-based processor, but there are still many paper-based documents such as newspapers, magazines, maps, and drawings. Also, once a file on a computer is printed, it can be said that it is a paper document. There is a growing demand for systems that digitize these paper documents and take advantage of the advantages of digital documents. Examples of systems include facsimile for document transmission and filing systems for storage and reuse of documents. Paper documents are read by a scanner or a CCD (Charge Coupled Device) camera, and are treated as pixel-format image files.

By the way, in many existing systems, the change in the resolution of the output device is not taken into account, but the resolution of the output device actually changes depending on the usage. For example, it must be assumed that the transmitted document will be viewed on the receiving side by enlarging or reducing it, or that output will be suitable for devices with various resolutions, from low-resolution displays to high-resolution printers. However, separate image files for each resolution will make it difficult to manage the data overnight, and saving at the highest possible resolution at any time will waste resources. there were. Considering the features of document images, most of document images are different from natural images, It can be expressed with a small number of colors, and the boundaries of colors are clear. Therefore, an image can be perceived as a set of closed contours in each color gamut. Conventionally, image representation methods based on such contour lines have been used mainly for relatively simple images, such as outline fonts, figures, and drawings, but are not suitable for document images. There were few applications. The reason for this is that the quality of the image automatically converted from the pixel representation image is not high and requires manual correction, and it is difficult to apply it to complex document images with a large amount of contour lines. One of the reasons why the quality of the automatically converted image is reduced is that the outline is expressed using only the linear approximation or the spline approximation. Disclosure of the invention

The present invention has been made in view of the above points, and an object of the present invention is to provide an image processing method capable of converting a content of a paper document into a high-definition digital image for display and printing. Is to do.

According to the image processing method of the present invention, in a first step, a content drawn on a paper document is read to create a binarized pixel image, and in a second step, a contour point sequence included in the pixel image Is extracted, and in a third step, the function of image data is generated by approximating the shape of the contour point sequence by a function. Further, in the fourth step, an approximate function is obtained based on the functionalized image data, and a closed contour of the color gamut is generated using the approximate function. Painting is performed to generate a drawing data, and in the sixth step, display processing or printing processing is performed based on the drawing data. Since the contours are approximated by functions, high-resolution digital images can be generated for display and printing.

In particular, it is desirable that the generation of the closed contour in the fourth step described above is performed by generating a sequence of contour points at intervals corresponding to the resolution of the display processing and the print processing in the sixth step. Since the contour point sequence is generated at intervals according to the resolution of the display screen and the resolution of printing, a high-definition digital image suitable for display and printing can be obtained.

In addition, when performing the display processing in the sixth step described above, When a change in display magnification is instructed, it is preferable to generate a closed outline by generating a sequence of outline points at intervals corresponding to the display magnification in the fourth step. In general, since the display screen has a low resolution, the user can often specify the display magnification when displaying the reconstructed image. In such a case, a high-definition digital image can be displayed at any display magnification by generating the contour point sequence at intervals according to the display magnification.

Further, it is desirable that the function approximation in the third step be performed by dividing the contour point sequence extracted in the second step into a plurality of pieces and determining an optimum approximation function for each of the divided portions. By dividing and processing the contour point sequence, it is possible to reduce the amount of data processing at the time of function approximation. In particular, it is desirable that the division position of the contour point sequence be determined by an optimization method based on dynamic programming. Since the optimal function approximation is performed for each divided part, a more accurate function approximation is possible.

In addition, it is desirable that the above-described division and the approximation function be associated in the order of a straight line, an arc, and a quadratic B-spline curve. By associating the approximation functions in these orders, the amount of data after function approximation can be reduced. BRIEF DESCRIPTION OF THE FIGURES

FIG. 1 is a diagram for explaining an outline of an image processing system according to an embodiment to which the image processing method of the present invention is applied,

FIG. 2 is a diagram illustrating a function approximation expression of a contour,

FIG. 3 is a diagram for explaining the section division of the contour line and the approximation function determination process. FIG.

FIG. 5 is a diagram illustrating a contour reconstruction method in an arc section,

FIG. 6 is a diagram illustrating a contour reconstruction method in a free curve section, FIG. 7 is a diagram illustrating an experimental system,

Figure 8 shows a document image (CCITT test image or part of it) prepared to evaluate the performance of the implemented encoder and view II.

Figure 9 shows the statements provided to evaluate the performance of the implemented encoder and view II. Figure showing the document image (CCITT test image or part of it)

Figure 10 is a diagram showing a document image (paper document scanned by a scanner) prepared to evaluate the performance of the implemented encoder and view II.

Fig. 11 shows the result of comparing the size of the functionalized image file output with the images shown in Figs. 8 to 10 as input and the size of the GIF file of the same image. , A diagram showing an example of an execution screen using a Java application,

FIG. 13 is a diagram showing an example of an execution screen using the Java application,

FIG. 14 is a diagram showing an example of displaying on a web browser using the Ja Va applet,

Figure 15 shows an enlarged view of the original pixel image.

FIG. 16 is a diagram showing experimental results regarding the processing speed. BEST MODE FOR CARRYING OUT THE INVENTION

Hereinafter, an image processing system according to an embodiment to which the image processing method of the present invention is applied will be described in detail with reference to the drawings. The following four items are important specifications for constructing the image processing system of the present embodiment.

(1) Able to output images of various resolutions with high quality using one data file.

(2) The file size is small.

(3) Sufficient image quality can be obtained by completely automatic file generation.

(4) Reconstruction and display must be fast in order to enhance the interactive nature.

In the present embodiment, an image processing system based on these specifications is provided. The image processing system according to the present embodiment is roughly divided into an encoder that outputs functionalized image data by inputting the contents drawn in a paper document and a document image that is reconstructed from image data and displayed. It consists of two parts (see Figure 1). Use an encoder and viewer to operate on a personalized system, and transmit image files using network services such as mail services and web services to display and print images on a remote system. It is assumed that the usage is simple.

In the present embodiment, a document or figure drawn on a paper document is captured by a scanner or a digital camera, and a binarized pixel image is used as an input image. Encoder Then, the edge of a black pixel is traced to obtain a sequence of closed contour points. Function approximation is performed for all closed contour points, and functionalized image data is output based on the information of the approximation function. In View II, an approximate function is obtained from the image of the functionalized image, and a closed contour of the color region is generated using the approximate function. Display 'To print, fill all closed areas and output.

Next, a method of expressing a document image using a functionalized figure will be described. The document image, which is the target image, has a clear outline, and if it is binarized and its edges are followed, a closed outline can be easily obtained. Here, we use the already proposed high-speed and memory-efficient contour extraction method. The obtained contour point sequence is represented by P i = (xi, y i), i = l, ..., L, and the distance between the preceding and following points is 1 or less.

The obtained contour point sequence is discrete, but if it is approximated by a continuous function, it is possible to output a contour line at any resolution. In the existing method for approximating many contour lines, all of one round of the closed contour line is approximated by a combination of straight lines or a spline function (Bézier function). The line is divided and each section is approximated using the optimal function. By using functions of a plurality of classes, appropriate approximation can be performed for the approximated section, so that the approximate data can be reduced. In the image processing system of the present embodiment, the contour is represented by a combination of a straight line, a quadratic B-spline curve, and an arc. In the example of Fig. 2, the contour is divided into three sections, which are approximated by B-splines, arcs, and straight lines, respectively.

Next, a method for determining the approximate function will be described. If the interval is known, the contour can be approximated using multiple classes of continuous functions as described above. In addition, if the approximation function for the relevant section is not appropriate, the data required for approximation will increase.Therefore, the process of segmentation and determination of the approximation function is an important process that greatly affects the approximation accuracy and the amount of data. . However, if humans give hints on the division of contour lines, the required automatic processing cannot be achieved. In many existing methods, extraction of angles using curvature, etc., determination of approximation functions, determination of approximate approximation functions using local fitting, and segmentation and post-processing are performed in appropriate sections. You are trying to achieve a split. The reason that the processing is performed in multiple stages is that curvature division and local fitting cannot perform overall consistent section division. is there. DP (Dynamic Programming), which is one of the optimization methods, is effective for segmentation of such one-dimensional sequence signals. In the image processing system of this embodiment, the fitting error is determined between the approximation function and the divided section, and the obtained error value is added for each divided section to determine the optimal fitting state for the entire contour line. are doing. In the following, a description will be given of the section division of the contour line using the DP and the approximation function determination processing.

If the contour point sequence is P i = ( _{X i} , y (1 ≤ _i ≤ L: contour line length), this is i = i 0, ii, ··· i _h , ··· · Consider dividing into H sections by i H.

Let d (i _h - _1; i _h ) be the cost of approximating with a function, and express the overall cost as the sum of the approximate costs of the intervals.

(1)

Becomes 1 _{mi n} is the minimum interval length. As an example,

Fig. 3 shows a case where the above process is divided into three by appropriate two points. The approximation cost function for each section is set not only to reduce the approximation error, but also to make the approximation section as long as possible. If the only purpose is to reduce the approximation error, it is conceivable to use the fitting error as it is as a cost function. However, the error can be reduced by increasing the coefficient of the spline curve, so if you judge only by the fitting error, all sections will be free curve sections. Therefore, in the image processing system of the present embodiment, a straight line and a circular arc are preferentially determined, and a cost function is set such that a section that is neither of them is a free curve section. The cost function used here is shown in equation (2) below.

C spl> otherwise

Where L i = li _h — i I is the section length, ε is the arc approximation error, ε _seg ² is the segment approximation error, and r is the radius of the approximate arc. Here, d _thd is a constant _indicating the maximum allowable cost, and if it exceeds this, the approximation function is not used. At present, d _{t hd} = 0.018 is set experimentally. ^! Is a cost for spline curve approximation, and CSP ₁ = 0.02 is set as a relatively large constant so that it is not preferentially selected for a straight line or an arc. The error ε _se when approximating by a line segment is the end point

P P 'Ά—

Is the mean square error with the straight line connecting .If the approximate straight line is ax + by + c0, the value is

^S seg = (3)

2+] (see Fig. 4 (a)). The arc approximation error ε is the mean square error of the partial point sequence with the approximate circle. If the end point of the section to be approximated is symmetrically located on the X-axis as shown in Fig. 4 (b), the center coordinates (b, 0) of the approximate circle can be obtained by the following equation.

_Where ε _arc ² is the distance from the center of the arc to the contour point minus the radius,

(Five)

Becomes However, in the actual contour point sequence, it is very rare that both end points are on the X axis, so when calculating, it is necessary to move and rotate the point sequence so that both end points have such a positional relationship There is.

The optimal interval division is obtained by minimizing the total cost g (H) when i _H = L. Expressed as a recurrence formula,

(g () = 0). As shown above, the cost of a section is calculated based on the approximation error and the section length in that section, and is calculated independently of the state of the preceding and following sections. Therefore, this optimization calculation can be performed efficiently using DP. The feature of using DP is that if the optimal interval division up to H is calculated, the optimal division below H is also calculated at the same time. The maximum value of H is the largest integer less than or equal to LZ 1 _{m: n} because the minimum length of the section is l _min . Finally, the cost functions for each H are compared, and the H with the least cost is selected, and the data is output using the interval division and the approximation function at that time.

Thus, in the approximation function determination method using DP, the interval division and the determination of the approximation function can be performed theoretically based on the optimization method, but the DP calculation amount increases in proportion to the square of the length. I get it. In addition, since the amount of calculation for fitting increases in proportion to the length, it is practically problematic to apply DP as it is when the contour is long. In order to reduce the amount of calculation, in the image processing system of the present embodiment, long contour lines are processed separately.

Next, the structure of the functionalized image file and the playback / display method will be described. First, the structure of the functionalized image file will be described. The outline point sequence is divided into sections by the above-described method, and the class of the function that approximates each section is determined. As an image file, the minimum data that can reproduce the approximate contour line based on the information of the approximate function should be stored. As information common to the entire image, the number of contour lines constituting the image is required. On the other hand, information on each contour includes information on the entire contour and information on an approximation function for each approximation section. Information on the entire contour requires the color of the area, the number of sections, and the start point coordinates. Information about the approximation function for each approximation interval differs depending on the class of the approximation function, and the following information is stored for each.

(1) Straight section

Although an end point is required to reproduce a straight line, the start point is the end point of the previous approximation section, so only the end point coordinates of the straight line are needed.

(2) Arc section

If the start point, end point, center coordinates, and drawing direction (clockwise or counterclockwise) are known, the arc can be reproduced. However, the start point is the end point of the previous section just like a straight section, so there is no need to save it. The coordinates, end point, and drawing direction flag may be stored. At the time of generating the stored data, the center coordinates are known from the fitting result, so this is used.

(3) Free curve section

In the free curve section, it is approximated by the B-spline function. In order to approximate with the B-spline function, the number of nodes must be determined. However, since this cannot be determined analytically, fitting is performed stepwise, and the error is within an allowable range (here, within 1 pixel). To find the minimum number of nodes. As the data, the spline coefficient at the time of fitting to obtain the number of nodes may be stored. Same as the score.

Next, a method for reconstructing a functionalized image will be described. To output a functionalized image, it is necessary to reconstruct the outline from the data stored in the file, but since most output devices are designed to output pixel images, they are ultimately converted to pixel images. There must be. Here, the closed contour point sequence is reconstructed from the direct approximation function data as a highly accurate polygon, and the image is output by filling the inside. At this time, if the contour point sequence is reconstructed at a density corresponding to the resolution of the output device, the resolution of the output device can be fully utilized. Since the process of reconstructing a polygon differs depending on the approximation function, a reconstruction method for each approximation function will be described below.

(1) Straight section:

In the straight section, the end point may be added as the vertex of the polygon.

(2) Arc section:

In the arc section, the center and the end point are known from the data, so the start point is added to this, and the radius, start angle, and end angle of the arc are calculated from that, and the point sequence on the arc is calculated. At this time, the density of contour points can be changed by adjusting the step angle (see Fig. 5).

(3) Free curve section:

In the free curve section, the approximate contour is obtained by calculating the convolution of the spline coefficient and the B spline in the X and y directions, respectively. By changing the density of the B-spline when calculating the convolution, contour points with different densities can be constructed (see Figure 6).

Next, the functions to be implemented in view II will be described. Since the specifications that are prioritized as View II functions differ depending on the type of output device, image output processing that matches the output device is required. Here, we consider the specifications that are prioritized in implementation for the printer and display.

(1) Pudding evening

In the case of pudding, an output result that makes full use of the pudding's resolution is required even if it takes some time to output. Therefore, a high-precision polygon can be generated and output according to the resolution of the device.

(2) Display Since high display quality is not so much expected in low resolution displays, the main purpose is to provide a preview-like display. Since higher interactive performance than display quality is required, drawing must be performed at high speed. When considering the display application, unlike a printer, the display is enlarged in order to see the details of the image in response to an instruction from the user to change the display magnification, and the image is reduced in size to see the overall overview. There is a need to.

Next, the result of evaluating the image processing system of the present embodiment will be described. In the experiments, we performed an experiment on an encoder that reads a paper document with a scanner and uses it as an input image to generate a functionalized image file, and an experiment on a view that reads a functionalized image file and displays and prints the image. View II allows you to freely enlarge and reduce images with simple mouse-only operations. For implementation, C ++ is used for the encoder, and JaVa2 is used for the functionalized image view ヮ. In view II, in addition to displaying and printing files that exist locally, a view II was created using a Java applet on a web browser, assuming a system via a network (see Fig. 7). In addition, five document images shown in Figs. 8 to 10 were prepared to evaluate the performance of the implemented encoder and view II. Figures 8 (a), 8 (b), and 9 are CCITT test images or parts of them, and Figures 10 (a) and 10 (b) are paper documents scanned by a scanner. . Each image is a black and white binary image with a resolution of about 300 dpi.

First, experimental results on the encoder will be described. Since the required specifications for the encoder include the size of the image file, we conducted an experiment focusing on the file size. In the experiment, we compared the size of the GIF file of the same image with the size of the functionalized image file output using the images in Figs. At the same time, the relationship between the number of regions in the image and the size of the functionalized image file was examined. The results of this experiment are shown in FIG. Based on the results of the experiment, the compression ratio was about 1.7% to 19% compared to the original bitmap image, and the file size was about 0.2 to 1.3 times that of the GIF. The image shown in Fig. 8 (b), which has a larger file size than GIF, has many fine characters, so the number of fine and complex outlines increases, which leads to an increase in file size. For other images, it is enough compared to GIF It can be said that the file size is small. When the correlation between the number of areas and the file size is calculated, the result is 0.97, which indicates that the number of areas has a large effect on the size of the functionalized image file.

Next, experimental results regarding the display application will be described. Here, we performed experiments on the accuracy, contour reconstruction speed, drawing speed, etc. of the output image of View II, which is a functionalized image display application. FIG. 12 to FIG. 15 are diagrams showing comparison results between the example of the embodiment screen and the pixel images. Figs. 12 and 13 are screen examples of implementation using the Java application, and Fig. 14 is an example of display using the Java applet. FIG. 15 is an enlarged view of the original pixel image for comparison with the image processing system of the present embodiment. From FIGS. 12 to 15, it can be seen that in the image processing system of the present embodiment, both the reduced image and the enlarged image are reproduced with high accuracy.

So far, we have focused on display accuracy. Next, we show the results of experiments on processing speed. FIG. 16 is a diagram showing experimental results regarding the processing speed. The experimental results shown in Fig. 16 measured the time required to read the function image file and reconstruct the contours, and the time required to display all the contours. As shown in FIG. 16, the time required for the contour reconstruction processing is 0.16 seconds to 1.47 seconds, indicating that the processing is performed at a very high speed. The drawing time is between 0.04 and 1.32 seconds. It can be said that the responsiveness of the application is sufficient at such a rendering speed. In this experiment, the entire image is displayed. However, when the image is enlarged, the number of unnecessary contours that do not appear on the screen increases, and the drawing time is greatly reduced. These views are implemented in JaVa2, but JaVa2 runs on a virtual machine and is inferior in execution speed compared to the native execution environment. In addition, the graphic library used this time is a standard one, so it is not optimized for the latest graphic model. Nevertheless, the results of the experiment shown in Fig. 16 show that high-speed drawing is possible. If a library that takes advantage of the performance of Graphic Xerare is introduced in the future, higher-speed drawing will be achieved. Can be expected.

As described above, the image processing system according to the present embodiment uses a function image to generate a document image. By expressing, the paper document can be digitized with high definition. In the present embodiment, since the contour is approximated by a continuous function of a plurality of classes, it is possible to output a single image file at an arbitrary resolution. In addition, the functionalized image file used in the present embodiment is configured by extracting the minimum data necessary for reproduction by approximating each contour part with an optimal function, so that the file size is small. In addition, the playback and display are very fast. Industrial applicability

As described above, according to the present invention, since contour lines of figures and the like read from a paper document are approximated by functions, a high-definition digital image can be generated at the time of display or printing.

Claims

The scope of the claims

1. The contents drawn on the paper document are read, and a binarized pixel image is created. A second step of extracting a contour point sequence included in the pixel image, and a function approximation of the shape of the contour point sequence A third step of generating functionalized image data by

A fourth step of obtaining an approximate function based on the functionalized image data generated in the third step, and generating a closed contour of the color region using the approximate function; and drawing the closed contour by filling the closed contour A fifth step of generating display data, and a sixth step of performing display processing or print processing based on the drawing data.

An image processing method comprising:

2. The generation of the closed contour in the fourth step is performed by generating a sequence of contour points at intervals corresponding to the resolution of the display processing and the printing processing in the sixth step. 2. The image processing method according to claim 1, wherein

3. When performing the display processing in the sixth step, when the user instructs to change the display magnification, in the fourth step, the outline point sequence is formed at intervals corresponding to the display magnification. The image processing method according to claim 1, wherein the closed contour is generated by generating the closed contour.

4. The function approximation in the third step is performed by dividing the contour point sequence extracted in the second step into a plurality of pieces and determining an optimal approximation function for each divided part. The image processing method according to claim 1, wherein the image processing method is characterized in that:

5. The image processing method according to claim 4, wherein a division position of the outline point sequence is determined by an optimization method based on dynamic programming.

6. The correspondence between the segment and the approximation function is straight line, circular arc, 2

5. The image processing method according to claim 4, wherein the image processing is performed in the order of the curve.