CN111179289A

CN111179289A - An image segmentation method suitable for web pages with long images and wide images

Info

Publication number: CN111179289A
Application number: CN201911423525.5A
Authority: CN
Inventors: 曾浩; 高凡
Original assignee: Chongqing University of Post and Telecommunications
Current assignee: Chongqing University of Post and Telecommunications
Priority date: 2019-12-31
Filing date: 2019-12-31
Publication date: 2020-05-19
Anticipated expiration: 2039-12-31
Also published as: CN111179289B

Abstract

The invention relates to an image segmentation method suitable for a long image and a wide image of a webpage, and belongs to the field of image processing. The method includes the following steps: using extrapolation to automatically fill the image boundary, and performing grayscale processing on the image. After filtering, extracting the gradient edge of the image and binarizing the hysteresis threshold based on the Canny operator. Morphological processing of erosion and expansion, and finally performing contour extraction and correction, and finally filtering and deduplication with a certain scale rule, and outputting the segmented image; from the perspective of algorithm complexity, the present invention does not have a complex model pre-training process. , does not need to allocate massive system resources to the weight parameters of the convolutional network model, completes the image segmentation of the web page's long-width map scene under limited CPU usage, does not need to call GPU resources, and is in line with the simplicity and simplicity of image preprocessing. Quick and effective three principles.

Description

Image segmentation method suitable for webpage length and width images

Technical Field

The invention belongs to the field of image processing, and relates to an image segmentation method suitable for a webpage length-width image.

Background

Image segmentation is an important branch of image preprocessing technology in the field of computer vision, and is also an indispensable image preprocessing technology in large-scale content-based image retrieval technology. The excellent image segmentation effect has a decisive influence on the realization result of the project engineering in the field of computer vision.

The existing image segmentation technologies are mainly divided into traditional image segmentation based on image attributes (such as gray threshold segmentation, edge gradient segmentation, segmentation by a histogram averaging method and the like) and image segmentation based on specific theories (feature clustering segmentation, graph theory-based segmentation, wavelet transformation segmentation and the like), and recently, the image semantic segmentation based on deep learning, which achieves good effects in the computer vision field and utilizes strong nonlinear capability of a neural network.

Despite the long research on image segmentation algorithms, the above related techniques are mostly semantic level for a single normal size image. Aiming at the overlarge size and complex and dense plates and text images of a webpage length and width image, the traditional processing method based on the image segmentation method cannot obtain an efficient, rapid and good image segmentation effect, and can cause repeated images and missed images, and the complexity of a semantic segmentation algorithm based on deep learning is greatly improved, so that the time consumption of a project system engineering process is increased, the memory and GPU are occupied, the three principles of simplicity, rapidness and effectiveness of image preprocessing are violated, in addition, the existing image semantic segmentation algorithm based on the deep learning cannot meet the requirement of image segmentation suitable for the webpage size of the long and width image at present, and the segmentation effect and the robustness are insufficient.

Disclosure of Invention

In view of the above, the present invention provides an image segmentation method suitable for a page length-width map.

In order to achieve the purpose, the invention provides the following technical scheme:

an image segmentation method suitable for a webpage length and width image comprises the following steps: automatically filling image boundaries in an extrapolation mode, carrying out graying processing on the image, carrying out gradient edge extraction and binarization operation of a hysteresis threshold value on the basis of a Canny operator after filtering processing, carrying out contour extraction and correction through morphological processing of closing, corrosion and expansion, filtering and removing weight according to a certain scale rule, and outputting the segmented image;

the method comprises the following specific steps:

(1) reading in and transferring the long and wide webpage graph;

(2) filling a frame of the original image; a parameter border value is given and used as a constant pixel value to automatically fill a boundary value of an image in an interpolation mode, so that a picture close to the edge of a webpage can be identified in subsequent contour detection;

(3) carrying out graying processing on the multi-channel picture, converting the BGR three-color space of the picture into a gray space, and outputting the converted single-channel picture;

(4) carrying out image gradient edge extraction and hysteresis threshold binarization operation on the gray level image based on a Canny operator;

(5) performing morphological processing, and performing closing operation, corrosion and expansion processing by establishing an elliptical kernel function kernel to make the image boundary clear;

(6) performing first contour generation, establishing a contour of a hierarchical tree structure, performing compression on contour information in the horizontal direction, the vertical direction and the diagonal direction, and only keeping the terminal point coordinates of the direction, namely only 4 points of a rectangular contour for storage; according to the first contour information, carrying out suppression processing on the contour region of the binary image in a specific size, namely carrying out suppression zero setting operation on the repeated coverage region of the specific contour in the binary image; and generating the contour again; eliminating the contour with very conventional dimensions such as too large and too small height and width of the contour to finish the correction of the contour;

(7) and solving an inscribed matrix of each contour, determining the range of the inscribed matrix of the contour in the original image, and unloading and outputting each contour region picture to finish image segmentation.

Optionally, the step (4) specifically includes:

firstly, filtering noise in the image by using a 5x5 Gaussian filter, and calculating a first derivative in the horizontal direction and the vertical direction by using a sobel operator; secondly, carrying out non-maximum suppression on the derivative value of the image, detecting whether the derivative value is a local maximum value in the field in the gradient direction at each pixel, and otherwise carrying out zero suppression processing to solve and obtain an image gradient value based on the canny operator; and finally, carrying out binarization on the hysteresis threshold, giving parameters of an upper limit maxVal and a lower limit minVal of the hysteresis threshold, and determining an edge through the parameters maxVal and the minVal: the pixel gradient > maxVal is definitely the edge, retained; pixel gradient < minVal positive is non-edge, truncated; if the two are not connected, the connection is retained and the disconnection is discarded.

The invention has the beneficial effects that:

the method comprises the steps of page length and image width scene, dense webpage layout and complex content characteristics, standard output segmentation subgraphs, no repeated image and missing image, and filtering and removing text information and non-target unusual contour images in the original webpage.

From the aspect of algorithm complexity, compared with the image semantic segmentation technology based on deep learning, the image segmentation technology provided by the invention has no complex model pre-training process, does not need to allocate massive system resources to weight parameters of a convolutional network model (convolutional layer, pooling layer and full-link layer), completes image segmentation of a webpage long-wide image scene under limited CPU occupancy rate, does not need to call GPU resources, and accords with the three principles of simplicity, rapidness and effectiveness of image preprocessing.

Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention may be realized and attained by the means of the instrumentalities and combinations particularly pointed out hereinafter.

Drawings

For the purposes of promoting a better understanding of the objects, aspects and advantages of the invention, reference will now be made to the following detailed description taken in conjunction with the accompanying drawings in which:

FIG. 1 is a flow of image segmentation for a web page length-width map;

FIG. 2 is a flow chart of canny edge detection and binarization;

FIG. 3 is an example of a binarization boundary for a hysteresis threshold;

FIG. 4 is a flow of contour extraction and correction.

Detailed Description

The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention in a schematic way, and the features in the following embodiments and examples may be combined with each other without conflict.

Wherein the showings are for the purpose of illustrating the invention only and not for the purpose of limiting the same, and in which there is shown by way of illustration only and not in the drawings in which there is no intention to limit the invention thereto; to better illustrate the embodiments of the present invention, some parts of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product; it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.

The same or similar reference numerals in the drawings of the embodiments of the present invention correspond to the same or similar components; in the description of the present invention, it should be understood that if there is an orientation or positional relationship indicated by terms such as "upper", "lower", "left", "right", "front", "rear", etc., based on the orientation or positional relationship shown in the drawings, it is only for convenience of description and simplification of description, but it is not an indication or suggestion that the referred device or element must have a specific orientation, be constructed in a specific orientation, and be operated, and therefore, the terms describing the positional relationship in the drawings are only used for illustrative purposes, and are not to be construed as limiting the present invention, and the specific meaning of the terms may be understood by those skilled in the art according to specific situations.

The implementation mode of the invention is as follows: the method comprises the steps of automatically filling image boundaries in an extrapolation mode, carrying out graying processing on the image, carrying out gradient edge extraction and binarization operation of a hysteresis threshold value on the basis of a Canny operator after filtering processing, carrying out morphological processing of closing, corrosion and expansion, finally carrying out contour extraction and correction, filtering and removing the weight according to a certain scale rule, and outputting the segmented image.

The method mainly comprises the following steps:

1. and reading and unloading the long and wide webpage graph.

2. And performing border filling on the original drawing. The boundary value of the image is automatically filled in an interpolation mode by giving a parameter border value as a constant pixel value, so that the picture close to the edge of the webpage can be identified in the subsequent contour detection.

3. And carrying out graying processing on the multi-channel picture, converting the BGR three-color space of the picture into a gray space, and outputting the converted single-channel picture.

4. And (4) performing binarization operation of gradient edge extraction and hysteresis threshold of the image on the gray level image based on a Canny operator. First, the noise in the plot was filtered using a 5x5 gaussian filter, and the first derivatives in the horizontal and vertical directions were calculated using the sobel operator. Secondly, carrying out non-maximum suppression on the derivative value of the image, detecting whether the derivative value is a local maximum value in the field in the gradient direction at each pixel, and otherwise carrying out zero suppression processing to solve and obtain an image gradient value based on the canny operator; and finally, carrying out binarization on the hysteresis threshold, giving parameters of an upper limit maxVal and a lower limit minVal of the hysteresis threshold, and determining an edge through the parameters maxVal and the minVal: the pixel gradient > maxVal is definitely the edge, retained; pixel gradient < minVal positive is non-edge, truncated; if the two are not connected, the connection is retained and the disconnection is discarded.

5. And performing morphological processing, namely performing closing operation, corrosion and expansion processing by establishing an elliptical kernel function kernel to make the image boundary clear.

6. Performing first contour generation, establishing a contour of a hierarchical tree structure, performing compression on contour information in the horizontal direction, the vertical direction and the diagonal direction, and only keeping the terminal point coordinates of the direction, namely only 4 points of a rectangular contour for storage; according to the first contour information, carrying out suppression processing on the contour region of the binary image in a specific size, namely carrying out suppression zero setting operation on the repeated coverage region of the specific contour in the binary image; and generating the contour again; and eliminating the contour with very conventional dimensions such as too large and too small height and width of the contour, and finishing the correction of the contour.

7. And solving an inscribed matrix of each contour, determining the range of the inscribed matrix of the contour in the original image, and unloading and outputting each contour region picture to finish image segmentation.

The implementation process of the image segmentation method suitable for the webpage length-width map is shown in fig. 1: the webpage length and width image is subjected to reading and unloading, frame filling and graying, canny edge detection and binarization, morphological processing, contour extraction and correction, image segmentation is completed, and an image is output.

The specific process mainly comprises the following steps:

(1) and reading and unloading the original image.

(2) And performing frame filling on the original image, and outputting a white frame filling image with a given pixel width.

(3) And performing graying processing on the original image. Converting the BGR three-color space of the picture into a gray scale space, and outputting the gray scale space as a single-channel gray scale image.

(4) Edge gradient detection and binarization of a hysteresis threshold value based on canny operator are carried out on the gray-scale map as shown in fig. 2, the binarization mode of a gradient boundary is shown in fig. 3, and finally a gradient boundary map is output.

(5) And performing image morphological processing on the gradient boundary map to make the boundary more obvious, and outputting a morphological processing map.

(6) The morphological processing graph is subjected to contour extraction and correction as shown in fig. 4: firstly, generating a first contour, drawing the first generated contour on a frame filling diagram, and outputting the first contour generation diagram; compressing and extracting the contour information, performing contour suppression of a specific contour dimension rule, and outputting a binary image after suppression; secondly, performing secondary contour generation on the binary image after the suppression and correction, drawing a secondarily generated contour on the frame filling image, and outputting the contour generated image after the suppression; and finally, eliminating the non-standard outline and reserving the outline information of a specific size.

And according to the contour information with a specific size, carrying out contour segmentation on the frame filling graph, outputting and outputting a sub-graph set, and finishing the image segmentation of the webpage length-width graph.

Finally, the above embodiments are only intended to illustrate the technical solutions of the present invention and not to limit the present invention, and although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions, and all of them should be covered by the claims of the present invention.

Claims

1. an image segmentation method that is applicable to a long image and a wide image of a webpage, it is characterized in that: the method comprises the following steps: adopt extrapolation mode to automatically fill the image boundary, and the image is carried out grayscale processing, after filtering processing, based on Canny The operator performs the gradient edge extraction of the image and the binarization of the hysteresis threshold, through the morphological processing of closure, erosion and expansion, and finally performs contour extraction and correction, and finally filters and deduplicates with a certain scale rule. image output;

The specific steps are:

(1) Read and dump the long and wide web page images;

(2) Carry out border filling to the original image; Given the border value of the parameter, the boundary value of the image is automatically filled in the extrapolation method as a constant pixel value, so that the picture close to the edge of the web page can be identified in the subsequent contour detection;

(3) Grayscale processing is performed on the multi-channel picture, the BGR three-color color space of the picture is converted into a grayscale space, and the converted single-channel picture is output;

(4) Perform the binarization operation of the gradient edge extraction of the image and the hysteresis threshold based on the Canny operator on the gray image;

(5) Morphological processing is performed, and the closed operation, erosion and expansion processing are performed by establishing an elliptical kernel function kernel, so that the image boundary is clear;

(6) Perform the first contour generation, establish a contour of a hierarchical tree structure, compress the contour information in the horizontal direction, the vertical direction, and the diagonal direction, and only retain the coordinates of the end point in this direction, that is, a rectangular contour has only 4 points. Save; according to the first contour information, suppress the contour area of the binarized image with a specific size, that is, suppress and reset the repeated coverage area of the specific contour in the binarized image; and perform contour generation again; The contours of very conventional sizes such as height, width, etc. are removed to complete the contour correction;

(7) Calculate the inscribed matrix of each contour, determine the range of the inscribed matrix of the contour in the original image, and transfer and output each contour area image to complete the image segmentation.

2. a kind of image segmentation method that is applicable to web page long picture wide picture according to claim 1, is characterized in that: described step (4) is specifically:

First, use a 5x5 Gaussian filter to filter the noise in the image, and use the sobel operator to calculate the first-order derivatives in the horizontal and vertical directions; secondly, perform non-maximum suppression on the derivative value of the image, and detect whether it is at each pixel. is the local maximum value in the field in the gradient direction, otherwise, zero-setting suppression processing is performed to obtain the image gradient value based on the canny operator; finally, the binarization of the hysteresis threshold is performed, and the upper limit maxVal and the lower limit minVal of the hysteresis threshold are given. Parameter, the edge is determined by the parameters maxVal and minVal: pixel gradient > maxVal must be an edge, reserved; pixel gradient < minVal must be non-edge, discarded; if it is between the two, if it is connected, it will be retained, and if it is not connected, it will be discarded.