Image segmentation method suitable for webpage length and width images
Technical Field
The invention belongs to the field of image processing, and relates to an image segmentation method suitable for a webpage length-width image.
Background
Image segmentation is an important branch of image preprocessing technology in the field of computer vision, and is also an indispensable image preprocessing technology in large-scale content-based image retrieval technology. The excellent image segmentation effect has a decisive influence on the realization result of the project engineering in the field of computer vision.
The existing image segmentation technologies are mainly divided into traditional image segmentation based on image attributes (such as gray threshold segmentation, edge gradient segmentation, segmentation by a histogram averaging method and the like) and image segmentation based on specific theories (feature clustering segmentation, graph theory-based segmentation, wavelet transformation segmentation and the like), and recently, the image semantic segmentation based on deep learning, which achieves good effects in the computer vision field and utilizes strong nonlinear capability of a neural network.
Despite the long research on image segmentation algorithms, the above related techniques are mostly semantic level for a single normal size image. Aiming at the overlarge size and complex and dense plates and text images of a webpage length and width image, the traditional processing method based on the image segmentation method cannot obtain an efficient, rapid and good image segmentation effect, and can cause repeated images and missed images, and the complexity of a semantic segmentation algorithm based on deep learning is greatly improved, so that the time consumption of a project system engineering process is increased, the memory and GPU are occupied, the three principles of simplicity, rapidness and effectiveness of image preprocessing are violated, in addition, the existing image semantic segmentation algorithm based on the deep learning cannot meet the requirement of image segmentation suitable for the webpage size of the long and width image at present, and the segmentation effect and the robustness are insufficient.
Disclosure of Invention
In view of the above, the present invention provides an image segmentation method suitable for a page length-width map.
In order to achieve the purpose, the invention provides the following technical scheme:
an image segmentation method suitable for a webpage length and width image comprises the following steps: automatically filling image boundaries in an extrapolation mode, carrying out graying processing on the image, carrying out gradient edge extraction and binarization operation of a hysteresis threshold value on the basis of a Canny operator after filtering processing, carrying out contour extraction and correction through morphological processing of closing, corrosion and expansion, filtering and removing weight according to a certain scale rule, and outputting the segmented image;
the method comprises the following specific steps:
(1) reading in and transferring the long and wide webpage graph;
(2) filling a frame of the original image; a parameter border value is given and used as a constant pixel value to automatically fill a boundary value of an image in an interpolation mode, so that a picture close to the edge of a webpage can be identified in subsequent contour detection;
(3) carrying out graying processing on the multi-channel picture, converting the BGR three-color space of the picture into a gray space, and outputting the converted single-channel picture;
(4) carrying out image gradient edge extraction and hysteresis threshold binarization operation on the gray level image based on a Canny operator;
(5) performing morphological processing, and performing closing operation, corrosion and expansion processing by establishing an elliptical kernel function kernel to make the image boundary clear;
(6) performing first contour generation, establishing a contour of a hierarchical tree structure, performing compression on contour information in the horizontal direction, the vertical direction and the diagonal direction, and only keeping the terminal point coordinates of the direction, namely only 4 points of a rectangular contour for storage; according to the first contour information, carrying out suppression processing on the contour region of the binary image in a specific size, namely carrying out suppression zero setting operation on the repeated coverage region of the specific contour in the binary image; and generating the contour again; eliminating the contour with very conventional dimensions such as too large and too small height and width of the contour to finish the correction of the contour;
(7) and solving an inscribed matrix of each contour, determining the range of the inscribed matrix of the contour in the original image, and unloading and outputting each contour region picture to finish image segmentation.
Optionally, the step (4) specifically includes:
firstly, filtering noise in the image by using a 5x5 Gaussian filter, and calculating a first derivative in the horizontal direction and the vertical direction by using a sobel operator; secondly, carrying out non-maximum suppression on the derivative value of the image, detecting whether the derivative value is a local maximum value in the field in the gradient direction at each pixel, and otherwise carrying out zero suppression processing to solve and obtain an image gradient value based on the canny operator; and finally, carrying out binarization on the hysteresis threshold, giving parameters of an upper limit maxVal and a lower limit minVal of the hysteresis threshold, and determining an edge through the parameters maxVal and the minVal: the pixel gradient > maxVal is definitely the edge, retained; pixel gradient < minVal positive is non-edge, truncated; if the two are not connected, the connection is retained and the disconnection is discarded.
The invention has the beneficial effects that:
the method comprises the steps of page length and image width scene, dense webpage layout and complex content characteristics, standard output segmentation subgraphs, no repeated image and missing image, and filtering and removing text information and non-target unusual contour images in the original webpage.
From the aspect of algorithm complexity, compared with the image semantic segmentation technology based on deep learning, the image segmentation technology provided by the invention has no complex model pre-training process, does not need to allocate massive system resources to weight parameters of a convolutional network model (convolutional layer, pooling layer and full-link layer), completes image segmentation of a webpage long-wide image scene under limited CPU occupancy rate, does not need to call GPU resources, and accords with the three principles of simplicity, rapidness and effectiveness of image preprocessing.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention may be realized and attained by the means of the instrumentalities and combinations particularly pointed out hereinafter.
Drawings
For the purposes of promoting a better understanding of the objects, aspects and advantages of the invention, reference will now be made to the following detailed description taken in conjunction with the accompanying drawings in which:
FIG. 1 is a flow of image segmentation for a web page length-width map;
FIG. 2 is a flow chart of canny edge detection and binarization;
FIG. 3 is an example of a binarization boundary for a hysteresis threshold;
FIG. 4 is a flow of contour extraction and correction.
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention in a schematic way, and the features in the following embodiments and examples may be combined with each other without conflict.
Wherein the showings are for the purpose of illustrating the invention only and not for the purpose of limiting the same, and in which there is shown by way of illustration only and not in the drawings in which there is no intention to limit the invention thereto; to better illustrate the embodiments of the present invention, some parts of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product; it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The same or similar reference numerals in the drawings of the embodiments of the present invention correspond to the same or similar components; in the description of the present invention, it should be understood that if there is an orientation or positional relationship indicated by terms such as "upper", "lower", "left", "right", "front", "rear", etc., based on the orientation or positional relationship shown in the drawings, it is only for convenience of description and simplification of description, but it is not an indication or suggestion that the referred device or element must have a specific orientation, be constructed in a specific orientation, and be operated, and therefore, the terms describing the positional relationship in the drawings are only used for illustrative purposes, and are not to be construed as limiting the present invention, and the specific meaning of the terms may be understood by those skilled in the art according to specific situations.
The implementation mode of the invention is as follows: the method comprises the steps of automatically filling image boundaries in an extrapolation mode, carrying out graying processing on the image, carrying out gradient edge extraction and binarization operation of a hysteresis threshold value on the basis of a Canny operator after filtering processing, carrying out morphological processing of closing, corrosion and expansion, finally carrying out contour extraction and correction, filtering and removing the weight according to a certain scale rule, and outputting the segmented image.
The method mainly comprises the following steps:
1. and reading and unloading the long and wide webpage graph.
2. And performing border filling on the original drawing. The boundary value of the image is automatically filled in an interpolation mode by giving a parameter border value as a constant pixel value, so that the picture close to the edge of the webpage can be identified in the subsequent contour detection.
3. And carrying out graying processing on the multi-channel picture, converting the BGR three-color space of the picture into a gray space, and outputting the converted single-channel picture.
4. And (4) performing binarization operation of gradient edge extraction and hysteresis threshold of the image on the gray level image based on a Canny operator. First, the noise in the plot was filtered using a 5x5 gaussian filter, and the first derivatives in the horizontal and vertical directions were calculated using the sobel operator. Secondly, carrying out non-maximum suppression on the derivative value of the image, detecting whether the derivative value is a local maximum value in the field in the gradient direction at each pixel, and otherwise carrying out zero suppression processing to solve and obtain an image gradient value based on the canny operator; and finally, carrying out binarization on the hysteresis threshold, giving parameters of an upper limit maxVal and a lower limit minVal of the hysteresis threshold, and determining an edge through the parameters maxVal and the minVal: the pixel gradient > maxVal is definitely the edge, retained; pixel gradient < minVal positive is non-edge, truncated; if the two are not connected, the connection is retained and the disconnection is discarded.
5. And performing morphological processing, namely performing closing operation, corrosion and expansion processing by establishing an elliptical kernel function kernel to make the image boundary clear.
6. Performing first contour generation, establishing a contour of a hierarchical tree structure, performing compression on contour information in the horizontal direction, the vertical direction and the diagonal direction, and only keeping the terminal point coordinates of the direction, namely only 4 points of a rectangular contour for storage; according to the first contour information, carrying out suppression processing on the contour region of the binary image in a specific size, namely carrying out suppression zero setting operation on the repeated coverage region of the specific contour in the binary image; and generating the contour again; and eliminating the contour with very conventional dimensions such as too large and too small height and width of the contour, and finishing the correction of the contour.
7. And solving an inscribed matrix of each contour, determining the range of the inscribed matrix of the contour in the original image, and unloading and outputting each contour region picture to finish image segmentation.
The implementation process of the image segmentation method suitable for the webpage length-width map is shown in fig. 1: the webpage length and width image is subjected to reading and unloading, frame filling and graying, canny edge detection and binarization, morphological processing, contour extraction and correction, image segmentation is completed, and an image is output.
The specific process mainly comprises the following steps:
(1) and reading and unloading the original image.
(2) And performing frame filling on the original image, and outputting a white frame filling image with a given pixel width.
(3) And performing graying processing on the original image. Converting the BGR three-color space of the picture into a gray scale space, and outputting the gray scale space as a single-channel gray scale image.
(4) Edge gradient detection and binarization of a hysteresis threshold value based on canny operator are carried out on the gray-scale map as shown in fig. 2, the binarization mode of a gradient boundary is shown in fig. 3, and finally a gradient boundary map is output.
(5) And performing image morphological processing on the gradient boundary map to make the boundary more obvious, and outputting a morphological processing map.
(6) The morphological processing graph is subjected to contour extraction and correction as shown in fig. 4: firstly, generating a first contour, drawing the first generated contour on a frame filling diagram, and outputting the first contour generation diagram; compressing and extracting the contour information, performing contour suppression of a specific contour dimension rule, and outputting a binary image after suppression; secondly, performing secondary contour generation on the binary image after the suppression and correction, drawing a secondarily generated contour on the frame filling image, and outputting the contour generated image after the suppression; and finally, eliminating the non-standard outline and reserving the outline information of a specific size.
And according to the contour information with a specific size, carrying out contour segmentation on the frame filling graph, outputting and outputting a sub-graph set, and finishing the image segmentation of the webpage length-width graph.
Finally, the above embodiments are only intended to illustrate the technical solutions of the present invention and not to limit the present invention, and although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions, and all of them should be covered by the claims of the present invention.