CN120219572A

CN120219572A - A method for removing borders from electrical drawings based on direction search

Info

Publication number: CN120219572A
Application number: CN202510286674.0A
Authority: CN
Inventors: 陈中; 曹卫国; 钱晶
Original assignee: Nanjing Yunjie Power Technology Co ltd
Current assignee: Nanjing Yunjie Power Technology Co ltd
Priority date: 2025-03-12
Filing date: 2025-03-12
Publication date: 2025-06-27

Abstract

The invention discloses a method for removing the border of an electrical drawing based on direction search, and relates to the technical field of image processing methods. The method comprises the following steps: judging the type of the input drawing, and when it is judged that the input drawing is an electronic drawing, entering the electronic drawing processing flow; firstly, it is necessary to judge whether the drawing has an outer frame, and if there is an outer frame, entering the next step; removing the center axis of the frame, and separating the frame from the drawing content; extracting a connected domain from the drawing with the center line of the frame removed, and using the method of searching from the inside to the outside to find the inner contour boundary of the connected domain, and cutting out the effective area; when it is judged that the input drawing is a paper scanned drawing, firstly expanding the paper scanned drawing to remove the gaps between the frames; cutting out the ROI area on the binary image, and cutting out the effective area. The method can realize the accurate extraction of the effective area of the electrical drawing, thereby improving the accuracy and efficiency of the digitization of the electrical drawing.

Description

Electrical drawing frame removing method based on direction search

Technical Field

The invention relates to the technical field of image processing methods, in particular to an electric drawing frame removing method based on direction search.

Background

In the application scenarios of electrical drawing digitization, data analysis or information extraction, the position information of the primitives and the characters in the drawing is usually extracted and the connection relation between the primitives is searched, and the frame of the drawing and the characters in the frame can interfere the extraction of the information and the logic relation, so that the extraction is needed to be removed. The current technology and development status about electrical drawing frame removal are as follows:

1) And searching the inner and outer contours of the drawing frame based on an edge detection algorithm (such as Laplacian edge detection) and Hough straight line detection, and removing the frame through contour information. However, this method is difficult to handle a non-closed rectangular frame and is not applicable in cases where there is a lot of noise or tilting in the face of paper scanning.

2) The effective area identification method based on deep learning uses models such as Faster R-CNN, YOLO, mask R-CNN and the like to detect an effective area (namely an area with a frame removed) in a drawing, and then performs image clipping according to coordinates of a detection frame. However, the method requires a large amount of annotation data for training, has higher requirements on hardware, higher complexity of the model and difficult efficient deployment in an environment with limited resources, and meanwhile, in the electrical drawing, different types of drawings have larger style difference and limited generalization capability of the model.

Disclosure of Invention

The invention aims to provide the electrical drawing frame removing method which can achieve accurate extraction of the effective area of the electrical drawing, so that the accuracy and the efficiency of electrical drawing datamation are improved.

In order to solve the technical problems, the technical scheme adopted by the invention is that the electric drawing frame removing method based on direction search comprises the following steps:

s1, judging the type of an input drawing, and judging whether the input drawing is an electronic drawing or a paper tracing drawing;

s2, after judging that the input drawing is an electronic drawing, entering an electronic drawing processing flow, firstly judging whether the drawing has an outer frame, if so, entering a step S3, and if not, directly returning to the original drawing and ending the flow;

S3, removing the central axis of the frame, and separating the frame from the drawing content;

S4, extracting connected domains from drawings with the center line of the frame removed, arranging the connected domains in sequence from large to small, detecting by using Hough straight lines to find the connected domains where the frame is located, searching the connected domains from inside to outside to find the inner contour boundary, and cutting out an effective area;

S5, after judging that the input drawing is paper scanned, firstly judging whether the paper scanned is provided with an outer frame, if so, performing expansion operation on the paper scanned, and removing gaps between the frames;

S6, extracting a connected domain from the paper scanned image, traversing from large to small, searching for linear projection from outside to inside on the connected domain image by using a direction searching method, calculating the area ratio of the outer frame to the external matrix to determine the connected domain where the frame is located, searching for a roi area from inside to outside on the searched connected domain, cutting out the roi area on a binary image, and cutting out an effective area.

The technical scheme has the advantages that the method comprises the steps of firstly carrying out pretreatment after judging the type of an electrical drawing, searching from outside to inside by using a direction searching method, removing redundant white background areas outside the drawing frame, processing the situation of adhering the drawing content and the frame, extracting connected areas from images, searching the connected areas from the centroid outwards according to the number of pixel points of the connected areas from large to small, searching the connected areas where the frame is located by using a direction searching method, finding coordinates of inflection points of the inner outline of the frame, and carrying out image cutting by using an opencv library according to the coordinates, so that the rapid removal of the electrical drawing frame is realized, and the accuracy and the efficiency of data extraction and analysis are improved. The method uses the direction search to find the specific coordinates of the frame, can solve the problems of unclosed frame, missing frame part, inclined drawing and the like, and is suitable for electronic drawing and paper tracing drawing at the same time. The preprocessing of the drawing before searching the frame can solve the problem that the center line of the frame in the electronic drawing is adhered with the content of the drawing, and the robustness of frame removal is improved.

Drawings

The invention will be described in further detail with reference to the drawings and the detailed description.

FIG. 1 is a flow chart of a method according to an embodiment of the invention;

FIG. 2 is an original drawing of an electronic drawing (adhesion exists between the bottom center line and the drawing content) in an embodiment of the invention;

FIG. 3 is an electronic drawing of an embodiment of the present invention with adhesion portions removed;

FIG. 4 is a connected domain of a frame extracted in an embodiment of the present invention;

fig. 5 is a graph of a straight line detection result of a connected domain where a frame is located in an embodiment of the present invention;

FIG. 6 is a diagram of the result of removing the frame from the electronic drawing in an embodiment of the invention;

FIG. 7 is a paper artwork (broken frame and inclined drawing) in an embodiment of the invention;

FIG. 8 is a connected domain diagram of a paper drawing frame in an embodiment of the invention;

Fig. 9 is a diagram of the result of removing the frame from the paper drawing in the embodiment of the invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, but the present invention may be practiced in other ways other than those described herein, and persons skilled in the art will readily appreciate that the present invention is not limited to the specific embodiments disclosed below.

FIG. 1 is a flow chart of the invention, because the difference between the electronic drawing and the paper scanned drawing is large, wherein the paper scanned drawing has the problems of scanning inclination, crease, burr, insufficient scanning definition and the like, the invention adopts two sets of technical routes, the drawing type can be manually selected, and the drawing type can be divided by using a classification model, and the paper scanned drawing is processed according to the flow after the drawing type is selected.

Therefore, as shown in fig. 1, the embodiment of the invention discloses an electrical drawing frame removing method based on direction search, which comprises the following steps of.

Step S1, judging the type of the input drawing, wherein the electronic drawing is relatively regular, and has the advantage of less noise interference, and the paper tracing drawing can have the problems of crease, inclination and the like, and the two types of drawing correspond to different processing flows.

Step S2, if the electronic drawing flow is entered, firstly judging whether the drawing (shown in fig. 2) has an outer frame, if the drawing has the outer frame, entering step S3 for processing, if the drawing has no outer frame, directly returning to the original drawing and ending the flow, wherein fig. 2 is an electronic drawing with sensitive information removed, the middle line of the frame below the drawing is adhered with the drawing content, and if the input electronic drawing is the electronic drawing, entering the electronic drawing processing flow. The specific method comprises the following steps:

(1) And expanding the image boundary, adding a white frame for the image, and preventing the contour from being close to the edge to influence detection.

(2) The original image is converted into a gray image and binarized.

(3) All the closed outer contours in the binary image are extracted and the areas are calculated based on a Canny edge detection algorithm, and the area of the plane polygon is calculated by using coordinates passing through the vertexes because the closed outer contours can be irregular polygons, and the formula is as follows:

Wherein A represents the area of the polygon, n represents the number of vertices of the polygon, (x _i,y_i) represents the two-dimensional coordinates of the ith vertex of the polygon, the vertices are ordered clockwise, and the outline with the largest area and the coordinates of the circumscribed rectangle thereof are saved.

(4) Performing row scanning and column scanning on the binarized image from outside to inside according to four directions of up, down, left and right to find a black pixel boundary meeting the requirements, and defining a minimum circumscribed rectangle comprising the pixels:

Where x _column and y _row denote the number of black pixels on the column and row vectors, h is the height of the image (total number of rows), w is the width of the image (total number of columns), 1[ condition ] is an indication function, and 1 if the condition is satisfied, otherwise 0.

(5) And judging the area ratio of the circumscribed rectangle with the maximum outline and the black pixel boundary rectangle, and judging whether the input drawing has a frame or not.

And S3, removing the central axis of the frame. Because of the morphological characteristics of the frame, the interior of the frame is generally uneven, as shown in fig. 3, and because the line of the frame below the drawing is adhered with the drawing content, if the connected domain is directly extracted, the adhered drawing content and the frame are extracted together, so that the connected domain and the frame are required to be separated, fig. 3 is an electronic drawing with the central axis of the frame removed by using the separation algorithm of the invention, the central axis of the frame protrudes inwards and is adhered with the drawing content, and the frame is required to be separated from the drawing content, and the specific method is as follows:

(1) Step S2 has ensured that the drawing is located in the image, and the border coordinates of the frame are searched from outside to inside, so as to calculate the approximate coordinates of the central axis, draw a range rectangle according to the coordinates, count the number of black pixels in each row or each column according to the difference between the rows and columns, and divide the rows into two types (adhesion and non-adhesion) according to the feature that the adhesion area and the non-adhesion area have pixel differences in the range rectangle, wherein k=2 is set, namely 2 centroids are set, and the distance d between each data point and each centroid is calculated:

Where x _i is the data point, c _i is the centroid, and n is the number of features, n=1 in the present invention, because there is only one feature, the number of black vectors per row/column, that assigns the data point to the cluster where the closest centroid is located. After all data points are assigned to clusters, the centroid of each cluster is recalculated, assuming that there are n _k data points x _i, centroid c _k＝(c_k1 in the kth cluster), then the updated new centroid c _k1:

Through the steps, the rectangle of the range is divided into two types (adhesion and non-adhesion) and the adhesion area is set as the background color of the drawing, as shown in the result of fig. 4, the fig. 4 is a picture of the connected domain where the frame is located, the position of the frame on the picture of the connected domain corresponds to the position on the original picture, if the frame needs to be removed, the coordinates of the frame need to be accurately obtained, firstly, the connected domain where the frame is located in all the extracted connected domains, and then, the coordinates of the frame need to be found on the connected domain.

Step S4, extracting connected domains from drawings with the center lines of the frames removed, arranging the connected domains according to the sequence from large to small, finding the connected domain where the frames are located by using Hough straight line detection, and as shown in FIG. 5, FIG. 5 shows the result of using Hough straight line detection to find the connected domain where the frames are located in all the extracted connected domain images by using a straight line detection method, wherein the electronic drawings are relatively neat, the problems of inclination, burrs, folds, definition and the like are avoided, the frames have certain morphological characteristics, if the found straight lines meet the morphological characteristics of the frames, the connected domains are considered to have the frames, the inner contour boundary is found by using a method of searching from inside to outside for the connected domains, and the effective area is cut out, and the method is as follows:

(1) Pixels having the same pixel value and being 8-connected to each other are grouped into a set, and when one pixel p (x, y) is the same as its surrounding 8 pixels p (x-1, y), p (x+1, y), p (x, y-1), p (x, y+1), p (x-1, y-1), p (x-1, y+1), p (x+1, y-1), p (x+1, y+1) values are 8-connected. The images are scanned in a left to right, top to bottom order starting from the top left corner of the image. When an unlabeled foreground pixel is encountered, it is assigned a new label (an integer label) that identifies the start of a new connected domain. A Depth First Search (DFS) algorithm is then used to find all other foreground pixels that are connected to this pixel. The current pixel is pushed onto the stack, then the top of the stack pixel is popped up, and whether its neighboring pixels (8 connected) are unlabeled foreground pixels is checked. If so, marking the current connected domain as a label of the current connected domain, and pushing the current connected domain into a stack. This process is repeated until the stack is empty, thus marking a connected domain.

(2) In step S2, the redundant white background in the electronic drawing is removed, and because the electronic drawing is relatively regular, the invention uses hough straight line detection to detect the extracted connected domain from large to small, and when detecting that a certain connected domain simultaneously has a transverse line or a vertical line reaching a certain threshold multiple of the length or width of the whole image, the connected domain is considered to be the connected domain where the frame is located, as shown in fig. 5. After the connected domain where the frame is located is determined, searching for white pixels from the middle point to the periphery of the connected domain, recording current coordinates after searching for the white pixels, and cutting according to the coordinates to obtain a final result of the electronic drawing, wherein as shown in fig. 6, fig. 6 is a final result diagram of the electronic drawing, and compared with the original diagram of fig. 2, the frame is removed.

Step S5, if a paper tracing paper is input, as shown in FIG. 7, the paper is expanded to remove the gaps between the frames, and the rest steps are the same as those of the electronic paper, because the tracing paper may be damaged due to folds, scanning conditions and the like. In terms of flow, the processes of extracting the connected domain, searching the connected domain image where the frame is located, positioning the frame coordinates on the connected domain image and cutting the effective area on the original image after the frame is detected and the expansion operation is carried out on the paper tracing paper are the same, and the methods used in the processes are different. Fig. 7 is a paper scanned original image with sensitive information removed, and compared with an electronic drawing, the drawing has the problems of inclination, frame breakage, burrs and folds, so that the paper scanned original image needs to enter a frame removing process.

And S6, extracting a paper scanning image paper connected domain, sorting the connected domain from large to small, wherein the paper scanning image paper is not applicable to a method for searching the connected domain where the frame is located because the paper scanning image paper is inclined in a drawing, searching from outside to inside by using a direction searching method on the connected domain, when white pixels are searched, recording current coordinates, establishing a projection range rectangle with the length of a picture width (y axis) and a height (x axis), projecting the white pixels in the range onto the x axis or the y axis, calculating the projection length, and considering the connected domain as the connected domain where the frame is located if the projection length meets a certain proportion of the whole drawing length or width, wherein the method for searching the effective area from inside to outside is the same as that shown in FIG. 8, and the FIG. 8 is the connected domain where the frame is located by using the paper scanning image paper frame removing process. And searching the connected domain where the frame is located outwards from the centroid by using a direction searching method, finding the coordinates of each inflection point of the inner outline of the frame, and then performing image cutting by using an opencv library according to the coordinates, wherein as shown in fig. 9, fig. 9 is a result of paper scanning after the frame is removed.

The method can remove the frames of the electronic drawing and the paper tracing drawing, and can also treat the conditions of unclosed frames, missing frame parts, inclined drawing, adhesion of part of drawing content and the frames, and the like, so that the effective area of the electric drawing is accurately extracted, and the accuracy and the efficiency of the data of the electric drawing are improved.

Claims

1. The electric drawing frame removing method based on the direction search is characterized by comprising the following steps of:

2. The method for removing frames from an electrical drawing based on a direction search according to claim 1, wherein the method for judging whether the drawing has an outer frame in S2 comprises the steps of:

1) Expanding the boundary of the image and adding a white frame for the image;

2) Converting the original image into a gray image and binarizing the gray image;

3) All the closed outer contours in the binary image are extracted based on a Canny edge detection algorithm, the area is calculated, and the area of the plane polygon is calculated by using the coordinates of the passing vertexes, and the formula is as follows:

wherein A represents the area of the polygon, n represents the number of vertices of the polygon, (x _i,y_i) represents the two-dimensional coordinates of the ith vertex of the polygon, the vertices are ordered clockwise, and the outline with the largest area and the coordinates of the circumscribed rectangle thereof are stored;

4) Performing line scanning and column scanning on the binarized image from outside to inside in four directions, finding out a black pixel boundary meeting the requirements, and defining a minimum circumscribed rectangle comprising the pixels:

Wherein x _column and y _row represent the number of black pixels on the column and row vectors, h is the height of the image, w is the width of the image, 1[ condition ] is an indication function, 1 if the condition is satisfied, or 0 if the condition is not satisfied;

5) And judging the area ratio of the circumscribed rectangle with the maximum outline and the black pixel boundary rectangle, and judging whether the input drawing has a frame or not.

3. The method for removing frames from an electrical drawing based on direction search according to claim 1, wherein the specific method for removing the central axis of the frame in S3 comprises the following steps:

In step S2, the drawing is ensured to be positioned in the image, and the border coordinates of the frame are searched from outside to inside, so as to calculate the approximate coordinates of the central axis, a range rectangle is drawn according to the coordinates, the number of black pixels in each row or each column is counted according to the difference between the rows and the columns, the number of black pixels in each row or each column is divided into two types by using K-Means clustering according to the characteristic that pixel difference exists between the adhesion area and the non-adhesion area in the range rectangle, 2 centroids are arranged, and the distance d between each data point and each centroid is calculated:

Where x _i is the data point, c _i is the centroid, and n is the feature quantity;

After all data points are assigned to clusters, the centroid of each cluster is recalculated, assuming that there are n _k data points x _i, centroid c _k＝(c_k1 in the kth cluster), then the updated new centroid c _k1:

By continuously iterating the steps, the area in the range rectangle is divided into two types, namely an adhesion area and a non-adhesion area, and the adhesion area is set as the background color of the drawing.

4. The method for removing frames from an electrical drawing based on a direction search according to claim 1, wherein the method for finding an inner contour boundary for the connected domain in S4 by using a method for searching from inside to outside, and the method for clipping out an effective area comprises the following steps:

When one pixel p (x, y) is connected with 8 pixels p (x-1, y), p (x+1, y), p (x, y-1), p (x, y+1), p (x-1, y-1), p (x+1, y+1), p (x+1, y-1) and p (x+1, y+1) with the same value are connected with 8, the image is scanned from the left upper corner of the image according to the sequence from left to right and from top to bottom, a new label is allocated to the image when encountering an unmarked foreground pixel, the label is used for identifying the beginning of a new connected domain;

Detecting the extracted connected domain from large to small by using Hough straight line detection, when detecting that a certain connected domain simultaneously has a transverse line or a vertical line reaching a certain threshold multiple of the length or the width of the whole image, recognizing the connected domain as the connected domain where the frame is located, searching white pixels from the middle point to the periphery of the connected domain after determining the connected domain where the frame is located, recording the current coordinates after searching the white pixels, and cutting according to the coordinates to obtain the final result of the electronic drawing.

5. The electrical drawing frame removing method based on the direction search according to claim 1, wherein in the step S6:

Searching from outside to inside by using a direction searching method on the connected domain, when white pixels are searched, recording current coordinates, establishing a projection range rectangle with the length of the picture width and the height, projecting the white pixels in the range to an x axis or a y axis, calculating projection length, and if the projection length meets a certain proportion of the whole drawing length or width, considering the connected domain as the connected domain where the frame is located, and subsequently searching the effective area of the drawing from inside to outside by the same method.