[go: up one dir, main page]

CN119007002A - Building object extraction method, system, equipment and storage medium of remote sensing image - Google Patents

Building object extraction method, system, equipment and storage medium of remote sensing image Download PDF

Info

Publication number
CN119007002A
CN119007002A CN202410992709.8A CN202410992709A CN119007002A CN 119007002 A CN119007002 A CN 119007002A CN 202410992709 A CN202410992709 A CN 202410992709A CN 119007002 A CN119007002 A CN 119007002A
Authority
CN
China
Prior art keywords
contour
vertex
remote sensing
section
sensing image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410992709.8A
Other languages
Chinese (zh)
Inventor
李�荣
陶留锋
陈溪
黄颖
陈波
黄胜辉
陈小佩
潘明敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
WUHAN ZONDY CYBER TECHNOLOGY CO LTD
Original Assignee
WUHAN ZONDY CYBER TECHNOLOGY CO LTD
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by WUHAN ZONDY CYBER TECHNOLOGY CO LTD filed Critical WUHAN ZONDY CYBER TECHNOLOGY CO LTD
Priority to CN202410992709.8A priority Critical patent/CN119007002A/en
Publication of CN119007002A publication Critical patent/CN119007002A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/176Urban or other man-made structures
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Image Analysis (AREA)

Abstract

本发明公开了一种遥感影像的建筑对象提取方法、系统、设备及存储介质,所述方法包括:将遥感影像输入至目标检测网络内进行建筑物检测,获得多段初始轮廓;将各段初始轮廓的轮廓特征向量输入至轮廓优化网络中进行顶点预测,获得各段优化轮廓的多个顶点偏移量;通过轮廓注意力模块对遥感影像特征图进行处理获得各顶点权重系数;根据各原始顶点坐标、各顶点偏移量及各顶点权重系数通过坐标迭代公式得到各段优化轮廓的多个顶点目标坐标;基于多个顶点目标坐标从所述遥感影像中提取建筑对象。本发明根据检测框分成多段初始轮廓,在轮廓优化阶段引入边缘注意力模块,调整每段初始轮廓各顶点在回归过程中的权重,实现从遥感影像中准确提取出建筑对象。

The present invention discloses a method, system, device and storage medium for extracting building objects from remote sensing images. The method comprises: inputting the remote sensing image into a target detection network for building detection to obtain multiple initial contours; inputting the contour feature vector of each initial contour into a contour optimization network for vertex prediction to obtain multiple vertex offsets of each optimized contour; processing the remote sensing image feature map through a contour attention module to obtain each vertex weight coefficient; obtaining multiple vertex target coordinates of each optimized contour through a coordinate iteration formula according to each original vertex coordinate, each vertex offset and each vertex weight coefficient; extracting building objects from the remote sensing image based on multiple vertex target coordinates. The present invention divides the initial contour into multiple segments according to the detection frame, introduces an edge attention module in the contour optimization stage, adjusts the weights of each vertex of each initial contour in the regression process, and realizes accurate extraction of building objects from remote sensing images.

Description

Building object extraction method, system, equipment and storage medium of remote sensing image
Technical Field
The invention relates to the technical field of remote sensing image information extraction, in particular to a building object extraction method, a system, equipment and a storage medium of a remote sensing image.
Background
The building is used as the most main component of the city, the structure and the form of the city are molded, important cognitive information and structural knowledge are transferred, and the accurate extraction of the building has important significance for optimizing the structure of the city and planning and formulating the city.
Most of the current researches treat building extraction as a semantic segmentation task, and a result of extracting a building object by using a semantic segmentation method can obtain ideal precision in a specific scene, but the result cannot be directly applied to a subsequent task. Some studies post-process the results of semantic segmentation to extract instance objects, but such an approach depends first on the accuracy of the semantic segmentation results and may not fully segment each object in complex scenarios. Direct predictive building vertex finding methods are superior in speed to semantic segmentation, but their extracted building quality is poor, and despite subsequent research improvements to such methods, the following problems remain: 1. the initial profile of manual design lacks sufficient generalization, and is inaccurate when facing different scenes and complex building conditions; 2. errors or self-intersecting situations occur when the vertex of the contour moves due to inaccurate initial contour; 3. the feature extraction of the building contour is limited, so that the movement of the contour vertex is not accurate enough, and the final extracted contour has limited precision.
The foregoing is provided merely for the purpose of facilitating understanding of the technical solutions of the present invention and is not intended to represent an admission that the foregoing is prior art.
Disclosure of Invention
The invention mainly aims to provide a building object extraction method, a system, equipment and a storage medium of a remote sensing image, which aim to solve the technical problem of how to accurately extract building objects from the remote sensing image.
In order to achieve the above object, the present invention provides a method for extracting a building object of a remote sensing image, the method for extracting a building object of a remote sensing image comprising:
Inputting a remote sensing image into a backstone module in a target detection network for feature extraction to obtain a remote sensing image feature map, and performing building detection on the remote sensing image feature map to obtain a multi-section initial contour of a building object;
Determining contour feature vectors and a plurality of original vertex coordinates corresponding to the initial contours of all the sections;
inputting a plurality of contour feature vectors into a contour optimization network respectively for vertex Prediction to obtain a plurality of vertex offsets corresponding to each section of optimized contour, wherein the contour optimization network is constructed by a combination of a back bone module, a fusion module and a Prediction module;
Processing the remote sensing image feature map through a contour attention module to obtain a contour boundary attention map, and performing bilinear interpolation on the contour boundary attention map according to a plurality of original vertex coordinates corresponding to each section of initial contour to obtain a plurality of vertex weight coefficients corresponding to each section of initial contour;
Obtaining a plurality of vertex target coordinates corresponding to each section of optimized contour through a coordinate iteration formula according to a plurality of original vertex coordinates corresponding to each section of initial contour, a plurality of vertex offsets corresponding to each section of optimized contour and a plurality of vertex weight coefficients corresponding to each section of initial contour;
And extracting the building object from the remote sensing image based on a plurality of vertex target coordinates corresponding to each section of optimized contour.
Optionally, the step of performing building detection on the remote sensing image feature map to obtain a multi-segment initial contour of the building object includes:
downsampling the remote sensing image feature map to obtain a predicted heat map, a predicted bias map and a predicted attribute map;
determining geometrical center coordinates of the building object through a HEATMAP HEAD module according to the predicted heat map;
Determining the center coordinates of the detection frame corresponding to the building object according to the geometric center coordinates;
Determining the central coordinate offset of the detection frame through an offset head module according to the prediction bias diagram;
Determining detection frame attribute information through a whhead module according to the prediction attribute map;
generating a detection frame of the building object according to the center coordinates of the detection frame, the center coordinate offset and the detection frame attribute information;
And obtaining a multi-section initial contour of the building object through vertex sampling based on the detection frame of the building object.
Optionally, the step of inputting the plurality of profile feature vectors into the profile optimization network to perform vertex prediction to obtain a plurality of vertex offsets corresponding to each segment of the optimized profile includes:
Respectively inputting a plurality of contour feature vectors into a backstone module in a contour optimization network to obtain contour feature information of different levels corresponding to each section of initial contour;
processing profile characteristic information of different levels corresponding to each section of initial profile by a Fusion module in the profile optimization network respectively to obtain profile characteristic information after pooling of each section of initial profile;
And obtaining a plurality of vertex offsets corresponding to each section of optimized contour through Predi ction modules in the contour optimization network according to the contour characteristic information after the initial contour of each section is pooled.
Optionally, the step of processing the remote sensing image feature map by the profile attention module to obtain a profile boundary attention map includes:
Performing superposition convolution processing on the remote sensing image feature map through a contour attention module to obtain a contour boundary convolution map;
processing the outline boundary convolution graph through a Sigmoid activation layer to obtain an outline boundary prediction graph;
Carrying out multi-layer convolution processing on the contour boundary prediction graph to obtain a contour prediction convolution graph;
And generating a contour boundary attention map through the Sigmoid activation layer according to the contour prediction convolution map.
Optionally, the step of obtaining a plurality of vertex target coordinates corresponding to each segment of the optimized contour according to a plurality of original vertex coordinates corresponding to each segment of the initial contour, a plurality of vertex offsets corresponding to each segment of the optimized contour, and a plurality of vertex weight coefficients corresponding to each segment of the initial contour through a coordinate iteration formula includes:
Obtaining a plurality of vertex target offsets corresponding to each section of optimized contour through a vertex offset formula according to a plurality of vertex offsets corresponding to each section of optimized contour and a plurality of vertex weight coefficients corresponding to each section of initial contour;
The vertex offset formula is:
(Δx′k,Δy′k)=Atten(xk,yk)*(Δxk,Δyk)
Where k is the number of iterations, atten (x k,xk) is the kth vertex weight coefficient, (Δx k,Δyk) is the kth vertex offset, and (Δx' k,Δy′k) is the kth vertex target offset;
obtaining a plurality of vertex target coordinates corresponding to each section of optimized contour through a coordinate iteration formula according to a plurality of original vertex coordinates corresponding to each section of initial contour and a plurality of vertex target offsets corresponding to each section of optimized contour;
The coordinate iteration formula is:
In the formula, The target vertex offset for the (k-1) th point i, (Deltax' k-1,Δy′k-1) is the (k-1) th vertex offset,The original vertex coordinates for the kth-1 th time i point,The vertex target coordinates of the kth i point.
In addition, in order to achieve the above object, the present invention further provides a building object extraction system for remote sensing images, where the building object extraction system for remote sensing images includes:
The initial module is used for inputting the remote sensing image into a backstone module in the target detection network for feature extraction to obtain a remote sensing image feature map, and carrying out building detection on the remote sensing image feature map to obtain a multi-section initial contour of a building object;
the determining module is used for determining contour feature vectors and a plurality of original vertex coordinates corresponding to the initial contours of the sections;
The optimization module is used for respectively inputting a plurality of profile feature vectors into a profile optimization network to conduct vertex Prediction, and obtaining a plurality of vertex offsets corresponding to each section of optimized profile, wherein the profile optimization network is constructed by a Ba ckbone module, a Fusion module and a Prediction module;
The optimization module is further used for processing the remote sensing image feature map through the contour attention module to obtain a contour boundary attention map, and performing bilinear interpolation on the contour boundary attention map according to a plurality of original vertex coordinates corresponding to each section of initial contour to obtain a plurality of vertex weight coefficients corresponding to each section of initial contour;
The optimization module is further used for obtaining a plurality of vertex target coordinates corresponding to each section of the optimized contour through a coordinate iteration formula according to a plurality of original vertex coordinates corresponding to each section of the initial contour, a plurality of vertex offsets corresponding to each section of the optimized contour and a plurality of vertex weight coefficients corresponding to each section of the initial contour;
And the extraction module is used for extracting the building object from the remote sensing image based on a plurality of vertex target coordinates corresponding to each section of optimized contour.
In addition, in order to achieve the above object, the present invention also provides a building object extraction device for remote sensing images, the device comprising: the system comprises a memory, a processor and a remote sensing image building object extraction program stored on the memory and capable of running on the processor, wherein the remote sensing image building object extraction program is configured to realize the steps of the remote sensing image building object extraction method.
In addition, in order to achieve the above object, the present invention also proposes a storage medium having stored thereon a building object extraction program of a remote sensing image, which when executed by a processor, implements the steps of the building object extraction method of a remote sensing image as described above.
Firstly, inputting a remote sensing image into a backbox module in a target detection network to perform feature extraction to obtain a remote sensing image feature map, performing building detection to the remote sensing image feature map to obtain a multi-section initial contour of a building object, determining contour feature vectors and a plurality of original vertex coordinates corresponding to each section of initial contour, then respectively inputting the contour feature vectors into a contour optimization network to perform vertex Prediction to obtain a plurality of vertex offsets corresponding to each section of optimized contour, constructing the contour optimization network by the backbox module, a Fusion module and a Prediction module, processing the remote sensing image feature map through a contour attention module to obtain a contour boundary attention map, performing bilinear interpolation to the contour boundary attention map according to a plurality of original vertex coordinates corresponding to each section of initial contour to obtain a plurality of vertex weight coefficients corresponding to each section of initial contour, and finally obtaining a plurality of vertex coordinates corresponding to each section of optimized contour through a coordinate iteration formula according to a plurality of original vertex coordinates corresponding to each section of initial contour, and extracting each section of vertex weight coefficients corresponding to each section of optimized contour from the corresponding to a plurality of optimized target contour based on the multi-section of optimized object remote sensing object. According to the invention, the initial contour is divided into a plurality of sections according to the target detection frame, each section of initial contour is independently input into the network for regression, so that the matching error between the predicted peak and the real peak is avoided, a new Backbo ne network is provided in the contour optimization stage, the contour can be extracted by the backstone compared with the previous network to obtain the deeper features of the contour, the predicted contour boundary is closer to the real boundary, then an edge attention module is introduced in the contour optimization stage, the weight of each peak in the regression process is adjusted, the peak far from the real contour is forced to move faster, and finally a plurality of peak target coordinates corresponding to each section of optimized contour are obtained through a coordinate iteration formula, so that the building object can be accurately extracted from the remote sensing image.
Drawings
Fig. 1 is a schematic structural diagram of a building object extraction device for remote sensing image of a hardware operation environment according to an embodiment of the present invention;
FIG. 2 is a flowchart of a first embodiment of a method for extracting a building object from a remote sensing image according to the present invention;
FIG. 3 is a diagram illustrating the overall network CENTERNET architecture of a first embodiment of a method for extracting building objects from remote sensing images according to the present invention;
FIG. 4 is a schematic diagram of a detection frame of a first embodiment of a method for extracting a building object from a remote sensing image according to the present invention;
FIG. 5 is a schematic diagram of contour segmentation of a first embodiment of a method for extracting a building object from a remote sensing image according to the present invention;
FIG. 6 is a diagram showing the overall structure of a contour optimization network according to a first embodiment of a method for extracting building objects from remote sensing images according to the present invention;
Fig. 7 is a block diagram of a backhaul module in a profile optimization network according to a first embodiment of a remote sensing image building object extraction method of the present invention;
FIG. 8 is a flowchart of a profile attention module according to a first embodiment of a method for extracting building objects from remote sensing images of the present invention;
fig. 9 is a block diagram of a first embodiment of a system for extracting a building object from a remote sensing image according to the present invention.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Referring to fig. 1, fig. 1 is a schematic structural diagram of a building object extraction device for remote sensing image of a hardware operation environment according to an embodiment of the present invention.
As shown in fig. 1, the building object extraction apparatus for remote sensing image may include: a processor 1001, such as a central processing unit (Central Processing Unit, CPU), a communication bus 1002, a user interface 1003, a network interface 1004, a memory 1005. Wherein the communication bus 1002 is used to enable connected communication between these components. The user interface 1003 may include a Display, an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may further include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a Wireless interface (e.g., a Wireless-Fidelity (Wi-Fi) interface). The Memory 1005 may be a high-speed random access Memory (Random Access Memory, RAM) or a stable nonvolatile Memory (NVM), such as a disk Memory. The memory 1005 may also optionally be a storage system separate from the processor 1001 described above.
It will be appreciated by those skilled in the art that the structure shown in fig. 1 does not constitute a limitation of the remote sensing image building object extraction apparatus, and may include more or fewer components than shown, or may combine certain components, or may have a different arrangement of components.
As shown in fig. 1, the memory 1005 as one type of storage medium may include an operating system, a network communication module, a user interface module, and a building object extraction program for remote sensing images.
In the building object extraction device of the remote sensing image shown in fig. 1, the network interface 1004 is mainly used for performing data communication with a network server; the user interface 1003 is mainly used for data interaction with a user; the processor 1001 and the memory 1005 in the remote sensing image building object extraction device of the present invention may be disposed in the remote sensing image building object extraction device, where the remote sensing image building object extraction device invokes a remote sensing image building object extraction program stored in the memory 1005 through the processor 1001, and executes the remote sensing image building object extraction method provided by the embodiment of the present invention.
The embodiment of the invention provides a building object extraction method of a remote sensing image, and referring to fig. 2, fig. 2 is a flow chart of a first embodiment of the building object extraction method of the remote sensing image.
In this embodiment, the method for extracting a building object from a remote sensing image includes the following steps:
Step S10: inputting the remote sensing image into a backstone module in a target detection network for feature extraction to obtain a remote sensing image feature map, and performing building detection on the remote sensing image feature map to obtain a multi-section initial contour of a building object.
It is to be understood that the execution subject of the present embodiment may be a building object extraction system of remote sensing images with functions of data processing, network communication, program running, etc., or may be other computer devices with similar functions, etc., and the present embodiment is not limited thereto.
It should be noted that, in order to obtain building objects with different positions and sizes, a building needs to be first located in the remote sensing image and an initial contour of the building object is constructed.
It should be further understood that, referring to fig. 3, fig. 3 is a network overall block diagram of CENTERNET of the first embodiment of the building object extraction method for remote sensing images according to the present invention, where the target detection network in fig. 3 uses CENTERNET as a network for a building identification and positioning stage, to detect a potential building object in the remote sensing image.
Further, building detection is carried out on the remote sensing image feature map, and the processing mode of obtaining the multi-section initial outline of the building object is that downsampling is carried out on the remote sensing image feature map, so as to obtain a prediction heat map, a prediction bias map and a prediction attribute map; determining geometrical center coordinates of the building object through a HEATMAP HEAD module according to the predicted heat map; determining the center coordinates of the detection frame corresponding to the building object according to the geometric center coordinates; determining the central coordinate offset of the detection frame through an offset head module according to the prediction bias diagram; determining detection frame attribute information through a whhead module according to the prediction attribute map; generating a detection frame of the building object according to the center coordinates of the detection frame, the offset of the center coordinates and the attribute information of the detection frame; a multi-segment initial contour of the building object is obtained by vertex sampling based on the detection frame of the building object.
In a specific implementation, a remote sensing image is subjected to feature extraction through a backhaul module composed of CNNs in a CENTERNET network, and the extracted feature images (i.e., remote sensing image feature images) are sampled to obtain three images for prediction, namely an offset image (i.e., a prediction bias image), a wh image (i.e., a prediction attribute image) and a he atmap image (i.e., a prediction heat image).
It should also be noted that, using the peak search method, the geometric center coordinates of several building objects are determined from heatmap by the HEATMAP HEAD module. According to the method, each point of the heatma p is traversed in a 3 multiplied by 3 maximum pooling mode, meanwhile, whether the value (confidence coefficient) of the point is larger than or equal to that of eight adjacent points is compared, a certain number of points which are to be fixed and are larger than a threshold value are obtained and then screened according to the confidence coefficient to serve as final center points, the positions of the center points correspond to the center coordinates of a detection frame of a building object in heatmap, then the offset of the center coordinates of the detection frame is determined through an off set head module, the attribute information of the detection frame is determined through a wh head module, and the attribute information of the detection frame comprises the length and width information of the detection frame.
In this embodiment, CENTERNET obtains the center point and the length and width information of the detected object through the predicted three heads (i.e. HEATMAP HEAD module, offs et head module and whhead module), so as to obtain the potential building position in the remote sensing image, and generates the detection box of the building object at the position, referring to fig. 4, fig. 4 is a schematic diagram of the detection box of the first embodiment of the building object extraction method of the remote sensing image in the present invention.
It should also be appreciated that one inspection box is assigned to each building. The detection frame is usually represented by the coordinates of the upper left corner and the coordinates of the lower right corner, and its shape is usually a matrix. The rectangle is directly and uniformly sampled, the rectangle is sampled from a rectangle represented by only 2 or 4 vertexes to N vertexes, N is generally set to 128, the new contour is taken as an initial contour of a building, then the initial contour of the building is divided into a plurality of sections of initial contours (such as eight sections of initial contours) by using four vertexes of a target detection frame (namely the initial contour of the building) and four intersection points of a building object and the detection frame, the sections of initial contours are resampled, and the contour is optimized locally in each contour section, and referring to fig. 5, fig. 5 is a contour segmentation schematic diagram of a first embodiment of a building object extraction method of the remote sensing image of the invention.
Step S20: and determining contour feature vectors and a plurality of original vertex coordinates corresponding to the initial contours of the sections.
It should also be noted that although the building object may already be represented to some extent by the building initial contour, it is still not accurate enough compared to the actual contour of the building, and therefore it is necessary to segment the building initial contour so that it can be more closely approximated to the actual contour.
It should be appreciated that determining that each segment of the initial contour has a corresponding plurality of original vertices requires a predetermined determination of the contour feature vector and corresponding plurality of original vertex coordinates for each segment of the initial contour.
Step S30: and respectively inputting a plurality of contour feature vectors into a contour optimization network to perform vertex Prediction, and obtaining a plurality of vertex offsets corresponding to each section of optimized contour, wherein the contour optimization network is constructed by combining a back bone module, a Fusion module and a Prediction module.
Further, respectively inputting a plurality of contour feature vectors into a contour optimization network for vertex prediction, wherein the processing mode for obtaining a plurality of vertex offsets corresponding to each section of optimized contour is to respectively input the plurality of contour feature vectors into a back plane module in the contour optimization network to obtain contour feature information of different layers corresponding to each section of initial contour; processing the profile characteristic information of different levels corresponding to the initial profiles of each segment by a Fusion module in the profile optimization network to obtain the profile characteristic information after pooling of the initial profiles of each segment; and obtaining a plurality of vertex offsets corresponding to each section of optimized contour through Predicti on modules in the contour optimization network according to the contour characteristic information after the initial contour of each section is pooled.
In a specific implementation, referring to fig. 6, fig. 6 is an overall structure diagram of a contour optimization network of a first embodiment of a building object extraction method for remote sensing images according to the present invention, in the contour optimization network, an input of the network is a contour feature vector corresponding to an initial contour of each segment of a building object, and a plurality of vertex offsets corresponding to each segment of the optimized contour are output, and the overall contour optimization network is divided into Backbone, fusion and a Prediction part.
The plurality of vertex offsets corresponding to each segment of the optimized contour may be understood as the plurality of vertex offsets after each segment of the initial contour is optimized.
In a specific implementation, for a backhaul module:
A backup model for simulating Unet ++ network is designed, the network structure is shown in fig. 7, and fig. 7 is a block diagram of a backup model in a contour optimization network of a first embodiment of the remote sensing image building object extraction method of the present invention. Compared with Unet ++ network structure, the backhaul discards the depth of the network and increases the length of the network. Three operations are involved in this back bone, downsampling, upsampling, and base convolution blocks, respectively, where the base convolution block is made up of two layers of 1-dimensional convolutions, reLu, and BatchNorm; for each X 0,i of the first layer, X 0,i-1 is added by the up-sampling result of the base convolution block and X 1,i-1; for each X 1,i of the second layer, except that X 1,0 is the result of downsampling by X 0,0, the other results of the previous layer after basic convolution block and settlement (namely, different levels of contour characteristic information corresponding to each section of initial contour) are obtained; each X 0 ,i of the first layer has the same dimensions and each X 0,i is stored in State.
Fusion module:
The Fusion part is used for fusing the features of different layers before, the Stat e output by the Bcakbone module is fused and then passes through a1×1 convolution layer, and the maximum pooling is carried out, so that contour feature information after the pooling of the initial contour of each section is obtained.
A Prediction module:
The Prediction part consists of N layers of 1 multiplied by 1 convolution and a ReLU function, and a plurality of vertex offsets corresponding to each section of optimized contour are obtained through a Prediction module in the contour optimization network according to contour characteristic information after each section of initial contour pooling.
Step S40: and processing the remote sensing image feature map through a contour attention module to obtain a contour boundary attention map, and performing bilinear interpolation on the contour boundary attention map according to a plurality of original vertex coordinates corresponding to each section of initial contour to obtain a plurality of vertex weight coefficients corresponding to each section of initial contour.
Further, the remote sensing image feature map is processed through the profile attention module, and the processing mode of obtaining the profile boundary attention map is that the profile attention module carries out superposition convolution processing on the remote sensing image feature map to obtain a profile boundary convolution map; processing the outline boundary convolution graph through a Sigmoid activation layer to obtain an outline boundary prediction graph; carrying out multi-layer convolution processing on the contour boundary prediction graph to obtain a contour prediction convolution graph; the contour boundary attention map is generated through the Sigmoid activation layer according to the contour prediction convolution map.
In a specific implementation, referring to fig. 8, fig. 8 is a flow chart of a contour attention module of a first embodiment of a building object extraction method for remote sensing images according to the present invention, and an additional contour attention module is introduced, and this module automatically learns weights of various vertices in a contour, and gives greater weights to vertices far from a real contour, so that they move faster, and network learning efficiency is increased.
In this embodiment, a feature map obtained after the picture is initialized by the contour, that is, a remote sensing image feature map (feature map), N layers of convolution are continuously overlapped, and a contour boundary prediction map (contour map) which is a prediction map about the contour boundary is obtained by processing the feature map with a Sigmoid activation layer, and the contour boundary prediction map directly performs loss calculation with the true contour boundary of the picture.
The contour boundary prediction graph is also subjected to N-layer convolution, and finally, an Attention map (Attention map) is generated by using a Sigmoid activation layer, bilinear interpolation is performed on the Attention map according to the position of the contour (i.e., a plurality of original vertex coordinates corresponding to each segment of initial contour), and a weight coefficient Atten of each vertex (i.e., a plurality of vertex weight coefficients corresponding to each segment of initial contour) is obtained.
Step S50: and obtaining a plurality of vertex target coordinates corresponding to each section of optimized contour through a coordinate iteration formula according to a plurality of original vertex coordinates corresponding to each section of initial contour, a plurality of vertex offsets corresponding to each section of optimized contour and a plurality of vertex weight coefficients corresponding to each section of initial contour.
In this embodiment, according to a plurality of vertex offsets corresponding to each segment of the optimized contour and a plurality of vertex weight coefficients corresponding to each segment of the initial contour, a plurality of vertex target offsets corresponding to each segment of the optimized contour are obtained through a vertex offset formula;
The vertex offset formula is:
(Δx k,Δy k)=Atten(xk,yk)*(Δxk,Δyk)
Where k is the number of iterations, atten (x k,xk) is the kth vertex weight coefficient, (Δx k,Δyk) is the kth vertex offset, and (Δx' k,Δy′k) is the kth vertex target offset;
obtaining a plurality of vertex target coordinates corresponding to each section of optimized contour through a coordinate iteration formula according to a plurality of original vertex coordinates corresponding to each section of initial contour and a plurality of vertex target offsets corresponding to each section of optimized contour;
The coordinate iteration formula is:
In the formula, The target vertex offset for the (k-1) th point i, (Deltax' k-1,Δy′k-1) is the (k-1) th vertex offset,The original vertex coordinates for the kth-1 th time i point,The vertex target coordinates of the kth i point.
It should be noted that, updating the coordinates of the vertices is performed in an iterative manner, and the coordinates of the original verticesOffset from the vertexNew coordinates added as verticesAnd this new coordinate will continue to be passed into the profile-optimization network as before, outputting a new offset, where the number of iterations K can be set to 3.
Step S60: and extracting the building object from the remote sensing image based on a plurality of vertex target coordinates corresponding to each section of optimized contour.
It should be understood that, by obtaining the target coordinates of each vertex after each section of the optimized contour in the above manner, the contour after optimizing the building object can be determined according to the target coordinates of each vertex after each section of the optimized contour, and then the building object can be extracted from the remote sensing image according to the contour after optimizing the building object.
In this embodiment, firstly, a remote sensing image is input to a back bone module in a target detection network to perform feature extraction to obtain a remote sensing image feature map, building detection is performed on the remote sensing image feature map to obtain a multi-segment initial contour of a building object, a contour feature vector corresponding to each segment of the initial contour and a plurality of original vertex coordinates are determined, then, the plurality of contour feature vectors are respectively input to a contour optimization network to perform vertex Prediction to obtain a plurality of vertex offsets corresponding to each segment of the optimized contour, the contour optimization network is constructed by a Backbon e module, a Fusion module and a Prediction module, then, the remote sensing image feature map is processed by a contour attention module to obtain a contour boundary attention map, bilinear interpolation is performed on the contour boundary attention map according to a plurality of original vertex coordinates corresponding to each segment of the initial contour to obtain a plurality of vertex weight coefficients corresponding to each segment of the initial contour, and finally, a plurality of vertex coordinates corresponding to each segment of the optimized contour are obtained through a coordinate iteration formula according to a plurality of original vertex coordinates corresponding to each segment of the initial contour, a plurality of vertex offsets corresponding to each segment of the optimized contour and a plurality of vertex weight coefficients corresponding to each segment of the initial contour, and the target vertex coordinates corresponding to each segment of the optimized contour are extracted from the corresponding segment of the optimized contour. According to the embodiment, the initial contour is divided into a plurality of sections according to the target detection frame, each section of initial contour is independently input into the network for regression, so that matching errors between predicted vertexes and real vertexes are avoided, a new backstone network is provided in a contour optimization stage, the backstone network can extract deeper features of the contour compared with the previous network, the predicted contour boundary is closer to the real boundary, an edge attention module is introduced in the contour optimization stage, the weight of each vertex in the regression process is adjusted, the vertexes far away from the real contour are forced to move faster, and finally a plurality of vertex target coordinates corresponding to each section of optimized contour are obtained through a coordinate iteration formula, so that a building object can be accurately extracted from a remote sensing image.
Referring to fig. 9, fig. 9 is a block diagram illustrating a construction object extraction system according to a first embodiment of the remote sensing image of the present invention.
As shown in fig. 9, a building object extraction system for remote sensing images according to an embodiment of the present invention includes:
The initial module 9001 is configured to input a remote sensing image to a backhaul module in a target detection network to perform feature extraction, obtain a remote sensing image feature map, and perform building detection on the remote sensing image feature map to obtain a multi-segment initial contour of a building object;
A determining module 9002, configured to determine a contour feature vector and a plurality of original vertex coordinates corresponding to each segment of the initial contour;
The optimizing module 9003 is configured to input a plurality of contour feature vectors into a contour optimizing network respectively to perform vertex Prediction, and obtain a plurality of vertex offsets corresponding to each segment of optimized contour, where the contour optimizing network is constructed by combining a backfone module, a Fusion module and a Prediction module;
the optimizing module 9003 is further configured to process the remote sensing image feature map through a contour attention module to obtain a contour boundary attention map, and perform bilinear interpolation on the contour boundary attention map according to a plurality of original vertex coordinates corresponding to each segment of the initial contour to obtain a plurality of vertex weight coefficients corresponding to each segment of the initial contour;
the optimizing module 9003 is further configured to obtain a plurality of vertex target coordinates corresponding to each segment of the optimized contour according to a plurality of original vertex coordinates corresponding to each segment of the initial contour, a plurality of vertex offsets corresponding to each segment of the optimized contour, and a plurality of vertex weight coefficients corresponding to each segment of the initial contour through a coordinate iteration formula;
and the extracting module 9004 is configured to extract the building object from the remote sensing image based on a plurality of vertex target coordinates corresponding to each segment of the optimized contour.
In this embodiment, firstly, a remote sensing image is input to a back bone module in a target detection network to perform feature extraction to obtain a remote sensing image feature map, building detection is performed on the remote sensing image feature map to obtain a multi-segment initial contour of a building object, a contour feature vector corresponding to each segment of the initial contour and a plurality of original vertex coordinates are determined, then, the plurality of contour feature vectors are respectively input to a contour optimization network to perform vertex Prediction to obtain a plurality of vertex offsets corresponding to each segment of the optimized contour, the contour optimization network is constructed by a Backbon e module, a Fusion module and a Prediction module, then, the remote sensing image feature map is processed by a contour attention module to obtain a contour boundary attention map, bilinear interpolation is performed on the contour boundary attention map according to a plurality of original vertex coordinates corresponding to each segment of the initial contour to obtain a plurality of vertex weight coefficients corresponding to each segment of the initial contour, and finally, a plurality of vertex coordinates corresponding to each segment of the optimized contour are obtained through a coordinate iteration formula according to a plurality of original vertex coordinates corresponding to each segment of the initial contour, a plurality of vertex offsets corresponding to each segment of the optimized contour and a plurality of vertex weight coefficients corresponding to each segment of the initial contour, and the target vertex coordinates corresponding to each segment of the optimized contour are extracted from the corresponding segment of the optimized contour. According to the embodiment, the initial contour is divided into a plurality of sections according to the target detection frame, each section of initial contour is independently input into the network for regression, so that matching errors between predicted vertexes and real vertexes are avoided, a new backstone network is provided in a contour optimization stage, the backstone network can extract deeper features of the contour compared with the previous network, the predicted contour boundary is closer to the real boundary, an edge attention module is introduced in the contour optimization stage, the weight of each vertex in the regression process is adjusted, the vertexes far away from the real contour are forced to move faster, and finally a plurality of vertex target coordinates corresponding to each section of optimized contour are obtained through a coordinate iteration formula, so that a building object can be accurately extracted from a remote sensing image.
Other embodiments or specific implementation manners of the building object extraction system for remote sensing images of the present invention may refer to the above method embodiments, and will not be described herein.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. read-only memory/random-access memory, magnetic disk, optical disk), comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method according to the embodiments of the present invention.
The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.

Claims (8)

1. The building object extraction method of the remote sensing image is characterized by comprising the following steps of:
Inputting a remote sensing image into a backstone module in a target detection network for feature extraction to obtain a remote sensing image feature map, and performing building detection on the remote sensing image feature map to obtain a multi-section initial contour of a building object;
Determining contour feature vectors and a plurality of original vertex coordinates corresponding to the initial contours of all the sections;
inputting a plurality of contour feature vectors into a contour optimization network respectively for vertex Prediction to obtain a plurality of vertex offsets corresponding to each section of optimized contour, wherein the contour optimization network is constructed by a combination of a back bone module, a fusion module and a Prediction module;
Processing the remote sensing image feature map through a contour attention module to obtain a contour boundary attention map, and performing bilinear interpolation on the contour boundary attention map according to a plurality of original vertex coordinates corresponding to each section of initial contour to obtain a plurality of vertex weight coefficients corresponding to each section of initial contour;
Obtaining a plurality of vertex target coordinates corresponding to each section of optimized contour through a coordinate iteration formula according to a plurality of original vertex coordinates corresponding to each section of initial contour, a plurality of vertex offsets corresponding to each section of optimized contour and a plurality of vertex weight coefficients corresponding to each section of initial contour;
And extracting the building object from the remote sensing image based on a plurality of vertex target coordinates corresponding to each section of optimized contour.
2. The method of claim 1, wherein the step of performing building inspection on the remote sensing image feature map to obtain a multi-segment initial contour of a building object comprises:
downsampling the remote sensing image feature map to obtain a predicted heat map, a predicted bias map and a predicted attribute map;
determining geometrical center coordinates of the building object through a HEATMAP HEAD module according to the predicted heat map;
Determining the center coordinates of the detection frame corresponding to the building object according to the geometric center coordinates;
Determining the central coordinate offset of the detection frame through an offset head module according to the prediction bias diagram;
Determining detection frame attribute information through a whhead module according to the prediction attribute map;
generating a detection frame of the building object according to the center coordinates of the detection frame, the center coordinate offset and the detection frame attribute information;
And obtaining a multi-section initial contour of the building object through vertex sampling based on the detection frame of the building object.
3. The method of claim 2, wherein the step of inputting the plurality of contour feature vectors into the contour optimization network for vertex prediction to obtain a plurality of vertex offsets corresponding to each segment of the optimized contour comprises:
Respectively inputting a plurality of contour feature vectors into a backstone module in a contour optimization network to obtain contour feature information of different levels corresponding to each section of initial contour;
processing profile characteristic information of different levels corresponding to each section of initial profile by a Fusion module in the profile optimization network respectively to obtain profile characteristic information after pooling of each section of initial profile;
And obtaining a plurality of vertex offsets corresponding to each section of optimized contour through Predi ction modules in the contour optimization network according to the contour characteristic information after the initial contour of each section is pooled.
4. The method of any of claims 1-3, wherein the step of processing the remote sensing image feature map by a contour attention module to obtain a contour boundary attention map comprises:
Performing superposition convolution processing on the remote sensing image feature map through a contour attention module to obtain a contour boundary convolution map;
processing the outline boundary convolution graph through a Sigmoid activation layer to obtain an outline boundary prediction graph;
Carrying out multi-layer convolution processing on the contour boundary prediction graph to obtain a contour prediction convolution graph;
And generating a contour boundary attention map through the Sigmoid activation layer according to the contour prediction convolution map.
5. The method of claim 4, wherein the step of obtaining the plurality of vertex target coordinates corresponding to each segment of the optimized contour according to the plurality of original vertex coordinates corresponding to each segment of the initial contour, the plurality of vertex offsets corresponding to each segment of the optimized contour, and the plurality of vertex weight coefficients corresponding to each segment of the initial contour through a coordinate iteration formula comprises:
Obtaining a plurality of vertex target offsets corresponding to each section of optimized contour through a vertex offset formula according to a plurality of vertex offsets corresponding to each section of optimized contour and a plurality of vertex weight coefficients corresponding to each section of initial contour;
The vertex offset formula is:
(Δx k,Δy k)=Atten(xk,yk)*(Δxk,Δyk)
Where k is the number of iterations, atten (x k,xk) is the kth vertex weight coefficient, (Δx k,Δyk) is the kth vertex offset, and (Δx' k,Δy′k) is the kth vertex target offset;
obtaining a plurality of vertex target coordinates corresponding to each section of optimized contour through a coordinate iteration formula according to a plurality of original vertex coordinates corresponding to each section of initial contour and a plurality of vertex target offsets corresponding to each section of optimized contour;
The coordinate iteration formula is:
In the formula, The vertex target offset for the (k-1) th order i point, (Deltax k-1,Δyk-1) is the (k-1) th order vertex offset,The original vertex coordinates for the kth-1 th time i point,The vertex target coordinates of the kth i point.
6. A building object extraction system of remote sensing images, characterized in that the building object extraction system of remote sensing images comprises:
The initial module is used for inputting the remote sensing image into a backstone module in the target detection network for feature extraction to obtain a remote sensing image feature map, and carrying out building detection on the remote sensing image feature map to obtain a multi-section initial contour of a building object;
the determining module is used for determining contour feature vectors and a plurality of original vertex coordinates corresponding to the initial contours of the sections;
The optimization module is used for respectively inputting a plurality of profile feature vectors into a profile optimization network to conduct vertex Prediction, and obtaining a plurality of vertex offsets corresponding to each section of optimized profile, wherein the profile optimization network is constructed by a Ba ckbone module, a Fusion module and a Prediction module in a combined way;
The optimization module is further used for processing the remote sensing image feature map through the contour attention module to obtain a contour boundary attention map, and performing bilinear interpolation on the contour boundary attention map according to a plurality of original vertex coordinates corresponding to each section of initial contour to obtain a plurality of vertex weight coefficients corresponding to each section of initial contour;
The optimization module is further used for obtaining a plurality of vertex target coordinates corresponding to each section of the optimized contour through a coordinate iteration formula according to a plurality of original vertex coordinates corresponding to each section of the initial contour, a plurality of vertex offsets corresponding to each section of the optimized contour and a plurality of vertex weight coefficients corresponding to each section of the initial contour;
And the extraction module is used for extracting the building object from the remote sensing image based on a plurality of vertex target coordinates corresponding to each section of optimized contour.
7. A building object extraction apparatus for remote sensing images, the apparatus comprising: a memory, a processor and a remote sensing image building object extraction program stored on the memory and executable on the processor, the remote sensing image building object extraction program being configured to implement the steps of the remote sensing image building object extraction method of any one of claims 1 to 5.
8. A storage medium, wherein a building object extraction program of a remote sensing image is stored on the storage medium, and the building object extraction program of the remote sensing image, when executed by a processor, implements the steps of the building object extraction method of a remote sensing image according to any one of claims 1 to 5.
CN202410992709.8A 2024-07-23 2024-07-23 Building object extraction method, system, equipment and storage medium of remote sensing image Pending CN119007002A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410992709.8A CN119007002A (en) 2024-07-23 2024-07-23 Building object extraction method, system, equipment and storage medium of remote sensing image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410992709.8A CN119007002A (en) 2024-07-23 2024-07-23 Building object extraction method, system, equipment and storage medium of remote sensing image

Publications (1)

Publication Number Publication Date
CN119007002A true CN119007002A (en) 2024-11-22

Family

ID=93477207

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410992709.8A Pending CN119007002A (en) 2024-07-23 2024-07-23 Building object extraction method, system, equipment and storage medium of remote sensing image

Country Status (1)

Country Link
CN (1) CN119007002A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119991997A (en) * 2025-04-17 2025-05-13 中国测绘科学研究院 A method and system for end-to-end extraction of building vectors based on state update

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119991997A (en) * 2025-04-17 2025-05-13 中国测绘科学研究院 A method and system for end-to-end extraction of building vectors based on state update

Similar Documents

Publication Publication Date Title
CN111902825B (en) Polygonal object annotation system and method and method for training object annotation system
JP7040278B2 (en) Training method and training device for image processing device for face recognition
CN112991447A (en) Visual positioning and static map construction method and system in dynamic environment
CN108229488A (en) For the method, apparatus and electronic equipment of detection object key point
CN111091567B (en) Medical image registration method, medical device and storage medium
CN113177592B (en) Image segmentation method and device, computer equipment and storage medium
CN111292377B (en) Target detection method, device, computer equipment and storage medium
JP6880618B2 (en) Image processing program, image processing device, and image processing method
KR102352942B1 (en) Method and device for annotating object boundary information
TW202011266A (en) Neural network system for image matching and location determination, method, and device
CN119007002A (en) Building object extraction method, system, equipment and storage medium of remote sensing image
CN116977187A (en) A depth point set resampling method based on gradient field
CN118644546A (en) Three-dimensional posture optimization method and device
US11741611B2 (en) Cyclical object segmentation neural networks
CN118470539B (en) Building base extraction method, device and equipment based on non-orthographic remote sensing image
KR20230080804A (en) Apparatus and method for estimating human pose based AI
US11210551B2 (en) Iterative multi-directional image search supporting large template matching
CN116468761B (en) Registration method, equipment and storage medium based on probability distribution distance feature description
CN118229585A (en) Deep image restoration method, device, computer equipment and storage medium
CN117994307A (en) Point cloud registration method, system, storage medium and device based on diffusion model
CN114120055B (en) Training method of instance segmentation model, instance segmentation method, device and medium
CN117475265A (en) Teaching device, teaching method and storage medium
CN114639013B (en) Remote sensing image plane target detection and identification method based on improved Orient RCNN model
CN110728359A (en) Method, apparatus, device and storage medium for searching model structure
CN117115646A (en) Rapid regularization methods, devices, equipment and storage media for remote sensing building interpretation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination