Disclosure of Invention
In order to overcome the technical defects, the application provides an incremental map optimization method, a storage medium and equipment based on three-dimensional Gaussian.
In order to achieve the above purpose, the application is realized according to the following technical scheme:
in a first aspect, the present application provides an incremental map optimization method based on three-dimensional gaussian, which is applied to an incremental map building sub-module, wherein the incremental map building sub-module and a multi-sensor visual odometer sub-module form a positioning map building module, and the method includes:
S1, acquiring a current frame and an estimated standing posture, wherein the estimated standing posture is obtained by outputting a multi-sensor visual mileage submodule;
s2, judging whether the current frame is a key frame or not;
If yes, executing step S3;
S3, rendering to obtain a contour image based on the estimated standing posture;
s4, acquiring a historical three-dimensional Gaussian diagram;
Determining pixels corresponding to the newly added region based on the contour image and the historical three-dimensional Gaussian image;
S5, optimizing the historical three-dimensional Gaussian map based on the pixels corresponding to the newly added region to obtain a first precision three-dimensional Gao Situ corresponding to the newly added region;
S6, acquiring a historical key frame, wherein the historical key frame comprises an initial key frame and a previous key frame;
And carrying out loss optimization on the first precision three-dimension Gao Situ based on the historical key frame to obtain a second precision three-dimension Gao Situ corresponding to the newly added region, wherein the first precision is smaller than the second precision.
Optionally, judging whether the current frame is a key frame, specifically:
judging whether the pixel similarity of the current frame and the previous key frame is larger than a first threshold value or not;
if yes, the current frame is used as a key frame.
Optionally, the rendering to obtain a contour image based on the estimated standing pose includes:
determining a current camera view angle corresponding to the key frame according to the key frame;
And rendering the estimated standing posture based on the current camera view angle to obtain a contour image.
Optionally, the determining, based on the contour image and the historical three-dimensional gaussian image, a pixel corresponding to the newly added region includes:
Generating a mask based on the contour image;
Based on the mask and the historical three-dimensional Gaussian map, matching is carried out, and pixels corresponding to the newly added area are determined;
Optionally, the optimizing the historical three-dimensional gaussian image based on the pixel corresponding to the newly added region to obtain a first precision three-dimensional gaussian image corresponding to the newly added region includes:
acquiring a newly registered radar point in the time interval between the current frame and the previous key frame;
randomly sampling one half of the newly registered radar points;
determining an optimizable radar point based on pixels corresponding to the newly added region and one half of the newly registered radar points of the random sampling;
determining an optimizable gaussian point based on the optimizable radar point;
And carrying out optimization processing on the historical three-dimensional Gaussian diagram based on the optimizable Gaussian points to obtain a first precision three-dimensional Gao Situ corresponding to the newly added region.
Optionally, the determining an optimizable radar point based on the pixel corresponding to the newly added region and the newly registered radar point of one half of the random sampling includes:
projecting one half of the newly registered radar points of the random sampling into pixels corresponding to the newly added area;
Judging whether one half of the newly registered radar points can be projected into pixels corresponding to the newly added area or not;
if yes, the radar points which can be projected into the pixels corresponding to the newly added areas are used as the optimized radar points.
Optionally, the performing loss optimization on the first precision three-dimensional Gao Situ based on the historical key frame to obtain a second precision three-dimensional gaussian image corresponding to the newly added region includes:
step 701, randomly disturbing the historical key frame, and performing minimized luminosity rendering loss iterative processing on the first precision three-dimensional Gao Situ by adopting the following formula to obtain an optimized three-dimensional Gaussian rendering depth map;
Wherein I and Respectively observing (collecting) images corresponding to historical key frames and images obtained by Gaussian characterization corresponding to rendering the historical key frames, wherein L D-SSIM is structural similarity constraint structural consistency, lambda refers to weight, artificial setting is performed, and L refers to loss;
step S702, judging whether the calculation times of the minimized luminosity rendering loss iterative processing is larger than the first preset calculation times or not;
if yes, ending iterative computation processing, and taking the optimized three-dimensional Gaussian rendering depth map as a second precision three-dimensional Gao Situ corresponding to the newly added region;
if not, judging whether the number of calculation times of the minimized luminosity rendering loss iterative processing is equal to a second preset number of calculation times, wherein the second preset number of calculation times is the number of calculation times obtained after dividing the first preset number of calculation times according to a preset gradient;
If yes, go to step 703;
If not, taking the optimized three-dimensional Gaussian rendering depth map as a first precision three-dimensional Gao Situ of the next round of iterative processing, and repeating the step 701;
step 703, obtaining an optimized three-dimensional Gaussian rendering depth map;
Calculating the accumulated opacity for each pixel point in the optimized three-dimensional Gaussian rendering depth map;
If the accumulated opacity of any pixel point in each pixel point is greater than a second threshold value, or the difference value between the rendering depth corresponding to any pixel point and the laser point cloud depth is greater than a third threshold value, marking the pixel point as an unstable pixel;
determining a three-dimension Gao Situ corresponding to the unstable pixel based on the unstable pixel;
Processing the three-dimension Gao Situ corresponding to the unstable image by using a self-adaptive densification and construction strategy to obtain a processed first optimization loss Gaussian diagram;
And repeating the steps S701-S703 by taking the processed first optimized loss Gaussian diagram as a first precision three-dimensional Gao Situ for repeating the minimized photometric rendering loss iteration process next time.
Optionally, the estimated standing posture is output by the multi-sensor visual mileage sub-module, including:
inputting the radar point cloud map, the current frame and inertial navigation data into the multi-sensor visual mileage sub-module;
determining the estimated standing posture based on the radar point cloud map, the current frame and inertial navigation data;
And outputting the estimated standing posture through the multi-sensor visual mileage sub-module.
In a second aspect, the present application provides a computer device comprising a processor and a memory having stored therein at least one instruction, at least one program, code set or instruction set loaded and executed by the processor to implement the three-dimensional gaussian-based delta map optimization method according to any of claims 1 to 8.
In a third aspect, the present application provides a computer readable storage medium having stored therein at least one instruction, at least one program, code set, or instruction set loaded and executed by a processor to implement the three-dimensional gaussian-based delta map optimization method according to any of claims 1 to 8.
The application has the following beneficial effects:
According to the application, the current frame is judged and registered as the key frame, so that the calculated amount of pictures in the map updating process is reduced, and the three-dimensional Gaussian representation is introduced to serve as a basic method for real-time positioning and map building of large-scale scene modeling, so that high-fidelity modeling of far-small objects with complex structures is realized, and the scene map building quality is improved. In parallel with the multisensor odometer submodule, the pose tracking result is used, the incremental map optimization method is used for gradually optimizing the map, redundancy is reduced, efficiency and accuracy are improved, and high-fidelity and high-efficiency rendering is achieved.
In addition to the objects, features and advantages described above, the present application has other objects, features and advantages. The application will be described in further detail with reference to the accompanying drawings.
Detailed Description
Embodiments of the application are described in detail below with reference to the attached drawings, but the application can be implemented in a number of different ways, which are defined and covered by the claims.
Since a plurality of different maps and some conceptual decisions are mentioned in the present application, in order to avoid understanding errors, a simple description will be given here:
Global laser radar map, which is a comprehensive environmental description constructed by laser radar collected data, is processed by point cloud registration, feature extraction and the like, has the characteristics of high precision, rich three-dimensional information and the like, and has important application in the fields of automatic driving, robot navigation and the like.
Three-dimensional Gao Situ a three-dimensional gaussian is a graph based on three-dimensional space that represents data features based on gaussian distributions. The method describes the information such as the shape, the position, the uncertainty and the like of the object in the three-dimensional space through the Gaussian function, and has wide application in various fields.
It should be noted that the method of the present application is mainly applied to three-dimensional scene real-time construction of a scene on a large scale on the water surface. If fig. 1 shows that the three-dimensional scene is constructed in real time, the functions of the above modules are briefly described below mainly according to the joint calibration module and the positioning real-time mapping module:
The combined calibration module is used for carrying out combined high-precision calibration of external parameters on the laser radar, the inertial measurement unit and the camera in the hardware platform, fully utilizing high-resolution visual information provided by the camera, high-precision distance measurement provided by the laser radar, and dynamic information such as acceleration, angular velocity and the like provided by the inertial measurement unit, and eliminating potential influence of errors and drift of each sensor on the performance of the system. Through joint calibration, the data of the sensors can be aligned in the same coordinate system, so that the accuracy of data fusion is improved, and the errors are reduced in the data fusion process, so that the overall stability and reliability of the system are improved, and the system has important supporting effects on subsequent sensing, positioning, navigation and decision tasks.
The positioning and real-time mapping module is used for efficiently determining the position of the unmanned platform in a large-scale environment space on the water surface by utilizing multi-sensor information, and establishing a dense water surface scene three-dimensional map with visual and real content in real time, so as to provide accurate position information and high-quality dense map information for subsequent tasks such as path planning, autonomous control and the like.
Because the method of the application is mainly realized based on a positioning and real-time mapping module, the method is described in detail herein:
the positioning and real-time image construction module comprises a multi-sensor visual odometer sub-module and an incremental image construction sub-module.
The multi-sensor odometer submodule is a robust pose tracking system combining camera-inertial navigation-radar. The specific flow is shown in fig. 2. Inputting a radar point cloud map, a current image frame and inertial navigation data, firstly extracting point cloud SURF characteristic points, utilizing optical flow tracking to project map points in a previous frame image to the current frame, then carrying out scanning matching of the radar point cloud and the map, calculating residual errors of radar, a camera and inertial navigation, carrying out optimization on continuous time tracks represented by non-uniform B spline, carrying out marginalization and updating of the map points, inputting new time point sensing data, repeating the steps, and finally outputting continuous-time joint optimization tracks.
Specifically, a global map is constructed by using a radar point cloud, and three-dimensional map points are associated with pixels in a camera image, so that the requirement of visual feature depth optimization is avoided, and the efficient fusion of heterogeneous radar-inertia-camera data is realized in short-time sliding window optimization. And an IMU factor and a deviation factor are constructed by utilizing an IMU original measured value and a random walk process, then non-uniform B spline is adopted to perform continuous time optimization based on the point-plane error from the radar point cloud to the map, the re-projection error from the camera image to the map and the IMU factor and the deviation factor, the estimated standing posture of the radar point cloud map is obtained through output, and in the optimization process, the motion dynamic change is considered, and control points are dynamically and adaptively added to improve the precision and the instantaneity.
That is, the multisensor odometer submodule outputs the estimated standing posture after correspondingly processing the input radar point cloud map, the current image frame and the inertial navigation data and the input data.
The incremental map building sub-module is based on a three-dimensional Gaussian scene representation method, takes the output of the multi-sensor odometer module for estimating the pose of the current frame as input, and gradually builds a complete, high-fidelity and dense three-dimensional scene map in step-by-step optimization.
In order to solve the problems set forth in the background art, as shown in fig. 3 and 4, the present application provides an incremental map optimization method based on three-dimensional gaussian, which is applied to an incremental map building sub-module, wherein the incremental map building sub-module and a multi-sensor visual odometer sub-module form a positioning map building module, and the method comprises the following steps:
s1, acquiring a current frame and an estimated standing posture, wherein the estimated standing posture is obtained by outputting a multi-sensor visual mileage submodule;
the current frame represents the data frame corresponding to the sensor data acquired by the system at the current time. For example, a point cloud data frame obtained by scanning a laser radar at a certain moment or an image data frame shot by a camera at the same moment, and the current frame data provides real-time environment information for the system so as to make decisions and control.
Estimating pose refers to the process of estimating the position and pose of an object in space. The data are acquired through various sensors, and the information such as the azimuth, the angle and the like of the object relative to the reference system is determined by applying a specific algorithm, so that the method has important application in the fields such as robots, automatic driving, water surface scene construction and the like. The estimated pose is obtained by outputting the multi-sensor visual mileage sub-module and inputting the multi-sensor visual mileage sub-module into the incremental mapping sub-module for subsequent calculation processing, and the process of determining the estimated pose by the multi-sensor visual mileage sub-module is described above and will not be described in detail.
According to the application, a group of control strategies are carefully designed, so that the three-dimensional Gaussian sputtering is more suitable for incremental mapping. For a first time step, based on information input from multiple sensors, for example, a first frame of image, estimating a pose, and a lidar point cloud corresponding to the first frame, as a start of constructing a three-dimensional gaussian image, an original three-dimension Gao Situ is formed. And defaulting the input first frame image to be a key frame. Therefore, the three-dimensional Gaussian is initialized by using all radar points included in the laser radar point cloud corresponding to the first frame image in the first time step. Placing each Gaussian center point at a corresponding radar point position, initializing zero degree spherical harmonics of the Gaussian center points by using the color attribute of the radar points, giving smaller scales to nearby Gaussian, giving larger scales to Gaussian of a principle imaging plane so as to reduce potential aliasing artifacts, and projecting each Gaussian point to the image plane with a radius of 2 pixels:
Where e is the unit vector of 3*1, d represents the depth of the radar point in the image coordinate system, f is the camera focal length, and S is the image plane.
It should be noted that the above steps are only for generating the original three-dimension Gao Situ, and do not involve other gaussian processes.
After the original three-dimensional Gaussian is generated, the original three-dimensional Gaussian is updated and optimized continuously and incrementally according to the input information of the multiple sensors. It should be noted that, when incremental updating and optimizing is performed on the three-dimensional map Gao Situ each time, the incremental updating and optimizing is performed on the basis of the last incremental updating and optimizing of the three-dimensional map Gao Situ, and not on the basis of the original three-dimensional gaussian map each time.
Step S2, judging whether the current frame is a key frame, if so, executing step S3;
For an online incremental mapping system, in order to achieve high efficiency and real-time performance of map construction, optimization using all received images is not feasible on complex calculation. And to obtain a complete map of the environment, the gaussian scene representation should be able to simulate the geometry and appearance of the new viewing area. Therefore, the optimization of the three-dimensional scene representation is only carried out on the key frame, the time step is not simply set, the time step is used as a key frame dividing standard, the uncertainty of the change of the motion and the steering of the unmanned platform is fully considered, and the key frame is adaptively set according to the pixel similarity of the current frame and the last key frame. The method comprises the following steps:
and judging whether the pixel similarity of the current frame and the previous key frame is larger than a first threshold value, if so, registering the current frame as the key frame, and if not, taking the current frame as a common input frame to carry out simple subsequent processing. The first threshold value represents that the current frame has a high possibility of generating an additional area when compared with the previous key frame, so that the current frame can be registered as the key frame in time by setting the first threshold value, and the incremental map is updated according to the key frame. The first threshold value, the specific value, may be set by self according to factors such as the actual situation of the site, the situation of the sensor, or the data processing mode, which is not limited herein.
Step S3, rendering to obtain a contour image based on the estimated standing posture;
The geometry and appearance of the new observation region is typically captured in each frame received after the first frame, however lidar points from different frames may contain duplicate or very similar information. In order to avoid redundancy, after determining that the current frame is a key frame, we first determine, according to the key frame, a current camera view angle corresponding to the key frame, and then render the estimated pose from the current camera view angle to obtain a contour image V.
S4, acquiring a historical three-dimensional Gaussian diagram;
Determining pixels corresponding to the newly added region based on the contour image and the historical three-dimensional Gaussian image;
The historical three-dimensional Gaussian diagram is obtained by performing three-dimensional Gaussian optimization on the historical key frames before the key frames by adopting the method of the application, and the obtained three-dimensional Gaussian diagram is Gao Situ. For example, if the current frame is the 5 th key frame, the historical three-dimensional gaussian graph is obtained by performing three-dimensional gaussian optimization on the first four key frames by adopting the method of the present application, and then obtaining three-dimensional Gao Situ.
After the contour image V is obtained, a mask M is generated from the contour image V, and then a matching calculation is performed with the three-dimensional Gao Situ constructed by the current gaussian, so that in the mask, pixels corresponding to the new region are selected which are unreliable and tend to be observed, and pixels corresponding to the new region are selected which are unreliable and tend to be observed as pixels corresponding to the newly added region, so that subsequent calculations are performed.
The above-mentioned matching calculation is performed with the mask and the three-dimensional Gao Situ constructed by the current gaussian, so that in the mask, the pixels with the value of 0 in the mask are selected as the pixels which are unreliable and tend to observe the new region correspondence.
Step S5, optimizing the historical three-dimensional Gaussian diagram based on the pixels corresponding to the newly added region to obtain a first precision three-dimensional Gao Situ corresponding to the newly added region;
since this step uses the global lidar point cloud and the newly registered radar points in the global lidar point cloud, it will be described in detail:
In particular, in order to use a radar point cloud with more accurate geometric prior as a guide point set of three-dimensional gaussian sputtering, rather than using a general SFM method, a global lidar map stored in units of voxels with a resolution of ten meters is maintained in a corresponding system of the present application to facilitate operation and access to the radar point cloud. After receiving an input frame and a corresponding radar point cloud map, the system randomly samples one quarter of radar points, converts the radar points into world coordinates, registers the world coordinates on voxels to improve efficiency, and ensures that the distance between the radar points in each voxel is more than five centimeters so as to reduce redundancy. For successfully registered radar points, mapping to the nearest image and querying pixel values to obtain the color attribute. It should be noted that, the input frame mentioned here, not the key frame mentioned above, may be understood as the current frame when the method mentioned in the present application is applied to three-dimensional gaussian optimization.
After determining the pixels corresponding to the newly added area, the historical three-dimensional Gaussian diagram needs to be optimized, namely, the Gaussian corresponding to the newly added area needs to be added in the historical three-dimensional Gaussian diagram. Therefore, after determining the pixel corresponding to the newly added area, projecting the pixel into a maintained global laser radar point cloud, accessing a newly registered radar point in a time interval between the current frame (new key frame) and the previous key frame, taking sparsity of a water surface target into consideration, randomly sampling one half of the radar points to reduce unnecessary memory consumption, determining an optimizable radar point according to one half of the newly registered radar points and the pixel corresponding to the newly added area, determining an optimizable gaussian point according to the optimizable radar point after determining the optimizable radar point, and then performing optimization processing on a historical three-dimensional gaussian graph according to the optimizable gaussian point to obtain a first precision three-dimensional Gao Situ corresponding to the newly added area.
The first precision three-dimension Gao Situ corresponding to the newly added region can be understood as the low precision three-dimension Gao Situ corresponding to the newly added region, and the three-dimension Gao Situ corresponding to the newly added region is obtained at this time with lower precision because only the historical three-dimension gaussian map is initially optimized. The specific setting of the numerical range of the first precision may be set by itself according to the actual situation on site, which is not particularly limited herein. For example, a low-precision numerical range may be determined from the shooting pixels of the camera.
The pixels corresponding to the newly registered radar points and the newly added area according to one half of the random sampling are used for determining the optimized radar points, and the method specifically comprises the following steps:
And projecting the newly registered radar point of the half random sampling into the pixel corresponding to the newly added area, judging whether the newly registered radar point of the half random sampling can be projected into the pixel corresponding to the newly added area, and performing subsequent initialization Gaussian only as the radar point which can be projected onto the selected pixel as shown in a formula (2).
Wherein M is a contour image, V is a mask, alpha is the contribution of each Gaussian point to a certain pixel point, and the contribution is calculated by the product of the opacity and a probability density function of the three-dimensional Gaussian projection of the opacity and the three-dimensional Gaussian projection onto an imaging plane.
After the optimizable radar point is determined, the optimizable radar point is projected into a world coordinate system, so that an optimizable Gaussian point is determined, and further, in the world coordinate system, according to the optimizable Gaussian point, the historical three-dimensional Gaussian diagram is optimized, so that the addition of a newly added area is completed, and a first precision three-dimensional Gao Situ corresponding to the newly added area is obtained. It should be noted that, the specific calculation process of the projection and the specific calculation process of the optimization process mentioned in this paragraph belong to common calculation methods in the field, and therefore, the calculation process thereof will not be described in detail.
S6, acquiring a historical key frame, wherein the historical key frame comprises an initial key frame and a previous key frame;
And carrying out loss optimization on the first precision three-dimension Gao Situ based on the historical key frame to obtain a second precision three-dimension Gao Situ corresponding to the newly added region, wherein the first precision is smaller than the second precision. The historical keyframes are stored in a multi-sensor visual mileage sub-module.
The second precision three-dimensional Gao Situ corresponding to the newly added region can be understood as a high precision three-dimensional map corresponding to the newly added region, and the loss optimization processing is performed on the three-dimensional gaussian corresponding to the newly added region again for the first precision three-dimensional Gao Situ, so that the precision of the three-dimensional Gao Situ corresponding to the newly added region obtained at this time is higher. The specific setting of the numerical range of the second precision may be set by itself according to the actual situation on site, which is not particularly limited herein. For example, a high-precision numerical range may be determined based on the presentation requirements of the gaussian image.
In order to further limit the computational complexity to improve the efficiency and the improvement of the three-dimensional gaussian precision corresponding to the newly added region, during each keyframe mapping process, we select K historical keyframes to optimize the gaussian mapping, including the initial keyframe and the previous keyframe, to avoid the catastrophic forgetting problem and to maintain the geometric consistency of the global map. The method comprises the following specific steps:
Step 701, randomly scrambling the historical keyframes, and performing minimized luminosity rendering loss iteration processing on the first precision Gaussian image by adopting the following formula (3) to obtain an optimized three-dimensional Gaussian rendering depth image:
Wherein I and Respectively observing (collecting) images corresponding to historical key frames and images obtained by Gaussian characterization corresponding to rendering the historical key frames, wherein L D-SSIM is structural similarity constraint structural consistency, lambda refers to weight, artificial setting is performed, and L refers to loss;
The first precision Gaussian image is subjected to minimized luminosity rendering loss iterative processing, and the method aims at improving image rendering quality, optimizing model parameters, enhancing robustness, realizing accurate luminosity estimation, reducing differences between the first precision Gaussian image and a real image, enabling a generated image to be more vivid and natural, enabling the model to fit data better and adapting to various scenes.
Step S702, judging whether the calculation times of the minimized luminosity rendering loss iterative processing is larger than the first preset calculation times or not;
if yes, ending iterative computation processing, and taking the optimized three-dimensional Gaussian rendering depth map as a second precision three-dimensional Gao Situ corresponding to the newly added region;
If not, judging whether the calculation times of the minimized luminosity rendering loss iterative processing is equal to second preset calculation times, wherein the second preset calculation times are calculation times obtained after dividing the first preset calculation times according to preset gradients;
If yes, go to step 703;
If not, taking the optimized three-dimensional Gaussian rendering depth map as a first precision three-dimensional Gao Situ of the next round of iterative processing, and repeating the step 701;
Because wrong displacement or deformation can occur in the process of carrying out iterative optimization on the first precision three-dimension Gao Situ through luminosity rendering loss, the accumulated opacity of pixels is rendered, so that the pixels are divided into stable and unstable types, and self-adaptive densification and trimming strategies are used on three-dimension gauss corresponding to the unstable pixels, so that finer mapping and laser radar are promoted to make up for the defect that the whole scene cannot be covered, particularly in an unbounded water surface large-scale environment, and further the possible defect of initializing wrong three-dimension gauss is overcome, and finally, continuous-time three-dimensional scene map construction in the water surface large scene is realized.
When the first precision three-dimensional Gao Situ is subjected to the minimized luminosity rendering loss iterative processing, the number of times of calculation of the iterative processing can reach a certain number, and then the iterative processing is impossible to finish each time, and the optimized three-dimensional Gaussian rendering depth map after the iterative processing is immediately subjected to corresponding processing, so that the calculation time can be increased, and the efficiency can be reduced. Therefore, the number of times of processing is reduced by setting a first preset number of times of computation and a second preset number of times of computation, wherein the second preset number of times of computation is obtained by dividing the first preset number of times of computation according to a preset gradient. For example, the first preset number of computations is 3000 times, and the first preset number of computations may be set according to a gradient of 100, and the order of the first preset number of computations may be 100, 200, 300.
And determining the specific steps of rendering the depth map to the optimized three-dimensional Gaussian according to the first preset calculation times and the second preset calculation times, and sequentially performing the following description:
If the number of calculation steps of the minimized luminosity rendering loss iterative process (hereinafter, for convenience of understanding, the number of calculation steps is simplified to be the number of calculation steps of the iterative process) is larger than the first preset number of calculation steps, the fact that the calculation steps of the iterative process are completed at the moment is represented, and the optimized three-dimensional Gaussian rendering depth map is used as the second precision three-dimensional Gao Situ to complete the optimization of the three-dimensional Gaussian region corresponding to the newly added region.
If the number of iterative processing calculations is less than or equal to the first preset number of calculations, the method represents that the optimization processing of the first precision three-dimensional gaussian image is not completed at the moment, and at the moment, whether the number of iterative processing calculations meets the second preset number of calculations is further judged to determine the subsequent processing of the first precision three-dimensional gaussian image.
If the number of iterative processing calculations is equal to the second preset number of calculations, it means that the number of iterative processing calculations is already enough at this time, and the probability of erroneous displacement or deformation in the obtained optimized three-dimensional gaussian depth rendering map is greatly increased, and step 703 is executed at this time.
If the number of iterative processing calculations is not equal to the second preset number of calculations, this represents that the iterative processing is also performed to minimize photometric rendering loss in the first precision three-dimensional Gao Situ using equation (3).
Step 703, obtaining an optimized three-dimensional Gaussian rendering depth map;
Calculating the accumulated opacity for each pixel point in the optimized three-dimensional Gaussian rendering depth map;
If the accumulated opacity of any pixel point in each pixel point is smaller than a second threshold value, or the rendering depth corresponding to any pixel point is different from the laser point cloud depth, marking the pixel point as an unstable pixel;
determining a three-dimension Gao Situ corresponding to the unstable pixel based on the unstable pixel;
Processing the three-dimension Gao Situ corresponding to the stable image by using a self-adaptive densification and construction strategy to obtain a processed first optimization loss Gaussian diagram;
And (3) taking the processed first optimized loss Gaussian diagram as a first precision three-dimensional Gao Situ for repeating the minimized photometric rendering loss iteration processing next time, and repeating the steps S701-S703.
Since the region formed by the unstable pixels is most likely to be used for capturing new scene content, i.e. there may be omission of the newly added region, and unstable adaptive control existing in the optimization using the formula (3) may generate new gauss, resulting in poor image construction quality. Therefore, in order to improve the accuracy of the three-dimension Gao Situ corresponding to the newly added region, it is necessary to perform optimization processing on the three-dimensional gaussian region corresponding to the unstable pixel. The method comprises the following steps:
After the optimized three-dimensional Gaussian rendering depth map is obtained, firstly, for each pixel point in the optimized three-dimensional Gaussian rendering depth map, calculating the accumulated opacity, wherein the accumulated opacity can be equal to the outline map in form. After the accumulated opacity is calculated, judging the accumulated opacity of each pixel point, if the accumulated opacity of any pixel point in each pixel point is larger than a second threshold value or the difference value between the rendering depth corresponding to any pixel point and the laser point cloud is larger than a third threshold value, marking the pixel point as an unstable pixel, generating a three-dimensional Gao Situ corresponding to the unstable pixel through back projection of the unstable pixel into a world coordinate system according to the unstable pixel, and further processing a three-dimensional Gao Situ corresponding to the unstable image by using an adaptive densification and trimming strategy to obtain the optimized loss Gaussian graph after the first processing. The specific ranges of the second threshold and the third threshold are not particularly limited herein, depending on the specific situation at the site, or set by the site technician at his own discretion.
The densification and trimming strategies adaptively comprise a densification process and a trimming process, wherein the densification process is to compactly operate on the three-dimension Gao Situ corresponding to the unstable pixel. The clipping process is to check all observable gaussian points in the current three-dimensional gaussian image from the current camera view cone after adding new gaussian, and explicitly reduce the opacity of the gaussian points far away from the scene surface, so that the gaussian points are automatically removed periodically in the subsequent optimization process, thereby obtaining a processed first optimization loss gaussian image.
After the processed first optimization loss gaussian image is obtained, the iteration processing is not completed at this time, so that the processed first optimization loss gaussian image is used as the first precision three-dimensional Gao Situ for the next repeated minimized luminosity rendering loss iteration processing, the steps S701-S703 are repeated, the iteration processing is continued, and the periodic optimization of the first precision three-dimensional gaussian image is realized.
In summary, the present application reduces the calculation amount of the picture in the map updating process by judging the current frame and registering the current frame as the key frame, and improves the scene map construction quality by introducing three-dimensional Gaussian representation as a basic method of real-time positioning and mapping of large-scale scene modeling to orient to high-fidelity modeling of far-small objects with complex structures. In parallel with the multisensor odometer submodule, the pose tracking result is used, the incremental map optimization method is used for gradually optimizing the map, redundancy is reduced, efficiency and accuracy are improved, and high-fidelity and high-efficiency rendering is achieved.
In some embodiments, a computer readable storage medium is provided, the computer readable storage medium storing a computer program, which when executed by a processor, causes the processor to implement the three-dimensional gaussian-based incremental map optimization method provided in the first aspect.
Those of ordinary skill in the art will appreciate that all or some of the steps, systems, functional modules/units in the apparatus, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between functional blocks/units mentioned in the above description does not necessarily correspond to the division of physical components, for example, one physical component may have a plurality of functions, or one function or step may be cooperatively performed by several physical components. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer-readable storage media, which may include computer-readable storage media (or non-transitory media) and communication media (or transitory media).
The term computer-readable storage medium includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information, such as computer-readable instructions, data structures, program modules or other data, as known to those skilled in the art. Computer-readable storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. Furthermore, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media.
The computer readable storage medium may be an internal storage unit of the network management device according to the foregoing embodiment, for example, a hard disk or a memory of the network management device. The computer readable storage medium may also be an external storage device of the network management device, for example, a plug-in hard disk, a smart memory card (SMART MEDIACARD, SMC), a Secure Digital (SD) card, a flash memory card (FLASH CARD) or the like, which are provided on the network management device.
In some embodiments, an apparatus is provided that includes a processor and a memory for storing a computer program, the processor for executing the computer program and implementing the three-dimensional gaussian-based incremental map optimization method provided by the first aspect of the application when the computer program is executed.
It should be appreciated that the Processor may be a central processing unit (Central Processing Unit, CPU), it may also be other general purpose processors, digital signal processors (DIGITAL SIGNAL Processor, DSP), application SPECIFIC INTEGRATED Circuit (ASIC), field-Programmable gate array (Field-Programmable GATE ARRAY, FPGA) or other Programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. Wherein the general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The above description is only of the preferred embodiments of the present application and is not intended to limit the present application, but various modifications and variations can be made to the present application by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the protection scope of the present application.