CN111009030A

CN111009030A - A multi-view high-resolution texture image and binocular 3D point cloud mapping method

Info

Publication number: CN111009030A
Application number: CN201911179302.9A
Authority: CN
Inventors: 杜瑞建; 陈雷; 葛宝臻
Original assignee: Tianjin University
Current assignee: Tianjin University
Priority date: 2019-11-27
Filing date: 2019-11-27
Publication date: 2020-04-14

Abstract

The invention discloses a multi-view high-resolution texture image and binocular three-dimensional point cloud mapping method. The disparity map reconstructs the 3D point cloud and encapsulates it as a triangular patch model. The vertices of the triangular patch model are the 3D points of the 3D point cloud data; according to the correspondence between the pixel coordinates of the texture image and the 3D points of the triangular patch model, the multi-view texture is determined. The overlapping part of the image, and then the guide line is introduced in the overlapping part of the multi-view texture image, so that the three-dimensional point selects the only texture image for specified mapping, and realizes the selection partition of one-to-one mapping of three-dimensional points; The corresponding relationship between the pixel coordinates of the texture image and the three-dimensional points of the triangular patch model, and the multi-view high-resolution texture image is mapped to the triangular patch model. The invention can effectively improve the visual effect of the three-dimensional model.

Description

Multi-view high-resolution texture image and binocular three-dimensional point cloud mapping method

Technical Field

The invention relates to the technical field of three-dimensional reconstruction texture mapping, in particular to a multi-view high-resolution texture image and binocular three-dimensional point cloud mapping method.

Background

Texture mapping is an important component of three-dimensional model reconstruction. The quality and resolution of the texture has a critical impact on the three-dimensional model realism perception. Different three-dimensional point cloud data acquisition methods need to adopt corresponding texture mapping methods. The existing method for acquiring three-dimensional point cloud data of an object is divided into two types according to whether the object is contacted with the measured object: contact methods and non-contact methods. The contact method is to use the probe to directly contact with the target object, obtain the three-dimensional coordinates of the object through the return information of the probe, focus on obtaining the three-dimensional information of the target object, do not pay attention to the surface texture information. The non-contact method comprises a laser scanning method, a structured light measuring method, a binocular stereo vision method and the like, three-dimensional point cloud data are obtained on the basis of an optical technology, and the reality sense of the model can be improved through texture mapping.

The laser scanning method is an active depth measurement method, and uses infrared light to calculate depth by detecting flight time. The three-dimensional reconstruction model obtained by the method has no texture information, and texture mapping needs to be carried out by independently acquiring texture images. The texture mapping is realized by matching the characteristics of high-contrast points or lines of the texture image with the edge characteristics of the three-dimensional reconstruction model.

The structured light measurement method uses structured light field projection to calculate depth by detecting changes in phase and light intensity. Texture mapping of such three-dimensional reconstruction model still needs to rely on a color camera to shoot a texture image, but the color camera is usually calibrated for finding the matching relationship between the texture image and the three-dimensional reconstruction model. Because the structured light measurement method needs to project light spots to the surface of an object, the color camera synchronously shoots the object projected with the light spots, so that the posture of the color camera relative to a target object can be calculated, and texture mapping is realized through a camera model. The binocular stereo vision method adopts two cameras to simulate human eyes, and depth information is obtained through algorithm reconstruction. Compared with other two methods, the method has the characteristics of low cost, high efficiency, strong flexibility, simple required equipment and the like, and has high practical value in the field of three-dimensional reconstruction.

The binocular stereo vision method is a method for recovering three-dimensional information from an image, and a reconstructed three-dimensional model adopts a binocular camera image to perform texture mapping. The mapping from the texture image to the three-dimensional reconstruction model can be realized by selecting the characteristic points of the binocular image and applying the formed corresponding relation in the reconstruction process.

The binocular stereo vision method can realize texture mapping by selecting characteristic points of a binocular image and applying a corresponding relation formed in a reconstruction process. However, this method requires manual selection of feature points for the binocular image and is limited by the binocular camera resolution. When the framework of the three-dimensional model is sparse but the requirement on the local detail of the model is high, the resolution requirement is difficult to achieve by adopting a binocular image mapping method.

Disclosure of Invention

The invention aims to provide a multi-view high-resolution texture image and binocular three-dimensional point cloud mapping method aiming at the technical defects in the prior art, a high-resolution texture camera is used for independently collecting the multi-view texture image, the multi-view high-resolution texture image and three-dimensional point cloud data are fused, and the aim of improving the visual resolution of a point cloud model reconstructed by binocular stereoscopic vision through mapping of the multi-view high-resolution texture image is achieved.

The technical scheme adopted for realizing the purpose of the invention is as follows:

a multi-view high-resolution texture image and binocular three-dimensional point cloud mapping method comprises the following steps:

performing three-dimensional reconstruction by using the obtained stereo image pair and calibration parameters of the stereo view pair to obtain a block matching disparity map, reconstructing a three-dimensional point cloud based on the block matching disparity map, and packaging the three-dimensional point cloud into a triangular patch model, wherein the vertex of the triangular patch model is a three-dimensional point of three-dimensional point cloud data;

determining the overlapping part of the multi-view texture image according to the corresponding relation between the pixel coordinates of the texture image and the three-dimensional points of the triangular patch model, and introducing a guide line into the overlapping part of the multi-view texture image so as to select the unique texture image for appointed mapping by the three-dimensional points and realize the one-to-one mapping of the three-dimensional points;

and mapping the multi-view high-resolution texture image to the triangular patch model through the partition of the boundary and the obtained corresponding relation between the pixel coordinates of the texture image and the three-dimensional points of the triangular patch model.

1. Three-dimensional point cloud model and multi-view high-resolution texture image acquisition

And (5) building a three-camera system. And shooting the object images by the left camera and the right camera to perform binocular stereoscopic vision three-dimensional reconstruction.

Calibrating the binocular system and finally outputting an internal parameter matrix M of the binocular camera₁And M₂Distortion vector D₁And D₂And the rotation matrix R and translation vector T between the binocular cameras. Wherein:

c_xand c_yWhat characterizes is the offset of the camera optical axis in the image coordinate system, in pixels.

f_xAnd f_yIs the focal length in the x, y direction, and is the magnification in both the x, y directions.

D＝[k₁k₂p₁p₂k₃]

k₁,k₂,k₃Characterizing radial distortion, p, of an image due to camera lens manufacturing errors₁,p₂Characterizing lens mounting errors distorts the imaging in the tangential direction.

The rotation matrix describes the relative rotation between the two cameras, the rotation angles for the three directions being psi respectively,

θ。

T＝[T_xT_yT_z]

and performing three-dimensional reconstruction by using the obtained binocular image and the obtained system parameters.

Firstly, stereo correction is carried out on binocular images, and stereo matching based on block matching is adopted for corrected image pairs to obtain a block matching disparity map with pixel precision. For convenience of explanation, the block matching disparity map is considered to be based on a binocular left map, that is, the coordinate value of any pixel on the block matching disparity map represents the coordinate of the pixel in the binocular left map; the gray value of the pixel represents the parallax value d of the point between the binocular left image and the binocular right image; the same applies to the block matching disparity map based on the binocular right map.

Given the two-dimensional coordinates (x, y) of the binocular left image and the disparity d associated with it, this point can be projected into three dimensions:

wherein:

reconstructing a three-dimensional point cloud according to the Q matrix and the block matching disparity map, wherein the coordinates of the three-dimensional point cloud are (X/W, Y/W, Z/W), W is independent of the two-dimensional coordinates (X, Y),

T_xthe component of the translation vector x direction for the right camera pointing to the left camera. To perform texture mapping requires encapsulating the three-dimensional point cloud into a triangular patch model. And the vertexes of the triangular patch model are three-dimensional points of the three-dimensional point cloud data.

The multi-view texture shooting principle is schematically shown in fig. 2. The left and right of the camera are binocular cameras, the texture camera is positioned in the middle of the binocular cameras, and the distances between the three cameras and an object are the same. The binocular camera shoots the whole object, and the texture camera shoots a local high-resolution image of the object through a relatively long focal length of the texture camera. The texture camera shown in fig. 2 shoots four local images of an object, the texture camera guarantees the high resolution of the local images of the object in a small view field, and the local images of the object are shot by rotating and pitching the texture camera up, down, left and right, so that the multi-view high-resolution texture image is finally obtained.

2. Finding corresponding relation between high-resolution texture image and three-dimensional model

To map the texture image to the three-dimensional point cloud model, it is most important to find the corresponding relationship between the two. Firstly, the corresponding relation between the texture image and the binocular left image is searched. The process of matching the two images is as follows: extracting image characteristic points, matching the characteristic points, solving a homography matrix, and then transforming to the same coordinate system. The homography matrix H describes the transformation of two images from different viewing angles, the pixel coordinates of the two images are (x, y) and (x ', y'), respectively, and the correspondence is as follows:

wherein:

the homography matrix H consists of 9 elements with 8 degrees of freedom. After the feature point extraction matching is completed, H can be found from 4 pairs of matching points. After the homography matrix is solved, the corresponding relation between the texture image and the binocular left image can be obtained through the formula (2), and the data matching relation between the binocular left image and the point cloud model is given through the formula (1). Then, the coordinates of the three-dimensional point in the point cloud model in the world coordinate system are (X, Y, Z), and the coordinates of the corresponding pixel point in the texture image are (X ', Y'), and then the corresponding relationship is:

and (3) describing the corresponding relation between the texture image pixel coordinates and the point cloud model three-dimensional points, wherein f is the focal length of the left camera after stereo correction and takes the pixel as a unit.

3. Mapping of multi-view high resolution texture image point cloud model

The multi-view texture image is shot with an overlapping part in order to cover the whole view field; for the three-dimensional points on the point cloud model, only one image coordinate can be selected in the texture mapping for correspondence, so that the image overlapping part needs to be fused. And (3) searching the corresponding relation between the texture image and the three-dimensional model by adopting the method of the 2 sections one by one. After the multi-view texture images are respectively subjected to feature matching with the binocular left image, the positions of the multi-view texture images on the binocular left image are determined, and the overlapping parts of the multi-view texture images are also determined accordingly.

Guiding lines are introduced in the overlapping part of the multi-view texture image, and the guiding lines on the two-dimensional image appear as a boundary line on the three-dimensional point cloud model, as shown in fig. 4. The boundary line has the function of ensuring that the three-dimensional point selects a unique texture image for specified mapping. And selecting one texture image for mapping from the three-dimensional points on the left side of the boundary, and selecting another texture image containing the overlapped part for mapping from the three-dimensional points on the right side of the boundary, so that the one-to-one mapping of the three-dimensional points is realized.

And mapping the multi-view high-resolution texture image to the point cloud model through the partition of the boundary and the obtained corresponding relation between the pixel coordinates of the texture image and the three-dimensional points of the point cloud model.

Compared with the prior art, the invention has the following advantages:

(1) the texture mapping is carried out on the three-dimensional model by using the high-resolution texture image, and the visual effect of the three-dimensional model can be effectively improved when the skeleton of the three-dimensional model is sparse. In addition, when the local resolution of the model is required to be high, the high-resolution image is also adopted to compensate the detail information of the model, and the high-resolution image data and the three-dimensional model data are required to be fused.

(2) And (3) combining two-dimensional feature matching with binocular stereo vision three-dimensional reconstruction to provide a new texture mapping method. The method has simple and clear principle. Compared with the conventional mapping method based on 3D-2D matching, the method is more convenient and stable.

(3) In the process of mapping the multi-view texture image by adopting the texture mapping new method, a guideline partition mapping method is provided for the data redundancy condition of the overlapping part of the multi-view texture image, the point cloud data is subjected to partition mapping, and the provided texture mapping method is successfully popularized to be applied to a larger scene.

Drawings

FIG. 1 is a block diagram of a high resolution texture three-dimensional imaging system.

Fig. 2 is a schematic diagram of the multi-view texture capture principle.

Fig. 3 is a high resolution image texture mapping schematic.

Fig. 4 is a schematic diagram of the boundary on a triangular patch model.

Detailed Description

The invention is described in further detail below with reference to the figures and specific examples. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

The method comprises the steps of reconstructing a corresponding relation between a point cloud model and a binocular image as a bridge by using binocular stereoscopic vision, matching two-dimensional characteristics of a texture image and the binocular image to obtain the corresponding relation between the texture image and the point cloud model, searching for the corresponding relation between a multi-view high-resolution texture image, and completing texture fusion of an image overlapping part by using a guideline partition mapping method so as to realize multi-view high-resolution image texture mapping.

The method specifically comprises the steps of obtaining a three-dimensional point cloud model and a multi-view high-resolution texture image, searching the corresponding relation between the high-resolution texture image and the three-dimensional model, and mapping the multi-view high-resolution texture image point cloud model.

As shown in fig. 1, the system of the present invention includes a third camera 101, a first camera 201, and a second camera 202, where the first camera 201 and the second camera 202 form a stereoscopic vision pair, a picture view 1 and a view 2 are taken, and the view 1 and the view 2 form a stereoscopic image pair; the third camera 101 shoots a local texture image of an object by using a long focus, and four views 3 are obtained in a simulation experiment; view 3 has a higher resolution than view 1. Specifically, the first camera and the second camera adopt Canon 5D mark III, the focal length of a lens is set to be 70mm, and the binocular base line T is 560 mm; the third camera adopts Canon 5D mark IV, and the focal length of the lens is set to 135 mm; the shooting distance was 2500 mm.

The specific implementation comprises the following specific steps:

and S1, calibrating system parameters of the stereoscopic vision pair, and calibrating the camera by adopting a two-dimensional calibration method proposed by Zhangzhen.

And S2, after calibration is finished, performing three-dimensional reconstruction by using the obtained stereo image pair and the obtained system parameters. And carrying out stereo correction on the stereo image pair by using a Bouguet algorithm, and carrying out stereo matching on the corrected image by using an SGBM algorithm to obtain a block matching disparity map with pixel precision. For convenience of explanation, the block matching disparity map is considered here based on view 1, but the principles of the present invention are equally applicable to the block matching disparity map based on view 2.

And S3, giving two-dimensional coordinates (x, y) of the view 1 and a parallax d associated with the two-dimensional coordinates, projecting the point into three dimensions through a Q matrix, reconstructing a three-dimensional point cloud according to the Q matrix and the block matching parallax image, and packaging the three-dimensional point cloud into a point cloud model. The Q matrix is as follows:

s4, searching a corresponding relation between the point cloud model and the view 3;

(1) the view 1 and the point cloud model have a relationship shown in formula (1), and the view 1 is used as a matching intermediate bridge.

(2) And finding the corresponding relation between the view 1 and the view 3 by adopting a two-dimensional feature matching method. According to the characteristics of the pictures of the view 1 and the view 3, sift characteristic points are selected for matching, a homography matrix H is solved, and pixels are corresponding to the same coordinate system by means of perspective transformation.

(3) The matching relation between the view 3 and the view 1 can be obtained from the equation (2), and the data matching relation between the view 1 and the point cloud model is given by the equation (1). Then, the coordinates of the three-dimensional point in the point cloud model in the world coordinate system are (X, Y, Z), and the coordinates of the corresponding pixel point in the view 3 are (X ', Y'), and then the corresponding relationship is:

and S5, repeating the step 6 for the four views 3 shot by the camera 101, and finding the corresponding relation between the pixel coordinates of the four views 3 and the point cloud model. In S4(2), four homography matrices are recorded as follows:

finally, overlapping parts of the four views 3 are obtained, and guiding lines are reasonably selected on the view 1 according to the position relation between the multiple views 3 and the view 1, and the guiding lines partition the three-dimensional points of the point cloud model under the action of the reprojection matrix Q. Wherein the guideline is selected at the overlapping portion of the multi-view images by the result of four image matching. The guideline is on view 1 because view 1 can be directly linked to the point cloud model through the reprojection matrix Q.

And mapping the multi-view high-resolution texture image to the point cloud model according to the partition information obtained in the step S5 and the corresponding relation between the pixel coordinates of the four views 3 and the point cloud model. The visual resolution of the final point cloud model is higher than that of a point cloud model which only adopts a binocular left image for texture mapping.

The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims

1. A multi-view high-resolution texture image and binocular three-dimensional point cloud mapping method is characterized by comprising the following steps:

2. The method of claim 1, wherein the correspondence between the pixel coordinates of the texture image and the three-dimensional points of the triangular patch model is expressed as follows:

x, Y and Z are world coordinates of three-dimensional points in the triangular patch model, X 'and Y' are coordinates of corresponding pixel points in the texture camera image, H is a homography matrix, c is a homography matrix_x、c_yRepresenting the offset of the optical axis of the camera in an image coordinate system by taking a pixel as a unit; f is the left camera focal length after stereo correction, in pixel units.

3. The method of claim 2, wherein the three-dimensional point cloud is reconstructed from a Q matrix and a block matching disparity map by:

in the formula (I), the compound is shown in the specification,

x, y are two-dimensional coordinates of the binocular left or right image, d is the associated disparity, T_xThe component of the translation vector x direction for the right camera pointing to the left camera.

4. The method of claim 1, wherein the stereoscopic image pair is obtained from the stereoscopic eye pair, the multi-view texture image is captured by a texture camera, the stereoscopic eye pair constitutes a binocular system, the stereoscopic image pair is captured for stereoscopic three-dimensional reconstruction, and the texture camera is centered between the binocular systems.