CN106570903A

CN106570903A - Visual identification and positioning method based on RGB-D camera

Info

Publication number: CN106570903A
Application number: CN201610894251.8A
Authority: CN
Inventors: 张智军; 张文康; 黄永前
Original assignee: South China University of Technology SCUT
Current assignee: South China University of Technology SCUT
Priority date: 2016-10-13
Filing date: 2016-10-13
Publication date: 2017-04-19
Anticipated expiration: 2036-10-13
Also published as: CN106570903B

Abstract

The present invention provides a kind of visual recognition and positioning method based on RGB-D camera, comprising the following steps: 1), by using Microsoft Kinect camera sensor to carry out the acquisition of color image and depth image and convert them into three-dimensional point cloud, and in the scene 2), after extracting the plane in step 1), perform object extraction and segmentation on the remaining point cloud; 3), identify the point cloud collection of the object obtained in step 2) respectively and matching; 4), the object point cloud obtained in step 2) is calculated to realize the positioning of the object. This method is based on the 3D point cloud images collected by Microsoft's RGB-D sensor Kinect II for object recognition and positioning, and does not involve complex operations such as matching between multiple images when locating objects, which greatly improves computing efficiency and is real-time Strong, suitable for the advantages of complex daily life environment.

Description

A visual recognition and positioning method based on RGB-D camera

技术领域technical field

本发明涉及机器视觉的识别与定位领域，尤其是一种基于RGB-D摄像头的视觉识别与定位方法。The invention relates to the field of machine vision recognition and positioning, in particular to a visual recognition and positioning method based on an RGB-D camera.

背景技术Background technique

目前，现有的基于多目彩色图像相机的物体识别与定位系统，大多是通过立体匹配不同传感器采集的图像，获取每一个像素点在空间中的位置，存着在成本较大、运行速度缓慢、系统复杂等问题。At present, most of the existing object recognition and positioning systems based on multi-eye color image cameras obtain the position of each pixel in space by stereo matching the images collected by different sensors, which are costly and slow. , system complexity and other issues.

物体边缘分割大多是基于彩色摄像机的图像进行凸包提取的方法实现，该处理方法需要考虑物体外表色彩，在碰到背景颜色与物体相似的情况时容易产生误判，而凸包提取的方法也存在物体凸包轮廓错误及包含背景部分的问题。Object edge segmentation is mostly realized by the method of convex hull extraction based on the image of the color camera. This processing method needs to consider the appearance color of the object, and it is easy to cause misjudgment when the background color is similar to the object, and the method of convex hull extraction is also difficult. There are problems with the wrong convex hull outline of the object and the inclusion of the background part.

相对于现有的基于多目彩色图像相机进行物体识别与定位的方法，使用RGB-D传感器进行物体识别和定位的方法具有很多优势：Compared with the existing methods of object recognition and positioning based on multi-eye color image cameras, the method of using RGB-D sensors for object recognition and positioning has many advantages:

首先，计算量小、运算速度快、实时性强、物体定位成本低，微软公司推出的RGB-D传感器Kinect II降低了三维扫描的成本，直接向用户提供分辨率较高的彩色图像、深度图像及点云图像，仅通过一个RGB-D传感器，便可以直接获得每一个像素点在相机坐标系中的位置，无须通过立体匹配多目系统中不同传感器采集的图像来获取每一个像素点在空间中的位置；First of all, the amount of calculation is small, the calculation speed is fast, the real-time performance is strong, and the cost of object positioning is low. The RGB-D sensor Kinect II launched by Microsoft reduces the cost of 3D scanning and directly provides users with high-resolution color images and depth images. And point cloud images, the position of each pixel point in the camera coordinate system can be obtained directly through only one RGB-D sensor, without the need to obtain the spatial position of each pixel point by stereo matching the images collected by different sensors in the multi-eye system position in

其次，精确度与鲁棒性有所提高，基于RGB-D摄像头提供的深度图像及点云图像，可直接进行平面提取和物体的分割及定位，有效避免了物体本身外表和背景颜色的影响，可以减少误判情况的发生，提高系统的准确性和稳定性。Secondly, the accuracy and robustness have been improved. Based on the depth image and point cloud image provided by the RGB-D camera, plane extraction and object segmentation and positioning can be directly performed, effectively avoiding the influence of the appearance of the object itself and the background color. The occurrence of misjudgment can be reduced, and the accuracy and stability of the system can be improved.

发明内容Contents of the invention

本发明的目的是针对上述现有技术的不足，提供了一种基于RGB-D摄像头的视觉识别与定位方法，该方法计算量小、实时性强，并且能够适应日常生活场景。The purpose of the present invention is to provide a visual recognition and positioning method based on an RGB-D camera, which has a small amount of calculation, strong real-time performance, and can adapt to daily life scenes.

本发明的目的可以通过如下技术方案实现：The purpose of the present invention can be achieved through the following technical solutions:

一种基于RGB-D摄像头的视觉识别与定位方法，所述方法包括以下步骤：A visual recognition and positioning method based on an RGB-D camera, said method comprising the following steps:

1)通过微软Kinect摄像头传感器对物体进行彩色图像和深度图像的采集后转化为三维点云图像；1) The object is converted into a three-dimensional point cloud image after the color image and depth image are collected by the Microsoft Kinect camera sensor;

2)对步骤1)获得的三维点云图像的每一个点进行相应的法向量计算；2) Carry out corresponding normal vector calculation to each point of the three-dimensional point cloud image that step 1) obtains;

3)对步骤2)获得的法向量集合，运用区域生长算法对物体所放置的背景平面进行提取；3) For the set of normal vectors obtained in step 2), use the region growing algorithm to extract the background plane where the object is placed;

4)将步骤3)中提取出的背景平面的点去除，并对剩下的点云进行物体点云集合提取和凸包提取处理；4) removing the point of the background plane extracted in step 3), and performing object point cloud set extraction and convex hull extraction processing on the remaining point cloud;

5)将步骤4)中提取的各物体点云集合与对应的凸包相结合，进行二次区域生长，实现各物体完整轮廓的分割及完整点集的提取；5) Combine the point cloud collection of each object extracted in step 4) with the corresponding convex hull, perform secondary region growth, and realize the segmentation of the complete contour of each object and the extraction of the complete point set;

6)根据步骤5)中获得的各物体的完整轮廓，提取相应的彩色图像并分别进行特征提取和匹配识别；6) According to the complete outline of each object obtained in step 5), extract the corresponding color image and perform feature extraction and matching recognition respectively;

7)将步骤5)获得的各物体的完整轮廓内的点云集合进行求均值运算，获得各物体在相机坐标系中的位置信息；7) The point cloud set in the complete outline of each object obtained in step 5) is subjected to an average calculation to obtain the position information of each object in the camera coordinate system;

8)将步骤7)得到的各物体在相机坐标系中的位置信息，进行坐标系变换，转换到世界坐标系当中，实现各物体的定位。8) Transform the position information of each object in the camera coordinate system obtained in step 7) into the world coordinate system to realize the positioning of each object.

优选地，步骤2)中，计算法向量的方法为：Preferably, in step 2), the method for calculating the normal vector is:

设P_k为想要求得表面法向量的点，首先找到点P_k在图像中上下左右附近四个点P₁、P₂、P₃和P₄，P₁和P₃组成向量P₂和P₄组成向量则点P_k的表面法向量可以通过叉乘得到，具体如下：Let P _k be the point where you want to obtain the surface normal vector, first find the four points P ₁ , P ₂ , P ₃ and P ₄ near the point P _k in the image, P ₁ and P ₃ form a vector P ₂ and P ₄ make up the vector Then the surface normal vector of point P _k able to pass cross product get, as follows:

三维点云图像的每一个点的法向量均通过如上公式计算得到。The normal vector of each point of the 3D point cloud image is calculated by the above formula.

优选的，步骤3)中，先顺序扫描三维点云图像的每一个点的法向量，遇到近似竖直的法向量则继续寻找该点附近且法向量为近似竖直的点，加入到潜在平面点集中，若潜在平面点集中点的数目大于设定的阈值，则认为该潜在平面点集为一个平面点集合并且把潜在平面点集加入到平面集合中，否则继续扫描剩下的法向量，直到扫描结束后，即可获得平面集合，实现平面的提取。Preferably, in step 3), first sequentially scan the normal vector of each point of the three-dimensional point cloud image, and then continue to search for a point near the point where the normal vector is approximately vertical when encountering an approximately vertical normal vector, and add it to the potential Plane point set, if the number of points in the potential plane point set is greater than the set threshold, the potential plane point set is considered to be a plane point set and the potential plane point set is added to the plane set, otherwise continue to scan the remaining normal vectors , until the scanning is finished, the plane set can be obtained to realize the plane extraction.

优选的，步骤5)中，物体完整轮廓的分割及完整点集的提取方法包括以下步骤：Preferably, in step 5), the segmentation of the complete contour of the object and the extraction method of the complete point set include the following steps:

a)输入三维点云、平面点集合和平面凸包范围内的点；a) Input a 3D point cloud, a set of planar points and points within the range of the planar convex hull;

b)顺序扫描平面凸包范围内的点，寻找属于凸包范围内但不属于平面的点；b) Sequentially scan the points within the range of the convex hull of the plane to find points that belong to the range of the convex hull but do not belong to the plane;

c)步骤b)中找到的凸包范围内的非平面点后，继续寻找该点附近所有在凸包内的非平面点，并且加入到潜在物体点集中；c) After the non-planar point within the convex hull range found in step b), continue to search for all non-planar points within the convex hull near this point, and add them to the potential object point set;

d)若步骤c)中的潜在物体点集中点的数目小于设定的阈值，则回到步骤b)继续扫描剩下的平面凸包点；d) If the number of potential object point concentration points in step c) is less than the set threshold, then return to step b) and continue to scan the remaining plane convex hull points;

e)若步骤c)中的潜在物体点集中点的数目大于等于设定的阈值，则认为该潜在物体点集为一个物体点集合；e) If the number of potential object point collection points in step c) is greater than or equal to the set threshold, then the potential object point collection is considered to be an object point collection;

f)继续在步骤e)中的潜在物体点集附近寻找凸包边界上的点加入到潜在物体点集中，即把在凸包外的物体点也寻找出来并且加入潜在物体点集中；f) Continue to search for points on the boundary of the convex hull near the potential object point set in step e) to add to the potential object point set, that is, to find object points outside the convex hull and add them to the potential object point set;

g)把步骤f)中得到的潜在物体点集加入到物体集合中；g) adding the potential object point set obtained in step f) to the object set;

h)若仍存在点云中的点未被扫描，则回到步骤b)继续寻找新的物体点集；h) If there are still points in the point cloud that have not been scanned, go back to step b) and continue to search for new object point sets;

i)若点云中的全部点已经扫描完毕，则算法结束获得物体集合。i) If all the points in the point cloud have been scanned, the algorithm ends to obtain the object set.

优选地，步骤6)具体为：Preferably, step 6) is specifically:

首先，通过物体的三维点云集合获得物体在彩色图像中对应的区域，把物体所在的图像区域截取下来后使用Open CV的Feature Detector::detect()函数获得surf特征点；First, obtain the corresponding area of the object in the color image through the 3D point cloud collection of the object, intercept the image area where the object is located, and use the Feature Detector::detect() function of Open CV to obtain the surf feature point;

其次，进一步使用Open CV的Feature2D::compute()函数获得surf特征向量，物体的surf特征向量作为该物体的识别特征加入到识别库中，或者直接作为该物体识别匹配的特征向量，在实现物体的surf特征向量和库中相应的surf特征向量进行匹配时，使用最近邻开源库FLANN进行特征向量的匹配；Secondly, further use the Feature2D::compute() function of Open CV to obtain the surf feature vector. The surf feature vector of the object is added to the recognition library as the recognition feature of the object, or directly used as the feature vector of the object recognition and matching. When matching the surf eigenvectors with the corresponding surf eigenvectors in the library, use the nearest neighbor open source library FLANN to match the eigenvectors;

最后，使用最近邻开源库FLANN进行特征向量的匹配，具体就是使用Open CV的Descriptor Matcher::match()函数实现。Finally, use the nearest neighbor open source library FLANN to match the feature vectors, specifically using the Descriptor Matcher::match() function of Open CV.

本发明与现有技术相比，具有如下优点和有益效果：Compared with the prior art, the present invention has the following advantages and beneficial effects:

1、本发明采用RGB-D传感器，相比于现有的基于多目彩色图像相机的物体识别与定位系统，没有涉及到物体定位时多幅图像间匹配等复杂运算，拥有计算量小、计算效率高、运算速度快、实时性强、物体定位成本低、精确度高、鲁棒性强等优势，可实现准确快速稳定的物体识别与定位；1. The present invention uses an RGB-D sensor. Compared with the existing object recognition and positioning system based on multi-eye color image cameras, it does not involve complex operations such as matching between multiple images when locating objects, and has a small amount of calculation. High efficiency, fast computing speed, strong real-time performance, low object positioning cost, high accuracy, strong robustness and other advantages can realize accurate, fast and stable object recognition and positioning;

2、本发明运用区域生长算法，在三维点云中进行背景平面提取，拥有计算量小，准确分割等特点，能够实现背景平面的快速分离与提取，提高了物体的精准提取、定位与识别的可能性；2. The present invention uses the region growing algorithm to extract the background plane in the three-dimensional point cloud. It has the characteristics of small calculation and accurate segmentation, and can realize the rapid separation and extraction of the background plane, and improves the precision of object extraction, positioning and recognition. possibility;

3、本发明将各物体的点云集合与对应的凸包相结合，运用二次区域生长算法，剔除物体凸包内属于背景部分的点，增加凸包外属于物体部分的点，实现了物体轮廓的完整提取，提高了物体定位的精确度。3. The present invention combines the point cloud collection of each object with the corresponding convex hull, uses the secondary region growing algorithm, eliminates the points belonging to the background part in the convex hull of the object, and increases the points belonging to the object part outside the convex hull, realizing the object The complete extraction of contours improves the accuracy of object positioning.

附图说明Description of drawings

图1为本发明方法的流程图。Fig. 1 is the flowchart of the method of the present invention.

图2为本发明表面的法向量叉乘示意图。Fig. 2 is a schematic diagram of the cross product of the normal vector of the surface of the present invention.

图3为本发明平面区域生长算法流程图。Fig. 3 is a flow chart of the planar region growing algorithm of the present invention.

图4为本发明物体完整轮廓的分割及完整点集的提取流程图。Fig. 4 is a flow chart of the segmentation of the complete contour of the object and the extraction of the complete point set in the present invention.

图5(a)为Kinect相机坐标系示意图，图5(b)为实际的世界坐标系示意图。Figure 5(a) is a schematic diagram of the Kinect camera coordinate system, and Figure 5(b) is a schematic diagram of the actual world coordinate system.

具体实施方式detailed description

下面结合实施例及附图对本发明作进一步详细的描述，但本发明的实施方式不限于此。The present invention will be further described in detail below in conjunction with the embodiments and the accompanying drawings, but the embodiments of the present invention are not limited thereto.

实施例：Example:

本实施例提供了一种基于RGB-D摄像头的视觉识别与定位方法，如图1所示，主要由三维点云的采集、平面提取、物体的分割、物体的特征提取和匹配、物体的定位组成，所述方法具体包括以下步骤：This embodiment provides a visual recognition and positioning method based on an RGB-D camera. As shown in FIG. Composition, described method specifically comprises the following steps:

步骤一、通过微软Kinect摄像头传感器对物体进行彩色图像和深度图像的采集后转化为三维点云图像；Step 1, convert the color image and depth image of the object into a three-dimensional point cloud image after being collected by the Microsoft Kinect camera sensor;

本步骤中，微软的Kinect传感器可以采集RGB-d图像，通过自带的API函数或者开放自然交互(Open Natural Interaction,Open NI)、点云库(Point Cloud Library,PCL)等第三方函数库，即可获得三维点云图像。In this step, Microsoft's Kinect sensor can collect RGB-d images, through the built-in API function or third-party function libraries such as Open Natural Interaction (Open NI), Point Cloud Library (Point Cloud Library, PCL), The 3D point cloud image can be obtained.

步骤二、对步骤一获得的三维点云图像的每一个点进行相应的法向量计算；Step 2, performing corresponding normal vector calculation for each point of the 3D point cloud image obtained in step 1;

如图2所示，为本发明表面的法向量叉乘示意图，P_k为想要求得表面法向量的点，首先找到点P_k在图像中上下左右附近四个点P₁、P₂、P₃和P₄，P₁和P₃组成向量P₂和P₄组成向量则点P_k的表面法向量可以通过叉乘得到，具体如下：As shown in Figure 2, it is a schematic diagram of the cross product of the normal vector of the surface of the present invention, P _k is the point where the surface normal vector is to be obtained, first find the four points P ₁ , P ₂ , P near the top, bottom, left, and right of the point P _k in the image ₃ and P ₄ , P ₁ and P ₃ form a vector P ₂ and P ₄ make up the vector Then the surface normal vector of point P _k able to pass cross product get, as follows:

三维点云图像的每一个点的法向量可通过如上公式计算得到。The normal vector of each point of the 3D point cloud image can be calculated by the above formula.

步骤三、对步骤二获得的法向量集合，运用区域生长算法对物体所放置的背景平面进行提取；Step 3. For the set of normal vectors obtained in step 2, use the region growing algorithm to extract the background plane where the object is placed;

本步骤中，如图3所示，先顺序扫描三维点云图像的每一个点的法向量，遇到近似竖直的法向量则继续寻找该点附近且法向量为近似竖直的点，加入到潜在平面点集S中，若潜在平面点集S中点的数目N_s大于设定的阈值N，则认为该潜在平面点集S为一个平面点集合并且把潜在平面点集S加入到平面集合C中，否则继续扫描剩下的法向量，直到扫描结束后，即可获得平面集合C，实现平面的提取。In this step, as shown in Figure 3, first scan the normal vector of each point of the 3D point cloud image sequentially, and then continue to search for a point near the point where the normal vector is approximately vertical when encountering an approximately vertical normal vector, and add In the potential plane point set S, if the number N _s of points in the potential plane point set S is greater than the set threshold N, the potential plane point set S is considered to be a plane point set and the potential plane point set S is added to the plane Otherwise, continue to scan the remaining normal vectors until the end of the scan, then you can get the plane set C to realize the plane extraction.

步骤四、将步骤三中提取出的背景平面的点去除，并对剩下的点云进行物体点云集合提取和凸包提取处理；Step 4, remove the points of the background plane extracted in step 3, and perform object point cloud set extraction and convex hull extraction processing on the remaining point cloud;

步骤五、将步骤四中中提取的各物体点云集合与对应的凸包相结合，进行二次区域生长，实现各物体完整轮廓的分割及完整点集的提取；Step 5. Combining the point cloud collection of each object extracted in step 4 with the corresponding convex hull, performing secondary region growth, and realizing the segmentation of the complete outline of each object and the extraction of the complete point set;

图4所示，为物体完整轮廓的分割及完整点集的提取流程图，从步骤三获取到的平面集合进行凸包运算后，所得到的平面凸包中既包含了平面点集合，同时还包含了物体点集合。为了实现物体轮廓的完整提取，提高物体定位的精确度，本发明将各物体的点云集合与对应的凸包相结合，运用二次区域生长算法，剔除物体凸包内属于背景部分的点，增加凸包外属于物体部分的点，该物体完整轮廓的分割及完整点集的提取方法包括以下步骤：As shown in Figure 4, it is the flow chart of the segmentation of the complete contour of the object and the extraction of the complete point set. After performing the convex hull operation on the plane set obtained in step 3, the obtained plane convex hull contains both the plane point set and the Contains a collection of object points. In order to realize the complete extraction of object contours and improve the accuracy of object positioning, the present invention combines the point cloud collection of each object with the corresponding convex hull, and uses the secondary region growing algorithm to eliminate the points belonging to the background part in the convex hull of the object. Adding the points belonging to the object part outside the convex hull, the segmentation of the complete outline of the object and the extraction method of the complete point set include the following steps:

c)步骤b)中找到的凸包范围内的非平面点后，继续寻找该点附近所有在凸包内的非平面点，并且加入到潜在物体点集S'中；c) After the non-planar point within the convex hull range found in step b), continue to search for all non-planar points within the convex hull near this point, and add them to the potential object point set S';

d)若步骤c)中的潜在物体点集S'中点的数目N_s小于设定的阈值N，则回到步骤b)继续扫描剩下的平面凸包点；d) If the number N _s of points in the potential object point set S' in step c) is less than the set threshold N, then return to step b) and continue to scan the remaining planar convex hull points;

e)若步骤c)中的潜在物体点集S'中点的数目N_s大于等于设定的阈值N，则认为该潜在物体点集S'为一个物体点集合；e) If the number N _s of points in the potential object point set S' in step c) is greater than or equal to the set threshold N, then the potential object point set S' is considered to be an object point set;

f)继续在步骤e)中的潜在物体点集S'附近寻找凸包边界上的点加入到潜在物体点集S'中，即把在凸包外的物体点也寻找出来并且加入潜在物体点集S'中；f) Continue to find points on the boundary of the convex hull near the potential object point set S' in step e) and add them to the potential object point set S', that is, find the object points outside the convex hull and add them to the potential object points set S';

g)把步骤f)中得到的潜在物体点集S'加入到物体集合C'中；g) adding the potential object point set S' obtained in step f) to the object set C';

i)若点云中的全部点已经扫描完毕，则算法结束获得物体集合C'。i) If all the points in the point cloud have been scanned, the algorithm ends to obtain the object set C'.

步骤六、根据步骤五中获得的各物体的完整轮廓，提取相应的彩色图像并分别进行特征提取和匹配识别；Step 6. According to the complete outline of each object obtained in step 5, extract the corresponding color image and perform feature extraction and matching recognition respectively;

本步骤中，首先，通过物体的三维点云集合可以获得物体在彩色图像中对应的区域，把物体所在的图像区域截取下来后使用Open CV的Feature Detector::detect()函数获得surf特征点，进一步使用Open CV的Feature2D::compute()函数获得surf特征向量，物体的surf特征向量可以作为该物体的识别特征加入到识别库中，或者直接作为该物体识别匹配的特征向量，在实现物体的surf特征向量和库中相应的surf特征向量进行匹配时，使用最近邻开源库FLANN进行特征向量的匹配，最后，使用最近邻开源库FLANN进行特征向量的匹配，具体就是使用Open CV的Descriptor Matcher::match()函数实现。In this step, firstly, the corresponding area of the object in the color image can be obtained through the 3D point cloud collection of the object, and the image area where the object is located is intercepted and then the surf feature point is obtained by using the Feature Detector::detect() function of Open CV. Further use the Feature2D::compute() function of Open CV to obtain the surf feature vector. The surf feature vector of the object can be added to the recognition library as the recognition feature of the object, or directly used as the feature vector of the object recognition and matching. When matching the surf feature vector with the corresponding surf feature vector in the library, use the nearest neighbor open source library FLANN to match the feature vector, and finally, use the nearest neighbor open source library FLANN to match the feature vector, specifically using Open CV's Descriptor Matcher: :match() function implementation.

步骤七、将步骤五获得的各物体的完整轮廓内的点云集合进行求均值运算，获得各物体在相机坐标系中的位置信息；Step 7. Perform an averaging operation on the point cloud set in the complete outline of each object obtained in step 5 to obtain the position information of each object in the camera coordinate system;

步骤八、将步骤七得到的各物体在相机坐标系中的位置信息，进行坐标系变换，转换到世界坐标系当中，实现各物体的定位。Step 8: Transform the position information of each object in the camera coordinate system obtained in step 7 into the world coordinate system to realize the positioning of each object.

如图5(a)和图5(b)分别为Kinect相机坐标系示意图和实际的世界坐标系示意图，两个坐标系均为右手坐标系，要实现物体在世界坐标系中的定位，需要对相机进行标定，相机坐标系与世界坐标系之间存在以下关系：Figure 5(a) and Figure 5(b) are the schematic diagrams of the Kinect camera coordinate system and the actual world coordinate system respectively. Both coordinate systems are right-handed coordinate systems. To realize the positioning of objects in the world coordinate system, it is necessary to The camera is calibrated, and there is the following relationship between the camera coordinate system and the world coordinate system:

其中，X_C、Y_C、Z_c表示物体在摄像头坐标系中的位置分量，X_W、Y_W、Z_W表示物体在世界坐标系中的位置分量，和分别是相机的旋转矩阵和偏移矩阵，均为相机的外参数。Among them, X _C , Y _C , Z _c represent the position components of the object in the camera coordinate system, X _W , Y _W , Z _W represent the position components of the object in the world coordinate system, with They are the camera's rotation matrix and offset matrix, both of which are external parameters of the camera.

在坐标系转换过程中需要通过旋转矩阵和偏移矩阵来计算物体的世界坐标系，而这两个矩阵是通过相机的标定获得，在进行相机标定的时候使用到了matlab相机标定工具箱(Camera Calibration Toolbox for Matlab)，通过棋盘法对Kinect进行标定。由于Kinect默认的相机坐标系设置在红外摄像头处，如图5所示，并且Kinect采集的图片是镜像图片，因此在进行相机标定时要使用Kinect的红外摄像头采集红外图像，并且要把采集到的图片逐一进行左右镜像翻转处理后才输入到工具箱中进行标定。在标定时，先让相机围绕棋盘在不同角度和不同距离采集二十张以上图片进行相机的内参数计算，最后将棋盘固定在目标位置采集一幅图像进行相机外参数的计算。In the process of coordinate system transformation, the rotation matrix needs to be passed and offset matrix To calculate the world coordinate system of the object, and these two matrices are obtained through camera calibration. When performing camera calibration, the Matlab Camera Calibration Toolbox (Camera Calibration Toolbox for Matlab) is used to calibrate Kinect through the chessboard method. Since the default camera coordinate system of Kinect is set at the infrared camera, as shown in Figure 5, and the pictures collected by Kinect are mirror images, the infrared camera of Kinect should be used to collect infrared images when performing camera calibration, and the collected The images were mirrored left and right one by one before they were input into the toolbox for calibration. When calibrating, let the camera collect more than 20 pictures around the chessboard at different angles and different distances to calculate the internal parameters of the camera, and finally fix the chessboard at the target position to collect an image to calculate the external parameters of the camera.

将标定后获得的外参数矩阵和计算得到的物体在相机坐标系中的坐标进行以下运算：Perform the following operations on the external parameter matrix obtained after calibration and the calculated coordinates of the object in the camera coordinate system:

即可获得物体在世界坐标系中的空间坐标信息。The spatial coordinate information of the object in the world coordinate system can be obtained.

以上所述，仅为本发明专利较佳的实施例，但本发明专利的保护范围并不局限于此，任何熟悉本技术领域的技术人员在本发明专利所公开的范围内，根据本发明专利的技术方案及其发明专利构思加以等同替换或改变，都属于本发明专利的保护范围。The above is only a preferred embodiment of the patent of the present invention, but the scope of protection of the patent of the present invention is not limited thereto. The equivalent replacement or change of the technical solution and its invention patent concept all belong to the protection scope of the invention patent.

Claims

1. a kind of visual identity and localization method based on RGB-D photographic head, it is characterised in that：Methods described includes following step Suddenly：

1) three are converted into after the collection for carrying out coloured image and depth image to object by Microsoft Kinect camera sensing devices Dimension point cloud chart picture；

2) to step 1) obtain three-dimensional point cloud image each point carry out corresponding normal vector calculating；

3) to step 2) the normal direction duration set that obtains, the background plane that deployment area growth algorithm is placed to object carries Take；

4) by step 3) in the point of background plane that extracts remove, and remaining cloud is carried out object point converge conjunction extract and Convex closure extraction process；

5) by step 4) in extract each object point converge conjunction in combination with corresponding convex closure, carry out second zone growth, realize The segmentation of each object integrity profile and the extraction of complete point set；

6) according to step 5) in the integrity profile of each object that obtains, extracting corresponding coloured image simultaneously carries out respectively feature extraction And match cognization；

7) by step 5) point in the integrity profile of each object that obtains converges conjunction and carries out computing of averaging, and obtains each object in phase Positional information in machine coordinate system；

8) by step 7) positional information of each object for obtaining in camera coordinates system, coordinate system transformation is carried out, it is transformed into the world In the middle of coordinate system, the positioning of each object is realized.

2. a kind of visual identity and localization method based on RGB-D photographic head according to claim 1, it is characterised in that： Step 2) in, the method for calculating normal vector is：

If P_kFor want to claim surface normal point, point P is found first_kNeighbouring four point P up and down in the picture₁、P₂、P₃ And P₄, P₁And P₃Composition of vectorP₂And P₄Composition of vectorThen point P_kSurface normalCan pass throughMultiplication cross Arrive, it is specific as follows：

\overset{&RightArrow;}{ν_{p}} = \overset{&RightArrow;}{ν_{2}} \times \overset{&RightArrow;}{ν_{1}}

The normal vector of each point of three-dimensional point cloud image is calculated by as above formula.

3. a kind of visual identity and localization method based on RGB-D photographic head according to claim 1, it is characterised in that： Step 3) in, the normal vector of each point of first sequential scan three-dimensional point cloud image, the normal vector for running near vertical then continues Find the point nearby and normal vector is the point of near vertical, in being added to potential plane point set, if potential plane point set midpoint Threshold value of the number more than setting, then it is assumed that the potential plane point set is that a plane point set merges and potential plane point set is added To in plane set, remaining normal vector is otherwise continued to scan on, after the end of scan, you can obtain plane set, realized flat The extraction in face.

4. a kind of visual identity and localization method based on RGB-D photographic head according to claim 1, it is characterised in that： Step 5) in, the segmentation of the object integrity profile and the extraction of complete point set are comprised the following steps：

A) point being input in three-dimensional point cloud, planar point set peace face Convex range；

B) point in sequential scan plane Convex range, searching belongs in Convex range but is not belonging to the point of plane；

C) after the non-flat cake in the Convex range found in step b), the point is continually looked for all non-flat in convex closure nearby Cake, and it is added to potential object point concentration；

If the number of the potential object point centrostigma d) in step c) less than setting threshold value, return to step b) continue to scan on it is surplus Under plane convex closure point；

If the number of the potential object point centrostigma e) in step c) is more than or equal to the threshold value of setting, then it is assumed that the potential object point Collect for an object point set；

F) continue the borderline point of searching convex closure near the potential object point set in step e) and be added to potential object point concentration, The object point outside convex closure is also looked for out and add potential object point to concentrate；

G) the potential object point set obtained in step f) is added in collection of objects；

If the point h) still suffered from a cloud is not scanned, returns to step b) and continually look for new object point set；

If the whole points i) put in cloud are scanned, algorithm terminates to obtain collection of objects.

5. a kind of visual identity and localization method based on RGB-D photographic head according to claim 1, it is characterised in that： Step 6) it is specially：

First, object corresponding region in coloured image is obtained by the three-dimensional point cloud set of object, the figure that object is located Using the Feature Detector of Open CV after intercepting as region::Detect () function obtains surf characteristic points；

Secondly, further using the Feature2D of Open CV::Compute () function obtains surf characteristic vectors, object Surf characteristic vectors are added in identification storehouse as the identification feature of the object, or directly as the spy of the object identification matching Vector is levied, when corresponding surf characteristic vectors are matched in the surf characteristic vectors and storehouse for realizing object, using arest neighbors The storehouse FLANN that increases income carries out the matching of characteristic vector；

Finally, the matching of characteristic vector is carried out using the arest neighbors storehouse FLANN that increases income, particularly as being to use Open CV Descriptor Matcher::Match () function is realized.