CN102800126A - Method for recovering real-time three-dimensional body posture based on multimodal fusion - Google Patents
Method for recovering real-time three-dimensional body posture based on multimodal fusion Download PDFInfo
- Publication number
- CN102800126A CN102800126A CN2012102308982A CN201210230898A CN102800126A CN 102800126 A CN102800126 A CN 102800126A CN 2012102308982 A CN2012102308982 A CN 2012102308982A CN 201210230898 A CN201210230898 A CN 201210230898A CN 102800126 A CN102800126 A CN 102800126A
- Authority
- CN
- China
- Prior art keywords
- pixel
- human body
- shoulder
- depth
- scene
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Processing Or Creating Images (AREA)
Abstract
本发明公开了一种基于多模态融合的实时人体三维姿态恢复的方法。利用深度图解析、颜色识别、人脸检测等多种技术,来实时地获取人体主要关节点在现实世界中的坐标,从而恢复出人体的三维骨架信息。基于每一时刻同步获取得到的场景深度图像及场景彩色图像,利用人脸检测的方法获得人体头部位置信息,利用颜色识别的方法获得佩戴有色标的四肢端点位置信息,再通过四肢端点的位置信息,利用彩色图与深度图的映射关系计算出肘部与膝部的位置信息,最终利用时域信息对获取的骨架进行平滑处理,实时地重建出人体的运动信息。本发明相较于传统的利用近红外设备对人体进行三维姿态恢复的技术,提高了恢复的稳定性,也使人体运动捕获过程更为简便。The invention discloses a method for restoring a real-time three-dimensional posture of a human body based on multimodal fusion. Using depth map analysis, color recognition, face detection and other technologies to obtain the coordinates of the main joint points of the human body in real time in real time, so as to restore the three-dimensional skeleton information of the human body. Based on the scene depth image and scene color image obtained synchronously at each moment, the position information of the human head is obtained by using the method of face detection, and the position information of the end points of the limbs wearing the color code is obtained by the method of color recognition, and then through the position information of the end points of the limbs , use the mapping relationship between the color map and the depth map to calculate the position information of the elbow and knee, and finally use the time domain information to smooth the acquired skeleton, and reconstruct the motion information of the human body in real time. Compared with the traditional technology of using near-infrared equipment to restore the three-dimensional posture of the human body, the present invention improves the stability of the restoration and makes the process of capturing the motion of the human body easier.
Description
技术领域 technical field
本发明涉及一种实时人体三维姿态恢复的方法,尤其涉及利用深度图以及色标对人体三维姿态实时地进行恢复的方法。The invention relates to a method for restoring a real-time three-dimensional pose of a human body, in particular to a method for real-time restoring a three-dimensional pose of a human body by using a depth map and a color code.
背景技术 Background technique
人体三维姿态恢复是指通过设备获取现实中人体的运动数据,包括各主要关节点的三维空间坐标信息等;再对这些运动数据进行计算与渲染,从而建立虚拟场景中角色的运动姿态。通过这样的技术,可以将现实中人体的运动与虚拟世界中角色的运动绑定起来,从而驱动虚拟角色的运动。目前,人体的三维姿态恢复技术广泛地应用在电影、动画拍摄以及游戏制作等领域中,该技术相较于传统计算机动画的建模技术,效率更高,可以做到实时性。Human body 3D posture recovery refers to obtaining real human body motion data through equipment, including 3D space coordinate information of major joint points, etc.; and then calculating and rendering these motion data to establish the motion posture of the character in the virtual scene. Through such a technology, the movement of the human body in reality can be bound to the movement of the character in the virtual world, thereby driving the movement of the virtual character. At present, the three-dimensional posture restoration technology of the human body is widely used in the fields of film, animation shooting and game production. Compared with the traditional computer animation modeling technology, this technology is more efficient and can achieve real-time performance.
实现人体三维姿态恢复的技术有很多种,主要分为光学系统和非光学系统。非光学设备一般通过重力加速器或者辅助的机械设备来获取人体的运动数据,使用的并不是很广泛。而目前光学系统中,大多通过近红外设备为主,即由多个红外线摄像头识别出标记点(由反光率较高的材质制成)的位置,再通过定标算法,将标记点坐标转换为三维空间内的坐标。这种技术的优点是恢复出来的姿态比较精确,系统的鲁棒性比较高,而缺点是进行运动捕获的流程较为复杂,成本较高。There are many technologies to realize the three-dimensional posture recovery of the human body, which are mainly divided into optical systems and non-optical systems. Non-optical devices generally obtain human body motion data through gravitational accelerators or auxiliary mechanical devices, which are not widely used. At present, most of the optical systems are based on near-infrared equipment, that is, multiple infrared cameras identify the position of the marking point (made of a material with high reflectivity), and then use a calibration algorithm to convert the coordinates of the marking point into Coordinates in three-dimensional space. The advantage of this technology is that the recovered attitude is more accurate and the system robustness is relatively high, but the disadvantage is that the process of motion capture is more complicated and the cost is higher.
也有人使用多个普通摄像头提供多视角信息,再提取每个视角侧影的特征值后,从数据库中找出相似的姿态。这种技术优点是硬件成本较低,但是需要有特定数据集的支持,而且对于所要捕获的动作也有较大的限制。Some people use multiple ordinary cameras to provide multi-view information, and then extract the eigenvalues of each perspective silhouette to find similar poses from the database. The advantage of this technology is that the hardware cost is low, but it requires the support of a specific data set, and it also has a relatively large limitation on the actions to be captured.
随着微软推出新一代的交互设备Kinect,人体三维姿态恢复的技术又有了新的突破。Kinect设备可以捕获场景的深度图,深度图中每个像素与其在场景中的位置相对应并具有表示从每个参考位置到其场景位置的距离的像素值(换言之,深度图具有图像的形式,其中,像素值指出场景中的物体的形貌信息,而不是亮度或颜色)。Jamie Shotton等人在他们的论文“Real-Time Human PoseRecognition in Parts from Single Depth lmages”中,描述了一种基于机器学习的方法来恢复人体姿态。以色列一家公司Prime Sense也开发出了一种基于启发式方法的技术,通过对深度图进行背景减除、场景重建的方法,恢复出人体三维骨架信息。在以上方法中,只需要通过一台kinect设备,不需要任何的标记点即可实时地恢复出人体的运动姿态,这与传统的光学系统相比,有了很大的提升。同时,这种技术也大大降低了人体三维姿态恢复的成本,使得该技术可以进入家庭娱乐领域。With the introduction of a new generation of interactive device Kinect by Microsoft, the technology of human body three-dimensional posture recovery has made a new breakthrough. The Kinect device can capture a depth map of the scene, each pixel in the depth map corresponds to its position in the scene and has a pixel value representing the distance from each reference position to its scene position (in other words, the depth map has the form of an image, Among them, the pixel value indicates the shape information of the object in the scene, not the brightness or color). In their paper "Real-Time Human PoseRecognition in Parts from Single Depth images", Jamie Shotton et al. describe a machine learning-based approach to recovering human poses. An Israeli company, Prime Sense, has also developed a heuristic-based technology that restores the three-dimensional skeleton information of the human body by performing background subtraction and scene reconstruction on the depth map. In the above method, only one kinect device is needed, and the motion posture of the human body can be recovered in real time without any marking points, which is greatly improved compared with the traditional optical system. At the same time, this technology also greatly reduces the cost of restoring the three-dimensional posture of the human body, allowing this technology to enter the field of home entertainment.
然而,上述方法在稳定性上,与传统光学设备仍然有一定差距,且实现难度较大。However, the above method still has a certain gap with traditional optical equipment in terms of stability, and it is difficult to realize.
发明内容 Contents of the invention
本发明的目的是提供一种基于多模态融合的实时人体三维姿态恢复的方法。The purpose of the present invention is to provide a method for real-time human three-dimensional pose recovery based on multimodal fusion.
基于多模态融合的实时人体三维姿态恢复的方法,它的步骤如下:A method for real-time human three-dimensional posture recovery based on multi-modal fusion, its steps are as follows:
1)以不小于25帧每秒的帧速同步接受包含人体在内的场景深度图序列以及场景彩色图序列,所述场景深度图序列中的每一帧场景深度图由像素矩阵组成,像素矩阵中的每个像素点的值表示该像素点所对应场景中的位置到参考位置的距离,即该像素点的深度值;所述场景彩色图序列中的每一帧图片由像素矩阵组成,像素矩阵中的每个像素点的值表示该像素点所对应场景中的位置所表示的颜色信息,由RGB颜色值表示;1) Simultaneously accept the scene depth map sequence including the human body and the scene color map sequence at a frame rate of not less than 25 frames per second. Each frame of the scene depth map in the scene depth map sequence is composed of a pixel matrix, and the pixel matrix The value of each pixel point in represents the distance from the position in the scene corresponding to the pixel point to the reference position, that is, the depth value of the pixel point; each frame picture in the scene color map sequence is composed of a pixel matrix, and the pixel The value of each pixel point in the matrix represents the color information represented by the position in the scene corresponding to the pixel point, represented by the RGB color value;
2)分割所述场景深度图的背景与前景像素,获得场景深度图中表述人体部位的区域,即前景像素;2) Segmenting the background and foreground pixels of the scene depth map to obtain the area representing the body parts in the scene depth map, that is, the foreground pixels;
3)处理场景深度图中的前景像素,标注出场景深度图中表示人体躯干、头部以及四肢的像素点;3) Process the foreground pixels in the scene depth map, and mark the pixels representing the human torso, head and limbs in the scene depth map;
4)通过人脸检测,识别出场景彩色图中的人脸位置,通过场景彩色图与场景深度图的映射得到人体头部在场景深度图中的投影坐标,并转换为现实世界中的三维坐标,所述投影坐标为三维向量(X,Y,Z),其中(X,Y)具体地指向场景深度图的某个像素点,Z为该像素点的深度值;4) Through face detection, the face position in the scene color map is recognized, and the projection coordinates of the human head in the scene depth map are obtained through the mapping of the scene color map and the scene depth map, and converted into three-dimensional coordinates in the real world , the projection coordinates are three-dimensional vectors (X, Y, Z), where (X, Y) specifically points to a certain pixel point of the scene depth map, and Z is the depth value of the pixel point;
5)根据头部在场景深度图中的投影坐标,计算颈部与肩部在场景深度图中的投影坐标,并转换为现实世界中的三维坐标;5) According to the projection coordinates of the head in the scene depth map, calculate the projection coordinates of the neck and shoulders in the scene depth map, and convert them into three-dimensional coordinates in the real world;
6)通过四肢端点佩戴的带有颜色的标记物,获取手部及脚部在场景深度图中的投影坐标,并转换为现实世界中的三维坐标;6) Obtain the projected coordinates of the hands and feet in the scene depth map through the colored markers worn at the extremities, and convert them into three-dimensional coordinates in the real world;
7)通过手部和肩膀的三维坐标,计算出肘关节在场景深度图中的投影坐标,并转换为现实世界中的三维坐标;7) Calculate the projection coordinates of the elbow joint in the scene depth map through the three-dimensional coordinates of the hands and shoulders, and convert them into three-dimensional coordinates in the real world;
8)通过脚部和臀部的三维坐标,计算出膝关节在场景深度图中的投影坐标,并转换为现实世界中的三维坐标;8) Calculate the projected coordinates of the knee joint in the scene depth map through the three-dimensional coordinates of the feet and hips, and convert them into three-dimensional coordinates in the real world;
9)按照步骤2)至8)处理场景深度图序列及场景彩色图序列中的每一帧,并将每一帧捕获到的人体各部位的三维坐标依据人体结构组成并输出骨架模型,设置每个捕获的人体部位的三维坐标的约束空间和可信度,对骨架模型进行平滑处理,所述约束空间表示了每个捕获的人体部位在相邻两帧内所允许最大的位移范围。9) Process each frame in the scene depth map sequence and scene color map sequence according to steps 2) to 8), and compose the three-dimensional coordinates of each part of the human body captured in each frame according to the human body structure and output the skeleton model, and set each The constraint space and reliability of the three-dimensional coordinates of each captured human body part are smoothed, and the skeleton model is smoothed. The constraint space represents the maximum displacement range allowed by each captured human body part within two adjacent frames.
所述5)中的计算方法:The calculation method in 5) above:
a)在获取头部的三维坐标后,根据预设的颈部参考长度Lneck、颈部的参考深度Dneck以及头部的实际深度Dhead,通过下列公式计算得到颈部的实际长度L_Realneck:a) After obtaining the three-dimensional coordinates of the head, according to the preset neck reference length L neck , the neck reference depth D neck and the actual head depth D head , the actual length of the neck L_Real neck is calculated by the following formula :
L_Realneck=Dhead*Lneck/Dneck L_Real neck =D head *L neck /D neck
在头部与躯干连接的线段上,根据颈部的实际长度L_Realneck获得人体颈部的位置;On the line segment connecting the head and the torso, the position of the human neck is obtained according to the actual length of the neck L_Real neck ;
b)在获取颈部的三维坐标之后,本方法根据预设的肩部参考宽度Wshoulder、肩部的参考深度Dshoulder以及颈部的实际深度Rneck,通过下列公式计算得到肩部的实际宽度W_Realshoulder:b) After obtaining the three-dimensional coordinates of the neck, this method calculates the actual width of the shoulder by the following formula according to the preset reference width W shoulder of the shoulder, the reference depth D shoulder of the shoulder and the actual depth R neck of the neck W_Real shoulder :
W_Realshoulder=Rneck*Wshoulder/Dshoulder W_Real shoulder =R neck *W shoulder /D shoulder
在颈部所处的水平线段上,根据肩部实际宽度W_Realshoulder获得人体左右肩膀的位置;On the horizontal line segment where the neck is located, the position of the left and right shoulders of the human body is obtained according to the actual width of the shoulders W_Real shoulder ;
c)在计算肩部位置时,需要注意用户有时候会采取侧身的姿态,在这种情况下,需要调整侧身时肩膀投影的宽度;该步骤需要计算出左右肩膀所在位置的深度Dleft、Dright以及肩膀的预设长度Wshoulder,那么变化后的肩膀宽度WProjected应为:c) When calculating the shoulder position, it should be noted that the user sometimes adopts a sideways posture. In this case, the width of the shoulder projection needs to be adjusted; this step needs to calculate the depths D left and D of the left and right shoulders. right and the preset length W shoulder of the shoulder, then the changed shoulder width W Projected should be:
通过WProjected,按照步骤b)计算左右肩部的位置;Through W Projected , follow step b) to calculate the position of the left and right shoulders;
d)通过局部搜索的方法,确保上述a),b),c)步骤得到的肩膀坐标处于前景像素中;以左肩为例,该局部搜索方法在探测左肩膀时,如果估测的左肩部像素点(x,y)处于背景像素中,那么在搜索该估测像素点右侧的像素(x+t,y+t),其中t为搜索范围阈值,在该范围内的像素点中找出距离该估测像素点最近的前景像素,如果未能找到处于前景像素的点,那么递进式地增加t的值以扩大搜索范围,直至找到最邻近的前景像素为止。d) Through the local search method, ensure that the shoulder coordinates obtained in the above steps a), b), and c) are in the foreground pixel; taking the left shoulder as an example, when the local search method detects the left shoulder, if the estimated left shoulder pixel Point (x, y) is in the background pixel, then search for the pixel (x+t, y+t) on the right side of the estimated pixel point, where t is the threshold of the search range, and find If the foreground pixel closest to the estimated pixel cannot be found, the value of t is increased progressively to expand the search range until the nearest foreground pixel is found.
所述6)中的获取方法:The acquisition method in 6) above:
a)用户需要在手部和脚部佩戴带有颜色的标记物来辅助识别手部和脚部位置,所描述标记物的颜色应区分于用户身体其它部位的颜色;a) The user needs to wear colored markers on the hands and feet to assist in identifying the position of the hands and feet, and the color of the markers described should be distinguished from the colors of other parts of the user's body;
b)将彩色图由RBG颜色空间转换为HSV颜色空间,并提取手部和脚部标记物的HSV颜色特征作为阈值,再对每一帧场景彩色图,使用该阈值对其进行滤波,将不符合该颜色特征的像素点移除,获得颜色阈值图,并通过图像的腐蚀和膨胀操作,去除颜色阈值图中的噪点;b) Convert the color image from the RBG color space to the HSV color space, and extract the HSV color features of the hand and foot markers as a threshold, and then use the threshold to filter each frame of the scene color image, which will not Remove the pixels conforming to the color feature to obtain the color threshold map, and remove the noise in the color threshold map through image erosion and expansion operations;
c)经过以上处理,得到一张二值图像,其中手部和脚部标记物的部位会由相应的斑块(Blob)表述,以该斑块的中心点作为四肢端点的位置,再通过坐标转换,分别获得手部和脚部在现实世界中的三维坐标。c) After the above processing, a binary image is obtained, in which the position of the hand and foot markers will be represented by the corresponding plaque (Blob), and the center point of the plaque is used as the position of the extremity endpoint, and then through the coordinates transformation to obtain the 3D coordinates of the hands and feet in the real world, respectively.
所述7)中的计算方法:The calculation method in the above 7):
a)在计算时,需要在场景深度图中的前景像素中标注属于手臂部位的像素点;先分别通过左右肩膀的位置标注出场景深度图中表示躯干部位的像素点,再将场景深度图中与躯干连接的其余部位分别标注为表示四肢及头部的像素点;当手臂在正前方遮挡住躯干时,需要计算躯干“质心”的深度值,并将躯干上每一像素点的深度值与躯干“质心”的深度值比较,如果深度差大于某一阈值,则标注该像素属于手臂区域,反之,该像素点属于躯干区域;某个区域的“质心”指该区域的平均深度,为此,可以通过计算该区域深度值的直方图,并将具有最高频率的深度值或具有最高频率的两个或多个深度值的平均值设为该区域质心的深度值;a) When calculating, it is necessary to mark the pixels belonging to the arm part in the foreground pixels in the scene depth map; first mark the pixel points representing the torso in the scene depth map through the positions of the left and right shoulders respectively, and then put them in the scene depth map The rest of the parts connected to the torso are marked as pixels representing the limbs and the head; when the arms block the torso directly in front, the depth value of the "centroid" of the torso needs to be calculated, and the depth value of each pixel on the torso is compared with Compare the depth value of the "centroid" of the torso, if the depth difference is greater than a certain threshold, mark the pixel as belonging to the arm area, otherwise, the pixel belongs to the torso area; the "centroid" of a certain area refers to the average depth of the area, for this , by calculating the histogram of the depth values of the region, and setting the depth value with the highest frequency or the average of two or more depth values with the highest frequency as the depth value of the centroid of the region;
b)成功标注出手臂区域的像素点后,以手部为起点,遍历深度图中所有标注为手臂的像素,如果该像素点与手部的距离满足小臂长度的约束条件,则将其标记为潜在的肘部区域;之后再以肩部为起点,再次在手臂上搜索到肩部距离符合大臂长度约束的像素点,将这些点与之前标注出肘部的点取交集即可得到肘部的估测范围,再将这些点的中点标记为人体肘部位置。b) After successfully marking the pixels in the arm area, start from the hand and traverse all the pixels marked as arms in the depth map. If the distance between the pixel and the hand satisfies the constraint condition of the forearm length, mark it is the potential elbow area; then, starting from the shoulder, search again on the arm to find the pixel points whose distance from the shoulder meets the length constraint of the arm, and intersect these points with the points that marked the elbow before to get the elbow The estimated range of the head, and then mark the midpoint of these points as the position of the human elbow.
所述8)中的计算方法:The calculation method in the above 8):
a)在计算时,需要在场景深度图中的前景像素中标注出属于腿部部位的像素点,先分别通过左右肩膀的位置标注出场景深度图中表示躯干部位的像素点,再将场景深度图中与躯干连接的其余部位分别标注为表示四肢及头部的像素点;a) When calculating, it is necessary to mark the pixels belonging to the legs in the foreground pixels in the scene depth map, first mark the pixels representing the torso in the scene depth map through the positions of the left and right shoulders, and then set the scene depth The remaining parts connected to the torso in the figure are marked as pixels representing the limbs and head respectively;
b)成功标注出腿部区域的像素点后,以脚部为起点,遍历深度图中所有标注为腿部的像素点,如果该像素点与脚部的距离满足小腿长度的约束条件,则将其标记为潜在的膝盖点;之后再以臀部为起点,再次在腿部上搜索到臀部距离符合大腿长度约束的像素点,将这些点与之前标注出膝盖的点取交集即可得到膝盖的估测范围,再将这些点的中点标记为人体膝盖位置。b) After the pixels of the leg area are successfully marked, start from the foot and traverse all the pixels marked as legs in the depth map. If the distance between the pixel and the foot satisfies the constraint condition of the leg length, It is marked as a potential knee point; then, starting from the hip, search again on the leg to find the pixel points whose hip distance meets the thigh length constraint, and intersect these points with the points marked with the knee before to get the estimation of the knee. Measure the range, and then mark the midpoint of these points as the human knee position.
所述9)中的处理方法:The processing method in the above 9):
a.针对每一个人体部位定义一个约束长度D以及可信度C,约束长度D可以描述约束范围;约束范围是指一个以该人体部位为球心,D为半径的球体,该球体描述了该人体部位在相邻两帧的时间内,所允许的最大位移范围,不同人体部位的约束空间大小会不一样,手部的约束空间相比肩部会大一些;a. Define a constraint length D and reliability C for each human body part. The constraint length D can describe the constraint range; the constraint range refers to a sphere with the human body part as the center and D as the radius. The sphere describes the The maximum displacement range allowed for human body parts within two adjacent frames, the size of the constrained space of different human body parts will be different, and the constrained space of the hand will be larger than that of the shoulder;
b.可信度表示了该人体部位目前坐标值的准确程度,C的值越高,则该人体部位的位置越准确;初始时每个人体部位的可信度都设置为0,在新的一帧中,如果该人体部位新的位置处于前一帧该人体部位的约束空间内,则该人体部位的可信度增加一点,若该人体部位的可信度已达到最大值,则不需要改变;反之,若该人体部位新的位置在上一帧中该人体部位的约束空间之外,则只需向新的位置移动Length/C的距离,其中Length为该部位原来的位置和新位置所表示线段的长度,随后再将该部位的可信度降低一点。b. Credibility indicates the accuracy of the current coordinates of the body part. The higher the value of C, the more accurate the position of the body part is. Initially, the credibility of each body part is set to 0. In the new In one frame, if the new position of the human body part is in the constrained space of the human body part in the previous frame, the reliability of the human body part is increased a little. If the reliability of the human body part has reached the maximum value, no need Conversely, if the new position of the human body part is outside the constrained space of the human body part in the previous frame, it only needs to move the distance of Length/C to the new position, where Length is the original position and the new position of the part The length of the line segment represented, and then reduce the credibility of the part a bit.
本发明利用深度图解析、颜色识别、人脸检测等多种技术,来实时地获取人体主要关节点在现实世界中的坐标,从而恢复出人体的三维骨架信息,相较于传统的利用近红外设备对人体进行三维姿态恢复的技术,提高了恢复的稳定性,降低了使用成本,也使人体运动捕获过程更为简便,为人体三维姿态恢复技术走进家庭娱乐提供了新的解决方案。The present invention utilizes multiple technologies such as depth map analysis, color recognition, and face detection to obtain real-time coordinates of the main joint points of the human body in the real world, thereby recovering the three-dimensional skeleton information of the human body. Compared with the traditional method of using near-infrared The technology of recovering the three-dimensional posture of the human body by the device improves the stability of the recovery, reduces the cost of use, and makes the process of human motion capture easier, providing a new solution for the technology of restoring the three-dimensional posture of the human body into home entertainment.
附图说明 Description of drawings
下面结合附图和具体实施方式对本发明作进一步的说明。The present invention will be further described below in conjunction with the accompanying drawings and specific embodiments.
图1为基于多模态融合的实时人体三维姿态恢复的方法流程图;Fig. 1 is the method flowchart of the real-time human three-dimensional pose recovery based on multimodal fusion;
图2为本发明所使用的场景深度图;Fig. 2 is the depth map of the scene used by the present invention;
图3为本发明所恢复的人体三维骨架效果图。Fig. 3 is an effect diagram of the three-dimensional skeleton of the human body restored by the present invention.
具体实施方式 Detailed ways
结合附图1,基于多模态融合的实时人体三维姿态恢复的方法,它的步骤如下:In conjunction with accompanying drawing 1, the method for the real-time human three-dimensional attitude recovery based on multimodal fusion, its steps are as follows:
1)获取场景深度图像及场景彩色图像1) Obtain scene depth image and scene color image
本方法以不小于25帧每秒的帧速获取长度为640、宽度为480像素的场景深度图像和场景彩色图像序列,所述场景深度图序列中的每一帧场景深度图(如图2所示)由像素矩阵组成,像素矩阵中的每个像素点的值表示该像素点所对应场景中的位置到参考位置的距离,即该像素点的深度值;其中场景深度图和场景彩色图在每一时刻都是同步的,场景深度图像和场景彩色图像的每一个像素也是经过对齐的。This method acquires a sequence of scene depth images and scene color images with a length of 640 pixels and a width of 480 pixels at a frame rate of not less than 25 frames per second, and each frame of the scene depth map in the sequence of scene depth maps (as shown in Figure 2 shown) consists of a pixel matrix, and the value of each pixel in the pixel matrix represents the distance from the position in the scene corresponding to the pixel to the reference position, that is, the depth value of the pixel; where the scene depth map and the scene color map are in Every moment is synchronized, and every pixel of the scene depth image and the scene color image are also aligned.
2)对深度图像进行背景剪除2) Background clipping on the depth image
分割所述场景深度图的背景与前景像素,获得场景深度图中表述人体部位的区域,即前景像素。在实现人体姿态跟踪的时候,我们只对前景像素(用户)感兴趣,所以需要去除掉背景像素。在实际实现过程中,有很多不同的背景去除方法。本发明实现的方法是在深度图中先确定一个斑块(Blob)作为对象的身体,然后从斑块中去除具有明显不同深度值的其它斑块。这种方式需要首先确定某个具有最小尺寸的斑块,确定该尺寸涉及到现实世界的坐标系与投影坐标系之间的转换。由深度图可以获得的物体的投影坐标,为了确定物体的实际坐标,需要使用下面的公式将物体的(x,y,深度值)坐标转换为“现实世界”坐标(Xr,Yr,深度值):The background and foreground pixels of the scene depth map are segmented to obtain a region representing a human body part in the scene depth map, that is, foreground pixels. When implementing human pose tracking, we are only interested in foreground pixels (users), so we need to remove background pixels. In actual implementation, there are many different background removal methods. The method implemented by the present invention is to first determine a blob in the depth map as the body of the object, and then remove other blobs with obviously different depth values from the blob. This approach needs to first determine a patch with a minimum size, and determining the size involves conversion between the real-world coordinate system and the projected coordinate system. The projected coordinates of the object that can be obtained from the depth map, in order to determine the actual coordinates of the object, the following formula needs to be used to convert the coordinates of the object (x, y, depth value) into "real world" coordinates (X r , Y r , depth value):
Xr=(X-fovx/2)*像素尺寸*深度/参考深度X r = (X-fovx/2)*pixel size*depth/reference depth
Yr=(Y-fovy/2)*像素尺寸*深度/参考深度Y r = (Y-fovy/2)*pixel size*depth/reference depth
这里,fovx和fovy为x和y方向上的深度图的视野(以像素为单位)。像素尺寸为,在摄像头给定距离(参考深度)处像素所对着的长度。随后,我们就可以利用物体在现实世界中的坐标来计算其欧几里得距离,从而避免由于近大远小所造成的误差。在由现实世界坐标确定出斑块的尺寸后,即可筛选出离摄像头最近的,而且尺寸最大的斑块,并设定其为人体区域。Here, fovx and fovy are the field of view (in pixels) of the depth map in the x and y directions. The pixel size is the length that the pixel is facing at a given distance (reference depth) from the camera. Then, we can use the coordinates of the object in the real world to calculate its Euclidean distance, so as to avoid the error caused by the large distance. After the size of the plaque is determined by the coordinates of the real world, the nearest and largest plaque to the camera can be selected and set as the human body area.
另外一种更为简洁的方法是通过试探性地设置阈值,将深度值大于某一阈值的像素全部设置为背景像素,将尺寸小于某一阈值的斑块,设置为背景像素,这样实现更为简便,但准确性也更低。Another more concise method is to tentatively set the threshold, set all the pixels whose depth value is greater than a certain threshold as background pixels, and set the patches whose size is smaller than a certain threshold as background pixels, so as to achieve more Simpler, but less accurate.
3)在场景深度图中标注人体各区域3) Label each area of the human body in the scene depth map
处理场景深度图中的前景像素,标注出场景深度图中表示人体躯干、头部以及四肢的区域;Process the foreground pixels in the scene depth map, and mark the areas representing the human torso, head and limbs in the scene depth map;
4)通过人脸检测的方法,计算人体头部位置4) Calculate the position of the human head through the method of face detection
在这一步中,本发明使用OpenCV提供的哈尔分类器(Haar Cascade Classifier)进行人脸检测,从场景彩色图中实时地获取用户头部所在的像素点,通过场景彩色图与场景深度图的映射得到人体头部在场景深度图中的投影坐标,并通过步骤2中描述的坐标转换方法将其转换为现实世界中的三维坐标,所述投影坐标为三维向量(X,Y,Z),其中(X,Y)具体地指向场景深度图的某个像素点,Z为该像素点的深度值。In this step, the present invention uses the Haar Cascade Classifier (Haar Cascade Classifier) provided by OpenCV to detect the face, obtains the pixel of the user's head in real time from the scene color map, and uses the scene color map and the scene depth map. The projection coordinates of the human head in the scene depth map are obtained by mapping, and converted into three-dimensional coordinates in the real world through the coordinate transformation method described in step 2, and the projection coordinates are three-dimensional vectors (X, Y, Z), Where (X, Y) specifically points to a certain pixel of the scene depth map, and Z is the depth value of the pixel.
5)根据头部位置,计算肩膀坐标5) Calculate the shoulder coordinates according to the head position
a)在获取头部的三维坐标后,本方法根据预设的颈部参考长度Lneck、颈部的参考深度Dneck以及头部的实际深度Dhead,通过下列公式计算得到颈部的实际长度L_Realneck:a) After obtaining the three-dimensional coordinates of the head, this method calculates the actual length of the neck through the following formula according to the preset neck reference length L neck , the neck reference depth D neck and the head actual depth D head L_Real neck :
L_Realneck=Dhead*Lneck/Dneck L_Real neck =D head *L neck /D neck
在头部与躯干连接的线段上,根据颈部的实际长度L_Realneck获得人体颈部的位置;On the line segment connecting the head and the torso, the position of the human neck is obtained according to the actual length of the neck L_Real neck ;
b)在获取颈部的三维坐标之后,本方法根据预设的肩部参考宽度Wshoulder、肩部的参考深度Dshoulder以及颈部的实际深度Rneck,通过下列公式计算得到肩部的实际宽度W_Realshoulder:b) After obtaining the three-dimensional coordinates of the neck, this method calculates the actual width of the shoulder by the following formula according to the preset reference width W shoulder of the shoulder, the reference depth D shoulder of the shoulder and the actual depth R neck of the neck W_Real shoulder :
W_Realshoulder=Rneck*Wshoulder/Dshoulder W_Real shoulder =R neck *W shoulder /D shoulder
在颈部所处的水平线段上,根据肩部实际宽度W_Realshoulder获得人体左右肩膀的位置;On the horizontal line segment where the neck is located, the position of the left and right shoulders of the human body is obtained according to the actual width of the shoulders W_Real shoulder ;
c)在计算肩部位置时,需要注意用户有时候会采取侧身的姿态,在这种情况下,需要调整侧身时肩膀投影的宽度;该步骤需要计算出左右肩膀所在位置的深度Dleft、Dright以及肩膀的预设长度Wshoulder,那么变化后的肩膀宽度WProjected应为:c) When calculating the shoulder position, it should be noted that the user sometimes adopts a sideways posture. In this case, the width of the shoulder projection needs to be adjusted; this step needs to calculate the depths D left and D of the left and right shoulders. right and the preset length W shoulder of the shoulder, then the changed shoulder width W Projected should be:
通过WProjected,按照步骤b)计算左右肩部的位置;Through W Projected , follow step b) to calculate the position of the left and right shoulders;
d)通过局部搜索的方法,确保上述a),b),c)步骤得到的肩膀坐标处于前景像素中;以左肩为例,该局部搜索方法在探测左肩膀时,如果估测的左肩部像素点(x,y)处于背景像素中,那么在搜索该估测像素点右侧的像素(x+t,y+t),其中t为搜索范围阈值,在该范围内的像素点中找出距离该估测像素点最近的前景像素。如果未能找到处于前景像素的点,那么递进式地增加t的值以扩大搜索范围,直至找到最邻近的前景像素为止。d) Through the local search method, ensure that the shoulder coordinates obtained in the above steps a), b), and c) are in the foreground pixel; taking the left shoulder as an example, when the local search method detects the left shoulder, if the estimated left shoulder pixel Point (x, y) is in the background pixel, then search for the pixel (x+t, y+t) on the right side of the estimated pixel point, where t is the threshold of the search range, and find The closest foreground pixel to the estimated pixel. If the point in the foreground pixel cannot be found, then the value of t is progressively increased to expand the search range until the nearest foreground pixel is found.
6)通过四肢端点佩戴的带有颜色的标记物,获取手部及脚部在场景深度图中的投影坐标,并转换为现实世界中的三维坐标:6) Obtain the projected coordinates of the hands and feet in the scene depth map through the colored markers worn at the endpoints of the limbs, and convert them into three-dimensional coordinates in the real world:
a)在本发明中,用户需要在手部和脚部佩戴带有颜色的标记物来辅助识别手部和脚部位置,所描述标记物的颜色应区分于用户身体其它部位的颜色;a) In the present invention, the user needs to wear colored markers on the hands and feet to assist in identifying the positions of the hands and feet, and the color of the markers described should be distinguished from the colors of other parts of the user's body;
b)将彩色图由RBG颜色空间转换为HSV颜色空间,并提取手部和脚部标记物的HSV颜色特征作为阈值,再对每一帧场景彩色图,使用该阈值对其进行滤波,将不符合该颜色特征的像素点移除,获得颜色阈值图;并通过图像的腐蚀和膨胀操作,去除颜色阈值图中的噪点;b) Convert the color image from the RBG color space to the HSV color space, and extract the HSV color features of the hand and foot markers as a threshold, and then use the threshold to filter each frame of the scene color image, which will not Remove the pixels that meet the color characteristics to obtain the color threshold map; and remove the noise in the color threshold map through image erosion and expansion operations;
c)经过以上处理,得到一张二值图像,其中手部和脚部标记物的部位会由相应的斑块(Blob)表述,我们以该斑块的中心点作为四肢端点的位置,再通过坐标转换,分别获得手部和脚部在现实世界中的三维坐标。c) After the above processing, a binary image is obtained, in which the position of the hand and foot markers will be represented by the corresponding plaque (Blob). We use the center point of the plaque as the position of the end point of the limb, and then pass Coordinate transformation to obtain the 3D coordinates of the hands and feet in the real world, respectively.
7)通过手部和肩膀的三维坐标,计算出肘关节在场景深度图中的投影坐标,并转换为现实世界中的三维坐标:7) Calculate the projected coordinates of the elbow joint in the scene depth map through the three-dimensional coordinates of the hands and shoulders, and convert them into three-dimensional coordinates in the real world:
a)在计算时,需要在场景深度图中的前景像素中标注属于手臂部位的像素点;先分别通过左右肩膀的位置标注出场景深度图中表示躯干部位的像素点,再将场景深度图中与躯干连接的其余部位分别标注为表示四肢及头部的像素点;值得注意的是当手臂在正前方遮挡住躯干的情形,这时如果要判断躯干前某一像素点所表示的是躯干还是手臂区域,需要计算躯干“质心”的深度值,并将该像素点的深度与去躯干“质心”的深度值比较,如果深度差大于某一阈值,则标注该像素属于手臂区域,反之,该像素点属于躯干区域;某个区域的“质心”指该区域的平均深度,为此,可以通过计算该区域深度值的直方图,并将具有最高频率的深度值(或具有最高频率的两个或多个深度值的平均值)设为该区域质心的深度值;a) When calculating, it is necessary to mark the pixels belonging to the arm part in the foreground pixels in the scene depth map; first mark the pixel points representing the torso in the scene depth map through the positions of the left and right shoulders respectively, and then put them in the scene depth map The remaining parts connected to the torso are marked as pixels representing the limbs and the head; it is worth noting that when the arm is directly in front of the torso, if you want to judge whether a pixel in front of the torso represents the torso or For the arm area, it is necessary to calculate the depth value of the "centroid" of the torso, and compare the depth of the pixel with the depth value of the "centroid" of the torso. If the depth difference is greater than a certain threshold, mark the pixel as belonging to the arm area, otherwise, the The pixel points belong to the torso region; the "centroid" of a region refers to the average depth of the region, and this can be done by computing a histogram of the depth values of the region and taking the depth value with the highest frequency (or the two with the highest frequency or the average of multiple depth values) is set as the depth value of the centroid of the area;
b)成功标注出手臂区域的像素点后,以手部为起点,遍历深度图中所有标注为手臂的像素,如果该像素点与手部的距离满足小臂长度的约束条件,则将其标记为潜在的肘部区域;之后再以肩部为起点,再次在手臂上搜索到肩部距离符合大臂长度约束的像素点,将这些点与之前标注出肘部的点取交集即可得到肘部的估测范围,再将这些点的中点标记为人体肘部位置。b) After successfully marking the pixels in the arm area, start from the hand and traverse all the pixels marked as arms in the depth map. If the distance between the pixel and the hand satisfies the constraint condition of the forearm length, mark it is the potential elbow area; then, starting from the shoulder, search again on the arm to find the pixel points whose distance from the shoulder meets the length constraint of the arm, and intersect these points with the points that marked the elbow before to get the elbow The estimated range of the head, and then mark the midpoint of these points as the position of the human elbow.
8)通过脚部和臀部的三维坐标,计算出膝关节在场景深度图中的投影坐标,并转换为现实世界中的三维坐标:8) Calculate the projected coordinates of the knee joint in the scene depth map through the three-dimensional coordinates of the feet and hips, and convert them into three-dimensional coordinates in the real world:
a)在计算时,需要在场景深度图中的前景像素中标注出属于腿部部位的像素点;先分别通过左右肩膀的位置标注出场景深度图中表示躯干部位的像素点,再将场景深度图中与躯干连接的其余部位分别标注为表示四肢及头部的像素点;a) When calculating, it is necessary to mark the pixels belonging to the legs in the foreground pixels in the scene depth map; first mark the pixels representing the torso in the scene depth map through the positions of the left and right shoulders, and then set the scene depth The remaining parts connected to the torso in the figure are marked as pixels representing the limbs and head respectively;
b)成功标注出腿部区域的像素点后,以脚部为起点,遍历深度图中所有标注为腿部的像素点,如果该像素点与脚部的距离满足小腿长度的约束条件,则将其标记为潜在的膝盖点;之后再以臀部为起点,再次在腿部上搜索到臀部距离符合大腿长度约束的像素点,将这些点与之前标注出膝盖的点取交集即可得到膝盖的估测范围,再将这些点的中点标记为人体膝盖位置。b) After the pixels of the leg area are successfully marked, start from the foot and traverse all the pixels marked as legs in the depth map. If the distance between the pixel and the foot satisfies the constraint condition of the leg length, It is marked as a potential knee point; then, starting from the hip, search again on the leg to find the pixel points whose hip distance meets the thigh length constraint, and intersect these points with the points marked with the knee before to get the estimation of the knee. Measure the range, and then mark the midpoint of these points as the human knee position.
9)按照步骤2)至8)处理场景深度图序列及场景彩色图序列中的每一帧,并将每一帧捕获到的人体各部位的三维坐标依据人体结构组成并输出如图3所示的人体骨架模型,设置每个捕获的人体部位的三维坐标的约束空间和可信度,对人体骨架模型进行平滑处理,所述约束空间表示了每个捕获的人体部位在相邻两帧内所允许最大的位移范围:9) Follow steps 2) to 8) to process each frame in the scene depth map sequence and scene color map sequence, and compose and output the three-dimensional coordinates of each part of the human body captured in each frame according to the human body structure, as shown in Figure 3 The human skeleton model, setting the constraint space and reliability of the three-dimensional coordinates of each captured human body part, smoothing the human body skeleton model, the constraint space represents the position of each captured human body part in two adjacent frames The maximum displacement range allowed:
a)针对每一个人体部位定义一个约束长度D以及可信度C,约束长度D可以描述约束范围;约束范围是指一个以该人体部位为球心,D为半径的球体,该球体描述了该人体部位在相邻两帧的时间内,所允许的最大位移范围,不同人体部位的约束空间大小会不一样,手部的约束空间相比肩部会大一些;a) Define a constraint length D and reliability C for each human body part. The constraint length D can describe the constraint range; the constraint range refers to a sphere with the human body part as the center and D as the radius. The sphere describes the The maximum displacement range allowed for human body parts within two adjacent frames, the size of the constrained space of different human body parts will be different, and the constrained space of the hand will be larger than that of the shoulder;
b)可信度表示了该人体部位目前坐标值的准确程度,C的值越高,则该人体部位的位置越准确;初始时每个人体部位的可信度都设置为0,在新的一帧中,如果该人体部位新的位置处于前一帧该人体部位的约束空间内,则该人体部位的可信度增加一点,若该人体部位的可信度已达到最大值,则不需要改变;反之,若该人体部位新的位置在上一帧中该人体部位的约束空间之外,则只需向新的位置移动Length/C的距离,其中Length为该部位原来的位置和新位置所表示线段的长度,随后再将该部位的可信度降低一点。b) Credibility indicates the accuracy of the current coordinates of the body part. The higher the value of C, the more accurate the position of the body part is. Initially, the credibility of each body part is set to 0. In the new In one frame, if the new position of the human body part is in the constrained space of the human body part in the previous frame, the reliability of the human body part is increased a little. If the reliability of the human body part has reached the maximum value, no need Conversely, if the new position of the human body part is outside the constrained space of the human body part in the previous frame, it only needs to move the distance of Length/C to the new position, where Length is the original position and the new position of the part The length of the line segment represented, and then reduce the credibility of the part a bit.
Claims (6)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN2012102308982A CN102800126A (en) | 2012-07-04 | 2012-07-04 | Method for recovering real-time three-dimensional body posture based on multimodal fusion |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN2012102308982A CN102800126A (en) | 2012-07-04 | 2012-07-04 | Method for recovering real-time three-dimensional body posture based on multimodal fusion |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN102800126A true CN102800126A (en) | 2012-11-28 |
Family
ID=47199222
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN2012102308982A Pending CN102800126A (en) | 2012-07-04 | 2012-07-04 | Method for recovering real-time three-dimensional body posture based on multimodal fusion |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN102800126A (en) |
Cited By (37)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103336953A (en) * | 2013-07-05 | 2013-10-02 | 深圳市中视典数字科技有限公司 | Movement judgment method based on body sensing equipment |
| CN103745226A (en) * | 2013-12-31 | 2014-04-23 | 国家电网公司 | Dressing safety detection method for worker on working site of electric power facility |
| CN104167016A (en) * | 2014-06-16 | 2014-11-26 | 西安工业大学 | Three-dimensional motion reconstruction method based on RGB color and depth image |
| CN104573612A (en) * | 2013-10-16 | 2015-04-29 | 北京三星通信技术研究有限公司 | Equipment and method for estimating postures of multiple overlapped human body objects in range image |
| CN105407774A (en) * | 2013-07-29 | 2016-03-16 | 三星电子株式会社 | Auto-cleaning system, cleaning robot and method of controlling the cleaning robot |
| CN105574525A (en) * | 2015-12-18 | 2016-05-11 | 天津中科智能识别产业技术研究院有限公司 | Method and device for obtaining complex scene multi-mode biology characteristic image |
| CN106535759A (en) * | 2014-04-09 | 2017-03-22 | 拜耳消费者保健股份公司 | Method, apparatus, and computer-readable medium for generating a set of recommended orthotic products |
| CN106846324A (en) * | 2017-01-16 | 2017-06-13 | 河海大学常州校区 | A kind of irregular object height measurement method based on Kinect |
| CN107169262A (en) * | 2017-03-31 | 2017-09-15 | 百度在线网络技术(北京)有限公司 | Recommend method, device, equipment and the computer-readable storage medium of body shaping scheme |
| CN107230226A (en) * | 2017-05-15 | 2017-10-03 | 深圳奥比中光科技有限公司 | Determination methods, device and the storage device of human body incidence relation |
| CN107481286A (en) * | 2017-07-11 | 2017-12-15 | 厦门博尔利信息技术有限公司 | Dynamic 3 D schematic capture algorithm based on passive infrared reflection |
| CN107808128A (en) * | 2017-10-16 | 2018-03-16 | 深圳市云之梦科技有限公司 | A kind of virtual image rebuilds the method and system of human body face measurement |
| CN108295469A (en) * | 2017-12-04 | 2018-07-20 | 成都思悟革科技有限公司 | Game visual angle effect method based on motion capture technology |
| CN108542021A (en) * | 2018-03-18 | 2018-09-18 | 江苏特力威信息系统有限公司 | A kind of gym suit and limbs measurement method and device based on vitta identification |
| CN109353907A (en) * | 2017-09-05 | 2019-02-19 | 日立楼宇技术(广州)有限公司 | Safety prompting method and system for elevator operation |
| CN110342252A (en) * | 2019-07-01 | 2019-10-18 | 芜湖启迪睿视信息技术有限公司 | A kind of article automatically grabs method and automatic grabbing device |
| CN110781820A (en) * | 2019-10-25 | 2020-02-11 | 网易(杭州)网络有限公司 | Game character action generating method, game character action generating device, computer device and storage medium |
| CN110909580A (en) * | 2018-09-18 | 2020-03-24 | 北京市商汤科技开发有限公司 | Data processing method and device, electronic equipment and storage medium |
| CN111144207A (en) * | 2019-11-21 | 2020-05-12 | 东南大学 | Human body detection and tracking method based on multi-mode information perception |
| CN111640176A (en) * | 2018-06-21 | 2020-09-08 | 华为技术有限公司 | A kind of object modeling motion method, device and equipment |
| CN111680670A (en) * | 2020-08-12 | 2020-09-18 | 长沙小钴科技有限公司 | Cross-mode human head detection method and device |
| CN112037319A (en) * | 2020-08-19 | 2020-12-04 | 上海佑久健康科技有限公司 | Human body measuring method, system and computer readable storage medium |
| CN112150448A (en) * | 2020-09-28 | 2020-12-29 | 杭州海康威视数字技术股份有限公司 | Image processing method, device and equipment and storage medium |
| CN112184898A (en) * | 2020-10-21 | 2021-01-05 | 安徽动感智能科技有限公司 | Digital human body modeling method based on motion recognition |
| CN112241477A (en) * | 2019-07-18 | 2021-01-19 | 国网河北省电力有限公司邢台供电分公司 | Multidimensional data visualization method for assisting transformer maintenance operation site |
| CN113065532A (en) * | 2021-05-19 | 2021-07-02 | 南京大学 | Sitting posture geometric parameter detection method and system based on RGBD image |
| CN113158910A (en) * | 2021-04-25 | 2021-07-23 | 北京华捷艾米科技有限公司 | Human skeleton recognition method and device, computer equipment and storage medium |
| CN113343925A (en) * | 2021-07-02 | 2021-09-03 | 厦门美图之家科技有限公司 | Face three-dimensional reconstruction method and device, electronic equipment and storage medium |
| CN113384844A (en) * | 2021-06-17 | 2021-09-14 | 郑州万特电气股份有限公司 | Fire extinguishing action detection method based on binocular vision and fire extinguisher safety practical training system |
| CN113591726A (en) * | 2021-08-03 | 2021-11-02 | 电子科技大学 | Cross mode evaluation method for Taijiquan training action |
| WO2021253777A1 (en) * | 2020-06-19 | 2021-12-23 | 北京市商汤科技开发有限公司 | Attitude detection and video processing methods and apparatuses, electronic device, and storage medium |
| CN114926401A (en) * | 2022-04-21 | 2022-08-19 | 奥比中光科技集团股份有限公司 | Skeleton point shielding detection method, device, equipment and storage medium |
| CN115308768A (en) * | 2022-07-22 | 2022-11-08 | 万达信息股份有限公司 | Intelligent monitoring system under privacy environment |
| CN116563952A (en) * | 2023-07-07 | 2023-08-08 | 厦门医学院 | A Method for Restoring Missing Data in Motion Capture Combining Graph Neural Network and Bone Length Constraint |
| CN116901034A (en) * | 2022-04-20 | 2023-10-20 | 埃博茨股份有限公司 | Reliable robotic manipulation in cluttered environments |
| CN117953662A (en) * | 2024-03-26 | 2024-04-30 | 广东先知大数据股份有限公司 | A drowning warning method, device, equipment and storage medium for people on the bank of a pond |
| CN120183002A (en) * | 2025-05-22 | 2025-06-20 | 泰山体育产业集团有限公司 | A method for locating 3D human joints using monocular color video |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101253373A (en) * | 2005-08-30 | 2008-08-27 | 东芝开利株式会社 | indoor unit of air conditioner |
| CN101657825A (en) * | 2006-05-11 | 2010-02-24 | 普莱姆传感有限公司 | Modeling Human Figures from Depth Maps |
| CN102350700A (en) * | 2011-09-19 | 2012-02-15 | 华南理工大学 | Method for controlling robot based on visual sense |
-
2012
- 2012-07-04 CN CN2012102308982A patent/CN102800126A/en active Pending
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101253373A (en) * | 2005-08-30 | 2008-08-27 | 东芝开利株式会社 | indoor unit of air conditioner |
| CN101657825A (en) * | 2006-05-11 | 2010-02-24 | 普莱姆传感有限公司 | Modeling Human Figures from Depth Maps |
| CN102350700A (en) * | 2011-09-19 | 2012-02-15 | 华南理工大学 | Method for controlling robot based on visual sense |
Non-Patent Citations (2)
| Title |
|---|
| HIMANSHU PRAKASH JAIN ET AL: "Real-Time Upper-Body Human Pose Estimation Using a Depth Camera", 《COMPUTER VISION/COMPUTER GRAPHICS COLLABORATION TECHNIQUES LECTURE NOTES IN COMPUTER SCIENCE》 * |
| 周娟 等: "基于强度图和深度图的多模态人脸识别", 《计算机工程与应用》 * |
Cited By (63)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103336953A (en) * | 2013-07-05 | 2013-10-02 | 深圳市中视典数字科技有限公司 | Movement judgment method based on body sensing equipment |
| CN103336953B (en) * | 2013-07-05 | 2016-06-01 | 深圳市中视典数字科技有限公司 | A kind of method passed judgment on based on body sense equipment action |
| US10265858B2 (en) | 2013-07-29 | 2019-04-23 | Samsung Electronics Co., Ltd. | Auto-cleaning system, cleaning robot and method of controlling the cleaning robot |
| CN105407774A (en) * | 2013-07-29 | 2016-03-16 | 三星电子株式会社 | Auto-cleaning system, cleaning robot and method of controlling the cleaning robot |
| CN105407774B (en) * | 2013-07-29 | 2018-09-18 | 三星电子株式会社 | Automatic sweeping system, sweeping robot and the method for controlling sweeping robot |
| CN104573612A (en) * | 2013-10-16 | 2015-04-29 | 北京三星通信技术研究有限公司 | Equipment and method for estimating postures of multiple overlapped human body objects in range image |
| CN104573612B (en) * | 2013-10-16 | 2019-10-22 | 北京三星通信技术研究有限公司 | Apparatus and method for estimating the pose of multiple human subjects overlapping in a depth image |
| CN103745226A (en) * | 2013-12-31 | 2014-04-23 | 国家电网公司 | Dressing safety detection method for worker on working site of electric power facility |
| CN106535759A (en) * | 2014-04-09 | 2017-03-22 | 拜耳消费者保健股份公司 | Method, apparatus, and computer-readable medium for generating a set of recommended orthotic products |
| CN104167016B (en) * | 2014-06-16 | 2017-10-03 | 西安工业大学 | A kind of three-dimensional motion method for reconstructing based on RGB color and depth image |
| CN104167016A (en) * | 2014-06-16 | 2014-11-26 | 西安工业大学 | Three-dimensional motion reconstruction method based on RGB color and depth image |
| CN105574525A (en) * | 2015-12-18 | 2016-05-11 | 天津中科智能识别产业技术研究院有限公司 | Method and device for obtaining complex scene multi-mode biology characteristic image |
| CN105574525B (en) * | 2015-12-18 | 2019-04-26 | 天津中科虹星科技有限公司 | A method and device for acquiring a multimodal biometric image of a complex scene |
| CN106846324A (en) * | 2017-01-16 | 2017-06-13 | 河海大学常州校区 | A kind of irregular object height measurement method based on Kinect |
| CN106846324B (en) * | 2017-01-16 | 2020-05-01 | 河海大学常州校区 | A Kinect-based Height Measurement Method for Irregular Objects |
| CN107169262A (en) * | 2017-03-31 | 2017-09-15 | 百度在线网络技术(北京)有限公司 | Recommend method, device, equipment and the computer-readable storage medium of body shaping scheme |
| CN107169262B (en) * | 2017-03-31 | 2021-11-23 | 百度在线网络技术(北京)有限公司 | Method, device, equipment and computer storage medium for recommending body shaping scheme |
| CN107230226A (en) * | 2017-05-15 | 2017-10-03 | 深圳奥比中光科技有限公司 | Determination methods, device and the storage device of human body incidence relation |
| CN107481286A (en) * | 2017-07-11 | 2017-12-15 | 厦门博尔利信息技术有限公司 | Dynamic 3 D schematic capture algorithm based on passive infrared reflection |
| CN109353907A (en) * | 2017-09-05 | 2019-02-19 | 日立楼宇技术(广州)有限公司 | Safety prompting method and system for elevator operation |
| CN109353907B (en) * | 2017-09-05 | 2020-09-15 | 日立楼宇技术(广州)有限公司 | Safety prompting method and system for elevator operation |
| CN107808128A (en) * | 2017-10-16 | 2018-03-16 | 深圳市云之梦科技有限公司 | A kind of virtual image rebuilds the method and system of human body face measurement |
| CN107808128B (en) * | 2017-10-16 | 2021-04-02 | 深圳市云之梦科技有限公司 | A method and system for measuring human facial features by virtual image reconstruction |
| CN108295469A (en) * | 2017-12-04 | 2018-07-20 | 成都思悟革科技有限公司 | Game visual angle effect method based on motion capture technology |
| CN108295469B (en) * | 2017-12-04 | 2021-03-26 | 成都思悟革科技有限公司 | Game visual angle conversion method based on motion capture technology |
| CN108542021A (en) * | 2018-03-18 | 2018-09-18 | 江苏特力威信息系统有限公司 | A kind of gym suit and limbs measurement method and device based on vitta identification |
| CN111640176A (en) * | 2018-06-21 | 2020-09-08 | 华为技术有限公司 | A kind of object modeling motion method, device and equipment |
| US11436802B2 (en) | 2018-06-21 | 2022-09-06 | Huawei Technologies Co., Ltd. | Object modeling and movement method and apparatus, and device |
| JP2021513175A (en) * | 2018-09-18 | 2021-05-20 | ベイジン センスタイム テクノロジー デベロップメント カンパニー, リミテッド | Data processing methods and devices, electronic devices and storage media |
| WO2020057121A1 (en) * | 2018-09-18 | 2020-03-26 | 北京市商汤科技开发有限公司 | Data processing method and apparatus, electronic device and storage medium |
| CN110909580A (en) * | 2018-09-18 | 2020-03-24 | 北京市商汤科技开发有限公司 | Data processing method and device, electronic equipment and storage medium |
| CN110909580B (en) * | 2018-09-18 | 2022-06-10 | 北京市商汤科技开发有限公司 | Data processing method and device, electronic equipment and storage medium |
| US11238273B2 (en) | 2018-09-18 | 2022-02-01 | Beijing Sensetime Technology Development Co., Ltd. | Data processing method and apparatus, electronic device and storage medium |
| CN110342252B (en) * | 2019-07-01 | 2024-06-04 | 河南启迪睿视智能科技有限公司 | Automatic article grabbing method and automatic grabbing device |
| CN110342252A (en) * | 2019-07-01 | 2019-10-18 | 芜湖启迪睿视信息技术有限公司 | A kind of article automatically grabs method and automatic grabbing device |
| CN112241477A (en) * | 2019-07-18 | 2021-01-19 | 国网河北省电力有限公司邢台供电分公司 | Multidimensional data visualization method for assisting transformer maintenance operation site |
| CN110781820B (en) * | 2019-10-25 | 2022-08-05 | 网易(杭州)网络有限公司 | Game character action generating method, game character action generating device, computer device and storage medium |
| CN110781820A (en) * | 2019-10-25 | 2020-02-11 | 网易(杭州)网络有限公司 | Game character action generating method, game character action generating device, computer device and storage medium |
| CN111144207A (en) * | 2019-11-21 | 2020-05-12 | 东南大学 | Human body detection and tracking method based on multi-mode information perception |
| WO2021253777A1 (en) * | 2020-06-19 | 2021-12-23 | 北京市商汤科技开发有限公司 | Attitude detection and video processing methods and apparatuses, electronic device, and storage medium |
| CN111680670A (en) * | 2020-08-12 | 2020-09-18 | 长沙小钴科技有限公司 | Cross-mode human head detection method and device |
| CN111680670B (en) * | 2020-08-12 | 2020-12-01 | 长沙小钴科技有限公司 | Cross-mode human head detection method and device |
| CN112037319A (en) * | 2020-08-19 | 2020-12-04 | 上海佑久健康科技有限公司 | Human body measuring method, system and computer readable storage medium |
| CN112150448B (en) * | 2020-09-28 | 2023-09-26 | 杭州海康威视数字技术股份有限公司 | Image processing methods, devices and equipment, storage media |
| CN112150448A (en) * | 2020-09-28 | 2020-12-29 | 杭州海康威视数字技术股份有限公司 | Image processing method, device and equipment and storage medium |
| CN112184898A (en) * | 2020-10-21 | 2021-01-05 | 安徽动感智能科技有限公司 | Digital human body modeling method based on motion recognition |
| CN113158910A (en) * | 2021-04-25 | 2021-07-23 | 北京华捷艾米科技有限公司 | Human skeleton recognition method and device, computer equipment and storage medium |
| CN113065532B (en) * | 2021-05-19 | 2024-02-09 | 南京大学 | Sitting posture geometric parameter detection method and system based on RGBD image |
| CN113065532A (en) * | 2021-05-19 | 2021-07-02 | 南京大学 | Sitting posture geometric parameter detection method and system based on RGBD image |
| CN113384844B (en) * | 2021-06-17 | 2022-01-28 | 郑州万特电气股份有限公司 | Fire extinguishing action detection method based on binocular vision and fire extinguisher safety practical training system |
| CN113384844A (en) * | 2021-06-17 | 2021-09-14 | 郑州万特电气股份有限公司 | Fire extinguishing action detection method based on binocular vision and fire extinguisher safety practical training system |
| CN113343925A (en) * | 2021-07-02 | 2021-09-03 | 厦门美图之家科技有限公司 | Face three-dimensional reconstruction method and device, electronic equipment and storage medium |
| CN113343925B (en) * | 2021-07-02 | 2023-08-29 | 厦门美图宜肤科技有限公司 | Face three-dimensional reconstruction method and device, electronic equipment and storage medium |
| CN113591726A (en) * | 2021-08-03 | 2021-11-02 | 电子科技大学 | Cross mode evaluation method for Taijiquan training action |
| CN116901034A (en) * | 2022-04-20 | 2023-10-20 | 埃博茨股份有限公司 | Reliable robotic manipulation in cluttered environments |
| CN114926401A (en) * | 2022-04-21 | 2022-08-19 | 奥比中光科技集团股份有限公司 | Skeleton point shielding detection method, device, equipment and storage medium |
| CN114926401B (en) * | 2022-04-21 | 2025-12-16 | 奥比中光科技集团股份有限公司 | Skeletal point shielding detection method, device, equipment and storage medium |
| CN115308768A (en) * | 2022-07-22 | 2022-11-08 | 万达信息股份有限公司 | Intelligent monitoring system under privacy environment |
| CN116563952B (en) * | 2023-07-07 | 2023-09-15 | 厦门医学院 | Dynamic capture missing data recovery method combining graph neural network and bone length constraint |
| CN116563952A (en) * | 2023-07-07 | 2023-08-08 | 厦门医学院 | A Method for Restoring Missing Data in Motion Capture Combining Graph Neural Network and Bone Length Constraint |
| CN117953662A (en) * | 2024-03-26 | 2024-04-30 | 广东先知大数据股份有限公司 | A drowning warning method, device, equipment and storage medium for people on the bank of a pond |
| CN117953662B (en) * | 2024-03-26 | 2024-06-28 | 广东先知大数据股份有限公司 | Method, device, equipment and storage medium for pre-warning drowning of pond shoreside personnel |
| CN120183002A (en) * | 2025-05-22 | 2025-06-20 | 泰山体育产业集团有限公司 | A method for locating 3D human joints using monocular color video |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN102800126A (en) | Method for recovering real-time three-dimensional body posture based on multimodal fusion | |
| CN106250867B (en) | A kind of implementation method of the skeleton tracking system based on depth data | |
| Jalal et al. | Human body parts estimation and detection for physical sports movements | |
| CN105631861B (en) | Restore the method for 3 D human body posture from unmarked monocular image in conjunction with height map | |
| CN103155003B (en) | Posture estimation device and posture estimation method | |
| JP5873442B2 (en) | Object detection apparatus and object detection method | |
| CN102609683B (en) | Automatic labeling method for human joint based on monocular video | |
| CN111968129A (en) | Instant positioning and map construction system and method with semantic perception | |
| US20080112592A1 (en) | Motion Capture Apparatus and Method, and Motion Capture Program | |
| CN110135375A (en) | Multi-Person Pose Estimation Method Based on Global Information Integration | |
| CN108717531A (en) | Estimation method of human posture based on Faster R-CNN | |
| CN109934848A (en) | A method for accurate positioning of moving objects based on deep learning | |
| CN109919141A (en) | A Pedestrian Re-identification Method Based on Skeleton Pose | |
| CN103729647B (en) | The method that skeleton is extracted is realized based on depth image | |
| CN111027432B (en) | A Vision-Following Robot Method Based on Gait Features | |
| JP2019096113A (en) | Processing device, method and program relating to keypoint data | |
| CN109344694A (en) | A real-time recognition method of basic human actions based on 3D human skeleton | |
| CN106909890B (en) | Human behavior recognition method based on part clustering characteristics | |
| CN102075686A (en) | Robust real-time on-line camera tracking method | |
| CN103942829A (en) | Single-image human body three-dimensional posture reconstruction method | |
| CN106815855A (en) | Based on the human body motion tracking method that production and discriminate combine | |
| CN111695523A (en) | Double-current convolutional neural network action identification method based on skeleton space-time and dynamic information | |
| Willimon et al. | 3D non-rigid deformable surface estimation without feature correspondence | |
| CN109670401B (en) | Action recognition method based on skeletal motion diagram | |
| CN110516639A (en) | A real-time calculation method of 3D position of people based on video stream natural scene |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| C02 | Deemed withdrawal of patent application after publication (patent law 2001) | ||
| WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20121128 |