[go: up one dir, main page]

CN111222401A - Method and device for identifying three-dimensional coordinates of hand key points - Google Patents

Method and device for identifying three-dimensional coordinates of hand key points Download PDF

Info

Publication number
CN111222401A
CN111222401A CN201911112541.2A CN201911112541A CN111222401A CN 111222401 A CN111222401 A CN 111222401A CN 201911112541 A CN201911112541 A CN 201911112541A CN 111222401 A CN111222401 A CN 111222401A
Authority
CN
China
Prior art keywords
hand
color image
model
key points
dimensional coordinates
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911112541.2A
Other languages
Chinese (zh)
Other versions
CN111222401B (en
Inventor
李江
李骊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing HJIMI Technology Co Ltd
Original Assignee
Beijing HJIMI Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing HJIMI Technology Co Ltd filed Critical Beijing HJIMI Technology Co Ltd
Priority to CN201911112541.2A priority Critical patent/CN111222401B/en
Publication of CN111222401A publication Critical patent/CN111222401A/en
Application granted granted Critical
Publication of CN111222401B publication Critical patent/CN111222401B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Image Analysis (AREA)

Abstract

本发明提供了一种手部关键点三维坐标识别方法及装置,获取目标手部框彩色图像,所述目标手部框彩色图像为经过手部检测后得到的彩色图像;将所述目标手部框彩色图像输入手部关键点三维坐标识别网络模型中进行处理,得到所述目标手部关键点的三维坐标。本发明实现了基于彩色数据的手部关键点三维坐标的识别。

Figure 201911112541

The present invention provides a method and device for recognizing three-dimensional coordinates of key points of a hand, obtaining a color image of a target hand frame, and the color image of the target hand frame is a color image obtained after hand detection; The frame color image is input into the three-dimensional coordinate recognition network model of the hand key point for processing, and the three-dimensional coordinate of the target hand key point is obtained. The invention realizes the identification of three-dimensional coordinates of hand key points based on color data.

Figure 201911112541

Description

Method and device for identifying three-dimensional coordinates of hand key points
Technical Field
The invention relates to the technical field of image processing, in particular to a method and a device for identifying three-dimensional coordinates of key points of a hand.
Background
The 3D gesture key point estimation is a key technology of 3D gesture control, and a common current hand key point coordinate estimation scheme based on a depth image is as follows: and directly or indirectly utilizing a depth camera to obtain an infrared image and a color image thereof, identifying two-dimensional coordinates of the key points of the hand in the image by adopting a color image algorithm of an RGB space, then finding depth values of corresponding positions in the registered depth image as numerical values of the depth direction of the depth image, or directly identifying three-dimensional coordinates of the key points of the hand in the depth image by adopting a monocular depth image data algorithm.
However, the technology for estimating the key points of the hand based on the monocular depth camera relies on the quality of the data of the depth map, and when the depth image has more noise, the depth map is not accurate enough, the edge contour is not smooth enough, or the background depth value has great interference, the depth data of the foreground of the hand is not accurate enough, which affects the accuracy of the coordinate estimation of the key points of the hand. In the existing mobile terminal devices, for example, mobile phones, tablet computers and other devices are not provided with a large number of products integrated with the depth camera, and most existing products are overheated and have serious power consumption, so that the user experience of realizing the hand key point coordinate estimation based on the depth camera is poor.
Disclosure of Invention
In view of this, the present invention provides a method and an apparatus for recognizing three-dimensional coordinates of key points of a hand, so as to realize recognition of three-dimensional coordinates of key points of the hand based on color data.
In order to achieve the above purpose, the invention provides the following specific technical scheme:
a three-dimensional coordinate identification method for key points of a hand comprises the following steps:
acquiring a target hand frame color image, wherein the target hand frame color image is a color image obtained after hand detection;
and inputting the target hand frame color image into a three-dimensional coordinate recognition network model of the hand key point for processing to obtain the three-dimensional coordinate of the target hand key point.
Optionally, the method further includes:
acquiring training data of the three-dimensional coordinate recognition network model of the hand key points;
and training a preset neural network model by using the training data, and obtaining the three-dimensional coordinate recognition network model of the hand key point when the accuracy of the output result of the preset neural network model is greater than a threshold value.
Optionally, the acquiring training data of the three-dimensional coordinate recognition network model of the hand key point includes:
under the condition that the direction of a camera and the distance between the camera and a prosthetic hand model in a CG model are set, acquiring a color image of the prosthetic hand model by using the camera;
acquiring three-dimensional coordinates of key points of the hand according to the artificial hand model;
fusing the color image of the artificial hand model with the real scene image to obtain a color image with a foreground artificial hand model and a real background;
according to internal parameters of a camera, cutting a hand area in a color image with a foreground artificial hand model and a real background to obtain a hand frame color image, and performing normalization processing on the hand frame color image and three-dimensional coordinates of hand key points to obtain training data of the hand key point three-dimensional coordinate recognition network model, wherein the training data comprises the normalized hand frame color image and the three-dimensional coordinates of the hand key points.
Optionally, the acquiring training data of the three-dimensional coordinate recognition network model of the hand key point includes:
acquiring a depth image and a color image which are acquired by a depth camera and are synchronized and registered with each other;
recognizing three-dimensional coordinates of the hand key points of the depth image by using a hand key point coordinate recognition model based on depth data;
cutting a hand area in the color image to obtain a hand frame color image corresponding to the depth image hand area;
and normalizing the three-dimensional coordinates of the hand key points according to the depth value of the center of the hand frame color image to obtain training data of the hand key point three-dimensional coordinate recognition network model comprising the hand frame color image and the normalized three-dimensional coordinates of the hand key points.
Optionally, the three-dimensional coordinates of the target hand key point are position coordinates relative to the center of the target hand frame color image area, and the method further includes:
acquiring a depth value of the center of the target hand frame color image area;
and calculating the real three-dimensional coordinates of the key points of the target hand according to the depth value of the center of the color image area of the target hand frame.
A hand key point three-dimensional coordinate recognition device comprises:
the device comprises a color image acquisition unit, a color image acquisition unit and a color image processing unit, wherein the color image acquisition unit is used for acquiring a target hand frame color image which is obtained after hand detection;
and the three-dimensional coordinate identification unit is used for inputting the target hand frame color image into a hand key point three-dimensional coordinate identification network model for processing to obtain the three-dimensional coordinates of the target hand key point.
Optionally, the apparatus further comprises:
the training data acquisition unit is used for acquiring training data of the three-dimensional coordinate recognition network model of the hand key points;
and the recognition model training unit is used for training a preset neural network model by using the training data, and when the accuracy of the output result of the preset neural network model is greater than a threshold value, the three-dimensional coordinate recognition network model of the hand key point is obtained.
Optionally, the training data obtaining unit is specifically configured to:
under the condition that the direction of a camera and the distance between the camera and a prosthetic hand model in a CG model are set, acquiring a color image of the prosthetic hand model by using the camera;
acquiring three-dimensional coordinates of key points of the hand according to the artificial hand model;
fusing the color image of the artificial hand model with the real scene image to obtain a color image with a foreground artificial hand model and a real background;
according to internal parameters of a camera, cutting a hand area in a color image with a foreground artificial hand model and a real background to obtain a hand frame color image, and performing normalization processing on the hand frame color image and three-dimensional coordinates of hand key points to obtain training data of the hand key point three-dimensional coordinate recognition network model, wherein the training data comprises the normalized hand frame color image and the three-dimensional coordinates of the hand key points.
Optionally, the training data obtaining unit is specifically configured to:
acquiring a depth image and a color image which are acquired by a depth camera and are synchronized and registered with each other;
recognizing three-dimensional coordinates of the hand key points of the depth image by using a hand key point coordinate recognition model based on depth data;
cutting a hand area in the color image to obtain a hand frame color image corresponding to the depth image hand area;
and normalizing the three-dimensional coordinates of the hand key points according to the depth value of the center of the hand frame color image to obtain training data of the hand key point three-dimensional coordinate recognition network model comprising the hand frame color image and the normalized three-dimensional coordinates of the hand key points.
Optionally, the three-dimensional coordinates of the target hand key point are position coordinates relative to the center of the target hand frame color image area, and the apparatus further includes:
the three-dimensional coordinate conversion unit is used for acquiring the depth value of the center of the target hand frame color image area; and calculating the real three-dimensional coordinates of the key points of the target hand according to the depth value of the center of the color image area of the target hand frame.
Compared with the prior art, the invention has the following beneficial effects:
according to the method for identifying the three-dimensional coordinates of the key points of the hands, the three-dimensional coordinates of the key points of the hands in the color image of the hand frame are identified by using the three-dimensional coordinate identification network model of the key points of the hands obtained through pre-selection training, the three-dimensional coordinates of the key points of the hands can be identified only based on the color image without using a depth image, the user experience is improved, and the application scene is wide.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a schematic flow chart of a method for identifying three-dimensional coordinates of key points of a hand according to an embodiment of the present invention;
FIG. 2 is a schematic flowchart of a method for training a three-dimensional coordinate recognition network model of a hand key point according to an embodiment of the present invention;
FIG. 3 is a schematic flowchart of a method for obtaining training data of a three-dimensional coordinate recognition network model of a hand key point according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating the relative positions of a camera and a dummy hand model in a CG model according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a color image of a prosthetic hand model collected according to an embodiment of the present invention;
fig. 6 is a schematic image diagram obtained by fusing a color image of a prosthetic hand model and a real scene image, which is disclosed in the embodiment of the present invention;
FIG. 7 is a schematic flowchart of another method for obtaining training data of a three-dimensional coordinate recognition network model of a hand key point according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of a hand key point three-dimensional coordinate recognition device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
For a depth camera, a color camera has low cost and low energy consumption, and is widely applied to current mobile end equipment, and this embodiment discloses a method for identifying three-dimensional coordinates of a hand key point, which is applied to mobile end equipment equipped with a color camera, and realizes the identification of three-dimensional coordinates of the hand key point based on a color image, please refer to fig. 1, and the method for identifying three-dimensional coordinates of the hand key point disclosed in this embodiment includes the following steps:
s101: acquiring a target hand frame color image, wherein the target hand frame color image is a color image obtained after hand detection;
after the color camera collects a color image containing a hand image, a target hand frame color image is obtained through hand detection, and the target hand frame color image is the hand frame color image needing to be subjected to three-dimensional coordinate identification of a hand key point at this time.
S102: and inputting the target hand frame color image into a three-dimensional coordinate recognition network model of the hand key point for processing to obtain the three-dimensional coordinate of the target hand key point.
The three-dimensional coordinate recognition network model of the hand key point is obtained by pre-training, please refer to fig. 2, and the training method of the three-dimensional coordinate recognition network model of the hand key point is as follows:
s201: acquiring training data of a three-dimensional coordinate recognition network model of a hand key point;
the embodiment provides two methods for acquiring training data, wherein in the first method, a color camera is used for acquiring a color image of a prosthetic hand model in a CG model, and the training data is obtained through image fusion processing, hand region cutting and three-dimensional coordinate normalization processing of hand key points; and acquiring a depth image and a color image containing a hand through a depth camera, recognizing the three-dimensional coordinates of the hand key points of the depth image based on a hand key point coordinate recognition model of the depth data, and obtaining training data through hand region cutting and three-dimensional coordinate normalization processing of the hand key points.
The two training data acquisition methods specifically comprise the following steps:
method 1
Referring to fig. 3, the method for obtaining training data of the three-dimensional coordinate recognition network model of the hand key point includes the following steps:
s301: under the condition that the direction of a camera and the distance between the camera and a prosthetic hand model in a CG model are set, acquiring a color image of the prosthetic hand model by using the camera;
the relative positions of the camera and the dummy hand model in the CG model are shown in fig. 4, where the camera's intrinsic parameters have been set, and the camera's intrinsic parameters include a horizontal focal length, a vertical focal length, an image center horizontal coordinate, and a vertical coordinate.
S302: acquiring three-dimensional coordinates of key points of the hand according to the artificial hand model;
as shown in fig. 5, the collected color image of the hand model corresponds to the hand key points, and the three-dimensional coordinates of the hand key points are three-dimensional coordinates having a depth direction, and may be world coordinates or image uvd coordinates.
S303: fusing the color image of the artificial hand model with the real scene image to obtain a color image with a foreground artificial hand model and a real background;
specifically, the color image of the artificial hand model is used as a mask (mask) of a real image, the color image of the artificial hand model is placed in the center of a real scene image or other set positions, and the data of an original real image corresponding to an area in the contour of the artificial hand is changed into color image data of the artificial hand model, so that the color image of the artificial hand model is fused with the real scene image.
A color image with a foreground prosthetic hand model and a real background is shown in fig. 6.
S304: according to internal parameters of a camera, cutting a hand area in a color image with a foreground artificial hand model and a real background to obtain a hand frame color image, and performing normalization processing on the hand frame color image and three-dimensional coordinates of hand key points to obtain training data of the hand key point three-dimensional coordinate recognition network model, wherein the training data comprises the normalized hand frame color image and the three-dimensional coordinates of the hand key points.
The color image of the hand frame obtained after cutting the hand area is the smallest surrounding frame image which can surround the hand area.
The normalization processing of the three-dimensional coordinates of the hand frame color image and the hand key points means that pixels of the hand frame color image are normalized to be in an interval of [ -1,1] or [0,1], the coordinates of the hand key points are normalized by taking centroid coordinates obtained by calculating the coordinates of the artificial hand frame as normalization reference points, the normalization interval is [ -1,1] or [0,1], and the normalization interval is consistent with the normalization range of the hand frame color image.
Method two
Referring to fig. 7, the method for obtaining training data of the three-dimensional coordinate recognition network model of the hand key point includes the following steps:
s401: acquiring a depth image and a color image which are acquired by a depth camera and are synchronized and registered with each other;
the registration here means that the color image and the depth image are consistent in size and have one-to-one correspondence in pixel value in the process of guaranteeing frame synchronization transmission. The one-to-one correspondence referred to herein is not the same as the pixel values in the color image and the depth image, but rather the color image is palm RGB data for one hand at the center of the image, and the corresponding depth image center is the palm depth value for one hand.
S402: recognizing three-dimensional coordinates of the hand key points of the depth image by using a hand key point coordinate recognition model based on depth data;
the hand key point coordinate identification model based on the depth data can be any existing model, the principle of which is existing, and the description is omitted here.
S403: cutting a hand area in the color image to obtain a hand frame color image corresponding to the depth image hand area;
the color image of the hand frame obtained after cutting the hand area is the smallest surrounding frame image which can surround the hand area.
S404: and normalizing the three-dimensional coordinates of the hand key points according to the depth value of the center of the hand frame color image to obtain training data of the hand key point three-dimensional coordinate recognition network model comprising the hand frame color image and the normalized three-dimensional coordinates of the hand key points.
The three-dimensional coordinates of the hand key points are normalized according to the depth value of the center of the color image of the hand frame, specifically, the depth value of the center of the color image of the hand frame is used as a reference point for normalizing the coordinates of the cut hand key points to be normalized to be [ -1,1] or [0,1], and meanwhile, the color image of the hand frame is also normalized in a corresponding range, namely, the pixels of the color image of the hand frame are normalized to be in an interval [ -1,1] or [0,1 ].
The two training data acquisition methods can perform data enhancement processes, translation, rotation, scaling, mirror image and the like after obtaining the normalized data, and meanwhile, the artificial hand model can also be flexibly driven through the degrees of freedom of the skeleton joints to perform different gestures, which is not repeated herein.
S202: and training a preset neural network model by using the training data, and obtaining a three-dimensional coordinate recognition network model of the hand key point when the accuracy of the output result of the preset neural network model is greater than a threshold value.
The three-dimensional coordinate recognition network model of the hand key points has the input data of a hand frame color image and the output data of the three-dimensional coordinate recognition network model of the hand key points of the hand frame color image.
The three-dimensional coordinates of the hand key points output by the hand key point three-dimensional coordinate recognition network model are position coordinates relative to the center of the target hand frame color image area, the depth value of the center of the target hand frame color image area is acquired by means of a bullet screen SLAM and the like, and the real three-dimensional coordinates of the target hand key points are calculated according to the depth value of the center of the target hand frame color image area.
Therefore, according to the method for recognizing the three-dimensional coordinates of the key points of the hands disclosed by the embodiment, the three-dimensional coordinates of the key points of the hands in the color image of the hand frame are recognized by using the three-dimensional coordinate recognition network model of the key points of the hands obtained through the pre-selection training, the three-dimensional coordinates of the key points of the hands can be recognized only based on the color image without using a depth image, the user experience is improved, and the application scene is wide.
Based on the method for identifying three-dimensional coordinates of key points of a hand disclosed in the above embodiments, the present embodiment correspondingly discloses a device for identifying three-dimensional coordinates of key points of a hand, please refer to fig. 8, and the device includes:
a color image acquisition unit 801, configured to acquire a color image of a target hand frame, where the color image of the target hand frame is obtained after hand detection;
and a three-dimensional coordinate recognition unit 802, configured to input the target hand frame color image into a hand key point three-dimensional coordinate recognition network model for processing, so as to obtain a three-dimensional coordinate of the target hand key point.
Optionally, the apparatus further comprises:
the training data acquisition unit is used for acquiring training data of the three-dimensional coordinate recognition network model of the hand key points;
and the recognition model training unit is used for training a preset neural network model by using the training data, and when the accuracy of the output result of the preset neural network model is greater than a threshold value, the three-dimensional coordinate recognition network model of the hand key point is obtained.
Optionally, the training data obtaining unit is specifically configured to:
under the condition that the direction of a camera and the distance between the camera and a prosthetic hand model in a CG model are set, acquiring a color image of the prosthetic hand model by using the camera;
acquiring three-dimensional coordinates of key points of the hand according to the artificial hand model;
fusing the color image of the artificial hand model with the real scene image to obtain a color image with a foreground artificial hand model and a real background;
according to internal parameters of a camera, cutting a hand area in a color image with a foreground artificial hand model and a real background to obtain a hand frame color image, and performing normalization processing on the hand frame color image and three-dimensional coordinates of hand key points to obtain training data of the hand key point three-dimensional coordinate recognition network model, wherein the training data comprises the normalized hand frame color image and the three-dimensional coordinates of the hand key points.
Optionally, the training data obtaining unit is specifically configured to:
acquiring a depth image and a color image which are acquired by a depth camera and are synchronized and registered with each other;
recognizing three-dimensional coordinates of the hand key points of the depth image by using a hand key point coordinate recognition model based on depth data;
cutting a hand area in the color image to obtain a hand frame color image corresponding to the depth image hand area;
and normalizing the three-dimensional coordinates of the hand key points according to the depth value of the center of the hand frame color image to obtain training data of the hand key point three-dimensional coordinate recognition network model comprising the hand frame color image and the normalized three-dimensional coordinates of the hand key points.
Optionally, the three-dimensional coordinates of the target hand key point are position coordinates relative to the center of the target hand frame color image area, and the apparatus further includes:
the three-dimensional coordinate conversion unit is used for acquiring the depth value of the center of the target hand frame color image area; and calculating the real three-dimensional coordinates of the key points of the target hand according to the depth value of the center of the color image area of the target hand frame.
According to the hand key point three-dimensional coordinate recognition device disclosed by the embodiment, the three-dimensional coordinates of the hand key points in the hand frame color image are recognized by using the hand key point three-dimensional coordinate recognition network model obtained through pre-selection training, the three-dimensional coordinates of the hand key points can be recognized only based on the color image without using a depth image, the user experience is improved, and the application scene is wide.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
It is further noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1.一种手部关键点三维坐标识别方法,其特征在于,包括:1. a three-dimensional coordinate identification method of hand key point, is characterized in that, comprises: 获取目标手部框彩色图像,所述目标手部框彩色图像为经过手部检测后得到的彩色图像;acquiring a color image of the target hand frame, where the target hand frame color image is a color image obtained after hand detection; 将所述目标手部框彩色图像输入手部关键点三维坐标识别网络模型中进行处理,得到所述目标手部关键点的三维坐标,所述目标手部关键点的三维坐标为相对于所述目标手部框彩色图像区域中心的位置坐标。The color image of the target hand frame is input into the three-dimensional coordinate recognition network model of the key points of the hand for processing, and the three-dimensional coordinates of the key points of the target hand are obtained, and the three-dimensional coordinates of the key points of the target hand are relative to the The position coordinates of the center of the color image area of the target hand box. 2.根据权利要求1所述的方法,其特征在于,所述方法还包括:2. The method according to claim 1, wherein the method further comprises: 获取所述手部关键点三维坐标识别网络模型的训练数据;Obtain the training data of the three-dimensional coordinate recognition network model of the hand key point; 利用所述训练数据对预设神经网络模型进行训练,当所述预设神经网络模型输出结果的准确率大于阈值时,得到所述手部关键点三维坐标识别网络模型。The preset neural network model is trained by using the training data, and when the accuracy rate of the output result of the preset neural network model is greater than a threshold, the three-dimensional coordinate recognition network model of the hand key point is obtained. 3.根据权利要求2所述的方法,其特征在于,所述获取所述手部关键点三维坐标识别网络模型的训练数据,包括:3. The method according to claim 2, wherein the acquiring the training data of the three-dimensional coordinate recognition network model of the hand key point comprises: 在设置好相机的方向以及相机与CG模型中的假手模型的距离的情况下,利用相机采集所述假手模型的彩色图像;When the direction of the camera and the distance between the camera and the prosthetic hand model in the CG model are set, the camera is used to collect a color image of the prosthetic hand model; 依据所述假手模型获取手部关键点的三维坐标;obtaining three-dimensional coordinates of key points of the hand according to the prosthetic hand model; 对所述假手模型的彩色图像与真实场景图像进行融合,得到具有前景假手模型和真实背景的彩色图像;fusing the color image of the prosthetic hand model with the real scene image to obtain a color image with a foreground prosthetic hand model and a real background; 根据相机的内参,在具有前景假手模型和真实背景的彩色图像中进行手部区域裁剪,得到手部框彩色图像,并对手部框彩色图像和手部关键点的三维坐标进行归一化处理,得到包括归一化处理后的手部框彩色图像和手部关键点的三维坐标的所述手部关键点三维坐标识别网络模型的训练数据。According to the internal parameters of the camera, the hand region is cropped in the color image with the foreground prosthetic hand model and the real background, and the color image of the hand frame is obtained, and the color image of the hand frame and the three-dimensional coordinates of the key points of the hand are normalized. The training data of the three-dimensional coordinate recognition network model of the hand key point including the normalized color image of the hand frame and the three-dimensional coordinates of the hand key point is obtained. 4.根据权利要求2所述的方法,其特征在于,所述获取所述手部关键点三维坐标识别网络模型的训练数据,包括:4. The method according to claim 2, wherein the acquiring the training data of the three-dimensional coordinate recognition network model of the hand key point comprises: 获取深度相机采集的帧同步且配准后的深度图像和彩色图像;Acquire the synchronized and registered depth image and color image captured by the depth camera; 利用基于深度数据的手部关键点坐标识别模型,识别所述深度图像的手部关键点的三维坐标;Utilize the hand key point coordinate recognition model based on the depth data to identify the three-dimensional coordinates of the hand key point of the depth image; 在彩色图像中进行手部区域裁剪,得到与所述深度图像手部区域相对应的手部框彩色图像;Cropping the hand region in the color image to obtain a color image of the hand frame corresponding to the hand region of the depth image; 依据手部框彩色图像中心的深度值对手部关键点的三维坐标进行归一化处理,得到包括手部框彩色图像和归一化处理后的手部关键点三维坐标的所述手部关键点三维坐标识别网络模型的训练数据。According to the depth value of the center of the color image of the hand frame, the three-dimensional coordinates of the hand key points are normalized to obtain the hand key points including the color image of the hand frame and the three-dimensional coordinates of the normalized hand key points. The training data for the 3D coordinate recognition network model. 5.根据权利要求1所述的方法,其特征在于,所述方法还包括:5. The method according to claim 1, wherein the method further comprises: 获取所述目标手部框彩色图像区域中心的深度值;Obtain the depth value of the center of the color image area of the target hand frame; 依据所述目标手部框彩色图像区域中心的深度值,计算所述目标手部关键点的真实三维坐标。According to the depth value of the center of the color image area of the target hand frame, the real three-dimensional coordinates of the key points of the target hand are calculated. 6.一种手部关键点三维坐标识别装置,其特征在于,包括:6. A three-dimensional coordinate recognition device for hand key points, characterized in that, comprising: 彩色图像获取单元,用于获取目标手部框彩色图像,所述目标手部框彩色图像为经过手部检测后得到的彩色图像;a color image acquisition unit, configured to acquire a color image of a target hand frame, where the target hand frame color image is a color image obtained after hand detection; 三维坐标识别单元,用于将所述目标手部框彩色图像输入手部关键点三维坐标识别网络模型中进行处理,得到所述目标手部关键点的三维坐标,所述目标手部关键点的三维坐标为相对于所述目标手部框彩色图像区域中心的位置坐标。The three-dimensional coordinate recognition unit is used for inputting the color image of the target hand frame into the three-dimensional coordinate recognition network model of the key points of the hand for processing to obtain the three-dimensional coordinates of the key points of the target hand. The three-dimensional coordinates are the position coordinates relative to the center of the color image area of the target hand frame. 7.根据权利要求6所述的装置,其特征在于,所述装置还包括:7. The apparatus of claim 6, wherein the apparatus further comprises: 训练数据获取单元,用于获取所述手部关键点三维坐标识别网络模型的训练数据;a training data acquisition unit, used for acquiring the training data of the three-dimensional coordinate recognition network model of the hand key point; 识别模型训练单元,用于利用所述训练数据对预设神经网络模型进行训练,当所述预设神经网络模型输出结果的准确率大于阈值时,得到所述手部关键点三维坐标识别网络模型。A recognition model training unit, configured to use the training data to train a preset neural network model, and when the accuracy rate of the output result of the preset neural network model is greater than a threshold, obtain the three-dimensional coordinate recognition network model of the hand key point . 8.根据权利要求7所述的装置,其特征在于,所述训练数据获取单元,具体用于:8. The device according to claim 7, wherein the training data acquisition unit is specifically used for: 在设置好相机的方向以及相机与CG模型中的假手模型的距离的情况下,利用相机采集所述假手模型的彩色图像;When the direction of the camera and the distance between the camera and the prosthetic hand model in the CG model are set, the camera is used to collect a color image of the prosthetic hand model; 依据所述假手模型获取手部关键点的三维坐标;obtaining three-dimensional coordinates of key points of the hand according to the prosthetic hand model; 对所述假手模型的彩色图像与真实场景图像进行融合,得到具有前景假手模型和真实背景的彩色图像;fusing the color image of the prosthetic hand model with the real scene image to obtain a color image with a foreground prosthetic hand model and a real background; 根据相机的内参,在具有前景假手模型和真实背景的彩色图像中进行手部区域裁剪,得到手部框彩色图像,并对手部框彩色图像和手部关键点的三维坐标进行归一化处理,得到包括归一化处理后的手部框彩色图像和手部关键点的三维坐标的所述手部关键点三维坐标识别网络模型的训练数据。According to the internal parameters of the camera, the hand region is cropped in the color image with the foreground prosthetic hand model and the real background, and the color image of the hand frame is obtained, and the color image of the hand frame and the three-dimensional coordinates of the key points of the hand are normalized. The training data of the three-dimensional coordinate recognition network model of the hand key point including the normalized color image of the hand frame and the three-dimensional coordinates of the hand key point is obtained. 9.根据权利要求7所述的装置,其特征在于,所述训练数据获取单元,具体用于:9. The device according to claim 7, wherein the training data acquisition unit is specifically used for: 获取深度相机采集的帧同步且配准后的深度图像和彩色图像;Acquire the synchronized and registered depth image and color image captured by the depth camera; 利用基于深度数据的手部关键点坐标识别模型,识别所述深度图像的手部关键点的三维坐标;Utilize the hand key point coordinate recognition model based on the depth data to identify the three-dimensional coordinates of the hand key point of the depth image; 在彩色图像中进行手部区域裁剪,得到与所述深度图像手部区域相对应的手部框彩色图像;Cropping the hand region in the color image to obtain a color image of the hand frame corresponding to the hand region of the depth image; 依据手部框彩色图像中心的深度值对手部关键点的三维坐标进行归一化处理,得到包括手部框彩色图像和归一化处理后的手部关键点三维坐标的所述手部关键点三维坐标识别网络模型的训练数据。According to the depth value of the center of the color image of the hand frame, the three-dimensional coordinates of the hand key points are normalized to obtain the hand key points including the color image of the hand frame and the three-dimensional coordinates of the normalized hand key points. The training data for the 3D coordinate recognition network model. 10.根据权利要求6所述的装置,其特征在于,所述装置还包括:10. The apparatus of claim 6, wherein the apparatus further comprises: 三维坐标转换单元,用于获取所述目标手部框彩色图像区域中心的深度值;依据所述目标手部框彩色图像区域中心的深度值,计算所述目标手部关键点的真实三维坐标。The three-dimensional coordinate conversion unit is used to obtain the depth value of the center of the color image area of the target hand frame; according to the depth value of the center of the color image area of the target hand frame, calculate the real three-dimensional coordinates of the key points of the target hand.
CN201911112541.2A 2019-11-14 2019-11-14 Method and device for identifying three-dimensional coordinates of hand key points Expired - Fee Related CN111222401B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911112541.2A CN111222401B (en) 2019-11-14 2019-11-14 Method and device for identifying three-dimensional coordinates of hand key points

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911112541.2A CN111222401B (en) 2019-11-14 2019-11-14 Method and device for identifying three-dimensional coordinates of hand key points

Publications (2)

Publication Number Publication Date
CN111222401A true CN111222401A (en) 2020-06-02
CN111222401B CN111222401B (en) 2023-08-22

Family

ID=70829003

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911112541.2A Expired - Fee Related CN111222401B (en) 2019-11-14 2019-11-14 Method and device for identifying three-dimensional coordinates of hand key points

Country Status (1)

Country Link
CN (1) CN111222401B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB201208088D0 (en) * 2012-05-09 2012-06-20 Ncam Sollutions Ltd Ncam
CN108256504A (en) * 2018-02-11 2018-07-06 苏州笛卡测试技术有限公司 A kind of Three-Dimensional Dynamic gesture identification method based on deep learning
CN109308459A (en) * 2018-09-05 2019-02-05 南京大学 Gesture Estimation Method Based on Finger Attention Model and Keypoint Topology Model
CN110163048A (en) * 2018-07-10 2019-08-23 腾讯科技(深圳)有限公司 Identification model training method, recognition methods and the equipment of hand key point
CN110427917A (en) * 2019-08-14 2019-11-08 北京百度网讯科技有限公司 Method and device for detecting key points
CN110443205A (en) * 2019-08-07 2019-11-12 北京华捷艾米科技有限公司 A kind of hand images dividing method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB201208088D0 (en) * 2012-05-09 2012-06-20 Ncam Sollutions Ltd Ncam
CN108256504A (en) * 2018-02-11 2018-07-06 苏州笛卡测试技术有限公司 A kind of Three-Dimensional Dynamic gesture identification method based on deep learning
CN110163048A (en) * 2018-07-10 2019-08-23 腾讯科技(深圳)有限公司 Identification model training method, recognition methods and the equipment of hand key point
CN109308459A (en) * 2018-09-05 2019-02-05 南京大学 Gesture Estimation Method Based on Finger Attention Model and Keypoint Topology Model
CN110443205A (en) * 2019-08-07 2019-11-12 北京华捷艾米科技有限公司 A kind of hand images dividing method and device
CN110427917A (en) * 2019-08-14 2019-11-08 北京百度网讯科技有限公司 Method and device for detecting key points

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
冯超等: ""一种多特征相结合的三维人脸关键点检测方法"", 《液晶与显示》 *

Also Published As

Publication number Publication date
CN111222401B (en) 2023-08-22

Similar Documents

Publication Publication Date Title
KR102731786B1 (en) How to create a customized/personalized head transfer function
CN106598221B (en) 3D direction of visual lines estimation method based on eye critical point detection
EP3113114B1 (en) Image processing method and device
CN106708270B (en) Virtual reality equipment display method and device and virtual reality equipment
WO2020015468A1 (en) Image transmission method and apparatus, terminal device, and storage medium
CN103793719A (en) Monocular distance-measuring method and system based on human eye positioning
CN101180653A (en) Method and device for three-dimensional rendering
CN110276251B (en) Image recognition method, device, equipment and storage medium
TW201941104A (en) Control method for smart device, apparatus, device, and storage medium
WO2021238163A1 (en) Image processing method and apparatus, electronic device, and storage medium
CN103472907B (en) Method and system for determining operation area
CN103472915B (en) reading control method based on pupil tracking, reading control device and display device
CN106218409A (en) A kind of can the bore hole 3D automobile instrument display packing of tracing of human eye and device
US20160093028A1 (en) Image processing method, image processing apparatus and electronic device
CN104049760A (en) Obtaining method and system of man-machine interaction instruction
CN103475886A (en) Stereoscopic depth image establishing system and method thereof
CN107272899B (en) A VR interaction method, device and electronic device based on dynamic gestures
TWI509466B (en) Object recognition method and object recognition apparatus using the same
CN107526515A (en) A kind of method and electronic equipment of focusing of taking pictures
CN105306819B (en) A kind of method and device taken pictures based on gesture control
KR101053253B1 (en) Apparatus and method for face recognition using 3D information
WO2019119290A1 (en) Method and apparatus for determining prompt information, and electronic device and computer program product
CN111399634A (en) Gesture-guided object recognition method and device
CN104065949B (en) A kind of Television Virtual touch control method and system
CN111928304B (en) Oil smoke concentration identification method and device and range hood

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20230822