CN119810268A

CN119810268A - Posture transition animation generation method, device, electronic device and storage medium

Info

Publication number: CN119810268A
Application number: CN202411854004.6A
Authority: CN
Inventors: 魏伟亮
Original assignee: Netease Hangzhou Network Co Ltd
Current assignee: Netease Hangzhou Network Co Ltd
Priority date: 2024-12-16
Filing date: 2024-12-16
Publication date: 2025-04-11

Abstract

The application provides a gesture transition animation generation method, a gesture transition animation generation device, electronic equipment and a storage medium, wherein the gesture transition animation generation method comprises the following steps: acquiring first state information of a plurality of preset gesture key points in a first gesture and second state information of a plurality of preset gesture key points in a second gesture, respectively determining common gesture key points and unique gesture key points of the first gesture and the second gesture according to the first state information and the second state information, generating at least one frame of interpolation skeleton image between the first gesture and the second gesture according to the common gesture key points and the unique gesture key points, respectively generating a starting skeleton image of the first gesture and an ending skeleton image of the second gesture according to the first state information and the second state information, and generating gesture transition animations according to the starting skeleton image of the first gesture, the at least one frame of interpolation skeleton image and the ending skeleton image of the second gesture. The scheme is suitable for nonlinear motion, has high animation generation efficiency, and improves the smoothness and naturalness of the animation.

Description

Gesture transition animation generation method and device, electronic equipment and storage medium

Technical Field

The present application relates to the field of image processing technologies, and in particular, to a method and apparatus for generating a gesture transition animation, an electronic device, and a storage medium.

Background

Along with the continuous development of computer graphics, artificial intelligence and man-machine interaction technologies, in the fields of animation production, video editing, virtual reality and the like, the gesture interpolation technology is a key for realizing smooth transition and natural motion.

Currently, pose interpolation mainly relies on simple linear interpolation or complex machine learning models to generate intermediate transition frames, and manually adjusts and edits the intermediate transition frames so that the intermediate transition frames meet expected requirements.

However, a simple linear interpolation method cannot handle complex nonlinear motion, and a complex machine learning model can generate a high-quality interpolation result, but requires a large amount of training data and computing resources, and is relatively complex and inefficient.

Disclosure of Invention

In view of the above, the embodiment of the application provides a gesture transition animation generation method, a gesture transition animation generation device, electronic equipment and a storage medium, which are suitable for nonlinear motion, have high animation generation efficiency and improve animation fluency and naturalness.

In a first aspect, an embodiment of the present application provides a method for generating a gesture transition animation, including:

Acquiring first state information of a plurality of preset gesture key points in a first gesture and second state information of a plurality of preset gesture key points in a second gesture;

According to the first state information and the second state information, common gesture key points and unique gesture key points of the first gesture and the second gesture are respectively determined;

Generating at least one frame of interpolation skeleton image between the first gesture and the second gesture according to the common gesture key points and the unique gesture key points;

Generating a starting skeleton image of the first gesture and an ending skeleton image of the second gesture according to the first state information and the second state information;

and generating a gesture transition animation from the first gesture to the second gesture according to the initial skeleton image of the first gesture, the at least one frame of interpolation skeleton image and the ending skeleton image of the second gesture.

In a second aspect, an embodiment of the present application further provides a gesture transition animation generating device, including:

the acquisition module is used for acquiring first state information of a plurality of preset gesture key points in a first gesture and second state information of a plurality of preset gesture key points in a second gesture;

the determining module is used for respectively determining common gesture key points and unique gesture key points of the first gesture and the second gesture according to the first state information and the second state information;

the generation module is used for generating at least one frame of interpolation skeleton image between the first gesture and the second gesture according to the common gesture key points and the unique gesture key points;

The generating module is further configured to generate a start skeleton image of the first gesture and an end skeleton image of the second gesture according to the first state information and the second state information, respectively;

The generating module is further configured to generate a gesture transition animation from the first gesture to the second gesture according to the initial skeleton image of the first gesture, the at least one frame of interpolation skeleton image, and the ending skeleton image of the second gesture.

In a third aspect, an embodiment of the present application further provides an electronic device, including a processor, a memory, and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory in communication over the bus when the electronic device is running, the processor executing the machine-readable instructions to perform the method of any one of the first aspects.

In a fourth aspect, embodiments of the present application also provide a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the method of any of the first aspects.

The application provides a gesture transition animation generation method, a device, electronic equipment and a storage medium, wherein the method comprises the steps of obtaining first state information of a plurality of preset gesture key points in a first gesture and second state information of a plurality of preset gesture key points in a second gesture, respectively determining common gesture key points and unique gesture key points of the first gesture and the second gesture according to the first state information and the second state information, generating at least one frame of interpolation skeleton image between the first gesture and the second gesture according to the common gesture key points and the unique gesture key points, respectively generating a start skeleton image of the first gesture and an end skeleton image of the second gesture according to the first state information and the second state information, and generating a gesture transition animation according to the start skeleton image of the first gesture, the at least one frame of interpolation skeleton image and the end skeleton image of the second gesture. The scheme is suitable for nonlinear motion, has high animation generation efficiency, and improves the smoothness and naturalness of the animation.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart illustrating a method for generating a gesture transition animation according to an embodiment of the present application;

FIG. 2 is a second flow chart of a gesture transition animation generation method according to an embodiment of the present application;

FIG. 3 is a schematic diagram I of a gesture transition animation according to an embodiment of the present application;

FIG. 4 is a schematic diagram II of a gesture transition animation according to an embodiment of the present application;

FIG. 5 is a flowchart illustrating a method for generating a gesture transition animation according to an embodiment of the present application;

FIG. 6 is a schematic structural diagram of a gesture transition animation generating device according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. The components of the embodiments of the present application generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the application, as presented in the figures, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by a person skilled in the art without making any inventive effort, are intended to be within the scope of the present application.

Along with the continuous development of computer graphics, artificial intelligence and man-machine interaction technologies, the gesture interpolation technology plays an important role in more fields, is a key for realizing smooth transition and natural motion, is used for promoting technical innovation and efficiency improvement of various industries, and is particularly embodied in the following scenes:

1. Animation industry

In conventional animation, an animator needs to manually draw each frame of pictures, which is a time-consuming and labor-intensive process. With the development of computer animation technology, key frame animation is the mainstream, and an animator only needs to create key frames, while intermediate transition frames are automatically generated by a computer.

2. Movie special effect industry

In movie special effects production, motion capture technology is widely used to create realistic character animations. However, motion capture data often requires post-processing and optimization. The sample gesture interpolation technique can be used to smooth and interpolate motion capture data, helping the special effects practitioner create smoother, more natural character animations.

3. Game development industry

In game development, the fluency of character animation directly affects the game experience. The present technique may be applied to action transitions of game characters, such as smooth transitions between states from standing to walking, walking to running, etc. This not only improves the naturalness of the action of the game character, but also reduces the storage requirements of animation resources.

4. Virtual Reality (VR) and augmented Reality (Augmented Reality, AR)

In VR and AR applications, user interactions and actions of virtual characters need to be generated and rendered in real-time. Due to the characteristic of high calculation efficiency, the technology is very suitable for generating smooth action transition in real time, and the immersion and realism of VR/AR experience are improved.

5. Human-computer interaction and computer vision

In the field of human-computer interaction, accurate recognition and prediction of human body gestures are a key problem. The gesture interpolation technology can be used for optimizing the result of human gesture recognition, filling in the lack of gesture data caused by shielding or recognition errors, and improving the stability and accuracy of the human-computer interaction system.

6. Sports analysis and sports science and technology

In the fields of physical training and sports medicine, accurate motion trajectory analysis is of paramount importance. The present techniques may be used to process and optimize data obtained from motion capture devices, help coaches and doctors more accurately analyze athlete's movements, optimize training protocols, or conduct rehabilitation instructions.

7. Robotics technology

In robot motion control, smooth trajectory planning is critical to achieving smooth and accurate robot motion. The technology can be applied to the generation of the motion trail of the robot, and helps the robot to realize more natural and efficient actions.

8. Digital twin technology

Digital twin technology is becoming increasingly important in the industry and in the field of intelligent manufacturing. The attitude interpolation technology can be used for optimizing human body or machine motion simulation in the digital twin model, and the simulation accuracy and sense of reality are improved.

9. Education and training

In the fields of distance education and vocational training, particularly courses (such as dance, sports, craftsman and the like) requiring action demonstration, the gesture interpolation technology can be used for generating detailed action decomposition teaching videos so as to help learners to better understand and simulate complex actions.

10. Medical rehabilitation

In physical therapy and rehabilitation training, accurate motion guidance and assessment are important. The posture interpolation technology can be used for generating standard action demonstration or analyzing the movement condition of a patient to help doctors to make more accurate rehabilitation plans.

In the application field, the existing attitude interpolation technology mainly depends on a simple linear interpolation or a complex machine learning model, the simple linear interpolation method has high calculation speed but cannot process complex nonlinear motion, and the complex machine learning model can generate a high-quality interpolation result, but requires a large amount of training data and calculation resources, and is complex and low in efficiency.

It should be noted that, in the middle transition frame process between two motion postures, if the posture key points of the two motion postures are consistent, the motion is linear motion, and if the posture key points of the two motion postures are inconsistent, that is, the condition that the posture key points disappear or appear may occur under the two postures, the motion is nonlinear motion.

Based on the method, the application provides a gesture transition animation generation method based on linear interpolation, key points of intermediate transition frames are generated rapidly through an optimized linear interpolation algorithm, the calculation efficiency and the animation quality are balanced, and different interpolation strategies are adopted for nonlinear motion (the key points disappear or appear), so that the accuracy of interpolation is improved, and therefore, the method is suitable for nonlinear motion, the animation generation efficiency is high, and the smoothness and the naturalness of the animation are improved.

Fig. 1 is a schematic flow chart of a gesture transition animation generation method according to an embodiment of the present application, where an execution subject of the embodiment may be an electronic device, such as a terminal, a tablet computer, a desktop computer, or the like.

As shown in fig. 1, the method may include:

S101, acquiring first state information of a plurality of preset gesture key points in a first gesture and second state information of a plurality of preset gesture key points in a second gesture.

The first gesture and the second gesture may be gestures of the same type of object, the object type may be a person or an animal, the first gesture may be a gesture in which the person stands straight, and the second gesture may be a gesture in which the person takes a step.

In an alternative embodiment, the first pose and the second pose are poses of the same class of objects, and the poses of the same object or poses of different objects under the same class of objects, for example, the object type is a person, the first pose and the second pose are pose 1 and pose 2 of the person 1, respectively, or the first pose is pose 1 of the person 1 and the second pose is pose 2 of the person 2. That is, the object in the posture 1 and the posture 2 is the person 1, or the object in the posture 1 is the person 1 and the object in the posture 2 is the person 2.

The plurality of preset gesture key points are gesture key points which can be detected by the object under the non-shielding gesture, and for example, the gesture key points can comprise head key points, trunk key points and four limb key points, wherein the head key points comprise nose key points, neck key points and the like, the trunk key points comprise shoulder key points, hip key points and the like, and the four limb key points comprise elbow key points, wrist key points, knee key points, ankle key points and the like. The number of preset posture key points of the human body can be 18.

The first state information is used for indicating states of each preset gesture key point of the object in the first gesture, including a state capable of being detected and a state incapable of being detected, for example, a state in which a person places his or her hand behind the first gesture, a wrist key point and an elbow key point of the person are not detected in the first gesture, but a nose key point, a neck key point, a shoulder key point, a knee key point, an ankle key point, and the like of the person are detected, that is, a state in which the wrist key point and the elbow key point of the person are not detected, and a state in which the nose key point, the neck key point, the shoulder key point, the knee key point, and the ankle key point are detectable.

Similarly, the second state information is used for indicating states of key points of each preset gesture of the object in the second gesture, including a detectable state and an undetectable state.

Detecting preset gesture key points of the first gesture to obtain first state information of a plurality of preset gesture keys, and detecting preset key points of the second gesture to obtain second state information of a plurality of preset gesture key points.

S102, respectively determining common gesture key points and unique gesture key points of the first gesture and the second gesture according to the first state information and the second state information.

The common gesture key point is an intersection between a gesture key point which can be detected by an object in a first gesture and a gesture key point which can be detected by an object in a second gesture.

For example, if the person 1 in the first posture is put behind the hand in the first posture, the posture key points that can be detected by the person 1 in the first posture include a nose key point, a neck key point, a shoulder key point, a knee key point, and an ankle key point, and if the person 2 in the second posture is lifted by both hands in the second posture, the posture key points that can be detected by the person 2 in the second posture include a nose key point, a neck key point, a shoulder key point, an elbow key point, a wrist key point, a knee key point, and an ankle key point, the common posture key point is determined to be the nose key point, the neck key point, the shoulder key point, the knee key point, and the ankle key point.

The unique posture key point is a posture key point unique to the object in the first posture, that is, a posture key point detectable by the object in the first posture but not detectable by the object in the second posture.

For example, if the character 1 in the first posture is lifted by one hand in the first posture, the posture key points which can be detected by the character 1 in the first posture include a nose key point, a neck key point, a shoulder key point, an elbow key point, a wrist key point, a knee key point, and an ankle key point, and if the character 2 in the second posture is lowered and turned around in the second posture, the posture key points which can be detected by the character 2 in the second posture include a shoulder key point, an elbow key point, a wrist key point, a knee key point, and an ankle key point, and the unique posture key points are determined to be the nose key point and the neck key point.

Or the unique posture key point may also be a posture key point unique to the object in the second posture, that is, a posture key point that the object in the second posture can detect in the second posture, but the object in the first posture cannot detect in the first posture.

For example, when the person 2 in the second posture is lifted by both hands in the second posture, the posture key points which can be detected by the person 2 in the second posture include a nose key point, a neck key point, a shoulder key point, an elbow key point, a wrist key point, a knee key point, and an ankle key point, and when the person 1 in the first posture is placed behind the hands in the first posture, the posture key points which can be detected by the person 1 in the first posture include a nose key point, a neck key point, a shoulder key point, a knee key point, and an ankle key point, and the unique posture key points are determined to be the elbow key point and the wrist key point.

According to the first state information and the second state information, it can be determined that the posture key point of the object in the first posture and the posture key point of the object in the second posture are common key points (i.e., the posture key points which can be detected in both the first posture and the second posture).

If the object in the first posture can detect the posture key point in the first posture, but the object in the second posture can not detect the posture key point in the second posture, determining that the posture key point is a unique posture key point of the object in the first posture.

If the object in the second posture can detect the posture key point in the second posture, but the object in the first posture can not detect the posture key point in the first posture, determining that the posture key point is the unique posture key point of the object in the second posture.

Wherein, the unique gesture key points of the object in the first gesture indicate the condition that the unique gesture key points of the object in the first gesture appear for the second gesture, and similarly, the unique gesture key points of the object in the second gesture indicate the condition that the unique gesture key points of the object in the second gesture appear for the first gesture.

That is, the motion from the first posture to the second posture is a nonlinear motion, resulting in a case where posture key points in the first posture and the second posture are inconsistent.

The non-linear motion may be embodied, for example, in the following scenario:

In the figure turning scene, when a person faces the lens, two hands can be seen, in the turning process, the back hands can be shielded by the body, at the moment, the key points of the hands are disappeared, the person continues to turn, and the hands which are not seen originally are appeared.

When the object shields the scene, the person walks to the back of the desk, the lower body is shielded, the key points of the legs disappear, and after the person walks out, the points appear.

In the scene of the in-out picture, when a person walks from the edge of the picture, the gesture key points gradually appear, and when the person walks out of the picture, the gesture key points gradually disappear.

S103, generating at least one frame of interpolation skeleton image between the first gesture and the second gesture according to the common gesture key points and the unique gesture key points.

Performing linear interpolation according to the common gesture key points to obtain at least one interpolation position of the common gesture key points, performing linear interpolation according to the unique gesture key points to obtain at least one interpolation position of the unique gesture key points, and respectively generating at least one frame of interpolation skeleton image between the first gesture and the second gesture according to the at least one interpolation position of the common gesture key points and the at least one interpolation position of the unique gesture key points, wherein each frame of interpolation skeleton image is an image generated by connecting the common gesture key points and the unique gesture key points according to each interpolation position of the common gesture key points and each interpolation position of the unique gesture key points.

S104, respectively generating a starting skeleton image of the first gesture and an ending skeleton image of the second gesture according to the first state information and the second state information.

According to the first state information, determining gesture key points which can be detected by the first gesture object under the first gesture from a plurality of preset gesture key points, and according to the second state information, determining gesture key points which can be detected by the second gesture object under the second gesture from a plurality of preset gesture key points, connecting the gesture key points which can be detected by the first gesture object under the first gesture to generate an initial skeleton image of the first gesture, and connecting the gesture key points which can be detected by the second gesture object under the second gesture to generate an end skeleton image of the second gesture.

The initial skeleton image is an image obtained by connecting gesture key points which can be detected by an object in a first gesture under the first gesture, and the end skeleton image is an image obtained by connecting gesture key points which can be detected by an object in a second gesture under the second gesture.

It should be noted that, based on the relative position relationship of the gesture key points that the object in the first gesture can detect in the first gesture, skeleton drawing may be performed according to the gesture key points that the object in the first gesture can detect in the first gesture, so as to generate the initial skeleton image.

Based on the relative position relation of the gesture key points which can be detected by the object in the second gesture, skeleton drawing is carried out according to the gesture key points which can be detected by the object in the second gesture, and an initial skeleton image is generated.

The relative position relation of the gesture key points which can be detected by the object in the first gesture is used for indicating the relative position of the gesture key points, for example, the gesture key points which can be detected are nose key points, neck key points and shoulder key points, and then the nose key points and the neck key points are connected based on the relative positions of the nose key points, the neck key points and the shoulder key points on a human body, and the neck key points and the shoulder key points are connected for skeleton drawing, so that an initial skeleton image is generated.

Similarly, the relative positional relationship of the gesture keypoints that the object of the second gesture can detect in the second gesture is used to indicate the relative position of the gesture keypoints.

S105, generating a gesture transition animation from the first gesture to the second gesture according to the initial skeleton image of the first gesture, the at least one frame of interpolation skeleton image and the end skeleton image of the second gesture.

And carrying out animation combination on the initial skeleton image, the at least one frame of interpolation skeleton image and the end skeleton image of the second gesture to generate gesture transition animation from the first gesture to the second gesture, wherein the gesture transition animation comprises the initial skeleton image, the at least one frame of interpolation skeleton image and the end skeleton image of the second gesture.

It should be noted that, in a specific implementation, in the 'Pose _inter' class, the 'transform_ keypoints' and the 'gen_skeleton' functions may be called by using the 'run' method to generate a series of skeleton images (each interpolation skeleton image, a start skeleton image and an end skeleton image), that is, generate a gesture transition animation. In addition, a series of skeleton images may be converted into a pre-set tensor format for subsequent processing or visualization.

In this embodiment, the key points of the intermediate transition frame are generated rapidly through the optimized linear interpolation algorithm, which balances the calculation efficiency and the animation quality, is suitable for real-time application and lightweight equipment, and adopts different interpolation strategies for nonlinear motion (the key points disappear or appear), so that the accuracy of interpolation is improved.

In an alternative embodiment, the first state information comprises a first confidence coefficient, the second state information comprises a second confidence coefficient, the first confidence coefficient is used for indicating the certainty of the key point of the corresponding preset gesture in the first gesture, and the second confidence coefficient is used for indicating the certainty of the key point of the corresponding preset gesture in the second gesture.

The range of the values of the first confidence coefficient and the second confidence coefficient may be 0 to 1, if the first confidence coefficient is 0, the object representing the first gesture cannot detect the corresponding preset gesture key point (may be blocked) in the first gesture, that is, the corresponding preset gesture key point is not in the first gesture, and if the second confidence coefficient is 0, the object representing the second gesture cannot detect the corresponding preset gesture key point in the second gesture, that is, the corresponding preset gesture key point is not in the second gesture.

If the first confidence coefficient is 1, the object representing the first gesture can detect the corresponding preset gesture key point under the first gesture, namely the corresponding preset gesture key point under the first gesture, and if the second confidence coefficient is 1, the object representing the second gesture can detect the corresponding preset gesture key point under the second gesture, namely the corresponding preset gesture key point under the second gesture. An intermediate value of 0 to 1 represents different degrees of confidence, respectively.

For example, you are looking at a person, when you can see the person's hand completely, then the position of the opponent is very certain, where confidence = 1, if the person's hand is blocked by the table halfway, then the specific position of the hand is less certain, confidence may = 0.5, if the hand is completely blocked from view, then it is completely unknown where the hand is, confidence = 0.

Fig. 2 is a schematic flow chart of a gesture transition animation generation method according to an embodiment of the present application, in an optional implementation manner, in step S102, common gesture key points and unique gesture key points of a first gesture and a second gesture are respectively determined according to first state information and second state information, which may include:

S201, according to the first confidence coefficient and the second confidence coefficient, the common gesture key points and the unique gesture key points are respectively determined.

And if the first confidence coefficient and the second confidence coefficient of the preset gesture key point exceed the preset confidence coefficient threshold value, determining the preset gesture key point as a common gesture key point of the first gesture and the second gesture. The preset confidence threshold may be, for example, 0.5.

If the first confidence coefficient of the preset gesture key point exceeds the preset confidence coefficient threshold, but the second confidence coefficient of the preset gesture key point does not exceed the preset confidence coefficient threshold, determining that the preset gesture key point is a unique gesture key point of the object in the first gesture under the first gesture, namely, the unique gesture key point between the first gesture and the second gesture.

If the first confidence coefficient of the preset gesture key point does not exceed the preset confidence coefficient threshold, but the second confidence coefficient of the preset gesture key point exceeds the preset confidence coefficient threshold, determining that the preset gesture key point is a unique gesture key point of the object in the second gesture, namely the unique gesture key point between the first gesture and the second gesture.

In an alternative embodiment, the first state information further comprises a first position of the common gesture keypoint or a first position of the common gesture keypoint and a first position of the unique gesture keypoint, and the second state information further comprises a second position of the common gesture keypoint or a second position of the common gesture keypoint and a second position of the unique gesture keypoint.

If the unique gesture key point is the unique gesture key point of the object with the first gesture under the first gesture, the first state information further comprises a first position of the common gesture key point and a first position of the unique gesture key point, and the second state information comprises a second position of the common gesture key point.

If the unique gesture key point is the unique gesture key point of the object with the second gesture under the second gesture, the first state information further comprises a first position of the common gesture key point, and the second state information comprises a second position of the common gesture key point and a second position of the unique gesture key point.

Step S103, generating at least one frame of interpolation skeleton image between the first gesture and the second gesture according to the common gesture key points and the unique gesture key points, including:

S202, interpolation is carried out according to the first position of the common gesture key point and the second position of the common gesture key point, and at least one interpolation position of the common gesture key point is obtained.

Interpolation is performed according to the first position of the common gesture key point and the second position of the common gesture key point, so as to obtain at least one interpolation position of the common gesture key point, wherein the number of interpolation positions of the common gesture key point can be a default number, that is, linear interpolation is performed according to the first position of the common gesture key point and the second position of the common gesture key point, so as to obtain at least one interpolation position of the common gesture key point.

In an optional embodiment, the step S202, interpolation is performed according to the first position of the common gesture key point and the second position of the common gesture key point to obtain at least one interpolation position of the common gesture key point, including:

and interpolating according to the first position of the common gesture key point, the second position of the common gesture key point and the preset interpolation frame number to obtain a plurality of interpolation positions of the preset interpolation frame number of the common gesture key point.

The preset interpolation frame number is a custom interpolation frame number, namely the flexible interpolation frame number can be set when interpolation is carried out, interpolation is carried out according to the first position of the common gesture key point, the second position of the common gesture key point and the preset interpolation frame number, and the preset interpolation frame number interpolation positions of the common gesture key point are obtained, namely the number of the interpolation positions of the common gesture key point is the preset interpolation frame number.

For example, when the first position and the second position of the common posture key point are two-dimensional positions (x 1, y 1), (x 2, y 2) in the preset two-dimensional coordinate system and the preset interpolation frame number is n, the two-dimensional position (x, y) of the common posture key point in the 1 st frame (1 st skeleton interpolation image) is expressed as x=x1+ (x 2-x 1) x 1/n, y=y1+ (y 2-y 1) x 1/n, and the two-dimensional position (x, y) of the common posture key point in the 2 nd frame (2 nd skeleton interpolation image) is expressed as x=x1+ (x 2-x 1) x 2/n, and y=y1+ (y 2-y 1) x 2/n.

Taking (x1=0.2, y1=0.3), (x2=0.8, y2=0.9), n=3 as an example:

Frame 1 x=0.2+ (0.8-0.2) x 1/3=0.4, y=0.3+ (0.9-0.3) x 1/3=0.5

Frame 2 x=0.2+ (0.8-0.2) x 2/3=0.6, y=0.3+ (0.9-0.3) x 2/3=0.7

Frame 3 x=0.2+ (0.8-0.2) x 3/3=0.8, y=0.3+ (0.9-0.3) x 3/3=0.9.

It should be noted that, in a specific implementation, the 'transform_ keypoints' function may be used to interpolate the pose key points, where the inputs of the function are the first state information, the second state information, and the preset interpolation frame number, and the output is the pose key points (including the common pose key points and the unique pose key points) of the intermediate transition frame (interpolation skeleton image).

S203, generating interpolation skeleton images of frames according to the interpolation positions of the common gesture key points and the first positions of the unique gesture key points or the second positions of the unique gesture key points.

If the unique gesture key points are unique gesture key points of the object with the first gesture under the first gesture, skeleton drawing is carried out according to each interpolation position of the common gesture key points and the first position of the unique gesture key points so as to connect the common gesture key points and the unique gesture key points to generate interpolation skeleton images of each frame. That is, for each frame of the interpolation skeleton image, the first position of the unique posture key point is directly adopted as the interpolation position of the unique posture key point in each frame of the interpolation skeleton image.

If the unique gesture key points are unique gesture key points of the object with the second gesture under the second gesture, skeleton drawing is performed according to each interpolation position of the common gesture key points and the second position of the unique gesture key points, so that the common gesture key points and the unique gesture key points are connected, and each frame of interpolation skeleton image is generated. That is, the second position of the unique posture key point is directly adopted as the interpolation position of the unique posture key point in the interpolation skeleton image of each frame for each frame of the interpolation skeleton image.

It is worth to say that, according to the first position of the gesture key point which can be detected by the object in the first gesture and the second position of the gesture key point which can be detected by the object in the second gesture, skeleton drawing is performed, and a starting skeleton image of the first gesture and an ending skeleton image of the second gesture are generated.

In an optional embodiment, the step S203, generating each frame of interpolation skeleton image according to each interpolation position of the common gesture key point and the first position of the unique gesture key point or the second position of the unique gesture key point may include:

based on the relative position relation between the common gesture key points and the unique gesture key points, skeleton drawing is carried out according to each interpolation position of the common gesture key points and the first position of the unique gesture key points or the second position of the unique gesture key points, and each interpolation skeleton image is generated.

The relative position relationship between the common posture key point and the unique posture key point is used for indicating the relative positions of the common posture key point and the unique posture key point, for example, the common posture key point is a nose key point, a neck key point and a shoulder key point, the unique posture key point is an elbow key point and a wrist key point, and then based on the relative positions of the nose key point, the neck key point and the shoulder key point on a human body respectively, the nose key point and the neck key point are connected with the shoulder key point according to the interpolation positions of the nose key point, the neck key point and the shoulder key point, the first position or the second position of the elbow key point and the wrist key point, the neck key point is connected with the shoulder key point, the shoulder key point is connected with the elbow key point, the elbow key point is connected with the wrist key point for skeleton drawing, and each frame of interpolation skeleton image is generated.

It should be noted that, in a specific implementation, the skeleton image may be generated using a 'gen_skeleton' function, where the input of the function is each interpolation position of the common gesture key point, and the first position of the unique gesture key point or the second position of the unique gesture key point, and output as each interpolation skeleton image.

In the skeleton drawing process, a blank canvas can be respectively created for each interpolation skeleton image, then according to each interpolation position of the common gesture key point and the first position of the unique gesture key point or the second position of the unique gesture key point, the common gesture key point and the unique gesture key point are drawn on the blank canvas by using a cv2.Circle function, small dots are drawn at the positions of the common gesture key point and the unique gesture key point, and straight lines are drawn by using the cv2.Line function to connect the dots so as to carry out skeleton drawing.

In this embodiment, by customizing the interpolation frame number, different animation requirements are satisfied, and the user can control the smoothness and detail level of the animation by adjusting the interpolation frame number, so that the flexibility is higher, and by considering the situation that the gesture key points may disappear or appear, it is ensured that the gesture key points can be smoothly transited.

For example, in a first posture image, a first person stands on a kneeling position with a single knee, in a second posture image, a second person stands on a foot point, a preset interpolation frame number is 8 frames, fig. 3 is a schematic diagram one of a posture transition animation provided by an embodiment of the present application, and as shown in fig. 3, by connecting posture key points, 8 interpolation skeleton images, 1 start skeleton image (first image from left to right) and 1 end skeleton image (last image from left to right) are generated, and it can be seen that the skeleton images are similar to match images.

For example, a first person in a first gesture image swings his/her hand with his/her foot standing, and a second person in a second gesture image stands with his/her foot standing, and the preset interpolation frame number is 8, fig. 4 is a schematic diagram two of gesture transition animation provided in an embodiment of the present application, and as shown in fig. 4, by connecting gesture key points, 8 interpolation skeleton images, 1 start skeleton image (first image from left to right), and 1 end skeleton image (last image from left to right) are generated, and it can be seen that the skeleton images are similar to match images.

Fig. 5 is a flowchart illustrating a third method for generating a gesture transition animation according to an embodiment of the present application, as shown in fig. 5, in an optional implementation manner, in step S101, obtaining first state information of a plurality of preset gesture key points in a first gesture and second state information of a plurality of preset gesture key points in a second gesture may include:

S301, acquiring a first posture image corresponding to the first posture and a second posture image corresponding to the second posture.

S302, key point detection is carried out on the first posture image and the second posture image respectively, and first state information and second state information are obtained.

Acquiring a first posture image of an object in a first posture and a second posture image of an object in a second posture, detecting key points of the first posture image to obtain first state information of a plurality of preset posture key points, and detecting the key points of the second posture image to obtain second state information of the plurality of preset posture key points.

The first gesture and the second gesture are gestures of the same type of object, and the plurality of preset gesture key points are gesture key points of the same type of object; step S302, respectively performing key point detection on the first gesture image and the second gesture image to obtain first state information and second state information, including:

and adopting a gesture key point detection model of the same type of object to respectively detect key points of the first gesture image and the second gesture image to obtain first state information and second state information.

And detecting the key points of the first gesture image by adopting a gesture key point detection model of the same type of object to obtain first state information of a plurality of preset gesture key points, and detecting the key points of the second gesture image to obtain second state information of the plurality of preset gesture key points.

The gesture key point detection model may be OpenPose, DWPose, the detection result (i.e., the first state information and the second state information) of the gesture key point detection model may be in a form of a triplet (x, y, confidence), where x and y respectively represent the horizontal position and the vertical position of the gesture key point in the gesture image, the value range may be 0 to 1, the confidence is a confidence, and the confidence represents the certainty of the gesture key point predicted by the model under the corresponding gesture.

In the embodiment, the corresponding gesture key point detection model is adopted based on the object type, gesture data of different objects can be processed, and the method is suitable for different application scenes, high in adaptability and high in flexibility.

In an alternative embodiment, the plurality of preset gesture keypoints are gesture keypoints of the same class of objects, and the method may further include:

and performing model rendering on the gesture transition animation according to the object types of the same class of objects to obtain a gesture transition model of the object types.

The object type can be, for example, a person or an animal, and according to the object type, model rendering is performed on the gesture transition animation to obtain a gesture transition model of the same class of objects, that is, according to the object type of the same class of objects, model rendering is performed on each skeleton image in the gesture transition animation to generate a gesture transition model of the object type.

It should be noted that, according to the actual application fields, such as the animation production field, the video editing field, the virtual reality field, etc., based on the actual requirements, after the gesture transition animation of the same kind of object is generated, the gesture transition animation can be subjected to model rendering, so as to obtain a gesture transition model of the same kind of object, and specific implementation of the model rendering is referred to related description of the prior art and is not repeated herein.

In this embodiment, different gesture key point detection models are adopted for different object types, so as to support different types of gesture marks, and the method is applicable to different application fields.

Fig. 6 is a schematic structural diagram of a gesture transition animation generating device according to an embodiment of the present application, where the device may be integrated in an electronic device.

As shown in fig. 6, the apparatus may include:

an obtaining module 401, configured to obtain first state information of a plurality of preset gesture key points in a first gesture and second state information of a plurality of preset gesture key points in a second gesture;

a determining module 402, configured to determine, according to the first state information and the second state information, a common pose key point and a unique pose key point from the first pose to the second pose, respectively;

A generating module 403, configured to generate at least one frame of interpolation skeleton image between the first pose and the second pose according to the common pose key point and the unique pose key point;

The generating module 403 is further configured to generate a start skeleton image of the first gesture and an end skeleton image of the second gesture according to the first state information and the second state information, respectively;

The generating module 403 is further configured to generate a gesture transition animation from the first gesture to the second gesture according to the initial skeleton image of the first gesture, the at least one frame of interpolated skeleton image, and the end skeleton image of the second gesture.

In an alternative embodiment, the first state information comprises a first confidence coefficient, the second state information comprises a second confidence coefficient, the first confidence coefficient is used for indicating the confidence coefficient of the key point of the corresponding preset gesture in the first gesture, and the second confidence coefficient is used for indicating the confidence coefficient of the key point of the corresponding preset gesture in the second gesture;

The determining module 402 is specifically configured to determine the common pose key point and the unique pose key point according to the first confidence and the second confidence, respectively.

In an alternative embodiment, the first state information further comprises a first position of the common gesture key point or a first position of the common gesture key point and a first position of the unique gesture key point, and the second state information further comprises a second position of the common gesture key point or a second position of the common gesture key point and a second position of the unique gesture key point;

the generating module 403 is specifically configured to:

Interpolation is carried out according to the first position of the common gesture key point and the second position of the common gesture key point, so that at least one interpolation position of the common gesture key point is obtained;

and generating interpolation skeleton images of frames according to the interpolation positions of the common gesture key points and the first positions of the unique gesture key points or the second positions of the unique gesture key points.

In an alternative embodiment, the generating module 403 is specifically configured to:

In an alternative embodiment, the obtaining module 401 is specifically configured to:

Acquiring a first posture image corresponding to the first posture and a second posture image corresponding to the second posture;

And detecting key points of the first posture image and the second posture image respectively to obtain first state information and second state information.

In an alternative embodiment, the gesture of the same type of object in the first gesture and the second gesture, the plurality of preset gesture key points are gesture key points of the same type of object, and the obtaining module is specifically configured to:

In an alternative embodiment, the plurality of preset gesture keypoints are gesture keypoints of the same class of objects, and the device further comprises:

And the rendering module 404 is configured to perform model rendering on the gesture transition animation according to the object type of the same class of object, so as to obtain a gesture transition model of the object type.

In this embodiment, the system includes an acquisition module configured to acquire first state information of a plurality of preset gesture key points in a first gesture and second state information of a plurality of preset gesture key points in a second gesture, a determination module configured to determine common gesture key points and unique gesture key points from the first gesture to the second gesture according to the first state information and the second state information, a generation module configured to generate at least one frame of interpolation skeleton image from the first gesture to the second gesture according to the common gesture key points and the unique gesture key points, and further configured to generate a start skeleton image of the first gesture and an end skeleton image of the second gesture according to the first state information and the second state information, and further configured to generate a gesture transition animation from the first gesture to the second gesture according to the start skeleton image of the first gesture, the at least one frame of interpolation skeleton image, and the end skeleton image of the second gesture. The scheme is suitable for nonlinear motion, has high animation generation efficiency, and improves the smoothness and naturalness of the animation.

Fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application, as shown in fig. 7, where the device may include a processor 501, a memory 502 and a bus 503, the memory 502 stores machine-readable instructions executable by the processor 501, and when the electronic device is running, the processor 501 communicates with the memory 502 through the bus 503, and the processor 501 executes the machine-readable instructions to perform the following steps:

And generating a gesture transition animation from the first gesture to the second gesture according to the initial skeleton image of the first gesture, the at least one frame of interpolation skeleton image and the end skeleton image of the second gesture.

According to the first state information and the second state information, respectively determining a common gesture key point and a unique gesture key point from the first gesture to the second gesture, including:

And respectively determining the common gesture key points and the unique gesture key points according to the first confidence and the second confidence.

In an alternative embodiment, the first state information further includes a first position of a common pose key point or a first position of a common pose key point and a first position of a unique pose key point, the second state information further includes a second position of a common pose key point or a second position of a common pose key point and a second position of a unique pose key point, and generating at least one frame of interpolation skeleton image between the first pose and the second pose according to the common pose key point and the unique pose key point includes:

In an alternative embodiment, interpolation is performed according to the first position of the common gesture key point and the second position of the common gesture key point, so as to obtain at least one interpolation position of the common gesture key point, including:

In an alternative embodiment, generating each frame of interpolated skeleton image according to each interpolation position of the common gesture key point and the first position of the unique gesture key point or the second position of the unique gesture key point includes:

In an alternative embodiment, acquiring first state information of a plurality of preset gesture key points in a first gesture and second state information of a plurality of preset gesture key points in a second gesture includes:

In an alternative embodiment, the first gesture and the second gesture are the gesture of the same type of object, and the plurality of preset gesture key points are gesture key points of the same type of object, and the key point detection is performed on the first gesture image and the second gesture image to obtain first state information and second state information, including:

In an alternative embodiment, the plurality of preset gesture keypoints are gesture keypoints of the same class of objects, and the method further comprises:

In this embodiment, the processor executes machine-readable instructions to perform obtaining first state information of a plurality of preset gesture key points in a first gesture and second state information of a plurality of preset gesture key points in a second gesture, determining common gesture key points and unique gesture key points of the first gesture and the second gesture respectively according to the first state information and the second state information, generating at least one frame of interpolation skeleton image between the first gesture and the second gesture according to the common gesture key points and the unique gesture key points, generating a start skeleton image of the first gesture and an end skeleton image of the second gesture respectively according to the first state information and the second state information, and generating a gesture transition animation according to the start skeleton image of the first gesture, the at least one frame of interpolation skeleton image and the end skeleton image of the second gesture. The scheme is suitable for nonlinear motion, has high animation generation efficiency, and improves the smoothness and naturalness of the animation.

The embodiment of the application also provides a computer readable storage medium, wherein the computer readable storage medium stores a computer program, the computer program is executed by a processor, and the processor executes the following steps:

In this embodiment, the computer program is executed when executed by the processor, the processor executes to obtain first state information of a plurality of preset gesture key points in a first gesture and second state information of a plurality of preset gesture key points in a second gesture, determine a common gesture key point and a unique gesture key point of the first gesture and the second gesture respectively according to the first state information and the second state information, generate at least one frame of interpolation skeleton image between the first gesture and the second gesture according to the common gesture key point and the unique gesture key point, generate a start skeleton image of the first gesture and an end skeleton image of the second gesture respectively according to the first state information and the second state information, and generate a gesture transition animation according to the start skeleton image of the first gesture, the at least one frame of interpolation skeleton image and the end skeleton image of the second gesture. The scheme is suitable for nonlinear motion, has high animation generation efficiency, and improves the smoothness and naturalness of the animation.

In an embodiment of the present application, the computer program may further execute other machine readable instructions when executed by a processor to perform the method as described in other embodiments, and the specific implementation of the method steps and principles are referred to in the description of the embodiments and are not described in detail herein.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. The above-described apparatus embodiments are merely illustrative, for example, the division of the units is merely a logical function division, and there may be other manners of division in actual implementation, and for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some communication interface, device or unit indirect coupling or communication connection, which may be in electrical, mechanical or other form.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in the embodiments provided in the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. The storage medium includes a U disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, an optical disk, or other various media capable of storing program codes.

It should be noted that like reference numerals and letters refer to like items in the following figures, and thus, once an item is defined in one figure, no further definition or explanation of that in the following figures is necessary, and furthermore, the terms "first," "second," "third," etc. are used merely to distinguish between descriptions and are not to be construed as indicating or implying relative importance.

It should be noted that the foregoing embodiments are merely illustrative embodiments of the present application, and not restrictive, and the scope of the application is not limited to the embodiments, and although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those skilled in the art that any modification, variation or substitution of some of the technical features of the embodiments may be made within the technical scope of the present application disclosed in the present application, and the spirit, the scope and the scope of the technical aspects of the embodiments do not deviate from the spirit and scope of the technical aspects of the embodiments. Are intended to be encompassed within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method for generating a gesture transition animation, comprising:

2. The method of claim 1, wherein the first state information comprises a first confidence level, and the second state information comprises a second confidence level, wherein the first confidence level is used for indicating the confidence level of the corresponding preset gesture key point in the first gesture, and the second confidence level is used for indicating the confidence level of the corresponding preset gesture key point in the second gesture;

The determining, according to the first state information and the second state information, a common pose key point and a unique pose key point of the first pose and the second pose, respectively, includes:

3. The method of claim 1 or 2, wherein the first state information further comprises a first location of the common pose key point or a first location of the common pose key point and a first location of the unique pose key point, the second state information further comprises a second location of the common pose key point or a second location of the common pose key point and a second location of the unique pose key point, and the generating at least one frame of interpolated skeleton image between the first pose and the second pose based on the common pose key point and the unique pose key point comprises:

And generating interpolation skeleton images of each frame according to each interpolation position of the common gesture key point and the first position of the unique gesture key point or the second position of the unique gesture key point.

4. A method according to claim 3, wherein said interpolating from the first position of the common pose keypoint and the second position of the common pose keypoint results in at least one interpolated position of the common pose keypoint, comprising:

And interpolating according to the first position of the common gesture key point, the second position of the common gesture key point and a preset interpolation frame number to obtain a plurality of interpolation positions of the preset interpolation frame number of the common gesture key point.

5. The method of claim 4, wherein generating each frame of interpolated skeleton image from each interpolated position of the common pose keypoint and the first position of the unique pose keypoint or the second position of the unique pose keypoint comprises:

And based on the relative position relation between the common gesture key points and the unique gesture key points, performing skeleton drawing according to each interpolation position of the common gesture key points and the first position of the unique gesture key points or the second position of the unique gesture key points, and generating each interpolation skeleton image.

6. The method of claim 1, wherein the obtaining the first state information of the plurality of preset gesture keypoints in the first gesture and the second state information of the plurality of preset gesture keypoints in the second gesture comprises:

And detecting key points of the first posture image and the second posture image respectively to obtain the first state information and the second state information.

7. The method of claim 6, wherein the first and second gestures of the same type of object, the plurality of preset gesture keypoints are gesture keypoints of the same type of object, the performing keypoint detection on the first and second gesture images to obtain the first and second state information respectively, comprises:

And adopting a gesture key point detection model of the same type of object to detect key points of the first gesture image and the second gesture image respectively to obtain the first state information and the second state information.

8. The method of claim 1, wherein the plurality of preset gesture keypoints are gesture keypoints of the same class of objects, the method further comprising:

and carrying out model rendering on the gesture transition animation according to the object type of the same class of object to obtain a gesture transition model of the object type.

9. A gesture transition animation generation apparatus, comprising:

10. An electronic device comprising a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor in communication with the memory via the bus when the electronic device is in operation, the processor executing the machine-readable instructions to perform the method of any one of claims 1 to 8.

11. A computer-readable storage medium, characterized in that it has stored thereon a computer program which, when executed by a processor, performs the method of any of claims 1 to 8.