CN114119908B

CN114119908B - A clothing model driving method, device and storage medium

Info

Publication number: CN114119908B
Application number: CN202010876682.8A
Authority: CN
Inventors: 郑天祥; 周润楠; 闫浩男; 张胜凯; 杨超杰; 吴圣杰
Original assignee: Beijing Momo Information Technology Co Ltd
Current assignee: Beijing Momo Information Technology Co Ltd
Priority date: 2020-08-27
Filing date: 2020-08-27
Publication date: 2025-08-29
Anticipated expiration: 2040-08-27
Also published as: CN114119908A

Abstract

The present invention discloses a method for driving a clothing model, comprising: obtaining a two-dimensional image of clothing; creating a three-dimensional model of the clothing based on the two-dimensional image of the clothing; constructing a three-dimensional standard human body model in an initial posture in combination with a mathematical model; fitting the three-dimensional clothing model to the three-dimensional standard human body model in the initial posture; obtaining a two-dimensional image of a target human body; obtaining parameters of the three-dimensional target human body model through calculation using a secondary neural network model; inputting several sets of obtained posture and body shape parameters into the three-dimensional standard human body model for fitting; driving the human body model to move from the initial posture to the target posture; obtaining a target human body model having the same posture and body shape as the target human body and wearing the changed clothing. The present invention uses a series of methods, such as combining skin and cloth simulation, performing a gravity solution for several frames according to the situation at the end of the movement, and variable speed interpolation to drive the model movement. In the matching of the clothing model and the human body model, the realism and restoration of the three-dimensional clothing model are maintained while ensuring a certain processing speed, thereby achieving an excellent simulated change of clothing effect.

Description

Clothing model driving method, equipment and storage medium

Technical Field

The invention belongs to the field of virtual changing and fitting of users, and particularly relates to a human body modeling, garment modeling and fitting of a garment model and a human body model used in virtual changing, in particular to a driving method, driving equipment and a storage medium of a three-dimensional garment model worn on a user-defined human body model.

Background

With the development of internet technology, online shopping is becoming more and more popular. Compared with physical store shopping, online shopping has the advantages of multiple commodity types, convenience in shopping and the like. However, when the commodity is purchased on the internet, some problems which are not easy to solve exist, and the most important is that the commodity to be purchased cannot be checked in the field. Of all commodity categories, the problem of clothing commodity is most pronounced. Compared with the real-time changing and checking of clothing effects in the shopping of physical shops, online clothing shopping can not provide effect figures for consumers, only model fitting pictures can be provided, and some even no fitting pictures are provided, so that consumers can not acquire matching degrees of clothing and body type images of the consumers intuitively in real time. Resulting in a significant amount of shipment back.

To address this problem, operators have attempted to solve this problem by providing a simulated fitting effect to consumers using virtual fitting techniques. Of course, other situations exist in reality where virtual fit techniques may be used, such as in online games. Thus, this technology has been developed rapidly.

Virtual fitting refers to a technical application in which a user can view a "changing" effect on a terminal screen in real time without actually changing the clothes that the user wants to view the wearing effect. The prior art of changing clothes mainly comprises the planar fitting and the three-dimensional virtual fitting technology. The former is basically to collect the picture of the user, collect the picture of the clothes, then stretch or compress the clothes to the same state as the human body, then cut and splice to form the image after dressing, but the image is poor in reality due to the simple and rough image processing mode, the actual body shape of the user is not considered at all, and the user cannot be satisfied by only moving the clothes on the user picture. The latter usually collects three-dimensional information of a person through a three-dimensional collection device and combines the three-dimensional information with the characteristics of clothing, or manually inputs body data information provided by a user, and generates a virtual three-dimensional model mesh of a human body according to a certain rule, and then combines the virtual three-dimensional model mesh with clothing mapping. Overall, such three-dimensional virtual fitting requires a large amount of data acquisition or three-dimensional data calculation, and has high hardware cost, which is not easy to popularize in common users.

With the development of cloud computing technology, artificial intelligence technology and intelligent terminal processing capability, two-dimensional virtual fitting technology is generated. The technology mainly comprises three steps of (1) processing personal body information provided by a user to obtain a target human body model, (2) processing clothing information to obtain a clothing model, and (3) fusing the human body model and the clothing model together to generate a simulation image of the clothing worn by a person.

For the point (1), due to accumulation of various uncertain factors such as flow design, model parameter selection, a neural network training method and the like, the quality of the finally generated dressing picture is not as good as that of the traditional three-dimensional virtual dressing technology, wherein the establishment of the mannequin is a basic step, and the later dressing process is also based on the previously generated mannequin, so once the mannequin is generated inaccurately, the problems of overlarge body type gap between the mannequin and a fitting person, skin texture loss, body part loss and the like are easily generated, and the finally generated dressing picture effect is influenced.

In the field of image processing, three-dimensional reconstruction refers to the establishment of a mathematical model suitable for computer representation and processing of a three-dimensional object, is a basis for processing, operating and analyzing properties of the three-dimensional object in a computer environment, is a key technology for establishing virtual reality expressing objective world in a computer, and is widely applied to the fields of computer animation, virtual reality, industrial detection and the like.

In the general field of computer vision, there are various initial origins of human modeling, which generally include three main types of three-dimensional reconstruction methods based on multi-view depth of field photography, in which 3D scanning devices are used to perform omnidirectional scanning on a real human body, and three-dimensional reconstruction methods in which a given image is combined with a human body model. The multi-view three-dimensional reconstruction method is used for providing images of the reconstructed human body with mutually overlapped multiple views and establishing a space conversion relation between the images, and a plurality of groups of cameras are used for shooting a plurality of pieces of images to splice a 3D model, so that the operation is relatively simplified, the calculation complexity is still higher, and in most cases, only people participating in the scene can acquire multi-angle images. The models obtained by the mapping and the stitching of the multi-angle photographing method of the depth camera have no body scale data and cannot provide a basis for 3D perception. And thirdly, the method for combining the single image with the human body model only needs to provide one image, the three-dimensional human body characteristic curve intelligent generation method based on the neural network is used for training through the neural network to obtain the weight and threshold value which can be used for describing the curves of the neck, the chest, the waist, the buttocks and the like of the human body, and then the predicted human body model can be obtained by directly generating the three-dimensional human body curve which is matched with the real human body according to the size parameter information such as girth, width, thickness and the like of the human body section. However, this method still needs to consume a large amount of calculation because of a small amount of input information, so that the final model effect is not satisfactory.

For point (2), there are several different methods in the prior art of generating three-dimensional garment models. At present, a traditional clothing three-dimensional model building mode is based on a two-dimensional clothing cut-part design and sewing method. This method requires a certain expertise of clothing to design templates, which is not a quality possessed by all users of virtual fitting, and also requires manual specification of sewing relationships between templates, which consumes a lot of time to set. In addition, another more novel three-dimensional modeling approach is based on hand-drawing, which can generate a simple garment model from line information drawn by a user. However, this method requires specialized personnel to perform hand painting, and has poor replicability and repeatability, and requires a great deal of time for users to perform detailed painting of clothing, which is difficult to be popularized in electronic commerce on a large scale. Both of these methods are more prone to innovating new garments rather than three-dimensional modeling of existing garments for sale. And on the basis of acquiring the clothing picture information, the image processing technology and the graphic simulation technology are comprehensively used to finally generate the virtual three-dimensional clothing model. And acquiring the outline and the size of the garment in the picture through outline detection and classification, finding out the edge-to-edge key points from the outline through a machine learning method, generating sewing information through the corresponding relation of the key points, and finally carrying out physical sewing simulation on the garment in a three-dimensional space to acquire the real effect of the garment worn on a human body.

For the point (3), the common virtual fitting room in the current market mainly focuses on style collocation, and does not intuitively simulate the natural attribute of the collision of the virtual character and the clothing cloth, so that great shortages still exist in the sense of reality. At present, more and more manufacturers visually represent the gesture of a user by utilizing the virtual character, simulate the collision response between clothing cloth and a human body in real time and render in real time to increase the adhesiveness between the virtual world and the real world, bring more dress changing fun to the virtual fitting user, and bring convenience to more people in purchasing clothes.

In summary, based on the internet technology and the characteristics of the network environment where the internet technology is located, the mode of directly outputting the final image or photo after the replacement from a single human body image is certainly most preferable, the convenience is the best, and the user can complete the whole virtual clothes replacing process by only one photo without going to the scene. The problem is that it is mainly to ensure that the effect of the obtained result is basically equivalent to that of the real 3D simulation changing. Among these, how (1) a human body model closest to the real state of the human body is obtained by one photograph, and (2) how to put the three-dimensional clothing model on the target human body model closest to the real state become two of the most important unavoidable problems in the virtual clothing changing method.

For the first point. In the prior art, the method for constructing the human body model is generally classified into (1) reconstructing the human body model represented by the voxels through a convolutional neural network based on a regression method, firstly estimating the position of a main joint point of a human body according to an input picture, then estimating whether each unit voxel in the main joint point is occupied or not in a voxel grid with a given designated size according to the position of a key point, thereby describing the reconstructed human body shape by using the whole shape of the internally occupied voxel, and (2) reconstructing the human body based on a single picture. (3) The human body skeleton is represented by 23 skeleton nodes, the posture of the whole human body is represented by the rotation of each skeleton node, simultaneously, the human body shape is represented by 6890 vertex positions, in the fitting process, the skeleton node positions are given, and the parameters of the shape and the posture are fitted at the same time, so that three-dimensional human body reconstruction is carried out, or key points on an image are predicted by a CNN model, and then the initial human body model is obtained by fitting by an SMPL model. Then, the fitting shape parameters are used for regressing a body joint bounding box, each joint corresponds to one bounding box, and the bounding box is represented by the axial length and the radius. Finally, combining the initial model and the regression obtained bounding box to obtain the three-dimensional human body reconstruction. The method has the problems that the modeling speed is low, the modeling accuracy is insufficient, and the reconstruction effect is strongly dependent on the created body and gesture database.

The prior art discloses a human modeling method based on body measurement data, as shown in figure 1, the method comprises the steps of obtaining the body measurement data, carrying out linear regression on a pre-established human model through a pre-trained prediction model according to the body measurement data, fitting to obtain the predicted human model, wherein the pre-established human model comprises a plurality of groups of pre-defined mark characteristic points and corresponding standard shape bases, the body measurement data comprises measurement data corresponding to each group of mark characteristic points, and obtaining a target human model according to the predicted human model, wherein the target human model comprises the measurement data, the target shape bases and the target shape coefficients. However, this method requires very high volume measurement data including volume length data and girth data such as height, arm length, shoulder width, leg length, calf length, thigh length, foot plate length, head circumference, chest circumference, waist circumference, thigh circumference, etc., not only measurement but also calculation. The method saves the calculated amount, but the user experience is very bad, and the program is very tedious. And in the training of the mannequin, reference is made to the training mode of the SMPL model.

The SMPL model is a parameterized human model, and is a human modeling method proposed by Marep, and the method can perform arbitrary human modeling and animation driving. The method is the biggest difference with the traditional LBS in that the method for imaging the body surface morphology of the human body posture is proposed, and the method can simulate the protrusion and the depression of human muscles in the limb movement process. Therefore, the surface distortion of the human body in the movement process can be avoided, and the appearance of the muscle stretching and shrinking movement of the human body can be accurately depicted. In the method, beta and theta are input parameters, wherein beta represents 10 parameters of the human body with stuffy and thin body and equal proportion of the head to the body, and theta represents 75 parameters of the whole motion pose of the human body and the relative angles of 24 joints. However, the model generation mode is characterized in that a great amount of training data is accumulated to obtain the relationship between the body type and the shape base, but the relationship cannot be controlled independently due to the extremely strong correlation between the shape base, and decoupling operation is not easy to carry out, for example, a certain correlation exists between an arm and a leg, the leg theoretically moves along with the arm when the arm moves, and improvement of the body type aiming at different characteristics is difficult to realize on the SMPL model.

The second prior art discloses a 3D human modeling method based on a single photo, which comprises the steps of obtaining a photo, analyzing the photo, marking key points of a human body in the photo, calculating space coordinates of the key points, obtaining distances between skeleton points in a pre-established standard human body model and the key points in the photo, aligning the skeleton points and the key points to generate a basic human body model, obtaining a basic map in the pre-established standard human body model, calculating differences between the basic map and skin textures of the human face in the photo, fusing the basic map and the skin textures of the human face in the photo by using an edge channel to generate basic texture data, and generating the 3D human body model according to the basic human body model and the basic texture data. 3D human modeling is realized through one photo, and the model is supported by a skeletal and muscular system, so that expression and action can be generated. However, the method is that after the distances between the key points of the user photo and the key points of the standard mannequin are matched, the distances are adjusted to reach the gesture of the target human body, and the final human body model can be obtained after the difference value calculation and the fusion are carried out through the basic mapping and the skin texture in the photo.

The three-dimensional human body model generation method comprises the steps of obtaining a two-dimensional human body image, inputting the two-dimensional human body image into a three-dimensional human body parameter model to obtain three-dimensional human body parameters corresponding to the two-dimensional human body image, inputting a training sample into a neural network to train to obtain a three-dimensional human body parameter model, inputting the standard two-dimensional human body image in the training sample into the neural network to obtain a predicted three-dimensional human body parameter corresponding to the standard two-dimensional human body image, adjusting a three-dimensional flexible deformable model according to the predicted three-dimensional human body parameter to obtain a predicted three-dimensional human body model, and reversely mapping joint points in the predicted three-dimensional human body model to obtain the predicted joint point position in the standard two-dimensional human body image. In the modeling mode, only joint point parameters are used for judging the parameters which are finally output through the neural network, then the mature body type of the SMPL model is used for carrying out detail adjustment consistent with the target human body posture, and although the calculated amount is reduced, the input parameters are fewer, the adjustment can be completed only on the basis of the SMPL prediction model, and the human body model which is particularly ideal and highly consistent with the target human body posture is difficult to output.

For the second point. The prior art has a virtual fitting scheme comprising the steps of obtaining a reference mannequin of dressing and a target mannequin of non-dressing, respectively embedding skeletons of the same hierarchical structure for the reference mannequin and the target mannequin, binding the skeletons of the reference mannequin and the target mannequin with skin, calculating the rotation amount of bones in the target mannequin skeleton, recursively adjusting all bones in the target mannequin skeleton to enable the postures of the target mannequin skeleton and the reference mannequin skeleton to be consistent, carrying out skin deformation of the target mannequin by using an LBS skin algorithm according to the rotation amount of bones in the target mannequin skeleton, and transferring a clothing model from the reference mannequin to the target mannequin on the basis of skin deformation of the target mannequin. According to the invention, after the posture of the target human body model is consistent with that of the reference human body model, the difficulty in migration of the clothing model from the reference human body to the target human body can be reduced, and the low-efficiency non-rigid registration problem is converted into the high-efficiency rigid registration problem, so that the migration of the clothing model from the reference human body model to the target human body model is realized. The automatic clothes fitting machine solves the technical problem that automatic clothes fitting under different human bodies and different postures is completed under the condition that the sizes of the clothes are unchanged before and after the clothes fitting. However, this pure skinning method, which is too much focused on the fixed distance between the clothing and the skin, is superior in fitting speed, but has a great disadvantage in clothing matching degree and reality, and is only suitable for some occasions where it is necessary to quickly and simply process the clothing to follow the movement of the skin mesh.

For the third point. The fourth prior art discloses a virtual fitting method, which comprises the steps of obtaining a reference mannequin of dressing and a target mannequin of not dressing, respectively embedding skeletons of the same hierarchical structure for the reference mannequin and the target mannequin, binding the skin of the skeletons of the reference mannequin and the target mannequin, calculating the rotation quantity of bones in the target mannequin skeleton, recursively adjusting all bones in the target mannequin skeleton to enable the postures of the target mannequin skeleton and the reference mannequin skeleton to be consistent, performing skin deformation of the target mannequin by using an LBS skin algorithm according to the rotation quantity of bones in the target mannequin skeleton, and transferring a clothing model from the reference mannequin to the target mannequin on the basis of performing skin deformation on the target mannequin. The method improves the concrete implementation mode of skin driving, but does not deviate from the thought of completing movement by virtue of the bone skin in practice, and the actual clothing simulation effect is not improved substantially.

In order to match with the development trend of the internet industry, in the field of virtual fitting, the minimum input information, the minimum calculation amount and the optimal effect are three basic targets which are always forced to be required. The invention aims to solve the problems of reality and restoration degree of the clothing model after moving, properly improves simulation speed at some parts and links on the premise of ensuring the simulation effect of the clothing, tries to find a better balance point among the three parts, and provides a virtual fitting method which can achieve simple input, has calculated amount not exceeding the bearing force of terminal equipment and has effect close to real clothing.

Disclosure of Invention

Based on the above problems, the present invention provides a clothing model driving method, apparatus and storage medium that overcome the above problems.

The invention provides a clothing model driving method which comprises the steps of obtaining a two-dimensional image of clothing, manufacturing a three-dimensional model of the clothing according to the two-dimensional image of the clothing, combining a mathematical model to construct a three-dimensional standard human body model, enabling the three-dimensional standard human body model to be in an initial posture, attaching the three-dimensional clothing model to the three-dimensional standard human body model in the initial posture, obtaining a two-dimensional image of a target human body, obtaining three-dimensional target human body model parameters through calculation of a two-level neural network model, inputting a plurality of obtained posture and body type parameters into the three-dimensional standard human body model to fit, driving the human body model to move from the initial posture to the target posture, and obtaining the target human body model which is identical to the target human body posture and the body type and is worn on the clothing after replacement.

Preferably, the posture of the target human body is determined through three-dimensional human body action posture parameters, and bones are driven to move from the initial posture to the target posture, so that the three-dimensional human body model is basically consistent with the target human body posture. The movement of the clothing model adopts the combination of cloth simulation and a skin method, the part which basically does not deform after the clothing moves adopts a skin mode, and the part which deforms greatly during the clothing moving process adopts a cloth simulation mode. And establishing a relation between the model grid vertexes and the skeleton in the driving process, calculating the rotation parameters of the skeleton in the target posture, and driving the related vertexes to reach the position of the target posture by the skeleton to drive the vertexes nearby the skeleton to complete the spatial movement through the relation of the skin weights. The moving state from the initial posture to the target posture is calculated frame by frame, skin driving is finished and cloth simulation is calculated when each frame is calculated, and the state of the next frame is calculated after the calculation of the current frame is finished.

The cloth simulation uses a collision body system to model the vertex grid of the mannequin into a rigid body, models the grid of the clothes model into a non-rigid body, uses a physical engine to simulate the collision relation between the rigid body and the non-rigid body, calculates the collision between the rigid body and the non-rigid body in the cloth simulation process, simultaneously considers the connection acting force between the grids of the clothes model, calculates the grid state of the clothes model frame by frame, and simulates the collision motion process in the physical world. When the posture of the human body model is driven to reach the target posture, a plurality of frames of gravity calculation is needed to be carried out on the cloth of the garment, the continuity of motion change is judged according to the neural network prediction model, the target posture is analyzed to be snap shot or static swing shot, if the target posture is snap shot, the garment model can keep the speed in the motion process due to inertia to enable the target posture state to be unstable, the times of the gravity calculation are reduced when the human body model reaches the target posture, and if the target posture is swing shot, the times of the gravity calculation are increased when the human body model reaches the target posture, so that the fidelity of the cloth of the garment in the target posture is ensured.

Preferably, the fitting moving step further comprises the following substeps of 1) obtaining position coordinates of an initial gesture and a target gesture, 2) generating an animation sequence moving from the initial gesture to the target gesture, 3) processing in a grid frame inserting mode in the process of generating the animation sequence, 4) setting the frame inserting speed to be slow in the front-back distance from the initial point and the position of the target point and fast in the middle movement process, 5) resting a plurality of frames when driving to the final target gesture to obtain the whole animation sequence, and 6) completing the process of driving bones from the initial gesture to move to the target gesture.

Preferably, the step of obtaining the human body model parameters further comprises substituting the two-dimensional human body contour image into a first neural network subjected to deep learning to perform joint point regression to obtain a joint point diagram, a semantic segmentation diagram, body skeleton points and key point information of a target human body, substituting the generated human body information into a second neural network subjected to deep learning to perform regression of human body posture and body shape parameters to obtain three-dimensional human body parameters including three-dimensional human body action posture parameters and three-dimensional human body shape parameters, wherein the three-dimensional human body model has a mathematical weight relation of skeleton points and human body grids, and the determination of the skeleton points can be related to determine the human body model of the target human body posture.

Also disclosed is a computer readable storage medium having stored therein a computer program which, when executed by a processor, implements any of the method steps described above.

The electronic equipment comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus, the memory is used for storing a computer program, and the processor is used for realizing any one of the steps when executing the program stored on the memory.

The beneficial effects of the invention are as follows:

1. The reality of the virtual garment is high. Here, the reality actually includes two aspects, namely, the reality that the clothes naturally follow the state change of the human body is high, and the restoration degree of the cloth texture of the clothes is high. Firstly, putting the three-dimensional clothing model on a three-dimensional standard mannequin, inputting body type parameters of the target mannequin to obtain the target mannequin, enabling the three-dimensional clothing model to change along with the change of the body type from the standard mannequin to the target mannequin in a mathematical model processing mode, and enabling the change of the clothing state to be very real. In addition, we also use cloth simulation to simulate the cloth effect close to reality (of course, physical effect can be done and reality is not the same), we mainly highlight here the high reduction degree of cloth texture simulation, including the accuracy of cloth printing simulation. And particularly, when the movement is finished, cloth calculation is carried out on the clothes for a plurality of frames, the final texture state of the clothes is completely reproduced, and the authenticity is ensured.

2. In the cloth simulation process, when the human body model reaches the target gesture, a neural network is used for judging the static or moving state of an input picture, and according to different conditions, the cloth of the garment is subjected to a plurality of frames of gravity calculation to simulate the final state of the garment in the real moving state so as to ensure that the cloth of the garment presents a moving trend consistent with the moving form of the photo under the target gesture, and good fidelity and texture are achieved.

3. Both texture and speed are achieved. On the premise of ensuring that the cloth simulation effect exceeds that of a general model, the model is driven to move by adopting a skeleton-driven skin grid mode at certain parts which are not particularly important and are not extremely severe in deformation, the garment model is driven by adopting the combination of cloth simulation and skin mode, the part which basically does not deform after the garment moves is ensured to have resolving speed by adopting the skin mode, such as the upper half body, and the part which deforms in the garment moving process, such as the leg part and the garment lower hem part are ensured to have fidelity and cloth texture by adopting the cloth simulation mode.

4. The user operation is simple. The invention provides a method for analyzing a whole body photograph of a human body to obtain accurate three-dimensional model parameters of the human body through a deep neural network, which can quickly model the human body by only one common photograph, meanwhile, a three-dimensional clothing model is obtained by processing two-dimensional clothing pictures in advance, a user does not need to participate in the back-curtain work, only needs to select the clothing style of virtual fitting, and the system can automatically match with the corresponding clothing model. The method is very well suitable for the characteristics and trends of the Internet age, and is simple and quick. The user does not need any preparation, and the uploading of a photo is all the work which the user needs to complete.

5. The bone driving process is reasonable. In order to realistically fit the gesture of a target human body, an optimized pin method is adopted to complete the process of changing from an initial gesture to the target gesture, compared with a traditional pin method, the bone information of the target gesture is obtained through model regression prediction, meanwhile, an animation sequence moving from the initial gesture to the target gesture is generated, and a bone information time sequence from the initial gesture to the target gesture is formed through a linear interpolation, nearest neighbor interpolation and other frame interpolation modes. In the process of generating the animation sequence, a grid mesh frame inserting mode is adopted for processing, the frame inserting speed is set to be low in the positions of the initial point and the target point which are far from the front and back, the middle motion process is fast, and particularly, when the model reaches the final target gesture, a plurality of frames are stopped, so that the whole animation sequence is obtained. Compared with the uniform frame insertion, the method is closer to the real physical world motion law, the posture of the human body model can be stabilized, the state of the clothing model can be stabilized in the time of the frames, and the cloth simulation calculation can be performed simultaneously. The effect of the garment and the human body posture simulation is better, and the considerable processing time can be reduced.

6. Deep neural networks are used with high frequency. The invention fully utilizes the advantages of the deep learning network, and can restore the posture and the body shape of the human body in various complex scenes with high precision. Different neural networks are respectively used for different purposes, and neural network models with different input conditions and training modes are utilized, so that accurate contour separation of a human body under a complex background, semantic segmentation of the human body, determination of key points and joint points are realized, influence of loose clothes and hairstyles is eliminated, and the real body shape and form of the human body are approximated to the greatest extent. In the prior art, a neural network model is also used, but the functions and actions of the neural network model are greatly different due to different input conditions, input parameters and training modes.

7. The neural network model is more scientific and targeted. In the prior art, some image processing methods are too pursued to simply go straight out the model, the details of the model are not taken time to polish, mapping from a 2D picture to a 3D body model is completed purely through training of massive image data, although the efficiency is very high, the processing flow is too simple, the three-dimensional body model is completely generated by means of a neural network model, the consistency and effect of the proportion and detail parts of the body are not as good as those of people, the follow-up further processing is not helpful, and the follow-up procedure is possibly difficult to cross. The neural network of the previous stage, human body contour, human body semantic segmentation, key points and joint points are all used as input items, the generation of model parameters can be completed from multiple angles, the parameters output by the neural network of the next stage comprise two categories of gestures and SHAPEs, the actions and the body types can be controlled respectively, and the gestures and the body types of the human body model can be accurately copied by combining with the reference model of the human body.

8. The human body model is accurate and controllable. The current popular human body reconstruction method based on single image is mainly divided into a reconstruction parameterized human body model. The most commonly used parameterized model is the maple SMPL model, which contains a total of 72 parameters for describing body posture and describing body shape. Aiming at the problem of single picture reconstruction, firstly, estimating the two-dimensional joint position from a picture, and then realizing optimization to obtain the SMPL parameters by the minimum projection distance between the three-dimensional joint and the two-dimensional plane joint, thereby obtaining the human body. However, the SMPL model is mainly used for deep learning and training through a large number of mannequin examples, the relationship between the body shape and the shape base is an overall association relationship, the decoupling difficulty is very high, the body part to be controlled cannot be controlled optionally, the generated model cannot achieve high consistency with the real body posture and the body shape, and in addition, if the model is further applied to the subsequent dressing process, the expression capability of geometric details of the body surface is limited, and the detail texture of the clothing on the body surface cannot be reconstructed well. However, the human body model is not obtained through training, and the parameters have a corresponding relation based on a mathematical principle, that is, the parameters of each group are independent, so that the model has better interpretation in the transformation process, and can better represent the shape change of a certain part of the body. In popular terms, the model is that the body type of a person is thousands of people, the ratio of thighs to shanks of a plurality of people does not meet a certain accurate ratio, and the model can realize the control and the length adjustment of thighs and shanks respectively by controlling input parameters so as to accurately determine the ratio of the legs.

According to the invention, through a series of methods, such as the combination of skin and cloth simulation, a plurality of frames of gravity calculation are carried out according to the situation when the movement is finished, the variable speed frame insertion driving model movement and the like, the reality degree and the reduction degree of the three-dimensional clothing model are maintained in the matching of the clothing model and the human body model, a certain processing speed is ensured, and a good simulation reloading effect is obtained.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

FIG. 1 is a full process flow diagram of one embodiment;

FIG. 2 is a process flow diagram of a model parameter acquisition module of one embodiment;

FIG. 3 is a flow diagram of a mannequin fitting process of one embodiment;

Fig. 4 is a schematic diagram of the system of the present invention.

Detailed Description

Features and exemplary embodiments of various aspects of the present invention will be described in detail below, and in order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are merely configured to illustrate the invention and are not configured to limit the invention. It will be apparent to one skilled in the art that the present invention may be practiced without some of these specific details. The following description of the embodiments is merely intended to provide a better understanding of the invention by showing examples of the invention.

It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising" does not exclude the presence of additional identical elements in a process, method, article, or apparatus that comprises the element.

The method for processing a human body image according to the embodiment of the present invention will be described in detail with reference to the accompanying drawings.

The invention provides a clothing model driving method, which comprises the steps of obtaining a two-dimensional image of clothing, manufacturing a three-dimensional model of the clothing according to the two-dimensional image of the clothing, constructing a three-dimensional standard human body model according to a mathematical model, attaching the three-dimensional clothing model to the three-dimensional standard human body model of the initial posture, obtaining a two-dimensional image of a target human body, obtaining three-dimensional target human body model parameters through calculation of a two-level neural network model, inputting a plurality of obtained posture and body type parameters into the three-dimensional standard human body model for fitting, driving the human body model to move from the initial posture to the target posture, and obtaining the target human body model which is identical to the target human body posture and body type and is worn on the clothing after replacement.

The method generally comprises several partial steps. Firstly, generating a three-dimensional clothing model, secondly, generating a standard human body model, sleeving the three-dimensional clothing model on the standard human body model, thirdly, obtaining parameters of a human body model with a target posture, thirdly, enabling the body shape and the posture of the standard human body model to be consistent with those of the target human body model, and simultaneously simulating real changes of the three-dimensional clothing model along with changes of the human body model.

The first part is mainly to generate a three-dimensional garment model. In the prior art of generating three-dimensional garment models, there are several different approaches. At present, a traditional clothing three-dimensional model building mode is based on a two-dimensional clothing cut-part design and sewing method. This approach requires some expertise in clothing to design the template. Another more novel three-dimensional modeling approach is based on hand-drawing, which can generate a simple garment model from line information drawn by a user. And on the basis of acquiring the clothing picture information, the image processing technology and the graphic simulation technology are comprehensively used to finally generate the virtual three-dimensional clothing model. And acquiring the outline and the size of the garment in the picture through outline detection and classification, finding out the edge-to-edge key points from the outline through a machine learning method, generating sewing information through the corresponding relation of the key points, and finally carrying out physical sewing simulation on the garment in a three-dimensional space to acquire the real effect of the garment worn on a human body. In addition, there are mapping methods, mathematical model simulation methods, etc., the present invention does not make any particular limitation on this part, but the three-dimensional garment model needs to be matched to a standard human model, and the general requirement is based on the garment model already matched to the standard human model, and the garment model is matched to the human model under the target posture by means of cloth physical simulation, and the naturalness and rationality of the garment are ensured.

Therefore, basic requirements are generally met, including, but not limited to, a, completely attaching to an initial posture of a standard mannequin without penetrating a mold, b, outputting uniform quadrilateral surfaces, c, spreading, tiling, compacting and aligning UV of a model, manually aligning UV of a map by a photoshop tool, d, carrying out vertex combination, e, evenly subtracting surfaces of the output model, referring to a standard total surface number of the model not more than 15w surfaces/sleeve, f, adjusting materials in mainstream garment design software, solving 10 frames of animation to observe cloth effects, achieving expectations and saving material parameters, g, adjusting rendering materials in the mainstream design software, previewing one rendering, and guaranteeing reasonable lambert properties of the materials.

The second part is to design and model some standard models, namely basic mannequins in advance according to our human modeling method, and sleeve the three-dimensional clothing model on the standard human model so as to achieve the effect of adapting to our subsequent working flow. The main working content is that a three-dimensional standard human body model, namely a basic human body table, is constructed by combining a mathematical model. The SMPL human model of the Marpulus can avoid the surface distortion of the human body in the motion process, and can accurately describe the appearance of the muscle stretching and contraction motion of the human body. In the method, beta and theta are input parameters, wherein beta represents 10 parameters of the human body with stuffy and thin body and equal proportion of the head to the body, and theta represents 75 parameters of the whole motion pose of the human body and the relative angles of 24 joints. The beta parameter is ShapeBlend posture parameters, the shape change of the human body can be controlled through 10 incremental templates, and in particular, the change of the human body morphology controlled by each parameter can be characterized through a diagram. By researching continuous animation of parameter changes, we can clearly see that each continuous change of control human morphological parameters can cause local and even whole chain changes of the human model, and in order to reflect the movement of human muscle tissues, each linear change of parameters of the SMPL human model can cause large-area grid changes. In the figure, for example, when adjusting the parameters of beta 1, the model directly understands the parameter change of beta 1 as the whole change of the body, and you may only want to adjust the proportion of the waist, but the model can force the fat and thin of the legs, the chest and even the hands to help you adjust. Although this mode of operation can greatly simplify the workflow and increase efficiency, it is indeed very inconvenient for the project pursuing the modeling effect. Since the SMPL manikin is a model which is trained by western body photographs and measurement data and conforms to western body types, the regular shape change of the model substantially conforms to the general change curve of western people, and when the model is applied to the manikin modeling of asians, many problems such as the proportion of arms and legs, the proportion of waists, the proportion of necks, the length of legs and arms, etc. occur. Through our study, there are large differences in these aspects, and if hard-fitting the SMPL mannequin, our requirements cannot be met in the final production effect.

Therefore, the improvement of the effect is completed by adopting a mode of self-making the mannequin. The core of the device is to self-construct human blendshape base to realize accurate independent control of human body. Preferably, the three-dimensional standard mannequin (base mannequin) is composed of 20 body base parameters and 170 bone parameters. The multiple bases form the whole human body model, and each body base is controlled and changed by parameters independently and does not affect each other. On the one hand, the control parameters are increased, and ten beta control parameters of the Marpulus are not used, so that the adjustable parameters are added with the length of arms, the length of legs, the fat and thin of waist, buttocks and chest and the like besides the normal fat and thin, the parameters are increased by more than one time in the aspect of skeleton parameters, the range of adjustable parameters is greatly widened, and a good foundation is provided for the fine design of standard human models. By independent control, it is understood that each base is independently controlled, such as waist, leg, hand, head, etc., and each bone can be independently adjusted in length, independent of each other, and no physical linkage occurs, so that fine adjustment of the mannequin can be performed well. The model is not easy and thick, and the model cannot be adjusted to the form satisfactory to a designer. The existing model shows a corresponding relation in mathematical principle, which is actually equivalent to designing the model from two parts of artificial aesthetic and data statistical analysis, so that the model is generated according to the design rule to be considered to be the correct model conforming to Asian human body type, and is obviously different from a big data training model of the SMPL human body model, so that the parameter transformation is more interpretable, the local body change of the human body model can be better represented, the change is based on the mathematical principle, the parameters have no influence, and a completely independent state is kept between the arm and the leg. In practice, so many different parameters are designed, namely, in order to avoid the defect that the human body model is trained by big data, the human body model is precisely controlled in more dimensions, the method is not limited to a few indexes such as height, and the modeling effect is greatly improved. The method has practical significance only on the premise of self-building the body base and setting up so many independent control parameters, and the method is indispensable for meeting the requirements of designers.

As for putting the three-dimensional clothing model on a standard human body model, the technology is conventional in the art, and the invention does not limit the technology excessively, so that the required effect can be achieved.

The third part is to process the acquired human body image to obtain the parameter information required by generating the human body model. In the past, the selection of the key points of bones is usually carried out manually, but the mode is very low in efficiency and does not meet the requirement of fast rhythm of the Internet age, so that the key points are selected by using the neural network subjected to deep learning instead of manually at the present of the great line of the neural network, and the trend is realized. How to efficiently utilize the neural network is a problem that requires further investigation. In general, we have adopted the idea of adding data "refinement" to the secondary neural network to construct our parameter acquisition system. As shown in figure 2, the parameters are generated by adopting a neural network subjected to deep learning, and mainly comprise the following substeps of 1) acquiring a two-dimensional image of a target human body, 2) processing to acquire a two-dimensional human body contour image of the target human body, 3) substituting the two-dimensional human body contour image into a first neural network subjected to deep learning to perform joint point regression, 4) acquiring a joint point diagram of the target human body, acquiring semantic segmentation diagrams of all parts of the human body, body key points and body skeleton points, 5) substituting the generated joint point diagram, semantic segmentation diagram, body skeleton points and key point information of the target human body into a second neural network subjected to deep learning to perform regression of human body posture and body shape parameters, and 6) acquiring output three-dimensional human body parameters including three-dimensional human body action posture parameters and three-dimensional human body shape parameters.

The two-dimensional image of the target human body may be a two-dimensional image including a human body image of any posture and any dressing. The two-dimensional human body contour image is acquired by using a target detection algorithm, wherein the target detection algorithm is based on a target area rapid generation network of a convolutional neural network.

Before the two-dimensional human body image is input into the first neural network model, a process of training the neural network is further included, wherein the training sample comprises a standard two-dimensional human body image marked with original joint positions, and the original joint positions are marked on the two-dimensional human body image with high accuracy by manpower. Here, a target image is first acquired, and human body detection is performed on the target image using a target detection algorithm. Human detection does not use a measuring instrument to detect a real human body, but in the present invention, it is actually meant that for any given image, a two-dimensional photograph, such as a human face, containing sufficient information, and the four limbs and body requirements of the human are all included in the picture. Then, a given image is searched by adopting a certain strategy to determine whether the given image contains a human body, and if the given image contains the human body, parameters such as the position, the size and the like of the human body are given. In this embodiment, before acquiring the human body key points in the target image, human body detection needs to be performed on the target image to acquire a human body frame marked with the human body position in the target image, and because the picture input by us can be any picture, some background of non-human body images, such as a table chair, a big tree automobile building and the like, inevitably exist, and these useless backgrounds are removed through some mature algorithms.

Meanwhile, semantic segmentation, joint point detection, skeleton detection and edge detection are further carried out, and after the 1D point information and the 2D surface information are collected, a good foundation can be laid for generating a 3D human model at the back. The first level neural network is used to generate a joint point diagram of the human body, alternatively, the target detection algorithm can quickly generate a network for a target area based on the convolutional neural network. The first neural network needs to perform a large amount of data training, the manual operation is used for marking joint points of some photos collected from the network, then the photos are input into the neural network for training, and the neural network after deep learning can basically obtain the joint point diagram with the same accuracy and effect as the manual operation for marking the joint points immediately after inputting the photos, and meanwhile, the efficiency is tens of times or hundreds of times that of the manual operation. Human body joints are usually used as key points of human body.

In the invention, the joint point position of the human body in the photo is obtained, only the first step is completed, 1D point information is obtained, 2D surface information is also generated according to the 1D point information, and the work can be completed through a neural network model and a mature algorithm in the prior art. The invention redesigns the flow and intervention time of the neural network model to participate in the work, reasonably designs various conditions and parameters, ensures that the generation work of the parameters is more efficient, reduces the degree of manual participation, is very suitable for the application scene of the Internet, for example, in the virtual reloading program, the user can obtain the reloading result basically instantaneously without waiting, and plays a vital role in improving the attraction of the program to the user.

After obtaining the related 1D point information and 2D surface information, the parameters or results can be substituted into the second neural network subjected to deep learning by taking the joint point diagram, the semantic segmentation diagram, the body skeleton point and the key point information of the target human body as input items to carry out regression of the human body posture and body type parameters. Through the regression calculation of the second neural network, a plurality of groups of three-dimensional human body parameters including three-dimensional human body action posture parameters and three-dimensional human body shape parameters can be immediately output. Preferably, the neural network is designed based on a three-dimensional standard mannequin (base mannequin), a predictive three-dimensional mannequin, a standard two-dimensional body image that marks the original joint location, and a standard two-dimensional body image that includes the predicted joint location.

The fourth part, namely the most critical part, is to fit and drive parameters of the human body model and the human body model, and simultaneously ensure that the state of the clothes after the clothes follow-up movement is as true as possible.

As shown in fig. 3, the moving process comprises the following substeps of corresponding the obtained three-dimensional human body posture and SHAPE parameters to a plurality of bases and skeleton parameters of a three-dimensional standard human body model, inputting the obtained plurality of sets of bases and skeleton parameters into the standard three-dimensional human body parameter model for fitting, wherein the three-dimensional human body model has a mathematical weight relation of skeleton points and model grids, and the determination of the skeleton points can be related to the human body model for determining the target human body posture. In this section, these two parameters generated in the previous section are used, and can be substituted into a pre-designed phantom to construct a 3D phantom. These two types of parameters are similar to the name of the parameters of the human body SMPL model of the Marpulus, but the actual content of the parameters is greatly different. Because the bases of the model are different, that is, the model adopts a self-made three-dimensional standard human model (base human table), each base is designed according to the body type and the figure proportion of Asians and comprises a plurality of parts which are not related to the SMPL model, the SMPL model of the Marpulus adopts a standard human model generated by big data training, the generation calculation modes of the model and the model are different, and the model is finally embodied as a generated 3D human model, but the connotation of the model is quite different. After this step, a preliminary 3D mannequin is obtained, including a mesh of mannequins with skeletal location and length information.

In this part, after the three-dimensional garment model is sleeved on the standard human body model, the body shape and the posture of the standard human body model are required to be consistent with those of the target human body, and meanwhile, the real change of the three-dimensional garment model along with the change of the human body model is simulated. We use several methods to ensure that the above objective is achieved successfully.

Firstly, determining the posture of a target human body through three-dimensional human body action posture parameters, and driving bones from an initial posture to move to the target posture so that the three-dimensional human body model is basically consistent with the target human body posture.

In the animation field, it is common to use bones or joints as the source of the drive, with the use of skin drives. The skin driving is to establish a relation between the model grid vertexes and bones in the driving process, calculate the rotation parameters of the bones under the target posture, and then drive the vertexes nearby the bones to complete the spatial movement through the relation of skin weights, so that the associated vertexes are driven to reach the position of the target posture. For example, when the bone is bent and rotated, the effect drives the nearby binding mesh vertices to also complete the spatial movement. As the same reason, since there is a self-built human model, each small part of the human model can be expressed relatively finely by having a very large number of parameters representing human features. For example, after we obtain rotation parameters for all target poses (pose) of 170 bones, the mesh vertices of the entire mannequin can be driven to the target pose (pose) using all skin weights. The benefit of skinning is that it is fast, and it is understood that it is not necessary to interpolate every frame, and the garment model can be driven directly from the initial state to the target pose. However, it is also obvious that there is no simulation of the operation content of the physical world, so that the state after reaching the target position is quite unnatural in the changing process and is different from the state after actual movement. The mold-through is not obvious for a close-fitting garment model, but unnatural can be clearly seen for a more relaxed three-dimensional garment model.

To solve the problems, the movement of the clothing model adopts a mode of combining cloth simulation and a skin method, the part which basically does not deform or deforms very little after the clothing moves adopts a skin mode, and the part which deforms relatively much in the clothing moving process adopts a cloth simulation mode. Here, the method of covering is mainly used for improving the moving processing speed, and the method of cloth simulation is mainly used for improving the reality, so that the cloth simulation is to simulate the cloth effect close to reality (of course, the physical effect can be made and the physical effect in reality is not the same), and the high reduction degree of the cloth texture simulation, including the simulation accuracy of the cloth printing, is mainly used. Our approach gives both texture and speed. On the premise of ensuring that the cloth simulation effect exceeds that of a general model, the model is driven to move by adopting a skeleton-driven skin grid mode at certain parts which are not particularly important and are not extremely severe in deformation, the garment model is driven by adopting the combination of cloth simulation and skin mode, and the resolving speed is ensured by adopting the skin mode at the parts which basically do not deform after the garment moves, such as the upper half of one-piece dress, yoga exercise clothing, tight clothing, clothing with the shape and the posture of the clothing highly fitting, and the like. The deformation part in the moving process of the clothes adopts a cloth simulation mode to ensure fidelity and cloth texture, such as leg parts, a clothes hem part, a windcoat, a coat with a hem, yarn clothing and the like. The specific part adopts the skin and the specific part adopts the cloth simulation, does not have a clear division criterion, is actually judged by the specific generated final clothing model effect, and if the upper half of the one-piece dress has the part which is similar to a water sleeve, a lotus leaf sleeve and the like and changes greatly along with the arm movement, a cloth simulation solving mode is used. The present invention emphasizes that it is possible to find a better balance of speed and mass using a combination of these two methods.

Secondly, the moving state from the initial posture to the target posture is calculated frame by frame, skin driving is finished and cloth simulation is calculated firstly when each frame is calculated, and the state of the next frame is calculated after the calculation of the current frame is finished. This also creates a sequence of intervening states that maintains the high realism of a frame-by-frame shift-by-frame calculation. The cloth simulation uses a collision body system, the vertex grid of the human model is modeled as a rigid body, the grid of the clothes model is modeled as a non-rigid body, a physical engine is used for simulating the collision relation between the rigid body and the non-rigid body, the collision between the rigid body and the non-rigid body is calculated in the cloth simulation process, meanwhile, the connection acting force between the grids of the clothes model is considered, the grid state of the clothes model is calculated frame by frame, and the collision motion process in the physical world is simulated, so that the real moving state of the clothes model can be maintained to the greatest extent in a frame by frame simulation mode, and the simulation speed is reduced, but better balance is maintained under the background of the enhanced calculation capability of equipment.

And if the target posture is the snap shot, the clothing model can keep the speed in the motion process due to inertia so that the target posture state is unstable, the times of gravity calculation are reduced when the human body model reaches the target posture, and if the target posture is the snap shot, the times of gravity calculation are increased when the human body model reaches the target posture, so that the fidelity of the clothing cloth under the target posture is ensured. In particular, when the movement is finished, cloth of a plurality of frames is calculated on the clothes, the clothes are perfectly matched with variable-speed frame inserting driving in time, the process is originally in existence, the stable model state is obtained, the clothes are applied to a clothes model, the parallel method is adopted, the loss is basically avoided in time, the final texture state of the clothes is completely reproduced, and the authenticity is improved. In the cloth simulation process, when the human body model reaches the target gesture, a neural network is used for judging the static or moving state of an input picture, and according to different conditions, the cloth of the garment is subjected to a plurality of frames of gravity calculation to simulate the final state of the garment in the real moving state so as to ensure that the cloth of the garment presents a moving trend consistent with the moving form of the photo under the target gesture, and good fidelity and texture are achieved.

In this section, the mannequin also completes the change from the initial pose to the target pose. Because we input only one photo, the pose of the target human body on the photo is usually different from that of the basic human body table, and in order to fit the pose of the target human body, the change from the initial pose to the target pose is completed. In order to more realistically simulate the fitting of several sets of basis and bone parameters in a standard three-dimensional parametric model of the human body, it further comprises the steps of,

1) Obtaining initial posture and position coordinates of a target posture, wherein the obtaining of initial posture parameters is determined by initializing parameters of a standard mannequin model, and bone information of the target posture is obtained by regression prediction of a neural network model.

2) And after the initial state of the skeleton information and the state parameters of the target gesture are included, the skeleton information time sequence from the initial gesture to the target gesture is formed by linear interpolation, nearest neighbor interpolation and other frame interpolation modes. In the driving process, according to the number of bones driven by each frame, the method can be divided into two modes of global linear interpolation, driving parent nodes by the precursor and driving child nodes by the precursor, and the like, and the driving state in the simulated physical world is considered.

3) In the process of generating the animation sequence, a mesh frame inserting mode is adopted for processing, namely, after each frame drives bones to move, the weight parameters of a standard mannequin are used for calculating to obtain the vertex, namely the face information, of the mannequin in the current state, and the current mannequin mesh state is updated, recorded and stored.

4) The frame inserting speed is set to be slow in the front-back distance from the initial point and the target point and fast in the middle motion process, and the non-uniform frame inserting speed is adopted in the patent, namely, the single frame moving amplitude is small in the initial motion and ending motion processes, and the moving amplitude is large in the middle motion process. The initial state of the physical action in the real world is simulated to have a certain acceleration process, and the high inter-frame distance is kept in the motion process, and the driving speed is reduced until the motion is finished.

5) A few frames are stationary while driving to the final target pose, the entire animation sequence is obtained. Compared with the uniform frame insertion, the method is closer to the real physical world motion law, and the simulated effect is better.

6) The bone is driven to move from the initial posture to the target posture.

The method of generating a three-dimensional manikin according to embodiments of the invention described in connection with fig. 1 to 3 may be implemented by a processing manikin device. Fig. 4 is a schematic diagram showing a hardware configuration 300 of an apparatus for processing a human body image according to an embodiment of the invention.

The invention also discloses a computer readable storage medium, wherein a computer program is stored in the computer readable storage medium, and the computer program realizes the clothing model driving method and the clothing model driving steps when being executed by a processor.

The electronic equipment comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus, the memory is used for storing a computer program, and the processor is used for realizing the clothing model driving method and the clothing model driving steps when executing the program stored on the memory.

As shown in fig. 4, the apparatus 300 for implementing virtual fitting in this embodiment includes a processor 301, a memory 302, a communication interface 303, and a bus 310, where the processor 301, the memory 302, and the communication interface 303 are connected by the bus 310 and complete communication with each other.

In particular, the processor 301 may include a Central Processing Unit (CPU), or an Application Specific Integrated Circuit (ASIC), or may be configured as one or more integrated circuits that implement embodiments of the present invention.

Memory 302 may include mass storage for data or instructions. By way of example, and not limitation, memory 302 may comprise an HDD, floppy disk drive, flash memory, optical disk, magneto-optical disk, magnetic tape, or Universal Serial Bus (USB) drive, or a combination of two or more of these. Memory 302 may include removable or non-removable (or fixed) media, where appropriate. The memory 302 may be internal or external to the processing human image device 300, where appropriate. In a particular embodiment, the memory 302 is a non-volatile solid-state memory. In particular embodiments, memory 302 includes Read Only Memory (ROM). The ROM may be mask programmed ROM, programmable ROM (PROM), erasable PROM (EPROM), electrically Erasable PROM (EEPROM), electrically rewritable ROM (EAROM), or flash memory, or a combination of two or more of these, where appropriate.

The communication interface 303 is mainly used to implement communication between each module, device, unit and/or apparatus in the embodiment of the present invention.

Bus 310 includes hardware, software, or both that couple the components of device 300 that process images of the human body to one another. By way of example, and not limitation, the buses may include an Accelerated Graphics Port (AGP) or other graphics bus, an Enhanced Industry Standard Architecture (EISA) bus, a Front Side Bus (FSB), a HyperTransport (HT) interconnect, an Industry Standard Architecture (ISA) bus, an infiniband interconnect, a Low Pin Count (LPC) bus, a memory bus, a micro channel architecture (MCa) bus, a Peripheral Component Interconnect (PCI) bus, a PCI-Express (PCI-X) bus, a Serial Advanced Technology Attachment (SATA) bus, a video electronics standards association local (VLB) bus, or other suitable bus, or a combination of two or more of the above. Bus 310 may include one or more buses, where appropriate. Although embodiments of the invention have been described and illustrated with respect to a particular bus, the invention contemplates any suitable bus or interconnect.

That is, the apparatus 300 for processing a human body image shown in fig. 4 may be implemented to include a processor 301, a memory 302, a communication interface 303, and a bus 310. The processor 301, memory 302, and communication interface 303 are connected and communicate with each other via a bus 310. The memory 302 is used for storing program codes, and the processor 301 executes a program corresponding to the executable program codes by reading the executable program codes stored in the memory 302 for executing the virtual fitting method in any one of the embodiments of the present invention, thereby implementing the method and apparatus for virtual fitting described in connection with fig. 1 to 3.

The embodiment of the invention also provides a computer storage medium, wherein the computer storage medium is stored with computer program instructions, and the computer program instructions realize the method for processing the human body image provided by the embodiment of the invention when being executed by a processor.

It should be understood that the invention is not limited to the particular arrangements and instrumentality described above and shown in the drawings. For the sake of brevity, a detailed description of known methods is omitted here. In the above embodiments, several specific steps are described and shown as examples. The method processes of the present invention are not limited to the specific steps described and shown, but various changes, modifications and additions, or the order between steps may be made by those skilled in the art after appreciating the spirit of the present invention.

The functional blocks shown in the above-described structural block diagrams may be implemented in hardware, software, firmware, or a combination thereof. When implemented in hardware, it may be, for example, an electronic circuit, an Application Specific Integrated Circuit (ASIC), suitable firmware, a plug-in, a function card, or the like. When implemented in software, the elements of the invention are the programs or code segments used to perform the required tasks. The program or code segments may be stored in a machine readable medium or transmitted over transmission media or communication links by a data signal carried in a carrier wave. A "machine-readable medium" may include any medium that can store or transfer information. Examples of machine-readable media include electronic circuitry, semiconductor memory devices, ROM, flash memory, erasable ROM (EROM), floppy disks, CD-ROMs, optical disks, hard disks, fiber optic media, radio Frequency (RF) links, and the like. The code segments may be downloaded via computer networks such as the internet, intranets, etc.

It should also be noted that the exemplary embodiments mentioned in this disclosure describe some methods or systems based on a series of steps or devices. The present invention is not limited to the order of the above-described steps, that is, the steps may be performed in the order mentioned in the embodiments, or may be performed in a different order from the order in the embodiments, or several steps may be performed simultaneously.

In the foregoing, only the specific embodiments of the present invention are described, and it will be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the systems, modules and units described above may refer to the corresponding processes in the foregoing method embodiments, which are not repeated herein. It should be understood that the scope of the present invention is not limited thereto, and any equivalent modifications or substitutions can be easily made by those skilled in the art within the technical scope of the present invention, and they should be included in the scope of the present invention.

Claims

1. A clothing model driving method, characterized in that the method comprises:

1) Obtain a two-dimensional image of the garment;

2) Create a three-dimensional model of the garment based on the two-dimensional image of the garment;

3) constructing a three-dimensional standard human body model in combination with the mathematical model, wherein the three-dimensional standard human body model is in an initial posture, the three-dimensional standard human body model having a mathematical weight relationship between skeletal points and a human body mesh, and the determination of the skeletal points can be associated with determining a human body model of a target human body posture;

4) Fitting the 3D clothing model to the 3D standard human body model in the initial pose through fabric physics simulation;

5) Using target detection algorithm to obtain a two-dimensional image of the target human body;

6) Obtaining the three-dimensional target human body model parameters, including posture and body shape parameters, through calculation of the secondary neural network model;

7) Inputting the obtained sets of posture and body shape parameters into the 3D standard human body model for fitting, so that the posture and body shape of the 3D standard human body model are consistent with those of the 3D target human body model;

8) Using a combination of cloth simulation and skinning methods to drive the clothing model, the human body model moves from the initial pose to the target pose;

9) Obtain a target human body model having the same posture and body shape as the target human body and wearing the changed clothing; wherein, when driving the clothing model so that the posture of the human body model reaches the target posture, it is necessary to perform gravity calculations on the clothing fabric for several frames, judge the continuity of the movement change based on the neural network prediction model, and analyze whether the target posture is a snapshot or a static pose. If it is a snapshot, the clothing model maintains the speed during the motion due to inertia, making the target posture state an unsteady state. When the human body model reaches the target posture, the number of gravity calculations is reduced; if it is a posed shot, the number of gravity calculations is increased when the human body model reaches the target posture to ensure the realism of the clothing fabric in the target posture.

2. The method according to claim 1 is characterized in that the posture of the target human body is determined by the three-dimensional human body motion posture parameters, and the skeleton is driven to move from the initial posture to the target posture, so that the three-dimensional human body model is basically consistent with the target human body posture.

3. The method according to claim 1 is characterized in that the clothing model is driven by a combination of cloth simulation and skinning methods, the skinning method is used for the part of the clothing that basically does not deform after the human body model moves, and the cloth simulation method is used for the part that undergoes large deformation during the movement of the clothing.

4. The method according to claim 3 is characterized in that the relationship between the model mesh vertices and the bones is established during the driving process, and after the rotation parameters of the bones at the target posture are calculated, the relationship between the skinning weights is used to enable the bones to drive the vertices near them to complete the spatial movement, driving the associated vertices to reach the position of the target posture.

5. The method according to claim 3 is characterized in that the movement state from the initial posture to the target posture is calculated frame by frame. When calculating each frame, the skin drive is completed first and then the cloth simulation is calculated. After the current frame is calculated, the state of the next frame is calculated.

6. The method according to claim 3 is characterized in that the cloth simulation uses a collision body system, the vertex mesh of the human model is modeled as a rigid body, and the mesh of the clothing model is modeled as a non-rigid body, and the collision relationship between the rigid body and the non-rigid body is simulated using a physics engine. The collision between the rigid body and the non-rigid body is calculated during the cloth simulation process, and the connection force between the meshes of the clothing model itself is considered. The mesh state of the clothing model is calculated frame by frame to simulate the collision motion process in the physical world.

7. The method according to claim 1, wherein the fitting movement step further comprises the following sub-steps:

1) Obtain the position coordinates of the initial posture and target posture;

2) Generate an animation sequence that moves from the initial pose to the target pose;

3) In the process of generating animation sequences, grid interpolation is used for processing;

4) The interpolation speed is set so that the distance between the initial point and the target point is slow, and the intermediate movement process is fast;

5) When driving to the final target pose, stop for several frames to obtain the entire animation sequence;

6) Complete the movement of the skeleton from the initial posture to the target posture.

8. The method according to claim 1 is characterized in that the step of obtaining the parameters of the three-dimensional target human body model further includes substituting the two-dimensional human body contour image into a first neural network that has undergone deep learning to perform joint point regression to obtain a joint point map, semantic segmentation map, body bone points and key point information of the target human body; substituting the generated above-mentioned human body information into a second neural network that has undergone deep learning to perform human body posture and body shape parameter regression to obtain three-dimensional human body parameters, including three-dimensional human body motion posture parameters and three-dimensional human body shape parameters.

9. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, it implements the method according to any one of claims 1 to 8.

10. An electronic device, characterized in that it includes a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory communicate with each other via the communication bus; the memory is used to store computer programs; and the processor is used to implement the method described in any one of claims 1 to 8 when executing the program stored in the memory.