HK1246924A

HK1246924A - Identifying method, device and storage medium for au features

Info

Publication number: HK1246924A
Application number: HK18106255.7A
Authority: HK
Inventors: 陈林; 张国辉
Original assignee: 平安科技（深圳）有限公司
Filing date: 2018-05-15
Publication date: 2018-09-14

Description

AU feature recognition method, device and storage medium

Technical Field

The present invention relates to the field of computer vision processing technologies, and in particular, to an AU feature recognition method, an AU feature recognition device, and a computer-readable storage medium.

Background

The human face emotion recognition is an important component of human-computer interaction and emotion calculation research, relates to the research fields of psychology, sociology, anthropology, life science, cognitive science, computer science and the like, and has great significance for human-computer interaction intellectualization and harmony.

International famous psychologist Paul Ekman and research partner w.v. friesen conducted intensive research to delineate the correspondence of different facial muscle actions and different expressions through observation and biofeedback. FACS is a "facial expression coding system" created in 1976 by many years of research. According to the anatomical features of the human face, the human face can be divided into a plurality of motion units (AU) which are independent and connected with each other, and the motion characteristics of the motion units and the main area controlled by the motion units can reflect the facial expression.

At present, the method for judging the facial expression by identifying the AU characteristics in the facial image is general and has high accuracy, however, most of the methods for identifying the AU characteristics in the industry collect a large number of AU samples, arrange the samples into several types, train an AU characteristic identification model by using a convolutional neural network and identify the AU characteristics, but the method has low accuracy.

Disclosure of Invention

The invention provides an AU feature identification method, an AU feature identification device and a computer readable storage medium, and mainly aims to identify AU features in feature areas in real-time face images through different AU classifiers and effectively improve the efficiency of AU feature identification.

To achieve the above object, the present invention provides an electronic device, comprising: the image pickup device comprises a memory, a processor and an image pickup device, wherein the memory comprises an AU characteristic identification program, and the AU characteristic identification program realizes the following steps when being executed by the processor:

a real-time image capturing step: acquiring a real-time image shot by a camera device, and extracting a real-time face image from the real-time image by using a face recognition algorithm;

a facial feature point identification step: inputting the real-time face image into a pre-trained face average model, and identifying t face feature points from the real-time face image by using the face average model;

local feature extraction: determining a characteristic region matched with each AU in the real-time facial image according to the positions of the t facial characteristic points, extracting local characteristics from the characteristic region, and generating a plurality of characteristic vectors; and

AU feature prediction step: and respectively inputting the plurality of feature vectors into a pre-trained AU classifier matched with the feature region to obtain a prediction result of the corresponding AU feature identified from the feature region.

Optionally, when executed by the processor, the AU feature recognition program further implements the steps of:

a judging step: and judging whether the probability of each AU feature in the prediction result is greater than a preset threshold value.

Optionally, the determining step further includes:

and when the AU characteristics with the probability larger than the preset threshold value exist in the prediction result, prompting that the AU characteristics are identified from the real-time face image.

In addition, to achieve the above object, the present invention further provides an AU feature recognition method, including:

Optionally, the method further comprises:

Optionally, the determining step further includes:

Further, to achieve the above object, the present invention also provides a computer-readable storage medium including an AU feature recognition program therein, which when executed by a processor, implements any of the steps in the AU feature recognition method as described above.

The AU feature identification method, the electronic device and the computer readable storage medium provided by the invention can obtain the prediction result of identifying each AU feature by intercepting the feature area corresponding to each AU feature from the real-time facial image and inputting the feature area corresponding to each AU feature into the corresponding AU classifier, thereby improving the accuracy of AU feature identification.

Drawings

FIG. 1 is a diagram of an electronic device according to a preferred embodiment of the present invention;

FIG. 2 is a functional block diagram of an AU feature recognition program of FIG. 1;

fig. 3 is a flowchart of an AU feature recognition method according to a first embodiment of the present invention;

fig. 4 is a flowchart of an AU feature recognition method according to a second embodiment of the present invention.

The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

The invention provides an electronic device 1. Referring to fig. 1, a schematic diagram of an electronic device 1 according to a preferred embodiment of the invention is shown.

In the present embodiment, the electronic device 1 may be a terminal device having an arithmetic function, such as a server, a smart phone, a tablet computer, a portable computer, or a desktop computer.

The electronic device 1 includes: a processor 12, a memory 11, an imaging device 13, a network interface 14, and a communication bus 15. The camera device 13 is installed in a specific location, such as an office or a monitoring area, and captures a real-time image of a target entering the specific location in real time, and transmits the captured real-time image to the processor 12 through a network. The network interface 14 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The communication bus 15 is used to realize connection communication between these components.

The memory 11 includes at least one type of readable storage medium. The at least one type of readable storage medium may be a non-volatile storage medium such as a flash memory, a hard disk, a multimedia card, a card-type memory, and the like. In some embodiments, the readable storage medium may be an internal storage unit of the electronic apparatus 1, such as a hard disk of the electronic apparatus 1. In other embodiments, the readable storage medium may also be an external memory of the electronic device 1, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the electronic device 1.

In the present embodiment, the readable storage medium of the memory 11 is generally used for storing an Action Unit (AU) feature recognition program 10 installed in the electronic device 1, a face image sample library, a pre-trained face average model, an AU classifier, and the like. The memory 11 may also be used to temporarily store data that has been output or is to be output.

The processor 12 may be a Central Processing Unit (CPU), microprocessor or other data Processing chip in some embodiments, and is used for executing program codes stored in the memory 11 or Processing data, such as executing the AU feature recognition program 10.

Fig. 1 only shows the electronic device 1 with components 11-15, but it is to be understood that not all of the shown components are required to be implemented, and that more or fewer components may alternatively be implemented.

Optionally, the electronic device 1 may further include a user interface, the user interface may include an input unit such as a Keyboard (Keyboard), a voice input device such as a microphone (microphone) or other equipment with a voice recognition function, a voice output device such as a sound box, a headset, etc., and optionally the user interface may further include a standard wired interface, a wireless interface.

Optionally, the electronic device 1 may further comprise a display, which may also be appropriately referred to as a display screen or display unit. In some embodiments, the display device may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch device, or the like. The display is used for displaying information processed in the electronic apparatus 1 and for displaying a visualized user interface.

Optionally, the electronic device 1 further comprises a touch sensor. The area provided by the touch sensor for the user to perform touch operation is called a touch area. Further, the touch sensor described herein may be a resistive touch sensor, a capacitive touch sensor, or the like. The touch sensor may include not only a contact type touch sensor but also a proximity type touch sensor. Further, the touch sensor may be a single sensor, or may be a plurality of sensors arranged in an array, for example.

The area of the display of the electronic device 1 may be the same as or different from the area of the touch sensor. Optionally, a display is stacked with the touch sensor to form a touch display screen. The device detects touch operation triggered by a user based on the touch display screen.

Optionally, the electronic device 1 may further include a Radio Frequency (RF) circuit, a sensor, an audio circuit, and the like, which are not described herein again.

In the apparatus embodiment shown in fig. 1, the memory 11, which is a kind of computer storage medium, may include therein an operating system, and an AU feature recognition program 10; the processor 12, when executing the AU feature recognition program 10 stored in the memory 11, realizes the following steps:

acquiring a real-time image shot by the camera device 13, extracting a real-time face image from the real-time image by using a face recognition algorithm, and intercepting a feature area corresponding to each AU feature from the real-time face image; the processor 12 calls pre-trained AU classifiers from the memory 11, inputs the feature region corresponding to each AU feature into the corresponding AU classifier, and obtains the prediction result of each AU feature recognized from the real-time facial image, so as to facilitate the subsequent judgment of the emotion in the current facial image.

In other embodiments, the AU feature recognition program 10 may also be divided into one or more modules, and the one or more modules are stored in the memory 11 and executed by the processor 12 to complete the present invention. The modules referred to herein are referred to as a series of computer program instruction segments capable of performing specified functions.

Referring to fig. 2, a functional block diagram of the AU feature recognition program 10 in fig. 1 is shown.

The AU feature recognition program 10 may be divided into: an acquisition module 110, a recognition module 120, a feature extraction module 130, a prediction module 140, a determination module 150, and a prompt module 160.

The acquiring module 110 is configured to acquire a real-time image captured by the camera 13, and extract a real-time face image from the real-time image by using a face recognition algorithm. When the camera device 13 captures a real-time image, the camera device 13 sends the real-time image to the processor 12, and when the processor 12 receives the real-time image, the obtaining module 110 obtains the size of the image first and establishes a gray image with the same size; converting the obtained color image into a gray image, and simultaneously creating a memory space; equalizing the histogram of the gray level image to reduce the information content of the gray level image and accelerate the detection speed, then loading a training library of an Intel company, detecting the face in the picture, returning an object containing face information, obtaining the data of the position of the face, and recording the number of the face; finally, the region of the head portrait is obtained and stored, so that the process of extracting the face image in real time is completed.

Specifically, the face recognition algorithm for extracting the real-time facial image from the real-time image may further include: geometric feature-based methods, local feature analysis methods, eigenface methods, elastic model-based methods, neural network methods, and the like.

And the recognition module 120 is configured to input the real-time facial image into a pre-trained face average model, and recognize t facial feature points from the real-time facial image by using the face average model. Let t be 76, 76 of the face feature points in the face average model. When the obtaining module 110 extracts the real-time facial image, the identifying module 120 calls the facial average model of the trained facial feature points from the memory 11, aligns the real-time facial image with the facial average model, and then searches the real-time facial image for 76 facial feature points matching the 76 facial feature points of the facial average model by using a feature extraction algorithm. The face average model of the face feature points is constructed and trained in advance, and a specific embodiment will be described in the following AU feature recognition method.

In this embodiment, the feature extraction algorithm is a SIFT (scale-innovative feature transform) algorithm. The SIFT algorithm extracts local features of each facial feature point from a facial average model of the facial feature points, selects an eye feature point or a lip feature point as a reference feature point, searches feature points which are the same as or similar to the local features of the reference feature point in a real-time facial image, for example, whether the difference value of the local features of the two feature points is within a preset range or not, if so, indicates that the feature points are the same as or similar to the local features of the reference feature points, and takes the feature points as the facial feature points. This principle is followed until all facial feature points are found in the real-time facial image. In other embodiments, the feature extraction algorithm may also be SURF (speedUp Robust Features) algorithm, LBP (Local Binary Patterns) algorithm, HOG (Histogram of organized Grids) algorithm, etc.

A feature extraction module 130, configured to determine a feature region in the real-time facial image that matches each AU according to the position of the t facial feature points, extract local features from the feature region, and generate a plurality of feature vectors. In one embodiment, human beings have a total of 39 major Facial Action Units (AU) according to the Facial emotion Coding System (FACS) summarized by paul and ackermann. Each AU is a small set of muscle contraction codes for the face. For example AU 1-raise the inner corner of eyebrow, AU 2-raise the outer corner of eyebrow, AU 9-crinkle nose, AU 22-tighten lips and turn outwards. Then, for AU1 and AU2, we need to determine feature areas matching with AU, namely eyebrows, and the feature extraction module 130 determines forehead, eyebrows and eye areas in the real-time face image according to 76 face feature points identified from the real-time face image by the identification module 120, as feature areas matching AU1 and AU2, extracts HOG features of inner eyebrow corners and outer eyebrow corners from the forehead, eyebrow and eye areas, respectively, and forms feature vectors V of feature areas of AU1 and AU2 respectively₁、V₂. For AU9 and AU22, we need to determine feature areas, namely, nose and lips, matching with the AU, the feature extraction module 130 determines the nose and lips areas in the real-time face image according to 76 face feature points identified from the real-time face image by the identification module 120, and extracts the HOG features from the nose area and the lips area as feature areas matching with AU9 and AU22 respectively to form feature vectors V of the feature areas of AU9 and AU22 respectively₉、V₂₂。

A prediction module 140 forAnd respectively inputting the plurality of feature vectors into a pre-trained AU classifier matched with the feature region to obtain a prediction result of the corresponding AU feature identified from the feature region. The number of the pre-trained AU classifiers is 39, the pre-trained AU classifiers correspond to all AUs respectively, and the prediction module respectively uses the feature vectors V₁、V₂、V₉、V₂₂And inputting AU classifiers of AU1, AU2, AU9 and AU22, wherein the classifiers respectively output the probabilities of identifying AU1, AU2, AU9 and AU22 from the corresponding feature regions.

The determining module 150 is configured to determine whether a probability that the corresponding AU feature is identified from the feature area is greater than a preset threshold in the prediction result. Suppose the probabilities of each AU classifier identifying AU1, AU2, AU9, AU22 from the current real-time facial image are: 0.45, 0.51, 0.60, 0.65, and the preset threshold is 0.50, and the determining module 150 determines the probability of identifying the corresponding AU feature from the real-time face image and the size of the preset threshold (0.50).

And a prompt module 160, configured to prompt that the AU feature is identified from the real-time face image if the probability that the corresponding AU feature is identified from the feature area in the prediction result is greater than a preset threshold. The probability of identifying the AU1 from the current real-time face image is smaller than a preset threshold, and the probability of identifying the AU2, the AU9 and the AU22 from the current real-time face image is larger than the preset threshold, so that the prompting module 160 prompts that the AU2, the AU9 and the AU22 are identified from the current real-time face image.

The electronic device 1 provided in this embodiment extracts feature areas matched with each AU from the real-time image, and identifies corresponding AU features from the feature areas, thereby improving the accuracy of AU feature identification.

In addition, the invention also provides an AU characteristic identification method. Fig. 3 is a flowchart illustrating an AU feature recognition method according to a first embodiment of the present invention. The method may be performed by an apparatus, which may be implemented by software and/or hardware.

In this embodiment, the AU feature recognition method includes: step S10-step S40.

Step S10, a real-time image captured by the camera device is acquired, and a real-time facial image is extracted from the real-time image by using a face recognition algorithm. When the camera device 13 shoots a real-time image, the camera device 13 sends the real-time image to the processor 12, and after the processor 12 receives the real-time image, the size of the image is firstly obtained, and a gray image with the same size is established; converting the obtained color image into a gray image, and simultaneously creating a memory space; equalizing the histogram of the gray level image to reduce the information content of the gray level image and accelerate the detection speed, then loading a training library of an Intel company, detecting the face in the picture, returning an object containing face information, obtaining the data of the position of the face, and recording the number of the face; finally, the region of the head portrait is obtained and stored, so that the process of extracting the face image in real time is completed.

In step S20, the real-time face image is input to a face average model trained in advance, and t face feature points are recognized from the real-time face image by using the face average model.

Wherein the face average model is obtained by the following method:

a first sample library of n face images was created, and 76 feature points were manually marked at the positions of the eyes, eyebrows, nose, mouth, and outer contour of the face in each face image. The 76 feature points in each human face image form a shape feature vector S, and n shape feature vectors S of the face are obtained.

And training a face feature recognition model by using the t face feature points to obtain a face average model. The human face feature recognition model is an Ensemble of Regression Tress (ERT) algorithm. The ERT algorithm is formulated as follows:

where t denotes the cascade number,. tau._t(-) represents the regressor at the current stage. Each regressor is composed of a number of regression trees (trees) that are trained to obtain. Wherein S (t) is the shape estimate of the current model; each regressor tau_t(-) predict an increment from the input images I and S (t)This increment is added to the current shape estimate to improve the current model. Each stage of the regressor performs prediction according to the characteristic points. The training data set was: (I1, S1), (In, Sn) where I is the input sample image and S is the shape feature vector consisting of feature points In the sample image.

In the model training process, the number of face images in the first sample library is n, and each sample picture is assumed to have 76 feature points and feature vectorsTraining a first regression tree by taking partial feature points of all sample pictures (for example, randomly taking 50 feature points from 76 feature points of each sample picture), training a second tree by using a residual error between a predicted value of the first regression tree and a true value of the partial feature points (a weighted average value of the 50 feature points taken by each sample picture) until a predicted value of the nth tree and the true value of the partial feature points are close to 0, obtaining all regression trees of the ERT algorithm, obtaining a face average model (mean shape) according to the regression trees, and storing a model file and a sample library into a memory.

After the real-time face image is extracted, the face average model of the trained face feature points is called from the memory, the real-time face image is aligned with the face average model, and then 76 face feature points matched with the 76 face feature points of the face average model are searched in the real-time face image by using a feature extraction algorithm.

The feature extraction algorithm may be a SIFT (scale-invariant feature transform) algorithm, SURF (speedUp Robust Features) algorithm, LBP (Local Binary Patterns) algorithm, HOG (history of organized composites) algorithm, etc.

Step S30 is to determine a feature region matching each AU in the real-time face image based on the position of the t face feature points, extract local features from the feature region, and generate a plurality of feature vectors.

According to the Facial emotion Coding System (FACS) summarized by paul and ackerman, humans have a total of 39 major Facial Action Units (AU). Each AU is a small set of muscle contraction codes for the face. For example AU 1-raise the inner corner of eyebrow, AU 2-raise the outer corner of eyebrow, AU 9-crinkle nose, AU 22-tighten lips and turn outwards. Then, for AU1 and AU2, we need to determine feature regions, namely eyebrows, matching with the AU, determine forehead, eyebrow and eye regions in the real-time face image as feature regions matching with AU1 and AU2 according to 76 face feature points identified from the real-time face image, extract HOG features of inner eyebrow angles and outer eyebrow angles from forehead, eyebrow and eye regions respectively, and form feature vectors V of feature regions of AU1 and AU2 respectively₁、V₂. For AU9 and AU22, we need to determine the feature areas, namely the nose and the lips, matching with the AU, determine the nose and the lips areas in the real-time face image according to 76 face feature points identified from the real-time face image, extract HOG features from the nose area and the lips area as the feature areas matching with AU9 and AU22 respectively, and form the feature vectors V of the feature areas of AU9 and AU22 respectively₉、V₂₂。

And step S40, inputting the plurality of feature vectors into a pre-trained AU classifier matched with the feature region respectively to obtain a prediction result of corresponding AU features identified from the feature region.

The number of the pre-trained AU classifiers is 39, which correspond to AU1, AU2, AU3, … and AU39 respectively, and the pre-trained AU classifiers are obtained by the following steps:

in the first sample library, an image area (and a face image including the AU) matched with each AU is respectively cut from each face sample image to be used as a positive sample image of the AU, and a negative sample image is prepared for each AU, so that a positive sample image and a negative sample image of each AU are obtained. The image areas corresponding to different AUs may be the same, for example AU1, AU2, AU4 all relate to areas of the face image containing eyebrows, eyes and forehead, and AU9, AU22 relate to nose and lip areas of the face image. The area of the picture that does not include the AU can be a negative sample picture of the AU. The AU positive sample image and the AU negative sample image are normalized to the same size. Extracting local features such as HOG features from each AU positive sample image and negative sample image, and storing the local features into feature vectors; and respectively carrying out learning training on a support vector classifier (SVM) by using the local characteristics of the positive/negative sample image of each AU to obtain the classifier of each AU.

Respectively convert the feature vectors V₁、V₂、V₉、V₂₂And inputting AU classifiers of AU1, AU2, AU9 and AU22, wherein the classifiers respectively output the probabilities of identifying AU1, AU2, AU9 and AU22 from the corresponding feature regions.

In the AU feature recognition method provided in this embodiment, the probability of recognizing the AU feature from the feature region is determined by intercepting the feature region matched with each AU from the real-time image, rather than by the corresponding AU classifier. AU features in the feature region in the real-time face image are identified through different AU classifiers, and therefore the efficiency of AU feature identification is effectively improved.

A second embodiment of the AU feature recognition method is proposed based on the first embodiment. Fig. 4 is a flowchart illustrating an AU feature recognition method according to a second embodiment of the present invention. The method may be performed by an apparatus, which may be implemented by software and/or hardware.

In this embodiment, the AU feature recognition method includes: step S10-step S70. The contents of steps S10-S40 are substantially the same as those in the first embodiment, and are not described herein again.

And step S50, judging whether the probability of each AU feature in the prediction result is greater than a preset threshold value.

Suppose the probabilities of each AU classifier identifying AU1, AU2, AU9, AU22 from the current real-time facial image are: 0.45, 0.51, 0.60 and 0.65, and the preset threshold is 0.50, and the probability of identifying each AU feature and the size of the preset threshold are judged.

And step S60, when the AU feature with the probability larger than the preset threshold value exists in the prediction result, prompting that the AU feature is identified from the real-time face image. And if the probability of identifying the AU1 from the current real-time face image is smaller than a preset threshold value and the probability of identifying the AU2, the AU9 and the AU22 from the current real-time face image is larger than the preset threshold value, judging that the AU2, the AU9 and the AU22 are identified from the current real-time face image and not identifying the AU1 from the current real-time face image.

Compared with the first embodiment, the AU feature recognition method provided by the embodiment intercepts the feature region matched with each AU from the real-time image, judges the probability of recognizing the AU feature from the feature region through the corresponding AU classifier, recognizes the AU feature in the feature region in the real-time facial image through different AU classifiers, sets the threshold value, filters the probability of recognizing the corresponding AU by each AU classifier, and effectively improves the accuracy of AU feature recognition.

Furthermore, an embodiment of the present invention further provides a computer-readable storage medium, where an AU feature recognition program is included in the computer-readable storage medium, and the AU feature recognition program, when executed by a processor, implements the following operations:

Optionally, the determining step further includes:

The specific implementation of the computer readable storage medium of the present invention is substantially the same as the specific implementation of the AU feature recognition method described above, and will not be described herein again.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, apparatus, article, or method. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, apparatus, article, or method that includes the element.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments. Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) as described above and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.

The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims

1. An electronic device, the device comprising: the image capturing device comprises a memory, a processor and an image capturing device, wherein the memory comprises an Action Unit (AU) feature recognition program, and the AU feature recognition program realizes the following steps when being executed by the processor:

2. The electronic device of claim 1, wherein the AU feature recognition program, when executed by the processor, further performs the steps of:

3. The electronic device of claim 2, wherein the determining step further comprises:

4. The electronic device of claim 1, wherein the step of training the AU classifier comprises:

a sample preparation step: collecting a face sample image, respectively intercepting an image area matched with each AU from the face sample image as a positive sample image of the AU, and preparing a negative sample image for each AU;

local feature extraction: extracting local features of the positive sample image and the negative sample image of each AU to generate corresponding feature vectors;

model training: and (3) learning and training a support vector classifier (SVM) by using the local characteristics of the positive/negative sample image of each AU to obtain a corresponding AU classifier.

5. An AU feature identification method, the method comprising:

6. The AU feature recognition method of claim 5, further comprising:

7. The AU feature recognition method of claim 5 or 6, wherein the judging step further comprises:

8. The AU feature recognition method of claim 5, wherein the face average model is obtained by training a face feature recognition model, wherein the face feature recognition model is an ERT algorithm and is expressed by the following formula:

where t denotes the cascade number,. tau._t(-) represents the regressor of the current stage, S (t) is the shape estimate of the current model, each regressor τ_t(-) an increment is predicted from the input current image I and S (t)And adding the increment to the current shape estimation to improve the current model, training partial feature points of all sample pictures to obtain a first regression tree in the model training process, training a second tree … by using the residual error between the predicted value of the first regression tree and the true value of the partial feature points, and repeating until the predicted value of the Nth tree and the true value of the partial feature points are close to 0 to obtain all regression trees of the ERT algorithm, and obtaining a face average model according to the regression trees.

9. The AU feature recognition method of claim 5, wherein the training of the AU classifier comprises:

10. A computer-readable storage medium, characterized in that an AU feature recognition program is included therein, which when executed by a processor, implements the steps of an AU feature recognition method according to any one of claims 5 to 9.