Disclosure of Invention
In order to solve the problems, the invention provides a non-contact heart rate variability feature extraction method based on a real application scene based on an IPPG principle, an image processing technology, a signal processing technology and a feature extraction technology which are integrated based on the IPPG principle, and the extracted features are consistent with contact extraction results.
The invention provides a non-contact heart rate variability feature extraction method, which provides a solution strategy for acquiring a face region by combining face detection and face tracking in order to improve the extraction speed of heart rate variability features, acquires a face position through the face detection and then tracks the face position, relocates the face position through the face detection according to a fixed time interval in the tracking process to prevent tracking offset, and continuously corrects the face offset generated by shaking to prevent incomplete extraction of the face region, thereby ensuring that the accuracy of acquiring the face region is not influenced while the speed is improved. In addition, the invention also realizes simultaneous processing of images acquired from the camera and image processing in two threads by means of a shared queue, thereby better improving the extraction speed.
In order to reduce the influence of different illumination conditions on the extraction result, the invention provides a solution strategy combining channel separation, self-adaptive skin detection and EEMD filtering. Firstly, converting a face area image into an LUV color space, separating an L channel reflecting brightness change, obtaining a U channel image reflecting chromaticity change, converting the face area image into a YCrCb color space, carrying out skin detection according to a luminance component Y under different illumination conditions and a Cb component self-adaptive determination threshold value to obtain a skin part of the face area image, calculating a skin detection result and the U channel image to obtain an original heart rate signal, preliminarily reducing the influence of illumination intensity change, and reducing the noise of the original heart rate signal by applying an EEMD method to further reduce the influence of illumination change.
In order to reduce the influence of the shaking of the testee on the extraction result, the invention provides a peak point extraction strategy capable of correcting the peak point influenced by the shaking. Firstly, calculating signal peak points, then judging whether each extracted peak point is influenced by shaking to generate extraction abnormity or not through whether the slope between adjacent peak points of the signals and the distance between the peak points are in a threshold range, and averaging and correcting the abnormal peak points of the peak points with the extraction abnormity through the positions of all normal peak points before the current abnormal peak point, thereby obtaining relatively accurate signal peak points and further reducing the influence of shaking on feature extraction.
In order to achieve the above purpose, the acquisition process provided by the invention is as follows:
step one, collecting an image containing a human face:
the tested person faces the camera and collects face images according to the fixed frame rate of the camera, namely 30FPS, and the tested person can extract relatively accurate heart rate variability features only by continuously collecting the face images for at least 30 seconds.
The storage of the acquired image including the human face needs to be performed in one sub-thread, that is, the acquisition of the image and the processing of reading the image in the program should be performed simultaneously in two threads, and the two threads share one image queue.
Step two, acquiring a face region of the image:
the face area is extracted in a mode of combining face detection by using a libfacedetection open source face detection library and face tracking by using a KLT (Kandade-Lucas-Tomasi) tracking method, the face position is acquired by face detection and then tracked, and meanwhile, the face position is positioned by reusing the face detection at a fixed time interval of 10s in the tracking process and then tracking is continued.
In the face tracking process, a minimum external rectangle is determined through four vertexes where a face area is determined after face tracking, the face affected by shaking is corrected through the central point and the deflection angle of the rectangle, and the face area is prevented from being extracted incompletely.
Step three, Euler amplification:
and the Euler amplification method is used for enhancing the skin color change of the face region and enhancing the information of the part related to the physiological signal in the face image.
Step four, channel separation:
and C, converting the face image obtained in the third step after Euler amplification from an RGB color space to an LUV color space, so as to separate an L channel reflecting brightness change, and extracting an original heart rate signal by using a U channel reflecting chromaticity change.
Step five, self-adaptive threshold skin detection:
and (3) providing a self-adaptive threshold value for skin detection, determining the threshold value in a self-adaptive manner according to the luminance component Y and the Cb component under different illumination conditions, setting the skin pixel in a range meeting the threshold value as a skin pixel, setting the pixel point of the skin pixel to be 255 white, and setting the rest of the skin pixel to be 0 black. And (4) carrying out AND operation on the skin detection image and the U channel, thereby removing the non-skin area and obtaining the U channel face image with the non-skin area removed.
Step six, source signal extraction:
in the process of extracting the HRV characteristics in one round, calculating the pixel mean value of the U-channel face image with the non-skin area removed in the fifth step to obtain a series of pixel mean value points, and then carrying out standardized calculation on the series of pixel points to form the original heart rate signal.
Seventhly, EEMD denoising:
noise reduction of the original heart rate signal using the eemd (ensemble Empirical Mode composition) method further reduces the effects of different lighting conditions.
Step eight, five-point sliding:
method for removing high-frequency noise still contained in signal by applying five-point sliding smoothing filtering method
Step nine, peak point extraction and correction:
and calculating a signal peak point, and finding out the peak point influenced by the shaking and correcting the peak point by setting the distance between the two peak points and the threshold range of the slope, thereby obtaining the relatively accurate signal peak point.
Step ten, HRV feature extraction:
and (4) calculating RR interval and R point time by using the corrected peak point obtained in the step nine so as to extract HRV characteristics, and extracting 27 HRV characteristics including time domain characteristics, frequency domain characteristics and nonlinear characteristics.
Wherein the time domain features include: max, min, mean, SDNN, RMSSD, hr-mean, hr-sd, NN40, pNN40, or HRVti;
the frequency domain features are obtained by performing spectrum analysis and extraction by using a Lomb-Scargle periodogram, and the frequency domain features comprise: aVLF, aLF, aHF, aTotal, pVLF, pLF, pHF, nLF, nHF, LFHF, peakVLF, peakLF or peakHF;
wherein the non-linear characteristics include: SD1, SD2 or SD1/SD 2.
The invention has the advantages and positive effects that:
the invention provides a non-contact heart rate variability feature extraction method based on a real application scene, and particularly embodies a strategy of improving the non-contact heart rate variability feature extraction speed and a strategy of overcoming different influences of shaking and illumination conditions in the extraction method so as to improve the non-contact heart rate variability feature extraction accuracy. In a real application scene, the real application functions of automatic switching, automatic detection, automatic calculation of HRV characteristics and the like of a tester can be realized on the basis of the method, and the method has very strong real application significance.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
The invention provides a non-contact heart rate variability feature extraction method based on a real application scene, which can be used for quickly and accurately extracting heart rate variability features in the real application scene, can be used for emotion recognition to reflect the psychological pressure level of a tested person, and can be applied to the real scenes of assisting customs workers in screening suspicious customs clearance personnel and the like.
Referring to fig. 2, the present invention provides a strategy for increasing the feature extraction speed by combining face detection and face tracking to obtain a face region. The method comprises the steps of acquiring a face position through face detection, tracking the face position, repositioning the face position according to fixed time intervals in the tracking process to prevent tracking offset, and continuously correcting the face offset generated by shaking to prevent incomplete extraction of a face region, so that the speed is improved while the accuracy of the acquisition of the face region is not influenced.
Referring to fig. 3, the proposed strategy for mitigating the impact of different illumination conditions in combination with channel separation, adaptive skin detection, EEMD filtering. Firstly, converting a face area image into an LUV color space, separating an L channel reflecting brightness change, obtaining a U channel image reflecting chromaticity change, converting the face area image into a YCrCb color space, carrying out skin detection according to a brightness component Y under different illumination conditions and a Cb component self-adaptive determination threshold value to obtain a skin part of the face area image, obtaining an original heart rate signal by operating a skin detection result and the U channel image, preliminarily reducing different influences of illumination conditions, and reducing noise of a source signal by applying an EEMD method to further reduce different influences of the illumination conditions.
Referring to fig. 4, the present invention provides a peak point extraction strategy that can correct the peak points affected by shaking. Firstly, calculating signal peak points, then judging whether each extracted peak point is influenced by shaking to generate extraction abnormity or not through whether the slope between adjacent peak points of the signals and the distance between the peak points are in a threshold range, and averaging and correcting the abnormal peak points of the peak points with the extraction abnormity through the positions of all normal peak points before the current abnormal peak point, thereby obtaining relatively accurate signal peak points and further reducing the influence of shaking on feature extraction.
Referring to fig. 1, the non-contact heart rate variability feature extraction method based on the reality application scenario, which is provided by the invention in combination with the above strategy, mainly includes 4 major parts: the data acquisition part, the image processing part, the signal processing part and the HRV feature extraction part can be subdivided into 10 steps: the method comprises the steps of collecting an image containing a human face, obtaining an image face area, Euler amplifying, channel separating, self-adaptive threshold skin detecting, source signal extracting, EEMD denoising, five-point sliding, peak point extracting and correcting and HRV feature extracting.
The method comprises the following specific steps:
step one, collecting an image containing a human face:
the tested person faces to the camera and collects the face image according to the fixed frame rate of the camera, and the frame rate of the common USB camera on the market is mostly 30FPS, so the invention assumes that the face image is collected according to 30 FPS. The tested person can continuously acquire face images for at least 30s to extract relatively accurate heart rate variability characteristics.
Because the invention is based on a real application scene, it is not satisfactory to record the human face video as a general flow and then extract the heart rate variability features from the human face video. Therefore, when the method is applied, three threads are required to be simultaneously carried out, one thread is responsible for collecting the face image, one thread is responsible for processing the face image, and the other thread is responsible for signal processing and feature extraction.
Since the threads are used for processing, one thread collects the face images and the other thread processes the face images, a shared space is needed to ensure that the threads store the collected face images into the shared space in sequence, and the other thread reads the face images from the shared space in sequence for processing.
Step two, acquiring a face region of the image:
the extraction speed of the traditional non-contact heart rate variability feature extraction process is relatively slow. The traditional non-contact feature extraction can perform face detection once for each frame of image, and although the method can stably extract a face region, the speed is slow and the efficiency is low. The face region extraction is crucial in the overall process of HRV feature extraction, the face region extraction is used for processing images, compared with pure numerical calculation such as signal processing and emotion classification, the face region extraction takes most of time consumed by program operation, and therefore the whole operation speed of the system can be greatly improved in the optimization stage.
From the problem, the face region is extracted by combining the face detection by using the libfacedetection open source face detection library and the face tracking by using the KLT (Kandade-Lucas-Tomasi) tracking method, the extraction method is high in speed, the face region is stably extracted, other function realization is not influenced (for example, key functions such as automatic switching of a detected person and the like need to be considered under a real application scene), the face detection can be regarded as a face tracking providing tracking template in the extraction method, and in order to prevent the tracking position from deviating relative to the face position, the face detection is carried out again according to a fixed time interval in the tracking process to reposition the face position so as to carry out the face tracking.
In an actual application scene, a shaking condition needs to be considered, and the face region is incompletely extracted in the shaking condition by using the mode of face detection and face tracking, so that the face correction is needed. Although the method can effectively correct the position of the face, the correction speed is slow, and if the method is placed in a non-contact heart rate variability feature extraction flow, the extraction speed is severely slowed. Therefore, the invention determines a minimum external rectangle through four vertexes where the determined face area is located after face tracking, and performs correction through the central point and the deflection angle of the rectangle. Referring to fig. 5, the deflection angle of the rectangle is determined by a coordinate system, and when the rectangle is shifted to the right, the deflection angle is referenced to 0 degrees on the positive x-axis half in the first quadrant of the coordinate system. When the rectangle is shifted to the left, the angle is determined by the second quadrant, with the shift angle referenced to 0 degrees on the positive y-axis. The rotation angle is determined to be clockwise rotation by the current degree when the angle is between 0 and 45 degrees, and is determined to be counterclockwise rotation by subtracting the degree from 90 when the angle is between 45 and 90 degrees. Through the center point of the rectangle and the deflection angle of the rectangle, an affine transformation matrix is constructed, affine transformation is carried out on the whole image according to the affine transformation matrix, the image is intercepted again according to the center point of the rectangle and the length and the width of the rectangle, and the corrected face can be obtained.
Step three, Euler amplification:
the invention uses Euler amplification method to enhance the skin color change of human face region, and enhances the information of the part related to physiological signal in human face image.
The number of spatial decomposition layers in the Euler amplification method is 6, the frequency band of time domain filtering is 1-2Hz, and the image amplification factor is 200.
Step four, channel separation:
in order to reduce the influence of different illumination conditions on HRV feature extraction, the Euler amplified face image obtained in the step three is converted from an RGB color space to an LUV color space, so that an L channel reflecting brightness change is separated, and an original heart rate signal is extracted by using a U channel reflecting chromaticity change.
Step five, self-adaptive threshold skin detection:
because the face region image obtained through face detection and face tracking still has face parts of non-skin regions, and the face parts can influence the accuracy of HRV feature extraction, the non-skin regions of the face need to be screened out, and because the illumination conditions are different in a practical application scene, the skin detection effect of a single threshold value is poor. And (4) carrying out AND operation on the skin detection image and the U channel, thereby removing the non-skin area and obtaining the U channel face image with the non-skin area removed.
The dynamic configuration rule is defined in the YcrCb color space as follows:
θ3=6;θ4=-8
if(Y≤128)θ1=6;θ2=12;
a pixel is a skin pixel if its Cr value satisfies the following condition
cr≥-2(cb+24);cr≥-(cb+17);
cr≥-4(cb+32);cr≥2.5(cb+θ1);
cr≥θ3;cr≥0.5(θ4-cb);
Where Y is a luminance component, Cb is a blue chrominance component, Cr is a red chrominance component, and θ 1 to θ 4 are intermediate variables.
Step six, source signal extraction:
in the invention, for one round of HRV feature extraction process, the average value of pixels of the U-channel face image with the non-skin area removed is calculated through the fifth step to obtain a series of average value points of the pixels, and then the series of pixel points are subjected to standardization processing to form the original heart rate signal.
Seventhly, EEMD denoising:
in a practical application scene, different illumination conditions have great influence on HRV feature extraction, and experimental results show that the lower the illumination, the larger the extracted HRV feature error. In order to reduce errors caused by different illumination conditions to HRV extraction, the invention adopts an EEMD (ensemble Empirical Mode composition) method to perform noise reduction treatment. The method comprises the steps of obtaining IMF components with different resolutions under each scale in a self-adaptive mode by applying an EEMD method, calculating instantaneous frequency through Hilbert transformation, distinguishing IMF components with noise dominance and IMF components with signal dominance according to the instantaneous frequency, abandoning the IMF components with noise dominance, and reserving the IMF components with signal dominance to reconstruct signals, so that the influence on HRV feature extraction under different illumination conditions is relieved.
Step eight, five-point sliding:
the five-point moving average filtering belongs to low-pass filtering, and can effectively remove high-frequency noise still contained in the signal. And applying five-point moving average filtering to further filter the signal after EEMD filtering in the step seven so as to make the signal smoother.
The formula for calculating the ith new data by five-point moving average is as follows:
where N is the signal length, f (j) is the signal value within the five-point sliding window, and y (i) is the new signal value determined by the five-point sliding average.
Step nine, peak point extraction and correction:
when the head of a testee shakes, the heart rate variability curve shakes, so that the extraction of the peak point of the heart rate variability curve is inaccurate. The invention provides a strategy for extracting peak points capable of correcting the peak points affected by the shaking, firstly, the signal peak points are calculated for the smooth signals obtained after the step eight, then whether the extraction of the peak points is affected by the shaking and abnormal extraction occurs is judged according to whether the slope between adjacent peak points and the distance between the peak points are in the threshold range, the abnormal peak points are corrected by averaging the positions of all normal peak points before the current abnormal peak point, and all abnormal peak points are corrected, so that the relatively accurate signal peak points are obtained.
The formula for judging whether the slope between adjacent peak points is in the threshold range in the invention is as follows:
wherein h isiDenotes the height of the ith (i-1, 2, …, n) peak, tiThe time corresponding to the ith (i is 1,2, …, n) peak point is shown, and if the formula is not satisfied, the ith peak point is represented as an abnormal peak point.
The formula for determining whether the distance between adjacent peak points is within the threshold range in the present invention is as follows:
60/(HR-14)≤ti-ti-1<60/(HR+14),(i=2,3,…,n)
wherein, tiAnd (3) representing the time corresponding to the ith (i-1, 2, …, n) peak point, wherein HR represents the heart rate mean value, and if the formula is not satisfied, the ith peak point is represented as an abnormal peak point, wherein the calculation formula of the average heart rate HR is as follows:
wherein, tallRepresents the total duration of the detection, and count represents the total number of peak points in the detection.
The calculation formula for correcting the detected abnormal peak point is as follows:
tF_new=tF-1+[(tF-1-tF-2)+…+(t2-t1)]/(F-2)
wherein, tF_newTo correct the result, (t)F-1,…,t1) The abnormal peak point is a normal peak point or a corrected peak point before the abnormal peak point to be corrected.
Step ten, HRV feature extraction:
and (4) calculating RR interval and R point time by using the corrected peak point obtained in the step nine so as to extract HRV characteristics, and extracting 27 HRV characteristics including time domain characteristics, frequency domain characteristics and nonlinear characteristics.
Wherein the time domain features include: max, min, mean, SDNN, RMSSD, hr-mean, hr-sd, NN40, pNN40, or HRVti;
the frequency domain features are obtained by performing spectrum analysis and extraction by using a Lomb-Scargle periodogram, and the frequency domain features comprise: aVLF, aLF, aHF, aTotal, pVLF, pLF, pHF, nLF, nHF, LFHF, peakVLF, peakLF or peakHF;
wherein the non-linear characteristics include: SD1, SD2 or SD1/SD 2.
The above is a specific embodiment of the present invention.