CN114998229B

CN114998229B - A non-contact sleep monitoring method based on deep learning and multi-parameter fusion

Info

Publication number: CN114998229B
Application number: CN202210561402.3A
Authority: CN
Inventors: 张静; 晏博赟; 贺涛; 杜晓辉; 王祥舟; 孙海鑫; 刘娟秀; 刘霖; 刘永
Original assignee: University of Electronic Science and Technology of China
Current assignee: University of Electronic Science and Technology of China
Priority date: 2022-05-23
Filing date: 2022-05-23
Publication date: 2024-04-12
Anticipated expiration: 2042-05-23
Also published as: CN114998229A

Abstract

The invention relates to a non-contact sleep monitoring system based on deep learning and multi-parameter fusion, and belongs to the fields of image processing and deep learning. The system firstly segments an acquired sleep video image, then builds a deep convolutional neural network for extracting and amplifying physiological signals, amplifies heart rate signals of a forehead area and eye movement frequencies of an eye area by setting different amplification factors of the network to obtain the forehead area video image of the amplified heart rate signals and the eye area video image after the eye movement frequencies are amplified, and then extracts corresponding frequency spectrums by utilizing fast Fourier transformation to find frequencies corresponding to frequency spectrum peaks as monitored heart rate signals and eye movement frequencies. For a three-position body video image, a sleeping gesture monitoring neural network structure based on deep learning is built, sleeping gesture features extracted from the network are input into a full-connection layer to be subjected to six classification, classification results correspond to six sleeping gestures of supine, prone, left straight lying, left spiral lying, right straight lying and right spiral lying, and the number of turning times is counted through switching of different sleeping gestures in the six sleeping gestures. And finally, comprehensively evaluating the sleep quality by combining the monitored physiological signals. The invention has the characteristics of high comfort, multi-parameter fusion and high automation, and realizes the monitoring of physiological parameters such as heart rate, respiratory rate, eye movement frequency, sleeping posture, turning-over times and the like in a non-contact manner.

Description

A non-contact sleep monitoring method based on deep learning and multi-parameter fusion

技术领域Technical Field

本发明属于图像处理领域和深度学习领域，具体涉及视频图像处理技术和深度卷积网络相结合实现多参数融合的非接触式睡眠监测系统。The present invention belongs to the field of image processing and deep learning, and specifically relates to a non-contact sleep monitoring system that combines video image processing technology with a deep convolutional network to achieve multi-parameter fusion.

技术背景technical background

睡眠过程中，人体的大脑、肌肉、眼睛、心脏、呼吸等的一系列机能都会发生变化，监测这些变化可以促进对人体睡眠质量的判断。睡眠障碍通常是指睡眠量或质的异常或在睡眠时发生某些临床症状，如睡眠减少或睡眠过多，睡眠呼吸障碍，快速眼动睡眠行为障碍等。医学证明，长期具有睡眠障碍的人群会诱发多种疾病，因此及时诊断与治疗睡眠障碍对人类健康有着重要意义。During sleep, a series of functions of the human brain, muscles, eyes, heart, breathing, etc. will change. Monitoring these changes can help to judge the quality of human sleep. Sleep disorders usually refer to abnormal sleep quantity or quality or certain clinical symptoms during sleep, such as reduced or excessive sleep, sleep breathing disorders, rapid eye movement sleep behavior disorder, etc. Medically proven, people with long-term sleep disorders will induce a variety of diseases, so timely diagnosis and treatment of sleep disorders are of great significance to human health.

多导睡眠图被称为睡眠障碍诊断和治疗的金方法，主要监测患者的脑电图、心电图、眼电图、口鼻气流通量、血氧饱和度等通道的生理信号，根据收集到的信号进行诊断。利用多导睡眠图监测时需要在被测者身上安装多种传感器，给被测者带来极大的不舒适感，此外，即使其对很多参数进行监测，医生还是会根据被测者的既往病史和监测期间的主观感受作为评估的依据之一，判读结果带有较强的主观性。随着深度学习的发展，出现了许多小型化的睡眠监测设备，如智能枕头、床垫、手环等。智能枕头、床垫通过压力传感器监测睡眠过程中的压力变化，统计被测者的翻身次数。智能手环佩戴时可监测睡眠时的心率。虽然现有的睡眠监测设备降低了监测过程中佩戴传感器的不舒适感，但其仅能对单一生理参数进行监测，导致睡眠质量的评估结果不准确，不完整。Polysomnography is known as the golden method for diagnosing and treating sleep disorders. It mainly monitors the physiological signals of the patient's electroencephalogram, electrocardiogram, electrooculogram, oral and nasal airflow, blood oxygen saturation and other channels, and makes a diagnosis based on the collected signals. When using polysomnography for monitoring, it is necessary to install a variety of sensors on the subject, which brings great discomfort to the subject. In addition, even if it monitors many parameters, the doctor will still use the subject's medical history and subjective feelings during the monitoring period as one of the basis for evaluation, and the interpretation results are highly subjective. With the development of deep learning, many miniaturized sleep monitoring devices have emerged, such as smart pillows, mattresses, and bracelets. Smart pillows and mattresses monitor pressure changes during sleep through pressure sensors and count the number of times the subject turns over. Smart bracelets can monitor heart rate during sleep when worn. Although existing sleep monitoring equipment reduces the discomfort of wearing sensors during monitoring, it can only monitor a single physiological parameter, resulting in inaccurate and incomplete evaluation results of sleep quality.

针对上述睡眠监测中的问题，我们设计了一种基于深度学习与多参数融合的非接触式睡眠监测系统。对于心率和眼动频率，通过深层卷积神经网络从睡眠视频中提取微小生理信号并进行放大，同时抑制伪影的产生，实现有效生理信号放大，最后对放大的生理信号进行频谱分析。对于睡姿和翻身次数，利用卷积神经网络自动提取三方位摄像头获取的视频帧的睡姿特征，并对其进行六分类，其中根据不同睡姿之间的切换可以监测睡眠过程中的翻身次数。In response to the above problems in sleep monitoring, we designed a non-contact sleep monitoring system based on deep learning and multi-parameter fusion. For heart rate and eye movement frequency, a deep convolutional neural network is used to extract and amplify tiny physiological signals from sleep videos, while suppressing the generation of artifacts to achieve effective physiological signal amplification. Finally, the amplified physiological signals are subjected to spectrum analysis. For sleeping posture and number of turning over, a convolutional neural network is used to automatically extract the sleeping posture features of the video frames obtained by the three-dimensional camera, and classify them into six categories. The number of turning over during sleep can be monitored by switching between different sleeping postures.

发明内容Summary of the invention

本发明针对多导睡眠图的接触性监测给被测者带来不舒适感和人工判读的主观性以及其他睡眠监测设备监测生理参数的单一性等问题，设计了一种基于深度学习与多参数融合的非接触式睡眠监测系统，以实现对于心率、眼动频率、睡姿、翻身次数等多项生理参数的非接触式监测。In order to solve the problems of discomfort caused by contact monitoring of polysomnography to the subjects and the subjectivity of manual interpretation, as well as the singleness of physiological parameters monitored by other sleep monitoring equipment, the present invention designs a non-contact sleep monitoring system based on deep learning and multi-parameter fusion to achieve non-contact monitoring of multiple physiological parameters such as heart rate, eye movement frequency, sleeping posture, and number of turning over.

本发明技术方案是一种基于深度学习与多参数融合的非接触式睡眠监测方法，该方法包括以下步骤：The technical solution of the present invention is a non-contact sleep monitoring method based on deep learning and multi-parameter fusion, which includes the following steps:

步骤1：搭建睡眠监测平台，在测试者身体的上、左、右三个位置分别放置三个摄像头，以获取测试者睡眠过程中的视频图像；Step 1: Build a sleep monitoring platform and place three cameras on the upper, left, and right sides of the tester's body to obtain video images of the tester during sleep;

步骤2：对步骤1中位于测试者身体上方摄像头所获视频图像进行图像分割，得到测试者额头区域视频图像和眼睛区域视频图像；Step 2: Segment the video image obtained by the camera located above the tester's body in step 1 to obtain a video image of the tester's forehead area and a video image of the eye area;

步骤3：搭建生理信号提取与放大的深层卷积神经网络，利用其提取视频中微小的生理信号并进行放大；Step 3: Build a deep convolutional neural network for physiological signal extraction and amplification, and use it to extract and amplify tiny physiological signals in the video;

步骤4：将步骤2所获额头区域视频图像和眼睛区域视频图像分别输入到步骤3搭建的深层卷积神经网络中，分别提取与放大额头区域的心率信号和眼睛区域的眼动信号，输出放大心率信号后的额头区域视频图像和放大眼动频率后的眼睛区域视频图像；Step 4: Input the forehead area video image and the eye area video image obtained in step 2 into the deep convolutional neural network constructed in step 3, respectively, extract and amplify the heart rate signal of the forehead area and the eye movement signal of the eye area, and output the forehead area video image after amplifying the heart rate signal and the eye area video image after amplifying the eye movement frequency;

步骤5：对步骤4所获放大心率信号后的额头区域视频图像中的每一帧图像进行RGB三通道分离，对R、G、B三个通道内的像素点求平均值，然后进行时间序列堆叠，得到脉搏波信号；Step 5: Separate the RGB three-channels of each frame of the forehead area video image after the amplified heart rate signal obtained in step 4, calculate the average value of the pixels in the three channels of R, G, and B, and then stack them in time series to obtain the pulse wave signal;

步骤6：对步骤5得到的脉搏波信号通过快速傅里叶变换，得到人体脉搏波的时间序列频谱；Step 6: Perform fast Fourier transform on the pulse wave signal obtained in step 5 to obtain a time series spectrum of the human pulse wave;

步骤7：对步骤6所获时间序列频谱进行频谱分析，选取频谱峰值对应的频率作为心率的监测结果；Step 7: Perform spectrum analysis on the time series spectrum obtained in step 6, and select the frequency corresponding to the spectrum peak as the heart rate monitoring result;

步骤8：对步骤4所获放大眼动频率后的眼睛区域视频图像按照时间序列进行堆叠，并进行快速傅里叶变换，得到眼睛区域视频图像频谱；Step 8: stacking the eye region video images obtained in step 4 after amplifying the eye movement frequency in a time series, and performing a fast Fourier transform to obtain a frequency spectrum of the eye region video images;

步骤9：对步骤8所获眼睛区域视频图像频谱提取频谱峰值对应的频率作为眼动频率的监测结果；Step 9: extracting the frequency corresponding to the spectrum peak of the eye area video image spectrum obtained in step 8 as the monitoring result of the eye movement frequency;

步骤10：搭建基于深度学习的睡姿监测神经网络结构，若睡姿监测神经网络未训练完毕，则执行步骤11，若睡姿监测神经网络已经训练完毕，则执行步骤13；Step 10: Build a deep learning-based sleep posture monitoring neural network structure. If the sleep posture monitoring neural network has not been trained, execute step 11; if the sleep posture monitoring neural network has been trained, execute step 13;

步骤11：预先通过部署在人身体上、左、右的三个摄像头各收集1000张以上测试者睡眠时的图像，对所获图像进行特征标注，人工标注出测试者所处睡姿状态，对应仰卧、俯卧、左侧直卧、左侧蜷卧、右侧直卧和右侧蜷卧六种睡姿；Step 11: Collect more than 1,000 images of the test subject while sleeping by deploying three cameras on the left and right sides of the body, annotate the images with features, and manually annotate the sleeping positions of the test subject, corresponding to six sleeping positions: supine, prone, left side straight, left side curled up, right side straight, and right side curled up;

步骤12：将所获步骤11标注后的图像数据送入神经网络进行训练，按照8:2的比例划分训练集和验证集对网络进行训练，直至验证集准确度达到95％以上，网络训练完毕；Step 12: Send the image data annotated in step 11 to the neural network for training, and divide the training set and the validation set into a ratio of 8:2 to train the network until the accuracy of the validation set reaches more than 95%, and the network training is completed;

步骤13：将步骤1中三个摄像头中的测试者三方位躯体视频图像输入训练好的神经网络，并对其进行六分类输出，分类结果对应为仰卧、俯卧、左侧直卧、左侧蜷卧、右侧直卧和右侧蜷卧六种睡姿；Step 13: Input the three-dimensional body video images of the tester from the three cameras in step 1 into the trained neural network, and perform six-category output on them, and the classification results correspond to six sleeping positions: supine, prone, left side straight, left side curled up, right side straight, and right side curled up;

步骤14：若测试者发生所述步骤13中六种睡姿中任意两种睡姿的切换时，记为一次翻身，但是左侧直卧和左侧蜷卧的相互转换以及右侧直卧和右侧蜷卧的相互转换不计入翻身次数的统计；Step 14: If the test subject switches between any two of the six sleeping positions in step 13, it is counted as one turn over, but the switch between lying upright on the left side and lying curled up on the left side and the switch between lying upright on the right side and lying curled up on the right side are not counted in the statistics of the number of turns over;

步骤15：结合步骤7所获心率参数，步骤9所获眼动频率，步骤13所获睡姿状态，步骤14所获翻身次数，对测试者的睡眠质量结果进行综合评估。Step 15: Combine the heart rate parameters obtained in step 7, the eye movement frequency obtained in step 9, the sleeping posture obtained in step 13, and the number of tossing and turning obtained in step 14 to comprehensively evaluate the sleep quality results of the tester.

其中，所述步骤2具体为：Wherein, the step 2 is specifically as follows:

步骤2.1：对位于测试者身体上方摄像头所获取的视频图像，调用python中的dlib库，分割出人脸所在区域，提取出测试者脸部视频图像；Step 2.1: For the video image captured by the camera above the tester's body, call the dlib library in Python to segment the face area and extract the tester's face video image;

步骤2.2：对步骤2.1所获脸部视频图像进行人脸关键点检测，调用python中的dlib库进行人脸关键点检测，得到测试者的人脸68个关键点所处位置；Step 2.2: Perform facial key point detection on the facial video image obtained in step 2.1, call the dlib library in python to perform facial key point detection, and obtain the positions of 68 key points of the tester's face;

步骤2.3：通过步骤2.2中所获人脸关键点位置对步骤2.1所获视频图像再次进行图像分割；通过标识人脸左右眼眉毛中心的关键点，以及通过dlib库识别到的人脸的上边界，分割出一块矩形区域的额头区域视频图像；Step 2.3: The video image obtained in step 2.1 is segmented again by using the positions of the key points of the face obtained in step 2.2; a rectangular forehead area video image is segmented by identifying the key points of the center of the left and right eyebrows of the face and the upper boundary of the face identified by the dlib library;

步骤2.4：通过标识人脸左右眼的关键点，找到代表左眼眼角、右眼眼角、眼眶最上方和眼眶最下方的关键点，通过上述四个关键点分割出一块矩形区域的眼睛区域视频图像；Step 2.4: By identifying the key points of the left and right eyes of the face, find the key points representing the left eye corner, the right eye corner, the uppermost part of the eye socket, and the lowermost part of the eye socket, and segment a rectangular area of the eye area video image through the above four key points;

其中，所述步骤3具体为：Wherein, the step 3 is specifically as follows:

步骤3.1：搭建生理信号提取与放大的深层卷积神经网络的编码器结构，其结构首先为：2个卷积层每个卷积层后均使用Relu激活函数，然后是3个残差网络，再通过一个卷积层提取睡眠视频中的生理信号，其步长设置为2，最后连接两个残差结构输出；Step 3.1: Build the encoder structure of the deep convolutional neural network for physiological signal extraction and amplification. The structure is first: 2 convolutional layers, each of which uses the Relu activation function, followed by 3 residual networks, and then a convolutional layer to extract the physiological signals in the sleep video. The step size is set to 2, and finally the two residual structure outputs are connected;

步骤3.2：搭建深层卷积神经网络的调制放大结构，先通过一个卷积层对两帧睡眠图像生理信号的差分进行卷积运算，激活函数为Relu函数，然后乘以放大因子α，再一次利用卷积层和残差结构对放大后的特征进行非线性变化，得到放大的生理信号差分特征；Step 3.2: Build a modulation amplification structure of a deep convolutional neural network. First, perform a convolution operation on the difference of the physiological signals of the two frames of sleep images through a convolution layer. The activation function is the Relu function, and then multiply it by the amplification factor α. The convolution layer and residual structure are used again to perform nonlinear changes on the amplified features to obtain the amplified physiological signal differential features.

步骤3.3：搭建深层卷积神经网络的解码器结构，将放大的生理信号差分特征叠加到初始睡眠图像上，再通过上采样和两个卷积层对放大的视频进行解码输出，实现睡眠视频中生理信号的放大。Step 3.3: Build the decoder structure of the deep convolutional neural network, superimpose the amplified physiological signal differential features on the initial sleep image, and then decode and output the amplified video through upsampling and two convolutional layers to achieve the amplification of physiological signals in the sleep video.

其中，所述步骤4具体为：Wherein, the step 4 is specifically as follows:

步骤4.1：将步骤2中获得的额头区域视频图像输入到步骤3搭建的深层卷积神经网络中，设置放大因子α＝15，对额头区域的心率进行提取与放大，输出放大心率信号后的额头区域视频图像；Step 4.1: Input the forehead area video image obtained in step 2 into the deep convolutional neural network built in step 3, set the amplification factor α=15, extract and amplify the heart rate in the forehead area, and output the forehead area video image after amplifying the heart rate signal;

步骤4.2：将步骤2中获得的眼睛区域视频图像输入到步骤3搭建的深层卷积神经网络中，设置放大因子α＝30，对眼睛区域的眼动频率进行提取与放大，输出放大眼动频率后的眼睛区域视频图像；Step 4.2: Input the eye area video image obtained in step 2 into the deep convolutional neural network built in step 3, set the amplification factor α=30, extract and amplify the eye movement frequency of the eye area, and output the eye area video image after amplifying the eye movement frequency;

其中，所述步骤10具体为：Wherein, the step 10 is specifically as follows:

步骤10.1：搭建神经网络结构，包括4个卷积层，3个最大池化层，1个全连接层和1个分类器；Step 10.1: Build a neural network structure, including 4 convolutional layers, 3 maximum pooling layers, 1 fully connected layer and 1 classifier;

步骤10.2：防止计算量过大，每隔1s提取一次视频当前时刻关键帧，分别从位于测试者身体上、左、右的摄像头中提取1帧图像，组成三通道图像数据输入到步骤10.1的神经网络中；Step 10.2: To prevent the amount of calculation from being too large, extract the key frame of the video at the current moment every 1 second, and extract one frame of image from the cameras located on the upper, left, and right sides of the tester's body respectively, forming three-channel image data and inputting them into the neural network in step 10.1;

步骤10.3：步骤10.2中的三通道图像数据分别通过卷积核为10×10的卷积层，2×2的最大池化层，卷积核为10×10的卷积层，2×2的最大池化层，卷积核为10×10的卷积层，2×2的最大池化层，卷积核为10×10的卷积层，进行图像特征的提取；Step 10.3: The three-channel image data in step 10.2 are respectively passed through a convolution layer with a convolution kernel of 10×10, a 2×2 maximum pooling layer, a convolution layer with a convolution kernel of 10×10, a 2×2 maximum pooling layer, a convolution layer with a convolution kernel of 10×10, a 2×2 maximum pooling layer, and a convolution layer with a convolution kernel of 10×10 to extract image features;

步骤10.4：对上述提取的特征输入到全连接层进行六分类，分类结果对应到六种睡姿：仰卧、俯卧、左侧直卧、左侧蜷卧、右侧直卧和右侧蜷卧。Step 10.4: Input the above extracted features into the fully connected layer for six classifications. The classification results correspond to six sleeping positions: supine, prone, left side straight, left side curled up, right side straight and right side curled up.

其中，所述步骤15具体为：Wherein, the step 15 is specifically as follows:

步骤15.1：人在正常睡眠时的心率为每分钟60～100次，深睡时可降低至每分钟50次，对测试者睡眠时的心率进行监测，当心率明显下降时，认为测试者进入深睡期，而当心率逐渐上升时，则认为退出深睡期，最终统计测试者睡眠中深睡期所占比例，深睡期占比越大睡眠质量越高；Step 15.1: The heart rate of a person during normal sleep is 60 to 100 beats per minute, and can be reduced to 50 beats per minute during deep sleep. The heart rate of the test subject during sleep is monitored. When the heart rate drops significantly, it is considered that the test subject has entered the deep sleep period, and when the heart rate gradually increases, it is considered that the deep sleep period has been exited. Finally, the proportion of the deep sleep period in the test subject's sleep is calculated. The greater the proportion of the deep sleep period, the higher the sleep quality;

步骤15.2：处于睡眠分期中的快速眼动期时，眼球会快速转动，对测试者的眼动进行监测，若眼动频率出现明显变大，结合步骤14.1中的心率上升，可认为测试者进入快速眼动期，统计快速眼动期的持续时间，若快速眼动睡眠突然中断，往往是心绞痛、哮喘等疾病发作的信号；Step 15.2: During the REM sleep stage, the eyeballs will move rapidly. Monitor the eye movements of the test subject. If the eye movement frequency increases significantly, combined with the increase in heart rate in step 14.1, it can be considered that the test subject has entered the REM stage. Count the duration of the REM stage. If the REM sleep is suddenly interrupted, it is often a signal of the onset of diseases such as angina pectoris and asthma.

步骤15.3：平卧被认为是睡眠期间比较好的一种睡姿，但不适合患有呼吸道疾病或是经常打呼噜的人群，而应采用侧卧睡姿；对测试者的睡姿进行监测，若测试者患有呼吸道疾病或是打呼噜，则在他们采用了非侧卧睡姿时对其进行睡姿调整建议；Step 15.3: Lying flat on the back is considered to be a better sleeping position during sleep, but it is not suitable for people with respiratory diseases or frequent snoring. Instead, sleeping on the side should be adopted; monitor the sleeping position of the tester, and if the tester suffers from respiratory diseases or snores, provide sleeping position adjustment suggestions when they adopt a non-side sleeping position;

步骤15.4：对测试者的翻身次数进行监测，若翻身次数过多，则提示测试者可能缺乏钙离子或精神压力大，睡眠质量差。Step 15.4: Monitor the number of times the test subject turns over. If the test subject turns over too often, it indicates that the test subject may be calcium ion deficient or mentally stressed, resulting in poor sleep quality.

本发明是一种基于深度学习与多参数融合的非接触式睡眠监测系统，该系统首先对获取到的视频图像进行分割，分割出额头区域视频图像、眼睛区域视频图像、三方位躯体视频图像，然后搭建生理信号提取与放大的深层卷积神经网络，通过设置网络不同的放大因子，对额头区域的心率信号和眼睛区域的眼动频率进行放大，得到放大心率信号的额头区域视频图像和放大眼动频率后的眼睛区域视频图像，再利用快速傅里叶变换提取相应频谱，找出频谱峰值对应的频率作为监测到的心率信号和眼动频率。对于三方位躯体视频图像，搭建基于深度学习的睡姿监测神经网络结构，将网络中提取的睡姿特征输入到全连接层进行六分类，分类结果对应为仰卧、俯卧、左侧直卧、左侧蜷卧、右侧直卧和右侧蜷卧六种睡姿，再通过六种睡姿中的不同睡姿的切换，统计翻身次d数。最后综合上述监测到的生理信号，对睡眠质量进行综合评估。本发明为测试者提供了一种高舒适度、多参数融合、高度自动化的睡眠监测系统，非接触性的实现了心率、呼吸频率、眼动频率、睡姿、翻身次数等生理参数的监测，提高了监测的可靠性，对睡眠质量的临床诊断及对睡眠障碍患者或潜在患者的临床治疗和尽早干预具有关键作用。The present invention is a non-contact sleep monitoring system based on deep learning and multi-parameter fusion. The system first segments the acquired video image, and segments the forehead area video image, the eye area video image, and the three-position body video image. Then, a deep convolutional neural network for physiological signal extraction and amplification is built. By setting different amplification factors of the network, the heart rate signal of the forehead area and the eye movement frequency of the eye area are amplified to obtain the forehead area video image with amplified heart rate signal and the eye area video image after amplifying the eye movement frequency. Then, the corresponding spectrum is extracted by fast Fourier transform, and the frequency corresponding to the spectrum peak is found as the monitored heart rate signal and eye movement frequency. For the three-position body video image, a sleeping posture monitoring neural network structure based on deep learning is built, and the sleeping posture features extracted from the network are input into the fully connected layer for six classifications. The classification results correspond to six sleeping postures: supine, prone, left straight, left curled, right straight and right curled. Then, by switching different sleeping postures in the six sleeping postures, the number of turning over times is counted. Finally, the above-mentioned monitored physiological signals are combined to comprehensively evaluate the sleep quality. The present invention provides a highly comfortable, multi-parameter fusion, and highly automated sleep monitoring system for the tester, which realizes non-contact monitoring of physiological parameters such as heart rate, respiratory rate, eye movement rate, sleeping posture, and number of turning over, thereby improving the reliability of monitoring. It plays a key role in the clinical diagnosis of sleep quality and the clinical treatment and early intervention of patients or potential patients with sleep disorders.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1是生理信号提取与放大的深层卷积神经网络图Figure 1 is a deep convolutional neural network diagram for physiological signal extraction and amplification

图2是心率监测流程图Figure 2 is a heart rate monitoring flow chart

图3是眼动监测流程图Figure 3 is the eye movement monitoring flow chart

图4是睡姿、翻身次数监测流程图Figure 4 is a flowchart of monitoring sleeping posture and turning over times

图5是睡姿监测神经网络结构图Figure 5 is a diagram of the neural network structure for sleeping posture monitoring

具体实施方式Detailed ways

下面结合附图，对本发明一种基于深度学习与多参数融合的非接触式睡眠监测系统进行详细说明：The following is a detailed description of a non-contact sleep monitoring system based on deep learning and multi-parameter fusion according to the present invention in conjunction with the accompanying drawings:

步骤1：搭建睡眠监测平台。在测试者身体的上、左、右三个位置分别放置三个摄像头，以获取测试者睡眠过程中的视频图像；Step 1: Build a sleep monitoring platform. Place three cameras on the upper, left, and right sides of the tester's body to obtain video images of the tester during sleep;

步骤2.3：通过步骤2.2中所获人脸关键点位置对步骤2.1所获视频图像再次进行图像分割。通过标识人脸左右眼眉毛中心的关键点，以及通过dlib库识别到的人脸的上边界，分割出一块矩形区域的额头区域视频图像；Step 2.3: Use the facial key point positions obtained in step 2.2 to segment the video image obtained in step 2.1 again. By identifying the key points of the center of the left and right eyebrows of the face and the upper boundary of the face identified by the dlib library, a rectangular forehead area video image is segmented;

步骤2.4：通过标识人脸左右眼的关键点，找到代表左眼眼角、右眼眼角、眼眶最上方和眼眶最下方的关键点，通过上述四个关键点分割出一块矩形区域的眼睛区域视频图像。Step 2.4: By identifying the key points of the left and right eyes of the face, find the key points representing the left eye corner, the right eye corner, the uppermost part of the eye socket and the lowermost part of the eye socket, and segment the eye area video image into a rectangular area through the above four key points.

步骤3.1：搭建生理信号提取与放大的深层卷积神经网络的编码器结构，其包括2个卷积层和3个残差网络，每个卷积层后均使用了Relu激活函数，然后再通过一个卷积层提取睡眠视频中的生理信号，其步长设置为2，最后连接两个残差结构输出；Step 3.1: Build the encoder structure of the deep convolutional neural network for physiological signal extraction and amplification, which includes 2 convolutional layers and 3 residual networks. The Relu activation function is used after each convolutional layer. Then, a convolutional layer is used to extract the physiological signals in the sleep video. The step size is set to 2, and finally the two residual structure outputs are connected;

步骤3.2：搭建深层卷积神经网络的调制放大结构，先通过一个卷积层对两帧睡眠图像生理信号的差分进行卷积运算，激活函数为Relu函数，然后乘以放大因子α，再利用卷积层和残差结构对放大后的特征进行非线性变化，得到放大的生理信号差分特征；Step 3.2: Build a modulation amplification structure of a deep convolutional neural network. First, perform a convolution operation on the difference of the physiological signals of the two frames of sleep images through a convolution layer. The activation function is the Relu function, and then multiply it by the amplification factor α. Then use the convolution layer and residual structure to perform nonlinear changes on the amplified features to obtain the amplified physiological signal differential features.

步骤5：对步骤4所获放大心率信号后的额头区域视频图像中的每一帧图像进行RGB三通道分离，对R、G、B三个通道内的像素点求平均值，然后进行时间序列堆叠，得到脉搏波信号；Step 5: Separate the RGB three-channels of each frame of the forehead area video image after the amplified heart rate signal obtained in step 4, calculate the average value of the pixels in the three channels of R, G, and B, and then stack the time series to obtain the pulse wave signal;

步骤10：搭建基于深度学习的睡姿监测神经网络结构，若睡姿监测神经网络未训练完毕，则执行步骤11。若睡姿监测神经网络已经训练完毕，则执行步骤13；Step 10: Build a deep learning-based sleep posture monitoring neural network structure. If the sleep posture monitoring neural network has not been trained, proceed to step 11. If the sleep posture monitoring neural network has been trained, proceed to step 13;

步骤15.1：人在正常睡眠时的心率为每分钟60～100次，深睡时可降低至每分钟50次。本发明对测试者睡眠时的心率进行监测，当心率明显下降时，认为测试者进入深睡期，而当心率逐渐上升时，则认为退出深睡期。最终统计测试者睡眠中深睡期所占比例，深睡期占比越大睡眠质量越高；Step 15.1: The heart rate of a person during normal sleep is 60 to 100 beats per minute, and can be reduced to 50 beats per minute during deep sleep. The present invention monitors the heart rate of the test subject during sleep. When the heart rate drops significantly, it is considered that the test subject has entered a deep sleep period, and when the heart rate gradually increases, it is considered that the test subject has exited a deep sleep period. Finally, the proportion of the deep sleep period in the test subject's sleep is counted. The greater the proportion of the deep sleep period, the higher the sleep quality;

步骤15.2：处于睡眠分期中的快速眼动期时，眼球会快速转动。本发明对测试者的眼动进行监测，若眼动频率出现明显变大，结合步骤14.1中的心率上升，可认为测试者进入快速眼动期，统计快速眼动期的持续时间，若快速眼动睡眠突然中断，往往是心绞痛、哮喘等疾病发作的信号；Step 15.2: When in the rapid eye movement stage of sleep, the eyeballs will move rapidly. The present invention monitors the eye movements of the test subject. If the eye movement frequency increases significantly, combined with the increase in heart rate in step 14.1, it can be considered that the test subject has entered the rapid eye movement stage. The duration of the rapid eye movement stage is counted. If the rapid eye movement sleep is suddenly interrupted, it is often a signal of the onset of diseases such as angina pectoris and asthma.

步骤15.3：平卧被认为是睡眠期间比较好的一种睡姿，但不适合患有呼吸道疾病或是经常打呼噜的人群，而应采用侧卧睡姿。本发明对测试者的睡姿进行监测，若测试者患有呼吸道疾病或是打呼噜，则在他们采用了非侧卧睡姿时对其进行睡姿调整建议；Step 15.3: Lying flat is considered to be a better sleeping position during sleep, but it is not suitable for people with respiratory diseases or people who snore frequently. Instead, they should sleep sideways. The present invention monitors the sleeping position of the tester. If the tester suffers from respiratory diseases or snores, the present invention provides sleeping position adjustment suggestions when they adopt a non-side sleeping position.

步骤15.4：本发明对测试者的翻身次数进行监测，若翻身次数过多，则提示测试者可能缺乏钙离子或精神压力大，睡眠质量差。Step 15.4: The present invention monitors the number of times the tester turns over. If the number of times the tester turns over is too much, it indicates that the tester may be calcium ion deficient or mentally stressed, and has poor sleep quality.

Claims

1. A non-contact sleep monitoring method based on deep learning and multi-parameter fusion, the method comprising the following steps:

Step 1: Build a sleep monitoring platform; place three cameras on the upper, left, and right sides of the tester's body to obtain video images of the tester during sleep;

Step 2: Segment the video image obtained by the camera located above the tester's body in step 1 to obtain a video image of the tester's forehead area and a video image of the eye area;

Step 3: Build a deep convolutional neural network for physiological signal extraction and amplification, and use it to extract and amplify tiny physiological signals in the video;

Step 4: Input the forehead area video image and the eye area video image obtained in step 2 into the deep convolutional neural network constructed in step 3, respectively, extract and amplify the heart rate signal of the forehead area and the eye movement signal of the eye area, and output the forehead area video image after amplifying the heart rate signal and the eye area video image after amplifying the eye movement frequency;

Step 5: Separate the RGB three-channels of each frame of the forehead area video image after the amplified heart rate signal obtained in step 4, calculate the average value of the pixels in the three channels of R, G, and B, and then stack the time series to obtain the pulse wave signal;

Step 6: Perform fast Fourier transform on the pulse wave signal obtained in step 5 to obtain a time series spectrum of the human pulse wave;

Step 7: Perform spectrum analysis on the time series spectrum obtained in step 6, and select the frequency corresponding to the spectrum peak as the heart rate monitoring result;

Step 8: stacking the eye region video images obtained in step 4 after amplifying the eye movement frequency in a time series, and performing a fast Fourier transform to obtain a frequency spectrum of the eye region video images;

Step 9: extracting the frequency corresponding to the spectrum peak of the eye area video image spectrum obtained in step 8 as the monitoring result of the eye movement frequency;

Step 10: Build a deep learning-based sleep posture monitoring neural network structure. If the sleep posture monitoring neural network has not been trained, execute step 11; if the sleep posture monitoring neural network has been trained, execute step 13;

Step 11: Collect more than 1,000 images of the test subject while sleeping by deploying three cameras on the left and right sides of the body, annotate the images with features, and manually annotate the sleeping positions of the test subject, corresponding to six sleeping positions: supine, prone, left side straight, left side curled up, right side straight, and right side curled up;

Step 12: Send the image data annotated in step 11 to the neural network for training, and divide the training set and the validation set into a ratio of 8:2 to train the network until the accuracy of the validation set reaches more than 95%, and the network training is completed;

Step 13: Input the three-dimensional body video images of the tester from the three cameras in step 1 into the trained neural network, and perform six-category output on them, and the classification results correspond to six sleeping positions: supine, prone, left side straight, left side curled up, right side straight, and right side curled up;

Step 14: If the test subject switches between any two of the six sleeping positions in step 13, it is counted as one turn over, but the switch between lying upright on the left side and lying curled up on the left side and the switch between lying upright on the right side and lying curled up on the right side are not counted in the statistics of the number of turns over;

Step 15: Combine the heart rate parameters obtained in step 7, the eye movement frequency obtained in step 9, the sleeping posture obtained in step 13, and the number of tossing and turning obtained in step 14 to comprehensively evaluate the sleep quality results of the tester.

2. A non-contact sleep monitoring system based on deep learning and multi-parameter fusion as claimed in claim 1, characterized in that step 2 specifically comprises:

Step 2.1: For the video image captured by the camera above the tester's body, call the dlib library in Python to segment the face area and extract the tester's face video image;

Step 2.2: Perform facial key point detection on the facial video image obtained in step 2.1, call the dlib library in python to perform facial key point detection, and obtain the positions of 68 key points of the tester's face;

Step 2.3: The video image obtained in step 2.1 is segmented again by using the positions of the key points of the face obtained in step 2.2; a rectangular forehead area video image is segmented by identifying the key points of the center of the left and right eyebrows of the face and the upper boundary of the face identified by the dlib library;

Step 2.4: By identifying the key points of the left and right eyes of the face, find the key points representing the left eye corner, the right eye corner, the uppermost part of the eye socket and the lowermost part of the eye socket, and segment the eye area video image into a rectangular area through the above four key points.

3. A non-contact sleep monitoring system based on deep learning and multi-parameter fusion as claimed in claim 1, characterized in that step 3 specifically comprises:

Step 3.1: Build the encoder structure of the deep convolutional neural network for physiological signal extraction and amplification, which includes 2 convolutional layers and 3 residual networks. The Relu activation function is used after each convolutional layer. Then, a convolutional layer is used to extract the physiological signals in the sleep video. The step size is set to 2, and finally the two residual structure outputs are connected;

Step 3.2: Build a modulation amplification structure of a deep convolutional neural network. First, perform a convolution operation on the difference of the physiological signals of the two frames of sleep images through a convolution layer. The activation function is the Relu function, and then multiply it by the amplification factor α. Then use the convolution layer and residual structure to perform nonlinear changes on the amplified features to obtain the amplified physiological signal differential features.

Step 3.3: Build the decoder structure of the deep convolutional neural network, superimpose the amplified physiological signal differential features on the initial sleep image, and then decode and output the amplified video through upsampling and two convolutional layers to achieve the amplification of physiological signals in the sleep video.

4. The non-contact sleep monitoring system based on deep learning and multi-parameter fusion as claimed in claim 1, wherein step 4 specifically comprises:

Step 4.1: Input the forehead area video image obtained in step 2 into the deep convolutional neural network built in step 3, set the amplification factor α=15, extract and amplify the heart rate in the forehead area, and output the forehead area video image after amplifying the heart rate signal;

Step 4.2: Input the eye area video image obtained in step 2 into the deep convolutional neural network built in step 3, set the amplification factor α=30, extract and amplify the eye movement frequency of the eye area, and output the eye area video image after amplifying the eye movement frequency.

5. The non-contact sleep monitoring system based on deep learning and multi-parameter fusion as claimed in claim 1, wherein step 10 specifically comprises:

Step 10.1: Build a neural network structure, including 4 convolutional layers, 3 maximum pooling layers, 1 fully connected layer and 1 classifier;

Step 10.2: To prevent the amount of calculation from being too large, extract the key frame of the video at the current moment every 1 second, and extract one frame of image from the cameras located on the upper, left, and right sides of the tester's body respectively, forming three-channel image data and inputting them into the neural network in step 10.1;

Step 10.3: The three-channel image data in step 10.2 are respectively passed through a convolution layer with a convolution kernel of 10×10, a 2×2 maximum pooling layer, a convolution layer with a convolution kernel of 10×10, a 2×2 maximum pooling layer, a convolution layer with a convolution kernel of 10×10, a 2×2 maximum pooling layer, and a convolution layer with a convolution kernel of 10×10 to extract image features;

Step 10.4: Input the above extracted features into the fully connected layer for six classifications. The classification results correspond to six sleeping positions: supine, prone, left side straight, left side curled up, right side straight and right side curled up.

6. The non-contact sleep monitoring system based on deep learning and multi-parameter fusion as claimed in claim 1, wherein step 15 specifically comprises:

Step 15.1: The heart rate of a person during normal sleep is 60 to 100 beats per minute, and can be reduced to 50 beats per minute during deep sleep; the heart rate of the test subject during sleep is monitored, and when the heart rate drops significantly, it is considered that the test subject has entered the deep sleep period, and when the heart rate gradually increases, it is considered that the deep sleep period has been exited; finally, the proportion of the deep sleep period in the test subject's sleep is calculated, and the greater the proportion of the deep sleep period, the higher the sleep quality;

Step 15.2: During the REM sleep stage, the eyeballs will move rapidly. Monitor the test subject's eye movements. If the eye movement frequency increases significantly, combined with the heart rate increase in step 14.1, it can be considered that the test subject has entered the REM stage. Count the duration of the REM stage. If the REM sleep is suddenly interrupted, it is often a signal of an attack of diseases such as angina pectoris and asthma.

Step 15.3: Lying flat on the back is considered to be a better sleeping position during sleep, but it is not suitable for people with respiratory diseases or frequent snoring. Instead, sleeping on the side should be adopted; monitor the sleeping position of the tester, and if the tester suffers from respiratory diseases or snores, provide sleeping position adjustment suggestions when they adopt a non-side sleeping position;

Step 15.4: Monitor the number of times the test subject turns over. If the test subject turns over too often, it indicates that the test subject may be calcium ion deficient or mentally stressed, resulting in poor sleep quality.