Disclosure of Invention
The invention aims to provide a method for segmenting characteristic waves of electrocardiosignals, which aims to solve the problem that the characteristic waves are difficult to effectively identify under the condition of diseases.
Another object of the present invention is to provide an FS-Net model for segmenting an electrocardiographic signal feature wave to create a model for effectively segmenting the electrocardiographic signal feature wave.
One of the objects of the present invention is achieved by:
A segmentation method of electrocardiosignal characteristic waves comprises the following steps:
S1, denoising an electrocardiosignal, namely correcting baseline drift of the electrocardiosignal by using a median filtering algorithm, and eliminating high-frequency noise in the electrocardiosignal by using a third-order Butterworth low-pass filter;
S2, electrocardiosignal peak value normalization, namely calibrating denoised electrocardiosignal R peak value information by using a Pan-Tompkins algorithm, calculating RR interval information, selecting data by using a sliding window, and carrying out normalization processing on the data in the sliding window;
S3, carrying out smoothing treatment on the electrocardiosignals after normalization treatment, then correcting QRS waves, P waves and T waves of the electrocardiosignals, and finally carrying out characteristic enhancement on the P waves and the T waves in the corrected electrocardiosignals;
s4, constructing a model, namely constructing an FS-Net model for segmenting characteristic waves of the electrocardiosignal;
S5, model training, namely inputting the electrocardiosignals processed in the steps S1-S3 into an FS-Net model for training, and realizing segmentation of P waves, QRS waves and T waves in the electrocardiosignals;
S6, a post-processing algorithm is used for correcting the characteristics of the segmentation errors;
s7, electrocardiosignal segmentation, namely segmenting characteristic waves of the measured electrocardiosignal by using a trained FS-Net model.
Further, the low-pass cut-off frequency of the third-order butterworth low-pass filter in step S1 is 0.5hz.
Further, the size of the sliding window in step S2 is 1.1 times of the RR interval of the 5S electrocardiographic data in the vicinity of the current sampling point.
Further, the normalization formula is:
Wherein ECG nomalized represents an electrocardiograph signal for a period of time, time 1 is a start time of each sampling of the sliding window, time 2 is an end time of each sampling of the sliding window, max is a maximum value of an electrocardiograph signal amplitude in the sliding window, and abs is an absolute value of a corresponding amplitude of the electrocardiograph signal from time 1 to time 2.
Further, in step S3, the specific way to perform smoothing processing on the electrocardiographic signal is as follows:
Sliding a sliding window with the length of 60ms on the electrocardiosignal at the speed of 1 sampling point, calculating standard deviation for data in each window, taking a point position larger than a set threshold as a characteristic wave, taking the amplitude value of the electrocardiosignal smaller than the threshold as 0, and obtaining a corrected electrocardiosignal, wherein the value of the set threshold is 1.2% of the maximum value of the standard deviation;
The calculation formula of the smoothing process is as follows:
Wherein, ECG std represents the standard deviation of the electrocardiosignal, ECG amend (t) represents the corrected electrocardiosignal, ECG std (t) is the standard deviation of the sliding window at the moment t, and max (ECG std) is the maximum standard deviation of the sliding window at all moments.
Further, the specific way to correct the QRS wave of the electrocardiographic signal in step S3 is as follows:
s3a-1, taking an absolute value of the amplitude of the electrocardiosignal, and determining the peak point position of the QRS wave tip of the electrocardiosignal after taking the absolute value by using a Pan-Tompkins algorithm;
S3a-2, if the original amplitude corresponding to the peak point position of the QRS wave is a positive value, the original amplitude is kept unchanged, and if the original amplitude corresponding to the peak point position is a negative value, the original amplitude of the peak point position is kept as the opposite number.
Further, in step S3, the specific way to correct the P-wave and the T-wave is as follows:
s3b-1, respectively positioning the T wave position and the P wave position, and respectively calculating the average value of the T wave amplitude and the P wave amplitude;
s3b-2, when the average value of the amplitude values of the T waves is a negative number, the T waves are inverted T waves, and when the average value of the amplitude values of the P waves is a negative number, the P waves are inverted P waves;
S3b-3, counting the two zero crossing points of the T wave, determining an inverted T wave according to the amplitude and the vibration direction of the T wave, counting the two zero crossing points of the P wave, and determining an inverted P wave according to the amplitude and the vibration direction of the P wave;
s3b-4, respectively taking absolute values of amplitudes of the inverted T wave and the inverted P wave.
Further, in step S3, the specific manner of feature enhancement for the P-wave and the T-wave is as follows:
s3c-1, upwards moving the corrected electrocardiosignal by 1mV, wherein the baseline of the electrocardiosignal is 1mV;
s3c-2, performing primary transformation on the upward-moved electrocardiosignals according to a logarithmic function, wherein a transformation formula is as follows:
Wherein, when the ECG log (t) is the time t, the corresponding electrocardiosignal is the amplitude after primary transformation;
S3c-3, enhancing the transformed electrocardiosignal, wherein the enhancement formula is as follows:
Wherein, ECG enhance (t) is the amplitude value after the corresponding electrocardiosignal is enhanced when the time is t;
S3c-4, downwards shifting the enhanced electrocardiosignal by 1mV, wherein the baseline of the electrocardiosignal is 0mV, and carrying out Gaussian smoothing algorithm processing on the downwards shifted electrocardiosignal;
s3c-5, calibrating the enhanced electrocardiosignal R peak value information after Gaussian smoothing by using a Pan-Tompkins algorithm, calculating RR interval information, selecting data by using a sliding window, and carrying out normalization processing on the data in the sliding window;
s3c-6, sliding a sliding window with the length of 60ms on the electrocardiosignal at the speed of 1 sampling point, calculating standard deviation of data in each window, taking a point position larger than a set threshold as a characteristic wave, taking the amplitude value of the electrocardiosignal smaller than the threshold as 0, and obtaining a corrected electrocardiosignal, wherein the value of the set threshold is 3.5% of the maximum value of the standard deviation;
S3c-7, circularly executing the steps S3c-1-S3c-6 for 6 times, wherein the finally obtained electrocardiosignal is the electrocardiosignal with enhanced characteristics.
Further, the FS-Net model includes:
an encoder connected to the decoder for learning the dependency relationship between the characteristic waves, and
The decoder is connected with the encoder and is used for dividing characteristic waves of the electrocardiosignal;
After passing through 3 modules consisting of a convolution layer and a normalization layer in sequence, the input of the encoder enters 8 modules consisting of a multi-head attention layer, an Add layer, a normalization layer, a feedforward layer, an Add layer and a normalization layer in sequence and is output to a decoder;
The input of the decoder passes through 4 modules consisting of a two-way long-short-term memory network and a Dropout, passes through 2 full-connection mapping networks based on time distribution again, finally enters a Softmax layer and is output, and the full-connection mapping network based on time distribution comprises a time distribution network and a feedforward layer.
The second object of the invention is realized in that:
an FS-Net model for segmenting an electrocardiographic signal feature wave, comprising:
an encoder connected to the decoder for learning the dependency relationship between the characteristic waves, and
The decoder is connected with the encoder and is used for dividing characteristic waves of the electrocardiosignal;
After passing through 3 modules consisting of a convolution layer and a normalization layer in sequence, the input of the encoder enters 8 modules consisting of a multi-head attention layer, an Add layer, a normalization layer, a feedforward layer, an Add layer and a normalization layer in sequence and is output to a decoder;
The input of the decoder passes through 4 modules consisting of a two-way long-short-term memory network and a Dropout, passes through 2 full-connection mapping networks based on time distribution again, finally enters a Softmax layer and is output, and the full-connection mapping network based on time distribution comprises a time distribution network and a feedforward layer.
Further, the fully connected network based on time distribution maps signals with the time length of m and the characteristic length of n of each time point into signals with the time length of m and the characteristic length of n 'of each time point, wherein n' is smaller than n.
Electrocardiogram signature recognition is a challenging task in electrocardiographic signal analysis. The invention introduces the self-adaptive waveform correction filter to solve the problem of characteristic wave change caused by myocardial ischemia, effectively enhances the performance of P wave and T wave in an electrocardiogram, and improves the accuracy of characteristic wave detection. The algorithm adaptively corrects the T waveform state change caused by myocardial ischemia, enhances the amplitude of P waves and T waves, realizes the bidirectional matching of data and the algorithm, and improves the performance of the neural network. The invention also provides an interpretable end-to-end neural network model FS-Net, which can detect the characteristic wave of the modified and enhanced 12-lead electrocardiogram, provide the visualization of characteristic distribution and backtracking and display the deep learning process. In addition, a post-processing algorithm is provided for correcting false detection and accurately classifying the characteristic waves. The proposed algorithm makes a significant advance in the detection of electrocardiogram feature waves. The method not only improves the accuracy of electrocardiographic detection, but also has the interpretability and the practicability. Future studies may employ these methods in a broader range of cardiac disease detection and further optimize to enhance the efficacy of electrocardiographic detection and feasibility of clinical application.
Detailed Description
The present invention will be described in further detail below.
As shown in fig. 1, the invention provides a segmentation method of an electrocardiosignal characteristic wave, which specifically comprises the following steps:
Firstly, preprocessing signals, mainly removing noise and normalizing the signals, so that the signals are pure and stable electrocardiosignals. And then correcting the data, correcting the characteristic wave deformation caused by the diseases, and simultaneously carrying out characteristic enhancement, so that the FS-Net model can more effectively detect the electrocardiosignal. And sending the signals subjected to the characteristic correction and enhancement into an FS-Net model, analyzing the data by using the FS-Net model, and correcting the analyzed result by a post-processing algorithm so as to finish the segmentation of the characteristic waves.
S1, denoising the electrocardiosignal, namely correcting baseline drift of the electrocardiosignal by using a median filtering algorithm, and eliminating high-frequency noise in the electrocardiosignal by using a third-order Butterworth low-pass filter.
As shown in fig. 2, the electrocardiographic signals are extremely susceptible to noise during acquisition, which mainly includes power supply noise interference and myoelectric noise interference. And correcting baseline drift of the electrocardiosignal by using a median filtering algorithm, eliminating high-frequency noise in the signal by using a third-order Butterworth low-pass filter, and obtaining the denoised electrocardiosignal by using the low-pass cut-off frequency of the filter of 0.5 hz.
The red line in fig. 2 is the electrocardiosignal with noise, the blue line is the electrocardiosignal after noise removal, the left image in fig. 2 is the electrocardiosignal before and after noise removal of a diseased individual, the right image is the electrocardiosignal before and after noise removal of a healthy individual, and the electrocardiosignal is smoother after noise removal in the diseased individual or the healthy individual.
S2, electrocardiosignal peak value normalization, namely calibrating denoised electrocardiosignal R peak value information by using a Pan-Tompkins algorithm, calculating RR interval information, selecting data by using a sliding window, and carrying out normalization processing on the data in the sliding window.
The acquisition of the electrocardiosignal is affected by the respiratory rhythm of the subject, the difference of heartbeat and the acquisition equipment, so that the R peak heights of the same person in the same lead in a single test are different. In the next step of P-wave and T-wave enhancement, statistical parameters such as mean value and variance of electrocardiosignals are required to be used, so that stability of data is important for enhancing effect. The invention provides a self-adaptive cyclic filtering scheme, which solves the problem that the heights of R peaks are different when electrocardiosignals are acquired, and stabilizes the electrocardiosignals.
And then, selecting data by using a sliding window, normalizing the data in the sliding window, wherein the size of the sliding window is 1.1 times of the RR interval of the nearest 5 seconds of electrocardiograph data, the sliding speed is 10 sampling points each time, and the data in the sliding window is circularly normalized in the sliding window in the sliding process of each time until normalization operation in the window is completed for all the sampling points. The R peak heights in fig. 3 have been corrected, the R peak heights are relatively uniform, and the height difference between R peaks is small.
Wherein, the normalization formula is:
Wherein ECG nomalized represents a period of electrocardiograph signal, time 1 represents a start time, time 2 represents an end time, ECG [ time 1:time2 ] is the amplitude of the electrocardiograph signal corresponding to each time point, max is the maximum value of the amplitude of the electrocardiograph signal in the sliding window, and abs is the absolute value of the corresponding amplitude of the electrocardiograph signal from time 1 to time 2.
S3, carrying out smoothing treatment on the electrocardiosignals after normalization treatment, then correcting QRS waves, P waves and T waves of the electrocardiosignals, and finally carrying out characteristic enhancement on the P waves and the T waves in the corrected electrocardiosignals;
To stabilize the characteristic wave band of the enhanced electrocardiogram, the presence of the characteristic wave needs to be determined first. Because the characteristic wave state difference is larger, but the electrocardiosignal is stabilized near the base line when no characteristic wave exists, and the amplitude change is slow, the area without the characteristic wave is firstly calibrated, and then the area with the characteristic wave is reversely determined. The calibration is only needed to be roughly distinguished, and the fine division is completed by an electrocardiogram characteristic wave segmentation model. And determining a region without the characteristic wave, wherein the electrocardiosignal without the characteristic wave cannot change when the characteristic wave is enhanced later.
As shown in fig. 4, the present invention proposes a non-characteristic wave band smoothing method based on a window function, which calibrates a non-characteristic wave region and a characteristic wave region. And judging whether the characteristic wave occurs or not by calculating the standard deviation of the amplitude of the electrocardiosignal in a sliding window with the length of 60ms and sliding one sampling point at a time. The amplitude of the characteristic wave is changed rapidly to generate a larger standard deviation, the non-wave band is relatively stable to generate a smaller standard deviation, the point with the amplitude larger than the set threshold is regarded as the characteristic wave, and the point with the amplitude smaller than or equal to the set threshold is regarded as no characteristic wave. Among the characteristic waves, the amplitude of the P wave is minimum, the selection of the set threshold is based on distinguishing the P wave band from the non-wave band, the P wave is distinguished, other characteristic waves and the non-wave band are distinguished, only the specific type of characteristic wave cannot be determined, and the value of the set threshold is 1.2% of the maximum value of the standard deviation. The formula for calculating the standard deviation and threshold is expressed as:
Wherein, ECG std represents the variance of the signal, ECG amend represents the corrected signal, ECG std (t) is the standard deviation of the sliding window at time t, max (ECG std) is the maximum standard deviation of the sliding window at all times, and max (ECG std) x 0.012 is the set threshold.
For example, when the time corresponding to the sampling point is 1s, the time 1 is 970s and the time 2 is 1030s.
In fig. 4, the left graph is the electrocardiosignals before and after the non-characteristic band smoothing treatment of the diseased individual, the right graph is the electrocardiosignals before and after the non-characteristic band smoothing treatment of the healthy individual, the T wave and the P wave are subjected to the non-characteristic band smoothing treatment, and the clutter is filtered.
QRS wave direction modification of the electrocardiographic signal, QRS wave morphology may vary in standard 12-lead systems due to differences in sampling locations and differences in the type of disease that the subject may be suffering from. Even in disease-free cases, there is a difference in electrocardiographic morphology, e.g., QRS band peaks are generally upward in I, II and III leads under normal conditions, but mandrel shifts can change the peak direction. In order to adapt to the training requirement of the characteristic wave segmentation model on the co-distributed data, the invention provides an adaptive algorithm for correcting the peak direction of the QRS wave band.
As shown in fig. 5, since only the peak direction of the QRS wave band needs to be corrected, the R peak position does not need to be precisely positioned, firstly, the absolute value of the amplitude value corresponding to each point in the electrocardiosignal is taken, and then the peak point position is determined by using the Pan-Tompkins algorithm. And judging the peak direction according to the original data corresponding to the point position, namely keeping the signal unchanged when the corresponding point position information is positive, and storing the electrocardiosignal as the opposite number if the corresponding point position information is negative. In this way, the peak direction of the QRS wave band is unified and upward, and the uniformly distributed data is helpful for optimizing the training effect of the characteristic wave segmentation model.
The left graph in fig. 5 is the electrocardiosignals of the diseased individual before and after the correction of the QRS wave direction, the right graph is the electrocardiosignals of the healthy individual before and after the correction of the QRS wave direction, and the peak direction of the QRS wave band is uniformly upward in fig. 5 after the correction of the QRS wave direction.
The electrocardiosignal P wave and T wave direction correction algorithm is affected by lead difference and cardiac diseases, and the P wave and the T wave can deform. To better realize the enhancement of the electrocardio characteristic wave, all P waves and T waves need to be corrected. The first screening, namely inverting the P wave and inverting the T wave, namely respectively calculating the average value of the amplitude values of the T wave, judging that the T wave is the inverted T wave if the T wave average value information is negative, and judging that the P wave is the inverted P wave if the P wave average value information is negative. The second screening comprises the steps of counting the two zero crossing points of the T wave front, taking absolute values to correct inverted T wave, counting the two zero crossing points of the P wave, determining inverted P wave according to the amplitude and the vibration direction of the P wave, when the amplitude is negative and the vibration direction is downward, the inverted P wave is an inverted characteristic wave, and when zero crossing point information does not appear, the T wave direction is directly corrected by using the statistical information. The statistical information is P waves at the left side of the QRS waves and T waves at the right side of the QRS waves.
The electrocardiosignal P wave and T wave are enhanced, in a standard 12-lead electrocardiogram, the energy of the QRS wave band is generally higher than that of the P wave and the T wave, and the amplitude of the P wave is only 1/5 to 1/3 of that of the QRS wave band, so that the P wave is easy to be covered in the electrocardiosignal. Whereas the characteristic wave segmentation is more sensitive to rapid gradient changes, so the energy enhancement of P-waves and T-waves contributes to the characteristic wave detection. The values of the P wave and the T wave in the original signal are smaller, the values after logarithmic transformation are improved, but the values of the QRS section are higher relative to the values of the P wave and the T wave, and the relative increment after logarithmic transformation is smaller. The electrocardiosignal is processed by utilizing the characteristic that the increasing rate of the logarithmic function decreases with the increase of the input value.
The specific mode for carrying out characteristic enhancement on the P wave and the T wave is as follows:
S3c-1, moving the corrected electrocardiosignal upwards by 1mV, and setting the baseline of the electrocardiosignal to be 1mV.
S3c-2, performing primary transformation on the upward-moved electrocardiosignals according to a logarithmic function, wherein a transformation formula is as follows:
Wherein, ECG log (t) is the amplitude value after the primary transformation of the corresponding electrocardiosignal when the time is t.
S3c-3, enhancing the transformed electrocardiosignal, wherein the enhancement formula is as follows:
wherein, ECG enhance (t) is the amplitude value after corresponding electrocardiosignal is enhanced when time is t.
S3c-4, downwards shifting the enhanced electrocardiosignal by 1mV, wherein the baseline of the electrocardiosignal is 0mV, and carrying out Gaussian smoothing algorithm processing on the downwards shifted electrocardiosignal.
S3c-5, calibrating the R peak information of the downshifted enhanced electrocardiosignal by using a Pan-Tompkins algorithm, calculating RR interval information, selecting data by using a sliding window, and carrying out normalization processing on the data in the window, wherein a normalization formula is formula (1);
S3c-6, sliding a sliding window with the length of 60ms on the electrocardiosignal at the speed of 1 sampling point, calculating standard deviation of data in each window, taking a point position larger than a set threshold as a characteristic wave, taking the amplitude value of the electrocardiosignal smaller than the threshold as 0, obtaining a corrected electrocardiosignal, setting the value of the threshold as 3.5% of the maximum value of the standard deviation, and calculating a formula of the standard deviation as formula (2).
Wherein, ECG amend1 (t) is the corrected electrocardiosignal.
S3c-7, circularly executing the steps S3c-1-S3c-6 for 6 times, wherein the finally obtained electrocardiosignal is the electrocardiosignal with enhanced characteristics.
In order to obtain obvious effect, the total of 6 times of circulation enhancement is carried out, and finally the enhanced electrocardiosignal is obtained.
As shown in fig. 6, the original electrocardiosignal has obvious enhancement on P wave and T wave after the characteristic wave enhancement, no matter the T wave is inverted or the T wave is reduced.
After the enhancement of the electrocardiographic signals, the raw data is segmented into individual heartbeat intervals in order to adapt to the FS-Net model. Each heartbeat is composed mainly of P-waves, QRS-waves and T-waves, the remainder being non-bands. The division of the heart beat takes the midpoint (about 58% of RR interval) between the end of the current heart beat T wave and the beginning of the next heart beat P wave as a statistical reference. After segmentation, the data is uniformly adjusted to the length of 500 sampling points, and the original length is recorded for final calculation.
S4, constructing a model, namely constructing an FS-Net model for segmenting the characteristic waves of the electrocardiosignal.
As shown in fig. 7, the FS-Net model for segmenting an electrocardiographic signal feature wave in the present invention includes an encoder and a decoder to achieve segmentation of the electrocardiographic signal feature wave.
The FS-Net encoder, taking into account the time dependence of the electrocardiographic signals, the reciprocating contractions of the atria and ventricles form the periodicity of the P-wave, QRS-wave and T-wave. Therefore, the encoder of the present invention mainly employs a self-attention module to learn the dependency relationship between the feature waves. In order to make the self-attention module learn the potential information in the electrocardiosignal more effectively before being sent into the encoder, the characteristic dimension of the characteristic wave is expanded by the convolution module.
As shown in table 1, the model input first extends the latitude to 64 through a three-layer convolutional neural network. The characteristic channel of dimension 64 is that there is sufficient electrocardiographic semantic information. By using a self-attention mechanism, the network learns the correlation of front and back information on an electrocardiogram, so that the network has the capacity of observing the whole world, and fully learns the time point information of the electrocardiosignal P wave, the QRS wave and the T wave before and after.
The input signal firstly passes through 3 modules which are sequentially composed of a convolution layer and a first normalization layer, then sequentially enters 8 standard self-attention modules, the input of the self-attention modules is divided into two paths, one path passes through a multi-head attention network and reaches a first superposition layer, the other path is directly connected to the first superposition layer, after the superposition layers are added, the output of the second normalization layer is divided into two paths, one path passes through a feedforward layer and reaches the second superposition layer, the other path is directly connected to the second superposition layer, after the second superposition layer is added, the input of the second superposition layer enters a third normalization layer, and the output of the third normalization layer in the first, second, third, fourth, fifth, sixth and seventh self-attention modules is all connected to the next self-attention module, and the output of the third normalization layer in the eighth self-attention module is connected to the encoder.
The expanded data is input into an eight-layer standard self-attention module, and position coding information is added in a first layer for the network to learn the front and back time sequence of the data better. The number of heads used in the self-attention module is 4, and then nonlinear mapping of features is performed through the feed-forward network to enhance modeling capabilities of the model. To prevent the overfitting connection from randomly breaking the layer, the coefficient is 0.1.
After passing through 3 modules consisting of convolution layer and normalization layer, the input of the encoder enters 8 modules consisting of multi-head attention layer, add layer, normalization layer, feedforward layer, add layer and normalization layer in sequence, and outputs to the decoder.
The signal coded by the FS-Net decoder still contains time sequence information, the time sequence decoder is used for decoding, the first four layers use a two-way long-short-term memory module for dimension reduction, and the second two layers use a full-connection mapping network based on time distribution for finishing decoding, so that the start and stop point calibration of the characteristic wave is realized.
The input of the decoder passes through 4 modules which are sequentially composed of a two-way long-short-term memory network and a Dropout, passes through 2 full-connection mapping networks based on time distribution again, finally enters a Softmax layer and is output, and the full-connection mapping networks based on time distribution comprise a time distribution network and a feedforward layer.
The decoder is responsible for characteristic wave segmentation of electrocardiosignals, and firstly, the dimension reduction of the result of the self-attention module coding is carried out on a characteristic channel. The encoded result contains timing information, so that the Bi-directional long-short term memory network (Bi-LSTM) is used for feature decoding, and four layers of Bi-LSTM latitudes are used to reduce to 16.
After Bi-LSTM decoding, the dimension is reduced to 16 and further reduced to 2 to mark the presence or absence of the characteristic wave.
As shown in fig. 8, the full connection mapping network based on the time distribution performs the final decoding. The specific decoding mode is that the fully connected network is used for processing the characteristics of each moment, namely the fully connected network is applied to each time step of the sequence. The time length of the signal is m, and the characteristic length of each time point is n. After full connection mapping by time step, the time length is still m, but the characteristic length of each time point becomes n'. Wherein n' is less than n. Parameters are shared between each time step, the decoding process uses a two-layer network with dimensions 64 and 2, respectively. Finally, the output layer applies a softmax function to complete the decoding.
TABLE 1 network model parameter description
| Layers |
Type |
Number of neurons output |
Quantity |
| 1 |
Conv1DPack |
(batch,500,64) |
3 |
| 2 |
Encoder |
(batch,500,64) |
8 |
| 3 |
BiLSTM |
(batch,500,32) |
4 |
| 4 |
TimeDistrubute |
(batch,500,2) |
2 |
Considering that the electrocardiographic signal is a time-series signal, the characteristic wave has a periodic variation, and thus a neural network structure sensitive to time series is selected. The encoder mainly adopts a time-based self-attention network, and the decoder combines a long-short-term memory network (LSTM) and a time sequence description module, so that the model can effectively capture the time sequence dependency relationship between the characteristic waves.
S5, model training, namely inputting the electrocardiosignals processed in the steps S1-S3 into an FS-Net model for training, and realizing segmentation of P waves, QRS waves and T waves in the electrocardiosignals.
And (3) inputting the electrocardiosignals processed in the steps (S1-S4) into a constructed FS-Net model for training. The invention uses focal loss as a loss function, so that the difficult-to-divide edge information has a better accuracy formula expression as follows.
The neural network training is expected to realize rapid convergence of the network in the early stage, the learning rate in the initial training stage is 0.001, and when the accuracy rate exceeds 90%, the learning rate is reduced to 0.00005 until the accuracy rate of the training set is not improved.
S6, a post-processing algorithm is used for correcting the characteristics of the segmentation errors.
The characteristic wave segmentation model performs characteristic wave segmentation, and under normal conditions, one heart beat contains three segments of characteristic waves, but diseases may cause false detection of clutter.
The segmentation of the characteristic bands is performed using a neural network, and in normal cases, one heartbeat contains three characteristic bands. However, the disease may cause detection of noise that is erroneously detected as a characteristic wave. In the case of multiple detections, the erroneous segments and short segments of the edge will be removed. In the case of detecting the characteristic wave absence, it is necessary to complement the missing P wave or T wave according to the position of the R wave.
In the case of multiple detection, edges and short-time spurious bands are removed. And when the detection is missed, the P wave or the T wave of the detection is completed according to the R peak position.
Because each heartbeat is adjusted to the length of 500 sampling points before the electrocardiosignal is identified by the FS-Net model, the length of the minimum wave band is determined to be 4% of the length of the heartbeat, the wave bands marked by all the FS-Net models are traversed, an R peak is determined by using a Pan-Tompkins algorithm, the wave band where the R peak is located is a QRS wave band, and the index of the QRS wave band, namely the time corresponding to the QRS wave band, is recorded.
And taking 4% of the heartbeat length as the minimum band length, and filtering out the band with the length smaller than the minimum band length.
The band containing the R wave, the QRS band, is traversed again and its index is recorded.
When the number of the heartbeat wave bands is 3 and the R wave is in the middle wave band, three wave bands are reserved, when the number of the wave bands is 3 and the R wave is in the first wave band, the T wave bands are combined, namely, the starting time of the second wave is taken as the starting time of the T wave band, the ending time of the third wave band is taken as the ending time of the T wave, and when the number of the wave bands is 3 and the R wave is in the third wave band, the P wave bands are combined, namely, the starting time of the first wave is taken as the starting time of the P wave band, and the ending time of the second wave band is taken as the ending time of the P wave band.
When the number of the wave bands of the heartbeat is larger than 3, the wave bands on the left side of the R wave are combined into a P wave band, and the wave bands on the rightmost side of the R wave are combined into a T wave band.
When the number of the wavebands of the heartbeat is smaller than 2, supplementing the wavebands which are missed to be detected, and judging whether the missed detection is P waves or T waves according to the position of the R waves. According to the detected wave bands of the heartbeats, calculating the distance between the P wave and the R wave, taking the average value of the distance between the P wave and the R wave in all the heartbeats as a reference value of the distance between the P wave and the R wave, calculating the distance between the T wave and the R wave, and taking the average value of the distance between the T wave and the R wave in all the heartbeats as a reference value of the distance between the T wave and the R wave.
When there is no wave band on the left of R wave, P wave band is supplemented according to the reference value of the distance between P wave and R wave, and when there is no wave band on the right of R wave, T wave band is supplemented according to the reference value of the distance between T wave and R wave.
Because individual conditions of the measured electrocardiosignals are different, the reference value of the distance between the P wave and the R wave is determined according to the distance between the P wave and the R wave in all heartbeats of the individual, and the reference value of the distance between the T wave and the R wave is determined according to the distance between the T wave and the R wave in all heartbeats of the individual, so that the supplemented P wave and the T wave are more accurate.
S7, electrocardiosignal segmentation, namely segmenting characteristic waves of the measured electrocardiosignal by using a trained FS-Net model.
For the measured electrocardiographic signals, the processing of steps S1-S3 is also needed before the segmentation by using the FS-Net model, and the processing of step S6 is needed after the segmentation.
S8, model evaluation, namely dividing LUDB characteristic wave bands by using the corrected FS-Net model under different tolerances, and evaluating the corrected FS-Net model according to a division result.
The characteristic wave segmentation evaluation index is the segmentation precision of a characteristic wave band, and the mean value and the variance of the comparison error of a predicted characteristic wave starting point and a tag, wherein the mean value reflects the predicted mean deviation of the model, and the variance reflects the stability of the model.
TABLE 2 LUDB characteristic band splitting Performance
LUDB contained standard 12-lead electrocardiogram signals of 200 subjects, each lasting 10 seconds, with a sampling frequency of 500Hz. These signals are manually noted by a cardiologist, and the present invention uses all of the 183's data, containing 14480 beats in total. The other 17 persons suffer from atrial fibrillation or are not marked with P waves, the P waves in the electrocardiogram of the atrial fibrillation patient disappear, and the diagnosis can be carried out through the heart rate. The present invention focuses on the detection of the full waveform of the characteristic wave, and thus the 17-person data is not used.
As shown in table 2, the characteristic wave segmentation results under the tolerances of 40ms, 70ms, and 150ms are shown. When the model trained by the original electrocardiosignal and the electrocardiosignal processed in the steps S1-S3 is used together, the model effect is better. Under the standard tolerance of 150 milliseconds of the algorithm, the detection result of each wave band starting and ending point in LUDB obtains the sensitivity close to 100 percent, and when the model is trained and the original signal and the enhanced signal are input at the same time, the electric signal characteristic wave segmentation model can capture more effective information, the mean value and variance of the characteristic wave detection error are well controlled, and the electric signal characteristic wave segmentation model has excellent stability.
By comparing the invention with other schemes, the electrocardiosignal segmentation effect of the invention is evaluated.
TABLE 3 comparison of the results of the invention with other protocols
As shown in table 3, the present invention tends to yield better performance based on the results of LUDB under a 150ms tolerance by comparing with other schemes on LUDB dataset and QTDB dataset. Wang et al uses a residual network to synchronize analysis of wave structure and position, with 100% sensitivity to salient QRS complexes, but noted a decrease in detection performance for smaller amplitude P-waves. Wang et al, sereda, and Chen et al captured finer details using U-Net network based methods, improving the P-wave detection rate by 2% to 99.5%. However, since the electrocardiographic signal is a time-series signal, it is important to understand the time relationship. These methods employ LSTM networks. The IT-Net of the present invention also uses temporal modeling, enhances the characterization of PT waves by filtering, improves the temporal logic understanding by using an attention mechanism, and prevents temporal overlap by using a time distribution layer, thus exhibiting excellent performance in terms of the mean and variance of the start and end errors. Compared with the Chen et al solution, the invention reduces the average error by 2 ms, reduces the variance by 5 ms, and further improves the stability.
As can be seen from table 3, the detection sensitivity of both datasets to P-wave and QRS-wave is similar, exceeding 99%, but the LUDB partitioning error is less than QTDB. The test results of QTDB data sets showed that the deep learning methods of Jimenez, peimankar and Chen generally have higher scribe errors than the digital processing methods of Martinez, bote and Kalyakulina. In contrast to the digital processing method, the variance in the P-wave and QRS-wave complexes of the present invention is about 5 milliseconds, and the variance in the T-wave is about 20 milliseconds. The LUDB dataset annotated by the expert contained 14480 heartbeats, approximately five times the 2845 heartbeats of the QTDB dataset. The digital processing method for feature extraction captures the threshold information for the feature wave onset and offset more effectively, and therefore the digital processing method performs better on QTDB datasets. On this dataset, the sensitivity of the invention to P-wave and QRS complexes is over 99% as with other deep learning methods. Because QTDB has less data and poor generalization capability, the invention uses LUDB for pre-training and achieves better effect.
The main reason that the deep learning model does not perform as well as LUDB on QTDB is the quality and quantity of the annotation information. The deep learning model relies on data-driven training, and the existence of abnormal annotations, inaccurate or inconsistent labels in QTDB databases can seriously affect the performance of the deep learning model. For example, the lack of a starting point marker in some T wave signals is an abnormal annotation, which affects the convergence direction of the IT-Net model, and becomes a key factor that the model cannot effectively identify T waves. Furthermore, QTDB dataset provides information for only two random leads, resulting in data imbalance. Such imbalance may bias the model toward common feature wave types during learning and prediction. While most deep learning methods perform at QTDB rather than as digital processing methods, they generally yield more accurate results when trained using LUDB because LUDB has more accurate annotations and a larger data volume, especially if the lead information is fixed. And then training the QTDB data set by using the model trained by the LUDB data set, so that a better effect can be obtained.
S9, model characteristic wave characteristic extraction visualization.
As shown in fig. 9, gradient visual information of the model provided by the invention on the change of characteristic waves with four different attributes is shown by a CAM method, and the gradient visual information respectively corresponds to P-wave inversion, T-wave towering and waveform integral abnormality. These visualizations show that the focus of the model is still centered on the characteristic band. Thanks to the correction and enhancement of the data by the filter proposed by the invention, the model shows the same degree of attention to different kinds of heart diseases. By the visualized color, the class excitation can be observed to appear red to the peaks of the P wave, the QRS wave and the T wave, indicating similar degrees of activation in these regions. This shows that the model maintains the same segmentation accuracy when dealing with different diseases.