WO2023118556A1

WO2023118556A1 - System and method for predicting an intracranial pressure of a patient

Info

Publication number: WO2023118556A1
Application number: PCT/EP2022/087699
Authority: WO
Inventors: Nils SCHWEINGRUBER
Original assignee: Universitatsklinikum Hamburg Eppendorf; Universitaet Hamburg
Current assignee: Universitatsklinikum Hamburg Eppendorf; Universitaet Hamburg
Priority date: 2021-12-23
Filing date: 2022-12-23
Publication date: 2023-06-29
Anticipated expiration: 2024-06-23

Abstract

The invention relates to a system for predicting a parameter indicative of an intracranial pressure of a patient, the system comprising a medical data providing unit configured for providing medical data associated with the patient, a model providing unit configured for providing a model trained by machine learning for predicting the parameter indicative of an 5 intracranial pressure on the basis of said medical data, and a predicting unit configured for predicting the parameter indicative of an intracranial pressure using the provided trained model based on the provided medical data.

Description

System and method for predicting an intracranial pressure of a patient

The invention relates to a system and to a method for predicting a parameter indicative of an intracranial pressure of a patient. The invention also relates to a training unit and a training method for training a model by machine learning for predicting a parameter indicative of an intracranial pressure of a patient.

Treatment of patients suffering from neurological conditions such as intracerebral haemorrhage (ICH), subarachnoid haemorrhage (SAH), ischemic stroke, and traumatic brain injury (TBI) often includes admittance to an intensive care unit (ICU). Yet, many diseases treated in an intensive care unit can lead to an elevated intracranial pressure (ICP).

Particularly critical are long-sustained pressure phases of an elevated intracranial pressure. Such critical phases should be avoided to protect affected and non-affected surrounding brain tissue from secondary deterioration or entrapment.

Treatment of an elevated intracranial pressure should be initiated immediately in order to keep the periods of an elevated intracranial pressure as short as possible. Intensive care unit regimens mainly depend on the usage of pharmacological agents for either the treatment of a brain oedema or induction of a deep sedation for reduction of the brain metabolism. If these actions do not control the intracranial pressure, a decompressive craniectomy can be considered.

Today, clinical management of an elevated intracranial pressure often relies on invasive intracranial pressure measurement via external ventricular drainage or intraparenchymal probe. However, invasive intracranial pressure monitoring may bear several risks to the patient such as infections, brain tissue lesions, or haemorrhage.

Therefore, alternative approaches to an invasive intracranial pressure monitoring have been pursued, aiming for a non-invasive prediction of an intracranial pressure. For example, in the article "Intracranial pressure based decision making: Prediction of suspected increased intracranial pressure with machine learning. PLoS One. 2020;15(10 Oc- tober):1-14. doi:10.1371/journal. pone.0240845", T. Miyagawa et al. suggest using super vised machine learning to predict suspected increased ICP based on computed tomography (CT) measurements.

Another way of non-invasive intracranial pressure monitoring is provided with the software package ICM+ of Cambridge Enterprise Ltd. With ICM+, it is possible to monitor intracranial pressure in real-time using transcranial Doppler ultrasonography (TCD).

The present invention is based on the objective of providing an improved system for predicting a parameter indicative of an intracranial pressure of a patient. A further objective is to provide an improved method for predicting a parameter indicative of an intracranial pressure of a patient. A further objective is to provide an improved training system and an improved training method fortraining a model by machine learning for predicting a parameter indicative of an intracranial pressure of a patient.

With regard to the system, the objective is achieved by a system implementing the features according to claim 1 .

According to the invention, a system is proposed for predicting a parameter indicative of an intracranial pressure of a patient. The system comprises a medical data providing unit, a model providing unit and a predicting unit.

The medical data providing unit is configured for providing medical data associated with the patient. The model providing unit is configured for providing a model trained by machine learning for predicting the parameter indicative of an intracranial pressure on the basis of said medical data. The predicting unit is configured for predicting the parameter indicative of an intracranial pressure using the provided trained model based on the provided medical data.

The invention includes the recognition that severely ill patients admitted to an intensive care unit are at risk to deteriorate within a short period of time and therefore should be more closely monitored than any other patient in a hospital. The medical regimen often is adapted according to the alarms of a monitoring system if parameters outside a pre-set range are detected. Laboratory values, imaging data, and other information are interpreted by physi- cians and allow for further treatment decisions. Due to a great amount of constantly changing information, which is acquired in real time, not all discrete changes, particularly their combinations in all values, might be fully recognized by the medical staff. Setting individual alarm thresholds often is not standardized, leading to alarm fatigue, which jeopardizes alarm safety. Furthermore, patterns, which are believed to be not associated with the current medical problem can go unnoticed.

In fact, the intracranial pressure, in particular, is affected by a multitude of factors, many of which are monitored on an intensive care unit, but the complexity of the resulting patterns often limits their clinical use. A further challenge comes from the vast amount of data generated limiting even for experienced specialists the ability of fully processing and interpreting monitoring results of an intensive care unit. In fact, sometimes patterns become too multidimensional and complex to reproducibly draw conclusions from them such that optimal decisions cannot be made anymore.

With the system according to the invention, it is possible to make a multitude of factors monitored on an intensive care unit and contained in recorded medical data available in such a way that they can be reliably employed for assisting clinical management of patients undergoing invasive intracranial pressure monitoring. This is accomplished by the system, since it has a model providing unit configured for providing a model trained by machine learning for predicting the parameter indicative of an intracranial pressure on the basis of said medical data. The trained model can be fed with medical data of a patient and by using the system's predicting unit, the parameter indicative of an intracranial pressure can be predicted using the provided trained model. Since the model is trained by machine learning to predict the parameter indicative of an intracranial pressure, it may reliably assist medical staff in monitoring a patient's intracranial pressure with an intensive care unit. Since the system has a model providing unit configured for providing a model trained by machine learning for predicting the parameter indicative of an intracranial pressure, it may be capable of reliably predicting the parameter indicative of an intracranial pressure also from large amounts of data also with high complexity by performing, e.g., automatized data processing on the medical data, looking for patterns in the medical data, and/or classifying the medical data. It is a further advantage of the system that reliably predicting a parameter indicative of an intracranial pressure may even be possible independently of the underlying disease.

In the framework of the present description, the term "predicting" particularly comprises that an intracranial pressure is determined that is expected to occur in the future, i.e. in the sense of a forecast. In particular, there is a prediction time point, at which the system provides a prediction about a parameter indicative of an intracranial pressure to be expected in the future. The prediction is based on the medical data associated with a patient for whom the parameter indicative of an intracranial pressure is predicted. Thus, for making a prediction, the model processes medical data that are fed into the model. "Predicting" may include a general statement that a certain intracranial pressure, e.g., above a predefined threshold value, is to be expected in the future. "Predicting" may also include a statement about when a certain intracranial pressure is to be expected in the future, i.e., by additionally providing a time interval after which a certain intracranial pressure is expected to occur. For example, a distance between a prediction time point and a predicted intracranial pressure may include, e.g., 30 minutes and up to 24 hours.

The predicted parameter indicative of an intracranial pressure of a patient can represent directly an intracranial pressure or a value or quantity related to the intracranial pressure, e.g., behaving proportional or comparable to the intracranial pressure.

The medical data providing unit can be storage medium on which medical data are stored, a measurement unit for recording and providing medical data, a monitoring unit such as an intensive care unit, or a receiver or a transceiver for receiving medical data.

The model providing unit may comprise, e.g., store, a computer program comprising the model trained by machine learning for predicting the parameter indicative of an intracranial pressure.

The predicting unit may comprise a processor for executing a computer program comprising the trained model and being stored in the model providing unit for predicting the parameter indicative of an intracranial pressure using the provided trained model based on the medical data provided by the medical data providing unit.

Medical data provided by the medical data providing unit can be raw data as generated by a measurement unit or device or can be pre-processed medical data. Medical data can represent or include, e.g., patient data such as age, weight, height. Alternatively or additionally, medical data can also represent or include a medical diagnosis, e.g., about the disease to be treated. Alternatively or additionally, medical data can also represent or include vital signs of a patient. Alternatively or additionally, medical data can also represent or include laboratory values such as a C-reactive protein (CRP) parameter or a white blood cell parameter of a patient. Alternatively or additionally, medical data can also represent or include a medication provided to a patient. Medical data can also represent or include blood gas samples or a blood gas analysis (BGA) of a patient. Alternatively or additionally, medical data can also represent or include imaging data, e.g., computed tomography data, recorded for the patient. Alternatively or additionally, medical data can also represent or include genetic information of the patient. Certain laboratory values are typically obtained once a day, e.g., C-reactive protein, white blood cells, and a blood gas analysis is performed typically at least every four hours in patients under invasive ventilation. Vital parameters such as an intracranial pressure often are recorded many times during an hour, e.g., with a five-minute resolution. Alternatively or additionally, medical data can also represent or include a continuous medication, e.g., a continuous dosage of opioids.

Pre-processing of raw medical data may include an averaging, e.g., over one hour. The averaged medical data can be provided as medical data by the medical data providing unit und used as input for the trained model for predicting the parameter indicative of an intracranial pressure.

Alternatively or additionally, a pre-processing of medical data may include normalization of the medical data, e.g., using Z-standardization, Yeo-Johnson transformation coupled with a Z-standardization, or min-max normalization.

Alternatively or additionally, a pre-processing of medical data may include imputation of missing data, e.g., by employing iterative imputation, mean imputation, median imputation, filling with zeros, and minus one. Iterative imputation has been shown to work particularly well. Median imputation has the advantage of not introducing a strong bias and correlations into input features. A continuous medication may be assigned with 0 when not given.

Preferably, the predicting unit is configured for predicting an evolution of the parameter indicative of an intracranial pressure as a function of time based on the provided medical data. The evolution may represent an expected trend or trajectory of the intracranial pressure during a certain time interval. For example, the evolution may represent an extrapolation of the intracranial pressure that is based on the medical data recorded for the patient in the past and fed into the trained model as input for predicting the parameter indicative of the intracranial pressure. Furthermore, the evolution of the parameter may represent an extrapolation of the intracranial pressure as measured with an invasive or non-invasive intracranial pressure measurement. Preferably, the system further comprises a critical phase determining unit configured for determining based on the predicted evolution of the parameter indicative of an intracranial pressure in advance an occurrence of a critical phase during which an intracranial pressure is to be expected to lie above a predefined threshold value over a predefined period of time.

A critical phase, preferably, is defined by a time span during which intracranial pressure lies above a predefined intracranial pressure threshold value. The threshold value may represent a transition into an intracranial pressure regime in which damaging of affected and non-affected surrounding brain tissue can be expected. Since for damaging of affected and non-affected surrounding brain tissue the intracranial pressure has to lie at least for a certain time above the predefined intracranial pressure threshold value, the predefined period of time is selected to represent a minimum time duration after which damaging of affected and non-affected surrounding brain tissue can be expected.

For example, a predefined intracranial pressure threshold value can be set at an intracranial pressure value of between 20 mmHg and 24 mmHg, e.g., at 22 mmHg. A critical phase may be defined by an intracranial pressure lying above the predefined threshold for 1 hour or more, e.g., 2 hours or more. It may be advantageous to differentiate short critical phase, e.g., being equal to or less than two hours, from long critical phases, e.g., exceeding two hours. With the system, it may thus be possible to predict a starting point of a critical phase as well as an end point of the critical phase, i.e. a start and a duration of a critical phase. Depending on whether a critical phase is considered a short critical phase or a long critical phase, treatment of a patient may be adapted accordingly.

Preferably, the critical phase determining unit is configured for determining a critical phase in advance, at a distance between the prediction time point and the upcoming critical phase that is equal to or larger than 0.5 hours, or equal to or larger than 1 hour, or equal to or larger than 2 hours, in particular, between 1 hour and 24 hours. With the critical phase determining unit it may thus be possible to predict an upcoming critical phase up to 24 hours in advance.

Furthermore, with the critical phase determining unit it may also be possible to predict a time duration over which a critical phase is expected to last. For example, a maximum of two consecutive critical hours may be defined as a short critical phase. Accordingly, more than two consecutive critical hours may be defined as a long critical phase. The critical phase determining unit may also be configured for providing critical phase data indicative of a critical phase to be expected. These critical phase data may be visualized on a display, e.g., to a clinician. In particular, for a clinician, two hours is typically enough time to prepare an anticipated reaction in clinical settings to an upcoming critical phase, like adapting sedative medication, addressing invasive breathing conditions, or considering other more invasive procedures.

Preferably, the provided trained model comprises a recurrent neural network unit comprising or being a recurrent neural network. In a recurrent neural network (RNN), preferably, the outputs of all neurons are connected to the inputs of all neurons. Since in a recurrent neural network the output is fed back into the input layer of the model during training, the recurrent neural network can learn time-dependent information particularly well. Recurrent neural networks have the advantage that they can use their internal state, i.e., an internal memory, to process variable length sequences of inputs. Recurrent neural network can also handle input with missing data and thus may be robust to missing features and to raw datasets from various clinical sources. Additionally, to dealing with variable input length, the recurrent neural network unit has the advantage of being able to deal with bias-free imputation and individual per time step calculable gradient-based feature importance. In particular, details with regard to the feature importance will be provided further below in the present description. Using a trained model comprising a recurrent neural network unit that may deal with variable length sequences of inputs as well as with missing data may thus lead to a much more precise and anticipated treatment of an elevated intracranial pressure of a patient.

It is preferred that the recurrent neural network unit comprises or is a Long Short-Term Memory (LSTM) cell. A Long Short-Term Memory cell may operate on arbitrary sequence lengths and decide what information to remember or forget. A Long Short-Term Memory cell may have feedback connections and may be capable of processing not only single data points, but also entire sequences of data. A Long-Short-Term Memory cell may be composed of a cell, an input gate, an output gate and a forget gate. The cell may be configured for remembering values over arbitrary time intervals and the three gates may be configured for regulating a flow of information into and out of the cell. Using a recurrent neural network unit comprising or being a Long Short-Term Memory cell has the advantage that medical data can be classified and processed to make predictions based on medical data recorded over certain time intervals. Preferably, the provided trained model further comprises a first feedforward neural network unit configured for feeding information into the recurrent neural network unit and a second feedforward neural network for reading processed information out of the recurrent neural network unit and for providing the parameter indicative of an intracranial pressure. It is further preferred, that at least one of the first and second neural network units comprises at least one multi-layer perceptron layer (MLP). A multi-layer perceptron layer may consist of at least three layers of nodes that are an input layer, a hidden layer and an output layer. Except for the input nodes, each node may be a neuron that uses a nonlinear activation function.

If the model has a first feedforward neural network unit that is or has at least one multilayer perceptron layer and subsequently a recurrent neural network that is or has a Long Short-Term Memory cell, the multi-layer perceptron layer may feed information into the Long Short-Term Memory cell.

If the second feedforward neural network likewise is or has at least multi-layer perceptron layer, this multi-layer perceptron layer of the second feedforward neural network may read the information out of the Long Short-Term Memory cell unit to predict the parameter indicative of an intracranial pressure of a patient. Such a multi-layered approach may allow the extraction of higher-level features from medical data. Moreover, such a multi-layered approach may have beneficial properties with respect to model robustness to deal with arbitrary sequence lengths and various degrees of data missingness. An accordingly implemented model thus may reliably predict the parameter indicative of an intracranial pressure to assist in patient treatment.

Preferably, the first feedforward neural network unit comprises at least one hidden layer for extracting a number of features from the provided medical data. The features preferably are common to at least most of the provided medical data. Preferably, the first feedforward neural network unit is configured for feeding the extracted features as input into the recurrent neural network unit.

An extracted feature may be represented by feature data extracted or generated from the medical data.

For example, the first feedforward neural network may have a multi-layer perceptron layer that is configured for taking medical data as input and feeds the input into a hidden layer for extracting features contained in the medical data. Features that can be extracted from the medical data may be patient characteristics, a diagnosis, a vital sign, a blood gas analysis, a medication, or a laboratory value.

A special pre-selection of features is not required but may be beneficial for the performance of the trained model. In particular, vital parameters for example blood pressure parameters and some invasive breathing parameters have been shown to be particularly suitable for an improved performance of the model. The medical data may be used as raw data and/or a minimum of pre-processing may be done, preferably, by normalizing the raw-data for example with Yeo-Johnson transformation and optionally by applying an outlier detection and clipping of the data with a certain percentile. Features that may be important for an individual prediction may be selected by the model itself, in particular, by the first feedforward neural network. The first feed forward unit, preferably, is as large as the features presented in provided medical data.

It is preferred that no reduction of the number of features and/or a selection of specific features is performed in advanced during preprocessing. Important features should be identified by the models through the training-process. A feature that has been of particular relevance in various studies is the intracranial pressure. The system might be able to predict an occurrence of a critical phase in advance solely based on the measured intracranial pressure. Further features that have been found to be of importance for a prediction are often a patient's age, weight, a mean arterial pressure, and a mean airway pressure. For finding those features of greater importance for a prediction a feature importance unit can be employed as described below in the present description.

The features extracted in the first feedforward neural network, preferably, are fed into the Long Short-Term Memory cell implementing the recurrent neural network.

The Long Short-Term Memory cell may comprise 1 to 4 stacked Long Short-Term Memory layers. It is preferred that layer normalization is applied on the output of the Long Short- Term Memory cell, followed by a dropout layer. Layer normalization may include a Yeo- Johnson transformation coupled with a Z-standardization. Alternatively, Z-standardization, only, may be employed for normalization. Another alternative way of normalization may be based on a min-max normalization. The output of the Long Short-Term Memory cell, preferably, is fed into a two-layered multilayer perceptron layer implementing the second feed forward neural network unit to predict a parameter indicative of an intracranial pressure. Based on the parameter indicative of an intracranial pressure and, in particular, based on an evolution of this parameter, it can be predicted whether a short or a long critical phase will appear during a predefined upcoming time span.

Preferably, the system further comprises a feature importance unit configured for generating and providing feature importance data indicative of an importance of a certain extracted feature for the prediction of the parameter indicative of an intracranial pressure by the provided trained model.

A feature importance of a neural network model indicates a potential role of an input feature for a certain prediction made by the model. A feature importance may thus be indicative of why the model made a prediction of the parameter indicative of an intracranial pressure. This can be of interest, since in contrast to normal statistical models, neural networks can have a certain randomness of prediction and that can be also true for their consideration of input features. With the feature importance unit it may also be possible to calculate a feature importance for a number, e.g., every, individual input time step and, preferably, over the whole individual past of a patient admitted to an intensive care unit. Based on a calculated feature importance it may be possible to improve clinical interpretability of a predicted parameter indicative of an intracranial pressure.

For example, from a calculated feature importance it may be possible to deduce that a critical phase has been predicted due to features representing, e.g., a measured intracranial pressure and a cerebral perfusion pressure. In another example, calculating a feature importance may indicate that features such as a higher intracranial pressure, a mean arterial pressure and a cerebral perfusion pressure may be important for a prediction of critical phase. Further features that may be important for predicting a critical phase may be a blood gas analysis as well as sodium and bicarbonate. Feature importance may also show that continuous medication may have an influence on the model prediction. For example, a higher continuous dosage of opioids may be correlated with an upcoming critical phases, while narcotics like Propofol may show an opposite effect. Feature importance may also show which laboratory values were important for predicting a critical phase, such as thrombocytes, erythrocytes, mean corpuscular haemoglobin (MCH) and blood urea nitrogen (BUN). However, feature importance may also be useful to find features of less importance for a prediction of a critical phase. Such features may for example be higher glucose and chloride.

Based on a calculated feature importance it may be possible for clinical staff to verify a prediction made by the system, e.g., as part of a verification method, e.g., by consulting and evaluation recorded medical data associated with the features found to be important for a prediction of an upcoming critical phase. It may even be possible as part of a treatment method to base a treatment decision based on the calculated importance feature. For example, if certain features have been found important for an elevated intracranial pressure, treatment may be adapted based on these features. This may include providing medication that may influence these features. The calculated importance feature may also provide a guide for treating an elevated intracranial pressure. Thus, from the calculated importance feature clinical staff may know what features need to be changed to effectively reduce an elevated intracranial pressure.

Preferably, the feature importance unit is configured for using an integrated-gradient approach in combination with a gradient averaging to generate the importance feature. For example, for calculating a feature importance the SmoothGrad method as described by D. Smilkov, et al. in "SmoothGrad: Removing noise by adding noise. arXiv, 2017" may be employed and/or for gradient averaging the method of integrated gradients as described by M. Sundararajan, et al. in "Axiomatic attribution for deep networks. In: 34th International Conference on Machine Learning, ICML 2017. Vol 7. ; 2017:5109-5118." can be employed. SmoothGrad creates noisy copies of an input and then averages gradients with respect to these copies. This often sharpens a resulting saliency map, and removes irrelevant noisy regions.

The method of integrated gradient computes the gradient of the model's prediction output to its input features and generally requires no modification to an original deep neural network.

For example, the feature importance unit may be configured to use the method of integrated gradients for taking the gradient of the model output towards the input for several discrete inputs and summing them. These inputs may be interpolations on the path between the current patient and a baseline. This baseline may supposed to be an input for which the model generates zero saliency. The neutral baseline may be constructed by taking the median over each feature and repeating these median values for as many time steps, for example, one hour, as the patient for whom the parameter indicative of an intracranial pressure is to be predicted has.

The feature importance unit may be further configured for augmenting the method of integrated gradients procedure and smoothing by additionally applying the SmoothGrad method. Additionally, applying the SmoothGrad method may include repeating the method of integrated gradients N times, e.g., with N = 50, wherein each time with newly generated Gaussian noise, for example, a standard of Gaussian input noise may be between 0.0 and 0.25, may be added to the input. The N resulting input saliencies may then be averaged.

The feature importance unit may be further configured for conducting a feature ranking by dividing all saliencies of a patient by the total absolute sum. This way, every patient, independent of the length of intensive care unit stay, may have the same influence on the final saliency ranking. All saliencies may be grouped per hour and feature, before the sum may be calculated. Finally, the grouped sum may be divided by a count of the total amount of actual inputs the provided trained model had at that time of a certain feature. This way less frequent features like laboratory values can play a greater role then one hot encoded feature like the diagnosis.

Preferably, the medical data represent at least one of a vital parameter, a laboratory value, imaging data, genetic information and medication information. For example, a vital parameter may be at least one of a measured intracranial pressure, a cerebral perfusion pressure and a mean arterial pressure. A laboratory value may represent at least one of thrombocytes, erythrocytes, mean corpuscular haemoglobin and blood urea nitrogen. Alternatively or additionally, medical data can also represent or include a continuous medication. Groups of drugs may, e.g., be defined through their active ingredients, e.g., narcotics with Propofol or Ketamine.

Preferably, for the medical data provided by the medical data providing unit such data are selected that have a comparatively low number of missing data points. Preferably, medical data comprise such data that have been recorded with an intensive care unit to which a patient has been admitted to. For example, medical data may comprise vital parameters automatically stored with an intensive care unit. The medical data provided by the medical data providing unit and used for predicting a parameter indicative of a patient are, in particular, medical data recorded for this particular patient.

Preferably, medical data related to frequently measured parameters such as an intracranial pressure include a trajectory or evolution of this measured parameter as a function of time. An evolution can also be represented by an interpolation between parameter values associated with, e.g., measured at certain time steps. For example, medical data provided by the medical data providing unit may include medical data recorded during the time of admittance to an intensive care unit or a fraction of this time span.

With regard to the training system, a training system is proposed for training a model by machine learning. According to the invention, the training system comprises a medical training data providing unit, a model-to-be-trained providing unit and a training unit. The medical training data providing unit is configured for providing medical training data associated with a number of different patients. The model-to-be-trained providing unit is configured for providing a model that can be trained using machine learning. The training unit is configured for training the model-to-be-trained such that the trained model is configured for predicting a parameter indicative of an intracranial pressure on the basis of medical data of an individual patient. Preferably, the training unit is configured for training the model-to- be-trained employing supervised recurrent learning.

After training the model-to-be-trained, the trained model can be used as part of the system for implementing the trained model provided by the model providing unit as described above.

Medical training data used for training the model can be configured as described herein for the medical data associated with a patient for which a parameter indicative of a patient is to be predicted with the herein-described system. In particular, medical training data should have common features with the medical data based on which a parameter indicative of an intracranial pressure is to be predicted by the system described herein. This may comprise that the medical training data and the medical data have a similar distribution of features. In particular, such features that are known to be of higher importance for the prediction of an intracranial pressure of a patient should be common to the medical training data and the medical data. One common feature should at least be a monitored, i.e., measured, intracranial pressure of respective patients. Medical training data may comprise the medical data associated with a patient for which a parameter indicative of a patient with the above-described system. Medical training data, however, preferably, comprise medical data recorded for various different patients, e.g., 1000 patients or more.

Preferably, medical training data are associated with patients admitted to an intensive care unit. It is particular preferred that medical training data are associated with patients admitted to an intensive care unit thereby monitoring an intracranial pressure of each of the respective patients. It may be possible that the only feature the medical training data is associated with a number of different patients only have a monitored intracranial pressure as a common feature.

Preferably, medical training data are selected to have a comparatively low number of missing data points. It is further preferred that additionally or alternatively to the medical training data having a low number of missing data points it is preferred that the medical training data have a comparatively higher resolution of data points. A higher resolution of data points may result in a better performance of the trained model.

In particular, if the model-to-be-trained comprises a recurrent neural network cell, with the training system the model-to-be-trained may learn time dependent information since the output is fed back into the input of the model during training, as described by J. Schmidhu- ber in "Deep learning. Scholarpedia. 2015;10(11):32832".

A training data set comprising medical training data of various patients, e.g., one or more cohorts of patients, that were undergoing invasive intracranial pressure monitoring may be employed for training. For example, training data including medical training data of about 1000 patients have been shown be provide good training results. Preferably, further data sets including invasive intracranial pressure monitoring of another cohort of patients may be used for validation of the training result.

In one embodiment, with the medical training data providing unit an institutional cohort (ICP-ICU) of patients 1346 with invasive intracranial pressure monitoring was used to train a model comprising a recurrent neural network to predict the occurrence of intracranial pressure increases of equal or above 22 mmHg over a long time period of more than two hours within the upcoming hours. In this embodiment, external validation may be performed on patients undergoing invasive intracranial pressure measurement in two publicly available datasets, e.g., the Medical Information Mart for Intensive Care (MIMIC, number of patients = 998) and the eICU Collaborative Research Database (eICU, number of patients = 1634). In this embodiments, the training and tuning may take place on about 80% of the data from the ICP-ICU dataset used as the training-set. The validation may take place on 20% of the ICP-ICU dataset used as a test-set and the whole external cohorts.

Preferably, the training system and in particular its training unit is configured to generate or provide a predefined period of time for a critical phase and a threshold value. For example, an hour may be defined as a critical phase if at least one intracranial pressure measurement was equal to or above 22 mmHg. The threshold may be chosen based on a distribution of all intracranial pressure measurements (both internal and external) in surviving patients. For example, the intracranial pressure measurements used in the above-mentioned embodiments indicated 21 mmHg as the 95th percentile. A maximum of two consecutive critical hours may be defined as a short critical phase. More than two consecutive critical hours may be defined as a long critical phase. Preferably, the training unit is configured for training the model-to-be-trained using a complete intracranial pressure trajectory of a patient.

The present invention also relates to a model trained by machine learning, the model being configured for predicting a parameter indicative of an intracranial pressure on the basis of medical data of a patient. The model can be configured in accordance with the provided trained model as provided by the model providing unit of the system described above. Accordingly, the model may comprise a recurrent neural network unit that, preferably, comprises or is a Long Short-Term Memory cell. The trained model may further comprise a first feedforward neural network unit configured for feeding information into the recurrent neural network and a second feedforward neural network for reading processed information out of the recurrent neural network and for providing the parameter indicative of an intracranial pressure. At least one of the first and second neural network units may comprise at least one multi-layer perceptron layer. The first feedforward neural network unit may comprise at least one hidden layer for extracting a number of features from provided medical data. The first feedforward neural network unit may be configured for feeding the extracted features as input into the recurrent neural network unit. The second feedforward neural network unit may be configured for predicting the parameter indicative of an intracranial pressure of a patient. With regard to the method for predicting a parameter indicative of an intracranial pressure of a patient, a method is proposed comprising the steps of:

- providing with a medical data providing unit medical data associated with the patient;

- providing with a model providing unit a model trained by machine learning for predicting a parameter indicative of an intracranial pressure on the basis of said medical data; and

- predicting with a predicting unit the parameter indicative of an intracranial pressure using the provided trained model based on the provided medical data.

The method for predicting a parameter indicative of an intracranial pressure of a patient can be conducted employing the above-described system for predicting a parameter indicative of an intracranial pressure.

With regard to the training method, a method for training a model by machine learning is proposed, the method comprising the steps of:

- providing with a medical training data providing unit medical training data associated with a number of patients;

- providing with a model-to-be-trained providing unit a model that can be trained by machine learning; and

- training with a training unit the model to be trained by machine learning such that the trained model is configured for predicting a parameter indicative of an intracranial pressure on the basis of medical data of a patient.

The method for training a model by machine learning can be conducted using the abovedescribed training system for training a model by machine learning for predicting a parameter indicative of an intracranial pressure of a patient.

Training the model-to-be-trained by machine learning with a training unit may include first applying Gaussian noise with a mean of 0 and a standard deviation up to 0.2 to simulate measurement errors and to make the model more robust towards perturbations, e.g., applied later during a saliency calculation. Gaussian noise with a mean of 0 and a standard deviation up to 0.2 may be chosen to equal 20% of the standard deviation of a Z-standard- ized input used for layer normalization.

With the training unit, the loss between the targets and the predictions may be calculated as follows:

- applying a mask where targets are labelled NaN (not a number), i.e., where no intracranial pressure measurement is available,

- applying binary cross-entropy loss to phase predictions per phase and per patient,

- weighing the loss by the inverse distribution of the targets, e.g., weigh time steps with critical phases more than time steps with non-critical phases,

- dividing the loss per patient by the number of valid target points per patient such that every patient has the same influence on the final loss, and

- averaging the loss over all patients.

The above-described calculation of the loss between the targets and the predictions, preferably, is performed using a loss function employed for the training of the model-to-be- trained.

The loss of the network may be optimized by using the ADAM optimizer as described by Diederik P. Kingma and Jimmy Ba in „Adam: A method for stochastic optimization. In: 3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings. 2015" with a learning rate which may be reduced by a factor of 0.98 after every training epoch. For example, a batch size of 16 may be used fortraining for a total of 32 epochs, and clipping the gradient to 0.5. Training may be stopped early if a validation loss does not improve for, e.g., 5 epochs.

To find good hyperparameters to control the learning process of a model-to-be-trained with the training unit, the training unit may be configured for tuning several hyperparameters. For example, the training unit may be configured for employing the Tree Parzen Estimator of the Optuna library and prune unpromising trials with the MedianPruner (cf. T. Akiba, et al. 2019. Optuna: A Next-generation Hyperparameter Optimization Framework. In Pro- ceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD '19). Association for Computing Machinery, New York, NY, USA, 2623- 2631 . DOI:https://doi.org/10.1145/3292500.3330701). Unpromising trials may be trials that perform worse than the median performance of all previous trials at a given time step. For example, the training unit can be configured for tuning between 110 to 150 trials, e.g., 128 trials, per model.

For training the model-to-be-trained, the training unit can be configured for tuning the model-to-be-trained by optimizing a learning rate, a hidden layer size, a dropout, a standard of Gaussian noise on the input, a number of stacked Long-Term-Short-Term cell layers, and/or a gradient accumulation. For example, the learning rate may be varied between 1 e ⁸ and 1 e ³, a hidden layer size may be chosen within the range of 32 to 512, e.g., as 32, 64, 128, 256, or 512, a dropout may be defined from 0.2 to 0.5, a standard of Gaussian input noise may be between 0.0 and 0.25, a number of stacked Long-Term-Short-Term cell layers be range from 1 to 4, and/or a gradient accumulation may be selected from 1 to 16, which may effectively be a multiplier of the batch size.

The invention also relates to a computer program for predicting a parameter indicative of an intracranial pressure, the computer program being configured for carrying out the method for predicting a parameter indicative of an intracranial pressure of a patient as described above, when run on a computer.

The invention further relates to a computer program fortraining a model by machine learning, the computer program being configured for carrying out the training method fortraining a model for predicting a parameter indicative of an intracranial pressure of a patient as described above, when run on a computer.

Preferred embodiments of the invention will now be explained with reference to the figures. Of these figure:

Fig. 1 : schematically shows a system for predicting a parameter indicative of an intracranial pressure;

Fig. 2: shows s training system fortraining a model by machine learning for predicting a parameter indicative of an intracranial pressure; Fig. 3: shows a flow diagram representing a method for predicting a parameter indicative of an intracranial pressure;

Fig. 4: shows a flow diagram representing a method for training a model by machine learning for predicting a parameter indicative of an intracranial pressure;

Fig. 5: shows a flow diagram representing an exemplary workflow of data acquisition, pre-processing, and training as well as an evolution of an intracranial pressure according to a diagnosis, to a dataset and to an outcome;

Fig. 6: shows curves representing a performance of models with different distances between prediction time point and upcoming critical phase;

Fig. 7: shows a feature importance of a prediction of long and short critical phases of intracranial pressure;

Fig. 8: shows a boxplots of a sum of all saliencies per time steps (hours) divided by their occurrence of input feature are depicted and listed in decreasing manner for long critical phases (A) and for short critical phases (B).

Fig. 1 schematically shows a system 100 for predicting a parameter 102 indicative of an intracranial pressure of a patient.

The system 100 comprises a medical data providing unit 104, a model providing unit 106 and a predicting unit 108.

The medical data providing unit 104 comprises medical data 110 associated with the patient admitted to an intensive care unit. The medical data 110 are stored in a storage medium of the medical data providing unit 104 and represent vital parameters, laboratory values, and continuous medication information recorded for the patient, e.g., by means of the intensive care unit. The medical data 110 contain several features 112 like laboratory values and vital parameters that have been measured over a previous period such as an invasively measured intracranial pressure, a cerebral perfusion pressure and a mean arterial pressure. The medical data 110 further contain information about continuous medication applied to the patient. Furthermore, the medical data 110 comprise patient specific information such as gender, age and weight. The medical data also represent a diagnosis associated with the patient.

The model providing unit 104 is configured for providing a model 114 trained by machine learning for predicting the parameter 102 indicative of an intracranial pressure on the basis of the medical data 110. The model 114 is comprised by a computer program that is stored in the model providing unit 104. The trained model 114 comprises a first feed forward neural network unit 116 comprising a multi-layer perceptron layer having an input layer, a hidden layer and an output layer with 64 nodes in each of the layers. The number of nodes in each layer may be adapted to the number of features presented in the medical data 1 10. The presented features are in particular the features common to the medical data. With the hidden layer, the multi-layer perceptron layer 114 can extract a number of features 112 from the provided medical data 110. These extracted features 112 can be provided as input to a recurrent neural network 118 of the trained model 114. The recurrent neural network 118 of the trained model 1 14 is a Long-Short-Term-Memory cell comprising two hidden layers. Each hidden layer comprises 64 nodes. Alternatively, the Long-Short-Term-Memory cell may comprise a different number of hidden layers such as three or four hidden layers. The two hidden layers are followed by a dropout layer and another hidden layer with 64 nodes. The system 100 is configured for performing layer normalization on the output of the recurrent neural network 118, followed by a dropout layer.

The trained model 1 14 further comprises a second feed-forward neural network 120 comprising two multi-layer perceptron layers for reading processed information out of the recurrent neural network 118 and for providing the parameter 102 indicative of an intracranial pressure. Each of the two multi-layer perceptron layers comprise 64 nodes. If the number of nodes in the first feed forward neural network unit 116 changes, also the number of nodes in the recurrent neural network 118 and the second feed-forward neural network 120 may change accordingly, e.g., adapted to the number of features presented in the medical data 110. Preferably, a layer of the first feed forward neural network unit 116 may comprise 60 to 150 nodes, e.g., 85 to 150 nodes, depending on the number of features presented in the provided medical data 110.

For predicting the parameter 102 indicative of an intracranial pressure and in particular whether a short or a long critical phase will appear within the next couple of hours, the predicting unit 108 uses the trained model 114 provided by the model providing unit 106 and the medical data 1 10 provided by the medical data providing unit 104. The predicting unit 108 comprises a processor for executing the computer program storing the trained model. The predicting unit 108 is configured for using the trained model 114 for predicting an evolution of the parameter a function of time based on the provided medical data. From such an evolution it may be possible to determine a critical phase during which the intracranial pressure lies above a predefined threshold value over a predefined period of time. To this end, the system 100 may comprise critical phase determining unit. Such a critical phase determining unit may use a predicted evolution of the parameter 102 for determining in advance an occurrence of a critical phase. Based on a predicted critical phase, clinical staff may be instructed by the system of how to reduce an elevated intracranial pressure to prevent or at least reduce the impact of a critical phase.

The system 100 may further comprise a feature importance unit that is designed for generating and providing feature importance data indicative of an importance of a certain extracted feature for the prediction of the parameter indicative of an intracranial pressure by the provided trained model. Calculating a feature importance may ease the interpretability of a predicted parameter indicative of an intracranial pressure for clinical staff and may even provide a way of efficiently reducing an elevated intracranial pressure by varying exactly those features that have been found important for predicting a critical phase.

Fig. 2 shows a training system 200 for training a model by machine learning for predicting a parameter indicative of an intracranial pressure of a patient. The system 200 can be used for training a model such that the trained model can be used as the trained model 114 of system 100 as described with reference to Fig. 1 .

The training system 200 comprises a medical training data providing unit 202, a model-to- be-trained providing unit 204, and a training unit 206.

The medical training data providing unit 202 is configured for providing medical training data 216 associated with a number of patients that were undergoing invasive intracranial pressure monitoring. In particular, all medical training data 216 include invasive intracranial pressure measurements and blood gas analysis of the patients as common features 218.

The medical training data 216 can be divided into two cohorts of patients. A first data set is a training data set 222 that is used for training a model-to-be-trained. A second data set is a validation data set 224 used for validating the training of the model-to-be-trained. Both, the training data set 222 and the validation data set 224 comprise common features 218, 218' that particularly include a measured intracranial pressure of a patient. The training data set 222 and the validation data set 224 include various data representing inter alia laboratory parameters, vital signs, and medication information as well as patient specific information of various different patients.

The model-to-be-trained providing unit 204 is configured for providing a model 208 having a first feed forward neural network unit 210, a recurrent neural network 212 and a second feed-forward neural network unit 214 and can be to trained using machine learning.

The training unit 206 is configured for training the model 208 such that training output provided by the model is fed back into the input layer 210 of the model 208. Thereby, the recurrent neural network 210 can learn time dependent information. In particular, the training unit 206 is configured for training the model 208 to be trained such that the trained model is configured for predicting a parameter 220 indicative of an intracranial pressure on the basis of medical data of an individual patient. Preferably, the training unit 206 is configured for training the model 208 to be trained employing supervised recurrent learning.

Fig. 3 shows a flow diagram representing a method for predicting a parameter indicative of an intracranial pressure. The method can be conducted with the system 100 as described with reference to Fig. 1 .

In the method, initially, a patient is admitted to an intensive care unit (step S1) and the patient's health status is closely monitored (step S2). In particular, laboratory values such as C-reactive protein, white blood cells are obtained once a day for that patient. Additionally, a blood gas analysis is performed at least every four hours. Furthermore, continuous medication, e.g., a continuous dosage of opioids, is recorded. Further, in addition, vital parameters such as an intracranial pressure are recorded many times during an hour, e.g., with a five-minute resolution. In particular, the intracranial pressure is recorded using an intraparenchymal probe.

These data are recorded and transmitted to a medical data providing unit that stores the received data as medical data for this patient (step S3).

With a model providing unit, a model trained by machine learning for predicting a parameter indicative of an intracranial pressure on the basis of medical data is provided (step S4). The model has a recurrent neural network that comprises a Long-Short-Term-Memory cell unit. Thereby, the model is able to process variable length sequences of inputs and to handle input with missing data. Therefore, the model can learn long-term dependencies without applying heavy feature engineering. The model having a recurrent neural network is thus robust to missing features and to raw datasets from various clinical sources.

With a predicting unit, the parameter indicative of an intracranial pressure is predicted (step S5) using the provided trained model based on the provided medical data.

The above-described method can be carried out using a computer program for predicting a parameter indicative of an intracranial pressure, when run on a computer. The system as described with reference to Fig. 1 may be implemented by or as part of such a computer.

Fig. 4 shows a flow diagram representing a method for training a model by machine learning for predicting a parameter indicative of an intracranial pressure.

The training method can be conducted with the training system 200 as described with reference to Fig. 2. The training method described in the following may be carried out by a computer program for training a model by machine learning, when run on a computer. The training system described with reference to Fig 2 may be implemented by or as part of such a computer.

In the method for training a model by machine learning, initially, medical training data associated with a number of different patients are provided with a medical training data providing unit (step T1).

Furthermore, with a model-to-be-trained providing unit, a model that can be trained by machine learning is provided (step T2). This model comprises a recurrent neural network cell.

The model is then trained with a training unit by machine learning such that the trained model is configured for predicting a parameter indicative of an intracranial pressure on the basis of medical data of an individual patient (step T3).

Training of the model particularly comprises applying Gaussian noise with a mean of 0 and a standard deviation up to 0.2 to simulate measurement errors and to make the model more robust towards perturbations applied during a saliency calculation. The loss between the targets and the predictions is calculated by applying a mask for targets where no intracranial pressure measurements are available. Afterwards, a binary cross-entropy loss is applied to phase predictions per phase and per patient. Subsequently, the loss by the inverse distribution of the targets is weighed and the loss per patient by the number of valid target points per patient is divided such that every patient has the same influence on the final loss. Eventually, the loss is averaged over all patients.

The training further comprises a tuning to find good hyperparameters for the model. In the training method, for tuning the network, the learning rate was changed from 1 e^-8 to 1 e^-3, the hidden layer size was selected as 32, 64, 128, 256, or 512, the dropout was selected between 0.2 and 0.5, the standard of Gaussian input noise was varied from 0.0 to 0.25, the number of stacked Long-Short-Term-Memory layers was increased from 1 to 4 and the gradient accumulation was increased from 1 to 16.

Fig. 5 shows in (A) a flow diagram representing an exemplary workflow of data acquisition, pre-processing, and training as well as an evolution of an intracranial pressure according to a diagnosis in (B), to a dataset in (C) and to an outcome in (D).

In the workflow 500, three data sets 502, 504, 506 including medical data are initially provided (step W1).

Of these data sets 502, 504, 506, those medical data were chosen that are associated to patients that were treated on an intensive care unit and/or that were subject to invasive measurement of an intracranial pressure (step W2).

Thereby, from the initially provided data sets 502, 504, 506 three reduced data sets 508, 510, 512 are obtained (step W3). A first one of these data sets is a training data set 508 and the second and third data sets are validation data sets 510, 512.

The obtained data sets 508, 510, 512 are pre-processed to reduce noise in the data.

Further, from the data sets 508, 510, 512 common features are extracted (step W4). Such features may comprise vital signs, laboratory values, medication and blood gas analysis. Furthermore, descriptives such as age, weight, height and diagnosis may be extracted from the data sets 508, 510, 512. Blood gas analysis (BGA) and laboratory values are stored directly and automatically in a medical data providing unit of a system for predicting a parameter indicative of an intracranial pressure. Certain laboratory values are obtained once a day, e.g., CRP, white blood cells, and blood gas analysis at least every four hours in patients under invasive ventilation. Sampling may be performed more often when patients e.g. suffer under critical invasive breathing situations or present an increase in the intracranial pressure.

Preferably, only continuous medication is considered for model training. Groups of drugs may be defined through their active ingredients, e.g., narcotics with Propofol or Ketamine.

The pre-processing (step W5) is evaluated through a dimensionality reduction to find the fewest differences between data sets 508, 510, 512. Pre-processing further includes that all values are averaged when available over one hour and this average was used as input for the trained model provided by a model providing unit. Vital parameters like the intracranial pressure may sometimes be available one to two times in the first dataset 508 and on a five-minute resolution in the second and third dataset 510, 512.

Subsequently, intracranial hypertensive phases also referred to as targets are defined (step W6).

Targets, which are the variable to be predicted by the trained model, reflect critical intracranial pressure phases and may be defined based on hours with a critical intracranial pressure event. An hour may be defined as a critical phase if at least one intracranial pressure measurement is equal or larger than, e.g. 22 mmHg thus defining a threshold value. The threshold may be chosen based on a distribution of all intracranial pressure measurements of the data sets 508, 510, 512 in surviving patients, which in one embodiment may indicate 21 mmHg as the 95th percentile. A maximum of two consecutive critical hours may be defined as a short critical phase. More than two consecutive critical hours may be defined as a long critical phase, cf. Fig. 7. The targets may be defined according to the temporal proximity of the critical phase, e.g., 1 to 10 and 24 hours. It is preferred that targets are only defined when intracranial pressure measurements are available. Nevertheless, the complete intensive care unit trajectory of a patient may be used for training purposes.

Afterwards, a model-to-be-trained is provided by a model-to-be-trained providing unit and trained by a training unit, preferably, by means of supervised recurrent learning (step W7). A common approach to deal with sequential data in the medical domain is to break sequences into fixed-size blocks. A known model such as gradient boosted trees or even simple linear regressions can then operate over all time steps at once. Sequences that are shorter than the block size can be padded to match, but very long sequences require a prediction for every part of it. This hinders the known model from learning long-term dependencies unless heavy feature engineering is applied, such as adding the long-term variance of features from the time-steps before the block. Therefore, in the workflow 500 a model trained by machine learning for predicting a parameter indicative of an intracranial pressure is employed. Preferably, the trained model includes a recurrent neural network and is robust to missing features and to raw datasets from various clinical sources. Preferably, the recurrent neural network is a Long Short-Term Memory (LSTM) that can operate on arbitrary sequence lengths and decide what information to remember or forget.

The training and tuning may be performed on 80% of the data from the training data set 508. The validation may be performed on 20% of the data from the training data set 508 and the second and third data sets 510, 512.

To obtain the Receiver Operating Characteristic (ROC) and Precision Recall (PR) curve, the according Sensitivity (Recall), Specificity, and Precision may be calculated. Predictions of deep learning models range continuously between zero and one and a threshold should be set to classify a prediction into true or false. To visualize the performance of ML models independently from a set threshold, ROC and PR are used. ROC curves simulate the tradeoff between specificity and sensitivity, e.g., a perfect classifier would have the AUC-ROC of one. PR curves demonstrate the trade-off of Precision (Positive Predictive Value) and Recall (Sensitivity), the higher the sensitivity the lower the positive predictive value will be, e.g., a perfect classifier would also have the AUC-PR of one.

To calculate a possible accuracy of the model predicting long critical phases, the optimal threshold may be chosen for the highest value, when the false positive rate (FPR or 1 - Specificity) was subtracted from the true positive rate (TPR or Sensitivity).

The Area under the Curve (AUC) may be calculated. The mean and the standard deviation may be calculated based on the prediction of, e.g., five independent models.

In a subsequent step (step W8) a feature importance is generated using a feature importance unit. Neural networks can have complex architectures, e.g., multiple different layers and certain randomness in classification, e.g., in the bias. In essence and practical terms, feature importance of models including a recurrent neural network indicates a potential role of an input feature for a certain prediction. In contrast to normal statistical models, neural networks can have a certain randomness of prediction and that is also true for their consideration of input features. A feature importance method normally repeats its calculation several times to build an average importance of each feature.

In the workflow 500 a feature importance may be calculated for every individual input time step and over the whole individual past of a patient. This individual feature importance may be an important factor as to why the present study was based on a sequence-to-sequence Long-Term-Short-Term. To calculate feature importance the method SmoothGrad and the method of integrated gradients may be used. In one embodiments, feature importance for the prediction of long and short phases by five independent models is calculated on the test data set 508, and the validation data sets 510, 512.

In (B), (C) and (D) intracranial pressure values (in mmHg) of a patient are exemplary shown over the first 15 days in an intensive care unit are depicted according to the diagnosis (B), to the dataset (C), and to the outcome (D). Values are shown as a generative additive model with standard deviation in grey.

Fig. 6 shows curves representing a performance of models with different distances between prediction time point and upcoming critical phase. In (A) five independent models were trained on different splits of training data from dataset 508 to predict critical phases up to 24 hours in advance. The area under the curve (AUC) of ROC curves and in (B) Precision Recall Curves with the corresponding standard deviation of five independent models (ribbon) are depicted for each trained hour (1 h to 10h and 24h). Performance on the underlying test data set 508 is shown in red (top curve) and validation data set 510 in green (middle curve) and validation data set 612 (bottom curve) are depicted separately.

Fig. 7 shows a feature importance of a prediction of long and short critical phases of intracranial pressure.

In (A), a representative intracranial pressure trajectory of an individual patient with invasive intracranial pressure monitoring is shown, having a long critical phase in the beginning and several shorter critical phases at the end. The individual intracranial pressure course is depicted over time (in hours). The horizontal dashed line 700 represents the predefined threshold value for a definition of a critical phases, here intracranial pressure equal or above 22 mmHg. The according individual feature importance is shown in (B) as a heatmap to provide an overview of all important features that accounted for a given prediction. In particular, in (B), gradient based saliencies were calculated from five independent models based on the prediction 2 hours in advance of the critical phases. All other features, which had a low influence are not shown for that trajectory. The lines 702 connecting the saliency and intracranial pressure plot demonstrate the predictive horizon.

The prediction takes place two hours in advance and the important features forthat prediction at that time are demonstrated. The colour scale is continuous between -1 (blue) and +1 (red). Values being positive are red (to be considered as bad) because their higher values are positively correlated with the prediction of critical phases. Negative values (blue) represent negatively correlated values with the positive prediction.

In (C), to have a broader view on the top features over all validation datasets, the sum of all saliencies per time step was calculated. The top (red) and bottom (blue) two features are shown for each group (Descript. = patient characteristic, diagnosis, vital signs, BGA, medication, laboratory value) and for each target long (left) and short (right) critical phase. The lower and upper hinges of boxplots correspond to the first and third quartiles (the 25th and 75th percentiles) the middle line of the median. The upper and the lower whisker extends from the hinge to the largest and smallest value no further than 1 .5 times the interquartile range (IQR) from the hinge.

From (C) it can be deduced that the intracranial pressure, the mean arterial pressure (MAP) and the CPP were the most important dynamic predictors for critical long phases. Therefore, a higher ICP, MAP or CPP correlates with a critical phase in two hours for both long and short phases. For BGA, sodium and bicarbonate correlate with the occurrence of long critical phases, and higher glucose and chloride was negatively correlated with the prediction of long phases. Continuous medication also demonstrated an influence on the model prediction. A higher continuous dosage of opioids correlated with upcoming critical phases, while narcotics like Propofol showed the opposite effect. Laboratory values are less frequently acquired but can influence the model prediction. The most important values were thrombocytes, erythrocytes, mean corpuscular haemoglobin (MCH) and blood urea nitrogen (BUN). To visualize the overall feature importance over time, a heatmap can be drawn to show possible time dependent changes in feature importance and all datasets, cf. Fig. 8. Fig. 8 shows boxplots of a sum of all saliencies per time steps (hours) divided by their occurrence of input feature are depicted and listed in decreasing manner for long critical phases (A) and for short critical phases (B).

The lower and upper hinges of the boxplots correspond to the first and third quartiles, i.e., the 25th and 75th percentiles, the middle line to the median. The upper and the lower whisker extends from the hinge to the largest and smallest value no further than 1 .5 times the IQR from the hinge (A and B). The value can be negatively or positively correlated with the predicted target. If something is positively correlated, it is shown in red; if it is negatively correlated, it is shown in blue (A and B). This could be used intuitively in the clinic since red reflects a more negative and bluer a more positive influence on the predicted critical phase. A histogram of every hour of the input sequence is shown for the first 360 hours, wherein each tile represents one hour, for long critical phases (C) and short critical phases (D). Patient descriptive and diagnosis are one hot encoded input features. All other features are down sampled to an hourly input and can be dynamic according to their nature. Medications, i.e., only continuous medications, are shown in groups of substances.

Claims

1 . A system for predicting a parameter indicative of an intracranial pressure of a patient, the system comprising a medical data providing unit configured for providing medical data associated with the patient; a model providing unit configured for providing a model trained by machine learning for predicting the parameter indicative of an intracranial pressure on the basis of said medical data; and a predicting unit configured for predicting the parameter indicative of an intracranial pressure using the provided trained model based on the provided medical data.

2. The system of claim 1 , wherein the predicting unit is configured for predicting evolution of the parameter indicative of an intracranial pressure as a function of time based on the provided medical data.

3. The system of claim 2, further comprising a critical phase determining unit configured for determining based on the predicted evolution of the parameter indicative of an intracranial pressure in advance an occurrence of a critical phase during which an intracranial pressure is to be expected to lie above a predefined threshold value for a predefined period of time.

4. The system of at least one of the preceding claims, wherein the provided trained model comprises a recurrent neural network unit.

5. The system of claim 4, wherein the recurrent neural network unit comprises or is a Long Short-Term Memory cell.

6. The system of claim 4 or 5, wherein the provided trained model further comprises a first feedforward neural network unit configured for feeding information into the recurrent neural network and a second feedforward neural network for reading pro- cessed information out of the recurrent neural network and for providing the parameter indicative of an intracranial pressure.

7. The system of claim 6, wherein at least one of the first and second neural network units comprises at least one multi-layer perceptron layer.

8. The system of at least one of claims 6 or 7, wherein the first feedforward neural network unit comprises at least one hidden layer for extracting a number of features from the provided medical data, said features being common to at least most of the provided medical data, wherein the first feedforward neural network unit is configured for feeding the extracted features as input into the recurrent neural network unit.

9. The system of claim 8, further comprising a feature importance unit configured for generating and providing feature importance data indicative of an importance of a certain extracted feature for the prediction of the parameter indicative of an intracranial pressure by the provided trained model.

10. The system of claim 9, wherein the feature importance unit is configured for using an integrated-gradient approach in combination with a gradient averaging to generate the importance feature.

11. The system of at least one of the preceding claims, wherein the medical data represent at least one of a vital parameter, a laboratory value, imaging data, genetic information and medication information.

12. The system of claim 11 , wherein a vital parameter is at least one of a measured intracranial pressure, a cerebral perfusion pressure and a mean arterial pressure and/or wherein a laboratory value represents at least one of thrombocytes, erythrocytes, mean corpuscular haemoglobin and blood urea nitrogen.

13. A training system fortraining a model by machine learning, the training system comprising a medical training data providing unit configured for providing medical training data associated with a number of patients; a model-to-be-trained providing unit configured for providing a model that can be to trained using machine learning; and a training unit configured for training the model to be trained such that the trained model is configured for predicting a parameter indicative of an intracranial pressure on the basis of medical data of a patient. The training system of claim 13, wherein the training unit is configured for training the model to be trained employing supervised recurrent learning. A model trained by machine learning, the model being configured for predicting a parameter indicative of an intracranial pressure on the basis of medical data of a patient. A method for predicting a parameter indicative of an intracranial pressure of a patient, the method comprising the steps of: providing with a medical data providing unit medical data associated with the patient; providing with a model providing unit a model trained by machine learning for predicting a parameter indicative of an intracranial pressure on the basis of said medical data; and predicting with a predicting unit the parameter indicative of an intracranial pressure using the provided trained model based on the provided medical data. A method fortraining a model by machine learning, the method comprising the steps of: providing with a medical training data providing unit medical training data associated with a number of patients; providing with a model-to-be-trained providing unit a model that can be trained by machine learning; and training with a training unit the model to be trained by machine learning such that the trained model is configured for predicting a parameter indicative of an intracranial pressure on the basis of medical data of a patient.

A computer program for predicting a parameter indicative of an intracranial pressure, the computer program being configured for carrying out the method of claim 15, when run on a computer.

A computer program fortraining a model by machine learning, the computer program being configured for carrying out the method of claim 16, when run on a computer.