GB2639695A

GB2639695A - Interactive image segmentation using attribute volumes

Info

Publication number: GB2639695A
Application number: GB2412184.0A
Authority: GB
Inventors: Otto Andersen Jahn
Original assignee: Holocare AS
Current assignee: Holocare AS
Priority date: 2024-03-18
Filing date: 2024-08-19
Publication date: 2025-10-01
Also published as: GB2639696A; GB2639694A; GB202403860D0; GB2639595A

Abstract

A computer-implemented method of segmenting a medical volume is disclosed. The method involves receiving a selection of an attribute volume to be used for segmentation, wherein the attribute volume defines attribute data to be derived from the medical volume for each of a plurality of voxels of the attribute volume. Attribute data of the selected attribute volume from the medical volume is computed in accordance with a predetermined attribute processing routine and a representation of the medical volume is displayed in a segmentation interface. A selection of a target location for a segmentation operation is received through interaction with the representation and the segmentation operation is applied to voxels of the medical volume in a region determined based on the target location. The segmentation operation selectively applies a segmentation classification to each of a plurality of voxels of the medical volume in dependence on attribute data of the attribute volume corresponding to the voxel.

Description

INTERACTIVE IMAGE SEGMENTATION USING ATTRIBUTE VOLUMES

BACKGROUND

The present invention relates to segmentation of medical images, for example computed tomography (CT) images, magnetic resonance imaging (MR!) images or positron emission tomography (PET) images.

In the CT example, a patient (whether human or animal) may be scanned by a computed tomography (CT) scanner. The scanner may use X-rays to obtain a series of two-dimensional (2D) images of the patient. In the MRI and PET examples, the acquisition method is instead based on magnetic fields or positron emission respectively. Each 2D image represents a slice through the patient. Each pixel of each 20 image represents tissue characteristics in a slice, such as radio density for CT or magnetic resonance (or a derived metric) for MRI. A stack (that I S is, sequence or series) of two-dimensional CT images or other source images may be converted by a computer into a three-dimensional (3D) image comprising voxeis (sometimes also referred to as a 3D reconstruction). Each voxel of the 3D image represents a small volume. The 3D image may be used for planning a surgical operation; for example, a patient may have a damaged or diseased liver or a damaged or diseased heart. Note that while the term "image" as used herein may refer to a 2D or 3D image, 3D images will also be referred to as volumes, medical volumes or reconstruction volumes for clarity, depending on context.

To facilitate use of a 3D image (comprising voxels), the 3D image may be segmented by a computer Segmentation is a process in which a computer recognises various portions of a 3D 2.5 image. For example, for liver tissue, the segmentation process may recognise distinct portions of a 3D image such as: 0) liver tissue, (ii) blood vessels in the liver, and (iii) holes within the parenchyma of the liver The segmented 3D image may be used for planning a surgical operation, such as removing a tumour. A surgical procedure may be planned so that the procedure, for example, avoids large blood vessels.

-I -

Conventionally, it may take an expert (such as a radiologist) about 4 to 6 hours to manually segment an image. It would be desirable to reduce the time taken for segmentation to, for example, 1 hour or, even better; to a few minutes, or less. It would also be desirable to improve the quality of a segmentation, for example by reducing the proportion of a 3D image that is incorrectly segmented. It would also be desirable to allow several users (such as clinicians or radiologists) to collaborate in the planning of a surgical procedure. For example; two or more spatially or geographically separated users may wish to collaborate in planning a surgical procedure.

Segmentation may he performed automatically, for example using approaches based on machine learning, or manually. Automatic approaches may not be sufficiently accurate, at least when used on their own. However, manual approaches can be laborious, error prone, and time consuming.

BRIEF DESCRIPTION OF INVENTION

Aspects of the invention are set out in the independent claims. Certain preferred features are set out in the dependent claims.

In one example, a computer-implemented method of segmenting a medical volume is disclosed, the method comprising: receiving a selection of an attribute volume to be used for segmentation, wherein the attribute volume defines attribute data to be derived from the medical volume for each of a 2.5 plurality of voxels of the attribute volume; computing attribute data of the selected attribute volume from the medical volume in accordance with a predetermined attribute processing routine; displaying a representation of the medical volume in a segmentation interface; receiving a selection of a target location for a segmentation operation through interaction -2 -with the representation; and applying the segmentation operation to voxels of the medical volume in a region determined based on the target location, wherein the segmentation operation selectively applies a segmentation classification to each of a plurality of voxels of the medical volume in dependence on attribute data of the attribute volume corresponding to the voxei.

Embodiments of the present invention provide a segmentation editor that may be implemented as a web-based segmentation application, thereby allowing clinicians to undertake a segmentation process. The segmentation editor may semi-automatically create 3D segmentations from stacks of CT images, The segmentation editor provides an attribute-based smart "paintbrush"; the paintbrush is an editing tool that helps clinicians to perform segmentation.

FIGURES I 5

Figure 1 shows a system for training a convolutional neural network, according to an embodiment.

Figure 2 shows a system for performing segmentation based on medical images, according to 20 an embodiment.

Figure 3 shows a system that provides an attribute-based smart paintbrush, according to an embodiment.

2.5 Figure 4 shows a chain of attribute processing, according to an embodiment.

Figure 5 shows a branched chain of operations, according to an embodiment.

Figure 6 shows a denoised brush system that uses a curvature flow filter for segmentation, according to an embodiment.

Figure 7 shows an example (left) of a CT image; and (right) an attribute volume with a denoised image of the CT image, according to an embodiment.

Figure 8 shows an example (left) of a threshold segmentation performed on a CT image; and (right) threshold segmentation performed on an attribute volume of the CT image, according to an embodiment.

Figure 9 shows a vessel segmentation method, according to an embodiment.

Figure 10 shows (left) a CT image; and (right) vessel segmentation ormed on a vessel likelihood attribute volume, according to an embodiment.

Figure 11 shows a bone brush tool for determining regions of an image that contain bone structures, according to an embodiment.

Figure 12 shows an output of a watershed filter, according to an embodiment.

Figure 13 shows voxel classes after discriminating based on average class voxel values; according to an embodiment.

Figure 14 shows a result of a paintbrush segmentation performed on a bone likelihood attribute, according to an embodiment. 2.5

Figure 15 shows a method for processing a discrete attribute volume consisting of classes, according to an embodiment.

Figure 16 shows an example of a discrete attribute volume with classes, according to an -4 -embodiment.

Figure 17 shows an example of segmentation of liver, gall bladder, pancreas, kidneys and spleen, according to an embodiment.

DESCRIPTION

When processing medical images, segmentation is the process of partitioning medical images or volumes into multiple regions (sets of pixels or voxels) that represent anatomical structures.

While segmentation may be performed in the 20 image domain, commonly the 20 images are used to create a 3D reconstruction of the imaged object (e.g. patient organ) and the resulting reconstruction volume is then partitioned (segmented) into regions comprising sets of voxels. A goal of segmentation is to simplify and/or change the representation of an image (or images). The image or images may be labelled and therefore easier to analyse. By undertaking I 5 segmentation on a set of medical images, the collection of 20 images, can be converted to 3D volumes and graphically rendered for study or treatment planning.

Embodiments of the present invention may use an inference algorithm to produce a "likely" (that is, probable) organ segmentation, based on input data such as 2D CT images. Embodiments 20 may use machine learning techniques to produce a labelled file comprising data points within a 3D voxel space (x, y, z) that are representative of, for example, the liver.

A convolutional neural network that may be used by the present invention is the 3D U-Net. U-Net networks are fully convolutional networks that are useful for medical image 2.5 segmentation. The 3D U-Net consists of multiple convolutional layers that look at progressive levels of detail. Each convolutional layer moves a filter across the volume to learn patterns and structures which result in a valid segmentation output, given enough training.

Embodiments of the invention further provide an interactive segmentation interface providing manual or semi-automated segmentation tools in the form of a set of segmentation "brushe These may be used to perform a manual segmentation or may be used to refine an initial segmentation output by the machine learning model mentioned above.

Figure 1 shows a system 100 for training a neural network, and shows: a machine learning model undergoing training, an optional pre-processing step, a loss function, as an input to the machine learning model, 130a -130d labelled slices (4 are shown.) from a CT scanner, 140a a labelled region of slice 130a, a parameter set, indicative of the trained neural network, and a tissue-type input to the machine learning model.

Figure 1 shows a machine learning model 110 that is undergoing training. A series of labelled 20 slices 130a -130d of image data are presented to the machine learning model 110 that is undergoing training. In some implementations, the 2D slices are converted to a 3D reconstruction volume in a preparatory step (not shown) and the 3D volume is then input to the machine learning model. The machine learning model 110 that is undergoing training may be a convolutional neural network (CNN). The CNN may be a 3D U-Net neural network.

An optional pre-processing step 115 is described below in connection with Figure 2 (which shows a pre-processing step 215 that is similar to the pre-processing step 115).

The loss function 120 acts (as those skilled in the art will appreciate) as a guide to the machine 2.5 learning model 110 that is undergoing training. In the case of a 3D U-Net neural network, the loss function 120 measures how far off the neural network's predictions are from a target. (The target is a correct segmentation of the medical volume.) The machine learning model 110 that is undergoing training makes predictions on what the segmented volume should look like. The loss function 120 calculates how different these predictions are from a real segmented volume.

Then, u ng this difference information, internal parameters of the machine learning model 110 that is undergoing training are adjusted to make better predictions and thus minimise the difference between its predictions and the actual segmented volume.

For the liver, binary weighted cross entropy may be used as the loss function 120. This may be calculated by assessing the amount of entropy (that is, uncertainty) that exists between the known probability distribution of the ground truth, and the probability distribution of the inferred model.

For the heat a Sorensen-Dice coefficient (also known as a Dice score) may be used as the loss function 120. The Dice score counts how many points that are present / missing within an inferred segmentation are also present missing in a ground truth segmentation.

Figure 1 shows 4 slices 130a -130d but more, or fewer, slices may be presented to the I S machine learning model 110 (or used as basis for a reconstruction volume presented to the model) that is undergoing training.

Slice 130a is shown as being labelled with data 140a. (The other slices, such as slices 130b 130d, may also be labelled with respective label data.) The label data 140 may represent a segmentation that has previously been performed by a clinician, for example a radiolooist. In other words, the label data 140 represents a 'ground truth that is used to teach the machine learning model 110 that is undergoing training.

In cases where the machine learning model 110 that is undergoing training receives data (for 7.5 example, from a CT reconstruction or an MR1 machine) that is already in a 3D format, the voxels of the input volume may be labelled with label data 140. The label data 140 teaches the machine learning model 110 a segmentation that has previously been performed by a clinician.

In some embodiments, the label data 140 may be generated by a machine learning model. -7 -

The tissue-type input 160 may be used to inform the machine learning model 110 that is undergoing training what type of tissue is represented by the slices 130. For example, the slices 130a -130d may represent liver tissue. Other slices (not shown) may represent heart tissue. Each slice may be associated with respective data that indicates a tissue type of the slice. The inputs 160 may be numerical data (for example, heart tissue may have a value of "57" whereas liver tissue may have value of "86") or text data, for example in the ASCII format. In some embodiments, some or all of the slices 130 may include a tissue-type input 160. For example, as those skilled in the art will appreciate, an example of a format that is used for medical images is the digital imaging and communications in medicine (DICOM) format. The DICOM format may include metadata, such as the tissue type of the image(s). Examples of other medical image formats are: Analyze, neuroimaging informatics technology initiative (NIfTI) and MINC.

When the machine:earning model 110 has been trained, the trained machine learning model IS 110 may be outputted as parameter set 150. The parameter set 150 may be a set of numerical weights for various parameters of the trained machine learning model 110.

Figure 1 shows a single loss function 120. In some embodiments, two or more loss functions 120 may be used. For example, a first loss function 120 may be used for a first tissue type (for example; heart tissue), resulting in a first parameter set 150. A second loss function 120 may be used for a second tissue type (for example, liver tissue), resulting in a second parameter set 150. The training data (such as the slices 130 or MRI voxel data) may be split into two or more categories. For example, only heart training data may be provided to the first loss function 120; resulting in the first parameter set 150. For example, only liver training data may be provided to the second loss function 120, resulting in the second parameter set 150.

After a neural network has been trained, the trained neural network may be used to perform inference.

As those skilled in the art will appreciate, various hyper-parameters may be set to control the model structure and training procedure. For example, the following hyper-parameters may be used: (1) Learning rate details the network's rate of learning. This parameter may be set based on common standard values found in research and literature reviews.

(2) Batch size sets the amount of data in each batch processed by the neural network. This parameter may be set based on computational limits and efficiency.

(3) A decay parameter reduces learning rate over time to help the neural network learn finer details. This parameter may be set based on common standard values found in research and literature reviews.

Figure 2 shows an embodiment 200 that may use a trained neural.network to perform inference and thereby perform image segmentation, and shows: a parameter set, indicative of a trained neural network, an optional tissue type input, 210 medical image data that is to be segmented, 215 a pre-processing step, that may be performed by the computer server 220, 217 a post-processing step, 220 a computer server, 230 a processor of the computer 220, 240 segmented voxel data, generated by the processor 230, 2.5 250 a remote computer, that is remote from the server 220, 0 commands from the remote computer 250 to the server 220, 270 a rendered image, representing the segmented voxel data 240.

Figure 2 shows an embodiment 200 that may be used by a user (such as clinician, not shown) -9 -to segment medical images 210. Optional pre-processing 215 of the medical images will be discussed in more detail below Optional post-processing 217 that may be performed on the segmented images generated by the computer server 220 will be discussed in more detail below.

Figure 2 shows an optional tissue type input 160 to the computer server 220. Based on the tissue type input 160; the server 220 may automatically select an appropriate parameter set 150. The tissue type input 160 may be rnetadata in a medical image 210. Alternatively, a clinician may arrange for the server 220 to be loaded with a parameter set 150 that is suitable for liver tissue, or with a different parameter set 150 that is suitable for heart tissue.

A computer server 220 is used to perform inference (using a trained neural network, based on a parameter set 150) and thereby produce voxel data 240 indicative of a segmented medical image. For example, for a liver, each voxel (the voxels are not shown) may indicate whether a I 5 voxel contains: (i) non-liver tissue; (ii) liver parenchyma, or (iii) a blood vessel.

Medical image data 210 is presented to the computer server 220. The computer server 220 comprises a processor 230 that processes the medical image data 210. As will be explained below in more detail, one function performed by the processor 230 is to convert slices of medical image data 210 into voxel data 240. The processor 230 then uses a trained neural network (which is represented by the parameter set 150) to infer and classify the voxels.

The processor 230 may comprise, for example, one or more central processing units (CPUs), and/or one or more graphics processing units (CPUs), and/or one or more field programmable 2.5 gate arrays (FPGAs). For clarity, Figure 2 does riot show random access memory (RAM) or storage (such as a hard disk drive, HDD, or a solid-state disk, SSD). The parameter set 150 may be stored in RAM of the server 220. In some embodiments, two or more parameter sets 150 may be stored in the server 220. For example, a first parameter set 150 may be stored, for heart tissue. A second parameter set 150 may be stored, for liver tissue.

-10 -The user may send commands 260 from the remote computer 250 to the server 220. As one example of a command 260; the user may inform the server 220 whether to use a first parameter set 150 or a second parameter set 150. As another example of a command 260, and as will be explained below in more detail, the user may use a digital "paintbrush" to tell the server 220 that a region of the voxel data 240 is a particular tissue type, for example blood vessel or parenchyma The processor 230 may then use a trained neural network (as represented by the parameter set 150) to automatically segment a region (or regions) of the voxel data 240 as belonging to the same tissue type.

The processor 230 may use the parameter set 150 to automatically produce a "likely" (that is, plausible) segmentation of the voxel data 240. Each voxel may be labelled with data (not shown) to indicate a segmentation parameter (or classification) for that voxel. For example, each voxel may be labelled with data that indicates whether the voxel is a blood vessel or is I S parenchyma. The user may then review and, if necessary, amend, the segmented voxel data 240 (for example using the segmentation brush tool described in more detail later).

As those skilled in the art will appreciate, processing voxel data can be challenging for a computer. For example, voxel data 240 that has a three-dimensional (3D) array of 1024 x 1024 x 1024 voxels has a total of 1,073,741,824 voxeis. Thus, the computer server 220 may be a relatively high-performance computer device. The remote computer 250 may be a relatively low-performance computer device that communicates with the server 220 over a communications network, for example the Internet The server 220 may render the voxel data 240 to form a two-dimensional (2D) image. The server 220 may then send the rendered image 270 to the remote 2.5 computer 250 for display to the user. A benefit of rendering the 20 image on the server 220 is that it reduces the quantity of data 270 that is transferred to the remote computer 250. For example; the rendered image 270 may have 1920 x 1080 = 2,073,600 pixels which is dramatically fewer than the 1,073,741,824 voxels. Alternatively, images may be rendered at the remote computer based on volume data sent to the remote computer from the server.

-H -

Figure 2 shows the computer server 220 and the remote computer 250 as separate devices; the server 220 may be a cloud-based device and the computer 250 may be a web browser running on a laptop. In an alternative embodiment (not shown), the user may directly communicate with the server 220. For example, the user may use a keyboard (not shown) and a mouse (not shown) of the server 220.

Figure 2 shows a single remote computer 250. in alternative embodiments, two or more remote computers 250 may receive the rendered image 270. This can allow users who are spatially or geographically separated to participate in the planning of a surgical procedure. For example, two or more clinicians may use respective laptop computers 25x3 to view the rendered image 270. As will be appreciated, the server 220 performs relatively intense mathematical processing, allowing relatively low performance laptops 250 (or desktop computers that are less capable than the server 220) to display the rendered image 270.

The pre-processing 215 (and the pre-processing 115) will now be explained in more detail. These preprocessing steps may generally be performed in the 3D domain (though 2D preprocessing prior to 3D reconstruction may also be performed depending on the nature of the preprocessing operation and there may be a combination of 2D and 3D preprocessing steps).

For liver, prior to the use either within an inference procedure (Figure 2) or for training a model (Figure 1), the images may be resampled and reoriented so that the voxel size and axis orientation are uniform. The images may then be normalized, with different methods for each organ model, and the segmentations may be binarized where applicable.

Some specific pre-processing steps for the liver are: (1) Resampling: as different images can have different voxel sizes, the images may be resampled to give a uniform voxel size. The liver may be resampled to a voxel size -12 -of 1 in each dimension, although it is possible to resample to other voxel sizes.

(2) Reorientation: To cope with images potentially having different orientations, images may be re-oriented to specific axes codes before being fed into the neural network. Images typically contain information about their orientation in their image meta-data, this enables re-orienting images to a standard orientation. Liver images may be reorientated to axes codes R, A, S. This means that the first voxel axis is left to right, the second voxel axis is posterior to anterior and the third voxel axis is inferior to superior.

(3) Hounsfield thresholding and normalization: for liver, thresholding based on the intensity values in a CT image may be performed to clip intensity values that are far away from the values usually associated with liver intensity. For example, a CT image may be clipped so that, after clipping, the image has a minimum threshold of -50 units on the Hounsfield scale and a maximum threshold of 600 units on the Hounsfield scale. Intensity values under and above I S these thresholds are therefore clipped up or down to then range of -50 to 600. Values in the range of -50 to 600 may then be normalized to a new range of 0 to 1, respectively.

(4) Label binarization: if the ground truth, segmentations (for the training procedure of Figure 1) are divided into more than 1 class (excluding background), a label binarization procedure may be performed_ This means that all the segmented classes are combined into a single class, allowing the neural network (of Figure 1) to train on segmenting only the background and a single segmentation class.

(5) Data augmentation: the CT images may optionally be augmented with additional data. 2.5

(6) Patch sampling: if the CT images are too large to fit in the memory of a computer (whether for Figure 1 or Figure 2), the images may need to be split up into smaller patches which are then sampled from, and fed into, the neural network. For inference, each patch outputted from the neural network is assembled to make up a resulting segmentation having the same -13 -resolution as the original input CT image / volume.

(6) Masking and/or cropping: if there is a previous segmentation data avaiiahle, this can be used to mask and/or crop the input volume. This can reduce the processing time and increase accuracy of the model.

(7) Padding: padding may be applied to keep volumes at a consistent size. The volumes may be padded with the value -1000, which is the Hounsfield value of air (background).

(8) Adaptive Histogram Equalization: histogram equalization may be performed to modify the contrast in an image. An adaptive histogram equalization image filter is a superset of two or more contrast-enhancing filters. This may be done for some images to highlight specific features, for example vessels (such as blood vessels).

I 5 (9) Curvature Flow Filter: Curvature flow filtering is an anisotropic diffusion method that may be used for smoothing images while preserving edges. A curvature flow filter may be applied in situations where maintaining clear edges is necessary, in addition to the benefit of smoothing. For example, a curvature flow filter may be used for blood vessel models.

Some specific pre-processing steps for the heart are: (1) Reorientation: to cope with images potentially having different orientations, images may be re-oriented to specific axes codes before being fed into the neural network. Heart images may 2.5 be reorientated to axis codes Radial, Axial, Sagittal (RAS).

(2) Brightness normalization: an image-level normalization for heart images may be performed based on the lowest voxel value in the image. For example, values below 0 may be damped to 0. Values above 2048 may be clamped to 2048. The range of 0 to 2048 may then be normalised -14 -to 0 to 1.

(3) Binary Label ground truth segmentations (that may be used for the training of Figure 1) are only one class excluding background. This means the blood volume is labelled as one segmented class, making the neural network train on segmenting only the background and a single segmentation class.

(4) Patch sampling (binary balanced): if the images are too large to fit in memory (whether the memory of the computer of Figure 1 or Figure 2), the images may need to be split up into smaller patches which are then sampled from, and fed into, the neural network. For inference, each of the patches outputted from the neural network is assembled to make up a resulting segmentation having the same resolution as the original input file. This process can also ensure a balanced amount of ground truth versus background.

I 5 (5) Spatialwindow size: a pre-set spatial window ize may be applied to e sure the mages are of a fixed size spatial window may he used for accurate sampling and/or to ensure that the computer (of Figure 1 or Figure 2) does not use more memory than is avail Post-processing 217 may be performed on voxel images that have been segmented by the 20 server 220. Some examples of post-processing steps are: (1) Connectivity: unconnected masses may be removed using a connectivity algorithm. The connectivity algorithm may prioritise a largest collection of yokels. All unconnected islands may be assigned as noise. The algorithm used for this removal process may be the Connected Components 3D (CC3D) algorithm. As those skilled in the art appreciate, CC3D achieves this by checking if sets of voxels (islands) are connected to the largest collection of voxels and removing the islands if they are not connected to the largest collection of voxels. To mitigate excessive voxel removal, the CC3D algorithm may be supplemented with a failsafe mode: if the number of voxels removed is higher than a pre-set threshold then the CC3D post-processing -i5-process may be stopped, returning to the original segmentation (before the use of the CC3D algorithm was attempted)..

(2) Hole Filling Algorithm: in some cases, unidentifiable voxel volume (holes) may appear within the largest collection of voxels (such as liver parenchyma). These holes in the parenchyma are often due to lesions. Identification and marking of lesions may be performed manually and may be the responsibility of a medical specialist. There is a risk that the segmentation algorithm may incorrectly classify holes as background. To compensate for this risk, holes within the parenchyma itself are filled using a filling algorithm and classified as parenchyma.

(3) Small Island Algorithm: where appropriate, a small island removal algorithm is applied. This may be for cases where the CC3D algorithm is not used, such as in a vessel model. This is due to vessels sometimes being disconnected due to large slice thickness or incorrect classification in other parts of a vessel tree. To compensate for this; the small island removal algorithm is I S applied to remove islands below a pre-set threshold.

Next, the segmentation brush functionality will be described with reference to Figures 3-17. As a prelude to the discussion, some terms will be defined: (1) Brush: an interactive segmentation tool based on a 2D or 3D shape (typically a circle or sphere) by which the user can interactively fill or erase voxels in a segmented volume.

(2) Medical volume: a medical 3D voxel image; produced for example by CT or MRI. For example, this may be a CT reconstruction in which each voxel is defined by an attenuation value.

(3) Attribute volume: a derived medical volume in the form of a 3D voxel volume in which attributes assigned to voxels are derived from the source medical images or source medical volume (e.g. CT reconstruction volume) by some process or function. This includes, but is not -16-limited to, denoising, thresholded values, gradient magnitude, likelihood volumes and other metrics. An "attribute" in this context may be any data that can be derived from the underlying reconstruction volume (e.g. from the CT attenuation values of a CT reconstruction) and/or source medical images (and possibly from other available information). The attribute assigned to each voxel in an attribute volume may be a single value (e.g. for a denoised volume) or a multi-dimensional vector (e.g. for gradients).

(4) Segmented volume: a discrete voxel volume with classification labels. This may be based on a medical volume or attribute volume. The labels may specify which voxels define various tissue 10 types such as organ structure or pathology of interest.

Embodiments of he present invention can aid a user in m:ore easily segmenting parts of a 3D model by using an attribute "brush" tool (that is, "paintbrush") provided within an interactive segmentation application. Tools according to the present invention can also reduce the number I 5 of errors in a segmentation by auto-generating the bulk of the segmentation for the user, thereby allowing a user to focus on key parts which may then be validated by a clinician.

Segmentation brushes can be applied directly to the underlying medical volume (e.g. to the attenuation values of a CT reconstructions) but in preferred embodiments are applied to 20 attribute volumes derived from the reconstruction in various ways.

The segmentation brushes may be used to segment a 3D medical volume from scratch (i.e. no prior segmentation performed) or to refine a segmentation generated automatically by the machine learning model (as described above) or using another automated segmentation 2.5 process.

Figure 3 shows a system that provides an attribute-based smart paintbrush, according to an embodiment. As those skilled in the art will appreciate, the "paintbrush" is a computer-implemented tool that may be used by a user to modify an image that has been generated by, -17 -for example, the computer server 220. The brush tool is provided as part of an interactive segmentation application which in present examples is implemented as a web application operating via a web browser on a client device, using medical volumes / attribute volumes provided by an application back-end implemented on a remote server or cloud platform.

The brush "paints" segmentation labels onto a medical volume or attribute volume. Specifically; if a brush is applied to a particular voxel in a medical / attribute volume, the segmentation label corresponding to the brush is applied to that voxel. The segmentation label and spatial extent of the brush may be selected by the user For example; the brush may be defined with a spherical shape and a user-configured brush radius (defined in relation to the voxel volume being segmented). This defines the region of the medical attribute volume within which the brush may apply classification labels to voxels.

The user selects a target location for the brush by selecting a location in a representation of the I S medical volume displayed in the application interface, for example by clicking on a pixel location of a displayed 2D slice of the medical volume with a mouse or other pointing device. Vvbilst the user typically interacts via a 2D interface, brushes are generally applied three-dimensionally to the voxei volume based on the selected target location. The system translates the selected target location to a corresponding target voxel which determines the centre voxel for application of the segmentation brush.

Brushes typically have a spherical shape and thus an application of the brush at a target location will result in segmentation labels being applied to voxels within a sphere of defined extent (e.g. radius) centred at the target location. in some cases; brushes may trigger for all 2.5 voxels within their extent, but in other situations, whether the brush triggers (i.e. applies the relevant segmentation label) may depend on the value of the medical volume or attribute volume at that voxei. For example, the brush may define a threshold or range on the attribute value, and the segmentation label for the brush is applied to a voxel if the voxel value meets the threshold or is within the range.

-I 8 -In certain embodiments, the paintbrush may have 3 modes: (1) Fill / erase sphere. This paints n 30 inside a sphere, regardless of the underlying CT / MRI data.

(2) Threshold-based paint i erase. This will paint or erase segmented voxels based on whether the underlying medical voxel intensity or derived attribute volume value is inside a user-defined interval.

(3) Threshold-based connected body paint. This behaves similar to (2), but may perform a flood fill inside a sphere starting at a paintbrush location, only growing into regions that are spatially connected with the centre of the brush. Here, "spatially connected" means spatially adjacent voxels with. voxel values within the user-defined interval. Thus, the flood fill wili apply the I 5 classification to voxels starting at the centre of the brush and spreading out to adjacent voxels in all directions while the attribute values of those voxels remain within the specified interval and will then stop. Other voxels that are within the brush extent and have attribute values within the defined interval but are not spatially connected to the brush centre in the above sense will not have the classification applied.

Different tool types may be implemented as separate "brushes" which the user can select in the segmentation application andlor may be provided as one or more configurable brushes. For example, the application could provide different selectable brushes for the above three modes or could provide a single brush for which thresholding and flood-fill vs. spatially connected fill 2.5 modes are selectable e.g. as check boxes. For tools that involve thresholding, an attribute range within which a particular segmentation class is to be applied can be configured in the interface via sliders or other suitable interface elements.

Attribute volumes may be calculated automatically on high-performance hardware (for example, -19 -the server 220) in the cloud and then downloaded to a client device 250 for interactive segmentation. The specific attribute volumes depend on the type of the organ structure. For example; a bone segmentation will require other attributes than a vessel segmentation.

The segmentation application allows the user to view the original CT image so that their understanding of the anatomy is not biased by the derived attribute volume. A benefit of using an attribute volume as a basis for the paintbrush is that the attribute volume is better suited for semi-automatic flood fill or threshold-based fill than the original CT volume as it better describes the boundaries of the structure of interest.

In some embodiments; the segmentation interface displays either the original CT images or slices of the CT reconstruction as the representation with which the user interacts. The user can apply the tool by clicking on locations in those images to specify the brush target location, but the segmentation is then performed with respect to the derived attribute volume.

Regions in the displayed 20 representation are preferably marked according to the assigned classification labels e.g. by shading pixels in different colours assigned to different classifications, to provide a visual indication of the segmentation. The user can navigate the segmentation volume, for example by adjusting the 2D representation to show a different slice (e.g. moving up or down the stack of CT images in the z direction; or selecting vertical slices in either the x or y direction), allowing the user to change the centre plane for the brush.

A 3D visualization / projection of the volume (again coloured according to classification labels) may also be displayed in a separate region of the interface to give a complete view of the 2.5 segmentation and the user can rotate this for inspection, zoom in or out etc. Some implementations may also support an arbitrary orientation for the plane visualised in the displayed 2D representation. In such an approach, the user positions a virtual plane is 3D space e.g. in relation to the 3D visualisation. The interface provides suitable interface controls (e.g. -20 -buttons, drag-to-rotate interactions etc.) allowing the user to alter the position and orientation of the plane, for example to move and rotate the plane around multiple axes. The pixels on the configured plane are then sampled from the CT reconstruction volume in accordance with the configured plane to provide an updated 2D representation. This can be beneficial when the organ or pathology of interest is not easily aligned with the axial, corona: or sagittal orientations. Once the user has positioned the plane as required, they can then click on the target location for segmentation in the 2D representation. The system maps this to the corresponding target voxel (based on the slice plane and selected 2D location within that plane) and the brush is then applied centred at that voxel.

A preview of the spherical application region of the brush may also be shown in the 3D visualisation (and in cross-section in the 2D visualisation) before the brush is applied, e.g. through shading.

I S As mentioned above, the segmentation brush may operate on the underlying CT reconstruction volume but in preferred implementations is applied based on attributes of an attribute volume that is itself derived from the underlying medical volume. The attribute volume may be derived from the underlying medical volume (e.g. CT reconstruction) using any suitable operations. In preferred embodiments, the system allows complex (possibly branching) chains of operations to be defined for deriving an attribute volume from the underlying medical volume.

By way of example, Figure 4 shows a chain of attribute processing, according to an embodiment. Figure 4 shows two sequentially chained operations and two attribute volumes but there may be more than two. 2.5

Figure 5 shows a branched chain of operations, according to an embodiment. As shown, operation 3 receives multiple (in this case, two attribute volumes (1 and 2) as inputs which were themselves derived from the original medical volume (e.g. CT volume) using different operations. These attribute volumes are combined in operation 3 to produce the final attribute -21 -volume (3) used for segmentation. Other embodiments may receive 3 or more inputs.

The system may provide a library of stored attribute processing routines that can be selected by the user. Each routine may specify a single operation, or a complex set of chained and/or branched operations to produce a particular attribute volume as a basis for segmentation. In one approach, the system may offer different predefined attribute volumes for selection, such as a denoised volume, blood vessel likelihood volume or bone likelihood volume as in the examples described in more detail below. Once the user selects a particular attribute volume, the system then runs the associated stored processing routines to generate the selected attribute volume. However, in another approach, the user may also combine existing attribute processing routines to create more complex derived attribute volumes from predefined ones.

Selection of suitable attribute processing routine( s} to generate a particular attribute volume, together with appropriate configuration of the segmentation b * sh tool can allow various I 5 complex segmentation behaviours to be mplemented, some examples of which are set out below. The attribute volume to be used for a given segmentation may be chosen in a number of ways: Based on the organ or pathology type. For example, segmentation of parenchyma may use one derived attribute; while segmentation of a tumour would use a different derived attribute. The organ (or other e) or pathology ype may be selected explicitly by the user e.r the user interface or may be specified in some other way (for example as metadata of the source images or automatically detected from the source images or reconstruction volume; e.g. using a elclassification model) As a setting in the paintbrush. For example, for a threshold-based paintbrush; the user might explicitly select a denoised mode in the interface, corresponding to use of a denoised volume for segmentation.

In some embodiments, the user may independently configure the attribute processing and brush to create a bespoke segmentation tool for a particular application. In other examples, different segmentation brushes may be defined that include the necessary attribute processing and segmentation brush behaviour, e.g. to create a segmentation brush for a specific organ, a bone segmentation brush etc. The user can then simply select the configured brush and the required attribute volume is generated automatically.

As a concrete example, Figure 6 shows a denoised brush system that uses a curvature flow filter for segmentation, according to an embodiment. A denoised brush may allow a user to do segmentation on a volume which has been denoised. A benefit is that random image noise will not cause issues for threshold-based segmentation. An optimal denoising parameter may be chosen for a specific organ type. A curvature flow filter may be used to reduce noise in the orM with volume. A curvature flow filter may apply the partial differential equation _K = v----,-the contour curvature 'cin to every image channel 1. This filter reduces random noise in the image which is typically present in CT scans with thin slices. When segmenting e.g. liver 15 vessels where there is a low signal / noise ratio, a plain threshold-based segmentation is problematic because the vessels will be equally noisy. When instead doing the threshold-based segmentation based on the denoised attribute volume, the vessels are smooth.

Figure 7 shows an example (left) of a CT image and (right) slice of a denoised attribute 20 volume, according to an embodiment.

Figure 8 shows an example (left) of a threshold segmentation performed on a CT image, and (right) threshold segmentation performed on an attribute volume, according to an embodiment.

Figure 9 shows a vessel segmentation method, according to an embodiment. For vessel segmentation, a threshold flood fill paintbrush may be used on a vessel likelihood attribute. The vessel likelihood is an attribute where each voxel indicates the likelihood of that voxel representing a blood vessel. The paintbrush's threshold may be set so that the paintbrush paints only voxels that have a high likelihood of being a blood vessel.

The first ttribute volume ay be a denoised image as described above.

The local contrast attribute volume may be calculated; for each voxel. based on calculating a biased mean in the neighbourhood.

The local contrast is then calculated as the difference between the voxel value at each voxel and the biased mean in the neighbourhood.

Figure 10 shows (left) a Cl' image and (right) vessel segmentation performed on a vessel likelihood attribute volume; according to an embodiment.

Figure 11 shows a bone brush for determining regions of an image that contain bone structures, I 5 according to an embodiment. This attribute volume has two branches (see also Figure 5) which are eventually joined into one operation ulting in the bone likelihood attribute. One branch produces a denoised volume; the other branch calculates a gradient magnitude attribute volume. The denoised volume and the gradient magnitude attribute volumes are used as an input for a bone likelihood attribute calculation. Based on the gradient magnitude volume, a morphological watershed image filter may be performed, yielding a discrete voxel volume that is segmented into multiple classes based on the gradient magnitude.

Figure 12 shows an output of a watershed filter, according to an embodiment; v colour assigned to each class.

Figure 13 shows voxel classes after discriminating based on average class voxel values, according to an embodiment. A discrimination of the classes; where classes that are not likely to represent bone structures are discarded, may be performed using the following steps: que a un -24 - (1) For each class from the water-shedding volume, go through each voxel in that class and calculate the average voxel value from the same voxel index in the denoised c-r image.

(2) Create a new voxe volume with the same resolution.

(3) For each class where the average voxel value is above a given threshold (e.g. where the Hounsfield value (radio opacity) is high), generate a new class identifier and assign this identifier to all voxels from the same class. For each class where the average voxel value is below this threshold, assign the value 0 to that voxel.

In some areas parts of the bone structure may be missing because of a low gradient magnitude.

Therefore, an additional "snap to bone edge" may be performed, using the following steps: (1) For each voxel that has a non-zero class from a previous step, do a connected body flood fill.

(2) if a neighbouring voxel has a voxel value from the denoised CT attribute which corresponds ponds to a reasonable bone Hounsfield value and also has not been assigned to a class, assign the class from (1) to that voxel and continue the search to the next neighbouring voxels.

(3) Continue (2) until all connected neighbours are visited.

Figure 14 shows a result of a paintbrush segmentation performed on the bone likelihood attribute, according to an embodiment.

Organs in the abdomen often have a uniform, but similar radio opacity. Doing a traditional -25 -2.5 intensity-based flood fill segmentation will typically result in over-segmentation. In a generic abdominal organ brush; an attribute volume is generated which delineates organ structures based on gradients. This provides a paintbrush which limits itself to areas that are likely to belong to the organ of interest.

As shown by Figure 15; a discrete attribute volume consisting of classes may be generated as follows: (1) A curvature flow filter is used to generate a denoised volume. 10 (2) A gradient magnitude volume is generated.

(3) The watershed volume is generated.

Figure 16 shows an example of a discrete attribute volume with classes, according to an embodiment. Note that each organ is assigned a low number of classes, making it efficient and easy to do semi-automatic segmentation with a paintbrush.

Figure 17 shows an example of segmentation of liver, gall bladder, pancreas, kidneys and 20 spleen, according to an embodiment. A user may interactively paint on the attribute volume.

With a flood fill based on the attribute classes; a segmentation of multiple organs can be achieved in minutes.

Once the user has completed the segmentation using the segmentation brush tool, the system 2.5 stores and/or outputs the resulting segmented medical volume. For example, the segmentation labels applied to voxels may be stored in a data representation of the medical volume in a database; and the segmented volume may be retrieved for visualisation at a later date, transmitted to other users or systems, and/or accessed by other users through the segmentation application (e.g. for collaborative segmentation).

-26 -In addition to the 2D visualisation displayed in the interface of the segmentation application, the system may also output a 3D visualisation of the segmented medical volume via a 3D display system. Like the 2D visualisation; such a 3D visualisation can be displayed with voxels / regions coloured; shaded or otherwise marked according to segmentation classes. In some implementations, the system outputs a stereoscopic 3D visualisation of the segmented volume to an augmented reality headset, such as a Microsoft HoioLens (RTM) orApple Vision Pro (RTM) device, or a virtual reality headset such as Ma Quest 1/2/3, or a similar device. Holographic displays may also be supported.

The reference numerals in the claims are for the convenience of the relevant searching authority and are not to be regarded as limiting the scope of the claims.

The abstract as filed is hereby included by reference. I 5

Claims

CLAIMSA computer-implemented method of segmenting a medical volume, the method comprising: receiving a selection of an attribute volume to be used for segmentation, wherein the attribute volume defines attribute data to be derived from the medical volume for each of a plurality of voxels of the attribute volume; computing attribute data of the selected attribute volume from the medical volume in 10 accordance with a predetermined attribute processing routine; displaying a representation of the medical volume in a segmentation interface; receiving a selection of a target location for a segmentation operation through interaction with the representation; and applying the segmentation operation to voxels of the medical volume in a region I S determined based on the target location, wherein, the segmentation operation selectively applies a segmentation classification to each of a plurality of voxels of the medical volume in dependence on attribute data of th.e attribute volume corresponding to the voxei.
2. A method according to claim 1, wherein applying the segmentation operation comprises: determining an application region encompassing a set of voxels of the medical volume; evaluating attribute data of th.e attribute volume for the voxels in the application region with respect to a classification criterion, and assigning a specified segmentation classification to voxels in dependence on the evaluation. 2.5
3. A method according to claim 2, wherein the segmentation operation comprises a flood fill operation which applies the segmentation classification to any voxels having attribute data meeting the classification criterion. -28
A method according to claim 3, wherein the segmentation operation comprises a spatially connected flood fill operation which applies the segmentation classification to voxels meeting the classification criterion which are spatially connected to a centre voxel at the target location by way of other voxels also meeting;he classification criterion.
5. A method according to any of claims 2 to 4, wherein he classification criterion specifies a threshold or range, the segmentation operation assigning the segmentation classification to voxels in dependence on whether values of the attribute volume computed for the voxels meet the specified threshold or fall within the specified range.
6. A method according to any of the preceding claims, wherein the segmentation operation comprises application of a segmentation brush tool, and wherein applying the seamentation operation comprises applying a segmentation classification to voxels of the medical volume in a region determined based on the target voxel identified for the brush tool.
7. A method according to claim 6, wherein the segmentation classification is applied in dependence on one or more brush characteristics of the segmentation brush tool.S.
A method according to claim 7, wherein the brush characteristics comprise one or more brush shape; a brush spatial extent, optonally a radius for a spherical brush; a fill mode, a segmentation classification to be applied, and a classification criterion.
A method according to claim 8, wherein one or more of the brush characteristics are user-configurable in the user interface. 2.50.
A method according to any of the preceding claims, wherein the attribute data for each voxel of theattribute volume comprises at least one attribute data value, optionally a single data value or a vector of data values. _'7n_
11. A method according to any of the preceding claims, wherein the medical volume comprises a reconstruction volume based on a three-dimensional reconstruction derived from a set of source medical images and wherein the attribute volume is computed based on the reconstruction volume and/or source medical images.
12. A method according to claim 11, wherein the reconstruction volume comprises a CT (computed tomography) reconstruction based on x-ray images.
13. A method according to claim 11 or 12, wherein the attribute data is derived from the 10 voxel values of the reconstruction volume, the voxel values of the reconstruction volume preferably comprising attenuation values.
14. A method according to any of claims 11 to 13, wherein the displayed representation is based on the reconstruction volume and/or source images and wherein the segmentation I 5 operation is applied based on the derived attribute volume.
15. A method according to claim 14, wherein the displayed representation comprises an image derived from a two-dimensional slice of the reconstruction volume.
16. A method according to any of the preceding claims, comprising receiving a user selection of the attribute processing routine from a library of stored attribute processing routines and generating the attribute volume using the selected attribute processing routine.
17. A method according to any of the preceding claims, wherein the attribute processing 2.5 routine comprises a plurality of operations for deriving the attribute volume from the reconstruction volume and/or source medical images, optionally via one or more intermediate attribute volumes, wherein the attribute processing routine optionally comprises a chained and/or branching set of operations. -30 -
18. A method according to any of the preceding claims; wherein the attribute processing routine is operable to use at least one of: a denoising filter, a curvature flow filter, a gradient magnitude filter, a local contrast filter and a morphological watershed filter.
19. A method according to any of the preceding claims, wherein the attribute volume corresponds to a bone likelihood attribute, defining a bone likelihood attribute value for each of a plurality of voxels of the attribute volume indicating a likelihood that a voxel corresponds to bone.
20. A method according to claim 19, wherein the attribute processing routine: derives a denoised volume from a CT reconstruction volume using a curvature flow filter; derives a gradient magnitude attribute volume from the CT reconstruction volume using a gradient magnitude filter; and performs a bone likelihood calculation based on the denoised volume and the gradient I S magnitude attribute volume and outputs a bone likelihood attribute volume based on the bone likelihood calculation.
21. A method according to any of claims 1 to 18, wherein the attribute processing routine: derives a denoised volume from a CT reconstruction volume using a curvature flow filter; derives a local contrast attribute volume from the denoised volume using a local contrast filter; and performs a vessel likelihood calculation based on the local contrast attribute volume and outputs a vessel likelihood attribute volume indicating for respective voxels a likelihood that the voxel represents a blood vessel. 2.522.
A method according to any of the preceding claims comprising one or more of: outputting a segmented medical volume based on the segmentation operation; displaying a 20 visualisation of the segmented medical volume in the segmentation interface; outputting a 3D visualisation of the segmented medical volume via a 3D display system, optionally an augmented or virtual reality headset.
23. A computer readable medium comprising software code adapted when executed on a data processing system to perform a method as set out in any of the preceding claims.
24. A system having means, optionally in the form of one or more processors with associated memory, for performing a method as set out in any of claims el to 22.