Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
It will be understood that, as used herein, the terms "first," "second," and the like may be used herein to describe various elements, but these elements are not limited by these terms unless otherwise specified. These terms are only used to distinguish one element from another. For example, a first image may be referred to as a second image, and similarly, a second image may be referred to as a first image, without departing from the scope of the present application.
Fig. 1 is an application environment diagram of an image annotation method provided in some embodiments, as shown in fig. 1, in the application environment, including a terminal 110 and a server 120. Taking the target part as a lung and the target object as a lung nodule as an example, the server 120 stores an image sequence obtained by CT (Computed Tomography) scanning of a lung of a person, for example, 300 images. When a doctor needs to perform lung nodule labeling on the images, the terminal 110 may view the images, the terminal 110 may display one of the images (a first image) of a lung nodule obtained by CT scanning, and when the doctor finds that the first image includes the lung nodule, the doctor performs labeling on the lung nodule on the first image, for example, draws a labeling box on the first image to indicate the position of the lung nodule, the terminal 110 detects the lung nodule labeling operation, obtains position information of the lung nodule in the first image according to the lung nodule labeling operation, and sends the position information of the lung nodule in the first image to the server 120. Since one lung nodule generally spans multiple layers, the server 120 may execute the method provided in this embodiment of the present application, derive a position (second position information) of a lung nodule labeled by a doctor corresponding to another image (second image) in the image sequence, identify the second image on the second position information, and determine that the lung nodule labeled by the doctor also exists on the second image if it is identified that the lung nodule also exists on the second position information, so that the terminal 110 or the server 120 may add a labeling frame on an image region corresponding to the second position information in the second image, and may derive a position of the lung nodule in the other image and perform automatic labeling after the doctor labels the position of the lung nodule in one image. When the doctor switches the viewed image to be the second image, the terminal 110 displays the second image added with the labeling box and the identifier of the lung nodule, such as the labeling serial number, so that the doctor can view the lung nodule existing in the second image and can also determine that the lung nodule on the second image and the lung nodule labeled on the first image are the same lung nodule.
It is understood that the image annotation method may be executed in the terminal 110, for example, the terminal 110 may acquire the image sequence in advance to execute the image annotation method provided in the embodiment of the present application.
The server 120 may be an independent physical server, may be a server cluster composed of a plurality of physical servers, and may be a cloud server providing basic cloud computing services such as a cloud server, a cloud database, a cloud storage, or a CDN. The terminal 110 may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, or the like. The terminal 110 and the server 120 may be connected through a communication connection manner such as a network, and the application is not limited herein.
As shown in fig. 2, in some embodiments, an image annotation method is provided, and this embodiment is mainly illustrated by applying the method to the server 120 in fig. 1. The method specifically comprises the following steps:
step S202, a target image set is obtained, the target image set comprises images corresponding to a plurality of layers obtained by image acquisition of a target part, the target image set comprises a first image of the position of a determined target object and a second image of the position of the target object to be determined, the target object is located on the target part, and the second image and the first image correspond to different layers of the target part.
In particular, the set of target images may include a plurality of images, "plurality" meaning at least two. The target site may be a human or animal site, and may be any site requiring object recognition. Such as the lung or thyroid gland. The first image and the second image are acquired from the same target region by an imaging device, but have different layers, for example, from a CT device. The images of the target image set may be medical images, such as images obtained by a cross-sectional scan with a CT imaging device. The CT imaging equipment can utilize an X-ray beam to scan a layer with a certain thickness of a human body part, a detector receives the X-ray penetrating through the layer, information obtained by scanning is calculated to obtain an X-ray attenuation coefficient or an absorption coefficient, and a pixel value of each pixel point can be obtained according to the X-ray attenuation coefficient or the absorption coefficient, so that a CT image is obtained. The target object may be any object, for example, a nodule, a blood vessel, or a certain tissue, etc., for example, the target object may be a lung nodule or a thyroid nodule, etc.
By level is meant a layer of a certain thickness in the target site. That is, when scanning the target region, the target region is divided into a plurality of layers, and each layer is scanned to obtain an image as an image corresponding to one layer, thereby forming an image sequence. The image sequence includes a plurality of images. The first image may be one or more images of a sequence of images. The second image is one or more images in the sequence of images. The first image is an image in which the position of the target object in the image has been determined, and for example, may be an image in which the position of the target object is artificially labeled, and the second image may be an image in which the position of the target object is not artificially labeled. It will be appreciated that since multiple target objects may be included in one image, for example, there may be multiple lung nodules in one image, the first image in which the location of the target object has been determined may be for a particular lung nodule or nodules. For example, if the position of the a nodule in the t1 image is determined, then for the a nodule, the t1 image is the first image for which the position of the a nodule has been determined. If the position of the B-nodule in the t2 image is determined but the position of the B-nodule in the t1 image is not determined, then for the B-nodule, the t2 image is the first image for which the position of the B-nodule has been determined and the t1 image may be the second image.
As shown in fig. 3, which is a schematic diagram of a CT apparatus performing tomography in some embodiments, arrows indicate directions of x-rays, so that the CT apparatus may perform scanning layer by layer, and a plane represents one layer, so that a multi-layer image may be obtained, and an image of one layer may also be referred to as one layer.
In step S204, position information of the target object in the first image is acquired as reference position information.
Specifically, the position information is used to indicate the position of the object. The position information may be represented by coordinates or may be represented by a combination of coordinates and dimensions. The dimension may be represented by a radius or a length, etc. For example, assuming that the shape of the labeling box corresponding to the target object is a rectangle, the position information may include the center point coordinates of the rectangle, the length of the rectangle, and the width information. Assuming that the shape of the labeling box corresponding to the target object is a circle, the position information may include a moment center coordinate and a radius.
In some embodiments, the position information of the target object in the first image may be obtained in advance or may be obtained in real time. For example, manual labeling may be performed in advance to obtain and store the reference position information, or the position information may be obtained in real time according to an object labeling operation of a user.
In some embodiments, the terminal may display the first image, and when the terminal receives an object annotation operation on the first image, obtain annotation location information corresponding to the object annotation operation, as location information of a target object in the first image, and trigger a location information import request. Therefore, the server can receive a position information incoming request sent by the terminal, wherein the position information incoming request carries the position information of the target object in the first image, and therefore the position information of the target object in the first image can be extracted from the position incoming request.
Specifically, the object labeling operation refers to an operation of labeling an object, and may be at least one of a touch operation, a voice operation, an operation performed through an input device such as a mouse, or a gesture operation. The marking position information is position information corresponding to the marking operation. For example, a start position and an end position of the user touch operation may be acquired. The start position and the end position are taken as two opposite corners in the rectangle, thereby obtaining the rectangle. The coordinates of the center point of the rectangle and the size of the rectangle are taken as reference position information. For another example, if the user's voice "200 th row, 400 th column, 50 pixels long and 100 pixels wide" is received, the coordinates (200,400), 50 pixels long and 100 pixels wide can be used as the reference position information.
The location incoming request is used to request an incoming location, the location incoming information carrying location information of the target object in the first image. The position information of the target object in the first image can be extracted from the position incoming request.
In some embodiments, the terminal may obtain the first image from the server in response to an image display operation by the user, and display the first image. For example, when a doctor needs to view a medical image, an acquisition request for acquiring the medical image of a certain patient may be sent to the server, and the server acquires an image sequence of a target part of a target user in response to the acquisition request, or returns the image sequence to the terminal. It is also possible to return a partial image, for example a single image, to the terminal first. When the terminal receives the image switching operation of the user, the terminal sends a request for switching the images to the server, and the server returns the images needing to be switched in the image sequence.
By automatically labeling other images in response to the position information corresponding to the object labeling operation, a user such as a doctor can label any one of the images, and computer equipment such as a server can automatically acquire the position of the labeled target object in the other images according to the labeled position, so that the labeling time of the user can be reduced, and the labeling efficiency can be improved.
Step S206, a relative position relationship between a first layer corresponding to the first image and a second layer corresponding to the second image is obtained.
Specifically, the first image is an image obtained by image-capturing a first layer of the target portion, and the second image is an image obtained by image-capturing a second layer of the target portion. The relative positional relationship indicates the relative position between the planes. The relative position relationship of the first layer and the second layer can be represented by the distance between layers. For example, assuming that the first layer is 450 layers, the second layer is 449 layers, and the thickness between layers is 1mm, the relative positional relationship is 1 mm. Or a three-dimensional coordinate system can be established by taking a certain position point of the first layer as an origin of coordinate axes, wherein a plane where the first layer is located is a plane where the x axis and the y axis are located. The value of the z-axis coordinate of the second slice can thus be taken as the distance between the first slice and the second slice.
Step S208, second position information of the target object on the second image is determined according to the relative position relation and the reference position information.
Specifically, the second position information may be represented by coordinates. Assume that the callout box is a rectangle. The leftmost coordinate, the rightmost coordinate, the uppermost coordinate, and the lowermost coordinate of the rectangle may be acquired as the second position information. Or in combination with coordinate and size representations. For example, the second location information may include coordinates of a center point and a size.
Since the target object is in different layers, the occupied range is generally different. For example, a nodule may be gradually extended outwardly so that the nodule is larger in the middle and smaller in the sides. Therefore, in the middle layer, the occupied range is large, and the range of the target object in the second layer can be determined according to the relative position relationship. The size of the range may be expressed in terms of dimensions, such as length and width. The position of the target object on the second layer is obtained according to the position of the target object on the first image, so that the second position information can be obtained according to the coordinates and the size of the range, for example, the x-axis coordinates and the y-axis coordinates of the middle point of the target object on the first image can be obtained as the coordinates of the middle point of the target object on the second image. And then obtaining second position information of the target object on the second image according to the size of the range of the target object on the second layer and the coordinates of the intermediate point.
Step S210, performing object recognition on the second image according to the second position information to obtain a recognition result of the target object in the second image, and performing object labeling on the second image according to the recognition result and the second position information.
Specifically, the recognition result may be one of including the target object or not including the target object. Since the second position information is obtained according to the relative position relationship and the reference position information, that is, the position obtained according to the algorithm, but actually, whether the second position has the target object or not needs to be further determined, after the second position information of the target object is determined, an image area corresponding to the second position information in the second image may be acquired, and the image area is subject to object recognition to determine whether the target object exists in the image area or not. The recognition algorithm may be identified using an artificial intelligence model, such as an object recognition model. The identification result may be determined based on the calculation result by performing calculation based on the pixel value of the image region corresponding to the second position information.
When the recognition result indicates that the second image includes the target object, the second image may be labeled according to the second position information, the server may label the second image, or the terminal may label the second image, for example, when the recognition result indicates that the second image includes the target object, the server sends the second image and the second position information to the terminal, so that the terminal labels the second image according to the second position information to obtain a labeled second image, and the terminal displays the labeled second image.
In some embodiments, the image annotation method further comprises: when the identification result is that the second image comprises the target object, acquiring a target object identifier corresponding to the target object; the server or the terminal may add the target object identifier on the second image, for example, the server may return the target object identifier to the terminal, and the terminal adds the target object identifier on the second image. Therefore, the second image after the labeling processing includes a labeling element for labeling the object and a target object identifier, and the labeling element is generated according to the second position information.
In particular, the object identification is used to identify the object. The object identifier can be obtained according to the sequence of the object marking operation or the object identifier input operation of the user. For example, each lung nodule may be named in turn according to the order of the objects labeled by the user, and if a doctor finds and labels one lung nodule when viewing the lung nodule image, the lung nodule is labeled with 1, and if a lung nodule exists in another layer according to the position information of the lung nodule of the label 1 and the relative position relationship between the layers, the lung nodule is also labeled with 1. For another example, when an object labeling operation of a user is received, the terminal may pop up an input box, the user may input a name of the target object in the input box, the terminal carries the name of the target object in the position import request, and when the target object exists in another layer according to the position and the relative position relationship of the target object labeled by the user, the identifier of the lung nodule is the name input by the user, so that when the user switches to the second image of the second layer, the target object and the target object in the first image may be confirmed to be the same object according to the identifier of the target object displayed on the second image.
In some embodiments, the annotation element in the second image after the annotation processing may be movable, so that a movement operation of the user on the annotation element may be received, and the annotation element is moved according to the movement operation, so that the user may adjust the position of the target object on the second image, thereby making the annotated position more accurate.
Fig. 4 is a schematic diagram illustrating operations and labeling of a received object provided in some embodiments. The terminal may display a lung nodule image (first image) of layer 450. When the user finds a lung nodule on the image, the user may perform an object labeling operation, such as clicking on the location of the lung nodule. After the terminal receives the click operation, annotation elements, such as annotation boxes and object identifiers "3" and "4", can be generated at the click position, and represent the user's 3 rd annotated lung nodule and 4 th annotated lung nodule. After the user annotates, as shown in fig. 5, a schematic diagram of the second image after the annotation processing in some embodiments is shown. According to the method provided by the embodiment of the application, the existence of the lung nodule 3 at the 451 th layer and the 452 and the existence of the lung nodule 4 at the 449 th layer are calculated. Therefore, the image of 449, 451, and 452 layers can be added with the label box and the object identification. The labeling elements of different layers may be the same or different.
According to the image labeling method, the target image set is an image which is acquired by carrying out image acquisition on a target part, the acquired images respectively correspond to a plurality of layers, the position information of the target object is used as reference position information in a first image of which the position of the target object is determined in the target image set, and the second position information of the target object on a second image is determined according to the reference position information and the relative position relationship between the first layer corresponding to the first image and the second layer corresponding to the second image, so that the object identification can be carried out according to the second position information, the identification result of the target object in the second image is obtained, and the object labeling is carried out on the second image according to the identification result and the second position information. The position information of the target object in the first image can reflect the relevance of the target object on different layers of the target part, and the relative position relation can reflect the difference of the target object on different layers, so that the position information of the target object in the second image can be accurately obtained based on the relative position relation and the reference position information, and the object labeling is performed on the second image based on the identification result and the second position information, so that the object labeling efficiency and accuracy can be improved.
In some embodiments, since the target image set may include a plurality of images, when the position information corresponding to one image (the first image) in the sequence is received, all other images in the target image set may be regarded as the second image. The partial images in the target image set may be used as the second image. For example, the reference position information comprises a first size of the target object on the first image, and determining the second image of the position of the target object to be determined from the set of target images comprises: determining a second bedding plane distance according to the first size, wherein the second bedding plane distance and the first size form a positive correlation relationship; determining a layer with the distance from the first layer being smaller than the distance from the second layer as a second layer; and acquiring an image corresponding to the second layer from the target image set as a second image of the position of the target object to be determined.
In particular, the bedding distance refers to the distance between bedding and bedding, and may be expressed, for example, in terms of the thickness between bedding. As a practical example, in tomography, each slice has a certain thickness, so the slice thickness can be multiplied by the number of slices that are separated by two slices to obtain the distance between two slices.
The positive correlation relationship means that the correlation relationship is positive, and under the condition that other parameters are not changed, when the first size is increased, the second layer distance is larger. When there are a plurality of first sizes, the target slice distance may be determined based on one or more of the first sizes. For example, if the first size includes a length and a width, a larger value of the length and the width may be obtained, and the second slice distance may be obtained according to the larger value, for example, the larger value is multiplied or added with a preset value to obtain the second slice distance. Or an average value of the length and the width may be obtained, and the average value is multiplied by a preset value to obtain the second layer distance. The preset value may need to be set, for example, the preset value may be a coefficient smaller than 1, for example, 0.5.
After the second bedding surface distance is obtained, the bedding surface whose distance from the first bedding surface is less than the second bedding surface distance can be used as the second bedding surface, and then the image corresponding to the second bedding surface is obtained, so as to obtain the second image. As a practical example, assuming that the first dimension comprises a length of 5mm and a width of 3mm, the larger value of 5mm may be multiplied by a predetermined factor 1/2 to obtain 2.5 mm. Assuming that the first image is located at the 450 th layer and that one layer has a thickness of 1mm, the 448 th, 449 th, 451 th and 452 th layers having a distance of less than 2.5mm from the 450 th layer may be used as the second layer. Since the extension of the target object such as a nodule in different levels is not infinite, the amount of position calculation can be reduced and the calculation efficiency can be improved by using a level smaller than the second level distance as the second level. And the reference size and the first size form a positive correlation relationship, so that the second layer distance can be adaptively adjusted according to the size of a target object such as a nodule in the first image, and the flexibility and the accuracy are high. For example, the larger the size of a nodule marked by the doctor in the first image, the more the layer plane of the nodule extends, and the larger the second layer plane distance is, since the nodule may exist in a layer plane far away from the layer plane.
In some embodiments, the reference position information includes a first size of the target object on the first image and a first coordinate of the target object on the first image, and as shown in fig. 6, determining the second position information of the target object on the second image according to the relative position relationship and the reference position information includes:
step S602, determining a second size of the target object on the second image according to the first size and the relative position relationship.
Specifically, the relative positional relationship may be a first deck distance. The first level distance is the distance between the first level and the second level. For example, assuming that one deck is 2mm (millimeters) thick and 2 decks apart, the deck distance is 4 mm. The change in the second dimension relative to the first dimension may be determined based on the first ply separation. The second size is thus obtained from the first size and the change, which may be determined in particular from the situation of the specific target object. For example, for a nodule, its size does not remain constant but gradually decreases, and the physician-marked locations are typically locations where the nodule is prominent, and thus the nodule may be considered to decrease in size in images of other slices. The variation in size can therefore be determined from the first layer distance, the greater the variation in size.
In some embodiments, a size reduction parameter may be obtained according to the first layer distance, and the first size may be reduced according to the size reduction parameter to obtain the second size. The size reduction parameter is for performing reduction processing of the size, and the larger the size reduction parameter is, the more the size is reduced. The size reduction parameter has a positive correlation with the first layer distance, so the larger the first layer distance, the larger the size reduction parameter. I.e. the larger the first aspect distance, the larger the size reduction parameter and thus the more the size reduction, i.e. the smaller the second size of the target object. The correspondence between the first layer distance and the downsizing parameter may be determined in advance according to a variation of the target object. In the embodiment of the present application, when the position of the target object in the first image is marked by a user or identified by a model, the position is generally an obvious position of the target object, that is, a position with a larger size, so that it can be considered that the size of the target object in the images of other layers becomes smaller as the distance from the first layer becomes larger, so that the size reduction parameter corresponding to the first size is in positive correlation with the distance from the first layer, and the accuracy of the obtained second size can be improved.
In some embodiments, the size reduction parameter may also be obtained in combination with the first size, for example, the first size and the size reduction parameter may be in a negative correlation. That is, the larger the first size is, the smaller the downsizing relationship can be, with other parameters being constant.
In some embodiments, the first dimension comprises a first length and a first width, and the obtaining a downsizing parameter according to the first layer distance comprises: acquiring a larger value of the first length and the first width to obtain a reference size; and obtaining a size reduction parameter according to the first layer distance and the reference size, wherein the size reduction parameter and the reference size form a negative correlation relationship.
Specifically, the first length and the first width respectively refer to a length and a width of the target object on the first image. A larger value refers to a larger value of the first length and the first width. For example, assuming that the first length is greater than the first width, the first length is a reference dimension. The size reduction parameter may be obtained in combination with the reference size and the first layer distance. For example, a ratio of the first-level distance divided by the reference dimension may be used as the size reduction parameter. The square of the ratio of the first-layer distance divided by the reference dimension may also be obtained as the size reduction parameter. In the embodiment of the present application, the size reduction parameter is in a negative correlation with the reference size, and thus the larger the reference size is, the smaller the size reduction parameter is, and thus the slower the nodule is reduced. I.e. the larger lung nodule, the slower its size is to be reduced, making the resulting second size more accurate due to the sizing of the size reduction parameter taking into account the reference size.
After obtaining the size reduction parameter, the first size may be reduced using the size reduction parameter. For example. The size reduction parameter may be subtracted from the preset parameter, and the obtained value is multiplied by the first size as a coefficient to obtain the second size. The size reduction parameter may be subtracted from the preset parameter, and then the second size is obtained by squaring the subtracted value, which is used as a coefficient to multiply the first size. The preset parameter is, for example, 1.
For example, for a lung nodule, the formula for calculating the second dimension may be represented by formulas 1 and 2, where the second dimension includes a second length and a second width, i.e., the length and width of the target object in the second image. Assuming that the second image is located at the j-j layer
Indicating the length of the target object in the image (second image) of the i-j th layer (second slice),
the length of the target object in the image (first image) of the ith layer (first layer) is represented. X represents the X-axis.
Representing the width of the target object in the image (second image) of the i-j th layer (second slice).
The width of the target object in the image (first image) of the ith layer (first slice) is represented, and y represents the y-axis. z is a radical of
i-jRepresenting the distance between the ith-jth layer and the jth layer, wherein when the coordinate axes are established, the z-axis value of the first layer surface can be made to be 0, and then the z is
i-jCoordinate values of z-axis of the i-j th layer may be represented.
Expression solution
And
the larger of these.
Step S604, determining second position information of the target object on the second image according to the first coordinate and the second size.
Specifically, the first coordinate may be a coordinate of the target object in the first image at a preset point, for example, a coordinate of a center point or a coordinate of a certain point located at an edge, and may be specifically set as needed. For a nodule, the coordinates of its center are considered to be constant, extending along the centerline to the two sides. The first coordinates can thus be taken as the coordinates of the target object on the second image. Since the first coordinates of the preset points have already been determined. Therefore, the position information of the target object on the second image can be obtained according to the first coordinate and the second size.
In some embodiments, edge coordinates of the target object on the second image are determined according to the first coordinates and the second size, the edge coordinates are coordinates corresponding to a plurality of preset edge positions; and taking each edge coordinate as second position information of the target object on the second image.
Specifically, a plurality means at least two. The edge coordinates are coordinates corresponding to the edge position of the target object, for example, assuming that the target object is a rectangle, the edge may include coordinates corresponding to four sides, and may include the leftmost coordinate, the rightmost coordinate, the uppermost coordinate, and the lowermost coordinate of the rectangle. Therefore, a corresponding calculation formula can be preset according to the position of the first coordinate in the target object, and the first coordinate and the second size are calculated according to the calculation formula to obtain the edge coordinate.
In some embodiments, it is assumed that the user performs labeling on the ith layer of the lung nodule image sequence, where the ith layer of the lung nodule sequence is denoted as N
iThe x-axis coordinate of the center point of the rectangular frame marked by the user in the ith layer (first layer) is
y-axis coordinate of
The x-axis coordinate of the target object in the i-j (second image layer) layers is expressed as the leftmost (minimum x-axis) coordinate
The rightmost x-axis coordinate (x-axis maximum) is represented as
The uppermost y-axis coordinate (y-axis maximum) is represented as
The lowermost y-axis coordinate (y-axis minimum) is expressed as
The length of the nodule in the ith layer on the x axis is recorded as
The length in the y-axis is noted
And the z-axis coordinate of the ith layer is 0. The z-axis coordinate of the ith to jth layer is recorded as z
i-jThe parameter c represents the length and width of the label frame labeled by the user in the ith layerLarge value, is recorded as
The calculation formulas of the values of the respective edge coordinates of the second layer may be expressed as formulas (3) to (6). Where i is a positive integer and j can be a positive or negative integer.
Fig. 7 is a schematic diagram illustrating positions of images of target objects on different layers in some embodiments. In fig. 7, a plurality of grid-formed areas indicate areas occupied by the target objects. As shown in fig. 7, the coordinates of the center point of the nodule in each layer are consistent, and are both the x-axis coordinate value and the z-axis coordinate value corresponding to the center position of the target object in the i-th layer, which are the positions where the dotted lines pass through in fig. 7. Since the position of the target object of the ith-j layer is obtained according to the position of the ith layer, and the size of the object labeled by the user is generally the largest, the size of the target object of the ith layer is larger than that of the target object in the ith-j layer.
In some embodiments, performing object recognition on the second image according to the second position information, and obtaining a recognition result of the target object in the second image includes: acquiring an image area corresponding to the second position information from the second image as a target image area; and carrying out object recognition according to the image information of the target image area to obtain a recognition result of the target object in the second image.
In particular, the image information may comprise pixel values of the image. After the second position information is obtained, an image of a region corresponding to the second position information may be used as the image region to be recognized. And inputting the target image area into a model for recognition. The image region may be processed, for example, binarized. And obtaining an identification result according to the binarization processing result. For example, as shown in fig. 8, if the region corresponding to the second position information in the second image is obtained as the region a, the region a in the second image is taken as the target image region.
In some implementations, performing object recognition according to the image information of the target image region, and obtaining a recognition result of the target object in the second image includes: obtaining a pixel threshold value according to the pixel value of a pixel point in a target image area; converting pixel values smaller than a pixel threshold value into first pixel values and converting pixel values larger than the pixel threshold value into second pixel values in an initial pixel value set corresponding to the target image area to obtain a target pixel value set; carrying out statistical processing on pixel values in the target pixel value set to obtain a pixel statistical value; and obtaining the identification result of the target object in the second image according to the pixel statistic value.
Specifically, the pixel value in the initial pixel value set is a pixel value of each pixel point in the target image region, and may be a gray value. The pixel threshold may be a median or average of the pixel values of the target image area. The pixel value may be obtained from a difference between the maximum value and the minimum value of the pixel value. For example, the pixel value adjustment value may be added to the minimum value such that the pixel value threshold is between the maximum value and the minimum value. The pixel value adjusting value is obtained according to the difference value of the maximum value and the minimum value in the target image area, so that the pixel threshold value can be adjusted in a self-adaptive mode. For example, a preset coefficient may be multiplied by the difference between the maximum value and the minimum value as the pixel value adjustment value, and the preset coefficient may be 0.3, for example.
For example, for the ith-j image layers, the pixel values of the target image area can be obtained to generate a pixel matrix
The value of the row a and the column b in the target image area is the pixel value of the pixel point of the row a and the column b in the target image area. By a matrix of pixels
Calculating a binary threshold value of the pixel value
Is shown in equation (7). Wherein,
and the minimum value of the pixel values in the target image area of the ith-jth image layer is represented.
And representing the maximum value of the pixel values in the target image area of the ith-jth image layer.
The first pixel value and the second pixel value can be set according to requirements, and the second pixel value is larger than the first pixel value. For example, the first pixel value may be 0 and the second pixel value may be 1. Thus according to the pixel threshold
Forming a pixel matrix from the initial set of pixel values
Converting to a binary matrix of sets of target pixel values
Is shown in formula (8)
The pixel statistic may be an average value. Example (b)The formula for obtaining the pixel statistics may be as shown in formula (9),
represents the length of the target object in the image (second image) of the i-j th layer (second slice),
representing the width of the target object in the image (second image) of the i-j th layer (second slice). g
i-jAnd the pixel statistic value corresponding to the target image area in the i-j layer is represented.
Presentation pair
The pixel values in the matrix are summed.
In some embodiments, it may be determined that the target object is included in the second image when it is determined that the pixel statistics are within a preset pixel value range. The preset pixel value range may be set as desired, for example, may be greater than a preset pixel threshold. For example, it is determined experimentally that when the pixel statistic is between 0.2 and 0.94, i.e. between 0.2 and 0.94, the lung nodule is present. Since the gray value of a target object such as a nodule is generally large in an image obtained by CT scanning, a pixel value larger than a pixel threshold is converted into a second pixel value, and a pixel value smaller than the threshold is converted into a first pixel value. Therefore, the overall situation of the size of the pixel value in the target image area can be determined through the pixel statistic value obtained through statistics, and the identification result can be accurately obtained according to the pixel statistic value.
The method provided by the embodiment of the present application is described below by taking a target object as a lung nodule, performing CT imaging on a lung of a patient to obtain images of multiple slices, forming an image sequence, and automatically labeling other images of the image sequence according to a doctor's label, and includes the following steps:
1. and acquiring an image file corresponding to the lung nodule.
Specifically, an examination image of a lung nodule is generally an Imaging mode of image acquisition of the lung nodule from top to bottom, and on average, one examination may generate 300 image files, which are called Digital Imaging and communications in Medicine (Digital Imaging and communications in Medicine) files, and one of the Digital Imaging and communications files corresponds to one image layer of the lung nodule, so that one lung nodule including K layers may include K Digital files, and thus K Digital files may be acquired.
2. And processing the image file to obtain an array file.
Specifically, Dicom files can be read and loaded through a third-party library, a matrix of K × 512 and thicknesses between layers are generated, one matrix of 512 × 512 corresponds to one layer of the lung nodule sequence, the image is represented as 512 × 512 pixels, the matrix and the thickness of K × 512 can be saved as an npy (numpy) file, which is also called an array file, and matrix values can be gray values of the image. The file may be encrypted npy by an encryption algorithm and the array file named with the id (i.e., identification) of the image file.
Steps 1 and 2 may also be referred to as a data preprocessing process, which is offline and is responsible for processing the image file for processing the lung nodule sequence to generate a matrix, and may store the matrix in an encrypted file named with a sequence ID. The lung nodule sequence represents the slice sequence of the lung nodule, i.e., the lung nodule is scanned in multiple slices.
3. And generating a first image according to the array file, and displaying the first image on the terminal.
Specifically, when receiving the image sequence corresponding to the viewing sequence id, the encrypted npy file may be obtained, and decrypted to obtain the thickness of the matrix and the layer of K × 512, and when the doctor needs to view the first image, the doctor may send the first image to the terminal.
4. Position information of the target object in the first image is acquired as reference position information.
Specifically, after a doctor looks up a certain image, if a pulmonary nodule is found, the position of the nodule can be marked on the image, and at this time, the marking platform receives the position of the marking frame as a reference position, and transmits the sequence ID, the identifier of the image layer and information of the reference position to the corresponding server.
5. A second image is acquired.
Specifically, the second image layer may be determined according to the size of a labeling frame labeled by a doctor. For example, if the length of the label frame is greater than 1mm, layers within 3 layers apart may be used as the second layer. And if the length of the labeling frame is less than 1mm, obtaining the layers within 2 layers as a second layer. And after the second image layers are obtained, acquiring images corresponding to the second image layers as second images.
6. And acquiring the relative position relation between a first layer corresponding to the first image and a second layer corresponding to the second image.
Specifically, the thickness of the layer may be obtained, and the distance between the first layer and the second layer is obtained according to the thickness of the layer and the number of layers apart from the first layer.
7. And obtaining second position information of the target object on the second image according to the relative position relation and the reference position information.
Specifically, the reference position information may include coordinates of a middle point of the target object in the first image and a length and a width. Therefore, the coordinates of the intermediate point are taken as the coordinates of the intermediate point of the target object on the second image, and the length and the width of the target object in the second image are calculated according to the length and the width of the target object in the first image and the distance between the first layer and the second layer. Assuming that the labeled frame surface of the lung nodule is rectangular, the maximum value and the minimum value of the lung nodule on the x axis, and the maximum value and the minimum value on the y axis, i.e. the edge coordinates, can be obtained according to the middle point coordinates and the length and the width of the lung nodule on the second image.
8. And carrying out object recognition on the second image according to the second position information to obtain a recognition result of the target object in the second image.
Specifically, the pixel value of the image region corresponding to the rectangle may be acquired, the binarization processing may be performed on the target image region, and if the average value of the gray-scale values obtained by the binarization processing is 0.6 and the preset pixel value range is 0.2 to 0.94, it is determined that the lung nodule exists in the target image region corresponding to the second position information.
9. And adding an annotation frame on the target image area on the second image. And when a request for displaying the second image is received, returning the second image added with the annotation box.
Specifically, after the first image is labeled, the doctor may switch the next image in the image sequence to be the image that needs to be displayed currently, and if the image is the second image and there is a lung nodule, the second position information of the image is added with the label box and the identifier of the lung nodule, for example, "3". In this way, the doctor can quickly determine that a lung nodule exists on the second position information and is the same lung nodule on the first image.
The method provided by the embodiment of the application can be applied to labeling of lung nodules, the labeling of the lung nodules always troubles labeling doctors, the examination images of the lung nodules are generally an imaging mode of scanning from top to bottom, 1 image is cut by 1mm, on average, more than 300 images are generated after 1 examination, the doctors need to find out the nodules from the images and label the nodules, the nodules are generally three-dimensional and penetrate through a plurality of image layers, the physician is very labor intensive if each layer is manually labeled, and by the method provided by the embodiments of the present application, by building a mathematical model of the nodule's location calculation, the positions of the upper and lower layers of nodules are automatically deduced according to the positions marked by the doctor, so that the three-dimensional marking of the pulmonary nodules is realized, therefore, the doctor circles the positions of the nodules on any layer, and the algorithm can automatically identify the positions of the lung nodules in the upper layer and the lower layer and carry out identification and field marking. According to statistics, by the method provided by the embodiment of the application, the average time for marking an examination sequence by a doctor is reduced from 30min (minutes) to 18min, so that the marking efficiency is improved.
The image sequence provided by the embodiment of the application can be presented in a 3D (three-dimensional) manner, so that a stereoscopic lung nodule image is presented, and when the stereoscopic lung nodule image is presented, a nodule can be marked in the stereoscopic lung nodule image according to position information of the nodule.
As shown in fig. 9, in some embodiments, an image annotation apparatus is provided, which may be integrated in the server 120, and specifically may include a target image set acquisition module 902, a reference position information acquisition module 904, a relative position relationship acquisition module 906, a second position information determination module 908, and a recognition result obtaining module 910.
A target image set obtaining module 902, configured to obtain a target image set, where the target image set includes images corresponding to multiple layers obtained by performing image acquisition on a target portion, the target image set includes a first image in which a position of a target object is determined and a second image in which a position of the target object is to be determined, the target object is located on the target portion, and the second image and the first image correspond to different layers of the target portion.
A reference position information obtaining module 904, configured to obtain position information of the target object in the first image as reference position information.
A relative position relationship obtaining module 906, configured to obtain a relative position relationship between a first layer corresponding to the first image and a second layer corresponding to the second image;
a second position information determining module 908, configured to determine second position information of the target object on the second image according to the relative position relationship and the reference position information.
The recognition result obtaining module 910 is configured to perform object recognition on the second image according to the second position information, obtain a recognition result of the target object in the second image, and perform object labeling on the second image according to the recognition result and the second position information.
In some embodiments, the reference position information includes a first size of the target object on the first image and a first coordinate of the target object on the first image, and the second position information determining module 908 includes: a second size determination unit for determining a second size of the target object on the second image according to the first size and the relative positional relationship; and the second position information determining unit is used for determining second position information of the target object on the second image according to the first coordinates and the second size.
In some embodiments, the relative positional relationship comprises a first level distance of the first level and the second level, the second size determination unit is configured to: obtaining a size reduction parameter according to the first floor distance, wherein the size reduction parameter and the first floor distance form a positive correlation; and carrying out reduction processing on the first size according to the size reduction parameter to obtain a second size.
In some embodiments, the first size comprises a first length and a first width, and the second size determination unit is to: acquiring a larger value of the first length and the first width to obtain a reference size; and obtaining a size reduction parameter according to the first layer distance and the reference size, wherein the size reduction parameter and the reference size form a negative correlation relationship.
In some embodiments, the second location information determining unit is to; determining edge coordinates of the target object on the second image according to the first coordinates and the second size, wherein the edge coordinates are coordinates corresponding to preset edge positions, and the number of the preset edge positions is multiple; and taking each edge coordinate as second position information of the target object on the second image.
In some embodiments, the recognition result obtaining module is configured to: acquiring an image area corresponding to the second position information from the second image as a target image area; and carrying out object recognition according to the image information of the target image area to obtain a recognition result of the target object in the second image.
In some embodiments, the recognition result obtaining module includes: the pixel threshold obtaining unit is used for obtaining a pixel threshold according to the pixel value of the pixel point in the target image area; the conversion unit is used for converting pixel values smaller than a pixel threshold value into first pixel values and converting pixel values larger than the pixel threshold value into second pixel values in an initial pixel value set corresponding to the target image area to obtain a target pixel value set; the statistical unit is used for performing statistical processing on pixel values in the target pixel value set to obtain pixel statistical values; and the identification result obtaining unit is used for obtaining the identification result of the target object in the second image according to the pixel statistic value.
In some embodiments, the recognition result obtaining unit is configured to: and when the pixel statistic value is determined to be in the preset pixel value range, determining that the target object is included in the second image.
In some embodiments, the obtaining of the location information of the target object in the first image is configured to: receiving a position incoming request sent by a terminal; extracting position information of the target object in the first image from the position incoming request; when the terminal receives the object marking operation on the first image, the marking position information corresponding to the object marking operation is obtained and used as the position information of the target object in the first image, and the position information transmitting request is triggered.
In some embodiments, the apparatus further includes a sending module, configured to send the second image and the second location information to the terminal when the recognition result is that the second image includes the target object, so that the terminal labels the second image according to the second location information, and obtains the second image after the labeling processing.
In some embodiments, the reference location information includes a first size of the target object on the first image, the apparatus further includes a second image acquisition module for determining a second aspect distance from the first size, the second aspect distance having a positive correlation with the first size; determining a layer with the distance from the first layer being smaller than the distance from the second layer as a second layer; and acquiring an image corresponding to the second layer from the target image set as a second image of the position of the target object to be determined.
FIG. 10 is a diagram illustrating an internal structure of a computer device in some embodiments. The computer device may specifically be the server 120 in fig. 1. As shown in fig. 10, the computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the memory includes a non-volatile storage medium and an internal memory. The non-volatile storage medium of the computer device stores an operating system and may also store a computer program that, when executed by the processor, causes the processor to implement the image annotation method. The internal memory may also have a computer program stored therein, which when executed by the processor, causes the processor to perform the image annotation process.
Those skilled in the art will appreciate that the architecture shown in fig. 10 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In some embodiments, the image annotation device provided by the present application can be implemented in the form of a computer program, and the computer program can be run on a computer device as shown in fig. 10. The memory of the computer device may store various program modules constituting the image annotation apparatus, such as a target image set acquisition module 902, a reference position information acquisition module 904, a relative position relationship acquisition module 906, a second position information determination module 908, and a recognition result acquisition module 910 shown in fig. 9.
The computer program constituted by the respective program modules causes the processor to execute the steps in the image labeling method of the embodiments of the present application described in the present specification.
For example, the computer device shown in fig. 10 may obtain, by using the target image set obtaining module 902 in the image annotation apparatus shown in fig. 9, a target image set, which is an image obtained by image-capturing the target portion and corresponding to each of a plurality of layers, where the target image set includes a first image of the determined position of the target object and a second image of the determined position of the target object, and the target object is located on the target portion, where the second image and the first image correspond to different layers of the target portion. The position information of the target object in the first image is acquired as reference position information by the reference position information acquisition module 904. Acquiring a relative position relationship between a first layer corresponding to the first image and a second layer corresponding to the second image by using a relative position relationship acquisition module 906; second position information of the target object on the second image is determined by the second position information determination module 908 according to the relative positional relationship and the reference position information. The second image is subject to object recognition according to the second position information through the recognition result obtaining module 910, so as to obtain a recognition result of the target object in the second image, and perform object labeling on the second image according to the recognition result and the second position information.
In some embodiments, a computer device is provided, comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to perform the steps of the image annotation method described above. The steps of the image annotation method herein may be the steps in the image annotation methods of the above embodiments.
In some embodiments, a computer readable storage medium is provided, storing a computer program, which when executed by a processor, causes the processor to perform the steps of the image annotation method described above. The steps of the image annotation method herein may be the steps in the image annotation methods of the above embodiments.
It should be understood that, although the steps in the flowcharts of the embodiments of the present application are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in various embodiments may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware related to instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, the computer program can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.