CN114897915A - Image segmentation method, device, electronic device and storage medium - Google Patents
Image segmentation method, device, electronic device and storage medium Download PDFInfo
- Publication number
- CN114897915A CN114897915A CN202210380537.XA CN202210380537A CN114897915A CN 114897915 A CN114897915 A CN 114897915A CN 202210380537 A CN202210380537 A CN 202210380537A CN 114897915 A CN114897915 A CN 114897915A
- Authority
- CN
- China
- Prior art keywords
- image
- area
- mask
- interactive operation
- rectangular area
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0484—Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/90—Dynamic range modification of images or parts thereof
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/136—Segmentation; Edge detection involving thresholding
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20092—Interactive image processing based on input by user
- G06T2207/20104—Interactive definition of region of interest [ROI]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20112—Image segmentation details
- G06T2207/20132—Image cropping
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Editing Of Facsimile Originals (AREA)
Abstract
Description
技术领域technical field
本公开涉及图像处理技术领域,尤其涉及一种用于交互式的图像分割方法、装置、电子设备和存储介质。The present disclosure relates to the technical field of image processing, and in particular, to an interactive image segmentation method, apparatus, electronic device and storage medium.
背景技术Background technique
图像分割技术是一项非常重要的计算机视觉任务,它在图像检索、图片编辑和影视制作中有诸多应用。作为图像分割领域的一项子任务,交互式分割(InteractiveSegmentation)旨在以最少的用户输入和推理时间实现感兴趣对象和背景的区分,并达到最佳的分割精度。由于用户输入方式(例如点击、涂抹、边界框等)的多样性,交互式分割方法为用户提供了极大的灵活度,并能够根据用户引导对当前分割结果进行有效调整。Image segmentation technology is a very important computer vision task, which has many applications in image retrieval, image editing and film and television production. As a sub-task in the field of image segmentation, Interactive Segmentation aims to achieve the distinction between objects of interest and background with the least user input and reasoning time, and achieve the best segmentation accuracy. Due to the variety of user input methods (such as clicking, scribbling, bounding boxes, etc.), the interactive segmentation method provides users with great flexibility and can effectively adjust the current segmentation results according to user guidance.
相关技术的交互式分割技术依托的交互方式主要包括点击(click)和涂抹(scribble)两种交互操作,在每一次交互过程都是基于全图来进行分割。这种方式的缺点是基于全图的方式降低了交互式模型网络的有效分辨率,因此对于细节的分割有很大的局限性。The interactive method based on the interactive segmentation technology of the related art mainly includes two interactive operations, click and scribble. In each interactive process, segmentation is performed based on the whole image. The disadvantage of this method is that the full-image-based method reduces the effective resolution of the interactive model network, so it has great limitations for the segmentation of details.
发明内容SUMMARY OF THE INVENTION
根据本公开的第一方面,提供了一种图像分割方法,其特征在于,包括:接收针对图像的交互操作;根据所述交互操作的范围从图像中剪裁出相应的区域图像;根据所述交互操作的前一次交互操作得到的第一图像掩码、所述区域图像和所述交互操作的掩码,获取所述交互操作的结果掩码;使用所述交互操作的结果掩码将所述第一图像掩码更新为第二图像掩码,并基于第二图像掩码得到分割的图像。According to a first aspect of the present disclosure, there is provided an image segmentation method, characterized by comprising: receiving an interactive operation on an image; cropping a corresponding area image from the image according to the scope of the interactive operation; The first image mask, the area image, and the mask of the interactive operation obtained from the previous interactive operation of the operation are obtained, and the result mask of the interactive operation is obtained; An image mask is updated to a second image mask, and a segmented image is obtained based on the second image mask.
根据本公开的第一方面,所述交互操作包括针对图像的涂抹和/或点击操作。According to the first aspect of the present disclosure, the interactive operation includes a smear and/or click operation on an image.
根据本公开的第一方面,根据所述交互操作的范围从图像中剪裁出相应的区域图像,包括:确定所述交互操作的轨迹对应的交互区域;从图像中剪裁出与所述交互区域对应的所述区域图像。According to the first aspect of the present disclosure, clipping a corresponding area image from the image according to the scope of the interactive operation includes: determining an interactive area corresponding to a trajectory of the interactive operation; and clipping from the image corresponding to the interactive area of the area image.
根据本公开的第一方面,所述交互区域为圆形区域,所述从图像中剪裁出与所述交互区域对应的所述区域图像,包括:从图像中剪裁出与包围所述圆形区域的矩形区域对应的所述区域图像;或者从图像中剪裁出与所述圆形区域对应的部分图像,并将所述部分图像处理为矩形图像以获得所述区域图像。According to the first aspect of the present disclosure, the interaction area is a circular area, and the clipping the area image corresponding to the interaction area from the image includes: clipping and enclosing the circular area from the image the area image corresponding to the rectangular area of
根据本公开的第一方面,所述交互区域为矩形区域,所述从图像中剪裁出与所述交互区域对应的所述区域图像包括:从图像中剪裁出与所述矩形区域对应的所述区域图像。According to the first aspect of the present disclosure, the interactive area is a rectangular area, and the clipping the area image corresponding to the interactive area from the image includes: clipping the image corresponding to the rectangular area from the image area image.
根据本公开的第一方面,从图像中剪裁出与所述矩形区域对应的区域图像包括:以预定比率扩展所述矩形区域,并从图像中剪裁出与扩展后的所述矩形区域对应的所述区域图像。According to the first aspect of the present disclosure, cropping the area image corresponding to the rectangular area from the image includes: expanding the rectangular area at a predetermined ratio, and cropping out the area corresponding to the expanded rectangular area from the image area image.
根据本公开的第一方面,所述预定比率与所述矩形区域的大小负相关,并且/或者所述预定比率与图像被放大的程度负相关。According to the first aspect of the present disclosure, the predetermined ratio is inversely related to the size of the rectangular area, and/or the predetermined ratio is inversely related to the degree to which the image is enlarged.
根据本公开的第一方面,以预定比率扩展所述矩形区域包括:以第一比率放大所述矩形区域的长度,以第二比率放大所述矩形区域的宽度。According to the first aspect of the present disclosure, expanding the rectangular area at a predetermined ratio includes enlarging the length of the rectangular area at a first ratio and enlarging the width of the rectangular area at a second ratio.
根据本公开的第一方面,当所述矩形区域的大小与图像的大小之比小于第一阈值时,将所述预定比率确定为第一值;当所述矩形区域的大小与图像的大小之比大于第一阈值时,将所述预定比率确定为第二值,其中,第一值大于第二值,并且第一值和第二值与图像被放大的倍数负相关。According to the first aspect of the present disclosure, when the ratio of the size of the rectangular area to the size of the image is smaller than a first threshold value, the predetermined ratio is determined as the first value; when the ratio of the size of the rectangular area to the size of the image is smaller than the first threshold When the ratio is greater than the first threshold, the predetermined ratio is determined as a second value, wherein the first value is greater than the second value, and the first value and the second value are negatively correlated with the magnification of the image.
根据本公开的第一方面,当所述区域图像的大小与图像的大小之比小于第二阈值时,将所述区域图像扩展为第一预设大小的区域图像,其中,第二阈值小于第一阈值并且与图像被放大的倍数负相关。According to the first aspect of the present disclosure, when the ratio of the size of the area image to the size of the image is smaller than a second threshold value, the area image is expanded into an area image of a first preset size, wherein the second threshold value is smaller than the second threshold value A threshold and inversely related to the factor by which the image is magnified.
根据本公开的第一方面,所述方法还包括:调整所述区域图像的大小为第二预设大小,并基于调整后的所述区域图像,获取所述交互操作的结果掩码。According to the first aspect of the present disclosure, the method further includes: adjusting the size of the area image to a second preset size, and obtaining a result mask of the interaction operation based on the adjusted area image.
根据本公开的第二方面,提供了一种图像分割装置,包括:交互单元,被配置为接收针对图像的交互操作;剪裁单元,被配置为根据所述交互操作的范围从图像中剪裁出相应的区域图像;推理单元,被配置为根据所述交互操作的前一次交互操作得到的第一图像掩码、所述区域图像和所述交互操作的掩码,获取所述交互操作的结果掩码;掩码更新单元,被配置为使用所述交互操作的结果掩码将所述第一图像掩码更新为第二图像掩码,并基第二图像掩码得到分割的图像。According to a second aspect of the present disclosure, there is provided an image segmentation apparatus, comprising: an interaction unit configured to receive an interaction operation with respect to an image; a cropping unit configured to crop out a corresponding image from the image according to the range of the interaction operation The inference unit is configured to obtain the result mask of the interactive operation according to the first image mask obtained from the previous interactive operation of the interactive operation, the regional image and the mask of the interactive operation a mask updating unit configured to use the result mask of the interactive operation to update the first image mask to a second image mask, and obtain a segmented image based on the second image mask.
根据本公开的第二方面,所述交互操作包括针对图像中的涂抹和/或点击操作。According to the second aspect of the present disclosure, the interactive operation includes a smear and/or click operation in the image.
根据本公开的第二方面,剪裁单元被配置为:确定所述交互操作的轨迹对应的交互区域;从图像中剪裁出与所述交互区域对应的所述区域图像。According to the second aspect of the present disclosure, the cropping unit is configured to: determine the interaction area corresponding to the trajectory of the interaction operation; and crop the area image corresponding to the interaction area from the image.
根据本公开的第二方面,所述交互区域为圆形区域,剪裁单元被配置为:从图像中剪裁出与包围圆形区域的矩形区域对应的所述区域图像;或者从图像中剪裁出与所述圆形区域对应的部分图像,并将所述部分图像处理为矩形图像以获得所述区域图像。According to the second aspect of the present disclosure, the interaction area is a circular area, and the cropping unit is configured to: crop the area image corresponding to the rectangular area surrounding the circular area from the image; The partial image corresponding to the circular area is processed into a rectangular image to obtain the area image.
根据本公开的第二方面,所述交互区域为矩形区域,剪裁单元被配置为:从图像中剪裁出与所述矩形区域对应的区域图像。According to the second aspect of the present disclosure, the interaction area is a rectangular area, and the cropping unit is configured to: crop an area image corresponding to the rectangular area from the image.
根据本公开的第二方面,剪裁单元被配置为:以预定比率扩展所述矩形区域,并从图像中剪裁出与扩展后的所述矩形区域对应的所述区域图像。According to the second aspect of the present disclosure, the cropping unit is configured to expand the rectangular area at a predetermined ratio, and crop the area image corresponding to the expanded rectangular area from an image.
根据本公开的第二方面,所述预定比率与所述矩形区域的大小负相关,并且/或者所述预定比率与图像被放大的程度负相关。According to the second aspect of the present disclosure, the predetermined ratio is inversely related to the size of the rectangular area, and/or the predetermined ratio is inversely related to the degree to which the image is enlarged.
根据本公开的第二方面,剪裁单元被配置为:以第一比率放大所述矩形区域的长度,以第二比率放大所述矩形区域的宽度。According to the second aspect of the present disclosure, the trimming unit is configured to enlarge the length of the rectangular area at a first ratio and enlarge the width of the rectangular area at a second ratio.
根据本公开的第二方面,剪裁单元被配置为:当所述矩形区域的大小与图像的大小之比小于第一阈值时,将所述预定比率确定为第一值;当所述矩形区域的大小与图像的大小之比大于第一阈值时,将所述预定比率确定为第二值,其中,第一值大于第二值,并且第一值和第二值与图像被放大的倍数负相关。According to the second aspect of the present disclosure, the cropping unit is configured to: determine the predetermined ratio as a first value when the ratio of the size of the rectangular area to the size of the image is smaller than a first threshold; When the ratio of the size to the size of the image is greater than the first threshold, the predetermined ratio is determined as a second value, wherein the first value is greater than the second value, and the first value and the second value are negatively correlated with the magnification of the image .
根据本公开的第二方面,剪裁单元被配置为:当所述区域图像的大小与图像的大小之比小于第二阈值时,将所述区域图像扩展为具有第一预设大小的区域图像,其中,第二阈值小于第一阈值并且与图像被放大的倍数负相关。According to the second aspect of the present disclosure, the cropping unit is configured to: when the ratio of the size of the area image to the size of the image is smaller than a second threshold, expand the area image into an area image having a first preset size, Wherein, the second threshold is smaller than the first threshold and negatively correlated with the magnification of the image.
根据本公开的第二方面,推理单元被配置为:调整所述区域图像的大小为第二预设大小,并基于调整后的所述区域图像,获取所述交互操作的结果掩码根据本公开的第三方面,提供了一种电子设备,包括:至少一个处理器;至少一个存储计算机可执行指令的存储器,其中,所述计算机可执行指令在被所述至少一个处理器运行时,促使所述至少一个处理器执行如上所述的图像分割方法。According to the second aspect of the present disclosure, the inference unit is configured to: adjust the size of the area image to a second preset size, and based on the adjusted area image, obtain a result mask of the interactive operation according to the present disclosure In a third aspect, an electronic device is provided, comprising: at least one processor; at least one memory storing computer-executable instructions, wherein the computer-executable instructions, when executed by the at least one processor, cause all The at least one processor performs the image segmentation method as described above.
根据本公开的第四方面,提供了一种计算机可读存储介质,当所述计算机可读存储介质中的指令由至少一个处理器执行时,使得所述至少一个处理器能够执行如上所述的图像分割方法。According to a fourth aspect of the present disclosure, there is provided a computer-readable storage medium that, when executed by instructions in the computer-readable storage medium, enables the at least one processor to execute the above-described Image segmentation method.
根据本公开的第五方面,提供了一种计算机程序产品,所述计算机程序产品中的指令被至少一个处理器运行以执行如上所述的图像分割方法。According to a fifth aspect of the present disclosure, there is provided a computer program product whose instructions are executed by at least one processor to perform the image segmentation method as described above.
本公开的实施例提供的技术方案至少带来以下有益效果:根据本公开的示例性实施例的图像分割方法可以在每次交互中根据用户操作的范围自适应对图像进行局部裁剪,输入到交互分割网络的交互操作减少,因此,根据本公开的示例性实施例的图像分割方法简单,即使用户交互操作中出现了一些错误操作导致某次交互操作的分割结果不理想,也不会影响后续的预测过程,具有较强的容错率,极大方便用户进行感兴趣物体的分割提取。同时,根据本公开的实施例的根据用户交互操作来剪裁区域图像的方式可以得到更精细的局部区域,因此能够帮助用户提取出高精度的感兴趣物体掩码,进而得到更准确的分割图像。The technical solutions provided by the embodiments of the present disclosure bring at least the following beneficial effects: the image segmentation method according to the exemplary embodiments of the present disclosure can adaptively locally crop the image according to the scope of the user operation in each interaction, and input the image to the interactive The interactive operations of the segmentation network are reduced. Therefore, the image segmentation method according to the exemplary embodiment of the present disclosure is simple, and even if some wrong operations occur in the user interaction operation, resulting in an unsatisfactory segmentation result of a certain interaction operation, subsequent operations will not be affected. The prediction process has a strong fault tolerance rate, which is very convenient for users to segment and extract objects of interest. At the same time, the method of cropping the region image according to the user interaction operation according to the embodiment of the present disclosure can obtain a finer local region, thus helping the user to extract a high-precision object-of-interest mask, thereby obtaining a more accurate segmented image.
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本公开。It is to be understood that the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the present disclosure.
附图说明Description of drawings
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本公开的实施例,并与说明书一起用于解释本公开的原理,并不构成对本公开的不当限定。The accompanying drawings, which are incorporated into and constitute a part of this specification, illustrate embodiments consistent with the present disclosure, and together with the description, serve to explain the principles of the present disclosure and do not unduly limit the present disclosure.
图1是示出根据本公开的示例性实施例的图像分割方法的流程图。FIG. 1 is a flowchart illustrating an image segmentation method according to an exemplary embodiment of the present disclosure.
图2是示出根据本公开的示例性实施例的图像分割方法中的交互操作的示意图。FIG. 2 is a schematic diagram illustrating an interactive operation in an image segmentation method according to an exemplary embodiment of the present disclosure.
图3是示出根据本公开的示例性实施例的图像分割方法中的局部剪裁过程的示意图。FIG. 3 is a schematic diagram illustrating a local cropping process in an image segmentation method according to an exemplary embodiment of the present disclosure.
图4是示出根据本公开的示例性实施例的图像分割装置的框图。FIG. 4 is a block diagram illustrating an image segmentation apparatus according to an exemplary embodiment of the present disclosure.
图5是示出根据本公开的示例性实施例的用于图像分割的电子设备的示意图。FIG. 5 is a schematic diagram illustrating an electronic device for image segmentation according to an exemplary embodiment of the present disclosure.
图6是示出根据本公开的另一示例性实施例的用于图像分割的电子设备的示意图。FIG. 6 is a schematic diagram illustrating an electronic device for image segmentation according to another exemplary embodiment of the present disclosure.
具体实施方式Detailed ways
为了使本领域普通人员更好地理解本公开的技术方案,下面将结合附图,对本公开实施例中的技术方案进行清楚、完整地描述。In order to make those skilled in the art better understand the technical solutions of the present disclosure, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.
需要说明的是,本公开的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本公开的实施例能够以除了在这里图示或描述的那些以外的顺序实施。以下实施例中所描述的实施方式并不代表与本公开相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本公开的一些方面相一致的装置和方法的例子。It should be noted that the terms "first", "second" and the like in the description and claims of the present disclosure and the above drawings are used to distinguish similar objects, and are not necessarily used to describe a specific sequence or sequence. It is to be understood that the data so used may be interchanged under appropriate circumstances such that the embodiments of the disclosure described herein can be practiced in sequences other than those illustrated or described herein. The implementations described in the following examples are not intended to represent all implementations consistent with this disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as recited in the appended claims.
在此需要说明的是,在本公开中出现的“若干项之中的至少一项”均表示包含“该若干项中的任意一项”、“该若干项中的任意多项的组合”、“该若干项的全体”这三类并列的情况。例如“包括A和B之中的至少一个”即包括如下三种并列的情况:(1)包括A;(2)包括B;(3)包括A和B。又例如“执行步骤一和步骤二之中的至少一个”,即表示如下三种并列的情况:(1)执行步骤一;(2)执行步骤二;(3)执行步骤一和步骤二。It should be noted here that "at least one of several items" in the present disclosure all means including "any one of the several items", "a combination of any of the several items", The three categories of "the whole of the several items" are juxtaposed. For example, "including at least one of A and B" includes the following three parallel situations: (1) including A; (2) including B; (3) including A and B. Another example is "execute at least one of step 1 and step 2", which means the following three parallel situations: (1) execute step 1; (2) execute step 2; (3) execute step 1 and step 2.
图1是根据本公开的示例性实施例的图像分割方法的流程图。应理解可以在具有图像处理能力的任何设备中实现根据本公开的示例性实施例的图像分割方案。例如,可以在具有图像处理能力的终端设备中执行该方法。这里,终端设备可以是手机、平板电脑、桌面型、膝上型、手持计算机、笔记本电脑、上网本、个人数字助理(personal digitalassistant,PDA)、增强现实(augmented reality,AR)/虚拟现实(virtual reality,VR)设备。在终端设备上可运行执行应用,并在应用中实现图像分割处理。也可以在诸如服务器上执行该图像分割方法。本公开的示例性实施例不对其进行限制。FIG. 1 is a flowchart of an image segmentation method according to an exemplary embodiment of the present disclosure. It should be understood that the image segmentation scheme according to the exemplary embodiments of the present disclosure can be implemented in any device with image processing capabilities. For example, the method can be performed in a terminal device with image processing capability. Here, the terminal device may be a mobile phone, a tablet computer, a desktop, a laptop, a handheld computer, a notebook computer, a netbook, a personal digital assistant (PDA), an augmented reality (AR)/virtual reality , VR) equipment. An executable application can be run on the terminal device, and image segmentation processing can be implemented in the application. The image segmentation method can also be performed on, for example, a server. The exemplary embodiments of the present disclosure do not limit it.
如图1所示,在步骤S110,接收针对图像的交互操作。As shown in FIG. 1, in step S110, an interactive operation for an image is received.
这里,图像可以是通过电子设备的屏幕向用户显示的预定格式的图像(例如,RGB图片)。在图像中存在用户感兴趣的待分割的目标区域以及除了目标之外的背景区域,用户可通过多次的交互操作来从图像中分割出目标区域的图像。根据本公开的示例性实施例,针对每次的用户交互操作,可执行如参照图1所述的相应的处理。Here, the image may be an image in a predetermined format (eg, an RGB picture) displayed to the user through the screen of the electronic device. There are target areas to be segmented and background areas other than the target that the user is interested in in the image, and the user can segment the image of the target area from the image through multiple interactive operations. According to an exemplary embodiment of the present disclosure, for each user interaction operation, corresponding processing as described with reference to FIG. 1 may be performed.
根据本公开的示例性实施例,交互操作可包括涂抹操作和/或点击操作。例如,用户可通过在触摸屏上显示的图像中通过输入装置(例如,触控笔、手指等)以某种轨迹运动来执行涂抹操作,电子设备可通过对传感器接收到的信号进行处理而接收到用户的交互操作。通常,在感兴趣目标区域中执行的涂抹操作可被称为正涂抹,而在背景区域中执行的涂抹操作可被称为负涂抹。在一次交互任务中,为了获得用户较满意的结果,通常需要来来回回多次正负交互过程。According to an exemplary embodiment of the present disclosure, the interactive operation may include a smear operation and/or a click operation. For example, a user can perform a scribbling operation by moving an input device (eg, a stylus, a finger, etc.) in a certain trajectory in the image displayed on the touch screen, and the electronic device can receive the signal by processing the signal received by the sensor. User interaction. In general, smearing operations performed in the target area of interest may be referred to as positive smearing, while smearing operations performed in the background area may be referred to as negative smearing. In an interaction task, in order to obtain a satisfactory result for the user, it is usually necessary to go back and forth for many times of positive and negative interaction.
图2示出了根据本公开的示例性实施例的交互操作的示意图。如图2中的左边所示,用户想要从图像中分割出打网球的运动员的感兴趣目标区域,因此用户可在运动员身体部分画出如线条所示的轨迹以执行正涂抹。在图2所示的右边的交互热力图中,可以将涂抹的位置的像素值设置为255(即,如图2所示的白色),并将其余位置的像素值设置为0(即,除了线条之外的区域都为黑色),从而可以得到用户涂抹的轨迹的掩码(mask)。线条的宽度可根据具体产品的应用形式来进行调节。FIG. 2 shows a schematic diagram of an interaction operation according to an exemplary embodiment of the present disclosure. As shown on the left in FIG. 2 , the user wants to segment the target area of interest of a tennis player from the image, so the user can draw a trajectory as a line on the player's body part to perform positive smearing. In the interactive heatmap on the right shown in Figure 2, you can set the pixel value of the smeared position to 255 (i.e., white as shown in Figure 2), and set the pixel value of the rest of the position to 0 (i.e., except for The areas outside the lines are all black), so that you can get a mask of the trajectory that the user painted. The width of the line can be adjusted according to the application form of the specific product.
在步骤S120,根据所述交互操作的范围从图像中剪裁出相应的区域图像。In step S120, a corresponding region image is cut out from the image according to the scope of the interactive operation.
根据本公开的示例性实施例,根据所述交互操作的范围从图像中剪裁出相应的区域图像可包括:确定所述交互操作的轨迹对应的交互区域;从图像中剪裁出与所述交互区域对应的所述区域图像。According to an exemplary embodiment of the present disclosure, cropping a corresponding area image from the image according to the scope of the interaction operation may include: determining an interaction area corresponding to a trajectory of the interaction operation; cropping the interaction area from the image corresponding to the region image.
根据本公开的示例性实施例,所述交互区域可以为圆形区域。例如,可以确定包围交互轨迹的最小包围圆的区域作为交互区域。一个示例中,可确定连接轨迹的坐标的最大长度,并以最大长度为直径获得所述最小包围圆。所述从图像中剪裁出与所述交互区域对应的所述区域图像可包括:从图像中剪裁出与包围所述圆形区域的矩形区域对应的所述区域图像。一个示例中,可在圆形区域的周围填充0或平均值的像素,获得包围该圆形区域的最小矩形区域,再从图像中剪裁出最小矩形区域对应的区域图像。或者,可从图像中剪裁出与所述圆形区域对应的部分图像,并将所述部分图像处理为矩形图像以获得所述区域图像。例如,可在所述圆形区域的区域图像周围添加0或平均值的像素以获得矩形区域的图像。According to an exemplary embodiment of the present disclosure, the interaction area may be a circular area. For example, the area of the smallest enclosing circle surrounding the interaction trajectory may be determined as the interaction area. In one example, the maximum length of the coordinates connecting the trajectories may be determined, and the minimum enclosing circle may be obtained with the maximum length as the diameter. The cropping the area image corresponding to the interaction area from the image may include cropping the area image corresponding to the rectangular area surrounding the circular area from the image. In an example, the circle area may be filled with 0 or average pixels to obtain the smallest rectangular area surrounding the circular area, and then an area image corresponding to the smallest rectangular area is clipped from the image. Alternatively, a part of the image corresponding to the circular area may be cropped from the image, and the part of the image may be processed into a rectangular image to obtain the area image. For example, pixels of 0 or average value can be added around the area image of the circular area to obtain an image of the rectangular area.
根据本公开的示例性实施例,所述交互区域可以为矩形区域,所述从图像中剪裁出与所述交互区域对应的所述区域图像包括:从图像中剪裁出与所述矩形区域对应的所述区域图像。According to an exemplary embodiment of the present disclosure, the interaction area may be a rectangular area, and the clipping the area image corresponding to the interaction area from the image includes: clipping out the image corresponding to the rectangular area the area image.
应理解,在这里对于交互区域的形状不做限制,可以采用覆盖用户交互操作的轨迹的任何形状的区域,并通过各种方式将该交互区域处理为矩形图像以用于后续的分割处理。It should be understood that the shape of the interaction area is not limited here, and an area of any shape covering the trajectory of the user interaction operation may be used, and the interaction area may be processed into a rectangular image in various ways for subsequent segmentation processing.
根据本公开的示例性实施例,交互操作的范围可以是覆盖用户交互操作的轨迹的最小矩形区域,也即该最小矩形区域为交互操作的轨迹对应的交互区域。例如,如图3所示,如310所示的矩形区域为覆盖用户涂抹轨迹的最小矩形区域。可通过轨迹的坐标在屏幕坐标轴上的最大值和最小值来确定最小矩形区域。另外,根据本公开的示例性实施例,可扩充通过用户交互操作获取的区域图像,以获得更好的上下文信息来准确地确定用户感兴趣的目标区域。根据本公开的示例性实施例,在步骤S120可针对根据用户交互操作的范围获得的矩形区域,以预定比率扩展所述矩形区域。如图3所示,可按照预定的扩充比率来自适应地扩展用户交互所覆盖的矩形区域310,从而得到了更为准确的分割区域320。这里的扩充比率指的是被扩展的部分的长度相对于原始长度的比率。According to an exemplary embodiment of the present disclosure, the scope of the interactive operation may be the smallest rectangular area covering the trajectory of the user interactive operation, that is, the smallest rectangular area is the interactive area corresponding to the trajectory of the interactive operation. For example, as shown in FIG. 3 , the rectangular area indicated by 310 is the smallest rectangular area covering the user's smearing track. The minimum rectangular area can be determined by the maximum and minimum values of the coordinates of the track on the screen coordinate axis. In addition, according to an exemplary embodiment of the present disclosure, the region image acquired through the user's interactive operation can be expanded to obtain better contextual information to accurately determine the target region of interest to the user. According to an exemplary embodiment of the present disclosure, the rectangular area obtained according to the range of the user interaction operation may be expanded at a predetermined ratio in step S120. As shown in FIG. 3 , the
根据本公开的示例性实施例,用于执行扩充的预定比率与用户交互操作所覆盖的矩形区域的大小负相关,并且/或者所述预定比率与图像被放大的程度负相关。也就是说,根据本公开的类型实施例的扩充比率与两个因素相关,即,用户交互操作所覆盖的矩形区域占据原始图像的大小比例以及用户在交互过程中放大原图像的倍数相关,用户交互操作所覆盖的矩形区域占据原始图像的大小越大,扩充比率可以相对越小,而用户在交互操作中放大原始图像的倍数越大,扩充比率也可以相对越小。这样,可以保证针对用户的不同交互操作都能够取得相对合适的剪裁区域,该剪裁区域即能够包含足够的信息用于后续的目标分割操作,也不至于包含过多的信息影响处理速度。According to an exemplary embodiment of the present disclosure, the predetermined ratio for performing the enlargement is negatively correlated with the size of the rectangular area covered by the user interaction operation, and/or the predetermined ratio is negatively correlated with the degree to which the image is enlarged. That is to say, the expansion ratio according to the type of embodiment of the present disclosure is related to two factors, namely, the size ratio of the original image occupied by the rectangular area covered by the user's interactive operation and the multiple of the user's enlargement of the original image during the interaction process. The larger the size of the original image occupied by the rectangular area covered by the interactive operation, the relatively smaller the expansion ratio can be, and the greater the multiple that the user enlarges the original image in the interactive operation, the relatively smaller the expansion ratio can be. In this way, it can be ensured that a relatively suitable clipping area can be obtained for different user interaction operations, and the clipping area can contain enough information for subsequent target segmentation operations, and will not contain too much information to affect the processing speed.
根据本公开的示例性实施例,以预定比率扩展所述矩形区域包括:以第一比率放大所述矩形区域的长度,以第二比率放大所述矩形区域的宽度。第一比率和第二比率可以是不相同的,也可以是相同的。也就是说,可以通过相同的比率来扩充矩形区域的长和宽,也可以通过不同的比率来扩充矩形区域的长和宽,或者第一比率和第二比率可以具有预设的关系,在此不做限制。According to an exemplary embodiment of the present disclosure, expanding the rectangular area at a predetermined ratio includes enlarging a length of the rectangular area at a first ratio, and enlarging a width of the rectangular area at a second ratio. The first ratio and the second ratio may be different or the same. That is to say, the length and width of the rectangular area can be expanded by the same ratio, or the length and width of the rectangular area can be expanded by different ratios, or the first ratio and the second ratio can have a preset relationship, here No restrictions.
另外,根据本公开的示例性实施例,可根据分段式的方式来根据最小矩形区域确定剪裁的区域图像。当所述矩形区域的大小与图像的大小之比小于第一阈值时,将所述预定比率确定为第一值;当所述矩形区域的大小与图像的大小之比大于第一阈值时,将所述预定比率确定为第二值,其中,第一值大于第二值,并且第一值和第二值与图像被用户放大的倍数负相关。例如,当所述矩形区域的大小与图像的大小之比小于第一阈值时,将所述预定比率ER确定为ER=/;当所述矩形区域的大小与图像的大小之比大于第一阈值时,将所述预定比率ER确定为ER=β/z,其中,α>β,z表示用户对图像的放大倍数,并且α、β根据业务场景被预先设置,例如,α=2.0,β=1.5。这里,第一阈值可以是0-1之间的值,例如,可以是0.3。因此,分段函数可以设置为如下的等式1:In addition, according to an exemplary embodiment of the present disclosure, the cropped area image may be determined according to the smallest rectangular area in a segmented manner. When the ratio of the size of the rectangular area to the size of the image is smaller than the first threshold, the predetermined ratio is determined as the first value; when the ratio of the size of the rectangular area to the size of the image is greater than the first threshold, the predetermined ratio is determined as the first value; The predetermined ratio is determined as a second value, wherein the first value is greater than the second value, and the first value and the second value are negatively correlated with the magnification of the image by the user. For example, when the ratio of the size of the rectangular area to the size of the image is smaller than the first threshold, the predetermined ratio ER is determined as ER=/; when the ratio of the size of the rectangular area to the size of the image is greater than the first threshold , the predetermined ratio ER is determined as ER=β/z, where α>β, z represents the magnification of the image by the user, and α and β are preset according to the business scenario, for example, α=2.0, β= 1.5. Here, the first threshold may be a value between 0-1, for example, may be 0.3. Therefore, the piecewise function can be set as Equation 1 as follows:
其中,Area(Rmin)表示最小矩形区域的大小,Area(IRGB)表示原始图像的大小。Among them, Area(R min ) represents the size of the smallest rectangular area, and Area(I RGB ) represents the size of the original image.
根据本公开的示例性实施例,当剪裁出的区域图像的大小相对于图像的大小之比小于第二阈值时,将剪裁出的区域图像扩展为具有预设大小的区域,其中,第二阈值小于第一阈值并且与用户放大图像的倍数负相关。这里,预设大小的区域图像例如为面积为原图像的预定比例(例如,0.2)的区域图像。这里的比例可根据实际情况而调整。也就是说,在经过如上所述的扩充最小矩形区域进行剪裁之后,仍然可能会存在剪裁的区域图像较小导致不能快速的分割出用户感兴趣目标的情况,因此,可设置一个裁剪区域Area(Rexpand)的下限值,当Area(Rexpand)/Area(IRGB)小于某个超参阈值τ(即,第二阈值)时,可进一步扩充Rexpand区域,τ的定义如下:According to an exemplary embodiment of the present disclosure, when the ratio of the size of the cropped area image to the size of the image is smaller than a second threshold, the cropped area image is expanded into an area with a preset size, wherein the second threshold is smaller than the first threshold and negatively correlated with the factor by which the user magnifies the image. Here, the area image of the preset size is, for example, an area image whose area is a predetermined ratio (eg, 0.2) of the original image. The ratio here can be adjusted according to the actual situation. That is to say, after clipping through the expansion of the minimum rectangular area as described above, there may still be situations in which the clipped area image is small, so that the target of interest to the user cannot be quickly segmented. Therefore, a clipping area Area ( The lower limit of R expand ), when Area(R expand )/Area(I RGB ) is less than a certain hyperparameter threshold τ (ie, the second threshold), the R expand area can be further expanded, and τ is defined as follows:
τ=γ/zτ=γ/z
其中,当z越大时,τ会对应变小。根据本公开的示例性实施例,γ默认取值可以为0.34。通过以上的分段式扩充以及下限值的限定,可以更好地精细化剪裁出局部区域图像。Among them, when z is larger, τ will be smaller in response. According to an exemplary embodiment of the present disclosure, the default value of γ may be 0.34. Through the above segmented expansion and the limitation of the lower limit value, the local area image can be more refined and cropped.
应理解,以上的扩充用户的交互操作的范围的方式仅是示意,本领域的技术人员可根据实际业务场景的需要进行各种改变。It should be understood that the above manners of expanding the scope of the user's interactive operations are merely illustrative, and those skilled in the art can make various changes according to the needs of actual business scenarios.
在本公开的其他实施例中,交互区域还可以为除圆形、矩形之外的其他几何形状,例如椭圆、或交互操作所针对的目标轮廓。裁剪区域图像的实施方式有多种:In other embodiments of the present disclosure, the interaction area may also be other geometric shapes than circles and rectangles, such as ellipses, or target contours targeted by the interaction operation. There are various implementations of cropping region images:
一个示例中,从图像中剪裁出与包围所述交互区域的矩形区域对应的所述区域图像。例如,可在交互区域的周围填充0或平均值的像素,获得包围该交互区域的最小矩形区域,再从图像中剪裁出最小矩形区域对应的区域图像。应当理解,最小矩形区域仅为示例,获得包围该交互区域的矩形区域也可以不是最小矩形区域,本领域的技术人员可根据实际业务场景的需要进行各种改变。In one example, the area image corresponding to the rectangular area surrounding the interaction area is cropped from the image. For example, 0 or average pixels can be filled around the interactive area to obtain the smallest rectangular area surrounding the interactive area, and then an area image corresponding to the smallest rectangular area can be cropped from the image. It should be understood that the minimum rectangular area is only an example, and the obtained rectangular area surrounding the interaction area may not be the minimum rectangular area, and those skilled in the art can make various changes according to actual business scenarios.
另一个示例中,从图像中剪裁出与所述交互区域对应的部分图像,并将所述部分图像处理为矩形图像以获得所述区域图像。例如,可在所述交互区域的区域图像周围添加0或平均值的像素以获得矩形区域的图像。In another example, a part of the image corresponding to the interaction area is cropped from the image, and the part of the image is processed into a rectangular image to obtain the area image. For example, zero or average pixels can be added around the area image of the interaction area to obtain an image of a rectangular area.
可以看出,获得的所述区域图像为矩形图像。It can be seen that the obtained region image is a rectangular image.
在步骤S130,根据所述交互操作的前一次交互操作得到的第一图像掩码、所述区域图像和所述交互操作的掩码,获取所述交互操作的结果掩码。这里输出的结果掩码是与剪裁出的区域图像相应的分割掩码。In step S130, a result mask of the interactive operation is obtained according to the first image mask, the region image and the mask of the interactive operation obtained by the previous interactive operation of the interactive operation. The resulting mask output here is the segmentation mask corresponding to the cropped region image.
如上所述,在用户的每一次的交互操作中,将会使用到上一次交互之后得到的图像掩码,以及在步骤S120剪裁出的区域图像以及本次交互操作的掩码,将这些输入到训练好的交互式图像分割推理模型中,即可输出得到本次交互的结果掩码。这里的推理模型可以是通过掩码引导的恢复式交互训练得到的用于交互式图像分割的卷积神经网络(CNN)模型,该模型可以基于输入图片、用户引导和上一次交互操作的掩码结果,通过神经网络后对该输入图片产生一个预测的掩码结果。应理解,本申请还可以使用其他训练好的前向反馈形式的推理模型来执行推理以获得相应的掩码。As mentioned above, in each interactive operation of the user, the image mask obtained after the last interaction, the region image cropped in step S120 and the mask of this interactive operation will be used, and these will be input into the In the trained interactive image segmentation inference model, the result mask of this interaction can be output. The inference model here can be a Convolutional Neural Network (CNN) model for interactive image segmentation trained by mask-guided restorative interactions, which can be based on input pictures, user guidance, and masks from the last interaction As a result, a predicted mask is generated for the input image after passing through the neural network. It should be understood that the present application may also use other trained inference models in the form of forward feedback to perform inference to obtain corresponding masks.
根据本公开的示例性实施例,通过所述交互操作剪裁出的区域的大小被调整为预定大小之后输入交互式图像分割推理模型。这是因为,通常模型需要固定大小的输入。因此,可以将剪裁出的区域调整为固定大小以方便模型使用。According to an exemplary embodiment of the present disclosure, the size of the cropped region through the interactive operation is adjusted to a predetermined size and then input to the interactive image segmentation inference model. This is because, typically, models require fixed-size inputs. Therefore, the clipped area can be adjusted to a fixed size for the convenience of the model.
在步骤S140,使用所述交互操作的结果掩码将所述第一图像掩码更新为第二图像掩码,并基于第二图像掩码得到分割的图像。在得到剪裁区域相应的结果掩码之后,将该结果掩码更新到针对图像的上一次交互操作后的全图掩码的对应区域中,从而得到本次交互后的全图掩码。也就是说,可记录交互操作的结果掩码的位置,并根据结果掩码的位置将得到的结果掩码更新到上一次交互操作得到的全图掩码中,从而可以得到新的全图掩码。在得到了新的全图掩码之后可以通过全图掩码获得用户感兴趣的目标区域的图像,作为最终的图像分割的结果。应理解,如果在当前交互操作之前不存在上一次交互操作(即,当前交互操作是第一次交互操作),则第一图像掩码为最初的图像掩码,最初的图像掩码的全部像素值可以为0。In step S140, the first image mask is updated to a second image mask using the result mask of the interaction operation, and a segmented image is obtained based on the second image mask. After the result mask corresponding to the clipping area is obtained, the result mask is updated to the corresponding area of the full-image mask after the last interactive operation on the image, so as to obtain the full-image mask after this interaction. That is to say, the position of the result mask of the interactive operation can be recorded, and the obtained result mask can be updated to the full image mask obtained by the previous interactive operation according to the position of the result mask, so that a new full image mask can be obtained. code. After the new full-image mask is obtained, the image of the target region of interest to the user can be obtained through the full-image mask as the final image segmentation result. It should be understood that if there is no previous interactive operation before the current interactive operation (ie, the current interactive operation is the first interactive operation), the first image mask is the original image mask, and all the pixels of the original image mask The value can be 0.
例如,如果用户的交互输入为正涂抹(即,对感兴趣对象的涂抹),则可以将输出的结果掩码以255的像素值更新到全图掩码的相应区域中。如果用户的输入为负涂抹(即,对背景区域的涂抹),则可以将输出的结果掩码以0的像素值更新到全图掩码的相应区域中。For example, if the user's interactive input is positive smearing (ie, smearing on the object of interest), the output result mask can be updated with a pixel value of 255 into the corresponding area of the full-image mask. If the user's input is a negative smear (ie, a smear on the background area), the output result mask can be updated with a pixel value of 0 into the corresponding area of the full-image mask.
根据本公开的示例性实施例的图像分割方法可以在每次交互中根据用户操作的范围自适应对图像进行局部裁剪,然后再送入交互式分割模型以生成用户感兴趣目标分割掩码。另外,在相关技术中,需要将当前交互操作之前的所有交互操作的结果输入到交互分割模型中进行处理,所以如果在一次交互操作中出现了错误则会干扰最终的图像分割结果,容错率较小。而根据本公开的示例性实施例的图像分割方法,输入到交互分割网络的交互操作减少,因此,根据本公开的示例性实施例的图像分割方法简单,即使用户交互操作中出现了一些错误操作导致某次交互操作的分割结果不理想,也不会影响后续的预测过程,具有较强的容错率,极大方便用户进行感兴趣物体的分割提取。同时,根据本公开的实施例的根据用户交互操作来剪裁区域图像的方式可以得到更精细的局部区域,因此能够帮助用户提取出高精度的感兴趣物体掩码,进而得到更准确的分割图像。The image segmentation method according to the exemplary embodiment of the present disclosure can adaptively crop the image locally according to the scope of the user's operation in each interaction, and then feed it into the interactive segmentation model to generate the segmentation mask of the user's interest target. In addition, in the related art, the results of all interactive operations before the current interactive operation need to be input into the interactive segmentation model for processing, so if an error occurs in one interactive operation, it will interfere with the final image segmentation result, and the error tolerance rate is relatively high. Small. However, according to the image segmentation method according to the exemplary embodiment of the present disclosure, the interactive operations input to the interactive segmentation network are reduced. Therefore, the image segmentation method according to the exemplary embodiment of the present disclosure is simple, even if some wrong operations occur in the user interactive operation. As a result, the segmentation result of an interactive operation is not ideal, and it will not affect the subsequent prediction process. At the same time, the method of cropping the region image according to the user interaction operation according to the embodiment of the present disclosure can obtain a finer local region, thus helping the user to extract a high-precision object-of-interest mask, thereby obtaining a more accurate segmented image.
图4是示出根据本公开的示例性实施例的图像分割装置的框图。应理解,可以在具有图像处理的电子设备中以软件/硬件或软件硬件结合的方式实现该图像分割装置。FIG. 4 is a block diagram illustrating an image segmentation apparatus according to an exemplary embodiment of the present disclosure. It should be understood that the image segmentation apparatus can be implemented in an electronic device with image processing in a software/hardware or a combination of software and hardware.
如图4所示,根据本公开的示例性实施例的图像分割装置400可以包括交互单元410、剪裁单元420、推理单元430和掩码更新单元440。As shown in FIG. 4 , the
交互单元410被配置为接收针对图像的交互操作。剪裁单元420被配置为根据所述交互操作的范围从图像中剪裁出相应的区域图像。The
推理单元430被配置为根据所述交互操作的前一次交互操作得到的第一图像掩码、所述区域图像和所述交互操作的掩码,获取所述交互操作的结果掩码。The
掩码更新单元440被配置为使用所述交互操作的结果掩码将第一图像掩码更新为第二图像掩码,并基于第二图像掩码得到分割的图像。The
根据本公开的示例性实施例,所述交互操作包括针对图像中的的涂抹和/或点击操作。According to an exemplary embodiment of the present disclosure, the interactive operation includes a smear and/or click operation in the image.
根据本公开的示例性实施例,剪裁单元被配置为:确定所述交互操作的轨迹对应的交互区域;从图像中剪裁出与所述交互区域对应的所述区域图像。According to an exemplary embodiment of the present disclosure, the cropping unit is configured to: determine an interaction area corresponding to the trajectory of the interaction operation; and crop the area image corresponding to the interaction area from an image.
根据本公开的示例性实施例,所述交互区域为圆形区域,剪裁单元被配置为:从图像中剪裁出与包围圆形区域的矩形区域对应的所述区域图像;或者从图像中剪裁出与所述圆形区域对应的部分图像,并将所述部分图像处理为矩形图像以获得所述区域图像。According to an exemplary embodiment of the present disclosure, the interaction area is a circular area, and the cropping unit is configured to: crop out the area image corresponding to the rectangular area surrounding the circular area from the image; or crop out the image a partial image corresponding to the circular area, and processing the partial image into a rectangular image to obtain the area image.
根据本公开的示例性实施例,所述交互区域为矩形区域,剪裁单元被配置为:从图像中剪裁出与所述矩形区域对应的区域图像。According to an exemplary embodiment of the present disclosure, the interaction area is a rectangular area, and the cropping unit is configured to: crop an area image corresponding to the rectangular area from the image.
根据本公开的示例性实施例,剪裁单元420可被配置为:以预定比率扩展所述矩形区域,并从图像中剪裁出与扩展后的所述矩形区域相应的所述区域图像。According to an exemplary embodiment of the present disclosure, the
根据本公开的示例性实施例,所述预定比率与所述矩形区域的大小负相关,并且/或者所述预定比率与图像在所述交互操作中被放大的程度负相关。According to an exemplary embodiment of the present disclosure, the predetermined ratio is inversely related to the size of the rectangular area, and/or the predetermined ratio is inversely related to the degree to which the image is enlarged in the interactive operation.
根据本公开的示例性实施例,剪裁单元420可被配置为:以第一比率放大所述矩形区域的长度,以第二比率放大所述矩形区域的宽度。第一比率和第二比率可以是相互独立的。According to an exemplary embodiment of the present disclosure, the
根据本公开的示例性实施例,剪裁单元420可被配置为:当所述矩形区域的大小与图像的大小之比小于第一阈值时,将所述预定比率确定为第一值;当所述矩形区域的大小与图像的大小之比大于第一阈值时,将所述预定比率确定为第二值,其中,第一值大于第二值,并且第一值和第二值与图像被用户放大的倍数负相关。According to an exemplary embodiment of the present disclosure, the
根据本公开的示例性实施例,剪裁单元420可被配置为:当所述区域图像的大小相对于图像的大小之比小于第二阈值时,将所述区域图像扩展为具有第一预设大小的区域,其中,第二阈值小于第一阈值并且与图像被放大的倍数负相关。According to an exemplary embodiment of the present disclosure, the
根据本公开的示例性实施例,推理单元430可被配置为:调整所述区域图像的大小为第二预设大小,并基于调整后的所述区域图像,获取所述交互操作的结果掩码。According to an exemplary embodiment of the present disclosure, the
以上已经参照图1说明了根据本公开的示例性实施例的图像分割方法,其中的各个步骤与图4所示的图像分割装置的各个单元分别对应,因此不再重复描述。The image segmentation method according to an exemplary embodiment of the present disclosure has been described above with reference to FIG. 1 , and each step in the method corresponds to each unit of the image segmentation apparatus shown in FIG. 4 , so the description will not be repeated.
图5是示出根据本公开的示例性实施例的一种用于图像分割的视频处理的电子设备500的结构框图。该电子设备500例如可以是:智能手机、平板电脑、MP4(Moving PictureExperts Group Audio Layer IV,动态影像专家压缩标准音频层面4)播放器、笔记本电脑或台式电脑。电子设备500还可能被称为用户设备、便携式终端、膝上型终端、台式终端等其他名称。FIG. 5 is a structural block diagram illustrating an
通常,电子设备500包括有:处理器501和存储器502。Generally, the
处理器501可以包括一个或多个处理核心,比如4核心处理器、8核心处理器等。处理器501可以采用DSP(Digital Signal Processing,数字信号处理)、FPGA(FieldProgrammable Gate Array,现场可编程门阵列)、PLA(Programmable Logic Array,可编程逻辑阵列)中的至少一种硬件形式来实现。处理器501也可以包括主处理器和协处理器,主处理器是用于对在唤醒状态下的数据进行处理的处理器,也称CPU(Central ProcessingUnit,中央处理器);协处理器是用于对在待机状态下的数据进行处理的低功耗处理器。在一些实施例中,处理器501可以在集成有GPU(Graphics Processing Unit,图像处理器),GPU用于负责显示屏所需要显示的内容的渲染和绘制。在本公开的示例性实施例中,处理器501还可以包括AI(Artificial Intelligence,人工智能)处理器,该AI处理器用于处理有关机器学习的计算操作。The
存储器502可以包括一个或多个计算机可读存储介质,该计算机可读存储介质可以是非暂态的。存储器502还可包括高速随机存取存储器,以及非易失性存储器,比如一个或多个磁盘存储设备、闪存存储设备。在一些实施例中,存储器502中的非暂态的计算机可读存储介质用于存储至少一个指令,该至少一个指令用于被处理器501所执行以实现本公开的示例性实施例的图像分割方法。
在一些实施例中,电子设备500还可选包括有:外围设备接口503和至少一个外围设备。处理器501、存储器502和外围设备接口503之间可以通过总线或信号线相连。各个外围设备可以通过总线、信号线或电路板与外围设备接口503相连。具体地,外围设备包括:射频电路504、触摸显示屏505、摄像头506、音频电路507、定位组件508和电源509中的至少一种。In some embodiments, the
外围设备接口503可被用于将I/O(Input/Output,输入/输出)相关的至少一个外围设备连接到处理器501和存储器502。在一些实施例中,处理器501、存储器502和外围设备接口503被集成在同一芯片或电路板上;在一些其他实施例中,处理器501、存储器502和外围设备接口503中的任意一个或两个可以在单独的芯片或电路板上实现,本实施例对此不加以限定。The
射频电路504用于接收和发射RF(Radio Frequency,射频)信号,也称电磁信号。射频电路504通过电磁信号与通信网络以及其他通信设备进行通信。射频电路504将电信号转换为电磁信号进行发送,或者,将接收到的电磁信号转换为电信号。可选地,射频电路504包括:天线系统、RF收发器、一个或多个放大器、调谐器、振荡器、数字信号处理器、编解码芯片组、用户身份模块卡等等。射频电路504可以通过至少一种无线通信协议来与其它终端进行通信。该无线通信协议包括但不限于:城域网、各代移动通信网络(2G、3G、4G及5G)、无线局域网和/或WiFi(Wireless Fidelity,无线保真)网络。在一些实施例中,射频电路504还可以包括NFC(Near Field Communication,近距离无线通信)有关的电路,本公开对此不加以限定。The
显示屏505用于显示UI(User Interface,用户界面)。该UI可以包括图形、文本、图标、视频及其它们的任意组合。当显示屏505是触摸显示屏时,显示屏505还具有采集在显示屏505的表面或表面上方的触摸信号的能力。该触摸信号可以作为控制信号输入至处理器501进行处理。此时,显示屏505还可以用于提供虚拟按钮和/或虚拟键盘,也称软按钮和/或软键盘。在一些实施例中,显示屏505可以为一个,设置在电子设备500的前面板;在另一些实施例中,显示屏505可以为至少两个,分别设置在终端500的不同表面或呈折叠设计;在一些实施例中,显示屏505可以是柔性显示屏,设置在终端500的弯曲表面上或折叠面上。甚至,显示屏505还可以设置成非矩形的不规则图形,也即异形屏。显示屏505可以采用LCD(Liquid Crystal Display,液晶显示屏)、OLED(Organic Light-Emitting Diode,有机发光二极管)等材质制备。The
摄像头组件506用于采集图像或视频。可选地,摄像头组件506包括前置摄像头和后置摄像头。通常,前置摄像头设置在终端的前面板,后置摄像头设置在终端的背面。在一些实施例中,后置摄像头为至少两个,分别为主摄像头、景深摄像头、广角摄像头、长焦摄像头中的任意一种,以实现主摄像头和景深摄像头融合实现背景虚化功能、主摄像头和广角摄像头融合实现全景拍摄以及VR(Virtual Reality,虚拟现实)拍摄功能或者其它融合拍摄功能。在一些实施例中,摄像头组件506还可以包括闪光灯。闪光灯可以是单色温闪光灯,也可以是双色温闪光灯。双色温闪光灯是指暖光闪光灯和冷光闪光灯的组合,可以用于不同色温下的光线补偿。The
音频电路507可以包括麦克风和扬声器。麦克风用于采集用户及环境的声波,并将声波转换为电信号输入至处理器501进行处理,或者输入至射频电路504以实现语音通信。出于立体声采集或降噪的目的,麦克风可以为多个,分别设置在终端500的不同部位。麦克风还可以是阵列麦克风或全向采集型麦克风。扬声器则用于将来自处理器501或射频电路504的电信号转换为声波。扬声器可以是传统的薄膜扬声器,也可以是压电陶瓷扬声器。当扬声器是压电陶瓷扬声器时,不仅可以将电信号转换为人类可听见的声波,也可以将电信号转换为人类听不见的声波以进行测距等用途。在一些实施例中,音频电路507还可以包括耳机插孔。
定位组件508用于定位电子设备500的当前地理位置,以实现导航或LBS(LocationBased Service,基于位置的服务)。定位组件508可以是基于美国的GPS(GlobalPositioning System,全球定位系统)、中国的北斗系统、俄罗斯的格雷纳斯系统或欧盟的伽利略系统的定位组件。The
电源509用于为电子设备500中的各个组件进行供电。电源509可以是交流电、直流电、一次性电池或可充电电池。当电源509包括可充电电池时,该可充电电池可以支持有线充电或无线充电。该可充电电池还可以用于支持快充技术。
在一些实施例中,电子设备500还包括有一个或多个传感器510。该一个或多个传感器510包括但不限于:加速度传感器511、陀螺仪传感器512、压力传感器513、指纹传感器514、光学传感器515以及接近传感器516。In some embodiments, the
加速度传感器311可以检测以终端500建立的坐标系的三个坐标轴上的加速度大小。比如,加速度传感器511可以用于检测重力加速度在三个坐标轴上的分量。处理器501可以根据加速度传感器511采集的重力加速度信号,控制触摸显示屏505以横向视图或纵向视图进行用户界面的显示。加速度传感器511还可以用于游戏或者用户的运动数据的采集。The acceleration sensor 311 can detect the magnitude of acceleration on the three coordinate axes of the coordinate system established by the
陀螺仪传感器512可以检测终端500的机体方向及转动角度,陀螺仪传感器512可以与加速度传感器511协同采集用户对终端500的3D动作。处理器501根据陀螺仪传感器512采集的数据,可以实现如下功能:动作感应(比如根据用户的倾斜操作来改变UI)、拍摄时的图像稳定、游戏控制以及惯性导航。The
压力传感器513可以设置在终端500的侧边框和/或触摸显示屏505的下层。当压力传感器513设置在终端500的侧边框时,可以检测用户对终端500的握持信号,由处理器501根据压力传感器513采集的握持信号进行左右手识别或快捷操作。当压力传感器513设置在触摸显示屏505的下层时,由处理器501根据用户对触摸显示屏505的压力操作,实现对UI上的可操作性控件进行控制。可操作性控件包括按钮控件、滚动条控件、图标控件、菜单控件中的至少一种。The pressure sensor 513 may be disposed on the side frame of the terminal 500 and/or the lower layer of the
指纹传感器514用于采集用户的指纹,由处理器501根据指纹传感器514采集到的指纹识别用户的身份,或者,由指纹传感器514根据采集到的指纹识别用户的身份。在识别出用户的身份为可信身份时,由处理器501授权该用户执行相关的敏感操作,该敏感操作包括解锁屏幕、查看加密信息、下载软件、支付及更改设置等。指纹传感器514可以被设置电子设备500的正面、背面或侧面。当电子设备500上设置有物理按键或厂商Logo时,指纹传感器514可以与物理按键或厂商Logo集成在一起。The fingerprint sensor 514 is used to collect the user's fingerprint, and the
光学传感器515用于采集环境光强度。在一个实施例中,处理器501可以根据光学传感器515采集的环境光强度,控制触摸显示屏505的显示亮度。具体地,当环境光强度较高时,调高触摸显示屏505的显示亮度;当环境光强度较低时,调低触摸显示屏505的显示亮度。在另一个实施例中,处理器501还可以根据光学传感器515采集的环境光强度,动态调整摄像头组件506的拍摄参数。Optical sensor 515 is used to collect ambient light intensity. In one embodiment, the
接近传感器516,也称距离传感器,通常设置在电子设备500的前面板。接近传感器516用于采集用户与电子设备500的正面之间的距离。在一个实施例中,当接近传感器516检测到用户与终端500的正面之间的距离逐渐变小时,由处理器501控制触摸显示屏505从亮屏状态切换为息屏状态;当接近传感器516检测到用户与电子设备500的正面之间的距离逐渐变大时,由处理器501控制触摸显示屏505从息屏状态切换为亮屏状态。
本领域技术人员可以理解,图3中示出的结构并不构成对电子设备500的限定,可以包括比图示更多或更少的组件,或者组合某些组件,或者采用不同的组件布置。Those skilled in the art can understand that the structure shown in FIG. 3 does not constitute a limitation on the
图6所示为另一种电子设备600的结构框图。例如,电子设备600可以被提供为一服务器。参照图6,电子设备600包括一个或多个处理处理器610以及存储器620。存储器620可以包括用于执行以上的图像分割方法的一个或一个以上的程序。电子设备600还可以包括一个电源组件630被配置为执行电子设备600的电源管理,一个有线或无线网络接口640被配置为将电子设备600连接到网络,和一个输入输出(I/O)接口650。电子设备600可以操作基于存储在存储器620的操作系统,例如Windows ServerTM、Mac OS XTM、UnixTM、LinuxTM、FreeBSDTM或类似。FIG. 6 is a structural block diagram of another
根据本公开的实施例,还可提供一种存储指令的计算机可读存储介质,其中,当指令被至少一个处理器运行时,促使至少一个处理器执行根据本公开的图像分割方法。这里的计算机可读存储介质的示例包括:只读存储器(ROM)、随机存取可编程只读存储器(PROM)、电可擦除可编程只读存储器(EEPROM)、随机存取存储器(RAM)、动态随机存取存储器(DRAM)、静态随机存取存储器(SRAM)、闪存、非易失性存储器、CD-ROM、CD-R、CD+R、CD-RW、CD+RW、DVD-ROM、DVD-R、DVD+R、DVD-RW、DVD+RW、DVD-RAM、BD-ROM、BD-R、BD-R LTH、BD-RE、蓝光或光盘存储器、硬盘驱动器(HDD)、固态硬盘(SSD)、卡式存储器(诸如,多媒体卡、安全数字(SD)卡或极速数字(XD)卡)、磁带、软盘、磁光数据存储装置、光学数据存储装置、硬盘、固态盘以及任何其他装置,所述任何其他装置被配置为以非暂时性方式存储计算机程序以及任何相关联的数据、数据文件和数据结构并将所述计算机程序以及任何相关联的数据、数据文件和数据结构提供给处理器或计算机使得处理器或计算机能执行所述计算机程序。上述计算机可读存储介质中的计算机程序可在诸如客户端、主机、代理装置、服务器等计算机设备中部署的环境中运行,此外,在一个示例中,计算机程序以及任何相关联的数据、数据文件和数据结构分布在联网的计算机系统上,使得计算机程序以及任何相关联的数据、数据文件和数据结构通过一个或多个处理器或计算机以分布式方式存储、访问和执行。According to an embodiment of the present disclosure, there may also be provided a computer-readable storage medium storing instructions, wherein the instructions, when executed by at least one processor, cause the at least one processor to perform the image segmentation method according to the present disclosure. Examples of the computer-readable storage medium herein include: Read Only Memory (ROM), Random Access Programmable Read Only Memory (PROM), Electrically Erasable Programmable Read Only Memory (EEPROM), Random Access Memory (RAM) , dynamic random access memory (DRAM), static random access memory (SRAM), flash memory, non-volatile memory, CD-ROM, CD-R, CD+R, CD-RW, CD+RW, DVD-ROM , DVD-R, DVD+R, DVD-RW, DVD+RW, DVD-RAM, BD-ROM, BD-R, BD-R LTH, BD-RE, Blu-ray or Optical Disc Storage, Hard Disk Drive (HDD), Solid State Hard disk (SSD), card memory (such as a multimedia card, Secure Digital (SD) card, or Extreme Digital (XD) card), magnetic tape, floppy disk, magneto-optical data storage device, optical data storage device, hard disk, solid state disk, and any other apparatus configured to store and provide the computer program and any associated data, data files and data structures in a non-transitory manner with the computer program and any associated data, data files and data structures The computer program is given to a processor or computer so that the processor or computer can execute the computer program. The computer program in the above-mentioned computer readable storage medium can be executed in an environment deployed in a computer device such as a client, a host, a proxy device, a server, etc. Furthermore, in one example, the computer program and any associated data, data files and data structures are distributed over networked computer systems so that the computer programs and any associated data, data files and data structures are stored, accessed and executed in a distributed fashion by one or more processors or computers.
根据本公开的实施例中,还可提供一种计算机程序产品,该计算机程序产品中的指令可由计算机设备的处理器执行以完成上述的图像分割方法。According to an embodiment of the present disclosure, a computer program product can also be provided, wherein instructions in the computer program product can be executed by a processor of a computer device to complete the above-mentioned image segmentation method.
的根据本公开的示例性实施例的图像分割方法可以在每次交互中根据用户操作的范围自适应对图像进行局部裁剪,输入到交互分割网络的交互操作减少,具有较强的容错率。可以得到更精细的局部区域,并得到更准确的分割图像。The image segmentation method according to the exemplary embodiment of the present disclosure can adaptively locally crop the image according to the range of user operations in each interaction, reduce the number of interaction operations input to the interaction segmentation network, and have a strong error tolerance rate. A finer local area can be obtained and a more accurate segmented image can be obtained.
本领域技术人员在考虑说明书及实践这里公开的方案后,将容易想到本公开的其它实施方案。本申请旨在涵盖本公开的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本公开的真正范围和精神由下面的权利要求指出。Other embodiments of the present disclosure will readily occur to those skilled in the art upon consideration of the specification and practice of the schemes disclosed herein. This application is intended to cover any variations, uses, or adaptations of the present disclosure that follow the general principles of the present disclosure and include common knowledge or techniques in the technical field not disclosed by the present disclosure . The specification and examples are to be regarded as exemplary only, with the true scope and spirit of the disclosure being indicated by the following claims.
应当理解的是,本公开并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本公开的范围仅由所附的权利要求来限制。It is to be understood that the present disclosure is not limited to the precise structures described above and illustrated in the accompanying drawings, and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210380537.XA CN114897915B (en) | 2022-04-08 | 2022-04-08 | Image segmentation method, device, electronic device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210380537.XA CN114897915B (en) | 2022-04-08 | 2022-04-08 | Image segmentation method, device, electronic device and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114897915A true CN114897915A (en) | 2022-08-12 |
CN114897915B CN114897915B (en) | 2025-04-11 |
Family
ID=82717447
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210380537.XA Active CN114897915B (en) | 2022-04-08 | 2022-04-08 | Image segmentation method, device, electronic device and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114897915B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116797624A (en) * | 2023-05-30 | 2023-09-22 | 杭州易现先进科技有限公司 | Method and device for determining occlusion relationship |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210225002A1 (en) * | 2021-01-28 | 2021-07-22 | Intel Corporation | Techniques for Interactive Image Segmentation Networks |
US20210227152A1 (en) * | 2020-01-20 | 2021-07-22 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Method and apparatus for generating image |
WO2021150017A1 (en) * | 2020-01-23 | 2021-07-29 | Samsung Electronics Co., Ltd. | Method for interactive segmenting an object on an image and electronic computing device implementing the same |
CN114092498A (en) * | 2020-07-30 | 2022-02-25 | 达索系统公司 | Method for segmenting an object in an image |
-
2022
- 2022-04-08 CN CN202210380537.XA patent/CN114897915B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210227152A1 (en) * | 2020-01-20 | 2021-07-22 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Method and apparatus for generating image |
WO2021150017A1 (en) * | 2020-01-23 | 2021-07-29 | Samsung Electronics Co., Ltd. | Method for interactive segmenting an object on an image and electronic computing device implementing the same |
CN114092498A (en) * | 2020-07-30 | 2022-02-25 | 达索系统公司 | Method for segmenting an object in an image |
US20210225002A1 (en) * | 2021-01-28 | 2021-07-22 | Intel Corporation | Techniques for Interactive Image Segmentation Networks |
Non-Patent Citations (1)
Title |
---|
文韬;章义来;彭永康;: "基于闭合解抠图算法的图像边缘分割改进研究", 信息与电脑(理论版), no. 14, 25 July 2020 (2020-07-25) * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116797624A (en) * | 2023-05-30 | 2023-09-22 | 杭州易现先进科技有限公司 | Method and device for determining occlusion relationship |
Also Published As
Publication number | Publication date |
---|---|
CN114897915B (en) | 2025-04-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11640235B2 (en) | Additional object display method and apparatus, computer device, and storage medium | |
KR102635373B1 (en) | Image processing methods and devices, terminals and computer-readable storage media | |
WO2021008456A1 (en) | Image processing method and apparatus, electronic device, and storage medium | |
US11978219B2 (en) | Method and device for determining motion information of image feature point, and task performing method and device | |
CN112181572B (en) | Interactive special effect display method, device, terminal and storage medium | |
CN112749590B (en) | Target detection method, device, computer equipment and computer-readable storage medium | |
CN108391171A (en) | Control method and device, the terminal of video playing | |
CN110290426B (en) | Method, device and equipment for displaying resources and storage medium | |
CN108391058B (en) | Image capturing method, device, electronic device and storage medium | |
JP7487293B2 (en) | Method and device for controlling virtual camera movement, and computer device and program | |
WO2021114592A1 (en) | Video denoising method, device, terminal, and storage medium | |
CN111385525B (en) | Video monitoring method, device, terminal and system | |
WO2020211607A1 (en) | Video generation method, apparatus, electronic device, and medium | |
WO2022134632A1 (en) | Work processing method and apparatus | |
CN110839128A (en) | Photographic behavior detection method, device and storage medium | |
CN110853124B (en) | Methods, devices, electronic equipment and media for generating GIF dynamic images | |
CN111459363A (en) | Information display method, device, equipment and storage medium | |
WO2021218926A1 (en) | Image display method and apparatus, and computer device | |
CN111275607A (en) | Interface display method and device, computer equipment and storage medium | |
CN109754439B (en) | Calibration method, device, electronic equipment and medium | |
CN110675473A (en) | Method, device, electronic equipment and medium for generating GIF dynamic graph | |
CN112804481B (en) | Method and device for determining position of monitoring point and computer storage medium | |
CN114897915B (en) | Image segmentation method, device, electronic device and storage medium | |
CN113590877B (en) | Method and device for acquiring annotation data | |
CN112699906B (en) | Methods, devices and storage media for obtaining training data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |