WO2025052642A1

WO2025052642A1 - Training information creation assistance device, image classification device, and program

Info

Publication number: WO2025052642A1
Application number: PCT/JP2023/032758
Authority: WO
Inventors: 壮太小松; 昌義石川; 軍陳
Original assignee: Hitachi High Tech Corp
Current assignee: Hitachi High Tech Corp
Priority date: 2023-09-07
Filing date: 2023-09-07
Publication date: 2025-03-13
Anticipated expiration: 2026-03-07
Also published as: TW202512113A

Abstract

This training information creation assistance device comprises: a category score calculation unit that uses a segmentation model for an image which is composed of a plurality of pixels to calculate, for each combination of a pixel and a category, a category score indicating the degree of confidence that the pixel belongs to the category; an inference image determination unit that calculates an inference image on the basis of the category score; an adjacent category calculation unit that receives input of a category of interest and calculates, on the basis of the inference image and the category of interest, an adjacent category which is likely to be adjacent to the category of interest; and a correction-required score calculation unit that determines a correction-required score for at least one pixel on the basis of the category of interest, the adjacent category, and the category score.

Description

Teacher information creation support device, image classification device and program

　本発明は、教師情報作成支援装置、画像分類装置およびプログラムに関し、たとえば学習機能を備えた画像分類システムにおける教師データ作成コストを削減するためのものに関する。 The present invention relates to a teacher information creation support device, an image classification device, and a program for reducing the cost of creating teacher data in, for example, an image classification system with a learning function.

　深層学習を用いた画素ごとの画像分類モデルの作成には教師情報が必要となるが、教師情報付与にかかるコストが大きい。対策としてモデルの推論結果に対してユーザが修正箇所と修正内容を指示することでユーザと協調しながら学習するインタラクティブセグメンテーションがある。修正箇所の選び方として、注目カテゴリと推論した箇所のうち、推論確率の低い箇所とする方法が考えられるが、注目カテゴリと推論していない箇所は選ばれないため、偽陰性の箇所が選ばれず精度が向上しない。　Creating a pixel-by-pixel image classification model using deep learning requires teacher information, but the cost of adding teacher information is high. As a countermeasure, there is interactive segmentation, where the model learns in collaboration with the user by having the user indicate the parts and content of corrections to be made to the model's inference results. One possible way to select the parts to be corrected is to select parts with a low probability of inference from among the parts inferred as the focus category, but since parts that are not inferred as the focus category are not selected, false negative parts are not selected and the accuracy does not improve.

国際公開第２０２０／１２９２３５号公報International Publication No. 2020/129235 国際公開第２０２０／１８９２６９号公報International Publication No. 2020/189269

　特許文献１では、推論確率が低くなる傾向を利用して、モデルの推論確率が低い箇所を修正箇所候補とするが、注目カテゴリに関係の無い誤識別箇所が選ばれることがあり、注目カテゴリの精度が上がりづらい。 In Patent Document 1, the tendency for inference probability to decrease is utilized to select areas with low model inference probability as candidates for correction areas, but misclassified areas unrelated to the target category may be selected, making it difficult to improve the accuracy of the target category.

　特許文献２では、注目カテゴリと分類された箇所のうち、推論確率の低い箇所を修正箇所候補とするが、注目カテゴリと推論していない箇所は選ばれないため、偽陰性の箇所が選ばれず精度が向上しない。 In Patent Document 2, among the parts classified as the target category, parts with a low probability of inference are selected as candidates for correction parts, but parts that are not inferred to be in the target category are not selected, so false negative parts are not selected and accuracy is not improved.

　このことから本発明においては、偽陰性箇所を含めて注目カテゴリの精度向上が期待できる要修正箇所を提示することを目的とする。 The purpose of this invention is to present areas requiring correction, including false negatives, that are expected to improve the accuracy of the focus category.

　本発明に係る教師情報作成支援装置の一例は、
　複数の画素からなる画像について、セグメンテーションモデルを用いて、画素とカテゴリとの組み合わせごとに、その画素がそのカテゴリに属することの確信度を示すカテゴリスコアを算出する、カテゴリスコア算出部と、
　前記カテゴリスコアに基づき、推論画像を算出する、推論画像決定部と、
　注目カテゴリの入力を受け付けるとともに、前記推論画像および前記注目カテゴリに基づき、注目カテゴリに隣接しやすい隣接カテゴリを算出する、隣接カテゴリ算出部と、
　前記注目カテゴリ、前記隣接カテゴリおよび前記カテゴリスコアに基づき、少なくとも１つの画素について要修正スコアを決定する、要修正スコア算出部と、
を有する。 An example of a teacher information creation support device according to the present invention is:
a category score calculation unit that calculates, for an image consisting of a plurality of pixels, a category score indicating a degree of certainty that each combination of a pixel and a category belongs to the corresponding category, using a segmentation model;
An inference image determination unit that calculates an inference image based on the category score;
an adjacent category calculation unit that receives an input of a focus category and calculates an adjacent category that is likely to be adjacent to the focus category based on the inference image and the focus category;
a correction-required score calculation unit that determines a correction-required score for at least one pixel based on the focus category, the neighboring categories, and the category scores;
has.

　本発明に係る画像分類装置の一例は、
　複数の画素からなる画像について、セグメンテーションモデルを用いて、画素とカテゴリとの組み合わせごとに、その画素がそのカテゴリに属することの確信度を示すカテゴリスコアを算出し、前記カテゴリスコアに基づいて推論画像を算出する、画像カテゴリ推論部と、
　注目カテゴリ設定部を有する教師情報作成ＧＵＩと、
を有し、
　前記注目カテゴリ設定部は、注目カテゴリの入力を受け付け、
　前記画像分類装置は、前記注目カテゴリおよび前記カテゴリスコアに基づいて要修正箇所を算出し、
　前記教師情報作成ＧＵＩは、
　前記要修正箇所を出力し、
　教師情報付与部を介して教師情報の付与を受け付ける。 An example of an image classification device according to the present invention includes:
an image category inference unit that uses a segmentation model to calculate a category score indicating a degree of certainty that a pixel belongs to a category for each combination of a pixel and a category for an image consisting of a plurality of pixels, and calculates an inferred image based on the category score;
A teacher information creation GUI having a focus category setting unit;
having
The attention category setting unit receives an input of an attention category,
The image classification device calculates a portion to be corrected based on the attention category and the category score,
The teacher information creation GUI includes:
Output the part that needs to be corrected,
The teacher information is accepted via the teacher information providing unit.

　本発明に係るプログラムの一例は、コンピュータを上述の教師情報作成支援装置として機能させる。 One example of the program according to the present invention causes a computer to function as the teacher information creation support device described above.

　本発明に係るプログラムの一例は、コンピュータを上述の画像分類装置として機能させる。 One example of the program according to the present invention causes a computer to function as the image classification device described above.

　本発明によれば、注目カテゴリの偽陰性に対する精度向上に繋がる要修正箇所が提示できる。 The present invention can suggest areas that need correction to improve accuracy against false negatives in a category of interest.

本発明の実施例１に係る教師情報作成支援装置の機能構成例。1 is an example of a functional configuration of a teacher information creation support device according to a first embodiment of the present invention. 隣接カテゴリ算出部の処理の一例。13 shows an example of processing by an adjacent category calculation unit. 隣接カテゴリ算出部の処理結果の一例。13 shows an example of a processing result of an adjacent category calculation unit. 要修正スコア算出部の一例。13 illustrates an example of a correction-required score calculation unit. 不確実性を算出する方法の別の例。Another example of how to calculate uncertainty. 要修正スコアを算出する方法の別の例。Another example of how to calculate the Needs Revision score. 要修正スコアを算出する方法のさらに別の例。Yet another example of how to calculate the Needs Revision score. 本発明の実施例２に係る画像分類装置の機能構成例。13 is an example of a functional configuration of an image classification device according to a second embodiment of the present invention. 本発明の実施例３に係る画像分類装置の機能構成例。13 is an example of a functional configuration of an image classification device according to a third embodiment of the present invention. 本発明の実施例３に係る画像分類装置の機能構成の別の例。13 is another example of a functional configuration of an image classification device according to a third embodiment of the present invention. 教師情報作成ＧＵＩの一例。13 is an example of a teacher information creation GUI. 教師情報作成ＧＵＩの別の例。13 is another example of a teacher information creation GUI.

　以下、本発明の実施例について、図面を参照して詳細に説明する。
［実施例１］
　図１に、本発明の実施例１に係る教師情報作成支援装置の機能構成例を示す。まず図１の機能構成の概要を述べると、教師情報作成支援装置１００は入力画像Ｄ１とセグメンテーションモデルＤ２と注目カテゴリＤ４を入力とし、カテゴリスコア算出部１０１と推論画像決定部１０２と隣接カテゴリ算出部１０３と要修正スコア算出部１０４とを備える。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.
[Example 1]
1 shows an example of the functional configuration of a teacher information creation support device according to a first embodiment of the present invention. First, an overview of the functional configuration in Fig. 1 will be given. The teacher information creation support device 100 receives an input image D1, a segmentation model D2, and a focus category D4 as inputs, and includes a category score calculation unit 101, an inference image determination unit 102, an adjacent category calculation unit 103, and a correction required score calculation unit 104.

　カテゴリスコア算出部１０１は、複数の画素からなる入力画像Ｄ１について、セグメンテーションモデルＤ２を用いて、画素とカテゴリとの組み合わせごとに、その画素がそのカテゴリに属することの確信度を示すカテゴリスコアＤ３を算出する。カテゴリスコアＤ３は、入力画像Ｄ１に対して各画素ごとに、識別されるカテゴリごとの可能性の高さを示す値を持つベクトルである。 The category score calculation unit 101 uses a segmentation model D2 to calculate a category score D3 for each combination of pixel and category for an input image D1 consisting of multiple pixels, which indicates the degree of certainty that the pixel belongs to that category. The category score D3 is a vector having values indicating the likelihood of each category being identified for each pixel in the input image D1.

　推論画像決定部１０２は、カテゴリスコアＤ３に基づき、推論画像Ｄ５を算出する。推論画像Ｄ５は、推論により画素ごとにカテゴリが１つに特定された画像である。具体的には画素ごとにカテゴリスコアＤ３について、最大の値を持つカテゴリを識別カテゴリとすることで算出できる。 The inference image determination unit 102 calculates an inference image D5 based on the category score D3. The inference image D5 is an image in which a single category has been identified for each pixel through inference. Specifically, the inference image D5 can be calculated by determining the category with the maximum value for the category score D3 for each pixel as the identification category.

　隣接カテゴリ算出部１０３は、注目カテゴリＤ４の入力を受け付けるとともに、推論画像Ｄ５および注目カテゴリＤ４に基づき、隣接カテゴリＤ６を算出する。隣接カテゴリとは、注目カテゴリに隣接しやすいカテゴリ（すなわち隣接する尤度が高いカテゴリ）を意味するが、具体例は後述する。 The adjacent category calculation unit 103 receives the input of the focused category D4, and calculates the adjacent category D6 based on the inference image D5 and the focused category D4. An adjacent category means a category that is likely to be adjacent to the focused category (i.e., a category with a high adjacent likelihood), and a specific example will be described later.

　要修正スコア算出部１０４は、注目カテゴリＤ４、隣接カテゴリＤ６およびカテゴリスコアＤ３に基づき、要修正スコアＤ７を決定する。要修正スコアは、たとえば、セグメンテーションモデルＤ２にとって注目カテゴリに関して識別が難しい箇所について算出されるが、具体例は後述する。 The correction score calculation unit 104 determines the correction score D7 based on the focused category D4, the adjacent category D6, and the category score D3. The correction score is calculated, for example, for a portion that is difficult for the segmentation model D2 to identify in relation to the focused category, and a specific example will be described later.

　教師情報作成支援装置１００は、たとえばコンピュータを用いて構成することができる。コンピュータは公知のコンピュータとしてのハードウェア構成を有し、たとえば演算手段および記憶手段を備える。演算手段はたとえばプロセッサを含み、記憶手段はたとえば半導体メモリ装置および磁気ディスク装置等の記憶媒体を含む。記憶媒体の一部または全部が、過渡的でない(non-transitory)記憶媒体であってもよい。 The teacher information creation support device 100 can be configured, for example, using a computer. The computer has a hardware configuration as a known computer, and includes, for example, a calculation means and a storage means. The calculation means includes, for example, a processor, and the storage means includes, for example, a storage medium such as a semiconductor memory device and a magnetic disk device. Some or all of the storage medium may be a non-transitory storage medium.

　また、コンピュータは入出力手段を備えてもよい。入出力手段は、たとえばキーボードおよびマウス等の入力装置と、ディスプレイおよびプリンタ等の出力装置と、ネットワークインタフェース等の通信装置とを含む。 The computer may also be equipped with input/output means. The input/output means may include, for example, input devices such as a keyboard and a mouse, output devices such as a display and a printer, and communication devices such as a network interface.

　記憶手段はプログラムを記憶してもよい。プロセッサがこのプログラムを実行することにより、コンピュータは本実施例に係る教師情報作成支援装置１００または他の実施例に係る装置として機能する。 The storage means may store a program. When the processor executes this program, the computer functions as the teacher information creation support device 100 according to this embodiment or a device according to another embodiment.

　以下、図１中の各構成要素の詳細を述べる。 The details of each component in Figure 1 are described below.

　入力画像Ｄ１は、要修正スコアを算出する対象となる画像である。この画像内の、少なくとも１つの画素（画素又は画素群）について、セグメンテーションモデルＤ２の精度向上を促すと考えられる要修正スコアが算出される。 The input image D1 is an image for which a correction score is to be calculated. For at least one pixel (pixel or group of pixels) in this image, a correction score that is thought to promote improvement of the accuracy of the segmentation model D2 is calculated.

　セグメンテーションモデルＤ２は事前に機械学習が行われた学習済みモデルであり、画像に対して識別性能を有しているものである。ただし、事前の学習は入力画像Ｄ１と似た特徴を持つデータセットを用いて実施されていてもいいし、入力画像Ｄ１とは関係の無いデータセットで実施されていても良い。 Segmentation model D2 is a trained model that has undergone prior machine learning and has the ability to discriminate images. However, the prior training may be performed using a dataset that has similar characteristics to input image D1, or may be performed using a dataset that is unrelated to input image D1.

　セグメンテーションモデルＤ２は、入力画像Ｄ１の画素ごとに特徴量を算出し、特徴量から各カテゴリへ識別する際の確信度を計算する。確信度はソフトマックス関数で画素ごとに正規化され推論確率を示す値の場合もあるし、ソフトマックスで正規化される前のロジット（logit）関数である場合もあり、どちらでもよい。 The segmentation model D2 calculates features for each pixel of the input image D1, and calculates the confidence level when classifying each category from the features. The confidence level may be a value that indicates the inference probability after being normalized for each pixel using a softmax function, or it may be a logit function before being normalized using softmax. Either is acceptable.

　注目カテゴリＤ４は、セグメンテーションモデルの識別するカテゴリの内、優先して精度向上を図るべきカテゴリを示す。例えば、半導体計測画像であれば計測対象の構造を示すカテゴリなどである。 The focus category D4 indicates the category that should be prioritized for improving accuracy among the categories identified by the segmentation model. For example, in the case of a semiconductor measurement image, this would be a category that indicates the structure of the measurement target.

　図２に、隣接カテゴリ算出部の処理の一例を示す。隣接カテゴリ算出部１０３はまず、処理ステップＳ２０１として、隣接画素群を抽出する。 FIG. 2 shows an example of the processing of the adjacent category calculation unit. First, in processing step S201, the adjacent category calculation unit 103 extracts adjacent pixel groups.

　ここで、まず隣接カテゴリ算出部１０３は、推論画像Ｄ５の画素のうち注目カテゴリに属するものを抽出する。そして、注目カテゴリに属すると推論された画素のそれぞれについて、その画素に隣接する画素を、隣接画素として抽出する（ただし、隣接する画素も注目カテゴリに属する場合には、その画素は隣接画素としては抽出されない）。 First, the adjacent category calculation unit 103 extracts pixels from the inference image D5 that belong to the attention category. Then, for each pixel inferred to belong to the attention category, pixels adjacent to that pixel are extracted as adjacent pixels (however, if the adjacent pixels also belong to the attention category, those pixels are not extracted as adjacent pixels).

　次に、隣接カテゴリ算出部１０３は処理ステップＳ２０２として、隣接カテゴリを決定する。ここで、隣接カテゴリ算出部１０３は、隣接画素群について、推論画像Ｄ５におけるカテゴリの頻度を求め、頻度の高いカテゴリを、注目カテゴリＤ４に対する隣接カテゴリとして決定する。これによって、注目カテゴリＤ４に隣接しやすいカテゴリが決定される。ここで、頻度の高いカテゴリが複数あれば、隣接カテゴリを複数決定することも考えられる。これにより、注目カテゴリＤ４に隣接しやすいカテゴリを算出することが出来る。 Next, in processing step S202, the adjacent category calculation unit 103 determines adjacent categories. Here, the adjacent category calculation unit 103 obtains the frequency of categories in the inference image D5 for the adjacent pixel groups, and determines the category with a high frequency as the adjacent category for the focused category D4. In this way, the category that is likely to be adjacent to the focused category D4 is determined. Here, if there are multiple frequently occurring categories, it is also possible to determine multiple adjacent categories. In this way, it is possible to calculate the categories that are likely to be adjacent to the focused category D4.

　このように、本実施例において、注目カテゴリＤ４に隣接しやすいカテゴリとは、推論画像Ｄ５において注目カテゴリＤ４に属する画素に隣接する画素が属するカテゴリとして頻度の高いものをいう。「頻度の高い」の定義は当業者が適宜決定可能であるが、たとえば所定の閾値を超える場合をいう。 In this way, in this embodiment, a category that is likely to be adjacent to the focused category D4 refers to a category that frequently has pixels adjacent to pixels belonging to the focused category D4 in the inference image D5. The definition of "frequent" can be determined appropriately by those skilled in the art, but it refers to, for example, a case where the frequency exceeds a predetermined threshold.

　図３に、隣接カテゴリ算出部の処理結果の一例を示す。図３は、各画素がカテゴリ１、２、３のいずれかに識別された推論画像を入力とし、注目カテゴリをカテゴリ１とした例を示す。 Figure 3 shows an example of the processing results of the adjacent category calculation unit. Figure 3 shows an example in which an inference image in which each pixel is classified as category 1, 2, or 3 is input, and the focus category is category 1.

　まずＳ２０１にてカテゴリ１と推論された箇所に隣接する画素群が、隣接画素群３００として算出される。次に、Ｓ２０２にて隣接画素群３００が属するカテゴリの頻度を求め、頻度の高いカテゴリとしてカテゴリ２が隣接カテゴリとして決定される。 First, in S201, the group of pixels adjacent to the location inferred to be category 1 is calculated as adjacent pixel group 300. Next, in S202, the frequency of the category to which adjacent pixel group 300 belongs is calculated, and category 2, which is the category with the highest frequency, is determined as the adjacent category.

　例えば画像内に同様の構造が繰り返される半導体計測画像の計測対象の構造を示すカテゴリであれば、特定のカテゴリに隣接するカテゴリは限定されているため、上記の手法で頻度の高いカテゴリを隣接するカテゴリとすることが出来る。 For example, if the category indicates the structure of the measurement target in a semiconductor measurement image where similar structures are repeated within the image, the number of categories adjacent to a particular category is limited, so the above method can be used to determine the most frequent category as the adjacent category.

　また、例えば道路の画像の白線を注目カテゴリとした場合、隣接するカテゴリとして道路のアスファルト部分の頻度が高く、上記の手法によりアスファルト部分が隣接するカテゴリとして登録される。 For example, if the white lines in an image of a road are set as the focus category, the asphalt parts of the road are frequently seen as an adjacent category, and the above method will register the asphalt parts as an adjacent category.

　図４に要修正スコア算出部１０４の一例を示す。要修正スコア算出部１０４はまず処理ステップＳ４０１で、推論画像Ｄ５において注目カテゴリＤ４に属すると推論された各画素、および、推論画像Ｄ５において隣接カテゴリＤ６に属すると推論された各画素について、注目カテゴリＤ４に係るカテゴリスコアおよび隣接カテゴリＤ６に係るカテゴリスコアを抽出する。 FIG. 4 shows an example of the correction score calculation unit 104. First, in processing step S401, the correction score calculation unit 104 extracts a category score related to the attention category D4 and a category score related to the adjacent category D6 for each pixel inferred to belong to the attention category D4 in the inference image D5 and each pixel inferred to belong to the adjacent category D6 in the inference image D5.

　そして、要修正スコア算出部１０４は、抽出されたカテゴリスコアに基づき、処理ステップＳ４０２で、セグメンテーションモデルの学習が不十分である（たとえば学習が不十分である特徴を含んでいる）度合いを示す値として不確実性を算出する。たとえば、抽出されたカテゴリスコア群のうち、指定されたカテゴリ（注目カテゴリＤ４および隣接カテゴリＤ６）に対するカテゴリスコアの小ささを不確実性として算出する。 Then, in processing step S402, the correction score calculation unit 104 calculates uncertainty based on the extracted category scores as a value indicating the degree to which the segmentation model is insufficiently trained (e.g., contains insufficiently trained features). For example, the uncertainty is calculated as the smallness of the category scores for the specified categories (focus category D4 and adjacent category D6) from among the extracted category scores.

　例えば、カテゴリスコアが推論確率であった場合、１－推論確率（１からその推論確率を減算した値）を不確実性の値とする。一般に、学習したセグメンテーションモデルでは、十分に学習された特徴を持つ画素の推論確率は大きくなり、学習が不十分な特徴を持つ画素の推論確率は小さくなるため、１－推論確率が大きいとされた箇所（不確実性が大きい箇所）は学習が不十分な箇所であると言える。 For example, if the category score is an inference probability, then the uncertainty value is 1 - inference probability (the value obtained by subtracting that inference probability from 1). Generally, in a trained segmentation model, the inference probability of a pixel with sufficiently trained features will be large, and the inference probability of a pixel with insufficiently trained features will be small, so it can be said that areas where 1 - inference probability is large (areas with large uncertainty) are areas where learning is insufficient.

　これらに指定カテゴリとして注目カテゴリと隣接カテゴリを入力とすることで、注目カテゴリであると推論された箇所の不確実性と、隣接カテゴリであると推論された箇所の不確実性を得る。そして、これらの不確実性に基づき、処理ステップＳ４０３にて不確実性の大きい要修正箇所を特定し、特定された要修正箇所の要修正スコアを算出する。 By inputting the focus category and adjacent categories as the specified categories, the uncertainty of the part inferred to be the focus category and the uncertainty of the part inferred to be the adjacent category are obtained. Then, based on these uncertainties, processing step S403 identifies parts that need correction and have a large degree of uncertainty, and calculates a correction score for the identified parts that need correction.

　要修正箇所は、たとえば不確実性が所定閾値以上となる箇所とすることができる。要修正スコアの具体的な算出方法は、当業者が適宜設計することができ、たとえば注目カテゴリに係る不確実性と隣接カテゴリに係る不確実性との和であってもよい。 The parts requiring correction may be parts where the uncertainty is equal to or greater than a predetermined threshold. A specific method for calculating the score requiring correction may be designed as appropriate by a person skilled in the art, and may be, for example, the sum of the uncertainty related to the focus category and the uncertainty related to the adjacent category.

　不確実性の値が大きい箇所はセグメンテーションモデルにとって識別が難しい箇所を示し、優先的に学習することでセグメンテーションモデルの識別精度向上が期待できる。これにより、偽陽性箇所が出る可能性のある注目カテゴリと推論された箇所に加えて、偽陰性箇所が出る可能性がある隣接カテゴリと推論された箇所を含めた候補から要修正スコアを算出することが可能となる。　Areas with large uncertainty values indicate areas that are difficult for the segmentation model to identify, and by prioritizing learning from these areas, it is expected that the segmentation model's identification accuracy will improve. This makes it possible to calculate the score that needs correction from candidates that include adjacent categories and inferred areas that may result in false negatives, in addition to the focus category and inferred areas that may result in false positives.

　図５に、不確実性を算出する方法の別の例を示す。図５のように、隣接カテゴリに係る不確実性の算出方法として、隣接カテゴリに属する推論された箇所において、注目カテゴリに対するカテゴリスコアの大きさを不確実性とすることも考えられる。 Figure 5 shows another example of a method for calculating uncertainty. As shown in Figure 5, a method for calculating uncertainty regarding adjacent categories is to use the magnitude of the category score for the focus category as the uncertainty at an inferred location that belongs to the adjacent category.

　図５の例では、要修正スコア算出部１０４は、処理ステップＳ５０１で、推論画像において隣接カテゴリに属すると推論された各画素について、カテゴリスコア群を抽出する。そして、処理ステップＳ５０２で、抽出されたカテゴリスコア群のうち、注目カテゴリに係るカテゴリスコアに基づき、不確実性（第２不確実性）を算出する。第２不確実性は、たとえば注目カテゴリに係るカテゴリスコアの大きさとすることができる。 In the example of FIG. 5, in processing step S501, the correction score calculation unit 104 extracts a group of category scores for each pixel inferred to belong to an adjacent category in the inference image. Then, in processing step S502, the uncertainty (second uncertainty) is calculated based on the category score related to the focused category from the extracted group of category scores. The second uncertainty can be, for example, the magnitude of the category score related to the focused category.

　なお、注目カテゴリに属すると推論された箇所については、図４と同様の算出方法とすることができる。すなわち、要修正スコア算出部１０４は、推論画像において注目カテゴリに属すると推論された各画素について、注目カテゴリに係るカテゴリスコアに基づき、不確実性（第１不確実性）を算出する。 Note that for the parts inferred to belong to the attention category, the same calculation method as in FIG. 4 can be used. That is, the correction score calculation unit 104 calculates the uncertainty (first uncertainty) for each pixel inferred to belong to the attention category in the inference image, based on the category score related to the attention category.

　この場合には、要修正スコア算出部１０４は、第１不確実性および第２不確実性に基づき、要修正スコアを算出する。 In this case, the correction score calculation unit 104 calculates the correction score based on the first uncertainty and the second uncertainty.

　偽陰性の箇所では、入力画像が注目カテゴリの特徴を含んでいる場合があり、すなわち、そのような箇所では、最もカテゴリスコアが高いカテゴリが隣接カテゴリであっても、注目カテゴリのカテゴリスコアも高い場合がある。よって、隣接カテゴリ内において、より注目カテゴリに対する偽陰性の可能性が高い箇所に限定して、要修正スコアを算出することが可能となる。 In places where there are false negatives, the input image may contain features of the target category. In other words, in such places, even if the category with the highest category score is an adjacent category, the category score of the target category may also be high. Therefore, it is possible to calculate the score requiring correction only for places within the adjacent category that are more likely to be false negatives for the target category.

　実施例１では、以上述べた方法によって偽陽性に加えて偽陰性に関する精度向上を見込める要修正スコアを算出することが可能となる。 In Example 1, the method described above makes it possible to calculate a score requiring correction that is expected to improve accuracy with respect to false negatives as well as false positives.

　図６に、要修正スコアを算出する方法の別の例を示す。図６の例では、要修正スコア算出部１０４は、処理ステップＳ６０１において、各画素について、その画素を含む所定範囲内で要修正スコアＤ７を平滑化し、平滑化された要修正スコアＤ８の値が大きい箇所を要修正箇所として算出する。 FIG. 6 shows another example of a method for calculating the correction-needed score. In the example of FIG. 6, in processing step S601, the correction-needed score calculation unit 104 smoothes the correction-needed score D7 for each pixel within a predetermined range including the pixel, and calculates the part where the value of the smoothed correction-needed score D8 is large as the part that needs correction.

　ユーザが修正を行う際に、孤立した１画素のみを修正するよりも、大きい範囲を一度に修正した方が効率が良いため、所定範囲で修正スコアが高い箇所を算出することで、効率のよい修正を行うことが可能となる。所定範囲の具体的な決定方法、および、所定範囲内で平滑化するための具体的な演算は、当業者が適宜設計可能であるが、たとえばガウス関数との２次元畳み込み処理を用いてもよく、単純な平均演算処理を用いてもよい。 When a user makes corrections, it is more efficient to correct a large range at once than to correct only one isolated pixel, so by calculating the points with a high correction score within a specified range, it is possible to perform efficient corrections. A specific method for determining the specified range and a specific calculation for smoothing within the specified range can be designed appropriately by a person skilled in the art, but for example, a two-dimensional convolution process with a Gaussian function or a simple average calculation process may be used.

　図７に、要修正スコアを算出する方法のさらに別の例を示す。図７の例では、要修正スコア算出部１０４は、処理ステップＳ７０１において、要修正スコアが大きい箇所を要修正箇所Ｄ９として特定する。要修正スコアが大きい箇所の定義は、当業者が適宜決定可能であるが、たとえば閾値以上の要修正スコアを持つ箇所としてもよいし、要修正スコアの偏差値が大きい領域としてもよい。 FIG. 7 shows yet another example of a method for calculating the correction-needed score. In the example of FIG. 7, the correction-needed score calculation unit 104 identifies a portion having a high correction-needed score as a correction-needed portion D9 in processing step S701. The definition of a portion having a high correction-needed score can be appropriately determined by a person skilled in the art, but it may be, for example, a portion having a correction-needed score equal to or greater than a threshold value, or an area having a high standard deviation value of the correction-needed score.

　このように要修正箇所を決定することにより、ユーザは要修正スコアが大きい箇所を効率的に把握することができる。 By determining the areas that need correction in this way, users can efficiently identify areas that have a high correction score.

［実施例２］
　実施例２においては、実施例１にて算出される要修正スコアを用いて教師情報を作成するＧＵＩを提案する。実施例１と共通する部分については説明を省略する場合がある。 [Example 2]
In the second embodiment, a GUI is proposed for creating teacher information using the correction-required score calculated in the first embodiment. Description of parts common to the first embodiment may be omitted.

　図８に、本発明の実施例２に係る画像分類装置の機能構成例を示す。画像分類装置８００は、実施例１と同様の教師情報作成支援装置としても機能する。たとえば、画像分類装置８００は、さらに図１の教師情報作成支援装置１００と同様の構成を有してもよい。 FIG. 8 shows an example of the functional configuration of an image classification device according to the second embodiment of the present invention. The image classification device 800 also functions as a teacher information creation support device similar to that of the first embodiment. For example, the image classification device 800 may further have a configuration similar to that of the teacher information creation support device 100 of FIG. 1.

　まず図８の機能構成の概要を述べると、画像分類装置８００は、入力画像Ｄ８０１とセグメンテーションモデルＤ８０２を入力とし、画像カテゴリ推論部８０１と教師情報作成ＧＵＩ８０２を備える。 First, to outline the functional configuration in Figure 8, the image classification device 800 receives an input image D801 and a segmentation model D802, and is equipped with an image category inference unit 801 and a teacher information creation GUI 802.

　画像カテゴリ推論部８０１は、複数の画素からなる入力画像Ｄ８０１について、セグメンテーションモデルＤ８０２を用いて、画素とカテゴリとの組み合わせごとに、その画素がそのカテゴリに属することの確信度を示すカテゴリスコアＤ８０４を算出し、カテゴリスコアＤ８０４に基づいて推論画像Ｄ８０３を算出する。推論画像Ｄ８０３とカテゴリスコアＤ８０４はそれぞれ図１の推論画像Ｄ５とカテゴリスコアＤ３と同様に算出される。 The image category inference unit 801 uses a segmentation model D802 to calculate a category score D804 indicating the degree of certainty that a pixel belongs to a category for each combination of pixel and category for an input image D801 consisting of multiple pixels, and calculates an inference image D803 based on the category score D804. The inference image D803 and category score D804 are calculated in the same way as the inference image D5 and category score D3 in Figure 1, respectively.

　教師情報作成ＧＵＩ８０２は入力画像Ｄ８０１と推論画像Ｄ８０３とカテゴリスコアＤ８０４を用いてユーザが画素ごとの教師情報Ｄ８０５を作成することを目的としたＧＵＩである。教師情報作成ＧＵＩ８０２では入力画像Ｄ８０１と推論画像Ｄ８０３を同時に表示し推論結果の誤りを見つけやすくしたり、カテゴリスコアＤ８０４から要修正箇所を機械が求め修正箇所の選択を容易にする機能を有する。 The teacher information creation GUI802 is a GUI intended for the user to create teacher information D805 for each pixel using the input image D801, inference image D803, and category score D804. The teacher information creation GUI802 has a function of simultaneously displaying the input image D801 and inference image D803 to make it easier to find errors in the inference results, and a function of having the machine find areas that need correction from the category score D804, making it easy to select the areas to be corrected.

　図１１に、教師情報作成ＧＵＩ８０２の一例を示す。教師情報作成ＧＵＩは入力画像Ｄ８０１を表示する入力画像表示部と、推論画像Ｄ８０３を表示する推論画像表示部と、教師情報作成を付与する機能を有する教師情報付与部と、注目カテゴリの入力を受け付ける注目カテゴリ設定部と、入力された注目カテゴリと入力画像とカテゴリスコアから図１に示す教師情報作成支援装置１００に基づいて計算された要修正箇所を出力（たとえば表示）する要修正箇所表示部と、同様に計算された不確実性を表示する不確実性表示部と、セグメンテーションモデルの学習を開始するための操作を受け付けるモデル学習開始部を有する。 Figure 11 shows an example of a teacher information creation GUI 802. The teacher information creation GUI has an input image display unit that displays the input image D801, an inference image display unit that displays the inference image D803, a teacher information attachment unit that has a function of attaching teacher information creation, a focus category setting unit that accepts input of a focus category, a correction-required part display unit that outputs (for example, displays) a part that needs correction calculated based on the teacher information creation support device 100 shown in Figure 1 from the input focus category, input image, and category score, an uncertainty display unit that similarly displays the uncertainty calculated, and a model learning start unit that accepts an operation to start learning the segmentation model.

　不確実性表示部と要修正箇所表示部は、教師情報を付与する際の参考情報として表示していて、無くても良い。 The uncertainty display section and the section displaying areas requiring correction are displayed as reference information when adding supervised information, and may not be necessary.

　また、修正箇所を自動的に算出するための操作を受け付ける修正箇所自動設定部を有してもよい。 The device may also have an automatic correction part setting unit that accepts operations for automatically calculating the correction parts.

　画像分類装置８００は、注目カテゴリおよびカテゴリスコアに基づいて、たとえば実施例１の教師情報作成支援装置１００と同様にして要修正箇所を算出することができる。 The image classification device 800 can calculate the areas that need to be corrected based on the focus category and category score, for example, in a manner similar to that of the teacher information creation support device 100 of Example 1.

　教師情報作成ＧＵＩ８０２は、教師情報付与部を介して教師情報の付与を受け付ける。教師情報付与部は、計算機のクリック処理やペンタブレットを用いて画素ごとのカテゴリを設定できる。図１１の例では背景として入力画像を表示し、付与される教師情報を入力画像の前景に重畳させて表示するようになっているが、背景として入力画像を表示しなくても良い。また、複数の要修正箇所に対して１回の操作により同じカテゴリに修正する機能を有していても良い。 The teacher information creation GUI 802 accepts the assignment of teacher information via the teacher information assignment unit. The teacher information assignment unit can set a category for each pixel using a click process on a computer or a pen tablet. In the example of Figure 11, the input image is displayed as the background, and the assigned teacher information is displayed superimposed on the foreground of the input image, but it is not necessary to display the input image as the background. In addition, it may have a function to assign the same category to multiple areas requiring correction with a single operation.

　不確実性表示部は、教師情報を付与する際に参考にする情報として表示している。画像分類装置８００は、実施例１の教師情報作成支援装置１００と同様にして、セグメンテーションモデルの学習が不十分である度合いを表す不確実性を算出し、不確実性表示部はこの不確実性を表示する。図１１の例では、白い領域が不確実性の高い領域を表す。 The uncertainty display section displays information to be used as a reference when adding teacher information. Similar to the teacher information creation support device 100 of Example 1, the image classification device 800 calculates the uncertainty that indicates the degree to which the learning of the segmentation model is insufficient, and the uncertainty display section displays this uncertainty. In the example of Figure 11, white areas represent areas with high uncertainty.

　要修正箇所として選ばれていないが比較的不確実性が大きい箇所は、推論が誤っている箇所であることが多く、また誤っていなくてもセグメンテーションモデルの学習が不十分であることが多い。このため、ユーザは、不確実性表示部の表示を参考にし、比較的不確実性が大きい箇所に教師情報を付与することで、セグメンテーションモデルの精度を向上させることができる。ただし、自動で要修正箇所を選択する場合には、不確実性表示部は表示しなくても良い。　Areas that are not selected as areas requiring correction but have relatively high uncertainty are often areas where the inference is incorrect, and even if there is no error, the segmentation model is often insufficiently trained. For this reason, the user can improve the accuracy of the segmentation model by referring to the display in the uncertainty display section and adding teaching information to areas with relatively high uncertainty. However, if areas requiring correction are selected automatically, it is not necessary to display the uncertainty display section.

　また、不確実性表示部は図４に示す注目カテゴリ内の不確実性（第１不確実性）と隣接カテゴリ内の不確実性（第２不確実性）を別々に表示しても良い。また、隣接カテゴリが複数ある場合には、第２不確実性を複数の領域に分けて表示しても良い。 The uncertainty display unit may also display the uncertainty in the focus category (first uncertainty) and the uncertainty in the adjacent category (second uncertainty) shown in FIG. 4 separately. Furthermore, if there are multiple adjacent categories, the second uncertainty may be displayed divided into multiple regions.

　要修正箇所表示部は、不確実性が大きい箇所を要修正箇所として表示する。図１１の例では、要修正箇所は白い領域として表される。ユーザは、教師情報を付与する際に、要修正箇所を参考にすることができる。ただし、自動で要修正箇所を選択する場合には、要修正箇所表示部は表示しなくても良い。 The section that needs correction displays areas that are highly uncertain as areas that need correction. In the example of Figure 11, the areas that need correction are displayed as white areas. The user can refer to the areas that need correction when adding teaching information. However, if the areas that need correction are selected automatically, the section that needs correction does not need to be displayed.

　図１２に、教師情報作成ＧＵＩ８０２の別の例を示す。この例は、図１１のＧＵＩを、半導体の製造過程における構造の欠陥検出に適用する場合の例である。図１１と同様に３つのカテゴリが表示されており、とくにカテゴリ「構造１」および「構造２」は半導体構造の適切な構造を表し、カテゴリ「欠陥」は構造的欠陥を表す。 Figure 12 shows another example of the teaching information creation GUI 802. In this example, the GUI in Figure 11 is applied to detecting defects in structures during the semiconductor manufacturing process. As in Figure 11, three categories are displayed, and in particular, the categories "Structure 1" and "Structure 2" represent appropriate semiconductor structures, and the category "Defect" represents a structural defect.

　このようなＧＵＩによれば、ユーザは半導体構造における欠陥検出に利用可能な教師情報を、効率的に作成することができる。 With this type of GUI, users can efficiently create training information that can be used to detect defects in semiconductor structures.

［実施例３］
　実施例３においてはセグメンテーションモデルの学習とユーザのアノテーションを相互に行うインタラクティブセグメンテーションにおける、学習途中のセグメンテーションモデルである場合の画像分類装置を提案している。実施例１または２と共通する部分については、説明を省略する場合がある。 [Example 3]
In the third embodiment, an image classification device is proposed for a segmentation model in the middle of learning in interactive segmentation in which learning of a segmentation model and annotation by a user are performed mutually. Descriptions of parts common to the first and second embodiments may be omitted.

　図９に、本発明の実施例３に係る画像分類装置の機能構成例を示す。画像分類装置９００は、まず、実施例２に示す方法で算出される教師情報Ｄ８０５と入力画像を用いてセグメンテーションモデルＤ８０２を学習する。 FIG. 9 shows an example of the functional configuration of an image classification device according to the third embodiment of the present invention. The image classification device 900 first trains a segmentation model D802 using training information D805 calculated by the method shown in the second embodiment and an input image.

　モデル学習部９０１は入力画像Ｄ８０１と教師情報Ｄ８０５を用いてセグメンテーションモデルＤ８０２を学習する。 The model learning unit 901 learns a segmentation model D802 using an input image D801 and training information D805.

　セグメンテーションモデルＤ８０２は入力段階で事前に学習されているものを用いる。事前の学習には入力画像Ｄ８０１に対する少数の教師情報を用いても良いし、公開データセット等を用いて学習しても良い。 The segmentation model D802 used is one that has been trained in advance at the input stage. For the pre-training, a small amount of teacher information for the input image D801 may be used, or training may be performed using a public dataset, etc.

　画像分類装置９００において、モデル学習部９０１は、セグメンテーションモデルＤ８０２を学習し、その後、再度、画像カテゴリ推論部８０１等の処理により教師情報Ｄ８０５を作成し、教師情報Ｄ８０５を用いてセグメンテーションモデルＤ８０２を学習させる。この処理を推論画像Ｄ８０３に誤識別が無くなるまで繰り返すことにより、識別精度の良いセグメンテーションモデルＤ８０２を得ることが可能となる。 In the image classification device 900, the model learning unit 901 learns the segmentation model D802, and then creates teacher information D805 again through processing by the image category inference unit 801, etc., and trains the segmentation model D802 using the teacher information D805. By repeating this process until there are no more misclassifications in the inference image D803, it is possible to obtain a segmentation model D802 with good classification accuracy.

　図１０に、本発明の実施例３に係る画像分類装置の機能構成の別の例を示す。画像分類装置１０００は、実施例１および２と同様にして不確実性を算出し、各画素ごとの不確実性を用いて強調学習を行う。 FIG. 10 shows another example of the functional configuration of an image classification device according to the third embodiment of the present invention. The image classification device 1000 calculates the uncertainty in the same manner as in the first and second embodiments, and performs enhancement learning using the uncertainty for each pixel.

　不確実性の大きい箇所は、セグメンテーションモデルにとって学習が不十分である箇所である。モデル学習部１００１は、不確実性情報Ｄ１００１に基づき、不確実性の大きい画素の損失が、不確実性の小さい画素の損失より小さくなるように、セグメンテーションモデルを学習させる。 Areas with high uncertainty are areas where the segmentation model has not been sufficiently trained. The model training unit 1001 trains the segmentation model based on the uncertainty information D1001 so that the loss of pixels with high uncertainty is smaller than the loss of pixels with low uncertainty.

　このような学習により、精度向上が見込める。例えば、不確実性に比例して画素ごとの損失を大きくし、モデル学習時にその損失に従って学習を行うことで、不確実性の大きい箇所の特徴をより強く学習することが可能となる。 This type of learning is expected to improve accuracy. For example, by increasing the loss per pixel in proportion to the uncertainty and training the model according to that loss, it is possible to more strongly learn the features of areas with high uncertainty.

　１００…教師情報作成支援装置
　１０１…カテゴリスコア算出部
　１０２…推論画像決定部
　１０３…隣接カテゴリ算出部
　１０４…要修正スコア算出部
　３００…隣接画素群
　８００…画像分類装置
　８０１…画像カテゴリ推論部
　８０２…教師情報作成ＧＵＩ
　９００…画像分類装置
　９０１…モデル学習部
　１０００…画像分類装置
　１００１…モデル学習部
　Ｄ１…入力画像
　Ｄ２…セグメンテーションモデル
　Ｄ３…カテゴリスコア
　Ｄ４…注目カテゴリ
　Ｄ５…推論画像
　Ｄ６…隣接カテゴリ
　Ｄ７…要修正スコア
　Ｄ８…要修正スコア
　Ｄ９…要修正箇所
　Ｄ８０１…入力画像
　Ｄ８０２…セグメンテーションモデル
　Ｄ８０３…推論画像
　Ｄ８０４…カテゴリスコア
　Ｄ８０５…教師情報
　Ｄ１００１…不確実性情報
　Ｓ２０１…処理ステップ
　Ｓ２０２…処理ステップ
　Ｓ４０１…処理ステップ
　Ｓ４０２…処理ステップ
　Ｓ４０３…処理ステップ
　Ｓ５０１…処理ステップ
　Ｓ５０２…処理ステップ
　Ｓ６０１…処理ステップ
　Ｓ７０１…処理ステップ 100: Teacher information creation support device 101: Category score calculation unit 102: Inference image determination unit 103: Adjacent category calculation unit 104: Correction score calculation unit 300: Adjacent pixel group 800: Image classification device 801: Image category inference unit 802: Teacher information creation GUI
900: Image classification device 901: Model learning unit 1000: Image classification device 1001: Model learning unit D1: Input image D2: Segmentation model D3: Category score D4: Category of interest D5: Inferred image D6: Adjacent category D7: Score requiring correction D8: Score requiring correction D9: Area requiring correction D801: Input image D802: Segmentation model D803: Inferred image D804: Category score D805: Teacher information D1001: Uncertainty information S201: Processing step S202: Processing step S401: Processing step S402: Processing step S403: Processing step S501: Processing step S502: Processing step S601: Processing step S701: Processing step

Claims

A teacher information creation support device,
a category score calculation unit that calculates, for an image consisting of a plurality of pixels, a category score indicating a degree of certainty that each combination of a pixel and a category belongs to the corresponding category, using a segmentation model;
An inference image determination unit that calculates an inference image based on the category score;
an adjacent category calculation unit that receives an input of a focus category and calculates an adjacent category that is likely to be adjacent to the focus category based on the inference image and the focus category;
a correction-required score calculation unit that determines a correction-required score for at least one pixel based on the focus category, the neighboring categories, and the category scores;
A teacher information creation support device comprising:

The teacher information creation support device according to claim 1,
The correction score calculation unit,
extracting a category score related to the attention category and a category score related to the adjacent category for each pixel inferred to belong to the attention category in the inference image and each pixel inferred to belong to the adjacent category in the inference image;
Calculating uncertainty based on each extracted category score as a value indicating the degree to which the segmentation model is insufficiently trained;
calculating the revision score based on the uncertainty;
A teacher information creation support device comprising:

The teacher information creation support device according to claim 1,
The correction score calculation unit,
Calculating a first uncertainty for each pixel inferred to belong to the attention category in the inference image based on a category score related to the attention category;
Calculating a second uncertainty for each pixel inferred to belong to the adjacent category in the inference image based on a category score related to the target category;
calculating the revision score based on the first uncertainty and the second uncertainty;
A teacher information creation support device comprising:

The teacher information creation support device according to claim 1, wherein the correction score calculation unit identifies the parts with a large correction score as parts that need correction.

The teacher information creation support device according to claim 1, characterized in that the correction score calculation unit smoothes the correction score for each pixel within a predetermined range including the pixel, and calculates the part with the large smoothed value as the part requiring correction.

An image classification device, comprising:
an image category inference unit that uses a segmentation model to calculate a category score indicating a degree of certainty that a pixel belongs to a category for each combination of a pixel and a category for an image consisting of a plurality of pixels, and calculates an inferred image based on the category score;
A teacher information creation GUI having a focus category setting unit;
having
The attention category setting unit receives an input of an attention category,
The image classification device calculates a portion to be corrected based on the attention category and the category score,
The teacher information creation GUI includes:
Output the part that needs to be corrected,
Accepting assignment of teacher information via a teacher information assignment unit;
1. An image classification device comprising:

The image classification device according to claim 6, characterized in that the image classification device has a model learning unit that uses the teacher information to learn a segmentation model.

The image classification device according to claim 7,
The image classification device calculates an uncertainty indicating a degree to which the segmentation model is insufficiently trained;
The model learning unit learns the segmentation model so that a loss of pixels with large uncertainty is smaller than a loss of pixels with small uncertainty.
1. An image classification device comprising:

7. The image classification device according to claim 6,
The image classification device calculates an uncertainty representing a degree to which the segmentation model is insufficiently trained;
indicating said uncertainty;
1. An image classification device comprising:

7. The image classification device according to claim 6,
The image classification device includes:
Calculating an adjacent category that is likely to be adjacent to the attention category based on the inference image and the attention category;
Calculating a first uncertainty for each pixel inferred to belong to the attention category in the inference image based on a category score related to the attention category;
Calculating a second uncertainty for each pixel inferred to belong to the adjacent category in the inference image based on a category score related to the target category;
The teacher information creation GUI displays the first uncertainty and the second uncertainty.
1. An image classification device comprising:

The image classification device according to claim 8, wherein the teaching information creation GUI displays the areas with high uncertainty as the areas requiring correction.

A program that causes a computer to function as the teacher information creation support device described in claim 1.

A program that causes a computer to function as the image classification device described in claim 6.