TWI794132B

TWI794132B - System for detecting misidentified objects

Info

Publication number: TWI794132B
Application number: TW111135290A
Authority: TW
Inventors: 郭王鼎志
Original assignee: 威盛電子股份有限公司
Priority date: 2022-09-19
Filing date: 2022-09-19
Publication date: 2023-02-21
Also published as: TW202414344A

Abstract

A system for detecting misidentified objects comprises a client equipment and a servo equipment. The client equipment performs an elementary image comparing program to determine whether a candidate image area comprising a preset image characteristic exists in an image content, and outputs the candidate image area after it is found. The servo equipment is electrically coupled to the client equipment and receives the candidate image area output therefrom. The servo equipment performs an advanced image comparing program to determine whether the preset image characteristic is included in the received candidate image area, and, when the preset image characteristic is not included in the received candidate image area, outputs the received candidate image area to indicate that a misidentified object is found in the candidate image area.

Description

System for Detecting Misidentified Objects

本發明是有關於一種可以偵測誤判物件的系統，特別是有關於一種基於最先進模型的自動偵測誤判物件的系統。The present invention relates to a system capable of detecting misjudged objects, in particular to a system for automatically detecting misjudged objects based on the most advanced model.

在人工智慧物件辨識技術中，如何減少物件辨識錯誤的機率是一個十分重要的課題。為此，最早出現的技術是把在人工智慧物件辨識的過程中使用到的影像以人工方式再次檢視一次，接著把人工智慧物件辨識的結果與人工檢視的結果進行比對並找出差異，最終再根據比對所得的差異結果來修改人工智慧的模型內容或訓練人工智慧時所使用的資料庫，藉此增加人工智慧物件辨識的準確率。然而，由於全部的場景都必須看過一次並且將誤判找出來，導致一但需要檢視的影像數量變多就會沒辦法快速的發現問題，可見這樣的方式非常耗費時間與人力。In artificial intelligence object recognition technology, how to reduce the probability of object recognition errors is a very important issue. For this reason, the earliest technology that appeared was to manually inspect the images used in the process of artificial intelligence object recognition, and then compare the results of artificial intelligence object recognition with the results of manual inspection to find out the differences, and finally Then modify the model content of artificial intelligence or the database used in training artificial intelligence according to the difference results obtained from the comparison, thereby increasing the accuracy of artificial intelligence object recognition. However, since all the scenes must be viewed once to find out the misjudgment, once the number of images to be inspected increases, it is impossible to quickly find the problem. It can be seen that this method is very time-consuming and labor-intensive.

為了避免發生上述的問題，在近期的技術中採用了以最先進模型（State of the Arts model，SOTA model）為基礎的物件辨識方法以高度地增加辨識準確率。經過實驗發現，這種物件辨識方法的確可以加大物件辨識的準確率，但是由於在辨識過程中所使用到的權重數量較多所以會造成運算時間過長的問題，例如，在對於一部影片進行物件辨識操作的時候可能需要花上影片數十倍的時間才能分析出結果。所以，很明顯的，在物件辨識時採用這種技術仍然會耗費極多的時間。In order to avoid the above-mentioned problems, an object recognition method based on a state-of-the-art model (State of the Arts model, SOTA model) is adopted in recent technologies to highly increase the recognition accuracy. After experiments, it was found that this method of object recognition can indeed increase the accuracy of object recognition, but due to the large number of weights used in the recognition process, it will cause the problem of too long calculation time. For example, for a video When performing object recognition operations, it may take tens of times the time of the video to analyze the results. Therefore, it is obvious that using this technique in object recognition still consumes a lot of time.

為了解決上述的問題，本發明的一個目的就是提出一種偵測誤判物件的系統，其可在相對較短的時間內自動找出相當數量的誤判物件，減少在尋找誤判物件上需要花費的時間。In order to solve the above problems, an object of the present invention is to provide a system for detecting misjudged objects, which can automatically find a considerable number of misjudged objects in a relatively short period of time, reducing the time spent on finding misjudged objects.

從一個角度來看，本發明提出一種偵測誤判物件的系統，其特徵在於包括客戶端設備及伺服端設備，其中，客戶端設備適於取得影像內容、執行初階影像比對程式以在影像內容中尋找是否存在著包含有事先設定的預設影像特徵的候選區域圖塊，並在尋找到符合條件的候選區域圖塊後輸出此候選區域圖塊；伺服端設備電性耦接至客戶端設備以接收由客戶端設備輸出的候選區域圖塊、藉由執行進階影像比對程式以判斷所接收的候選區域圖塊是否包含前述的預設影像特徵，並在此候選區域圖塊不包含此預設影像特徵的時候輸出此候選區域圖塊以表示此候選區域圖塊造成物件誤判。其中，進階影像比對程式於判斷候選區域圖塊是否包含預設影像特徵時的準確率大於初階影像比對程式判斷同一個候選區域圖塊是否包含同一個預設影像特徵時的準確率。From one point of view, the present invention proposes a system for detecting misjudged objects, which is characterized in that it includes a client device and a server device, wherein the client device is suitable for obtaining image content, and executing a primary image comparison program to compare images in the image. Find whether there are candidate area tiles containing pre-set preset image features in the content, and output the candidate area tiles after finding qualified candidate area tiles; the server device is electrically coupled to the client The device receives the candidate area tiles output by the client device, and judges whether the received candidate area tiles contain the aforementioned preset image features by executing an advanced image comparison program, and the candidate area tiles do not contain When the preset image features are used, the candidate area block is output to indicate that the candidate area block causes object misjudgment. Among them, the accuracy rate of the advanced image comparison program in judging whether a candidate region block contains a preset image feature is greater than the accuracy rate of the primary image comparison program in judging whether the same candidate region block contains the same preset image feature .

在一個實施例中，上述的客戶端設備包括影像輸入裝置並藉由此影像輸入裝置取得上述的影像內容。In one embodiment, the above-mentioned client device includes an image input device and obtains the above-mentioned image content through the image input device.

在一個實施例中，上述的影像輸入裝置係為攝影器材，此攝影器材藉經由拍攝操作而產生上述的影像內容。In one embodiment, the above-mentioned image input device is a photographic equipment, and the photographic equipment generates the above-mentioned image content through a shooting operation.

在一個實施例中，上述的伺服端設備在執行進階影像比對程式以判斷候選區域圖塊是否包含預設影像特徵之前，更執行了一個特徵加強比對程式來放大此候選區域圖塊並分析被放大後的候選區域圖塊以判斷此一被放大後的候選區域圖塊是否包含有同樣的預設影像特徵，其中，只有被特徵加強比對程式判斷為不包含此預設影像特徵的候選區域圖塊會被提供給進階影像比對程式以進行後續的操作。In one embodiment, before the above-mentioned server device executes the advanced image comparison program to determine whether the candidate area block contains preset image features, it further executes a feature enhancement comparison program to enlarge the candidate area block and Analyzing the enlarged candidate area tiles to determine whether the enlarged candidate area tiles contain the same preset image feature, wherein only those that are judged not to include the preset image feature by the feature enhancement comparison program The candidate area tiles will be provided to the advanced image comparison program for subsequent operations.

在一個實施例中，上述的客戶端設備在輸出上述的候選區域圖塊之前，更執行了一個特徵加強比對程式來放大此候選區域圖塊並分析被放大後的候選區域圖塊以判斷被放大後的候選區域圖塊是否同樣包含上述的預設影像特徵，其中，只有經過特徵加強比對程式判斷為不包含上述的預設影像特徵的候選區域圖塊會被客戶端設備向外輸出。In one embodiment, before the above-mentioned client device outputs the above-mentioned candidate area tiles, it further executes a feature enhancement comparison program to enlarge the candidate area tiles and analyze the enlarged candidate area tiles to determine the Whether the enlarged candidate area tiles also contain the above-mentioned preset image features, wherein only the candidate area tiles that are determined not to include the above-mentioned preset image features by the feature enhancement comparison program will be output by the client device.

在一個實施例中，當上述的伺服端設備判斷所接收的候選區域圖塊不包含預設影像特徵的時候，使用此候選區域圖塊做為不包含預設影像特徵的負樣本以訓練初階影像比對程式。In one embodiment, when the above-mentioned server device determines that the received candidate area block does not contain preset image features, the candidate area block is used as a negative sample that does not contain preset image features to train the initial Image comparison program.

藉由採用上述技術，本發明所提供的偵測誤判物件的系統可以先對影像進行初步的、相對簡單的物件偵測，然後再對被篩選出來的物件的影像進行相對複雜且精準的進階辨識操作。如此不但可以減少進階辨識操作的負擔而因此降低進階辨識操作所需耗費的時間，而且還可以根據前後多次辨識操作所獲得的不同結果而自動找出在辨識過程中容易被誤判的物件，減少必須以人工找出誤判物件的人力需求。By adopting the above-mentioned technology, the system for detecting misjudged objects provided by the present invention can perform preliminary and relatively simple object detection on images first, and then carry out relatively complex and accurate advances on the images of the selected objects Identify operations. In this way, not only can the burden of advanced identification operations be reduced and thus the time required for advanced identification operations can be reduced, but also objects that are easily misjudged during the identification process can be automatically found according to the different results obtained from multiple identification operations before and after , to reduce the manpower requirement of finding misjudged objects manually.

為了使下述內容能被本領域的技術人員清楚地理解，在此要先說明的是，在本申請內容中使用的詞句「電性耦接」表示信號能夠在電性耦接的兩個物體之間進行傳遞。除非另有限定，否則傳遞電子信號的方式可採有線方式或無線方式為之，且電子信號的傳遞方向可以為單向或雙向。In order for those skilled in the art to understand the following content clearly, it should be explained here that the phrase "electrically coupled" used in this application indicates that signals can be electrically coupled between two objects transfer between. Unless otherwise limited, the way of transmitting electronic signals can be wired or wireless, and the transmission direction of electronic signals can be one-way or two-way.

請參照圖1，其為根據本發明一實施例的偵測誤判物件的系統的電路方塊圖。如圖所示，本實施例中的偵測誤判物件的系統10包括了客戶端設備100以及伺服端設備150，其中，客戶端設備100可以經由各種樣態的電子信號傳遞方式而電性耦接至伺服端設備150以使電子信號可以在客戶端設備100與伺服端設備150之間傳遞。Please refer to FIG. 1 , which is a circuit block diagram of a system for detecting misjudged objects according to an embodiment of the present invention. As shown in the figure, the system 10 for detecting misjudged objects in this embodiment includes a client device 100 and a server device 150, wherein the client device 100 can be electrically coupled via various electronic signal transmission methods to the server device 150 so that electronic signals can be transmitted between the client device 100 and the server device 150 .

進一步的，客戶端設備100在本實施例中包括了一個處理器102、一個記憶模組104以及一個影像輸入裝置106，其中，處理器102分別電性耦接至記憶模組104及影像輸入裝置106，且記憶模組104也電性耦接至影像輸入裝置106。在某些實施例中，影像輸入裝置106是可以藉由拍攝特定區域而產生對應的影像內容的攝影器材，例如：監視器、照相機或攝影機等。在這些實施例中，影像輸入裝置106的拍攝範圍可以是固定的、可以是能夠由使用者手動進行調整的，或者也可以是能夠由影像輸入裝置106根據處理器102的控制來進行調整的。依照設計的不同，由影像輸入裝置106藉由拍攝特定區域的操作所產生的影像內容可以先經過處理器102處理之後再儲存到記憶模組104，或者，所產生的影像內容也可以先被從影像輸入裝置106直接儲存到記憶模組104，然後再由處理器102在適當的時間從記憶模組104中取出並進行需要的處理。Further, the client device 100 in this embodiment includes a processor 102, a memory module 104 and an image input device 106, wherein the processor 102 is electrically coupled to the memory module 104 and the image input device respectively 106 , and the memory module 104 is also electrically coupled to the image input device 106 . In some embodiments, the image input device 106 is a photographic device capable of generating corresponding image content by photographing a specific area, such as a monitor, a camera, or a video camera. In these embodiments, the shooting range of the image input device 106 may be fixed, manually adjustable by the user, or adjustable by the image input device 106 according to the control of the processor 102 . Depending on the design, the image content generated by the operation of the image input device 106 by shooting a specific area can be processed by the processor 102 before being stored in the memory module 104, or the generated image content can also be firstly stored in the memory module 104. The image input device 106 is directly stored in the memory module 104, and then the processor 102 takes it out from the memory module 104 at an appropriate time and performs necessary processing.

在另一些實施例中，影像輸入裝置106是可以讓外部的資料接入到客戶端設備100的一個資料傳輸裝置，例如：讀卡機、通用序列匯流排（USB，Universal Serial Bus）連接器或RJ-45接頭等。藉由連接到影像輸入裝置106，客戶端設備100可以從外接的資料來源取得後續在影像分析時會使用到的影像內容。In other embodiments, the image input device 106 is a data transmission device that allows external data to be connected to the client device 100, such as a card reader, a Universal Serial Bus (USB, Universal Serial Bus) connector or RJ-45 connector, etc. By connecting to the image input device 106 , the client device 100 can obtain image content that will be used in subsequent image analysis from an external data source.

在經過影像輸入裝置106而取得所需要的影像內容之後，客戶端設備100會執行一個初階影像比對程式以尋找在影像內容中是否存在包含了事先設定的預設影像特徵的圖塊（後稱為候選區域圖塊）。在本實施例中，初階影像比對程式可以事先被儲存在記憶模組104裡，並在客戶端設備100被啟動之後由處理器102從記憶模組104中讀取出來並加以執行。這個初階影像比對程式可以是由深度神經網路為基礎而形成的、可以擷取目標影像的影像特徵並將所擷取出來的影像特徵與前述的預設影像特徵進行比對的電腦程式；或者，如本領域的技術人員所知，初階影像比對程式也可以是其它以任何方式做成且具有前述擷取及比對影像特徵能力的電腦程式。After obtaining the required image content through the image input device 106, the client device 100 will execute a preliminary image comparison program to find whether there are tiles containing preset image features set in advance in the image content (later called region proposal tiles). In this embodiment, the primary image comparison program may be stored in the memory module 104 in advance, and read out from the memory module 104 by the processor 102 and executed after the client device 100 is activated. This elementary image comparison program can be a computer program formed on the basis of a deep neural network, which can extract the image features of the target image and compare the extracted image features with the aforementioned preset image features Or, as known to those skilled in the art, the primary image comparison program can also be other computer programs made in any way and having the aforementioned ability to capture and compare image features.

另一方面，先前提到的伺服端設備150在本實施例中主要包括了一個處理器152以及一個記憶模組154，而且伺服端設備150具有一個電子信號傳輸模組以經由適合的通信協定而得以接收從客戶端設備100所傳來的資料或傳送資料至客戶端設備100。根據設計上的不同，經由電子信號傳輸模組而從客戶端設備100接收到的資料可以直接被送到處理器152以進行所需的處理，或者也可以先被儲存到記憶模組154後再由處理器152在適當的時間從記憶模組154中取出並進行處理。在經過處理器152處理之後所獲得的結果可以由輸出介面156輸出至外界以供使用者知悉相關資訊。在一實施例中，伺服端設備150可以具有比客戶端設備100更強大的運算能力以執行更為龐大複雜的運算。On the other hand, the previously mentioned server device 150 mainly includes a processor 152 and a memory module 154 in this embodiment, and the server device 150 has an electronic signal transmission module to communicate via a suitable communication protocol To receive data transmitted from the client device 100 or transmit data to the client device 100 . Depending on the design, the data received from the client device 100 via the electronic signal transmission module can be directly sent to the processor 152 for the required processing, or can be stored in the memory module 154 first and then stored in the memory module 154. The processor 152 takes it out from the memory module 154 at an appropriate time and processes it. The result obtained after being processed by the processor 152 can be output to the outside through the output interface 156 for the user to know the relevant information. In an embodiment, the server device 150 may have a stronger computing capability than the client device 100 to perform larger and more complex operations.

為了使本發明的技術內容能更容易地被本領域的技術人員所理解，以下將搭配圖1與圖2來進行更詳細的說明。其中，圖2為根據本發明一實施例的偵測誤判物件的系統的運作流程圖。如圖2所示，首先會由客戶端設備100取得影像內容（步驟S200），接著會在客戶端設備100中藉由執行初階影像比對程式來處理所取得的影像內容，包括：將影像內容分成一個以上的影像區域圖塊，依序擷取影像內容中各影像區域圖塊的影像特徵，以及將擷取出來的影像特徵與先前已經設定好的預設影像特徵進行比對等操作（步驟S202）。接下來，在將影像內容中的所有影像區域圖塊都經由步驟S202進行處理且發現並沒有任何一個影像區域圖塊的影像特徵符合預設影像特徵的時候，本實施例會使流程回到步驟S200以取得下一個需要被處理的影像內容（亦即步驟S204的判斷為否）；相對的，在將影像內容中的影像區域圖塊進行步驟S202的處理期間發現有某一個影像區域圖塊的影像特徵符合預設影像特徵的時候（亦即步驟S204的判斷為是），客戶端設備100會根據此次的影像內容是否已經全部處理完畢來決定後續的操作方式。In order to make the technical content of the present invention more easily understood by those skilled in the art, a more detailed description will be given below with reference to FIG. 1 and FIG. 2 . Wherein, FIG. 2 is a flow chart of the operation of the system for detecting misjudged objects according to an embodiment of the present invention. As shown in FIG. 2 , firstly, the client device 100 will obtain the image content (step S200 ), and then the client device 100 will process the acquired image content by executing a primary image comparison program, including: converting the image The content is divided into more than one image area tiles, and the image features of each image area block in the image content are sequentially extracted, and the extracted image features are compared with the preset image features that have been set before ( step S202). Next, when all the image region tiles in the image content are processed through step S202 and it is found that none of the image region tiles has an image feature that matches the preset image feature, this embodiment will return the process to step S200 In order to obtain the image content that needs to be processed next (that is, the judgment of step S204 is No); in contrast, during the processing of the image area tile in the image content in step S202, it is found that there is an image of a certain image area block When the feature matches the preset image feature (that is, the judgment of step S204 is yes), the client device 100 will determine the subsequent operation mode according to whether all the image content has been processed this time.

在前述步驟S204的判斷為是的時候，流程會進入步驟S206以確認目前取得的影像內容是否已經全部被處理完畢。當目前取得的影像內容已經全部被處理完畢的時候，除了會將在步驟S204中找到的包含有符合預設影像特徵的影像區域圖塊（亦即前述的候選區域圖塊）從客戶端設備100傳送往伺服端設備150（步驟S208）之外，客戶端設備100還會回到步驟S200以準備取得下一個影像內容並進行對應的處理；相對的，當目前取得的影像內容還沒有全部被處理完畢，那麼除了會將在步驟S204中找到的候選區域圖塊從客戶端設備100傳送往伺服端設備150（步驟S208）之外，客戶端設備100還會回到步驟S202以繼續確認影像內容中是否還存在其它的候選區域圖塊。When the determination of the aforementioned step S204 is yes, the process will enter into step S206 to confirm whether all the image contents acquired so far have been processed. When all the currently acquired image content has been processed, in addition to sending the image area tiles found in step S204 that meet the preset image characteristics (that is, the aforementioned candidate area tiles) from the client device 100 In addition to sending to the server device 150 (step S208), the client device 100 will return to step S200 to prepare for obtaining the next image content and perform corresponding processing; After completion, in addition to transmitting the candidate area tiles found in step S204 from the client device 100 to the server device 150 (step S208), the client device 100 will return to step S202 to continue to confirm the image content Whether there are other candidate region tiles.

在執行完步驟S208之後，伺服端設備150就會接收到從客戶端設備100送出的候選區域圖塊，而處理器152就會在適當的時間點執行前述的進階影像比對程式以對所接收到的候選區域圖塊進行必要的處理（步驟S210）。與前述的初階影像比對程式類似的是，進階影像比對程式可以是由深度神經網路為基礎而形成的、可以擷取目標影像的影像特徵並將所擷取出來的影像特徵與前述的預設影像特徵進行比對的電腦程式；或者，如本領域的技術人員所知，進階影像比對程式也可以是其它以任何方式做成且具有前述擷取及比對影像特徵能力的電腦程式。而與初階影像比對程式不同的是，進階影像比對程式在進行影像特徵比對時應具備比初階影像比對程式更高的準確率，例如，具備更多的權重項目、包含更多層的神經網路，或者以最先進模型（State of the Arts model，SOTA model）為程式建構基礎等。在一實施例中，因為進階影像比對程式所耗費的運算資源大於初階影像比對程式，如果進階影像比對程式在運算能力較弱的客戶端設備100上執行可能會太過緩慢，而在運算能力較強的伺服端設備150上則可以順利執行。After step S208 is executed, the server device 150 will receive the candidate area tiles sent from the client device 100, and the processor 152 will execute the aforementioned advanced image comparison program at an appropriate time point to compare all the candidate regions. Necessary processing is performed on the received candidate area tiles (step S210 ). Similar to the aforementioned primary image comparison program, the advanced image comparison program can be formed based on a deep neural network, which can extract the image features of the target image and compare the extracted image features with the A computer program for comparing the aforementioned preset image features; or, as known to those skilled in the art, the advanced image comparison program can also be made in any way and has the aforementioned ability to capture and compare image features computer program. Unlike the basic image comparison program, the advanced image comparison program should have a higher accuracy rate than the primary image comparison program when comparing image features, for example, it has more weight items, including More layers of neural networks, or the most advanced model (State of the Arts model, SOTA model) as the basis for program construction, etc. In one embodiment, because the advanced image comparison program consumes more computing resources than the elementary image comparison program, if the advanced image comparison program is executed on the client device 100 with weak computing power, it may be too slow , but it can be executed smoothly on the server device 150 with strong computing capability.

在以步驟S210利用進階影像比對程式來處理候選區域圖塊之後，伺服端設備152就可以利用處理的結果來判斷候選區域圖塊中是否包含了前述的預設影像特徵（步驟S212）。由於進階影像比對程式在影像特徵比對上具有更高的準確率，所以可能會有一些原本經過初階影像比對程式判斷包含有預設影像特徵的候選區域圖塊被改判為並不包含有預設影像特徵；一旦出現這類初階影像比對程式與進階影像比對程式的判斷結果前後不一的現象，就表示這一個候選區域圖塊有極大的可能是因為初階影像比對程式產生物件誤判而選出來的。因此，一旦在步驟S212中發現進階影像比對程式的判斷結果是這一個候選區域圖塊中不包含預設影像特徵（亦即步驟S212的判斷結果為否），那麼伺服端設備150就會藉由輸出介面156輸出相對應的資訊以提示告知此一候選區域圖塊有極大的可能使得初階影像比對程式產生物件誤判的結果（步驟S214）；相對的，在步驟S212發現進階影像比對程式的判斷結果同樣是這一個候選區域圖塊中包含了預設影像特徵（步驟S212的判斷結果為是），那麼伺服端設備150就可以認為初階影像比對程式對於這一個候選區域圖塊的辨識並不存在誤判的問題，並因此結束對此候選區域圖塊的相關處理程序。在一實施例中，當伺服端設備150的進階影像比對程式判斷所接收的候選區域圖塊不包含預設影像特徵的時候，可以進一步使用該候選區域圖塊做為不包含預設影像特徵的負樣本來訓練初階影像比對程式的深度神經網路以提升其準確率。After the candidate area block is processed by the advanced image comparison program in step S210, the server device 152 can use the processing result to determine whether the candidate area block contains the aforementioned preset image features (step S212). Since the advanced image comparison program has a higher accuracy in image feature comparison, some candidate area tiles that were originally judged to contain preset image features by the primary image comparison program may be changed to not Contains preset image features; once there is a discrepancy between the judgment results of the basic image comparison program and the advanced image comparison program, it means that the candidate area block is very likely to be caused by the primary image It is selected from the misjudgment of objects generated by the comparison program. Therefore, once it is found in step S212 that the judgment result of the advanced image comparison program is that the candidate area block does not contain the preset image feature (that is, the judgment result of step S212 is No), then the server device 150 will Corresponding information is output through the output interface 156 to prompt and inform that the candidate area block has a great possibility of making the preliminary image comparison program produce an object misjudgment result (step S214); in contrast, the advanced image is found in step S212 The judging result of the comparison program is also that the candidate region block contains the preset image features (the judgment result of step S212 is Yes), then the server device 150 can consider that the primary image comparison program has a certain value for this candidate region. There is no misjudgment problem in the identification of the block, and therefore the related processing procedure of the block in the candidate area is ended. In one embodiment, when the advanced image comparison program of the server device 150 judges that the received candidate area tile does not contain the preset image feature, the candidate area tile can be further used as the candidate area block that does not include the default image. The negative samples of the features are used to train the deep neural network of the basic image comparison program to improve its accuracy.

藉由採用上述的技術，雖然本發明提供的系統中仍然採用了具備高準確度但辨識速度緩慢的進階影像比對程式（例如但不限於基於最先進模型的影像辨識程式），但是因為需要進階影像比對程式進行判斷的影像內容只有先經過初階影像比對程式篩選而得的候選區域圖塊而非全部的影像內容，所以需要進階影像比對程式處理的資料量相較於其在現有技術中需要處理的資料量來說必然減少許多，因此必須耗費的時間也自然就比現有技術減少許多。所以，使用本發明提供的系統顯然可以降低在找出產生物件誤判的影像上所需花費的時間成本。另外，藉由指出在兩次影像比對中產生不同比對結果的圖塊，使用者可以輕易得知這些圖塊有極高機率造成物件誤判，除了可以直接將最終選出的圖塊用來加強訓練相應的神經網路之外，即使需要再度以人工從中進行精確篩選，也可以因為需要進行篩選的對象較少而達到減少人工耗費的效益。By adopting the above-mentioned technology, although the system provided by the present invention still uses an advanced image comparison program with high accuracy but slow recognition speed (such as but not limited to the image recognition program based on the most advanced model), but because of the need The image content judged by the advanced image comparison program is only the candidate area tiles first filtered by the elementary image comparison program instead of all the image content, so the amount of data that needs to be processed by the advanced image comparison program is compared to Compared with the prior art, the amount of data to be processed must be reduced a lot, so the time that must be spent is also naturally much less than the prior art. Therefore, using the system provided by the present invention can obviously reduce the time cost required to find out the images that cause object misjudgment. In addition, by pointing out the tiles that produce different comparison results in the two image comparisons, the user can easily know that these tiles have a high probability of causing misjudgment of the object, in addition to directly using the finally selected tiles to enhance In addition to training the corresponding neural network, even if it needs to be manually screened again, it can also achieve the benefit of reducing labor costs because there are fewer objects to be screened.

除了上述的技術之外，本發明提出的系統10還可以在初階影像比對程式以及進階影像比對程式之間執行其它的影像比對程式以進一步減少進階影像比對程式所需要判讀的資料量。In addition to the above-mentioned techniques, the system 10 proposed by the present invention can also execute other image comparison programs between the primary image comparison program and the advanced image comparison program to further reduce the interpretation required by the advanced image comparison program amount of data.

請一併參照圖1、圖2與圖3A，其中，圖3A為根據本發明一實施例在執行步驟S208時的詳細流程圖。如圖3A所示，在執行了步驟S204與S206之後，客戶端設備100會進一步執行一個特徵加強比對程式以處理在步驟S204中找到的候選區域圖塊（步驟S208）。這個特徵加強比對程式可以視所要比對的影像特徵的屬性而先對候選區域圖塊進行針對性的強化。舉例來說，當所要判斷的物件為行人時，可以選擇將候選區域圖塊放大以使人臉或人體特徵更容易被辨識的方式作為特徵加強比對程式中的一環；或者，當所要判斷的物件為特定形狀時，可以選擇將候選區域圖塊中的明暗度進行加強以使物體邊緣更為明顯的方式作為特徵加強比對程式中的一環。總之，特徵加強比對程式可以根據需求而改動，無須受到實施例中所提之影像特徵類型的限制。Please refer to FIG. 1 , FIG. 2 and FIG. 3A together, wherein FIG. 3A is a detailed flow chart of executing step S208 according to an embodiment of the present invention. As shown in FIG. 3A , after executing steps S204 and S206 , the client device 100 will further execute a feature enhancement comparison program to process the candidate area tiles found in step S204 (step S208 ). This feature enhancement comparison program can first perform targeted enhancement on the candidate region tiles according to the attributes of the image features to be compared. For example, when the object to be judged is a pedestrian, you can choose to enlarge the block of the candidate area to make the face or human body features easier to be recognized as a part of the feature enhancement comparison program; or, when the object to be judged When the object is in a specific shape, you can choose to enhance the brightness and darkness of the candidate area tiles to make the edge of the object more obvious as a part of the feature enhancement comparison program. In a word, the feature enhancement comparison program can be modified according to requirements, without being limited by the types of image features mentioned in the embodiments.

在候選區域圖塊被強化之後，這個被強化過的候選區域圖塊的影像特徵會再被用來與預設影像特徵進行比對。接下來，客戶端設備100會根據被強化過的候選區域圖塊中是否包含預設影像特徵而決定後續對此候選區域圖塊的處理方式（步驟S302）。在步驟S302的判斷結果為是，亦即此候選區域圖塊被前述的特徵加強比對程式判斷為包含有前述的預設影像特徵的時候，特徵加強比對程式就會移除此候選區域圖塊，於是這一個候選區域圖塊就不會被傳送至伺服端設備150（步驟S304）；相對的，在步驟S302的判斷結果為否，亦即此候選區域圖塊被前述的特徵加強比對程式判斷為不包含前述的預設影像特徵的時候，特徵加強比對程式就會將此候選區域圖塊傳送往伺服端設備150（步驟S306）。After the candidate area block is enhanced, the image feature of the enhanced candidate area block is used to compare with the preset image feature. Next, the client device 100 determines a subsequent processing method for the candidate area block according to whether the enhanced candidate area block contains preset image features (step S302 ). The judgment result of step S302 is yes, that is, when the candidate region block is judged by the feature enhancement comparison program to contain the aforementioned preset image features, the feature enhancement comparison program will remove the candidate region map block, so this candidate area block will not be sent to the server device 150 (step S304); on the contrary, the judgment result in step S302 is no, that is, the candidate area block is compared with the aforementioned feature enhancement When the program determines that it does not contain the aforementioned preset image feature, the feature enhancement comparison program will send the candidate area block to the server device 150 (step S306 ).

藉由採用圖3A所示的技術，可以進一步減少被傳送往伺服端設備150的候選區域圖塊的數量，因此雖然可能增加了運行特徵加強比對程式所耗費的時間，但同時也可以減少進階影像比對程式所需處理的資料量及其耗費的時間。By adopting the technique shown in FIG. 3A , the number of candidate region tiles sent to the server device 150 can be further reduced. Therefore, although the time spent on running the feature enhancement comparison program may be increased, it can also be reduced at the same time. The amount of data that a high-level image comparison program needs to process and the time it takes.

應注意的是，雖然採用圖3A的技術可以降低客戶端設備100傳送往伺服端設備150的資料量，但是由於一般客戶端設備100的運算能力會低於伺服端設備150的運算能力，所以為了降低客戶端設備100的運算壓力，也可以將前述的特徵加強比對程式改為運行於伺服端設備150中，如圖3B所示。It should be noted that although the technology of FIG. 3A can be used to reduce the amount of data transmitted from the client device 100 to the server device 150, generally the computing power of the client device 100 is lower than that of the server device 150, so in order To reduce the computing pressure of the client device 100, the aforementioned feature enhancement comparison program can also be changed to run in the server device 150, as shown in FIG. 3B.

請參照圖3B，其為根據本發明另一實施例在執行圖2的步驟S208時的詳細流程圖。如圖3B所示，在執行了步驟S204與S206之後，被選出來的候選區域圖塊會先被從客戶端設備100傳送至伺服端設備150（步驟S330）。伺服端設備150在接收到候選區域圖塊之後，會在適當的時間執行特徵加強比對程式處理這一個候選區域圖塊（步驟S332），包括：強化處理中的候選區域圖塊的影像特徵，以及將強化後的候選區域圖塊的影像特徵與事先設定好的預設影像特徵進行比對。Please refer to FIG. 3B , which is a detailed flow chart of executing step S208 in FIG. 2 according to another embodiment of the present invention. As shown in FIG. 3B , after steps S204 and S206 are performed, the selected candidate region tiles are firstly transmitted from the client device 100 to the server device 150 (step S330 ). After the server device 150 receives the block of the candidate area, it will execute the feature enhancement comparison program at an appropriate time to process the block of the candidate area (step S332), including: enhancing the image features of the block of the candidate area in processing, And comparing the image features of the enhanced candidate region tiles with the pre-set preset image features.

接下來，伺服端設備150會根據被強化過的候選區域圖塊中是否包含預設影像特徵而決定後續對此候選區域圖塊的處理方式（步驟S334）。在步驟S334的判斷結果為是，亦即此候選區域圖塊被特徵加強比對程式判斷為包含有前述的預設影像特徵的時候，特徵加強比對程式就會移除此候選區域圖塊，於是這一個候選區域圖塊就不會被成為進階影像比對程式需要處理的資料（步驟S336）；相對的，在步驟S334的判斷結果為否，亦即此候選區域圖塊被特徵加強比對程式判斷為不包含前述的預設影像特徵的時候，特徵加強比對程式就會將此候選區域圖塊提供給進階影像比對程式以使進階影像比對程式能對這一個候選區域圖塊進行後續的操作（步驟S210）。Next, the server device 150 determines a subsequent processing method for the candidate region block according to whether the enhanced candidate region block contains preset image features (step S334 ). The judgment result of step S334 is yes, that is, when the feature enhancement comparison program determines that the candidate area block contains the aforementioned preset image features, the feature enhancement comparison program will remove the candidate area block, Therefore, this candidate area block will not be the data to be processed by the advanced image comparison program (step S336); on the contrary, the judgment result in step S334 is no, that is, the candidate area block is enhanced by the feature ratio When the program judges that it does not contain the aforementioned preset image features, the feature enhancement comparison program will provide the candidate area tiles to the advanced image comparison program so that the advanced image comparison program can compare this candidate area Subsequent operations are performed on the block (step S210).

除了上述說明內容之外，如本領域的技術人員所知，在初階影像比對程式與進階影像比對程式之間還可以插入更多可以發揮出類似上述實施例中提到的特徵加強比對程式的效果的其它影像比對程式，具體要增加多少個類似的影像比對程式是可以根據實際需求來進行調整的，並不需要受到上述實施例中所述細節的限制。In addition to the above description, as known to those skilled in the art, more enhancements that can exert features similar to those mentioned in the above-mentioned embodiments can be inserted between the primary image comparison program and the advanced image comparison program. For other image comparison programs that compare the effects of the programs, the number of similar image comparison programs to be added can be adjusted according to actual needs, and does not need to be limited by the details described in the above-mentioned embodiments.

另外，經過實驗證明，藉由採用上述各實施例中的技術，最終需要由進階影像比對程式加以處理的資料量大約可以降低到原本輸入的影像內容的資料量的15%～25%，而能夠保留下來的包含誤判物件的圖塊數量大約是以人工找出的包含誤判物件的圖塊數量的75%～85%。由此可見，施行本發明所提供的技術的確能夠在不採用人工的前提下減少所需處理的資料量，而且也能夠保留足夠多的錯誤樣本以供後續使用。In addition, experiments have proved that by adopting the techniques in the above-mentioned embodiments, the amount of data that needs to be processed by the advanced image comparison program can be reduced to about 15%-25% of the original input image content. The number of blocks containing misjudged objects that can be preserved is about 75% to 85% of the number of blocks containing misjudged objects found manually. It can be seen that implementing the technology provided by the present invention can indeed reduce the amount of data to be processed without using manual labor, and can also retain enough error samples for subsequent use.

整體而言，藉由採用上述的技術，雖然在過程中仍然需要使用具備高準確度但辨識速度緩慢的進階影像比對程式，但是因為需要進階影像比對程式進行判斷的影像內容被大幅度的降低，且耗費較多運算資源的進階影像比對程式是在運算能力較強的伺服端設備上執行，所以必須耗費的時間自然就比現有技術減少許多。所以，使用本發明提供的系統顯然可以降低在找出產生物件誤判的影像上所需花費的時間成本。另外，藉由指出在兩次影像比對中產生不同比對結果的圖塊，使用者在不加入人工處理的狀況下也可以輕易得知這些圖塊有極高機率造成物件誤判並直接將最終選出的圖塊用來加強訓練相應的神經網路。進一步的，即使需要再度以人工從選出的圖塊中進行更精確地篩選，也將因為需要進行篩選的對象明顯變少而能夠達到減少人工耗費的結果。Overall, by adopting the above-mentioned technology, although the advanced image comparison program with high accuracy but slow recognition speed is still required in the process, the content of the image that needs to be judged by the advanced image comparison program is enlarged The magnitude is reduced, and the advanced image comparison program that consumes more computing resources is executed on the server device with stronger computing power, so the time that must be consumed is naturally much less than that of the prior art. Therefore, using the system provided by the present invention can obviously reduce the time cost required to find out the images that cause object misjudgment. In addition, by pointing out the tiles that produce different comparison results in the two image comparisons, the user can easily know that these tiles have a high probability of causing misjudgment of the object without adding manual processing, and directly convert the final The selected tiles are used to reinforce and train the corresponding neural network. Further, even if the selected blocks need to be manually screened more accurately, the result of reducing the labor cost can be achieved because the objects to be screened are obviously reduced.

10:偵測誤判物件的系統 100:客戶端設備 102、152:處理器 104、154:記憶模組 106:影像輸入裝置 150:伺服端設備 156:輸出介面 S200～S214:本發明一實施例的施行步驟 S300～S306:本發明一實施例執行步驟S208時的施行步驟 S330～S336:本發明另一實施例執行步驟S208時的施行步驟10: A system for detecting misjudged objects 100: client device 102, 152: Processor 104, 154: memory module 106: Image input device 150: server end equipment 156: output interface S200～S214: implementation steps of an embodiment of the present invention S300～S306: Execution steps when one embodiment of the present invention executes step S208 S330～S336: Implementation steps when executing step S208 in another embodiment of the present invention

圖1為根據本發明一實施例的偵測誤判物件的系統的電路方塊圖。圖2為根據本發明一實施例的偵測誤判物件的系統的運作流程圖。圖3A為根據本發明一實施例在執行圖2的步驟S208時的詳細流程圖。圖3B為根據本發明另一實施例在執行圖2的步驟S208時的詳細流程圖。 FIG. 1 is a circuit block diagram of a system for detecting misjudged objects according to an embodiment of the present invention. FIG. 2 is a flow chart of the operation of the system for detecting misjudged objects according to an embodiment of the present invention. FIG. 3A is a detailed flowchart of executing step S208 of FIG. 2 according to an embodiment of the present invention. FIG. 3B is a detailed flow chart of executing step S208 in FIG. 2 according to another embodiment of the present invention.

10:偵測誤判物件的系統 10: A system for detecting misjudged objects

100:客戶端設備 100: client device

102、152:處理器 102, 152: Processor

104、154:記憶模組 104, 154: memory module

106:影像輸入裝置 106: Image input device

150:伺服端設備 150: server end equipment

156:輸出介面 156: output interface

Claims

A system for detecting misjudged objects is characterized by comprising: A client device, adapted to obtain an image content, the client device executes a preliminary image comparison program to find whether there is a candidate area block in the image content, and outputs after finding the candidate area block The candidate area block, wherein the candidate area block includes a preset image feature set in advance; and A server device, electrically coupled to the client device to receive the candidate area block output by the client device, the server device judges the received candidate area by executing an advanced image comparison program Whether the block contains the preset image feature, and when the candidate area block does not contain the default image feature, output the candidate area block to indicate that the candidate area block causes object misjudgment, Wherein, the accuracy rate of the advanced image comparison program when judging whether the candidate region block contains the preset image feature is greater than the accuracy rate of the primary image comparison program when judging whether the candidate region block contains the preset image feature Accuracy.

The system as claimed in claim 1, wherein the client device includes an image input device to obtain the image content through the image input device.

The system according to claim 2, wherein the image input device is a photographic equipment, and the photographic equipment generates the image content through a shooting operation.

The system as described in claim 1, wherein the server device further executes a feature enhancement comparison program to enlarge the image feature before executing the advanced image comparison program to determine whether the candidate area block contains the preset image feature and analyzing the enlarged candidate area block to determine whether the enlarged candidate area block contains the preset image feature, wherein only the feature enhancement comparison program is judged not to contain the preset image feature The candidate area block with image features will be provided to the advanced image comparison program for subsequent operations.

The system according to claim 1, wherein the client device further executes a feature enhancement comparison program to enlarge the candidate area block and analyze the enlarged candidate area block before outputting the candidate area block to judging whether the enlarged candidate region tile contains the preset image feature, wherein only the candidate region tile that is judged not to include the preset image feature by the feature enhancement comparison program will be sent to the client device external output.

The system as described in claim 1, wherein when the server device judges that the received candidate region tile does not contain the preset image feature, use the candidate region tile as the candidate region tile that does not include the preset image feature Negative samples are used to train the basic image comparison program.