TWI883556B

TWI883556B - Re-identification device and method

Info

Publication number: TWI883556B
Application number: TW112136582A
Authority: TW
Inventors: 李牧柔; 謝鈞惟; 高志忠; 李潤容
Original assignee: 台達電子工業股份有限公司
Priority date: 2023-09-25
Filing date: 2023-09-25
Publication date: 2025-05-11
Also published as: TW202514552A

Abstract

A re-identification device includes a storage and a processor. The processor is configured to extract an object image corresponding to an object from an input image. The processor is further configured to calculate a reliability score corresponding to the object of the object image based on the object image. The processor is further configured to calculate visible scores corresponding to key points of the object image based on the object image. The processor is further configured to determine whether to store an image including the object image to the storage based on the reliability score and the visible scores of the object image. The processor is further configured to perform a re-identification step based on the image.

Description

Re-identification device and method

本揭露有關於一種識別裝置及方法，特別是有關於一種重識別裝置及方法。The present disclosure relates to an identification device and method, and more particularly to a re-identification device and method.

重識別（re-identification）技術係透過將包含諸如人體等物件之影像依據物件之外觀特徵進行分類，藉以歸納出對應同一個物件（例如：人物）的影像。而為了實現重識別技術，需要收集大量包含物件的影像作為識別資料。Re-identification technology is to classify images containing objects such as human bodies according to the appearance features of the objects, so as to summarize the images corresponding to the same object (for example, a person). In order to realize re-identification technology, it is necessary to collect a large number of images containing objects as identification data.

舉例來說，許多重識別技術會利用監視器影像拍攝取得的人物影像做為識別資料。然而，經常會因為監視器位置、拍攝角度、拍攝距離、成像品質、物體遮擋等因素發生人體被遮擋、僅拍到人體背部、人體過小或成像模糊等情形，使得監視器取得的人物影像無法清楚地呈現人物各個部位，進而造成重識別效果不佳。For example, many multi-recognition technologies use images of people captured by surveillance cameras as recognition data. However, due to factors such as the position of the surveillance camera, shooting angle, shooting distance, image quality, and object occlusion, the person is often blocked, only the back of the person is captured, the person is too small, or the image is blurred, so that the image of the person captured by the surveillance camera cannot clearly present all parts of the person, resulting in poor re-recognition results.

有鑑於此，如何篩選適合用於重識別的影像資料，乃業界亟需努力之目標。In view of this, how to filter image data suitable for re-identification is a goal that the industry urgently needs to work on.

為了解決上述問題，本揭露提出一種重識別裝置，包含一儲存器以及一處理器。該處理器耦接該儲存器，該處理器用以執行以下運作：自一第一輸入影像中擷取對應一物件的一第一物件影像；基於該第一物件影像，計算該第一物件影像對應該物件的一信心分數；基於該第一物件影像，計算該第一物件影像對應複數個關鍵點的複數個可視分數；基於該第一物件影像的該信心分數以及該些可視分數，判斷是否將包含該第一物件影像之一第一影像儲存至該儲存器作為複數個影像其中之一；以及基於該儲存器所儲存的該些影像，進行一重識別步驟。To solve the above problem, the present disclosure provides a re-identification device, comprising a memory and a processor. The processor is coupled to the memory, and the processor is used to perform the following operations: capturing a first object image corresponding to an object from a first input image; based on the first object image, calculating a confidence score of the first object image corresponding to the object; based on the first object image, calculating a plurality of visual scores of a plurality of key points corresponding to the first object image; based on the confidence score and the visual scores of the first object image, determining whether to store a first image including the first object image in the memory as one of a plurality of images; and performing a re-identification step based on the images stored in the memory.

本揭露還提供一種重識別方法，適用於一電子裝置，其中該電子裝置包含一儲存器，該重識別方法包含：自一第一輸入影像中擷取對應一物件的一第一物件影像；基於該第一物件影像，計算該第一物件影像對應該物件的一信心分數；基於該第一物件影像，計算該第一物件影像對應複數個關鍵點的複數個可視分數；基於該第一物件影像的該信心分數以及該些可視分數，判斷是否將該第一物件影像儲存至該儲存器作為複數個影像其中之一；以及基於該儲存器所儲存的該些影像，進行一重識別步驟。The present disclosure also provides a re-identification method applicable to an electronic device, wherein the electronic device includes a memory, and the re-identification method includes: capturing a first object image corresponding to an object from a first input image; based on the first object image, calculating a confidence score that the first object image corresponds to the object; based on the first object image, calculating a plurality of visual scores of a plurality of key points corresponding to the first object image; based on the confidence score and the visual scores of the first object image, determining whether to store the first object image in the memory as one of a plurality of images; and performing a re-identification step based on the images stored in the memory.

應該理解的是，前述的一般性描述和下列具體說明僅僅是示例性和解釋性的，並旨在提供所要求的本揭露的進一步說明。It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are intended to provide further explanation of the disclosure as claimed.

為了使本揭露之敘述更加詳盡與完備，可參照所附之圖式及以下所述各種實施例，圖式中相同之號碼代表相同或相似之元件。In order to make the description of the present disclosure more detailed and complete, reference may be made to the attached drawings and various embodiments described below, in which the same numbers in the drawings represent the same or similar elements.

請參照第1圖，其為本揭露第一實施方式中重識別裝置1的示意圖。如第1圖所示，重識別裝置1包含處理器12以及儲存器14，其中處理器12耦接儲存器14。重識別裝置1用以根據輸入影像中物件的成像品質篩選適合做為重識別資料的影像，有關重識別裝置1的具體運作請參考以下說明。Please refer to FIG. 1, which is a schematic diagram of a re-identification device 1 in the first embodiment of the present disclosure. As shown in FIG. 1, the re-identification device 1 includes a processor 12 and a memory 14, wherein the processor 12 is coupled to the memory 14. The re-identification device 1 is used to filter images suitable for re-identification data according to the imaging quality of objects in the input image. Please refer to the following description for the specific operation of the re-identification device 1.

在一些實施例中，處理器 12可包含中央處理單元（central processing unit，CPU）、圖形處理器（graphics processing unit）、多重處理器、分散式處理系統、特殊應用積體電路（application specific integrated circuit，ASIC）和/或合適的運算單元。In some embodiments, the processor 12 may include a central processing unit (CPU), a graphics processing unit (GPU), a multiprocessor, a distributed processing system, an application specific integrated circuit (ASIC), and/or an appropriate computing unit.

在一些實施例中，儲存器14可包含半導體或固態記憶體、磁帶、可移式電腦磁片、隨機存取記憶體（random access memory，RAM）、唯讀記憶體（read-only memory，ROM）、硬磁碟和/或光碟。In some embodiments, the memory 14 may include semiconductor or solid-state memory, magnetic tape, removable computer disk, random access memory (RAM), read-only memory (ROM), hard disk, and/or optical disk.

首先請參考第2圖，重識別裝置1的處理器12自輸入影像IMG中擷取對應物件的物件影像OBJ。如第2圖所示，處理器12在輸入影像IMG中辨識出人物後，則可以擷取出包含人物的物件影像OBJ。First, please refer to FIG. 2 , the processor 12 of the recognition device 1 extracts the object image OBJ corresponding to the object from the input image IMG. As shown in FIG. 2 , after the processor 12 recognizes the person in the input image IMG, it can extract the object image OBJ containing the person.

進一步地，處理器12基於物件影像OBJ，計算物件影像OBJ對應物件的信心分數RS，其中信心分數RS用以表示物件影像OBJ中包含物件（即，人體）的信心水準。Furthermore, the processor 12 calculates a confidence score RS of the object corresponding to the object image OBJ based on the object image OBJ, wherein the confidence score RS is used to indicate a confidence level that the object (ie, human body) is included in the object image OBJ.

在一些實施例中，處理器12可以利用影像辨識演算法辨識輸入影像IMG中是否包含物件以擷取出物件影像OBJ，並且基於物件影像OBJ，計算物件影像OBJ中包含物件的一信賴分數（reliability）作為信心分數RS。在一些實施例中，信心分數RS為介於0和1之間的數值，並且當信心分數RS越高，代表物件影像OBJ中包含物件的機率越高。在一些實施例中，處理器12可以利用YOLOv7（You Only Look Once version 7）演算法完成擷取物件影像OBJ及計算信心分數RS的運作。In some embodiments, the processor 12 may use an image recognition algorithm to identify whether the input image IMG contains an object to extract the object image OBJ, and based on the object image OBJ, calculate a reliability score of the object contained in the object image OBJ as a confidence score RS. In some embodiments, the confidence score RS is a value between 0 and 1, and the higher the confidence score RS, the higher the probability that the object image OBJ contains the object. In some embodiments, the processor 12 may use the YOLOv7 (You Only Look Once version 7) algorithm to complete the operation of extracting the object image OBJ and calculating the confidence score RS.

接下來請進一步參考第3圖，處理器12基於物件影像OBJ，計算物件影像OBJ對應關鍵點KP0-KP16的可視分數。如第3圖所示，處理器12在物件影像OBJ中基於物件的各個部位標記關鍵點KP0-KP16。Next, please refer to FIG. 3 , the processor 12 calculates the visual scores of the key points KP0-KP16 corresponding to the object image OBJ based on the object image OBJ. As shown in FIG. 3 , the processor 12 marks the key points KP0-KP16 based on various parts of the object in the object image OBJ.

在本實施例中，5個關鍵點KP0-KP4為臉部的關鍵點，12個關鍵點KP5-KP16為身體的關鍵點，其中臉部關鍵點包含鼻關鍵點KP0、左眼關鍵點KP1、右眼關鍵點KP2、左耳關鍵點KP3以及右耳關鍵點KP4，身體關鍵點包含左肩關鍵點KP5、右肩關鍵點KP6、左手肘關鍵點KP7、右手肘關鍵點KP8、左手腕關鍵點KP9、右手腕關鍵點KP10、左腰關鍵點KP11、右腰關鍵點KP12、左膝關鍵點KP13、右膝關鍵點KP14、左腳踝關鍵點KP15以及右腳踝關鍵點KP16。In this embodiment, 5 key points KP0-KP4 are key points of the face, and 12 key points KP5-KP16 are key points of the body, wherein the key points of the face include the nose key point KP0, the left eye key point KP1, the right eye key point KP2, the left ear key point KP3 and the right ear key point KP4, and the key points of the body include the left shoulder key point KP1. P5, right shoulder key point KP6, left elbow key point KP7, right elbow key point KP8, left wrist key point KP9, right wrist key point KP10, left waist key point KP11, right waist key point KP12, left knee key point KP13, right knee key point KP14, left ankle key point KP15 and right ankle key point KP16.

此外，處理器12還計算關鍵點KP0-KP16對應的可視分數，可視分數用以表示物件影像OBJ中關鍵點KP0-KP16各者的能見程度。In addition, the processor 12 also calculates the visibility scores corresponding to the key points KP0-KP16, and the visibility scores are used to indicate the visibility of each of the key points KP0-KP16 in the object image OBJ.

在一些實施例中，可視分數為介於0和1之間的數值，並且當可視分數越高，代表物件影像OBJ中包含關鍵點的機率越高。舉例來說，若物件影像OBJ中人體的右腳被遮擋，則對應右膝關鍵點KP14及右腳踝關鍵點KP16的可視分數為0；若物件影像OBJ僅呈現人體的背面，則臉部關鍵點的可視分數為0。In some embodiments, the visibility score is a value between 0 and 1, and the higher the visibility score, the higher the probability that the object image OBJ contains key points. For example, if the right foot of the human body in the object image OBJ is blocked, the visibility scores of the corresponding right knee key point KP14 and right ankle key point KP16 are 0; if the object image OBJ only shows the back of the human body, the visibility score of the facial key point is 0.

此外，若物件影像OBJ中人體距離拍攝器材（例如：監視器、相機）較遠導致物件影像OBJ解析度較低，或是成像品質不佳導致物件影像OBJ較模糊，亦可能造成處理器12計算出對應關鍵點KP0-KP16的可視分數較低。另一方面，若物件影像OBJ中呈現出人體的正面，各個部位未被遮擋和/或解析度較高，則處理器12計算出對應關鍵點KP0-KP16的可視分數較高。In addition, if the human body in the object image OBJ is far away from the shooting equipment (e.g., monitor, camera), resulting in a lower resolution of the object image OBJ, or if the image quality is poor and the object image OBJ is blurred, the processor 12 may also calculate a lower visible score for the corresponding key points KP0-KP16. On the other hand, if the front of the human body is shown in the object image OBJ, all parts are not blocked and/or the resolution is high, the processor 12 calculates a higher visible score for the corresponding key points KP0-KP16.

在一些實施例中，前述臉部關鍵點KP0-KP4對應的可視分數為臉部可視分數，身體關鍵點KP5-KP16對應的可視分數為身體可視分數。In some embodiments, the visibility scores corresponding to the aforementioned facial key points KP0-KP4 are facial visibility scores, and the visibility scores corresponding to the aforementioned body key points KP5-KP16 are body visibility scores.

在一些實施例中，處理器12可以透過以下運作計算對應關鍵點KP0-KP16的可視分數：判斷物件影像OBJ中關鍵點KP0-KP16各者的位置，其中關鍵點KP0-KP16各者對應物件的部位；以及基於物件影像OBJ中關鍵點KP0-KP16各者的該位置的複數個像素，計算關鍵點KP0-KP16各者對應的可視分數。換言之，處理器12於物件影像OBJ中標記出關鍵點KP0-KP16的位置後，則進一步藉由關鍵點KP0-KP16的位置周圍的像素點計算對應的可視分數。In some embodiments, the processor 12 can calculate the visibility score corresponding to the key points KP0-KP16 by the following operations: determining the position of each key point KP0-KP16 in the object image OBJ, wherein each key point KP0-KP16 corresponds to a part of the object; and calculating the visibility score corresponding to each key point KP0-KP16 based on a plurality of pixels at the position of each key point KP0-KP16 in the object image OBJ. In other words, after marking the position of the key points KP0-KP16 in the object image OBJ, the processor 12 further calculates the corresponding visibility score using the pixels around the position of the key points KP0-KP16.

在一些實施例中，處理器12可以利用影像辨識演算法針對關鍵點KP0-KP16計算出物件影像OBJ中包含人體各個部位的信賴分數作為關鍵點KP0-KP16的可視分數。在一些實施例中，處理器12可以利用YOLOv7-Pose演算法完成標記關鍵點KP0-KP16及計算關鍵點KP0-KP16對應之可視分數的運作。In some embodiments, the processor 12 may use an image recognition algorithm to calculate the confidence scores of the human body parts in the object image OBJ for the key points KP0-KP16 as the visual scores of the key points KP0-KP16. In some embodiments, the processor 12 may use a YOLOv7-Pose algorithm to complete the operations of marking the key points KP0-KP16 and calculating the visual scores corresponding to the key points KP0-KP16.

接著，處理器12基於物件影像OBJ的信心分數RS以及可視分數，判斷是否將包含物件影像OBJ之影像儲存至儲存器14。最後，處理器12基於儲存器14所儲存的影像，進行重識別步驟。Next, the processor 12 determines whether to store the image including the object image OBJ in the memory 14 based on the confidence score RS and the visibility score of the object image OBJ. Finally, the processor 12 performs a re-recognition step based on the image stored in the memory 14.

在一些實施例中，處理器12可以判斷信心分數RS以及可視分數是否大於閾值（例如：0.8），若信心分數RS以及可視分數大於閾值則將包含物件影像OBJ的影像儲存至儲存器14以進行重識別步驟，其中包含物件影像OBJ的影像可以視實際需求，例如：重識別步驟的流程、重識別的效果等因素而定，其可以是物件影像OBJ本身、輸入影像IMG和/或自輸入影像IMG裁切出包含物件影像OBJ的影像區塊。In some embodiments, the processor 12 can determine whether the confidence score RS and the visible score are greater than a threshold (for example, 0.8). If the confidence score RS and the visible score are greater than the threshold, the image containing the object image OBJ is stored in the memory 14 for a re-recognition step. The image containing the object image OBJ can be determined according to actual needs, such as the process of the re-recognition step, the effect of the re-recognition, etc. It can be the object image OBJ itself, the input image IMG, and/or an image block containing the object image OBJ cut out from the input image IMG.

在一些實施例中，處理器12還可以基於物件影像OBJ的信心分數RS以及可視分數計算第一總分；以及響應於第一總分大於閾值，將物件影像OBJ儲存至儲存器14。舉例來說，處理器12將信心分數RS以及可視分數加總後平均以計算第一總分，並且當第一總分大於閾值（例如：0.8）時，將物件影像OBJ儲存至儲存器14。In some embodiments, the processor 12 may further calculate a first total score based on the confidence score RS and the visual score of the object image OBJ, and in response to the first total score being greater than a threshold, store the object image OBJ in the memory 14. For example, the processor 12 adds up the confidence score RS and the visual score and averages them to calculate the first total score, and when the first total score is greater than a threshold (e.g., 0.8), stores the object image OBJ in the memory 14.

在一些實施例中，處理器12還可以基於物件影像OBJ的信心分數RS、物件影像OBJ的可視分數，以及對應信心分數RS及可視分數的複數個權重計算第一總分。具體而言，信心分數RS和對應物件中不同部位的可視分數各自代表物件影像OBJ的不同屬性，因此重識別裝置1可以按照不同屬性的偏重程度設定對應的權重。In some embodiments, the processor 12 may further calculate the first total score based on the confidence score RS of the object image OBJ, the visibility score of the object image OBJ, and a plurality of weights corresponding to the confidence score RS and the visibility score. Specifically, the confidence score RS and the visibility scores of different parts of the corresponding object each represent different attributes of the object image OBJ, so the re-identification device 1 may set corresponding weights according to the emphasis of different attributes.

舉例來說，若需要提高對臉部特徵的辨識強度，則重識別裝置1可以將臉部關鍵點的可視分數對應的權重設定為較高的數值。如此一來，相較於物件影像OBJ的信心分數RS以及物件影像OBJ中其他部位的可視分數，重識別裝置1可以篩選出臉部特徵較清晰的影像。For example, if the recognition strength of facial features needs to be improved, the re-identification device 1 can set the weight corresponding to the visual score of the facial key points to a higher value. In this way, compared with the confidence score RS of the object image OBJ and the visual scores of other parts in the object image OBJ, the re-identification device 1 can filter out images with clearer facial features.

在一些實施例中，重識別裝置1篩選並儲存至儲存器14的影像資料用以進行重識別步驟，其中重識別步驟包含接收待辨識影像後，分群該待辨識影像至複數個辨識類別之第一類別；以及基於第一類別，輸出對應第一類別之已辨識影像。In some embodiments, the re-identification device 1 screens and stores the image data in the memory 14 for a re-identification step, wherein the re-identification step includes grouping the image to be identified into a first category of a plurality of identification categories after receiving the image to be identified; and based on the first category, outputting the identified image corresponding to the first category.

具體而言，重識別步驟可以分群輸入的影像，將對應同一個物件（例如：同一個人物）的影像分至同一個類別。因此，當重識別步驟接收並分群待辨識影像後，則可以進一步將與待辨識影像分群至相同類別的其他影像（即，已辨識影像）輸出。Specifically, the re-identification step can group the input images and group the images corresponding to the same object (e.g., the same person) into the same category. Therefore, after the re-identification step receives and groups the images to be identified, it can further output other images (i.e., identified images) that are grouped into the same category as the images to be identified.

如此一來，重識別步驟可以應用於追蹤同一個物件出現的時間及位置，例如將多個設於不同位置的監視器影像以重識別步驟進行分群後，則可以歸納出同一個人物於各個時間點出現在哪一個監視器畫面中，進一步獲得人物的行蹤。In this way, the re-identification step can be applied to track the time and location of the appearance of the same object. For example, after multiple surveillance camera images located in different locations are grouped using the re-identification step, it can be summarized in which surveillance camera screen the same person appears at each time point, and the person's whereabouts can be further obtained.

在一些實施例中，重識別步驟更包含基於儲存器14儲存的影像訓練重識別模型。具體而言，重識別裝置1可以將儲存器14中的影像作為訓練資料訓練重識別模型。重識別模型可以是一種分群模型，在訓練的過程中，重識別模型可以基於訓練資料中影像擷取物件的特徵，並且經過訓練後的重識別模型可以如同前述實施例中的重識別步驟分群輸入的影像，將對應同一個物件（例如：同一個人物）的影像分至同一個類別。In some embodiments, the re-identification step further includes training a re-identification model based on the images stored in the memory 14. Specifically, the re-identification device 1 can use the images in the memory 14 as training data to train the re-identification model. The re-identification model can be a clustering model. During the training process, the re-identification model can be based on the features of the objects captured in the images in the training data, and the trained re-identification model can cluster the input images as in the re-identification step in the aforementioned embodiment, and classify the images corresponding to the same object (e.g., the same person) into the same category.

在一些實施例中，重識別裝置1的處理器12還自一第二輸入影像中追蹤該物件以擷取對應該物件的一第二物件影像，其中該第二輸入影像的一擷取時間晚於該第一輸入影像的該擷取時間；基於該第二物件影像，計算該第二物件影像對應該物件的該信心分數；基於該第二物件影像，計算該第二物件影像對應該些關鍵點的該些可視分數；以及基於該第二物件影像的該信心分數以及該些可視分數，判斷是否將包含該第二物件影像之一第二影像儲存至該儲存器作為該些影像其中之一。In some embodiments, the processor 12 of the re-identification device 1 further tracks the object from a second input image to capture a second object image corresponding to the object, wherein a capture time of the second input image is later than the capture time of the first input image; based on the second object image, calculates the confidence score of the second object image corresponding to the object; based on the second object image, calculates the visibility scores of the key points corresponding to the second object image; and based on the confidence score and the visibility scores of the second object image, determines whether to store a second image including the second object image in the memory as one of the images.

具體而言，重識別裝置1還可以進一步接收輸入影像IMG後續的輸入影像（例如：影片中的下一幀），並且基於與判斷是否將物件影像OBJ作為影像相同的運作，判斷是否將自後續影像擷取出的第二物件影像用於重識別步驟。Specifically, the re-identification device 1 can further receive an input image subsequent to the input image IMG (for example, the next frame in the video), and based on the same operation as determining whether to use the object image OBJ as an image, determine whether to use the second object image captured from the subsequent image for the re-identification step.

在一些實施例中，重識別裝置1的處理器12可以在影片中追蹤同一個物件在不同幀畫面（即，輸入影像IMG以及後續影像）中的位置，並且進一步判斷是否將畫面影像儲存至儲存器14。在一些實施例中，處理器12可以利用目標追蹤演算法（例如： DeepSort演算法）追蹤多個影像中同一個物件的位置。In some embodiments, the processor 12 of the re-identification device 1 can track the position of the same object in different frames (i.e., the input image IMG and subsequent images) in the video, and further determine whether to store the frame image to the memory 14. In some embodiments, the processor 12 can use a target tracking algorithm (e.g., DeepSort algorithm) to track the position of the same object in multiple images.

在一些實施例中，重識別裝置1還可以基於物件影像OBJ的信心分數RS以及可視分數計算第一總分；基於後續影像的信心分數RS以及可視分數計算第二總分；以及基於第一總分及第二總分決定儲存至儲存器14作為用於重識別步驟的影像。In some embodiments, the re-identification device 1 can also calculate a first total score based on the confidence score RS and the visible score of the object image OBJ; calculate a second total score based on the confidence score RS and the visible score of the subsequent image; and decide to store the image in the memory 14 as the image for the re-identification step based on the first total score and the second total score.

與上述計算第一總分的運作相同地，重識別裝置1的處理器12可以藉由相同的運作計算對應後續影像的第二總分，並且基於第一總分及第二總分決定儲存的影像，例如：將物件影像OBJ及第二物件影像兩者或任一者儲存至儲存器14。Similar to the operation of calculating the first total score mentioned above, the processor 12 of the re-identification device 1 can calculate the second total score corresponding to the subsequent image through the same operation, and determine the stored image based on the first total score and the second total score, for example: storing both or either the object image OBJ and the second object image in the memory 14.

舉例來說，處理器12可以自影片中的多個畫面幀分別擷取出包含同一個人物的多個影像（即，物件影像），接著各自依據影像的信心分數及可視分數計算出總分，最後依據影像各自的總分篩選影像。例如處理器12可以將多個總分排序並儲存總分最高的一或多個影像用於重識別步驟，或是以閾值（例如：0.8）篩選總分超過閾值的影像用於重識別步驟。如此一來，重識別裝置1可以從影片中擷取包含同一個人物的影像，並且篩選出成像品質較好的影像用於重識別步驟。For example, the processor 12 can extract multiple images (i.e., object images) containing the same person from multiple frames in a video, and then calculate the total score of each image based on the confidence score and the visual score, and finally filter the images based on the total score of each image. For example, the processor 12 can sort the multiple total scores and store one or more images with the highest total score for the re-identification step, or filter the images with a total score exceeding the threshold value (e.g., 0.8) for the re-identification step. In this way, the re-identification device 1 can extract images containing the same person from the video, and filter out images with better imaging quality for the re-identification step.

需要說明的是，上述實施例以輸入影像IMG中包含的一個物件影像OBJ作為示例進行說明，然而在其他實施例中，當輸入影像中包含多個物件時，重識別裝置1可以透過相同的運作擷取出多個物件影像，並且進一步透過相同的運作各自判斷是否將多個物件影像儲存至儲存器14作為用於重識別步驟。It should be noted that the above-mentioned embodiment is explained using an object image OBJ contained in the input image IMG as an example. However, in other embodiments, when the input image contains multiple objects, the re-identification device 1 can capture multiple object images through the same operation, and further determine whether to store the multiple object images in the memory 14 for the re-identification step through the same operation.

需要注意的是，上述實施例雖以人體作為物件之示例說明重識別裝置1的運作，然而本揭露所提出的技術不以此為限，實際上重識別裝置1亦可以相同的運作應用其他類型物件（例如：動物）的物件影像。It should be noted that although the above embodiments use the human body as an example of an object to illustrate the operation of the re-identification device 1, the technology proposed in the present disclosure is not limited to this. In fact, the re-identification device 1 can also be applied to the object images of other types of objects (such as animals) with the same operation.

綜上所述，本揭露所提出的重識別裝置1可以擷取輸入影像中包含物件的物件影像，並且基於物件影像的成像品質判斷是否將物件影像用於重識別步驟。此外，重識別裝置1還可以針對物件不同部位設定權重以篩選特定部位成像品質較好的影像。進一步地，重識別裝置1還可以從影片的多個畫面幀中擷取對應同一個物件的多個物件影像，並且篩選出成像品質較好的物件影像用於重識別步驟。In summary, the re-identification device 1 proposed in the present disclosure can capture an object image containing an object in an input image, and determine whether to use the object image for the re-identification step based on the imaging quality of the object image. In addition, the re-identification device 1 can also set weights for different parts of the object to filter out images with better imaging quality of specific parts. Furthermore, the re-identification device 1 can also capture multiple object images corresponding to the same object from multiple frames of a video, and filter out object images with better imaging quality for the re-identification step.

本揭露還提供一種重識別方法200，適用於一電子裝置（例如：重識別裝置1），其中該電子裝置包含一儲存器（例如：儲存器14）。在一些實施例中，該電子裝置還包含一處理器（例如：處理器12），用以執行重識別方法200。如第4圖所示，重識別方法200包含步驟S201至S205。The present disclosure also provides a re-identification method 200, which is applicable to an electronic device (e.g., re-identification device 1), wherein the electronic device includes a memory (e.g., memory 14). In some embodiments, the electronic device further includes a processor (e.g., processor 12) for executing the re-identification method 200. As shown in FIG. 4, the re-identification method 200 includes steps S201 to S205.

在步驟S201中，該電子裝置自一第一輸入影像中擷取對應一物件的一第一物件影像。在步驟S202中，該電子裝置基於該第一物件影像，計算該第一物件影像對應該物件的一信心分數。在步驟S203中，該電子裝置基於該第一物件影像，計算該第一物件影像對應複數個關鍵點的複數個可視分數。在步驟S204中，該電子裝置基於該第一物件影像的該信心分數以及該些可視分數，判斷是否將該第一物件影像儲存至該儲存器作為複數個影像其中之一。在步驟S205中，該電子裝置基於該儲存器所儲存的該些影像，進行一重識別步驟。In step S201, the electronic device captures a first object image corresponding to an object from a first input image. In step S202, the electronic device calculates a confidence score of the first object image corresponding to the object based on the first object image. In step S203, the electronic device calculates a plurality of visual scores of a plurality of key points corresponding to the first object image based on the first object image. In step S204, the electronic device determines whether to store the first object image in the memory as one of a plurality of images based on the confidence score and the visual scores of the first object image. In step S205, the electronic device performs a re-recognition step based on the images stored in the memory.

在一些實施例中，步驟S204更包含基於該第一物件影像的該信心分數以及該些可視分數計算一第一總分；以及響應於該第一總分大於一閾值，將該第一物件影像儲存至該儲存器。In some embodiments, step S204 further includes calculating a first total score based on the confidence score and the visual scores of the first object image; and in response to the first total score being greater than a threshold, storing the first object image in the memory.

在一些實施例中，計算該第一總分的步驟更包含基於該第一物件影像的該信心分數、該第一物件影像的該些可視分數，以及對應該信心分數及該些可視分數的複數個權重計算該第一總分。In some embodiments, the step of calculating the first total score further includes calculating the first total score based on the confidence score of the first object image, the visibility scores of the first object image, and a plurality of weights corresponding to the confidence score and the visibility scores.

在一些實施例中，重識別方法200更包含自一第二輸入影像中追蹤該物件以擷取對應該物件的一第二物件影像，其中該第二輸入影像的一擷取時間晚於該第一輸入影像的該擷取時間；基於該第二物件影像，計算該第二物件影像對應該物件的該信心分數；基於該第二物件影像，計算該第二物件影像對應該些關鍵點的該些可視分數；以及基於該第二物件影像的該信心分數以及該些可視分數，判斷是否將包含該第二物件影像之一第二影像儲存至該儲存器作為該些影像其中之一。In some embodiments, the re-identification method 200 further includes tracking the object from a second input image to capture a second object image corresponding to the object, wherein a capture time of the second input image is later than the capture time of the first input image; based on the second object image, calculating the confidence score that the second object image corresponds to the object; based on the second object image, calculating the visibility scores of the key points corresponding to the second object image; and based on the confidence score and the visibility scores of the second object image, determining whether to store a second image including the second object image in the memory as one of the images.

在一些實施例中，步驟S204更包含基於該第一物件影像的該信心分數以及該些可視分數計算一第一總分；基於該第二物件影像的該信心分數以及該些可視分數計算一第二總分；以及基於該第一總分及該第二總分決定儲存至該儲存器作為該些影像之內容。In some embodiments, step S204 further includes calculating a first total score based on the confidence score of the first object image and the visual scores; calculating a second total score based on the confidence score of the second object image and the visual scores; and determining to store in the memory as the content of the images based on the first total score and the second total score.

在一些實施例中，步驟S203更包含判斷該第一物件影像中該些關鍵點各者的一位置，其中該些關鍵點各者對應該物件的一部位；以及基於該第一物件影像中該些關鍵點各者的該位置的複數個像素，計算該些關鍵點各者對應的該些可視分數其中之一。In some embodiments, step S203 further includes determining a position of each of the key points in the first object image, wherein each of the key points corresponds to a portion of the object; and calculating one of the visibility scores corresponding to each of the key points based on a plurality of pixels at the position of each of the key points in the first object image.

在一些實施例中，該些關鍵點包含複數個臉部關鍵點以及複數個身體關鍵點，並且該些可視分數包含對應該些臉部關鍵點的複數個臉部可視分數以及對應該些身體關鍵點的複數個身體可視分數。In some embodiments, the key points include a plurality of facial key points and a plurality of body key points, and the visibility scores include a plurality of facial visibility scores corresponding to the facial key points and a plurality of body visibility scores corresponding to the body key points.

在一些實施例中，步驟S202更包含基於該第一物件影像，計算該第一物件影像中包含該物件的一信賴分數作為該信心分數。In some embodiments, step S202 further includes calculating, based on the first object image, a confidence score that the first object image includes the object as the confidence score.

在一些實施例中，該重識別步驟進一步包含：接收一待辨識影像，分群該待辨識影像至複數個辨識類別之一第一類別；以及基於該第一類別，輸出對應該第一類別之複數個已辨識影像。In some embodiments, the re-identification step further includes: receiving an image to be identified, grouping the image to be identified into a first category of a plurality of identification categories; and based on the first category, outputting a plurality of identified images corresponding to the first category.

綜上所述，本揭露所提出的重識別方法200可以擷取輸入影像中包含物件的物件影像，並且基於物件影像的成像品質判斷是否將物件影像用於重識別步驟。此外，重識別裝置1還可以針對物件不同部位設定權重以篩選特定部位成像品質較好的影像。進一步地，重識別裝置1還可以從影片的多個畫面幀中擷取對應同一個物件的多個物件影像，並且篩選出成像品質較好的物件影像用於重識別步驟。In summary, the re-identification method 200 proposed in the present disclosure can capture an object image containing an object in an input image, and determine whether to use the object image for the re-identification step based on the imaging quality of the object image. In addition, the re-identification device 1 can also set weights for different parts of the object to filter out images with better imaging quality of specific parts. Furthermore, the re-identification device 1 can also capture multiple object images corresponding to the same object from multiple frames of a video, and filter out object images with better imaging quality for the re-identification step.

雖以數個實施例詳述如上作為示例，然本揭露所提出之重識別裝置及方法亦得以其他系統、硬體、軟體、儲存媒體或其組合實現。因此，本揭露之保護範圍不應受限於本揭露實施例所描述之特定實現方式，當視後附之申請專利範圍所界定者為準。Although several embodiments are described in detail above as examples, the re-identification device and method proposed in the present disclosure can also be implemented by other systems, hardware, software, storage media or their combination. Therefore, the protection scope of the present disclosure should not be limited to the specific implementation method described in the embodiments of the present disclosure, but should be defined by the scope of the attached patent application.

對於本揭露所屬技術領域中具有通常知識者顯而易見的是，在不脫離本揭露的範圍或精神的情況下，可以對本揭露的結構進行各種修改和變化。鑑於前述，本揭露之保護範圍亦涵蓋在後附之申請專利範圍內進行之修改和變化。It is obvious to those with ordinary knowledge in the art to which the present disclosure belongs that various modifications and changes can be made to the structure of the present disclosure without departing from the scope or spirit of the present disclosure. In view of the foregoing, the protection scope of the present disclosure also covers modifications and changes made within the scope of the attached patent application.

1:重識別裝置 12:處理器 14:儲存器 IMG:輸入影像 RS:信心分數 OBJ:物件影像 KP0~KP16:關鍵點 200:重識別方法 S201~S205:步驟 1: Re-identification device 12: Processor 14: Memory IMG: Input image RS: Confidence score OBJ: Object image KP0~KP16: Key points 200: Re-identification method S201~S205: Steps

為讓本揭露之上述和其他目的、特徵、優點與實施例能更明顯易懂，所附圖式之說明如下：第1圖為本揭露第一實施方式中重識別裝置的示意圖；第2圖為本揭露第一實施方式中自輸入影像中擷取物件影像及計算信任分數的示意圖；第3圖為本揭露第一實施方式中物件影像中的關鍵點的示意圖；以及第4圖為本揭露第二實施方式中重識別方法的流程圖。 In order to make the above and other purposes, features, advantages and embodiments of the present disclosure more clearly understandable, the attached drawings are described as follows: Figure 1 is a schematic diagram of the re-identification device in the first embodiment of the present disclosure; Figure 2 is a schematic diagram of capturing an object image from an input image and calculating a trust score in the first embodiment of the present disclosure; Figure 3 is a schematic diagram of key points in an object image in the first embodiment of the present disclosure; and Figure 4 is a flow chart of the re-identification method in the second embodiment of the present disclosure.

國內寄存資訊(請依寄存機構、日期、號碼順序註記) 無國外寄存資訊(請依寄存國家、機構、日期、號碼順序註記) 無 Domestic storage information (please note in the order of storage institution, date, and number) None Foreign storage information (please note in the order of storage country, institution, date, and number) None

1:重識別裝置 1: Re-identify device

12:處理器 12: Processor

14:儲存器 14: Storage

Claims

A re-identification device comprises: a memory; and a processor coupled to the memory, the processor being used to perform the following operations: extracting a first object image corresponding to an object from a first input image; calculating a confidence score of the first object image corresponding to the object based on the first object image; calculating a plurality of visual scores of a plurality of key points corresponding to the first object image based on the first object image; determining whether to store a first image including the first object image in the memory as one of a plurality of images based on the confidence score and the visual scores of the first object image; and performing a re-identification step based on the images stored in the memory; The operation of determining to store in the memory as one of the images further includes: calculating a first total score based on the confidence score of the first object image and the visual scores; and storing the first object image in the memory in response to the first total score being greater than a threshold.

A re-identification device as described in claim 1, wherein the operation of calculating the first total score further includes calculating the first total score based on the confidence score of the first object image, the visual scores of the first object image, and a plurality of weights corresponding to the confidence score and the visual scores.

The re-identification device as described in claim 1, wherein the processor is further used to perform the following operations: Tracking the object from a second input image to capture a second object image corresponding to the object, wherein a capture time of the second input image is later than the capture time of the first input image; Based on the second object image, calculating the confidence score of the second object image corresponding to the object; Based on the second object image, calculating the visible scores of the key points corresponding to the second object image; and Based on the confidence score and the visible scores of the second object image, determining whether to store a second image including the second object image in the memory as one of the images.

The re-identification device as described in claim 3, wherein the operation of determining to store in the memory as one of the images comprises: Calculating a first total score based on the confidence score of the first object image and the visual scores; Calculating a second total score based on the confidence score of the second object image and the visual scores; and Determining the content to be stored in the memory as one of the images based on the first total score and the second total score.

The re-identification device as described in claim 1, wherein the operation of calculating the visible scores of the key points corresponding to the first object image further includes: Determining a position of each of the key points in the first object image, wherein each of the key points corresponds to a part of the object; and Based on a plurality of pixels of the position of each of the key points in the first object image, calculating one of the visible scores corresponding to each of the key points.

A re-identification device as described in claim 1, wherein the key points include a plurality of facial key points and a plurality of body key points, and the visibility scores include a plurality of facial visibility scores corresponding to the facial key points and a plurality of body visibility scores corresponding to the body key points.

The re-identification device as described in claim 1, wherein the operation of calculating the confidence score of the first object image corresponding to the object further includes: Based on the first object image, calculating a confidence score of the object included in the first object image as the confidence score.

The re-identification device as described in claim 1, wherein the re-identification step further includes: receiving an image to be identified, grouping the image to be identified into a first category of a plurality of identification categories; and based on the first category, outputting a plurality of identified images corresponding to the first category.

A re-identification method is applicable to an electronic device, wherein the electronic device includes a memory, and the re-identification method includes: Capturing a first object image corresponding to an object from a first input image; Based on the first object image, calculating a confidence score that the first object image corresponds to the object; Based on the first object image, calculating a plurality of visual scores of a plurality of key points corresponding to the first object image; Based on the confidence score of the first object image and the visual scores, determining whether to store a first image including the first object image in the memory as one of a plurality of images; and Based on the images stored in the memory, performing a re-identification step; The step of determining to store in the memory as one of the images further includes: Calculating a first total score based on the confidence score of the first object image and the visual scores; and In response to the first total score being greater than a threshold, storing the first object image in the memory.