CN112446261A

CN112446261A - Pedestrian re-identification equipment and method

Info

Publication number: CN112446261A
Application number: CN201910824726.XA
Authority: CN
Inventors: 竹梦圆; 郭心语; 李安新; 陈岚; 山谷佳祐; 小岛诚也; 酒井俊树
Original assignee: NTT Korea Co Ltd
Current assignee: NTT Korea Co Ltd
Priority date: 2019-09-02
Filing date: 2019-09-02
Publication date: 2021-03-05
Also published as: JP7612348B2; JP2021039741A

Abstract

The present disclosure provides pedestrian re-identification devices, methods, and computer storage media. The device includes: a pedestrian detection unit configured to perform pedestrian detection in each video frame of a video sequence; a feature extraction unit configured to extract appearance features for each pedestrian detected in the video frame; and a pedestrian identification unit, is configured to match the detected appearance features of the respective pedestrians with the target appearance features of the target pedestrians to identify the target pedestrians from the video frame, wherein the target appearance features of the target pedestrians include at least the target appearance features associated with the shape of the target pedestrians. A first appearance feature and a second appearance feature. By using the device, method and computer storage medium, the accuracy of pedestrian re-identification is improved in the scenario where the pedestrian's posture changes greatly or is recognized across cameras.

Description

Pedestrian re-identification equipment and method

Technical Field

The present disclosure relates to image processing, and in particular to a pedestrian re-recognition apparatus, method, and computer storage medium.

Background

Pedestrian re-identification refers to a technique for determining the presence of a specific target pedestrian by analyzing images or video sequences derived from multiple cameras with overlapping or non-overlapping fields of view. Different from common pedestrian tracking, pedestrian re-identification can identify a specific target pedestrian in images or video sequences shot by different cameras so as to realize long-term tracking and monitoring of the specific target pedestrian, so that the pedestrian re-identification has a very large application prospect in the monitoring field, criminal investigation work and the like.

At present, pedestrian re-identification is generally to determine whether a detected pedestrian corresponds to a known target pedestrian by matching features of each detected pedestrian in an image or a video sequence with features of known target pedestrians in a target pedestrian library constructed in advance, and the effect is often not ideal, mainly because: the shooting angles, the visual field ranges, the illumination conditions and the like of different cameras are different, so that the postures and the appearances of the same pedestrian in the pictures shot by the different cameras are possibly greatly different; pedestrians tend to have strong mobility, so that the postures and the appearances of the pedestrians can also change to a certain extent along with time; the previously constructed target pedestrian has a single characteristic, so that when the posture and the appearance of the same pedestrian are changed, a recognition error is easy to occur.

Disclosure of Invention

According to an aspect of the present disclosure, there is provided a pedestrian re-recognition apparatus including: a pedestrian detection unit configured to perform pedestrian detection in each video frame of the video sequence; a feature extraction unit configured to extract an appearance feature of each pedestrian detected in the video frame; and a pedestrian recognition unit configured to match the detected appearance features of the respective pedestrians with target appearance features of a target pedestrian to recognize the target pedestrian from the video frame, wherein the target appearance features of the target pedestrian include at least a first appearance feature and a second appearance feature associated with an appearance of the target pedestrian.

According to another aspect of the present disclosure, there is provided a pedestrian re-recognition apparatus including: a processor; and a memory having computer program instructions stored therein, wherein the computer program instructions, when executed by the processor, cause the processor to perform the steps of: carrying out pedestrian detection in each video frame of the video sequence; extracting appearance characteristics of each pedestrian detected in the video frame; and matching the detected appearance features of each pedestrian with target appearance features of target pedestrians to identify the target pedestrians from the video frames, wherein the target appearance features of the target pedestrians at least comprise first appearance features and second appearance features which are associated with the appearance of the target pedestrians.

According to another aspect of the present disclosure, there is provided a pedestrian re-identification method including: carrying out pedestrian detection in each video frame of the video sequence; extracting appearance characteristics of each pedestrian detected in the video frame; and matching the detected appearance features of each pedestrian with target appearance features of target pedestrians to identify the target pedestrians from the video frames, wherein the target appearance features of the target pedestrians at least comprise first appearance features and second appearance features which are associated with the appearance of the target pedestrians.

According to another aspect of the present disclosure, there is provided a computer readable storage medium having stored thereon computer program instructions, wherein the computer program instructions, when executed by a processor, implement the pedestrian re-identification method described above.

According to the device, the method and the computer storage medium of the above aspects of the disclosure, the accuracy of pedestrian re-identification is improved in a scene where the pedestrian posture is greatly changed or cross-camera identification is performed.

Drawings

These and/or other aspects and advantages of the present disclosure will become more apparent and more readily appreciated from the following detailed description of the embodiments of the present disclosure, taken in conjunction with the accompanying drawings of which:

fig. 1(a) and 1(b) show a case of an exemplary recognition error at the time of re-recognition of a pedestrian.

Fig. 2 shows an exemplary structural block diagram of a pedestrian re-recognition apparatus according to an embodiment of the present disclosure.

Fig. 3(a) and 3(b) show schematic diagrams of examples of first and second appearance features of target appearance features according to embodiments of the present disclosure.

Fig. 4 shows a schematic diagram of an example of a plurality of appearance features of a target appearance feature according to an embodiment of the present disclosure.

Fig. 5 illustrates a hardware block diagram of an exemplary computing device for implementing a pedestrian re-identification device of an embodiment of the present disclosure.

Fig. 6 shows a schematic flow chart of a pedestrian re-identification method according to an embodiment of the present disclosure.

Fig. 7(a) and 7(b) show test result diagrams of different pedestrian re-identification methods.

Detailed Description

For better understanding of the present disclosure, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present disclosure, and it is apparent that the described embodiments are only a part of the embodiments of the present disclosure, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.

As described above, the current pedestrian re-recognition usually matches the features of each pedestrian detected in an image or a video sequence with the features of known target pedestrians in a target pedestrian library constructed in advance to determine whether the detected pedestrian corresponds to the known target pedestrian, however, the existing pedestrian re-recognition method is prone to make false recognition of the pedestrian due to various reasons such as the posture change of the pedestrian. In particular, in the case where the features in the target pedestrian library for feature matching are single, once the posture and appearance of the same pedestrian largely change, it may be erroneously recognized as another pedestrian. A case of an exemplary pedestrian recognition error that may occur in the pedestrian re-recognition is described below in conjunction with fig. 1(a) and 1 (b).

Fig. 1(a) shows a case where an exemplary pedestrian recognition error occurs when a target feature of a target pedestrian is a feature associated with its half-body, and the pedestrian appears in the screen as a whole body and extracts a feature associated with its whole body for matching. As shown in the left diagram of fig. 1(a), when a certain pedestrian just walks into the visual field of the camera, it appears as a half-body in the screen, at this time, a new ID is assigned thereto, and a feature associated with the half-body of the pedestrian in the screen is taken as its target appearance feature for subsequent feature matching. Thereafter, as shown in the right diagram of fig. 1(a), when the pedestrian completely enters the visual field range of the camera, it appears as a whole body in the screen, and since the feature associated with the whole body extracted from the screen is greatly different from the feature associated with the half body constructed in advance as described above, a matching error is more likely to occur, and it may be erroneously recognized as another pedestrian.

Fig. 1(b) shows an exemplary pedestrian recognition error that is generated when a target feature of a target pedestrian is a feature associated with its whole body, and the pedestrian appears in the screen as a half-body and extracts a feature associated with its half-body for matching. As shown in the left diagram of fig. 1(b), when a certain pedestrian appears as a whole body in a screen, a feature associated with the whole body of the pedestrian may be extracted from the screen as its target appearance feature for subsequent feature matching. Thereafter, as shown in the right diagram of fig. 1(b), for example, when the pedestrian is obstructed or leaves the visual field range of the camera, the pedestrian appears only as a half-length in the screen, and since the feature associated with the half-length thereof extracted from the screen is greatly different from the feature associated with the whole body thereof constructed in advance as described above, it is easily erroneously recognized as another pedestrian.

It is seen that, in the above-described case, when the posture of the pedestrian is largely changed, the pedestrian appearance corresponding to each of the feature extracted from the captured video frame and the feature of the target pedestrian constructed in advance may not be the same, and for example, a feature associated with the half body of the pedestrian is extracted and the feature of the target pedestrian is a feature associated with the whole body thereof, or a feature associated with the whole body of the pedestrian is extracted and the feature of the target pedestrian is a feature associated with the half body thereof, so that the difference between the two features to be matched is large, and a pedestrian recognition error is liable to occur. In view of the above situation, in the present disclosure, multiple appearance features associated with different appearances of a target pedestrian are considered together to construct features of the target pedestrian for matching with corresponding features of detected pedestrians, respectively, so that a recognition result with sufficient accuracy can be provided in a scene of a great change in posture of the pedestrian or cross-camera recognition.

Next, a pedestrian re-recognition apparatus according to an embodiment of the present disclosure will be described with reference to fig. 2. Fig. 2 shows an exemplary structural block diagram of the pedestrian re-recognition apparatus 100 according to the embodiment of the present disclosure. As shown in fig. 2, the pedestrian re-recognition apparatus 100 may include a pedestrian detection unit 110, a feature extraction unit 120, and a pedestrian recognition unit 130. The main functions of the units of the pedestrian re-identification apparatus 100 are described below.

The pedestrian detection unit 110 may perform pedestrian detection in each video frame of the video sequence. As mentioned above, unlike ordinary pedestrian tracking, the pedestrian re-identification technique according to the present disclosure can identify a specific target pedestrian in images or video sequences taken across different cameras to enable long-term tracking and surveillance of the specific target pedestrian. More specifically, the video sequence to be analyzed, i.e., the video sequence to be analyzed from which it is necessary to identify whether or not the target pedestrian exists, may be captured by a different camera than the video frame from which the target pedestrian was previously captured and from which the appearance features thereof were extracted, or may be captured by the same camera at a different time.

In particular, the video sequence analyzed by the pedestrian detection unit 110 may include one or more video frames, each of which may be consecutive or non-consecutive video frames captured by a single camera or multiple cameras with overlapping or non-overlapping fields of view. The pedestrian detection unit 110 may employ any suitable image detection technique in the art to detect pedestrians from video frames of a video sequence, as the present disclosure is not limited thereto. For example, the pedestrian detection unit 110 may perform foreground segmentation, edge extraction, motion detection, and the like on each video frame, and determine respective sub-image regions corresponding to respective pedestrians appearing in the video frame, for example, the sub-image regions of the video frame may be represented by rectangular frames circumscribing the body contour of the detected pedestrian. For another example, the pedestrian detection unit 110 may perform pedestrian detection on each video frame by using a machine learning method such as a neural network or a support vector machine through a pre-trained pedestrian detection classifier to determine the position of each pedestrian appearing in the video frame.

The feature extraction unit 120 may extract an appearance feature of each pedestrian detected in the video frame. In the present disclosure, the appearance feature may include color feature, texture feature, shape feature, face feature and other features reflecting the appearance of the pedestrian, and the present disclosure is not limited thereto. The above-mentioned methods for extracting various features are well known in the art and will not be described herein.

The pedestrian recognition unit 130 may match the detected appearance features of the respective pedestrians with the target appearance features of the target pedestrian, and recognize the target pedestrian from the video frame according to the matching result. Specifically, the pedestrian recognition unit 130 may match the detected appearance features of each pedestrian with the appearance features of a target pedestrian, determine whether the pedestrian detected in the video frame corresponds to the target pedestrian based on the similarity between the two, and perform pedestrian re-recognition. Alternatively, the number of target pedestrians may be multiple, and accordingly, the pedestrian recognition unit 130 may match the detected appearance features of the respective pedestrians with the constructed appearance features of the respective target pedestrians, and determine whether the detected pedestrian in the video frame corresponds to a certain target pedestrian among the respective target pedestrians based on the similarity between the two, so as to perform pedestrian re-recognition. It will be appreciated that the target appearance characteristics of the target pedestrian may be contained in a pre-constructed pedestrian appearance characteristics library, which may be stored locally at the pedestrian re-identification device or at a server accessible to the pedestrian re-identification device. Alternatively, the target appearance characteristics of the target pedestrian may be generated from a picture and input from the outside as needed, rather than being stored in the pedestrian appearance characteristic library in advance. The present disclosure does not limit the manner in which the target appearance characteristics of the target pedestrian are stored or retrieved.

In embodiments of the present disclosure, various feature matching methods may be employed to identify target pedestrians from video frames, and for completeness of illustration only, an exemplary method will be described below. For example, the pedestrian recognition unit 130 may calculate a characteristic distance, such as a manhattan distance, a euclidean distance, a baryta distance, and the like, between the appearance characteristic of each detected pedestrian and a target appearance characteristic of a target pedestrian and compare it with a preset threshold value to determine whether the detected pedestrian corresponds to the target pedestrian. Alternatively, the number of target pedestrians may be multiple, in which case, the pedestrian recognition unit 130 may calculate, for a detected pedestrian, respective feature distances between the detected pedestrian and the respective target pedestrians, determine a closest feature distance therebetween, and if the closest feature distance is lower than a preset threshold, recognize the detected pedestrian as the target pedestrian corresponding to the closest feature distance. The pedestrian recognition unit 130 may output a pedestrian re-recognition result according to the matching result, for example, to associate the detected pedestrian with the identity of a certain target pedestrian when the detected pedestrian corresponds to the target pedestrian, and to assign a new identity to the detected pedestrian when the detected pedestrian cannot be matched with the target pedestrian.

As described above, when the pedestrian posture is largely changed, in the case where the pedestrian shape corresponding to each of the features extracted from the captured video frame and the pre-constructed features of the target pedestrian is not the same, a result of erroneous recognition of the pedestrian is liable to occur because the difference between the two features is large. In the disclosure, the target appearance characteristics of the target pedestrian at least comprise a first appearance characteristic and a second appearance characteristic which are associated with the appearance of the target pedestrian, so that different characteristics associated with different appearances of the target pedestrian can be sufficiently reflected by a plurality of pre-constructed appearance characteristics, and the accuracy of pedestrian re-identification is improved. It is to be understood that the appearance in the present disclosure may refer to a form in which the target pedestrian appears in the picture, such as appearing in the whole body, appearing in the half body, appearing in the front, appearing in the side, appearing in the back, and the like, and the present disclosure does not limit this. Accordingly, the first and second appearance features are two appearance features associated with different appearances of the target pedestrian. The pedestrian recognition operation of the pedestrian re-recognition apparatus 100 is specifically described below for different examples of the first appearance feature and the second appearance feature.

According to one embodiment of the present disclosure, the first appearance characteristic of the target pedestrian may be a characteristic associated with the whole body, and the second appearance characteristic of the target pedestrian may be a characteristic associated with the half body. As shown in fig. 3(a), the first appearance characteristic of the target pedestrian may be a characteristic associated with its frontal full body, and the second appearance characteristic may be a characteristic associated with its frontal half body. It is understood that the half-length in the present disclosure is not limited to exactly half the size of the whole body, and the half-length may be any suitable proportion of the whole body, such as a quarter, a third, etc., as desired. The pedestrian recognition operation of the pedestrian re-recognition apparatus 100 in this case will be described in detail below.

As described above, the pedestrian detection unit 110 may perform pedestrian detection in each video frame of the video sequence and determine each sub-image region corresponding to each pedestrian appearing in the video frame, and the pedestrian detection unit 110 may also determine whether the pedestrian appears as a whole body or a half body in the video frame based on an aspect ratio of the detected pedestrian in the video frame (e.g., based on an aspect ratio of a circumscribed rectangular frame of the pedestrian), or based on a whole body/half body classifier trained by a neural network, or other suitable method. Accordingly, when the pedestrian detection unit 110 determines that the detected pedestrian appears in the video frame as a whole body, the feature extraction unit 120 may extract features associated with the whole body thereof, for example, extract appearance features such as color features, texture features, shape features, and face features associated with the whole body thereof from the sub-image region in which the pedestrian is located. Meanwhile, the feature extraction unit 120 may cut out the bust of the detected pedestrian from the video frame and extract features associated with the bust thereof, for example, cut out the upper half of the sub-image region where the pedestrian is located, and extract appearance features such as a color feature, a texture feature, a shape feature, and a face feature associated with the bust thereof from the cut-out half sub-image region.

Next, the pedestrian recognition unit 130 may match a feature associated with the whole body of the detected pedestrian with a feature associated with the whole body of the target pedestrian, match a feature associated with the half body thereof with a feature associated with the half body of the target pedestrian, and associate the pedestrian detected in the video frame with the identity of the target pedestrian when at least one of the feature associated with the whole body of the detected pedestrian and the feature associated with the half body thereof matches the target pedestrian. That is, as long as one of the feature associated with the detected whole body of the pedestrian and the feature associated with the cut-out half body matches the corresponding feature of the target pedestrian, the pedestrian recognition unit 130 can accurately recognize the pedestrian without recognizing it as another pedestrian. Alternatively, the pedestrian recognition unit 130 may first match one of the features associated with the whole body or the features associated with the half body of the detected pedestrian with the corresponding feature of the target pedestrian, and when the matching is successful, may omit the matching operation for the remaining one of the features, and only when the matching is failed, continue the matching for the remaining one of the features, thereby reducing the amount of computation of the matching process. Further, when the pedestrian recognition unit 130 determines that the pedestrian detected in the video frame cannot be matched with the target pedestrian, that is, the feature associated with the whole body of the detected pedestrian cannot be matched with the feature associated with the whole body of the target pedestrian, and the feature associated with the half body of the detected pedestrian cannot be matched with the feature associated with the half body of the target pedestrian, a new identity may be assigned to the detected pedestrian.

Optionally, the pedestrian re-recognition apparatus 100 may further include a feature updating unit (not shown). The feature updating unit may update the target appearance feature of the target pedestrian according to the recognition result of the pedestrian recognition unit 130. Specifically, when a pedestrian detected in the video frame matches a target pedestrian, the feature updating unit may update the target appearance feature of the target pedestrian using a feature associated with the whole body of the detected pedestrian and a feature associated with the half body thereof. For example, when the pedestrian recognition unit 130 determines that a pedestrian detected in a video frame corresponds to a certain target pedestrian and associates the detected pedestrian with the identity of the target pedestrian: if the target pedestrian lacks a feature associated with the half-body thereof or a feature associated with the whole-body thereof, the feature updating unit may directly add the feature associated with the whole-body or half-body of the detected pedestrian in the video frame to the portion absent by the target pedestrian; for the existing features of the target pedestrian associated with the whole body of the target pedestrian, the feature updating unit may replace the existing features of the target pedestrian associated with the whole body of the target pedestrian with the existing features of the video frame associated with the whole body of the detected pedestrian, or fuse the existing features of the video frame associated with the whole body of the detected pedestrian and the existing features of the target pedestrian associated with the whole body of the target pedestrian as updated features associated with the whole body of the target pedestrian; the feature updating unit may replace the feature associated with the bust of the target pedestrian with the feature associated with the bust extracted after the video frame is cut out, or may fuse the extracted feature associated with the bust with the feature associated with the bust of the target pedestrian as the updated feature associated with the bust of the target pedestrian. Alternatively, the feature updating unit may transmit the feature associated with the whole body and the feature associated with the half body of the detected pedestrian to a server that maintains the target appearance feature of the target pedestrian, so that the server updates the target appearance feature of the target pedestrian based on the received features, thereby enabling the updated features to track the characteristics corresponding to the different appearances of the target pedestrian at all times. Further, when the pedestrian recognition unit 130 determines that a pedestrian detected in a video frame cannot match a target pedestrian and assigns a new identity to the detected pedestrian, the feature update unit may construct a target appearance feature for it, that is, a feature associated with the whole body of the detected pedestrian in the video frame as a feature associated with the whole body of the target pedestrian, and a feature associated with the half body of the extracted video frame after being truncated as a feature associated with the half body of the target pedestrian for subsequent feature matching.

The respective operations of the feature extraction unit 120 and the pedestrian recognition unit 130 when the pedestrian detection unit 110 determines that the detected pedestrian is present as a whole body in the video frame are described above. The respective operations of the feature extraction unit 120 and the pedestrian recognition unit 130 when the pedestrian detection unit 110 determines that the detected pedestrian is a bust in the video frame will be described below.

Specifically, when the pedestrian detection unit 110 determines that the detected pedestrian appears in the video frame as a bust, the feature extraction unit 120 may extract a feature associated with its bust, for example, extract appearance features such as a color feature, a texture feature, a shape feature, and a face feature associated with its bust from a sub-image region in which the pedestrian is located. Accordingly, the pedestrian recognition unit 130 may match the feature associated with the detected pedestrian's bust with the feature associated with the target pedestrian's bust, and associate the pedestrian detected in the video frame with the identity of the target pedestrian when the feature associated with the pedestrian's bust matches the target pedestrian. Further, when the pedestrian recognition unit 130 determines that a pedestrian detected in the video frame cannot be matched with the target pedestrian, that is, a feature associated with the bust of the detected pedestrian cannot be matched with a feature associated with the bust of the target pedestrian, a new identity may be assigned to the detected pedestrian.

Alternatively, the feature updating unit may update the target appearance feature of the target pedestrian according to the recognition result of the pedestrian recognition unit 130. Specifically, when the feature associated with the bust of the detected pedestrian and the feature associated with the bust of the target pedestrian match in the video frame, the feature updating unit may update the target appearance feature of the target pedestrian with the feature associated with the bust of the detected pedestrian. For example, the feature updating unit may replace the feature associated with the existing half-length of the target pedestrian with the feature associated with the half-length of the detected pedestrian extracted from the video frame, or fuse the extracted feature associated with the existing half-length of the target pedestrian and the feature associated with the existing half-length of the target pedestrian as the updated feature associated with the half-length of the target pedestrian. Alternatively, the feature updating unit may transmit the feature associated with the detected bust of the pedestrian to a server maintaining the target appearance feature of the target pedestrian, such that the server updates the target appearance feature of the target pedestrian based on the received feature, thereby enabling the updated feature to track characteristics corresponding to different appearances of the target pedestrian. Further, when the pedestrian recognition unit 130 determines that a pedestrian detected in a video frame cannot match a target pedestrian and assigns a new identity to the detected pedestrian, the feature update unit may construct a target appearance feature for it, i.e., a feature in the video frame associated with the detected pedestrian's bust as a feature associated with the target pedestrian's bust for subsequent feature matching.

It should be understood that, for ease of explanation, the detailed description is made above with the first appearance feature of the target pedestrian being a feature associated with the front full-body and the second appearance feature being a feature associated with the front half-body, however, the full-body associated feature and the half-body associated feature of the present disclosure are not limited to features extracted from the front target pedestrian, for example, the first appearance feature may be a feature associated with a lateral full-body of the target pedestrian and the second appearance feature is a feature associated with a lateral half-body, or the first appearance feature may be a feature associated with a back full-body of the target pedestrian and the second appearance feature is a feature associated with a back half-body, the pedestrian recognition unit 130 may match the features associated with the detected lateral/back full-body/half-body of the pedestrian with the corresponding features of the target pedestrian to recognize the pedestrian from the video frame, the present disclosure is not so limited.

According to another embodiment of the present disclosure, the first and second appearance features of the target pedestrian may be any two of a feature associated with the front face, a feature associated with the side face, and a feature associated with the back face. As shown in fig. 3(b), the first appearance feature of the target pedestrian may be a feature associated with the front face thereof, and the second appearance feature may be a feature associated with the side face thereof. It will be appreciated that the first and second appearance features may also be features associated with the front face and the back face, or features associated with the side faces and the back face, respectively. The pedestrian recognition operation of the pedestrian re-recognition apparatus 100 in this case will be described in detail below.

As described above, the pedestrian detection unit 110 can perform pedestrian detection in each video frame of the video sequence, and determine each sub-image region corresponding to each pedestrian appearing in the video frame. Accordingly, the feature extraction unit 120 may extract, from the sub-image region in which the pedestrian is located, an appearance feature such as a color feature, a texture feature, or a shape feature associated with the front surface thereof, an appearance feature such as a color feature, a texture feature, or a shape feature associated with the side surface thereof, or an appearance feature such as a color feature, a texture feature, or a shape feature associated with the back surface thereof, using a feature extraction method such as a color feature, a texture feature, or a shape feature. Next, the pedestrian recognition unit 130 may match the feature associated with the front face, the feature associated with the side face thereof, or the feature associated with the back face thereof with the first appearance feature and the second appearance feature of the target pedestrian (e.g., the feature associated with the front face and the feature associated with the side face), and associate the pedestrian detected in the video frame with the identity of the target pedestrian when the feature associated with the front face, the feature associated with the side face thereof, or the feature associated with the back face thereof matches the target pedestrian. That is, although the pedestrian is constantly changing in posture due to its strong motility, the target pedestrian has a plurality of appearance features associated with a plurality of appearances thereof, and therefore, as long as the feature associated with one of the appearances thereof actually extracted for the detected pedestrian matches one of the first appearance feature and the second appearance feature of the target pedestrian, the pedestrian recognition unit 130 can accurately recognize the pedestrian without recognizing it as another pedestrian. Optionally, the feature extraction unit 120 may further determine whether the pedestrian appears on the front, side, or back in the video frame. For example, the feature extraction unit 120 may determine whether the pedestrian appears on the front side, the side, or the back side in the video frame by using a trained neural network as a classifier or by analyzing the aspect ratio, the hair occlusion ratio, and the like of the sub-image region where the pedestrian is located. Thereafter, the pedestrian recognition unit 130 can know whether the feature extracted for the pedestrian is a feature associated with the front face thereof, a feature associated with the side face thereof, or a feature associated with the back face thereof at all, and match the extracted feature with the corresponding feature of the target pedestrian while omitting the matching operation with the remaining features of the target pedestrian, thereby reducing the amount of computation of the matching process. Further, when the pedestrian recognition unit 130 determines that a pedestrian detected in a video frame cannot be matched with a target pedestrian, that is, that none of the feature associated with the front face of the detected pedestrian, the feature associated with the side face thereof, or the feature associated with the back face thereof, and the feature associated with the front face of the target pedestrian and the feature associated with the side face of the target pedestrian match, a new identity may be assigned to the detected pedestrian.

Optionally, the pedestrian re-recognition apparatus 100 may further include a feature updating unit (not shown). The feature updating unit may update the target appearance feature of the target pedestrian according to the recognition result of the pedestrian recognition unit 130. Specifically, when a pedestrian detected in the video frame and a target pedestrian match, the feature updating unit may update the target appearance feature of the target pedestrian using a feature associated with a front face of the detected pedestrian, a feature associated with a side face thereof, or a feature associated with a rear face thereof. For example, when the pedestrian recognition unit 130 determines that a pedestrian detected in a video frame corresponds to a certain target pedestrian and associates the detected pedestrian with the identity of the target pedestrian, taking as an example that the feature associated with the front face of the detected pedestrian matches the feature associated with the front face of the target pedestrian, the feature update unit may replace the feature already associated with the front face of the target pedestrian with the feature associated with the front face of the detected pedestrian in the video frame, or fuse the feature already associated with the front face of the detected pedestrian and the feature already associated with the front face of the target pedestrian in the video frame as the updated feature associated with the front face of the target pedestrian. Alternatively, the feature updating unit may transmit the feature associated with the front face of the pedestrian to a server that maintains the target appearance feature of the target pedestrian, so that the server updates the target appearance feature of the target pedestrian based on the received feature. Further, when the pedestrian recognition unit 130 determines that the pedestrian detected in the video frame cannot be matched with the target pedestrian and assigns a new identity to the detected pedestrian, the feature update unit may construct the target appearance feature for it for subsequent feature matching.

Further, in the present disclosure, the appearance characteristics of the target pedestrian are not limited to two kinds, but may be more various kinds, for example, as shown in fig. 4, a plurality of kinds of appearance characteristics associated with each target pedestrian may be taken as the target appearance characteristics of the target pedestrian, wherein the first appearance characteristic may be a characteristic associated with the front whole body of the target pedestrian, the second appearance characteristic may be a characteristic associated with the side whole body of the target pedestrian, the third appearance characteristic may be a characteristic associated with the back whole body of the target pedestrian, the fourth appearance characteristic may be a characteristic associated with the front half body of the target pedestrian, the fifth appearance characteristic may be a characteristic associated with the side half body of the target pedestrian, the sixth appearance characteristic may be a characteristic … … associated with the back half body of the target pedestrian, and the like. It is understood that any of the above-mentioned appearance features may be used as the appearance features of the target pedestrian for subsequent feature matching. In this case, the feature extraction unit 120 may extract, for a detected pedestrian, appearance features such as color features, texture features, shape features, and the like associated with the front whole body, the side whole body, the back whole body, the front half body, the side half body, the back half body, or other appearance of the detected pedestrian. Accordingly, the pedestrian recognition unit 130 may match the extracted features with various appearance features of the target pedestrian, and associate the pedestrian detected in the video frame with the identity of the target pedestrian as long as the detected appearance feature of the pedestrian matches with some of the various appearance features of the target pedestrian, so that the pedestrian may be accurately recognized without being recognized as another pedestrian. Optionally, the feature extraction unit 120 may further determine whether the pedestrian appears in the video frame as a front whole body, a side whole body, a back whole body, a front half body, a side half body, a back half body, or another shape. Thereafter, the pedestrian recognition unit 130 can know which profile the extracted feature for the pedestrian is associated with, and match the extracted feature with the corresponding feature of the target pedestrian, while omitting the matching operation with the remaining features of the target pedestrian, thereby reducing the amount of computation of the matching process. Further, when the pedestrian recognition unit 130 determines that the pedestrian detected in the video frame cannot be matched with various appearance features of the target pedestrian, a new identity may be assigned to the detected pedestrian.

Alternatively, the feature updating unit may update the target appearance feature of the target pedestrian according to the recognition result of the pedestrian recognition unit 130. In particular, when a detected pedestrian and a target pedestrian match in the video frame, the feature updating unit may update the target appearance feature of the target pedestrian with various appearance features associated with the detected pedestrian (e.g., a feature associated with a front full body associated with the detected pedestrian, a feature associated with a side full body thereof, a feature associated with a back full body thereof, a feature associated with a front half body thereof, a feature associated with a side half body thereof, or a feature associated with a back half body thereof, etc.). Details regarding the operation of the feature updating unit on the target appearance feature of the target pedestrian are similar to those described above, and are not described herein again. Further, when the pedestrian recognition unit 130 determines that the pedestrian detected in the video frame cannot be matched with the respective features of the target pedestrian and assigns a new identity to the detected pedestrian, the feature update unit may construct the target appearance feature thereto for subsequent feature matching.

As described above, the pedestrian re-recognition apparatus 100 according to the embodiment of the present disclosure performs recognition of a target pedestrian using at least the first appearance feature and the second appearance feature associated with the appearance of the target pedestrian, so that different characteristics related to different appearances of the target pedestrian can be effectively tracked, and accuracy of pedestrian re-recognition is improved in a scene where a posture of the pedestrian is largely changed or cross-camera recognition is performed.

The block diagrams used in the description of the above embodiments show blocks in units of functions. These functional blocks (structural units) are implemented by any combination of hardware and/or software. Note that the means for implementing each functional block is not particularly limited. That is, each functional block may be implemented by one apparatus which is physically and/or logically combined, or may be implemented by a plurality of apparatuses which are directly and/or indirectly (for example, by wire and/or wirelessly) connected by two or more apparatuses which are physically and/or logically separated.

Fig. 5 is a hardware block diagram illustrating an exemplary computing device for implementing a pedestrian re-identification device of an embodiment of the present disclosure. The pedestrian re-recognition apparatus 100 described above may be configured as a computer device physically including a processor 501, a memory 502, a storage 503, a communication device 504, an input device 505, an output device 506, a bus 507, and the like.

In the following description, the words "device" or the like may be replaced with circuits, devices, units, or the like. The hardware configuration of the pedestrian re-identification apparatus 100 may include one or more of the respective devices shown in the drawings, or may not include some of the devices.

For example, the processor 501 is only illustrated as one, but may be a plurality of processors. The processing may be executed by one processor, or may be executed by one or more processors at the same time, sequentially, or by other methods. In addition, the processor 501 may be mounted by more than one chip.

Each function in the pedestrian re-recognition apparatus 100 is realized by, for example: by reading predetermined software (program) into hardware such as the processor 501 and the memory 502, the processor 501 performs an operation to control communication by the communication device 504 and to control reading and/or writing of data in the memory 502 and the storage 503.

The processor 501 controls the entire computer by operating an operating system, for example. The processor 501 may be a Central Processing Unit (CPU) including an interface with a peripheral device, a control device, an arithmetic device, a register, and the like. For example, the pedestrian detection unit 110, the feature extraction unit 120, the pedestrian recognition unit 130, and the like described above may be implemented by the processor 501.

Further, the processor 501 reads out a program (program code), a software module, data, and the like from the memory 503 and/or the communication device 504 to the memory 502, and executes various processes according to them. As the program, a program that causes a computer to execute at least a part of the operations described in the above embodiments may be used.

The Memory 502 is a computer-readable recording medium, and may be configured by at least one of a Read Only Memory (ROM), a Programmable Read Only Memory (EPROM), an Electrically Programmable Read Only Memory (EEPROM), a Random Access Memory (RAM), and other suitable storage media. Memory 502 may also be referred to as registers, cache, main memory (primary storage), etc. The memory 502 may store an executable program (program code), a software module, and the like for implementing the pedestrian re-identification method according to the embodiment of the present invention.

The memory 503 is a computer-readable recording medium, and may be configured by at least one of a flexible disk (floppy disk), a floppy (registered trademark) disk (floppy disk), a magneto-optical disk (for example, a compact Disc read only memory (CD-rom), etc.), a digital versatile Disc, a Blu-ray (registered trademark) optical disk), a removable disk, a hard disk drive, a smart card, a flash memory device (for example, a card, a stick, a key driver), a magnetic stripe, a database, a server, and other suitable storage media. The memory 503 may also be referred to as a secondary storage device.

The communication device 504 is hardware (transmission/reception device) for performing communication between computers via a wired and/or wireless network, and is also referred to as a network device, a network controller, a network card, a communication module, or the like. The communication device 504 may include a high Frequency switch, a duplexer, a filter, a Frequency synthesizer, and the like in order to implement Frequency Division Duplexing (FDD) and/or Time Division Duplexing (TDD), for example. For example, a video sequence to be analyzed may be received by the communication device 504.

The input device 505 is an input device (for example, a keyboard, a mouse, a microphone, a switch, a button, a sensor, or the like) that accepts input from the outside. The output device 506 is an output device (for example, a display, a speaker, a Light Emitting Diode (LED) lamp, or the like) that outputs to the outside. The input device 505 and the output device 506 may be integrated (e.g., a touch panel).

The respective devices such as the processor 501 and the memory 502 are connected via a bus 507 for communicating information. The bus 507 may be constituted by a single bus or may be constituted by buses different among devices.

Further, the pedestrian re-recognition apparatus 100 may include hardware such as a microprocessor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Programmable Logic Device (PLD), a Field Programmable Gate Array (FPGA), and the like, and a part or all of each functional block may be implemented by the hardware. For example, the processor 501 may be installed by at least one of these hardware.

Next, a flowchart of a pedestrian re-identification method according to an embodiment of the present disclosure will be described with reference to fig. 6. The steps of the pedestrian re-identification method substantially correspond to the operations of the respective components of the pedestrian re-identification apparatus 100 as described above in connection with fig. 2 to 5, and thus, in order to avoid repetition, only a brief description of the method will be given below, and a detailed description of the same details will be omitted.

As shown in fig. 6, in step S601, pedestrian detection is performed in each video frame of the video sequence. As mentioned above, unlike ordinary pedestrian tracking, the pedestrian re-identification technique according to the present disclosure can identify a specific target pedestrian in images or video sequences taken across different cameras to enable long-term tracking and surveillance of the specific target pedestrian. The video sequence is a video sequence to be analyzed from which whether a target pedestrian exists or not needs to be identified, and the video sequence and a video frame which is shot by a target pedestrian before and has appearance characteristics extracted from the target pedestrian can be shot by different cameras or shot by the same camera at different moments. The video sequence may include one or more video frames, each of which may be consecutive or non-consecutive video frames captured by a single camera or multiple cameras with overlapping or non-overlapping fields of view.

In this step, any suitable image detection technique in the art may be employed to detect pedestrians from each video frame of the video sequence, which is not limited by this disclosure. For example, each video frame may be subjected to foreground segmentation, edge extraction, motion detection, and the like, and respective sub-image regions corresponding to respective pedestrians appearing in the video frame are determined, e.g., the sub-image regions of the video frame may be represented by rectangular boxes circumscribing the body contour of the detected pedestrian. For another example, a machine learning method such as a neural network or a support vector machine may be used to perform pedestrian detection on each video frame through a pre-trained pedestrian detection classifier to determine the position of each pedestrian appearing in the video frame.

In step S602, for each pedestrian detected in the video frame, its appearance features are extracted. The appearance features may include color features, texture features, shape features, human face features, and other features that reflect the appearance of the pedestrian, and the disclosure is not limited herein. This step may use a feature extraction method known in the art to extract the appearance features of each detected pedestrian, which is not described herein.

In step S603, the detected appearance features of the respective pedestrians are matched with the target appearance features of the target pedestrian, and the target pedestrian is identified from the video frame according to the matching result. In this step, the detected appearance features of each pedestrian may be matched with the appearance features of the target pedestrian, and whether the pedestrian detected in the video frame corresponds to the target pedestrian may be determined based on the similarity between the two appearance features, so as to perform pedestrian re-identification. Optionally, the number of the target pedestrians may be multiple, and correspondingly, in this step, the appearance features of the detected pedestrians may be respectively matched with the appearance features of the constructed target pedestrians, and it is determined whether the detected pedestrian in the video frame corresponds to a certain target pedestrian among the target pedestrians based on the similarity between the two, so as to perform pedestrian re-identification. It will be appreciated that the target appearance characteristics of the target pedestrian may be contained in a pre-constructed pedestrian appearance characteristics library, which may be stored locally at the pedestrian re-identification device or at a server accessible to the pedestrian re-identification device. Alternatively, the target appearance characteristics of the target pedestrian may be generated from a picture and input from the outside as needed, rather than being stored in the pedestrian appearance characteristic library in advance.

In this embodiment, various feature matching methods may be employed to identify the target pedestrian from the video frames. For example, in this step S603, a characteristic distance, such as a manhattan distance, an euclidean distance, a baryta distance, and the like, between the appearance characteristic of each detected pedestrian and the target appearance characteristic of the target pedestrian may be calculated and compared with a preset threshold value to determine whether the detected pedestrian corresponds to the target pedestrian. Alternatively, the number of the target pedestrians may be multiple, in this case, for a detected pedestrian, respective feature distances between the detected pedestrian and the respective target pedestrians are calculated, a closest feature distance among the detected pedestrians is determined, and if the closest feature distance is lower than a preset threshold, the detected pedestrian is identified as the target pedestrian corresponding to the closest feature distance. Thereafter, a pedestrian re-recognition result may be output based on the matching result, such as associating the detected pedestrian with an identity of a target pedestrian when the detected pedestrian corresponds to the target pedestrian, and assigning a new identity to the detected pedestrian when the detected pedestrian cannot be matched with the target pedestrian.

As described above, in the present disclosure, the target appearance characteristics of the target pedestrian include at least the first appearance characteristics and the second appearance characteristics associated with the appearance of the target pedestrian, so that the various appearance characteristics constructed in advance can sufficiently reflect the different characteristics associated with the different appearances of the target pedestrian, thereby improving the accuracy of pedestrian re-identification. It is to be understood that the appearance in the present disclosure may refer to a form in which the target pedestrian appears in the picture, such as appearing in the whole body, appearing in the half body, appearing in the front, appearing in the side, appearing in the back, and the like, and the present disclosure does not limit this. Accordingly, the first and second appearance features are two appearance features associated with different appearances of the target pedestrian. The pedestrian re-identification method is described in detail below with respect to different examples of the first appearance feature and the second appearance feature.

According to one embodiment of the present disclosure, the first appearance characteristic of the target pedestrian may be a characteristic associated with the whole body, and the second appearance characteristic of the target pedestrian may be a characteristic associated with the half body. It is understood that the half-length in the present disclosure is not limited to exactly half the size of the whole body, and the half-length may be any suitable proportion of the whole body, such as a quarter, a third, etc., as desired. The specific operation of the pedestrian re-identification method in this case will be described specifically below.

As described above, pedestrian detection may be performed in each video frame of the video sequence in step S601, and each sub-image region corresponding to each pedestrian appearing in the video frame is determined. The step S601 may further include determining whether the pedestrian appears in the video frame as a whole body or a half body based on an aspect ratio of the detected pedestrian in the video frame (for example, an aspect ratio of a circumscribed rectangular frame of the pedestrian), or based on a whole body/half body classifier trained by a neural network, or other suitable methods.

Accordingly, when it is determined in step S601 that the detected pedestrian appears as a whole body in the video frame, step S602 may include extracting features associated with the whole body of the detected pedestrian, for example, extracting appearance features such as color features, texture features, shape features, and face features associated with the whole body from a sub-image region in which the pedestrian is located. Step S602 may further include cutting out the bust of the detected pedestrian from the video frame and extracting features associated with the bust, for example, cutting out the upper half of the sub-image region where the pedestrian is located, and extracting appearance features such as color features, texture features, shape features, and face features associated with the bust from the cut-out half sub-image region.

Next, step S603 may include matching a feature associated with the whole body of the detected pedestrian with a feature associated with the whole body of the target pedestrian and matching a feature associated with the bust thereof with a feature associated with the bust of the target pedestrian, associating the detected pedestrian in the video frame with the identity of the target pedestrian when at least one of the feature associated with the whole body of the detected pedestrian and the feature associated with the bust thereof matches the target pedestrian. Further, step S603 may further include assigning a new identity to the detected pedestrian when it is determined that the detected pedestrian cannot be matched with the target pedestrian in the video frame, that is, the feature associated with the whole body of the detected pedestrian cannot be matched with the feature associated with the whole body of the target pedestrian, and the feature associated with the half body of the detected pedestrian cannot be matched with the feature associated with the half body of the target pedestrian.

Optionally, the pedestrian re-identification method may further include the following steps: and updating the target appearance characteristics of the target pedestrian according to the pedestrian recognition result. Specifically, when it is determined in step S603 that the pedestrian detected in the video frame and the target pedestrian match, the target appearance feature of the target pedestrian may be updated with the feature associated with the whole body of the detected pedestrian and the feature associated with the half body thereof. For example, when it is determined in step S603 that a pedestrian detected in the video frame corresponds to a certain target pedestrian and associates the detected pedestrian with the identity of the target pedestrian: if the target pedestrian lacks a feature associated with its half-body or a feature associated with its whole-body, the feature associated with the detected whole-body or half-body of the pedestrian in the video frame may be added directly to the portion absent by the target pedestrian; for the existing features of the target pedestrian, replacing the existing features of the target pedestrian with the existing features of the video frame, or fusing the existing features of the video frame and the existing features of the video frame as updated features; the feature associated with the bust of the target pedestrian may be replaced by the feature associated with the bust of the target pedestrian extracted after the video frame is cut out, or the extracted feature associated with the bust of the target pedestrian may be fused with the feature associated with the bust of the target pedestrian as an updated feature associated with the bust of the target pedestrian, so that the updated feature may track the features corresponding to different outlines of the target pedestrian at any time. Further, when it is determined that the pedestrian detected in the video frame cannot be matched with the target pedestrian and a new identity is assigned to the detected pedestrian, the feature updating unit may construct the target appearance feature for it, that is, a feature associated with the whole body of the detected pedestrian in the video frame is taken as a feature associated with the whole body of the target pedestrian, and a feature associated with the bust thereof extracted after the video frame is cut out is taken as a feature associated with the bust of the target pedestrian for subsequent feature matching.

The above describes the corresponding pedestrian recognition operation when it is determined that the detected pedestrian is present as a whole body in the video frame. A corresponding pedestrian recognition operation when it is determined that the detected pedestrian is present as a half-body in the video frame will be described below.

Specifically, when it is determined in step S601 that the detected pedestrian appears as a bust in the video frame, step S602 may include extracting features associated with the bust of the pedestrian, for example, extracting appearance features such as a color feature, a texture feature, a shape feature, and a face feature associated with the bust of the pedestrian from the sub-image region in which the pedestrian is located. Accordingly, step S603 may include matching a feature associated with the detected pedestrian ' S bust with a feature associated with the target pedestrian ' S bust, associating the pedestrian detected in the video frame with the identity of the target pedestrian when the feature associated with the pedestrian ' S bust matches the target pedestrian. Further, when it is determined in step S603 that the pedestrian detected in the video frame cannot be matched with the target pedestrian, a new identity may be assigned to the detected pedestrian.

Optionally, the pedestrian re-identification method may further include the following steps: when it is determined in step S603 that the feature associated with the half-length of the detected pedestrian and the feature associated with the half-length of the target pedestrian match in the video frame, the target appearance feature of the target pedestrian is updated with the feature associated with the half-length of the detected pedestrian. For example, the feature associated with the existing bust of the target pedestrian may be replaced with the feature associated with the bust of the detected pedestrian extracted from the video frame, or the extracted feature associated with the bust of the target pedestrian and the feature associated with the existing bust of the target pedestrian may be fused as an updated feature associated with the bust of the target pedestrian, thereby enabling the updated feature to track characteristics corresponding to different outlines of the target pedestrian. Further, when it is determined in step S603 that the pedestrian detected in the video frame cannot be matched with the target pedestrian and a new identity is assigned to the detected pedestrian, a target appearance feature may be constructed for it, that is, a feature associated with the half-length of the detected pedestrian in the video frame may be taken as a feature associated with the half-length of the target pedestrian for subsequent feature matching.

It should be understood that the foregoing detailed description has been made with the first appearance feature of the target pedestrian being a feature associated with the front full-body and the second appearance feature being a feature associated with the front half-body for ease of explanation, however, the full-body associated feature and the half-body associated feature of the present disclosure are not limited to features extracted from the front target pedestrian, for example, the first appearance feature may be a feature associated with a lateral full-body of the target pedestrian and the second appearance feature is a feature associated with a lateral half-body, or the first appearance feature may be a feature associated with a back full-body of the target pedestrian and the second appearance feature is a feature associated with a back half-body, the pedestrian re-identification method may match features associated with the detected lateral/back full-body/half-body of the pedestrian with corresponding features of the target pedestrian to identify the pedestrian from the video frames, the present disclosure is not so limited.

According to another embodiment of the present disclosure, the first and second appearance features of the target pedestrian may be any two of a feature associated with the front face, a feature associated with the side face, and a feature associated with the back face. For example, the first appearance feature of the target pedestrian may be a feature associated with its front face, while the second appearance feature may be a feature associated with its side face. The specific operation of the pedestrian re-identification method in this case will be described specifically below.

As described above, pedestrian detection may be performed in each video frame of the video sequence in step S601, and each sub-image region corresponding to each pedestrian appearing in the video frame is determined. Accordingly, in step S602, a feature extraction method such as a color feature, a texture feature, and a shape feature may be adopted to extract, from the sub-image region where the pedestrian is located, an appearance feature such as a color feature, a texture feature, and a shape feature associated with the front surface thereof, an appearance feature such as a color feature, a texture feature, and a shape feature associated with the side surface thereof, or an appearance feature such as a color feature, a texture feature, and a shape feature associated with the back surface thereof. Next, in step S603, the feature associated with the front face, the feature associated with the side face thereof, or the feature associated with the back face thereof of the detected pedestrian may be matched with the first appearance feature and the second appearance feature of the target pedestrian (for example, the feature associated with the front face and the feature associated with the side face), and when the feature associated with the front face, the feature associated with the side face thereof, or the feature associated with the back face thereof of the pedestrian is matched with the target pedestrian, the detected pedestrian in the video frame may be associated with the identity of the target pedestrian. That is, although the pedestrian changes in posture from moment to moment due to its strong motility, the target pedestrian has a plurality of appearance features associated with a plurality of appearances thereof, and therefore, as long as the feature associated with one of the appearances thereof actually extracted for the detected pedestrian matches one of the first appearance feature and the second appearance feature of the target pedestrian, the pedestrian can be accurately identified without being identified as another pedestrian. Further, when it is determined in step S603 that the pedestrian detected in the video frame cannot be matched with the target pedestrian, that is, the feature associated with the front face of the detected pedestrian, the feature associated with the side face thereof, or the feature associated with the back face thereof, and the feature associated with the front face of the target pedestrian and the feature associated with the side face of the target pedestrian cannot be matched, a new identity may be assigned to the detected pedestrian.

Optionally, the pedestrian re-identification method may further include the following steps: and updating the target appearance characteristics of the target pedestrian according to the pedestrian recognition result. Specifically, when it is determined in step S603 that the pedestrian detected in the video frame and the target pedestrian match, the target appearance feature of the target pedestrian may be updated with the feature associated with the front face of the detected pedestrian, the feature associated with the side face thereof, or the feature associated with the back face thereof. For example, when it is determined in step S603 that the pedestrian detected in the video frame corresponds to a certain target pedestrian and the detected pedestrian is associated with the identity of the target pedestrian,

taking the example of matching the feature associated with the front face of the detected pedestrian with the feature associated with the front face of the target pedestrian, the feature already associated with the front face of the target pedestrian may be replaced by the feature associated with the front face of the detected pedestrian in the video frame, or the feature already associated with the front face of the target pedestrian and the feature already associated with the front face of the target pedestrian in the video frame may be fused as the updated feature associated with the front face of the target pedestrian. Further, when it is determined in step S603 that the pedestrian detected in the video frame cannot be matched with the target pedestrian and a new identity is assigned to the detected pedestrian, the feature updating unit may construct the target appearance feature for it for subsequent feature matching.

Further, in the present disclosure, the appearance characteristics of the target pedestrian are not limited to two kinds, but may be more various kinds, for example, a plurality of kinds of appearance characteristics associated with each target pedestrian may be taken as the target appearance characteristics of the target pedestrian, wherein the first appearance characteristic may be a characteristic associated with the front whole body of the target pedestrian, the second appearance characteristic may be a characteristic associated with the side whole body of the target pedestrian, the third appearance characteristic may be a characteristic associated with the back whole body of the target pedestrian, the fourth appearance characteristic may be a characteristic associated with the front half body of the target pedestrian, the fifth appearance characteristic may be a characteristic associated with the side half body of the target pedestrian, the sixth appearance characteristic may be a characteristic … … associated with the back half body of the target pedestrian, or the like. It is understood that any of the above-mentioned appearance features may be used as the appearance features of the target pedestrian for subsequent feature matching. In this case, in step S602, an appearance feature, such as a color feature, a texture feature, a shape feature, or the like, associated with the front whole body, the side whole body, the back whole body, the front half body, the side half body, the back half body, or other appearance of the detected pedestrian may be extracted for the detected pedestrian. Accordingly, in step S603, the extracted features may be matched with various appearance features of the target pedestrian, and the pedestrian detected in the video frame may be associated with the identity of the target pedestrian as long as the detected pedestrian' S appearance features are matched with some of the various appearance features of the target pedestrian, so that the pedestrian may be accurately identified without being identified as another pedestrian. Alternatively, in step S602, it may be further determined whether the pedestrian appears in the video frame as a front whole body, a side whole body, a back whole body, a front half body, a side half body, a back half body, or another shape. Thereafter, in step S603, it is possible to know which shape the feature extracted for the pedestrian is to be associated with, and match the extracted feature with the corresponding feature of the target pedestrian, while omitting the matching operation with the remaining features of the target pedestrian, thereby reducing the amount of computation of the matching process. Further, in step S603, it is determined that the detected pedestrian in the video frame cannot be matched with various appearance features of the target pedestrian, a new identity may be assigned to the detected pedestrian.

Optionally, the pedestrian re-identification method may further include the following steps: and updating the target appearance characteristics of the target pedestrian according to the pedestrian recognition result. Specifically, when it is determined in step S603 that the pedestrian detected in the video frame and the target pedestrian match, the target appearance feature of the target pedestrian may be updated with various appearance features associated with the detected pedestrian (e.g., a feature associated with a front whole body of the detected pedestrian, a feature associated with a side whole body thereof, a feature associated with a back whole body thereof, a feature associated with a front half body thereof, a feature associated with a side half body thereof, a feature associated with a back half body thereof, or the like). Further, when it is determined in step S603 that the pedestrian detected in the video frame cannot be matched with the respective features of the target pedestrian and a new identity is assigned to the detected pedestrian, a target appearance feature may be constructed for the same for subsequent feature matching.

As described above, according to the pedestrian re-identification method of the embodiment of the present disclosure, at least the first appearance feature and the second appearance feature associated with the appearance of the target pedestrian are used for identifying the target pedestrian, so that different characteristics related to different appearances of the target pedestrian can be effectively tracked, and under a scene where the posture of the pedestrian is greatly changed or the pedestrian is identified across cameras, the accuracy of pedestrian re-identification is improved.

In addition to the above-described methods and apparatus, embodiments of the present disclosure may also include a computer-readable storage medium having stored thereon computer program instructions executable by a processor to cause the processor to perform the above-described pedestrian re-identification method.

Result verification

Hereinafter, the results of verifying the following two pedestrian re-identification methods are shown: (1) pedestrian re-identification based on a single appearance characteristic of a target pedestrian without considering the appearance factor of the target pedestrian; (2) pedestrian re-identification based on the first appearance feature and the second appearance feature associated with the appearance of the target pedestrian. Specifically, in the exemplary verification, a target pedestrian library including appearance features of a plurality of target pedestrians is constructed in advance, and pedestrian detection is performed on one or more video frames in a video sequence to be analyzed, and the target pedestrians are identified from the video frames by calculating feature distances between the appearance features of the respective pedestrians detected in the video frames and the appearance features of the constructed target pedestrians. In the verification, the accuracy of pedestrian re-identification by the two pedestrian re-identification methods is evaluated by adopting the following two indexes:

rank 1: the index represents the probability that, after matching the appearance features of the detected pedestrians with the appearance features of the target pedestrians, the target pedestrian with the most advanced recognition result (i.e., the target pedestrian with the closest feature distance) among the target pedestrians is the correct result, and the expression is as follows:

the accuracy is as follows: the index represents an average value of probabilities that the same pedestrian holds the same ID in the image sequence, and is expressed as:

based on the two indexes, the test results of the two pedestrian re-identification methods are shown in table 1.

TABLE 1

As shown in table 1, the pedestrian re-identification method based on the first appearance feature and the second appearance feature associated with the appearance of the target pedestrian according to the present disclosure achieves better results in both indexes.

Further, fig. 7(a) -7(b) schematically show the results of re-identifying a pedestrian by using the above two pedestrian re-identification methods. Specifically, fig. 7(a) shows the result of pedestrian re-recognition based on a single appearance feature of a target pedestrian, which is recognized as P7 and P3 in different video frames due to changes in the photographing angle and posture, respectively, taking the same pedestrian as a circled example. Fig. 7(b) shows the result of pedestrian re-recognition based on the first appearance feature and the second appearance feature associated with the outer shape of the target pedestrian, which has obtained correct recognition results at different photographing angles and pedestrian postures because the target pedestrian has various appearance features associated with the outer shape thereof.

The foregoing describes the general principles of the present disclosure in conjunction with specific embodiments, however, it is noted that the advantages, effects, etc. mentioned in the present disclosure are merely examples and are not limiting, and they should not be considered essential to the various embodiments of the present disclosure. Furthermore, the foregoing disclosure of specific details is for the purpose of illustration and description and is not intended to be limiting, since the disclosure is not intended to be limited to the specific details so described.

The term "according to" used in the present specification does not mean "according only" unless explicitly stated in other paragraphs. In other words, the statement "according to" means both "according to only" and "according to at least".

Any reference to elements using the designations "first", "second", etc. used in this specification is not intended to be a comprehensive limitation on the number or order of such elements. These names may be used in this specification as a convenient way to distinguish between two or more elements. Thus, references to a first unit and a second unit do not imply that only two units may be employed or that the first unit must precede the second unit in several ways.

The term "determining" used in the present specification may include various operations. For example, regarding "determination (determination)", calculation (computing), estimation (computing), processing (processing), derivation (deriving), investigation (analyzing), search (looking up) (for example, a search in a table, a database, or another data structure), confirmation (ascertaining), and the like may be regarded as "determination (determination)". In addition, regarding "determination (determination)", reception (e.g., reception information), transmission (e.g., transmission information), input (input), output (output), access (access) (e.g., access to data in a memory), and the like may be regarded as "determination (determination)". Further, regarding "judgment (determination)", it is also possible to regard solution (solving), selection (selecting), selection (breathing), establishment (evaluating), comparison (comparing), and the like as performing "judgment (determination)". That is, with respect to "determining (confirming)", several actions may be considered as performing "determining (confirming)".

The terms "connected", "coupled" or any variation thereof as used in this specification refer to any connection or coupling, either direct or indirect, between two or more elements, and may include the following: between two units "connected" or "coupled" to each other, there are one or more intermediate units. The combination or connection between the elements may be physical, logical, or a combination of both. For example, "connected" may also be replaced with "accessed". As used in this specification, two units may be considered to be "connected" or "joined" to each other by the use of one or more wires, cables, and/or printed electrical connections, and by the use of electromagnetic energy or the like having wavelengths in the radio frequency region, the microwave region, and/or the optical (both visible and invisible) region, as a few non-limiting and non-exhaustive examples.

When the terms "including", "including" and "comprising" and variations thereof are used in the present specification or claims, these terms are open-ended as in the term "including". Further, the term "or" as used in the specification or claims is not exclusive or.

The embodiments and modes described in this specification may be used alone or in combination, or may be switched during execution. Note that, as long as there is no contradiction between the processing steps, sequences, flowcharts, and the like of the embodiments and the embodiments described in the present specification, the order may be changed. For example, with respect to the methods described in this specification, various elements of steps are presented in an exemplary order and are not limited to the particular order presented.

It should also be noted that in the apparatus and methods of the present disclosure, software, whether referred to as software, firmware, middleware, microcode, hardware description languages, or by other names, is to be construed broadly to mean commands, command sets, codes, code segments, program codes, programs, subprograms, software modules, applications, software packages, routines, subroutines, objects, executables, threads of execution, steps, functions, and the like.

Further, software, commands, information, and the like may be transmitted or received via a transmission medium. For example, when the software is transmitted from a website, server, or other remote source using a wired technology (e.g., coaxial cable, fiber optic cable, twisted pair, Digital Subscriber Line (DSL, microwave, etc.) and/or a wireless technology (e.g., infrared, microwave, etc.), the wired technology and/or wireless technology are included in the definition of transmission medium.

While the present invention has been described in detail, it will be apparent to those skilled in the art that the present invention is not limited to the embodiments described in the present specification. The present invention can be implemented as modifications and variations without departing from the spirit and scope of the present invention defined by the claims. Therefore, the description of the present specification is for illustrative purposes and is not intended to be in any limiting sense.

Claims

1. A pedestrian re-identification device, comprising:

a pedestrian detection unit, configured to perform pedestrian detection in each video frame of the video sequence;

a feature extraction unit configured to extract appearance features for each pedestrian detected in the video frame; and

a pedestrian recognition unit configured to match the detected appearance features of each pedestrian with the target appearance features of the target pedestrian to identify the target pedestrian from the video frame,

Wherein, the target appearance feature of the target pedestrian includes at least a first appearance feature and a second appearance feature associated with the shape of the target pedestrian.

2. The apparatus of claim 1, wherein the first appearance feature is a feature associated with a full body and the second appearance feature is a feature associated with a half body.

3. The apparatus of claim 2, wherein the pedestrian detection unit is further configured to determine whether the pedestrian appears full or half-length in the video frame.

4. The apparatus of claim 3, wherein the feature extraction unit is further configured to extract features associated with the entire body of the detected pedestrian when the pedestrian appears in the video frame as a whole body, and extract features from the video frame Cut out the bust of the detected pedestrian, and extract the features associated with its bust.

5. The apparatus of claim 4, wherein the pedestrian identification unit is further configured to match the feature associated with the detected pedestrian's whole body with the feature associated with the target pedestrian's whole body, and to match its half-body The associated feature matches the feature associated with the half-body of the target pedestrian, and when at least one of the feature associated with the detected pedestrian's full body and the feature associated with its half-body matches the target pedestrian, it will be Pedestrians detected in video frames are associated with the identity of the target pedestrian.

6. The apparatus of claim 5, further comprising:

a feature updating unit configured to update the target appearance of the target pedestrian with features associated with the detected pedestrian's full body and features associated with its half body when the detected pedestrian matches the target pedestrian in the video frame feature.

7. The apparatus of claim 3, wherein,

The feature extraction unit is further configured to extract features associated with the half body of the detected pedestrian when the pedestrian appears in the video frame as a half body, and

The pedestrian recognition unit is further configured to match the feature associated with the detected pedestrian's half-body with the feature associated with the target pedestrian's half-body, and when the feature associated with the pedestrian's half-body matches the target pedestrian, associate the detected pedestrian in this video frame with the identity of the target pedestrian,

The device also includes:

A feature updating unit configured to update the target appearance feature of the target pedestrian with a feature associated with the detected pedestrian's half-body when the detected pedestrian matches the target pedestrian in the video frame.

8. The apparatus of claim 1, wherein the first appearance feature and the second appearance feature are two of a feature associated with the front face, a feature associated with the side face, and a feature associated with the back face. kind.

9. The apparatus of claim 8, wherein,

the feature extraction unit is further configured to extract, in the video frame, features associated with the front face of the detected pedestrian, features associated with its side faces, or features associated with its back face, and

The pedestrian identification unit is further configured to match a feature associated with the detected pedestrian's front face, a feature associated with its side face, or a feature associated with its back face with the first appearance feature and the second appearance feature of the target pedestrian, When the features associated with the pedestrian's front face, the features associated with its sides, or the features associated with the back of the pedestrian match the target pedestrian, associate the pedestrian detected in the video frame with the identity of the target pedestrian ,

The device also includes:

A feature update unit configured to utilize features associated with the detected pedestrian's front face, features associated with its sides, or features associated with its backside when the detected pedestrian matches the target pedestrian in the video frame Update the target appearance features of the target pedestrian.

10. A pedestrian re-identification device, comprising:

processor; and

a memory in which computer program instructions are stored,

Wherein, when the computer program instructions are executed by the processor, the processor is caused to perform the following steps:

Pedestrian detection in each video frame of the video sequence;

For each pedestrian detected in the video frame, extract its appearance features; and

Match the detected appearance features of individual pedestrians with the target appearance features of the target pedestrian to identify the target pedestrian from the video frame,