CN113989761B

CN113989761B - Object tracking method, device, electronic device and storage medium

Info

Publication number: CN113989761B
Application number: CN202111275859.XA
Authority: CN
Inventors: 丁华杰; 马强; 赵杰
Original assignee: China Automotive Innovation Corp
Current assignee: China Automotive Innovation Corp
Priority date: 2021-10-29
Filing date: 2021-10-29
Publication date: 2025-06-24
Anticipated expiration: 2041-10-29
Also published as: CN113989761A

Abstract

The invention discloses an object tracking method, an object tracking device, electronic equipment and a storage medium, which comprise the steps of acquiring a current image and a historical image, acquiring historical state information of at least one first preset object in the historical image based on the historical image when the preset object exists in the historical image, determining position prediction information corresponding to the at least one first preset object according to the historical state information, determining an image acquisition type according to the position prediction information, determining at least one combination and determining distance information corresponding to the at least one combination and corresponding first characteristic information when the image acquisition type is a cross-camera acquisition type, determining weight information corresponding to the at least one combination according to the distance information and the first characteristic information, and updating object track information according to the weight information. According to the technical scheme of the invention, the target tracking of the looking-around automatic driving is realized, the possibility of object cross-camera loss is reduced, and the tracking efficiency is improved.

Description

Object tracking method and device, electronic equipment and storage medium

Technical Field

The present invention relates to the field of autopilot technology, and in particular, to an object tracking method, an object tracking device, an electronic device, and a storage medium.

Background

At present, automatic driving belongs to the field of relatively popular, and research needs for multi-target tracking are presented. Most of the cross-camera pedestrian tracking methods at present are based on research in the field of surveillance videos.

The large difference from the auto-driving looking around is that the scenes of a plurality of monitoring cameras are overlapped, and the relative positions of the same targets in the same scene can be shot. However, the positions of the eye-looking-around cameras commonly used in the current automatic driving are respectively positioned below the head, the tail and the two rearview mirrors, and the shooting directions of the four cameras are different. After the four images are de-distorted, partial corresponding images disappear, and the background images acquired by the four cameras have little similarity, so that a certain difficulty is also increased in matching targets in tracking.

Autopilot is well known as a field with high demands on efficiency. After the detection and classification of the targets for the images acquired by the four cameras, it is necessary to track these targets. The analysis of the image has taken a lot of time, and it is a difficulty how to reduce the calculation amount as much as possible in the tracking part, which is characterized by a multiple time consumption when the number of cameras is large. For example, since four cameras are currently used for looking around and the orientations of the four cameras are completely different, there is a certain efficiency problem in that the four cameras need to be simultaneously analyzed each time the tracking analysis is performed.

Disclosure of Invention

The object tracking method, the device, the electronic equipment and the storage medium provided by the invention are used for determining position prediction information through historical images, determining an image acquisition type based on the position prediction information, determining at least one combination and weight information corresponding to each combination based on the image acquisition type, updating object track information through the weight information, realizing object tracking of looking around automatic driving, reducing possibility of object cross-camera loss, reducing time consumption to a certain extent and improving object tracking efficiency.

In order to achieve the above object, the present invention provides the following solutions:

An object tracking method, the method comprising:

Acquiring a current image and a historical image;

acquiring respective historical state information of at least one first preset object in the historical image based on the historical image under the condition that the preset object exists in the historical image;

Determining position prediction information corresponding to each of the at least one first preset object according to the historical state information;

determining an image acquisition type according to the position prediction information;

Determining at least one combination based on the current image and the historical image and determining distance information and corresponding first characteristic information corresponding to the at least one combination under the condition that the image acquisition type is a cross-camera acquisition type;

determining weight information corresponding to each of the at least one combination according to the distance information and the first characteristic information;

and updating the object track information according to the weight information.

Optionally, after determining the image acquisition type according to the position prediction information, the method further includes:

determining at least one combination based on the current image and the historical image and determining corresponding cross-ratio information and corresponding second characteristic information of the at least one combination under the condition that the image acquisition type is a non-cross-camera acquisition type;

and determining weight information corresponding to each combination according to the cross ratio information and the second characteristic information.

Optionally, the determining the image acquisition type according to the position prediction information includes:

Determining that the image acquisition type is the cross-camera acquisition type when the position point corresponding to the position prediction information does not belong to a preset range;

And determining that the image acquisition type is a non-cross-camera acquisition type when the position point corresponding to the position prediction information belongs to a preset range.

Optionally, the historical state information includes a historical time, position information and speed information corresponding to at least one first preset object, and determining, according to the historical state information, position prediction information corresponding to each of the at least one first preset object includes:

acquiring the current moment;

obtaining time difference information according to the historical time and the current time;

and determining the position prediction information corresponding to each combination according to the time difference information, the position information and the speed information.

Optionally, the determining, based on the current image and the historical image, at least one combination in the case that the image acquisition type is a cross-camera acquisition type includes:

performing identification processing on an image corresponding to the position prediction information in the current image, and taking a preset object in the image as the second preset object;

and combining the first preset object with each second preset object in the historical image to obtain at least one combination, wherein the image acquisition equipment corresponding to the current image and the image acquisition equipment corresponding to the historical image are adjacently arranged.

Optionally, after the current image and the historical image are acquired, the method further includes:

and under the condition that the preset object exists in the current image and the preset object does not exist in the historical image, updating object track information, initializing the historical state information, and returning to the step of acquiring the current image and the historical image.

Optionally, the updating the object track information according to the weight information includes:

Determining a target combination according to the weight information;

and carrying out object association matching based on the target combination, and updating the object track information according to the matched objects.

In another aspect, the present invention also provides an object tracking apparatus, including:

The first information acquisition module is used for acquiring a current image and a historical image;

The second information acquisition module is used for acquiring the respective historical state information of at least one first preset object in the historical image based on the historical image under the condition that the preset object exists in the historical image;

The first information determining module is used for determining the position prediction information corresponding to each of the at least one first preset object according to the history state information;

the image acquisition type determining module is used for determining an image acquisition type according to the position prediction information;

A second information determining module, configured to determine at least one combination based on the current image and the history image and determine distance information and corresponding first feature information corresponding to the at least one combination, where the image acquisition type is a cross-camera acquisition type;

a third information determining module, configured to determine weight information corresponding to each of the at least one combination according to the distance information and the first feature information;

and the association matching module is used for updating the object track information according to the weight information.

In another aspect, the invention also provides electronic equipment, which comprises a processor and a memory for storing instructions executable by the processor, wherein the processor is configured to execute the object tracking method.

In another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon computer program instructions, wherein the computer program instructions, when executed by a processor, implement the above-described object tracking method.

According to the object tracking method, the device, the electronic equipment and the storage medium, the position prediction information is determined through the historical image, the image acquisition type is determined based on the position prediction information, at least one combination and the weight information corresponding to each combination are determined based on the image acquisition type, the object track information is updated through the weight information, the object tracking of looking around automatic driving is achieved, the possibility of object cross-camera loss is reduced, the time consumption is reduced to a certain extent, and the object tracking efficiency is improved.

Drawings

In order to more clearly illustrate the technical solution of the present invention, the following description will make a brief introduction to the drawings used in the description of the embodiments or the prior art. It should be apparent that the drawings in the following description are only some embodiments of the present invention, and that other drawings can be obtained from these drawings without inventive effort to those of ordinary skill in the art.

FIG. 1 is a method flow chart of an object tracking method according to an embodiment of the present invention;

FIG. 2 is a flowchart of a method for determining an image acquisition type based on position prediction information according to an embodiment of the present invention;

FIG. 3 is a flowchart of a method for determining an image acquisition type based on position prediction information according to an embodiment of the present invention;

FIG. 4 is a flowchart of a method for determining position prediction information corresponding to each of at least one first preset object according to historical state information according to an embodiment of the present invention;

FIG. 5 is a flow chart of a method for determining at least one combination based on a current image and a historical image in the case where the image acquisition type is a cross-camera acquisition type, provided by an embodiment of the present invention;

FIG. 6 is a flowchart of a method provided by an embodiment of the present invention after a current image and a history image are acquired;

FIG. 7 is a flowchart of a method for updating object track information according to weight information according to an embodiment of the present invention;

FIG. 8 is a block diagram of an object tracking device according to an embodiment of the present invention;

fig. 9 is a schematic diagram of a preset range in an object tracking method according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description.

It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

An embodiment of an object tracking method according to the present invention is described below, and fig. 1 is a flowchart of a method of an object tracking method according to an embodiment of the present invention. It is noted that the present specification provides method operational steps as described in the examples or flowcharts, but may include more or fewer operational steps based on conventional or non-inventive labor. The order of steps recited in the embodiments is merely one way of performing the order of steps and does not represent a unique order of execution. In actual system products, the processes may execute sequentially or in parallel (e.g., in a parallel processor or a multithreaded environment) in accordance with the methods shown in the embodiments or figures. As shown in fig. 1, the present embodiment provides an object tracking method, which includes:

S101, acquiring a current image and a historical image.

The current image may refer to images acquired by the plurality of image acquisition devices at the current time. The history image may refer to a last frame image with respect to a current image among images acquired by the plurality of image acquisition apparatuses. The history image may be the latest frame image of the existing target track. The plurality of image capturing devices may be disposed around the autonomous vehicle.

In practical application, the automatic driving vehicle can control a plurality of image acquisition devices around the vehicle to acquire images in real time in the running process, and correspondingly store the acquired images and the image acquisition time in a memory of the image acquisition devices. The controller of the object tracking device may acquire an image corresponding to the current time and a previous frame image of the image from the memory of the image acquisition device according to the current time, and use the image corresponding to the current time and the previous frame image of the image as the current image and the history image, respectively.

S102, acquiring respective historical state information of at least one first preset object in the historical image based on the historical image under the condition that the preset object exists in the historical image.

The preset object may refer to an object in which a ratio of a width to a height of a detection frame for object recognition in an image is smaller than a preset value, and for example, the preset object may be a pedestrian. The first preset object may refer to a preset object in the history image. It can be appreciated that the history image may not have a preset object or may have a preset object, and further, the history image may have one preset object or may have a plurality of preset objects. The history state information may be state information of at least one first preset object in the history image. The historical state information may represent a motion state of the first preset object at a time corresponding to the historical image. The history state information may include a photographing time corresponding to the history image, a position (x, y) of the first preset object in a coordinate system corresponding to the history image, a width w and a height h of a detection frame of the first preset object in the history image, and a change speed corresponding to the four variables (x, y, w, h).

In practical application, the history image is subjected to image recognition processing, after the preset object is recognized, the preset object can be used as a first preset object, and if a plurality of preset objects are detected, the plurality of preset objects can be used as a plurality of first preset objects. Establishing a coordinate system in the historical image, determining the position information of the midpoint of the detection frame of the first preset object under the coordinate system of the historical image, and taking the position information of at least one first preset object as current coordinate information. Based on the comparison of the historical image and the previous frame of image of the historical image, the change speeds corresponding to four variables in the historical image can be determined, so that the historical state information of the first preset object can be obtained.

S103, determining position prediction information corresponding to each of at least one first preset object according to the historical state information.

The position prediction information may refer to information of position prediction of the first preset object corresponding to the current frame. The position prediction information may be a position of an image coordinate system acquired by an image acquisition device corresponding to a shooting range in which the first preset object is located.

In practical application, according to the historical state information, the position prediction information of each first preset object can be obtained through a Kalman filter.

S104, determining the image acquisition type according to the position prediction information.

The image acquisition type may represent a relative positional relationship between an image acquisition device corresponding to the first preset object in a shooting range where the current frame may appear and an image acquisition device corresponding to the previous frame. The image acquisition types may include a cross-camera acquisition type and a non-cross-camera acquisition type.

In practical application, if the position of the first preset object in the previous frame corresponds to the first image acquisition device, according to the position prediction information, it can be predicted that the first preset object may appear in the second image acquisition device in the current frame, the second image acquisition device may be in a neighboring position of the first image acquisition device, and similarly, if it can be predicted that the first preset object may still appear in the first image acquisition device in the current frame according to the position prediction information, the image acquisition type may be in a non-cross-camera acquisition type.

S105, determining at least one combination based on the current image and the historical image, and determining distance information corresponding to the at least one combination and corresponding first characteristic information under the condition that the image acquisition type is a cross-camera acquisition type.

The combination may be a combination of a first preset object in the history image and each second preset object in the current image. The distance information may refer to a euclidean distance between the histogram features corresponding to the detection frame of the first preset object and the detection frame of the second preset object after the unified size in each combination. The histogram features may characterize the numerical distribution of the image within the detection box. The first feature information corresponding to each combination may represent a similarity between a first preset object and a second preset object in the combination.

In practical application, the detection frames of the first preset object and the detection frames of the second preset object in each combination can be subjected to unified size processing, after the detection frames of the first preset object and the detection frames of the second preset object are unified in size, histogram features of the two detection frames are obtained, and then Euclidean distance between the two histogram features is obtained through calculation. And extracting features of the detection frame of the first preset object and the detection frame of the second predicted object in each combination by using a pedestrian ReID (Person Re-identification) model, calculating the feature similarity between the first preset object and the second preset object in each combination according to the extracted two pieces of feature information, and taking the feature similarity as the first feature information.

S106, determining weight information corresponding to each combination according to the distance information and the first characteristic information.

The weight information of each combination can be used for representing the matching degree between the first preset object and the second preset object in the combination.

In practical application, the weight information of each combination can be calculated according to the distance information corresponding to the combination and the first characteristic information corresponding to the combination.

The weight information W of each specific combination can be calculated according to the following formula:

W＝w₁dist(F_H1,F_H2)+w₂cos(Fea₁,Fea₂)+w₃(|w₁/h₁-w₂/h₂|)

Wherein, w ₁、w₂ and w ₃ are preset parameters, and preferably, w ₁、w₂ and w ₃ can represent the proportional relationship between dist (F _H1,F_H2),cos(Fea₁,Fea₂) and (w ₁/h₁-w₂/h₂). w ₁、w₂ and w ₃ may be set according to the test. The Euclidean distance between the histogram features of the two detection frames can be calculated through dist (F _H1,F_H2), and the feature similarity can be calculated through cos (Fea ₁,Fea₂) to be used as first feature information. specifically, F _H1 and F _H2 have histogram features after unifying the sizes of the latest track detection box and the current frame detection box, respectively. fea ₁ and Fea ₂ represent the feature information extracted by the pedestrian ReID model, fea ₁ is the feature information inside the detection frame of the latest frame (i.e., the frame immediately above the current frame) of the existing target track, and Fea ₂ is the feature information of the detection frame of the current frame. h ₁ and h ₂ represent the width and height of the detection frame of the current frame and the width and height of the detection frame of the current frame, respectively, of the existing track.

Specifically, the calculation formula of the euclidean distance is:

and S107, updating the object track information according to the weight information.

The object track information may refer to track information of a preset object photographed by an image acquisition device loaded on the vehicle.

In practical application, based on the weight information corresponding to at least one combination, matching of the objects can be performed according to a KM (Kuhn-Munkres) algorithm (a hungarian matching algorithm with weights), and object track information can be updated according to a matching result.

The position prediction information is determined through the historical image, the image acquisition type is determined based on the position prediction information, at least one combination and weight information corresponding to each combination are determined based on the image acquisition type, and the object track information is updated through the weight information, so that the object tracking of looking around automatic driving is realized, the possibility of object cross-camera loss is reduced, the time consumption is reduced to a certain extent, and the object tracking efficiency is improved.

Fig. 2 is a flowchart of a method for determining an image acquisition type according to position prediction information according to an embodiment of the present invention. In one possible implementation, as shown in fig. 2, after determining the image acquisition type according to the position prediction information, the method may further include:

s201, under the condition that the image acquisition type is a non-cross-camera acquisition type, at least one combination is determined based on the current image and the historical image, and the cross-correlation information and the corresponding second characteristic information corresponding to the at least one combination are determined.

The intersection ratio information may refer to an intersection ratio of a detection frame of a first preset object and a detection frame of a second preset object in the combination, and specifically, the intersection ratio may refer to an area ratio of an intersection and a union of two rectangular frames. The second characteristic information may characterize a similarity between the first preset object and the second preset object in the combination.

In practical application, the cross-over ratio of the first preset object detection frame and the second preset object detection frame in the combination can be directly calculated, or the cross-over ratio of the first preset object detection frame and the second preset object detection frame after the two detection frames are subjected to outer expansion can be calculated after the two detection frames are subjected to outer expansion. In this embodiment, the cross-over ratio W _IOU is calculated by adopting a method of calculating the cross-over ratio after the outer expansion, and the specific calculation process is as follows:

R_e＝w_vv_host+w_yawv_{yaw_rate}

wherein Rc ₁ and Rc ₂ respectively represent two detection frames participating in calculation, R _e represents the expansion ratio of the detection frames, and v _host and v _{yaw_rate} respectively represent the vehicle speed and the corresponding yaw rate. w _v and w _yaw are preset parameters respectively, and represent corresponding weight relationships. w _v and w _yaw can be determined by multiple experiments, preferably w _v and w _yaw can be 0.6 and 0.4 respectively, the tracking effect is obvious and mismatch is less likely to occur.

S202, determining weight information corresponding to each combination according to the cross ratio information and the second characteristic information.

In practical applications, the feature similarity can be measured by using the cosine distance, so as to obtain the second feature information. The weight information W of each combination can be obtained by combining the ratio information and the second characteristic information. The specific calculation formula is as follows:

W=w₁W_IOU+w₂ cos(Fea₁,Fea₂)

Wherein, W ₁ and W ₂ are preset parameters respectively, which represent the proportional relationship between W _IOU and cos (Fea ₁,Fea₂). Preferably, the result of the target association match is best when w ₁ and w ₂ are 0.6 and 0.4, respectively. Fea ₁ and Fea ₂ represent the feature information extracted by the pedestrian ReID model, fea ₁ is the feature information inside the latest frame detection frame of the existing target track, and Fea ₂ is the feature information of the current frame detection frame.

It should be noted that, in the case that the image acquisition type is the non-cross-camera acquisition type, the method for determining the at least one combination may be that, in the case that the image acquisition type is the non-cross-camera acquisition type, it may be determined that the image acquisition device corresponding to the current image specifically is the same device as the image acquisition device corresponding to the first preset image, and the current image of the device identifier identical to the device identifier corresponding to the history image where the first preset object in the combination is located may be acquired for analysis. And under the condition that the preset objects exist in the current image, identifying at least one preset object as a second preset object. And combining the first preset object with each second preset object in the current image. It will be appreciated that the number of combinations is the same as the number of second preset objects.

After the image acquisition type is determined to be the non-cross-camera acquisition type, targeted processing is carried out after the image to be processed is determined, so that the data volume of analysis processing can be reduced, and the analysis efficiency is improved.

Fig. 3 is a flowchart of a method for determining an image acquisition type according to position prediction information according to an embodiment of the present invention. In one possible embodiment, as shown in fig. 3, the step S104 may include:

s301, determining that the image acquisition type is a cross-camera acquisition type when the position point corresponding to the position prediction information does not belong to a preset range.

Wherein the preset range may be determined according to a photographing range of each image capturing apparatus. The preset range may be the same range as the image capturing range of the image capturing apparatus, or may be a range slightly smaller than the image capturing range of the image capturing apparatus, and the present disclosure is not particularly limited. In the present embodiment, the preset range is a range smaller than the photographing range of the image capturing apparatus. The location point corresponding to the location prediction information may be a location of the first preset object representing prediction in the form of a point in the current image. Specifically, the location point corresponding to the location prediction information may be a midpoint of the bottom edge of the predicted detection frame.

In practical application, the relative position of the detection frame predicted by each first preset object relative to the historical image at the current moment can be obtained through the position prediction information, so that the specific position and the preset range of the midpoint of the bottom edge of the detection frame can be determined, and the image acquisition type can be determined according to whether the midpoint of the bottom edge is in the preset range or not. It will be appreciated that the image acquisition type is a non-cross-camera acquisition type when the midpoint of the bottom edge is within a preset range, and the image acquisition type is a cross-camera acquisition type when the midpoint of the bottom edge is outside the preset range. For example, as shown in fig. 9, W and H in the figure respectively represent the width and height of an image, a gray area represents a preset range, and a and B respectively represent detection frames of two preset objects, wherein the midpoint of the bottom edge of a is located in the preset range, the image acquisition type is determined to be a non-cross-camera acquisition type, the midpoint of the bottom edge of B is located outside the preset range, and the image acquisition type is determined to be a cross-camera acquisition type.

S302, determining that the image acquisition type is a non-cross-camera acquisition type when the position point corresponding to the position prediction information belongs to a preset range.

The relative position relation between the position point corresponding to the position prediction information and the image acquisition equipment corresponding to the historical moment of the first preset object can be determined through the position relation between the position point corresponding to the position prediction information and the preset range, so that the image acquisition type can be rapidly and accurately determined.

Fig. 4 is a flowchart of a method for determining position prediction information corresponding to each of at least one first preset object according to historical state information according to an embodiment of the present invention. In one possible embodiment, the history status information includes a history time, position information and speed information corresponding to at least one first preset object, as shown in fig. 4, the step S103 may include:

s401, acquiring the current moment.

The current time may refer to a shooting time corresponding to the current image.

In practical applications, the image capturing apparatus synchronously saves the capturing time when performing image capturing. The current time may be obtained according to the identification information of the current image.

S402, obtaining time difference information according to the historical time and the current time.

The historical time may refer to a shooting time corresponding to the historical image, that is, a shooting time corresponding to the previous frame of image. The time difference information may refer to difference information between a historical time and a current time.

In practical application, the time difference information can be obtained by taking the difference between the current time and the historical time.

S403, determining at least one combination of the corresponding position prediction information according to the time difference information, the position information and the speed information.

The position information of each first preset object may refer to a position coordinate of the first preset object in the history image. The position information may specifically include a position (x, y) of the first preset object in a coordinate system corresponding to the history image, and a width w and a height h of a detection frame of the first preset object in the history image. The speed information of each object may include the changing speeds corresponding to the four variables (x, y, w, h) described above.

In practical application, the position of the first preset object in the historical image coordinate system at the current time and the width and the height of the detection frame can be calculated according to the time difference information, the position information and the speed information.

FIG. 5 is a flow chart of a method for determining at least one combination based on a current image and a historical image in the case where the image acquisition type is a cross-camera acquisition type, provided by an embodiment of the present invention. In one possible implementation, as shown in fig. 5, in the case where the image acquisition type is a cross-camera acquisition type, determining at least one combination based on the current image and the history image may include:

S501, performing identification processing on an image corresponding to the position prediction information in the current image, and taking a preset object in the image as a second preset object.

In practical applications, the current image may include images captured by a plurality of image capturing devices at the current time. According to the relative position relation between the position prediction information and the image where the first preset object is located and the image acquisition equipment corresponding to the shooting range where the first preset object is located, the equipment identification of the image acquisition equipment corresponding to the position prediction information can be determined. The image corresponding to the equipment identifier in the current image is the image corresponding to the position prediction information in the current image. It can be understood that the position prediction information may reflect an image acquisition device corresponding to a shooting area where the first preset object may appear at the current moment, and an image corresponding to the position prediction information in the current image may be determined according to the position prediction information. And carrying out recognition processing on the image, and taking the preset object in the image as a second preset object. For example, when 2 preset objects are identified in the image, the 2 preset objects are respectively used as 2 second preset objects.

S502, combining a first preset object with each second preset object in the historical image to obtain at least one combination, wherein the image acquisition equipment corresponding to the current image and the image acquisition equipment corresponding to the historical image are adjacently arranged.

It is understood that the two first preset objects may correspond to different images and to different image acquisition types, respectively, according to the position prediction information. In the case that the image acquisition type of one of the first preset objects is the cross-camera acquisition type, the first preset object is combined with each of the second preset objects, so that the same number of combinations as the second preset objects can be obtained. For example, assuming that there are 2 second preset objects (object 21 and object 22, respectively) and the first preset object is object 11, 2 combinations may be combined, including the first combination (object 11 and object 21) and the second combination (object 11 and object 22, respectively).

Fig. 6 is a flowchart of a method for acquiring a current image and a history image according to an embodiment of the present invention. In one possible implementation, as shown in fig. 6, after the current image and the history image are acquired, the method may further include:

S601, under the condition that a preset object exists in a current image and the preset object does not exist in a historical image, updating object track information, initializing historical state information, and returning to the step of acquiring the current image and the historical image.

It can be understood that, in the case that the current image has a preset object and the history image does not have a preset object, it can be stated that the current image is the first frame image in the track corresponding to the preset object.

In practical application, when a preset object exists in the current image and the preset object does not exist in the history image, track information of the preset object can be newly added in the track information of the object to update the track information of the object, and the history state information of the preset object is initialized. Specifically, the position of a detection frame of a preset object in a coordinate system of a historical image is used as position information in historical state information of the preset object, and the change speeds of four variables are initialized to 0. After returning to the step of acquiring the current image and the history image, the preset object is taken as a first preset object, and the history state information of the preset object acquired in the subsequent step is the initialized information.

Fig. 7 is a flowchart of a method for updating object track information according to weight information according to an embodiment of the present invention. In one possible embodiment, as shown in fig. 7, the step S107 may include:

S701, determining target combinations according to the weight information.

The target combination can represent that the first preset object and the second preset object in the combination have good matching degree.

In practical application, according to the weight information of at least one combination, based on the KM algorithm, the combination corresponding to the complete matching with the maximum weight of the weighted bipartite graph is taken as the target combination.

S702, performing object association matching based on the target combination, and updating object track information according to the matched objects.

In practical application, a first preset object in the target combination corresponds to a track, a second preset object in the target combination is added to the track, and object track information of the track is updated according to related information of the second preset object.

Fig. 8 is a block diagram of an object tracking device according to an embodiment of the present invention. On the other hand, as shown in fig. 8, the present embodiment further provides an object tracking apparatus, which includes:

A first information acquisition module 10 for acquiring a current image and a history image;

the second information obtaining module 20 is configured to obtain, based on the history image, respective history state information of at least one first preset object in the history image when the preset object exists in the history image;

a first information determining module 30, configured to determine, according to the historical state information, position prediction information corresponding to each of at least one first preset object;

An image acquisition type determining module 40 for determining an image acquisition type based on the position prediction information;

A second information determining module 50, configured to determine at least one combination based on the current image and the history image and determine distance information and corresponding first feature information corresponding to the at least one combination, in a case where the image acquisition type is a cross-camera acquisition type;

A third information determining module 60, configured to determine weight information corresponding to each of the at least one combination according to the distance information and the first feature information;

the association matching module 70 is configured to update the object track information according to the weight information.

In another aspect, the embodiment of the invention also provides electronic equipment, which comprises a processor and a memory for storing instructions executable by the processor, wherein the processor is configured to execute the object tracking method.

In another aspect, embodiments of the present invention further provide a non-transitory computer readable storage medium having stored thereon computer program instructions, wherein the computer program instructions, when executed by a processor, implement the above-described object tracking method.

It should be noted that, for the sake of simplicity of description, the foregoing method embodiments are all expressed as two series of combinations of actions, but it should be understood by those skilled in the art that the present invention is not limited by the order of actions described, as some steps may be performed in other order or simultaneously in accordance with the present invention. Likewise, the modules of the object tracking device are referred to as computer programs or program segments for performing one or more specific functions, and the distinction of the modules does not represent that the actual program code must be separate. In addition, any combination of the above embodiments may be used to obtain other embodiments.

In the foregoing embodiments, the descriptions of the embodiments are focused on, and for those portions of each embodiment that are not described in detail, reference may be made to the related descriptions of other embodiments. Those of skill in the art will further appreciate that the various illustrative logical blocks (illustrative logical block), units, and steps described in connection with the embodiments of the invention may be implemented by electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software (interchangeability), various illustrative components described above (illustrative components), elements, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design requirements of the overall system. Those skilled in the art may implement the described functionality in varying ways for each particular application, but such implementation is not to be understood as beyond the scope of the embodiments of the present invention.

The foregoing description has fully disclosed specific embodiments of this invention. It should be noted that any modifications to the specific embodiments of the invention may be made by those skilled in the art without departing from the scope of the invention as defined in the appended claims. Accordingly, the scope of the claims of the present invention is not limited to the foregoing detailed description.

Claims

1. An object tracking method, the method comprising:

Acquiring a current image and a historical image;

Determining an image acquisition type according to the position prediction information; the image acquisition type characterizes the relative position relation between the image acquisition equipment corresponding to the first preset object in the shooting range where the current frame possibly appears and the image acquisition equipment corresponding to the previous frame; the image acquisition type comprises a cross-camera acquisition type, wherein the cross-camera acquisition type is used for indicating that image acquisition equipment corresponding to the position of a first preset object in a previous frame corresponds to different image acquisition equipment corresponding to the position of the first preset object predicted according to the position prediction information in the current frame;

and updating the object track information according to the weight information.

2. The method of claim 1, wherein after determining the image acquisition type based on the position prediction information, further comprising:

3. The method of claim 1, wherein determining an image acquisition type based on the position prediction information comprises:

4. The method according to claim 1, wherein the historical state information includes a historical time, position information corresponding to at least one first preset object, and speed information, and the determining, according to the historical state information, position prediction information corresponding to each of the at least one first preset object includes:

acquiring the current moment;

5. The method of claim 1, wherein the determining at least one combination based on the current image and the historical image if the image acquisition type is a cross-camera acquisition type comprises:

performing identification processing on an image corresponding to the position prediction information in the current image, and taking a preset object in the image as a second preset object;

6. The method of claim 1, further comprising, after the capturing the current image and the history image:

7. The method of claim 1, wherein updating object trajectory information based on the weight information comprises:

Determining a target combination according to the weight information;

8. An object tracking device, the device comprising:

The image acquisition type determining module is used for determining an image acquisition type according to the position prediction information; the image acquisition type characterizes the relative position relation between the image acquisition equipment corresponding to the first preset object in the shooting range where the current frame possibly appears and the image acquisition equipment corresponding to the previous frame; the image acquisition type comprises a cross-camera acquisition type, wherein the cross-camera acquisition type is used for indicating that image acquisition equipment corresponding to the position of a first preset object in a previous frame corresponds to different image acquisition equipment corresponding to the position of the first preset object predicted according to the position prediction information in the current frame;

9. An electronic device, comprising:

A processor;

a memory for storing processor-executable instructions;

Wherein the processor is configured to execute the executable instructions to implement the object tracking method of any one of claims 1 to 7.

10. A non-transitory computer readable storage medium having stored thereon computer program instructions, which when executed by a processor, implement the object tracking method of any of claims 1 to 7.