CN115619815A - Object tracking method and object tracking device - Google Patents
Object tracking method and object tracking device Download PDFInfo
- Publication number
- CN115619815A CN115619815A CN202110797357.7A CN202110797357A CN115619815A CN 115619815 A CN115619815 A CN 115619815A CN 202110797357 A CN202110797357 A CN 202110797357A CN 115619815 A CN115619815 A CN 115619815A
- Authority
- CN
- China
- Prior art keywords
- object tracking
- image frames
- region
- interest
- detection
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
Abstract
本发明实施例提供一种对象追踪方法及对象追踪装置,并适用于低延迟应用。在方法中,对连续图像讯框中的一个进行对象检测。对象检测用于辨识目标。暂存连续图像讯框。依据对象检测的结果对暂存的连续图像讯框进行对象追踪。对象追踪用于将连续图像讯框中的一个与另一个中的目标相关联。藉此,可提升追踪的准确度并满足低延迟的需求。
Embodiments of the present invention provide an object tracking method and an object tracking device, which are suitable for low-latency applications. In the method, object detection is performed on one of consecutive image frames. Object detection is used to identify objects. Temporarily store consecutive image frames. Object tracking is performed on the temporarily stored continuous image frames according to the object detection results. Object tracking is used to associate objects in one of successive image frames with another. In this way, the accuracy of tracking can be improved and the requirement of low latency can be met.
Description
技术领域technical field
本发明涉及一种图像处理技术,尤其是,还涉及一种对象追踪方法及对象追踪装置。The present invention relates to an image processing technology, and in particular, relates to an object tracking method and an object tracking device.
背景技术Background technique
对象检测(detection)及对象追踪(tracking)是计算机视觉技术中的重要研究,并已广泛应用在诸如视讯通话、医疗、行车辅助、保安等领域。Object detection (detection) and object tracking (tracking) are important researches in computer vision technology, and have been widely used in fields such as video communication, medical treatment, driving assistance, and security.
对象检测的主要功能在于辨识感兴趣区(Region of Interest,ROI)中的对象类型。对象检测的算法有很多种。例如,YOLO(You Only Look Once)是类神经网络算法,并具有数据轻量及高效率的特点。值得注意的是,YOLO第三代(version 3,V3)的架构中,上采样(upsampling)层可学习更加细微的特征,进而助于检测尺寸较小的对象。又例如,RetinaFace主要是针对人脸检测。RetinaFace可提供在自然场景下的单阶段密集脸部定位,使用特征金字塔网络(Feature Pyramid Network,FPN)负责不同尺寸的脸部(例如,更小脸部),并采用多任务损失(multi-task loss),进而对人脸检测提供较高的准确度。再例如,自适应增强(Adaptive Boosting,AdaBoost)使用前一个分类器分错的样本来训练下一个分类器,并加入弱分类器来增进分类结果,进而对异常数据或噪声数据有较高的敏感度。The main function of object detection is to identify object types in a Region of Interest (ROI). There are many algorithms for object detection. For example, YOLO (You Only Look Once) is a neural network algorithm with the characteristics of light data and high efficiency. It is worth noting that in the YOLO third-generation (version 3, V3) architecture, the upsampling layer can learn more subtle features, which in turn helps to detect objects of smaller size. Another example, RetinaFace is mainly for face detection. RetinaFace can provide single-stage dense face positioning in natural scenes, using Feature Pyramid Network (FPN) to be responsible for faces of different sizes (for example, smaller faces), and using multi-task loss (multi-task loss), thus providing higher accuracy for face detection. For another example, Adaptive Boosting (AdaBoost) uses the wrong samples of the previous classifier to train the next classifier, and adds weak classifiers to improve the classification results, which is more sensitive to abnormal data or noise data. Spend.
另一方面,对象追踪的主要功能在于追踪前后图像讯框(frame)所框选的相同对象。对象追踪的算法也有很多种。例如,光流法(optical flow)透过检测图像像素点的强度(intensity)随时间的变化,进而推断出对象的移动速度及方向。然而,光流法容易受光线变化、其他对象的影响而误判。又例如,最小平方误差输出和(Minimum Output Sum ofSquared Error,MOSSE)滤波器利用待检测区域与追踪目标的相关性确定待检测区域为追踪目标。值得注意的是,MOSSE滤波器可对受遮蔽的追踪目标更新滤波器参数,使得追踪目标再次出现时能对其重新追踪。再例如,尺度不变特征变换(Scale Invariant FeatureTransform,SIFT)算法确定特征点的位置、尺度及旋转不变量并对应产生特征向量,且透过匹配特征向量来确定目标的位置及方位。On the other hand, the main function of object tracking is to track the same object framed by the front and rear image frames. There are also many algorithms for object tracking. For example, the optical flow method can infer the moving speed and direction of the object by detecting the change of the intensity of image pixels over time. However, the optical flow method is easily misjudged by light changes and the influence of other objects. For another example, the Minimum Output Sum of Squared Error (MOSSE) filter utilizes the correlation between the area to be detected and the tracking target to determine the area to be detected as the tracking target. It is worth noting that the MOSSE filter can update the filter parameters for the covered tracking target, so that it can be re-tracked when the tracking target reappears. For another example, the Scale Invariant Feature Transform (SIFT) algorithm determines the position, scale and rotation invariance of feature points and generates corresponding feature vectors, and determines the position and orientation of the target by matching feature vectors.
一般而言,对象检测相较于对象追踪耗时,但对象追踪的结果可能有不准确的问题。在一些应用情境中,两种技术都可能影响使用体验。例如,实时视频会议的应用情境具有低延迟的需求。若检测耗时过长,则无法准确地框选移动中的物体。例如,对象检测经过四张讯框才得出的第一讯框中的框选结果,但目标的位置已在四张讯框之间改变,并使得实时显示的第四张讯框中的框选结果不准确。或者,追踪的目标不正确。由此可知,针对低延迟且高准确度的需求,现有技术仍有待改进。Generally speaking, object detection is time-consuming compared to object tracking, but the results of object tracking may have inaccurate problems. In some application scenarios, both technologies may affect the user experience. For example, the application scenario of real-time video conferencing has low-latency requirements. If the detection takes too long, moving objects cannot be accurately framed. For example, the frame selection result in the first frame is obtained after four frames of object detection, but the position of the object has changed between the four frames, making the frame in the fourth frame displayed in real time The selection result is inaccurate. Or, the wrong target is being tracked. It can be seen that, for the requirement of low delay and high accuracy, the existing technology still needs to be improved.
发明内容Contents of the invention
本发明实施例是针对一种对象追踪方法及对象追踪装置,基于对象检测的结果进行连续追踪,进而满足低延迟的需求并提供高准确度。Embodiments of the present invention are directed to an object tracking method and an object tracking device, which perform continuous tracking based on object detection results, thereby meeting the requirement of low latency and providing high accuracy.
根据本发明的实施例,对象追踪方法适用于低延迟应用,并包括(但不仅限于)下列步骤:对一张或更多张连续图像讯框中的一者进行对象检测。对象检测用于辨识目标。暂存连续图像讯框。依据对象检测的结果对暂存的连续图像讯框进行对象追踪。对象追踪用于将连续图像讯框中的一者与另一者中的目标相关联。According to an embodiment of the present invention, the object tracking method is suitable for low-latency applications and includes (but is not limited to) the following steps: performing object detection on one of one or more consecutive image frames. Object detection is used to identify objects. Temporarily store consecutive image frames. Object tracking is performed on the temporarily stored continuous image frames according to the object detection results. Object tracking is used to associate one of the successive image frames with an object in the other.
根据本发明的实施例,对象追踪装置适用于低延迟应用,并包括(但不仅限于)存储器及处理器。存储器用以存储程序代码。处理器耦接存储器。处理器经配置用以加载且执行程序代码而执行下列步骤:对一张或更多张连续图像讯框中的一者进行对象检测,暂存连续图像讯框,并依据对象检测的结果对暂存的连续图像讯框进行对象追踪。对象检测用于辨识目标。对象追踪用于将在连续图像讯框中的一者与另一者中的目标相关联。According to an embodiment of the present invention, an object tracking device is suitable for low-latency applications and includes, but is not limited to, a memory and a processor. The memory is used to store program codes. The processor is coupled to the memory. The processor is configured to load and execute program code to perform the following steps: perform object detection on one or more consecutive image frames, temporarily store the consecutive image frames, and perform object detection based on the result of the object detection. The stored continuous image frames are used for object tracking. Object detection is used to identify objects. Object tracking is used to associate objects in one of the successive image frames with the other.
基于上述,依据本发明实施例的对象追踪方法及对象追踪装置,暂存对象检测过程中的连续图像讯框,并待对象检测的结果得出而基于这结果追踪那些暂存的连续图像讯框中的目标。藉此,可结合对象检测的高准确度及对象追踪的高效率,并可符合低延迟应用的需求。Based on the above, according to the object tracking method and the object tracking device according to the embodiment of the present invention, the continuous image frames in the process of object detection are temporarily stored, and the result of the object detection is obtained, and those temporarily stored continuous image frames are tracked based on this result target in . In this way, the high accuracy of object detection and the high efficiency of object tracking can be combined, and the requirements of low-latency applications can be met.
附图说明Description of drawings
包含附图以便进一步理解本发明,且附图并入本说明书中并构成本说明书的一部分。附图说明本发明的实施例,并与描述一起用于解释本发明的原理。The accompanying drawings are included to provide a further understanding of the invention, and are incorporated in and constitute a part of this specification. The drawings illustrate embodiments of the invention and, together with the description, serve to explain principles of the invention.
图1是依据本发明一实施例的对象追踪装置的组件方块图;FIG. 1 is a component block diagram of an object tracking device according to an embodiment of the present invention;
图2是依据本发明一实施例的对象追踪方法的流程图;FIG. 2 is a flowchart of an object tracking method according to an embodiment of the present invention;
图3是依据本发明一实施例描述对连续图像讯框的追踪的示意图;FIG. 3 is a schematic diagram describing tracking of consecutive image frames according to an embodiment of the present invention;
图4是依据本发明一实施例的目标更新机制的流程图;FIG. 4 is a flowchart of a target update mechanism according to an embodiment of the present invention;
图5是依据本发明一实施例的对象检测及追踪的时序图;FIG. 5 is a sequence diagram of object detection and tracking according to an embodiment of the present invention;
图6是依据本发明另一实施例的对象检测及追踪的时序图;FIG. 6 is a sequence diagram of object detection and tracking according to another embodiment of the present invention;
图7是依据本发明一实施例的目标更新机制的时序图;FIG. 7 is a sequence diagram of a target update mechanism according to an embodiment of the present invention;
图8是依据本发明另一实施例的目标更新机制的时序图。FIG. 8 is a timing diagram of a target updating mechanism according to another embodiment of the present invention.
附图标号说明Explanation of reference numbers
100:对象追踪装置;100: object tracking device;
110:存储器;110: memory;
111:缓冲器;111: buffer;
130:处理器;130: processor;
131:检测追踪器;131: detect tracker;
132:检测器;132: detector;
133:主追踪器;133: main tracker;
135:次追踪器;135: secondary tracker;
S210~S250、S410~S460、S510~S530:步骤;S210~S250, S410~S460, S510~S530: steps;
F1~F5:连续图像讯框;F1~F5: continuous image frame;
ROI~ROI6:感兴趣区;ROI~ROI6: region of interest;
QF:队列讯框;QF: queue frame;
501:对象检测;501: object detection;
503:对象追踪;503: object tracking;
D1:期间;D1: period;
D2、D3、D4:周期;D2, D3, D4: period;
C1~C4:信心度;C1~C4: Confidence degree;
t1:时间点。t1: time point.
具体实施方式detailed description
现将详细地参考本发明的示范性实施例,示范性实施例的实例说明于附图中。只要有可能,相同组件符号在图式和描述中用来表示相同或相似部分。Reference will now be made in detail to the exemplary embodiments of the present invention, examples of which are illustrated in the accompanying drawings. Wherever possible, the same reference symbols are used in the drawings and descriptions to refer to the same or like parts.
图1是依据本发明一实施例的对象追踪装置100的组件方块图。请参照图1,对象追踪装置100包括(但不仅限于)存储器110及处理器130。对象追踪装置100可以是桌面计算机、笔记本电脑、智能型手机、平板计算机、服务器、监视装置、医疗检测仪器、光学检测仪器或其他运算装置。FIG. 1 is a block diagram of components of an
存储器110可以是任何型态的固定或可移动随机存取内存(Radom AccessMemory,RAM)、只读存储器(Read Only Memory,ROM)、闪存(flash memory)、传统硬盘(HardDisk Drive,HDD)、固态硬盘(Solid-State Drive,SSD)或类似组件。在一实施例中,存储器110用以记录程序代码、软件模块、组态配置、数据(例如,图像讯框、检测/追踪结果、信心度等)或其他档案,并待后文详述其实施例。The
在一实施例中,存储器110包括缓冲器111。缓冲器111可以是一个或更多个存储器110中的一者,也可以代表存储器110中的一个或更多个内存区块。缓冲器111用于暂存图像讯框,并待后续实施例详述其功能。一张或更多张图像讯框可以是以有线或无线连接的图像捕获设备(例如,相机、摄影机、或监视器)、服务器(例如,图像串流服务器、或云端服务器)或存储媒体(例如,闪存盘、硬盘或数据库服务器)所提供。In one embodiment, the
处理器130耦接存储器110,处理器130并可以是中央处理单元(CentralProcessing Unit,CPU)、图形处理单元(Graphic Processing unit,GPU),或是其他可程序化的一般用途或特殊用途的微处理器(Microprocessor)、数字信号处理器(DigitalSignal Processor,DSP)、可程序化控制器、现场可程序化逻辑门阵列(FieldProgrammable Gate Array,FPGA)、特殊应用集成电路(Application-SpecificIntegrated Circuit,ASIC)、神经网络加速器或其他类似组件或上述组件的组合。在一实施例中,处理器130用以执行对象追踪装置100的所有或部份作业,且可加载并执行存储器110所记录的程序代码、软件模块、档案及数据。在一些实施例中,处理器130的功能可透过软件实现。The
处理器130包括检测追踪器131及次追踪器135。检测追踪器131及次追踪器135中的任一者或两者可由独立的数字电路、芯片、神经网络加速器或其他处理器实现,或其功能可由软件实现。The
在一实施例中,检测追踪器131包括检测器132及主追踪器133。检测器132用以进行对象检测。对象检测例如是确定图像讯框中对应于目标(例如,人、动物、非生物体或其部位的对象)的感兴趣区(Region of Interest,ROI)(或是定界框(bounding box)、矩形框(bounding rectangle)),进而辨识目标的类型(例如,男性或女性、狗或猫、桌或椅、车或号志灯等)。检测器132例如可应用基于神经网络的算法(例如,YOLO、基于区域的卷积神经网络(Region Based Convolutional Neural Networks,R-CNN)、或快速R-CNN(Fast CNN))或是基于特征匹配的算法(例如,方向梯度直方图(Histogram of Oriented Gradient,HOG)、Harr、或加速稳健特征(Speeded Up Robust Features,SURF)的特征比对)实现对象检测。须说明的是,本发明实施例不加以限制检测器132所用的算法。In one embodiment, the
在一实施例中,主追踪器133及次追踪器135用以进行对象追踪。对象追踪用于将连续图像讯框中的一者与另一者中的目标相关联。连续图像讯框代表影片或视讯串流的那些连续的图像讯框。而对象追踪例如是判断相邻图像讯框中相同目标(可由感兴趣区框选其对应位置)的位置、移动、方向及其他运动的关联性,进而定位移动中的目标。主追踪器133及次追踪器135例如可应用光流法、排序法(Simple Online And Realtime Tracking,SORT)、深度排序法(Deep SORT)、联合检测及嵌入向量(Joint Detection and Embedding,JDE)模型或其他追踪算法实现对象追踪。须说明的是,本发明实施例不加以限制主追踪器133及次追踪器135所用的算法,且主追踪器133及次追踪器135可使用相同或不同的算法。In one embodiment, the
在一些实施例中,对象追踪装置100可还包括显示器(图未示)。显示器耦接处理器130。显示器可以是液晶显示器(Liquid-Crystal Display,LCD)、发光二极管(Light-Emitting Diode,LED)显示器、有机发光二极管(Organic Light-Emitting Diode,OLED)、量子点显示器(Quantum dot display)或其他类型显示器。在一实施例中,显示器用以显示图像讯框或经对象检测/追踪的图像讯框。In some embodiments, the
下文中,将搭配对象追踪装置100中的各项装置、组件及/或模块说明本发明实施例所述的方法。本方法的各个流程可依照实施情形而调整,且并不仅限于此。Hereinafter, the methods described in the embodiments of the present invention will be described in conjunction with various devices, components and/or modules in the
图2是依据本发明一实施例的对象追踪方法的流程图。请参照图2,检测追踪器131的检测器132对一张或更多张连续图像讯框中的一者进行对象检测(步骤S210)。具体而言,在一些应用情境中,例如是视讯通话、图像串流、图像监控或游戏,处理器130可取得一张或更多张连续的图像讯框(本文称为连续图像讯框)。连续图像讯框是指基于图像捕获设备或录制图像的影格率(例如以每秒显示讯框数(Frames Per Second,FPS)或频率计量)下相邻图像讯框的集合。例如,若影格率为60FPS,则一秒内的60张图像讯框可被称为连续图像讯框。然而,连续图像讯框不以一秒内的图像讯框为限。例如,连续图像讯框亦可为一秒半、两秒或二又三分之一秒内的图像讯框。FIG. 2 is a flowchart of an object tracking method according to an embodiment of the invention. Referring to FIG. 2 , the
反应于连续图像讯框的输入(例如,源自于图像捕获设备、服务器或存储媒体,并可存储于存储器110),检测器132自存储器110存取输入的一张连续图像讯框。在一实施例中,为了达到实时处理的功能,检测器132可对当前输入的第一张连续图像讯框进行对象检测。在另一实施例中,检测器132可对输入的其他张连续图像讯框进行对象检测。即,忽略第一张连续图像讯框或忽略复数张连续图像讯框。须说明的是,这处的第一张讯框表示在某一时间点下所输入的第一张讯框或是这时间点下对存储器110存取的第一张讯框,且不限于图像或视讯串流的起始讯框。In response to the input of consecutive image frames (eg, originating from the image capture device, server or storage medium, and stored in the memory 110 ), the
另一方面,对象检测的说明可参照前述针对检测器132的说明,且于这不再赘述。On the other hand, the description of the object detection can refer to the above description for the
举例而言,图3是依据本发明一实施例描述检测追踪器131对连续图像讯框的追踪的示意图。请参照图3,检测器132在连续图像讯框F1~F4中的第一张连续图像讯框F1决定目标所处位置对应的感兴趣区ROI,并据以辨识这感兴趣区ROI中的目标。须说明的是,图3所示的第二张至第四张连续图像讯框F2~F4代表第一张连续图像讯框F1的后续图像讯框。For example, FIG. 3 is a schematic diagram describing the tracking of continuous image frames by the
处理器130可暂存一张或更多张连续图像讯框(步骤S230)至缓冲器111。具体而言,部分低延迟应用需要对输入、存取或撷取的图像实时处理。低延迟应用相关于一张连续讯框图像的输入时间点与对同张连续讯框图像的输出时间点之间的时间延迟在特定容许时间内的视讯应用。例如,视讯通话/会议、或直播串流。依据不同需求,这些视讯应用可能额外提供诸如人脸检测、亮度调整、特效处理或其他图像处理。然而,若图像处理期间过长,则将影响应用的体验结果。例如,在实时视频会议中,若人脸检测期间过长,则头部的运动可能导致检测结果所得的人脸位置偏离当前输出图像中的人脸位置,并使显示的图像无法准确框选人脸。因此,在本发明实施例中,对象检测过程中所接收的连续图像讯框可保留下来,使对象检测的结果可更新所保留的图像讯框中的追踪目标,并使得这张图像讯框的输出时间点可晚于其对象检测的结束时间点。The
在一实施例中,在步骤S210的对象检测的全部或部分期间中,处理器130可暂存这期间内输入(系统,例如,对象追踪装置100)的一张或更多张连续图像讯框在缓冲器111。以图3为例,在检测器132对连续图像讯框F1进行对象检测。而在检测器132接收连续图像讯框F1及得出连续图像讯框F1中的感兴趣区ROI之间,存储器110依序存储连续图像讯框F1~F4。处理器130可将这些连续图像讯框F1~F4作为队列讯框QF并存储在缓冲器111中。In one embodiment, during all or part of the object detection in step S210, the
在另一实施例中,处理器130可进一步暂存对象检测期间外所存取的其他连续图像讯框。例如,处理器130将对象检测期间之前的最后一张或期间之后的下一张连续图像讯框。In another embodiment, the
又一实施例中,处理器130可暂存在对象追踪完结前的全部或部分期间中输入至系统的一个或更多个连续图像讯框。In yet another embodiment, the
须说明的是,图3所示范例是以对象检测的期间内所有连续图像讯框皆暂存到缓冲器111,然不以此为限。It should be noted that in the example shown in FIG. 3 , all consecutive image frames are temporarily stored in the
在一实施例中,处理器130可比较暂存的那些连续图像讯框与数量上限。这数量上限相关于缓冲器111的空间大小、检测器132的检测速度或处理效率的需求。例如,数量上限为8、10或20张。处理器130可依据连续图像讯框与数量上限的比较结果删除暂存的那些连续图像讯框中的至少一者。反应于暂存的那些连续图像讯框等于或大于数量上限,处理器130可删除缓冲器111中部分的连续图像讯框。例如,处理器130可删除排序在偶数顺位或奇数顺位的连续图像讯框,或者随机数删除缓冲器111中特定数量的连续讯框。另一方面,反应于暂存的那些连续图像讯框未达到数量上限,处理器130可保留缓冲器111中全部或部分的连续图像讯框。In one embodiment, the
在另一实施例中,若缓冲器111的空间容许对象检测期间所收到的所有连续图像讯框,则处理器130可保留那些连续图像讯框。In another embodiment, if the space of the
须说明的是,数量上限可能固定,也可能反应于检测器132的实时处理速度、系统运算复杂度、后续应用需求等因素而变动。It should be noted that the upper limit of the number may be fixed, or may change in response to factors such as the real-time processing speed of the
主追踪器133可依据对象检测的结果对暂存的一张或更多张连续图像讯框进行对象追踪(步骤S250)。在一实施例中,对象检测的结果包括目标的感兴趣区。如图3所示的感兴趣区ROI,感兴趣区ROI对应于在受对象检测的那张连续图像讯框中的目标的位置。须说明的是,感兴趣区ROI可能完全或部分框选到目标,且本发明实施例不加以限制。在一些实施例中,对象检测的结果还包括目标的类型。The
另一方面,对象追踪的说明可参照前述针对主追踪器133的说明,且于这不再赘述。On the other hand, the description of the object tracking can refer to the above description for the
此外,反应于连续图像讯框中的某一者的对象检测的完成(即,取得对象检测的结果,例如,图3所示检测到连续图像讯框F1的感兴趣区ROI),主追踪器133才对缓冲器111中的一张或更多张连续图像讯框进行对象追踪。换句而言,在对第一张连续图像讯框的对象检测完成之前,主追踪器133禁能或不追踪第一张连续图像讯框或后续输入的其他连续图像讯框。In addition, in response to the completion of the object detection of one of the consecutive image frames (i.e., obtaining the result of object detection, for example, detecting the ROI of the consecutive image frame F1 shown in FIG. 3 ), the
在一实施例中,主追踪器133可决定对象检测的结果中的感兴趣区在暂存的那些连续图像讯框之间的关联性,并依据这关联性决定出另一个感兴趣区。这关联性相关于一个或更多个感兴趣区中的一个或更多个目标在相邻连续图像讯框之间的位置、方位及/或速度。In one embodiment, the
以图3为例,主追踪器133在连续图像讯框F1~F4连续追踪检测器132所得出的感兴趣区ROI中的目标,并随目标移动而更新成感兴趣区ROI2。Taking FIG. 3 as an example, the
在一实施例中,假设对象检测的结果包括对应于目标的检测感兴趣区(即,对应于在受对象检测的那张连续图像讯框中的目标的位置)。此外,假设追踪感兴趣区是指对象追踪先前追踪的区域。换句而言,追踪感兴趣区是当前时间点下或邻近时间点之前对象追踪在一张或更多张连续图像讯框中作为追踪基础的感兴趣区。主追踪器133可将对象追踪所针对的追踪感兴趣区更新为对象检测所得出的检测感兴趣区。换句而言,追踪感兴趣区直接被检测感兴趣区取代。In one embodiment, it is assumed that the result of the object detection includes a detection ROI corresponding to the object (ie, corresponding to the position of the object in the consecutive image frames subject to object detection). In addition, it is assumed that tracking a region of interest means that the object tracks a previously tracked area. In other words, the tracking ROI is the ROI that the object is tracked in one or more consecutive image frames at or before the current time point as a tracking basis. The
图4是依据本发明一实施例的目标更新机制的流程图。请参照图4,处理器130自存储器110存取输入的连续图像讯框(步骤S410),并透过检测追踪器131检测存取的那一张连续图像讯框中的目标。此时,次追踪器135可能完成先前连续图像讯框的追踪,并进一步决定检测追踪器131是否忙碌(步骤S420)?然而,无论检测追踪器131是否忙碌,次追踪器135仍使用先前连续图像讯框所得出的感兴趣区来追踪目标(步骤S430)。另一方面,若检测追踪器131未忙碌,代表已得出一检测追踪感兴趣区(即,即132检测器完成检测,且133主追踪器已完成追踪所有暂存的连续图像)(步骤S440),则主追踪器133可使用检测器132所输出的新的感兴趣区更新当前追踪的感兴趣区(即,更新追踪目标,步骤S450),并于连续追踪完毕/完结并暂存的所有连续图像讯框后得出一检测追踪感兴趣区,与次追踪器135追踪得出的一追踪感兴趣区相比较或计算,选择其一或混和运算得出一最终感兴趣区,用来更新次追踪器135当前追踪的感兴趣区(步骤S460)。FIG. 4 is a flowchart of an object update mechanism according to an embodiment of the invention. Referring to FIG. 4 , the
在一实施例中,处理器130可依据对象检测追踪结果产生的时间禁能对先前追踪感兴趣区的对象追踪。假设检测追踪器131于次追踪器135开始一轮追踪但尚未完结的过程中产出一检测追踪结果,次追踪器135可在开启下一轮检测追踪前禁能或不进行对象追踪。而在下一个对象追踪的周期中,次追踪器135直接以将该检测追踪结果作为基础开始追踪。In one embodiment, the
举例而言,图5是依据本发明一实施例的对象检测及追踪的时序图,用以详加解释图4的步骤S460的决策机制。请参照图5,在检测追踪器131进行对象检测追踪501的期间D1中,次追踪器135的对象追踪503完成两张连续图像讯框的追踪。次追踪器135进行第三张连续图像讯框的对象追踪503的过程中,检测追踪器131即已完成或几乎完成对象检测追踪501。即,在第三个对象追踪503期间D2中,检测追踪器131执行对象检测追踪501并据以得出新的感兴趣区(步骤S510)。而在下一次检测追踪器131开启对象检测追踪501之前后一定期间内,对象追踪503可以对象检测追踪501所得出新的感兴趣区为基础进行追踪(步骤S530)。在另一实施例中,重新开启追踪的次追踪器135可基于对象检测追踪501所最新得出的检测追踪感兴趣区及前一次对象追踪503得出的追踪感兴趣区进行对象追踪503。例如,次追踪器135可使用检测追踪感兴趣区及追踪感兴趣区两者的加权平均。其中,加权平均所用的权重可端视应用者的需求而自行变更,且本发明实施例不加以限制。或者,次追踪器135可自检测追踪感兴趣区及追踪感兴趣区两者中择一。For example, FIG. 5 is a sequence diagram of object detection and tracking according to an embodiment of the present invention, which is used to explain the decision-making mechanism of step S460 in FIG. 4 in detail. Referring to FIG. 5 , during the period D1 when the
在一实施例中,处理器130可决定最近一次对象检测追踪501完成时间点与最近一次对象追踪503完成时间点之间的时间差。这时间差代表次追踪器135最新得出结果的时间点是否接近于检测追踪器131最新得出结果的时间点。次追踪器135以及检测追踪器131可依据这时间差决定是否使用检测追踪感兴趣区及追踪感兴趣区两者进行对象追踪以及对象检测。In one embodiment, the
举例而言,图6是依据本发明另一实施例的对象检测及追踪的时序图。请参照图6,对象追踪503不考虑对象追踪503是否得出结果,一直持续执行。然而,次追踪器135可判断期间D1的结尾与周期D4的结果之间的时间差,并将这时间差与差异阈值比较。若这时间差小于差异阈值,则对象追踪503可使用周期D4所得出的感兴趣区及对象检测追踪501在期间D1所得出的感兴趣区两个的加权平均。另一方面,若这时间差未小于差异阈值,则对象追踪503以及对象检测追踪501仅使用对象检测追踪501在期间D1所得出的感兴趣区。For example, FIG. 6 is a timing diagram of object detection and tracking according to another embodiment of the present invention. Please refer to FIG. 6 , the object tracking 503 is continuously executed regardless of whether the object tracking 503 obtains a result. However,
在一实施例中,假设对象检测的期间未被记录,次追踪器135可依据追踪感兴趣区(即,次追踪器135先前追踪的区域)在对象追踪的信心度决定将追踪感兴趣区更新为检测追踪感兴趣区(即,对象检测追踪的结果)。在一些应用情境中,对象追踪的目标可能突然被遮蔽,使得对象追踪的结果可能信心度较低(例如,小于信心度阈值)。此时,当次追踪器135的对象追踪完结时,次追踪器135可更新成对象检测追踪的结果或使用检测追踪感兴趣区及追踪感兴趣区两者的加权平均,并作为最终感兴趣区。In one embodiment, assuming that the period of object detection has not been recorded, the
举例而言,图7是依据本发明一实施例的目标更新机制的时序图。请参照图7,假设次追踪器135针对连续图像讯框F1~F4的结果的信心度C1~C4中感兴趣区ROI3的信心度C4小于信心度阈值。此时,次追踪器135可将感兴趣区ROI3更新为检测追踪器131所得出的感兴趣区ROI4。又例如,若信心度C1~C4中小于信心度阈值的数量大于数量阈值,则次追踪器135也可将感兴趣区ROI3更新为检测追踪器131所得出的感兴趣区ROI4。再例如,次追踪器135也可使用感兴趣区ROI3,ROI4两者的加权平均,且感兴趣区ROI3的权重可较低。For example, FIG. 7 is a sequence diagram of a target updating mechanism according to an embodiment of the present invention. Referring to FIG. 7 , it is assumed that the confidence level C4 of the ROI3 among the confidence levels C1 - C4 of the results of the
在一实施例中,次追踪器135可依据场景转换的检测结果决定将追踪感兴趣区(即,次追踪器135先前追踪的区域)更新为检测追踪感兴趣区(即,对象检测追踪的结果)。场景转换相关于相邻的二张连续图像讯框的场景不同。处理器130可判断背景的颜色、对比度或特定图案的变化程度,并据以得出场景转换的检测结果(例如,场景不同/已转换或相同/未转换)。例如,变化程度大于变化阈值,则检测结果为场景已转换,且次追踪器135可更新感兴趣区。又例如,变化程度未大于变化阈值,则检测结果为场景未转换,且次追踪器135可维持追踪感兴趣区或使用检测追踪感兴趣区及追踪感兴趣区两者。In one embodiment, the
举例而言,图8是依据本发明另一实施例的目标更新机制的时序图。请参照图8,假设处理器130在时间点t1检测到场景转换已转换。例如,连续图像讯框F2的内容是白天,但连续图像讯框F3的内容是夜晚。此外,针对连续图像讯框F3,次追踪器135可将连续图像讯框F2所得出的感兴趣区ROI5更新成检测追踪器131最近输出的感兴趣区ROI6。For example, FIG. 8 is a sequence diagram of a target updating mechanism according to another embodiment of the present invention. Referring to FIG. 8 , assume that the
在一实施例中,反应于一张或更多张连续图像讯框中的一者的对象追踪的完成,处理器130可要求这对象追踪的结果的显示。例如,处理器130可透过显示器显示连续图像讯框及对象追踪所框选的感兴趣区。In one embodiment, in response to completion of object tracking in one or more consecutive image frames,
以图3为例,表(1)是时间关系表:Taking Figure 3 as an example, table (1) is a time relationship table:
表(1)Table 1)
在检测器132检测连续图像讯框F1的期间中,处理器130输入连续图像讯框F1~F4至缓冲器111。此时,显示器所显示的连续图像讯框F1~F3尚未有对象检测或对象追踪的结果。当显示器显示连续图像讯框F4时,主追踪器133可使用检测器132所输出的感兴趣区追踪暂存的那些连续图像讯框F1~F4中的目标,并可据以显示对象追踪的结果(如图3所示连续图像讯框F4中的感兴趣区ROI2)。于其他实施例中,感兴趣区ROI2被用以与次追踪器135追踪得出的追踪感兴趣区相比较或计算,选择其一或混和运算得出最终感兴趣区,当显示器显示连续图像讯框F4时,并同时显示该最终感兴趣区。During the period when the
在一实施例中,检测器132可对缓冲器111所暂存的那些连续图像讯框之后的图像讯框进行对象检测,并禁能或不对原先暂存的那些连续图像讯框中的其他者进行对象检测。也就是说,检测器132禁能或不对所有输入的连续图像讯框进行对象检测。检测器132针对单一讯框的检测期间可能远大于主追踪器133针对单一讯框的追踪期间,且检测期间甚至无法因应应用情境的低延迟需求。待检测器132输出一笔结果,检测期间的其他连续图像讯框可能已被多次要求输出或其他处理。如表(1)所示,显示器输出连续图像讯框F1~F3,但检测器132仍在进行连续图像讯框F1的对象检测。反应于对象检测的结果输出,检测器132可直接对新输入的连续图像讯框进行对象检测,而禁能或不对先前暂存的其他连续图像讯框继续进行对象检测。以图3为例,检测器132检测连续图像讯框F4之后输入的图像讯框。In one embodiment, the
于另一实施例中,检测追踪器131系依据固定时间间隔、固定图像讯框张数间隔、或者场景转换的检测结果启动对新输入的连续图像讯框的对象检测,且任一次对象检测均为独立事件,不问目前是否尚有未完成的对象检测。任一次对象检测追踪的结果输出时,均用以更新前一次对象检测追踪的结果输出。由于每一次检测追踪所花费的时间长短不定,这处的前一次对象检测追踪系以输出结果的时间点判定。于另一实施例中,检测追踪器131系依据固定时间间隔、固定图像讯框张数间隔、或者场景转换的检测结果选择针对连续图像讯框中的哪一张讯框进行对象检测。检测追踪器131的启动时间点可略早或略晚于依据固定时间间隔、固定图像讯框张数间隔、或者场景转换的检测结果启动之前一实施例,但启动后依据固定时间间隔、固定图像讯框张数间隔、或者场景转换的检测结果选择针对连续图像讯框中的特定一张讯框进行对象检测,并选择性地停止先前对象检测或者对象追踪,以增加检测追踪器131的启动时间点的弹性。In another embodiment, the
综上所述,在本发明实施例的对象追踪方法及对象追踪装置中,可基于对象检测的结果追踪先前暂存的连续图像讯框中的目标。藉此,无论目标的类型(例如,人、动物或非生物),可提升对象追踪的准确度。此外,有鉴于追踪器的高处理效率,本发明实施例可满足实时视讯或其他低延迟应用的要求。To sum up, in the object tracking method and the object tracking device according to the embodiments of the present invention, the objects in the previously temporarily stored consecutive image frames can be tracked based on the object detection results. In this way, the accuracy of object tracking can be improved regardless of the type of the object (eg, human, animal, or non-biological). In addition, due to the high processing efficiency of the tracker, embodiments of the present invention can meet the requirements of real-time video or other low-latency applications.
最后应说明的是:以上各实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述各实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的范围。Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present invention, rather than limiting them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: It is still possible to modify the technical solutions described in the foregoing embodiments, or perform equivalent replacements for some or all of the technical features; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the technical solutions of the various embodiments of the present invention. scope.
Claims (22)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202110797357.7A CN115619815A (en) | 2021-07-14 | 2021-07-14 | Object tracking method and object tracking device |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202110797357.7A CN115619815A (en) | 2021-07-14 | 2021-07-14 | Object tracking method and object tracking device |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN115619815A true CN115619815A (en) | 2023-01-17 |
Family
ID=84854465
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202110797357.7A Pending CN115619815A (en) | 2021-07-14 | 2021-07-14 | Object tracking method and object tracking device |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN115619815A (en) |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20190114804A1 (en) * | 2017-10-13 | 2019-04-18 | Qualcomm Incorporated | Object tracking for neural network systems |
| CN110555862A (en) * | 2019-08-23 | 2019-12-10 | 北京数码视讯技术有限公司 | Target tracking method, device, electronic equipment and computer-readable storage medium |
| CN111209837A (en) * | 2019-12-31 | 2020-05-29 | 武汉光庭信息技术股份有限公司 | Target tracking method and device |
-
2021
- 2021-07-14 CN CN202110797357.7A patent/CN115619815A/en active Pending
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20190114804A1 (en) * | 2017-10-13 | 2019-04-18 | Qualcomm Incorporated | Object tracking for neural network systems |
| CN110555862A (en) * | 2019-08-23 | 2019-12-10 | 北京数码视讯技术有限公司 | Target tracking method, device, electronic equipment and computer-readable storage medium |
| CN111209837A (en) * | 2019-12-31 | 2020-05-29 | 武汉光庭信息技术股份有限公司 | Target tracking method and device |
Non-Patent Citations (1)
| Title |
|---|
| 隋运峰;李星博;赵士;黄忠涛;程志;: "基于图像弱检测器的飞机起降过程追踪方法", 计算机应用, no. 1, 30 June 2018 (2018-06-30) * |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| TWI783572B (en) | Object tracking method and object tracking apparatus | |
| CN110602527B (en) | Video processing method, device and storage medium | |
| JP2022023887A (en) | Appearance search system and method | |
| WO2021238586A1 (en) | Training method and apparatus, device, and computer readable storage medium | |
| WO2020125623A1 (en) | Method and device for live body detection, storage medium, and electronic device | |
| WO2021051545A1 (en) | Behavior identification model-based fall-down action determining method and apparatus, computer device, and storage medium | |
| CN107274433A (en) | Method for tracking target, device and storage medium based on deep learning | |
| WO2018188453A1 (en) | Method for determining human face area, storage medium, and computer device | |
| KR20180054709A (en) | Managing cloud sourced photos on a wireless network | |
| WO2020103462A1 (en) | Video search method and apparatus, computer device, and storage medium | |
| CN117237867A (en) | Adaptive scene surveillance video target detection method and system based on feature fusion | |
| JP2020109644A (en) | Fall detection method, fall detection apparatus, and electronic device | |
| CN111915713A (en) | A method for creating a three-dimensional dynamic scene, a computer device, and a storage medium | |
| Alotibi et al. | CNN-based crowd counting through IoT: Application for Saudi public places | |
| CN103947192A (en) | Video Analytics Coding | |
| CN112580435B (en) | Face positioning method, face model training and detecting method and device | |
| CN116468753A (en) | Target tracking method, device, equipment, storage medium and program product | |
| CN103187083A (en) | Storage method and system based on time domain video fusion | |
| CN112085002A (en) | Portrait segmentation method, portrait segmentation device, storage medium and electronic equipment | |
| CN110799984A (en) | Tracking control method, device and computer readable storage medium | |
| Wang et al. | Multi-scale aggregation network for temporal action proposals | |
| CN115619815A (en) | Object tracking method and object tracking device | |
| CN114219938A (en) | Region-of-interest acquisition method | |
| JP2010002960A (en) | Image processor, image processing method, and image processing program | |
| Choudhary et al. | Real time video summarization on mobile platform |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |