[go: up one dir, main page]

CN100486336C - Real time method for segmenting motion object based on H.264 compression domain - Google Patents

Real time method for segmenting motion object based on H.264 compression domain Download PDF

Info

Publication number
CN100486336C
CN100486336C CN 200610116363 CN200610116363A CN100486336C CN 100486336 C CN100486336 C CN 100486336C CN 200610116363 CN200610116363 CN 200610116363 CN 200610116363 A CN200610116363 A CN 200610116363A CN 100486336 C CN100486336 C CN 100486336C
Authority
CN
China
Prior art keywords
motion vector
motion
vector field
sigma
segmentation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN 200610116363
Other languages
Chinese (zh)
Other versions
CN1960491A (en
Inventor
刘志
张兆杨
陆宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Shanghai for Science and Technology
Original Assignee
University of Shanghai for Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Shanghai for Science and Technology filed Critical University of Shanghai for Science and Technology
Priority to CN 200610116363 priority Critical patent/CN100486336C/en
Publication of CN1960491A publication Critical patent/CN1960491A/en
Application granted granted Critical
Publication of CN100486336C publication Critical patent/CN100486336C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Image Analysis (AREA)

Abstract

本发明提出了一种基于匹配矩阵的H.264压缩域运动对象实时分割方法,分割所依赖的唯一信息是从H.264视频提取出的基于4×4块均匀采样的运动矢量场。首先对连续多帧的运动矢量场进行归一化并且迭代后向投影,获得累积运动矢量场以增强显著的运动信息。然后对累积运动矢量场进行全局运动补偿,同时采用快速的统计区域生长算法按照运动相似性将其分割成多个区域。利用上述两方面结果,提出基于匹配矩阵的运动对象分割方法,使之能有效的在视频序列中进行对象的跟踪与更新、对象的合并与分裂、新对象的出现以及对象消失等多种情况。对MPEG-4测试序列的实验结果表明,在一台CPU为3.0GHz,内存为512M的计算机上处理CIF格式的视频序列,平均每帧的处理时间为38ms,已能满足大多数实时应用25fps的要求,而且具有良好的分割质量。由于本发明提出的方法只使用了运动矢量场信息,因此它同样可适用于MPEG压缩域的运动对象分割。

Figure 200610116363

The invention proposes a method for real-time segmentation of moving objects in H.264 compressed domain based on matching matrix. The only information the segmentation depends on is the motion vector field based on uniform sampling of 4*4 blocks extracted from H.264 video. Firstly, the motion vector field of consecutive multiple frames is normalized and iteratively back-projected to obtain the accumulated motion vector field to enhance salient motion information. Then the global motion compensation is performed on the accumulated motion vector field, and a fast statistical region growing algorithm is used to divide it into multiple regions according to the motion similarity. Using the results of the above two aspects, a moving object segmentation method based on matching matrix is proposed, so that it can effectively perform object tracking and updating, object merging and splitting, new object appearance and object disappearance in video sequences. The experimental results of the MPEG-4 test sequence show that on a computer with a CPU of 3.0GHz and a memory of 512M to process a video sequence in CIF format, the average processing time of each frame is 38ms, which can meet the requirements of most real-time applications at 25fps. requirements, and has good segmentation quality. Since the method proposed by the present invention only uses motion vector field information, it is also applicable to moving object segmentation in MPEG compression domain.

Figure 200610116363

Description

基于H.264压缩域运动对象实时分割方法 Real-time Segmentation Method of Moving Objects Based on H.264 Compressed Domain

技术领域 technical field

本发明涉及到一种基于H.264压缩域运动对象实时分割方法,特别是与现有方法截然不同的是,免除了对压缩视频的完全解码,仅通过熵解码提取出的运动矢量用作分割所需的运动特征,因此计算量大大减少。而且它不受限于静止背景的视频序列,对于具有运动背景或者静止背景的视频序列,都能快速可靠地分割出运动对象。由于该方法只使用了运动矢量场信息,因此它同样可适用于MPEG压缩域的运动对象分割。The present invention relates to a real-time segmentation method of moving objects based on H.264 compressed domain, especially different from the existing method, it avoids the complete decoding of compressed video, and only the motion vector extracted by entropy decoding is used for segmentation required motion features, so the amount of computation is greatly reduced. Moreover, it is not limited to video sequences with static backgrounds. For video sequences with moving backgrounds or static backgrounds, it can quickly and reliably segment moving objects. Since this method only uses the information of the motion vector field, it is also applicable to the segmentation of moving objects in the MPEG compression domain.

背景技术 Background technique

运动对象分割是诸如视频索引与检索、智能视频监控、视频编辑和人脸识别等众多基于内容的多媒体应用所必需的一个前提条件。自从MPEG-4提出基于内容的视频编码,有关运动对象分割的研究大多集中在象素域,而基于压缩域的运动对象分割直至近年来才开始引起关注。在压缩域内进行运动对象分割,与象素域内的分割方法相比更适合于实际应用的需要。尤其是实际应用中的大多数视频序列已经压缩为某种格式,直接在此压缩域内进行运动对象分割,可免除对压缩视频进行完全解码;而且,在压缩域内需要处理的数据量要比象素域少很多,因此计算量大大减少;加之,从压缩视频中仅通过熵解码提取出的运动矢量和DCT系数,可直接用作分割所需的运动特征和纹理特征。因此,从压缩域分割运动对象具有快速的特点,可解决传统的象素域分割方法难于满足实时性分割的要求,适合于众多的具有实时性要求的应用场合。Moving object segmentation is a prerequisite for many content-based multimedia applications such as video indexing and retrieval, intelligent video surveillance, video editing, and face recognition. Since content-based video coding was proposed by MPEG-4, most researches on moving object segmentation have focused on the pixel domain, while the moving object segmentation based on compressed domain has only attracted attention in recent years. Carrying out moving object segmentation in the compressed domain is more suitable for the needs of practical applications than the segmentation method in the pixel domain. In particular, most of the video sequences in practical applications have been compressed into a certain format, and the moving object segmentation is directly performed in this compressed domain, which can avoid completely decoding the compressed video; moreover, the amount of data to be processed in the compressed domain is larger than that of pixels. The domain is much less, so the amount of calculation is greatly reduced; in addition, the motion vector and DCT coefficient extracted from the compressed video only through entropy decoding can be directly used as the motion feature and texture feature required for segmentation. Therefore, the segmentation of moving objects from the compressed domain has the characteristics of rapidity, which can solve the problem that the traditional pixel domain segmentation method is difficult to meet the requirements of real-time segmentation, and is suitable for many applications with real-time requirements.

目前,压缩域运动对象分割方法虽已有人提出,但基本上是针对MPEG-2压缩域的,H.264是最新的视频编码标准,相比于MPEG-2编码效率提高了一倍,目前越来越多的应用都在转向采用H.264来取代MPEG-2,但至今在H.264压缩域内进行运动对象分割的研究甚少。与MPEG压缩域相比,H.264压缩域中I帧的DCT系数不能直接用作分割的纹理特征,因为它们是在块的空间预测残差上进行变换的,而不是在原始块上进行变换。因此,在H.264域进行运动对象分割可直接使用的特征只有运动矢量信息。目前在H.264域,只有Zeng等提出了一种基于块的MRF模型从稀疏运动矢量场中分割运动对象的方法,根据各个块运动矢量的幅值赋予各个块不同类型的标记,通过最大化MRF的后验概率标记出属于运动对象的块。但是,这种方法只适用于静态背景的视频序列,而且分割的准确度不高,计算量也较大。At present, although the moving object segmentation method in the compressed domain has been proposed, it is basically aimed at the MPEG-2 compressed domain. H.264 is the latest video coding standard, which doubles the coding efficiency compared to MPEG-2. More and more applications are turning to H.264 to replace MPEG-2, but so far there is little research on moving object segmentation in H.264 compression domain. In contrast to the MPEG compressed domain, the DCT coefficients of I-frames in the H.264 compressed domain cannot be directly used as texture features for segmentation, since they are transformed on the spatial prediction residual of the block instead of the original block . Therefore, the only feature that can be directly used for moving object segmentation in the H.264 domain is the motion vector information. Currently in the H.264 domain, only Zeng et al. proposed a block-based MRF model to segment moving objects from the sparse motion vector field, assigning different types of marks to each block according to the magnitude of the motion vector of each block, and maximizing The posterior probabilities of the MRF flag blocks that belong to moving objects. However, this method is only suitable for video sequences with static backgrounds, and the accuracy of segmentation is not high, and the amount of calculation is relatively large.

发明内容 Contents of the invention

本发明的目的是提供一种基于H.264压缩域运动对象实时分割方法,分割所用的唯一信息是从H.264压缩视频中提取出的基于4×4块均匀采样的运动矢量场。本方法可免除对压缩视频的完全解码,使用熵解码提取出的运动矢量作为分割所需的运动特征,因此计算量大大减少,以达到实时运动对象分割的目的。The object of the present invention is to provide a real-time segmentation method for moving objects based on H.264 compressed domain. The only information used for segmentation is the motion vector field based on 4×4 block uniform sampling extracted from H.264 compressed video. This method can avoid the complete decoding of compressed video, and uses the motion vector extracted by entropy decoding as the motion feature required for segmentation, so the calculation amount is greatly reduced, so as to achieve the purpose of real-time moving object segmentation.

为达到上述的目的,本发明的构思是:For achieving above-mentioned purpose, design of the present invention is:

如图1所示,对输入的H.264压缩视频流提取运动矢量并归一化,然后进行迭代后向投影,获得可显著增强运动信息的累积运动矢量场。再进行全局运动补偿并且按照运动相似性将累积运动矢量场分割成多个区域,然后用一个匹配矩阵表示当前帧的分割区域和前一帧运动对象在当前帧投影的相关性,基于这个匹配矩阵将运动对象分割出来。As shown in Figure 1, the motion vectors are extracted and normalized from the input H.264 compressed video stream, and then iterative back projection is performed to obtain the accumulated motion vector field which can significantly enhance the motion information. Then perform global motion compensation and divide the accumulated motion vector field into multiple areas according to the motion similarity, and then use a matching matrix to represent the correlation between the segmented area of the current frame and the projection of the moving object in the previous frame in the current frame, based on this matching matrix Segment moving objects.

根据上述构思,本发明的技术方案是:According to above-mentioned design, technical scheme of the present invention is:

一种基于H.264压缩域的运动对象实时分割方法,其特征在于对连续多帧的运动矢量场归一化并进行迭代后向投影,获得累积运动矢量场;然后对累积运动矢量场进行全局运动补偿,同时采用快速的统计区域生长算法按照运动相似性将累积运动矢量场分割成多个区域;利用上述两方面结果,采用本发明提出的基于匹配矩阵的运动对象分割方法分割出运动对象,其中可有效地在视频序列中进行对象的跟踪与更新、对象的合并与分裂、对象的出现与消失等多种情况;其步骤是:A method for real-time segmentation of moving objects based on H.264 compressed domain, characterized in that the motion vector field of continuous multi-frames is normalized and iteratively back-projected to obtain the cumulative motion vector field; then the cumulative motion vector field is globally motion compensation, adopt fast statistical region growing algorithm to divide the accumulative motion vector field into a plurality of regions according to motion similarity; utilize above-mentioned two aspects result, adopt the moving object segmentation method based on matching matrix proposed by the present invention to segment out moving object, Among them, various situations such as tracking and updating of objects, merging and splitting of objects, appearance and disappearance of objects, etc. can be effectively carried out in the video sequence; the steps are:

a.运动矢量场归一化:从H.264视频中提取出运动矢量场并进行时域和空域上的归一化;a. Normalization of the motion vector field: extract the motion vector field from the H.264 video and perform normalization in the time and space domains;

b.累积运动矢量场:利用连续多帧的运动矢量场进行迭代后向投影来获得更加可靠的累积运动矢量场;b. Cumulative motion vector field: use the motion vector field of multiple consecutive frames to perform iterative back projection to obtain a more reliable cumulative motion vector field;

c.全局运动补偿:在累积运动矢量场上进行全局运动估计后进行补偿以获得各4×4块的残差;c. Global motion compensation: Compensate after performing global motion estimation on the accumulated motion vector field to obtain the residual of each 4×4 block;

d.区域分割:采用统计区域生长方法将累积运动矢量场分割成多个具有相似运动的区域;d. Region Segmentation: The cumulative motion vector field is segmented into multiple regions with similar motion using the statistical region growing method;

e.对象分割:采用基于匹配矩阵的分割方法将运动对象分割出来。e. Object Segmentation: Segment moving objects using a matching matrix-based segmentation method.

上述的运动矢量场归一化的步骤是:The above-mentioned steps of normalizing the motion vector field are:

(1)时域归一化:将当前帧的运动矢量除以当前帧与参考帧的间隔帧数,即时域距离;(1) Temporal normalization: divide the motion vector of the current frame by the number of frames between the current frame and the reference frame, which is the distance in the instant domain;

(2)空域归一化:将凡尺寸大于4×4的各个宏块运动矢量直接赋给该宏块所覆盖的所有4×4块。(2) Spatial normalization: directly assign the motion vector of each macroblock whose size is larger than 4*4 to all 4*4 blocks covered by the macroblock.

上述的累积运动矢量场的步骤是:The above-mentioned steps of accumulating the motion vector field are:

(1)利用当前帧之后若干帧的运动矢量场,对相邻帧的运动矢量场进行后向投影,就是通过对各投影块的运动矢量乘以不同的比例因子后相加得到当前块的投影运动矢量,比例因子的选定方法为:如果重叠区域的总面积大于当前块面积的一半,则各投影块的比例因子取为该投影块与当前块的相重叠的面积除以所有投影块与当前块重叠区域的总面积;否则,各投影块的比例因子取为其重叠面积与当前块面积之比;(1) Use the motion vector field of several frames after the current frame to back-project the motion vector field of the adjacent frame, that is, the projection of the current block is obtained by multiplying the motion vector of each projected block by different scale factors and adding them together The selection method of the motion vector and the scale factor is: if the total area of the overlapping area is greater than half of the current block area, the scale factor of each projected block is taken as the overlapping area of the projected block and the current block divided by all projected blocks and The total area of the overlapping area of the current block; otherwise, the scale factor of each projected block is taken as the ratio of its overlapping area to the area of the current block;

(2)从后帧开始迭代累积以获得当前帧的累积运动矢量场;(2) Start iterative accumulation from the subsequent frame to obtain the accumulated motion vector field of the current frame;

上述的全局运动补偿的步骤是:The above steps of global motion compensation are:

(1)采用6参数的仿射运动模型估算全局运动矢量场;(1) Estimate the global motion vector field using an affine motion model with 6 parameters;

①模型参数初始化:设m=(m1,m2,m3,m4,m5,m6)是全局运动模型的参数矢量,模型参数m(0)初始化为: m ( 0 ) = 1 0 1 N Σ i = 1 N mvx i 0 1 1 N Σ i = 1 N mvy i T , ①Model parameter initialization: let m=(m 1 , m 2 , m 3 , m 4 , m 5 , m 6 ) be the parameter vector of the global motion model, and the model parameter m (0) is initialized as: m ( 0 ) = 1 0 1 N Σ i = 1 N mvx i 0 1 1 N Σ i = 1 N mvy i T ,

②剔除局外点:首先计算当前帧中心坐标为(xi,yi)的第i个块在前一帧的估计中心坐标 x i ′ y i ′ = m 1 m 2 m 4 m 5 x i y i + m 3 m 6 , 则预测运动矢量和原始累积运动矢量(mvxi,mvyi)的偏差(exi,eyi)计算为: ex i = x i ′ - x i - mvx i e y i = y i ′ - y i - mvy i , 使用这个式子计算出每个4×4块的预测偏差(exi,eyi),最后计算出偏差幅度平方和

Figure C200610116363D00096
的直方图,然后剔除直方图中那些偏差幅度平方和大于25%的运动矢量;② Eliminate outliers: first calculate the estimated center coordinates of the ith block in the previous frame with the center coordinates of the current frame (xi , y i ) x i ′ the y i ′ = m 1 m 2 m 4 m 5 x i the y i + m 3 m 6 , Then predict the motion vector The deviation (ex i , ey i ) from the original cumulative motion vector (mvx i , mvy i ) is calculated as: ex i = x i ′ - x i - mvx i e the y i = the y i ′ - the y i - mvy i , Use this formula to calculate the prediction deviation (ex i , ey i ) of each 4×4 block, and finally calculate the sum of the squares of the deviation magnitude
Figure C200610116363D00096
The histogram of the histogram, and then remove those motion vectors in the histogram whose sum of the squares of the deviation magnitude is greater than 25%;

③模型参数更新:使用前面步骤中保留下来的运动矢量和Newton-Raphson方法来更新模型参数,第l步迭代中新的模型参数矢量m(l)定义如下:m(l)=m(l-1)-H-1b,这里Hessian矩阵H和梯度矢量b计算如下:③Model parameter update: Use the motion vector and Newton-Raphson method retained in the previous steps to update the model parameters. The new model parameter vector m (l) in the first iteration is defined as follows: m (l) = m (l- 1) -H -1 b, where the Hessian matrix H and the gradient vector b are calculated as follows:

Hh == ΣΣ ii ∈∈ RR xx ii 22 ΣΣ ii ∈∈ RR xx ii ythe y ii ΣΣ ii ∈∈ RR xx ii 00 00 00 ΣΣ ii ∈∈ RR xx ii ythe y ii ΣΣ ii ∈∈ RR ythe y ii 22 ΣΣ ii ∈∈ RR ythe y ii 00 00 00 ΣΣ ii ∈∈ RR xx ii ΣΣ ii ∈∈ RR ythe y ii ΣΣ ii ∈∈ RR 11 00 00 00 00 00 00 ΣΣ ii ∈∈ RR xx ii 22 ΣΣ ii ∈∈ RR xx ii ythe y ii ΣΣ ii ∈∈ RR xx ii 00 00 00 ΣΣ ii ∈∈ RR xx ii ythe y ii ΣΣ ii ∈∈ RR ythe y ii 22 ΣΣ ii ∈∈ RR ythe y ii 00 00 00 ΣΣ ii ∈∈ RR xx ii ΣΣ ii ∈∈ RR ythe y ii ΣΣ ii ∈∈ RR 11

bb == ΣΣ ii ∈∈ RR xx ii exex ii ΣΣ ii ∈∈ RR ythe y ii exex ii ΣΣ ii ∈∈ RR exex ii ΣΣ ii ∈∈ RR xx ii eyey ii ΣΣ ii ∈∈ RR ythe y ii eyey ii ΣΣ ii ∈∈ RR eyey ii TT

这里R代表保留下来的块的集合;Here R represents the collection of retained blocks;

④结束条件:重复步骤②和③最多5次,而且以下两个条件之一如果被满足的话也提前结束迭代:(i)计算m(l)-mstatic,得到一个差值向量,如果该差值向量中的每一个参数分量都小于0.01,就判断为属于摄像机静止的情况,结束迭代;其中mstatic=[100010]T为在摄像机静止情况下的全局运动向量;(ii)计算m(l)和m(l-1)的差值,如果这个差值的参数分量m3和m6小于0.01,而且其它参数分量小于0.0001,则迭代结束;④End condition: Repeat steps ② and ③ up to 5 times, and if one of the following two conditions is met, the iteration will also end early: (i) calculate m (l) -m static , get a difference vector, if the difference Each parameter component in the value vector is all less than 0.01, just judges to belong to the situation of still camera, finishes iteration; Wherein m static =[100010] T is the global motion vector under the still situation of camera; (ii) calculate m (l ) and m (l-1) , if the parameter components m 3 and m 6 of this difference are less than 0.01, and other parameter components are less than 0.0001, then the iteration ends;

⑤将得到的全局运动模型参数矢量m代入 x i ′ y i ′ = m 1 m 2 m 4 m 5 x i y i + m 3 m 6 , 求出前一帧的估计坐标

Figure C200610116363D00104
最后得到全局运动矢量场
Figure C200610116363D00105
⑤ Substitute the obtained global motion model parameter vector m into x i ′ the y i ′ = m 1 m 2 m 4 m 5 x i the y i + m 3 m 6 , Find the estimated coordinates of the previous frame
Figure C200610116363D00104
Finally, the global motion vector field is obtained
Figure C200610116363D00105

(2)计算全局运动矢量场与累积运动矢量场中各4×4块的残差。(2) Calculate the residuals of each 4×4 block in the global motion vector field and the cumulative motion vector field.

上述的区域分割是采用统计区域生长算法将累积运动矢量场分划成多个具有相似运动的区域,步骤如下:The above region segmentation is to use the statistical region growing algorithm to divide the accumulated motion vector field into multiple regions with similar motion, the steps are as follows:

(1)计算四邻域内任意相邻块组的运动差异性度量;(1) Calculate the motion difference measure of any adjacent block group in the four neighborhoods;

(2)所有相邻块组按照运动差异性度量从小到大的次序进行排序;(2) All adjacent block groups are sorted according to the order of motion difference measure from small to large;

(3)将运动差异性度量最小的相邻块组合并,以此处开始区域生长过程,在每次区域生长时,当前两个块组分别属于相邻的两个区域,则判断这两个区域是否合并的条件是这两个区域的平均运动矢量之差是否小于阈值条件:(3) Merge the adjacent blocks with the smallest motion difference measure, and start the region growing process here. When each region grows, the current two block groups belong to two adjacent regions respectively, then judge the two The condition for whether regions are merged is whether the difference between the average motion vectors of these two regions is less than a threshold condition:

Δ ( R ) = SR 2 2 Q | R | ( min ( SR , | R | ) log ( 1 + | R | ) + 2 log 6 wh ) , 其中SR表示运动矢量的动态范围,|R|表示区域包含的运动矢量数目,wh表示运动矢量场的尺寸,参数Q用来控制运动矢量场的分割程度,就样就可以将运动矢量场适度地分割成若干具有相似运动的区域; Δ ( R ) = SR 2 2 Q | R | ( min ( SR , | R | ) log ( 1 + | R | ) + 2 log 6 wh ) , Among them, SR represents the dynamic range of the motion vector, |R| represents the number of motion vectors contained in the region, wh represents the size of the motion vector field, and the parameter Q is used to control the division degree of the motion vector field, so that the motion vector field can be moderately Segmentation into several regions with similar motion;

(4)计算每个分割区域在全局运动补偿后的平均残差;(4) Calculate the average residual error of each segmented region after global motion compensation;

(5)区分最可靠的背景区域和其它对象所在的区域,在面积大于整个运动矢量场10%的若干分割区域中选择平均残差最小的区域作为可靠的背景区域,标记为

Figure C200610116363D00111
剩下的区域作为运动对象可能存在的区域
Figure C200610116363D00112
最后对当前帧所分割的M个对象区域和1个背景区域分别标记,分割结果记为 (5) Distinguish the most reliable background area and the area where other objects are located, and select the area with the smallest average residual error among several segmented areas whose area is larger than 10% of the entire motion vector field as the reliable background area, marked as
Figure C200610116363D00111
The remaining area is the area where the moving object may exist
Figure C200610116363D00112
Finally, the M object regions and 1 background region segmented by the current frame are respectively marked, and the segmentation result is denoted as

上述的对象分割是利用前一帧,t-1时刻,已经获得的运动对象分割结果,来判断当前帧,t时刻,各个分割区域是否与前一帧的某个对象匹配,以此构造匹配矩阵;基于匹配矩阵判断对象的跟踪与更新、对象的合并、对象的分裂、新对象的出现、旧对象的消失等情况,最终获得当前帧的若干运动对象;其步骤是:The above object segmentation is to use the segmentation result of the moving object obtained in the previous frame at time t-1 to determine whether each segmented area matches an object in the previous frame at time t in the current frame, so as to construct a matching matrix ;Based on the matching matrix, judge the tracking and updating of objects, the merging of objects, the splitting of objects, the appearance of new objects, the disappearance of old objects, etc., and finally obtain some moving objects in the current frame; the steps are:

(1)采用后向投影方法获得前一帧,t-1时刻,各个对象在当前帧t时刻的投影区域,先将前一帧的N个运动对象

Figure C200610116363D00114
和1个背景对象
Figure C200610116363D00115
标记出来,然后采用后向投影的方法获得前一帧各个对象在当前帧的投影区域。就是利用当前帧累积运动矢量场中任意块的坐标和其对应的累积运动矢量的差求出这个块在前一帧中的匹配位置,然后将前一帧匹配位置上的块对象投影到当前帧并逐个标记出来,记为
Figure C200610116363D00116
(1) Use the backward projection method to obtain the projection area of each object at the time t of the current frame at the time t-1 of the previous frame, and first move the N moving objects of the previous frame
Figure C200610116363D00114
and 1 background object
Figure C200610116363D00115
Mark it out, and then use the method of back projection to obtain the projection area of each object in the previous frame in the current frame. It is to use the difference between the coordinates of any block in the cumulative motion vector field of the current frame and its corresponding cumulative motion vector to find the matching position of this block in the previous frame, and then project the block object at the matching position of the previous frame to the current frame And marked out one by one, denoted as
Figure C200610116363D00116

(2)构造矩阵CMt,它表示分割区域与对象投影相互重叠的面积;构造矩阵CMRt,它表示每个分割区域落在各个对象投影内的比例;构造矩阵CMCt,它表示每个对象投影落在各个分割区域内的比例;根据标记图象

Figure C200610116363D00117
Figure C200610116363D00118
构造3个M+1行N+1列的矩阵CMt,CMRt,CMCt。其中矩阵CMt中的任意元素CMt(i,j)取值为在中标记为i且在标记为j的象素数目,即分割区域
Figure C200610116363D001111
与对象投影
Figure C200610116363D001112
相互重叠的面积。而矩阵CMRt(i,j)定义为CMRt第i行的各个元素是分割区域
Figure C200610116363D0011160442QIETU
落在各个对象投影内的比例;矩阵CMCt(i,j)定义为CMCt第j列的各个元素是对象的投影落在各个分割区域内的比例;(2) Construct a matrix CM t , which represents the overlapping area of the segmentation region and the object projection; construct a matrix CMR t , which represents the proportion of each segmentation region within each object projection; construct a matrix CMC t , which represents each object The proportion of the projection falling within each segmented area; according to the marker image
Figure C200610116363D00117
and
Figure C200610116363D00118
Construct three matrices CM t , CMR t , and CMC t with M+1 rows and N+1 columns. Among them, any element CM t (i, j) in the matrix CM t takes a value in marked i in and in The number of pixels marked as j, that is, the segmented area
Figure C200610116363D001111
with object projection
Figure C200610116363D001112
overlapping area. The matrix CMR t (i, j) is defined as each element of the i-th row of CMR t is the segmentation region
Figure C200610116363D0011160442QIETU
The proportion that falls within the projection of each object; the matrix CMC t (i, j) is defined as each element of the jth column of CMC t is the object The proportion of the projection of the falls in each segmented area;

(3)构造矩阵CMMt,它表示当前帧分割区域和对象投影之间的关联程度,矩阵CMMt记录了CMRt和CMCt所反映的

Figure C200610116363D00121
Figure C200610116363D00122
之间的相关信息;CMMt首先置为M+1行、N+1列的零矩阵;接着对CMRt进行行扫描找到每一行最大值所在的位置,对CMMt中相应位置处的元素值加1;然后对CMCt进行列扫描找到每一列最大值所在的位置,对CMMt中相应位置处的元素值加2;生成的矩阵CMMt的纵坐标依次表示为当前帧背景区域
Figure C200610116363D00123
和运动区域
Figure C200610116363D00124
横坐标依次表示为前一帧背景对象
Figure C200610116363D00125
和运动对象
Figure C200610116363D00126
矩阵中各元素的可能取值为0,1,2,3;CMMt中任意不为0的元素CMMt(i,j)表明了分割区域
Figure C200610116363D00127
与对象存在一定的相关性,具体而言:(3) Construct a matrix CMM t , which represents the degree of correlation between the current frame segmentation area and the object projection, and the matrix CMM t records the values reflected by CMR t and CMC t
Figure C200610116363D00121
and
Figure C200610116363D00122
The relevant information between; CMM t is first set as a zero matrix with M+1 rows and N+1 columns; then row-scan CMR t to find the position of the maximum value of each row, and the element value at the corresponding position in CMM t Add 1; then scan the columns of CMC t to find the position of the maximum value of each column, and add 2 to the element value at the corresponding position in CMM t ; the ordinate of the generated matrix CMM t is represented as the background area of the current frame in turn
Figure C200610116363D00123
and sports area
Figure C200610116363D00124
The abscissa represents the background object of the previous frame in sequence
Figure C200610116363D00125
and moving objects
Figure C200610116363D00126
The possible values of each element in the matrix are 0, 1, 2, 3; any element CMM t (i, j) that is not 0 in CMM t indicates the segmentation area
Figure C200610116363D00127
with the object There are certain correlations, specifically:

①CMMt(i,j)=1,表明分割区域

Figure C200610116363D00129
在很大程度上属于前一帧对象
Figure C200610116363D001210
①CMM t (i, j)=1, indicating the segmented area
Figure C200610116363D00129
Objects that belong largely to the previous frame
Figure C200610116363D001210

②CMMt(i,j)=2,表明前一帧对象在很大程度上包含在分割区域

Figure C200610116363D001212
中;②CMM t (i, j)=2, indicating that the object in the previous frame largely contained in the segmented area
Figure C200610116363D001212
middle;

③CMMt(i,j)=3,同时包含了上述两种情况,表明

Figure C200610116363D001213
Figure C200610116363D001214
具有极强的相关性;需要进一步比较,如果CMRt(i,j)>CMCt(i,j),则CMMt(i,j)=1;否则,CMMt(i,j)=2;最后生成的CMMt取值范围为0,1,2;③ CMM t (i, j) = 3, which includes the above two situations at the same time, indicating that
Figure C200610116363D001213
and
Figure C200610116363D001214
There is a strong correlation; further comparison is needed, if CMR t (i, j) > CMC t (i, j), then CMM t (i, j) = 1; otherwise, CMM t (i, j) = 2 ;The last generated CMM t ranges from 0, 1, 2;

(4)基于匹配矩阵CMMt对单个对象的跟踪与更新、新对象出现、对象的合并、对象的分裂以及对象的消失五类情况进行对象分割;通过矩阵CMMt可以有效地建立起分割区域与运动对象的关联关系,它能够以一种统一的方式有效地处理以下五种情况:(4) Carry out object segmentation based on the matching matrix CMM t for tracking and updating of a single object, new object appearance, object merging, object splitting and object disappearance; through the matrix CMM t , the segmentation area and The association relationship of moving objects, which can effectively handle the following five situations in a unified manner:

①单个对象跟踪与更新(1→1);①Single object tracking and updating (1→1);

②新对象出现(0→1);②A new object appears (0→1);

③对象的合并(m→1);③ Merge of objects (m→1);

④对象的分裂(1→m);④ Splitting of objects (1→m);

⑤对象的消失(1→0)。⑤ Disappearance of the object (1→0).

本发明与现有技术相比较,具有如下的突出特点和优点:本发明提供的基于压缩域的实时运动对象分割方法,是基于H.264视频流,即与现有方法截然不同的是现有的压缩域视频对象分割方法主要适用于MPEG域,而本发明不仅适用于H.264压缩域,同样适用于MPEG压缩域。而且本发明不受限于静止背景的视频序列,不论对于具有运动背景或者静止背景的视频序列,都能快速可靠地分割出运动对象。此外本发明提出的匹配矩阵分割运动对象的方法,几乎能够对视频对象运动的各种情况作出实时分割,因此分割对象的效果很好,具有很强的适用性。Compared with the prior art, the present invention has the following outstanding features and advantages: The real-time moving object segmentation method based on the compressed domain provided by the present invention is based on the H.264 video stream, which is completely different from the existing method. The compressed domain video object segmentation method is mainly applicable to the MPEG domain, while the present invention is not only applicable to the H.264 compressed domain, but also applicable to the MPEG compressed domain. Moreover, the present invention is not limited to video sequences with static backgrounds, and can quickly and reliably segment moving objects no matter for video sequences with moving backgrounds or static backgrounds. In addition, the method for segmenting moving objects by matching matrix proposed by the present invention can almost perform real-time segmentation for various situations of video object motion, so the effect of segmenting objects is very good and has strong applicability.

附图说明 Description of drawings

图1是本发明的基于匹配矩阵的H.264压缩域运动对象实时分割方法的程序框图。Fig. 1 is a program block diagram of the H.264 compressed domain moving object real-time segmentation method based on the matching matrix of the present invention.

图2是图1中运动矢量场归一化和累积运动矢量场的结构框图。FIG. 2 is a block diagram of the normalized motion vector field and the accumulated motion vector field in FIG. 1 .

图3是图1中全局运动补偿和区域分割的结构框图。Fig. 3 is a structural block diagram of global motion compensation and region segmentation in Fig. 1 .

图4是图1中对象分割的结构框图。Fig. 4 is a structural block diagram of object segmentation in Fig. 1 .

图5是对序列Coastguard中各个典型帧(第4、37、61、208帧)运动对象分割结果的图示。Fig. 5 is an illustration of the moving object segmentation results for each typical frame (frame 4, 37, 61, 208) in the sequence Coastguard.

图6是对序列Mobile中各个典型帧(第4、43、109、160帧)运动对象分割结果的图示。Fig. 6 is an illustration of the moving object segmentation results of each typical frame (4th, 43rd, 109th, 160th frame) in the sequence Mobile.

具体实施方式 Detailed ways

本发明的一个实施例子结合附图详述如下:An implementation example of the present invention is described in detail as follows in conjunction with accompanying drawing:

本发明基于匹配矩阵的H.264压缩域运动对象实时分割方法是按图1所示程序框图,在CPU为3.0GHz、内存512M的PC测试平台上编程实现,图5和图6示出仿真测试结果。The H.264 compressed domain moving object real-time segmentation method based on the matching matrix of the present invention is according to the program block diagram shown in Figure 1, and is programmed on a PC test platform with a CPU of 3.0GHz and a memory of 512M. Figure 5 and Figure 6 show the simulation test result.

参见图1,本发明基于匹配矩阵的H.264压缩域运动对象实时分割方法,通过运动矢量场的归一化和累积增强了显著的运动信息,然后对累积运动矢量场进行全局运动补偿,采用统计区域生长算法和基于匹配矩阵的运动对象分割方法来分割区域和运动对象,具有算法简单,对象分割速度快,分割效果好的特点。Referring to Fig. 1, the H.264 compressed domain moving object real-time segmentation method based on the matching matrix of the present invention enhances significant motion information through the normalization and accumulation of the motion vector field, and then performs global motion compensation on the accumulated motion vector field, using Statistical region growing algorithm and moving object segmentation method based on matching matrix are used to segment regions and moving objects, which have the characteristics of simple algorithm, fast object segmentation speed and good segmentation effect.

其步骤是:The steps are:

(1)运动矢量场归一化:从H.264视频中提取出运动矢量场并进行时域和空域上归一化;(1) Normalization of the motion vector field: extract the motion vector field from the H.264 video and perform normalization in the time and space domains;

(2)累积运动矢量场:利用连续多帧的运动矢量场进行迭代后向投影来获得更加可靠的累积运动矢量场;(2) Cumulative motion vector field: use the motion vector field of multiple consecutive frames to perform iterative back projection to obtain a more reliable cumulative motion vector field;

(3)全局运动补偿:在累积运动矢量场上进行全局运动估计后进行补偿以获得各4×4块的残差;(3) Global motion compensation: Compensate after performing global motion estimation on the accumulated motion vector field to obtain the residual error of each 4×4 block;

(4)区域分割:采用统计区域生长方法将累积运动矢量场分割成多个具有相似运动的区域;(4) Region segmentation: The accumulated motion vector field is segmented into multiple regions with similar motions using the statistical region growing method;

(5)对象分割:采用基于匹配矩阵的分割算法将运动对象分割出来。(5) Object Segmentation: The moving object is segmented by a segmentation algorithm based on matching matrix.

上述步骤(1)的运动矢量场归一化的过程如下:The process of the motion vector field normalization of above-mentioned step (1) is as follows:

①时域归一化:将当前帧的运动矢量除以当前帧与参考帧的间隔帧数,即时域距离;①Temporal normalization: Divide the motion vector of the current frame by the number of frames between the current frame and the reference frame, which is the distance in the instant domain;

②空域归一化:将凡尺寸大于4×4的各个宏块运动矢量直接赋给该宏块所覆盖的所有4×4块。② Spatial domain normalization: directly assign the motion vector of each macroblock whose size is larger than 4*4 to all 4*4 blocks covered by the macroblock.

上述步骤(2)的累积运动矢量场的过程如下:The process of accumulating motion vector field of above-mentioned steps (2) is as follows:

①利用当前帧之后若干帧的运动矢量场,对相邻帧的运动矢量场进行后向投影;① Use the motion vector field of several frames after the current frame to back-project the motion vector field of the adjacent frame;

②从后帧开始迭代累积以获得当前帧的累积运动矢量场。② Iteratively accumulate from the next frame to obtain the accumulated motion vector field of the current frame.

上述步骤(3)的全局运动补偿的过程如下:The process of the global motion compensation of above-mentioned steps (3) is as follows:

①采用6参数的仿射运动模型估算全局运动矢量场;① Estimate the global motion vector field using a 6-parameter affine motion model;

②计算出各4×4块经全局运动补偿后的残差。② Calculate the residual error of each 4×4 block after global motion compensation.

上述步骤(4)的区域分割的过程如下:The process of the region segmentation of the above-mentioned step (4) is as follows:

①计算四邻域内任意相邻块组的运动差异性度量;① Calculate the motion difference measure of any adjacent block group in the four neighborhoods;

②所有相邻块组按照运动差异性度量从小到大的次序进行排序;② All adjacent block groups are sorted according to the order of motion difference measure from small to large;

③将运动差异性度量最小的相邻块组合并,以此处开始区域生长过程;③Merge the adjacent blocks with the smallest motion difference measure, and start the region growing process here;

④计算每个分割区域在全局运动补偿后的平均残差;④ Calculate the average residual error of each segmented area after global motion compensation;

⑤区分最可靠的背景区域和其它对象所在的区域。⑤ Distinguish the most reliable background area from the area where other objects are located.

上述步骤(5)的对象分割的过程如下:The process of object segmentation in the above step (5) is as follows:

①采用后向投影方法获得前一帧各个对象在当前帧的投影区域;① Use the back projection method to obtain the projection area of each object in the previous frame in the current frame;

②构造矩阵CMt,它表示分割区域与投影对象相互重叠的面积;构造矩阵CMRt,它表示每个分割区域落在各个对象投影内的比例;构造矩阵CMCt,它表示每个对象投影落在各个分割区域内的比例;② Construct a matrix CM t , which represents the overlapping area of the segmented area and the projected object; construct a matrix CMR t , which represents the proportion of each segmented area in the projection of each object; construct a matrix CMC t , which represents the area of each object projected. The proportion in each segmented area;

③构造矩阵CMMt,它表示当前帧分割区域和对象投影之间的关联程度;③ Construct matrix CMM t , which represents the degree of association between the current frame segmentation area and the object projection;

④基于匹配矩阵CMMt对单个对象跟踪与更新、新对象出现、对象的合并、对象的分裂以及对象的消失等五类情况进行对象分割。④ Based on the matching matrix CMM t, object segmentation is performed for five types of situations: single object tracking and updating, new object appearance, object merging, object splitting, and object disappearance.

下面对本实施例子结合总框图(图1)的五个步骤给予进一步详细说明:Below the five steps of this implementation example in conjunction with the general block diagram (Fig. 1) are given in further detail:

a.运动矢量场归一化:a. Motion vector field normalization:

如图2所示,将当前帧的运动矢量除以当前帧与参考帧的间隔帧数得到时域上的归一化,将当前帧中尺寸大于4×4的块的运动矢量直接赋给该块所覆盖的所有4×4块获得空域上的归一化。As shown in Figure 2, the motion vector of the current frame is divided by the number of frames between the current frame and the reference frame to obtain normalization in the time domain, and the motion vector of the block larger than 4×4 in the current frame is directly assigned to the All 4x4 blocks covered by the block get normalized over the spatial domain.

b.累积运动矢量场:b. Cumulative motion vector field:

如图2所示,先利用当前帧之后若干帧的运动矢量场,对相邻帧的运动矢量场进行后向投影。就是通过对各投影块的运动矢量乘以不同的比例因子后相加得到当前块的投影运动矢量,比例因子的选定方法为:如果重叠区域的总面积大于当前块面积的一半,则各投影块的比例因子取为该投影块与当前块的相重叠的面积除以所有投影块与当前块重叠区域的总面积;否则,各投影块的比例因子取为其重叠面积与当前块面积之比。然后从后帧开始迭代累积以获得当前帧的累积运动矢量场。As shown in FIG. 2 , the motion vector fields of several frames after the current frame are used to back-project the motion vector fields of adjacent frames. The projected motion vector of the current block is obtained by multiplying the motion vectors of each projected block by different scale factors and adding them together. The selection method of the scale factor is: if the total area of the overlapping area is greater than half of the area of the current block, each projection The scale factor of a block is taken as the overlapping area of the projected block and the current block divided by the total area of the overlapping area of all projected blocks and the current block; otherwise, the scale factor of each projected block is taken as the ratio of its overlapping area to the area of the current block . Then iteratively accumulate from the next frame to obtain the accumulated motion vector field of the current frame.

c.全局运动补偿:c. Global motion compensation:

如图3所示,采用6参数的仿射运动模型估算全局运动矢量场,利用它与累积运动矢量场之差就可获得累积运动矢量场任意块经全局运动补偿后的残差。其步骤是:As shown in Figure 3, a 6-parameter affine motion model is used to estimate the global motion vector field, and the difference between it and the cumulative motion vector field can be used to obtain the residual error of any block in the cumulative motion vector field after global motion compensation. The steps are:

(1)采用6参数的仿射运动模型估算全局运动矢量场:(1) Estimate the global motion vector field using an affine motion model with 6 parameters:

①模型参数初始化:①Model parameter initialization:

设m=(m1,m2,m3,m4,m5,m6)是全局运动模型的参数矢量,模型参数m(0)初始化为: m ( 0 ) = 1 0 1 N Σ i = 1 N mvx i 0 1 1 N Σ i = 1 N mvy i T ; Let m=(m 1 , m 2 , m 3 , m 4 , m 5 , m 6 ) be the parameter vector of the global motion model, and the model parameter m (0) is initialized as: m ( 0 ) = 1 0 1 N Σ i = 1 N mvx i 0 1 1 N Σ i = 1 N mvy i T ;

②剔除局外点:② Eliminate outliers:

首先计算当前帧中心坐标为(xi,yi)的第i个块在前一帧的估计中心坐标

Figure C200610116363D00152
x i ′ y i ′ = m 1 m 2 m 4 m 5 x i y i + m 3 m 6 , 则预测运动矢量
Figure C200610116363D00154
和原始累积运动矢量(mvxi,mvyi)的偏差(exi,eyi)计算为: ex i = x i ′ - x i - mvx i e y i = y i ′ - y i - mvy i 。使用这个式子计算出每个4×4块的预测偏差(exi,eyi),最后计算出偏差幅度的平方和
Figure C200610116363D00161
的直方图,然后剔除直方图中那些偏差幅度平方和大于25%的运动矢量。First calculate the estimated center coordinates of the ith block in the previous frame with the center coordinates of the current frame ( xi , y i )
Figure C200610116363D00152
x i ′ the y i ′ = m 1 m 2 m 4 m 5 x i the y i + m 3 m 6 , Then predict the motion vector
Figure C200610116363D00154
The deviation (ex i , ey i ) from the original cumulative motion vector (mvx i , mvy i ) is calculated as: ex i = x i ′ - x i - mvx i e the y i = the y i ′ - the y i - mvy i . Use this formula to calculate the prediction deviation (ex i , ey i ) of each 4×4 block, and finally calculate the sum of the squares of the deviation magnitude
Figure C200610116363D00161
The histogram of the histogram, and then remove those motion vectors whose sum of squared deviation magnitudes is greater than 25% in the histogram.

③模型参数更新:③Model parameter update:

使用前面步骤中保留下来的运动矢量和Newton-Raphson方法来更新模型参数。第l步迭代中新的模型参数矢量m(l)定义如下:m(l)=m(l-1)-H-1b,这里Hessian矩阵H和梯度矢量b计算如下:Use the motion vectors preserved from the previous steps and the Newton-Raphson method to update the model parameters. The new model parameter vector m (l) in the first iteration is defined as follows: m (l) = m (l-1) -H -1 b, where the Hessian matrix H and the gradient vector b are calculated as follows:

Hh == ΣΣ ii ∈∈ RR xx ii 22 ΣΣ ii ∈∈ RR xx ii ythe y ii ΣΣ ii ∈∈ RR xx ii 00 00 00 ΣΣ ii ∈∈ RR xx ii ythe y ii ΣΣ ii ∈∈ RR ythe y ii 22 ΣΣ ii ∈∈ RR ythe y ii 00 00 00 ΣΣ ii ∈∈ RR xx ii ΣΣ ii ∈∈ RR ythe y ii ΣΣ ii ∈∈ RR 11 00 00 00 00 00 00 ΣΣ ii ∈∈ RR xx ii 22 ΣΣ ii ∈∈ RR xx ii ythe y ii ΣΣ ii ∈∈ RR xx ii 00 00 00 ΣΣ ii ∈∈ RR xx ii ythe y ii ΣΣ ii ∈∈ RR ythe y ii 22 ΣΣ ii ∈∈ RR ythe y ii 00 00 00 ΣΣ ii ∈∈ RR xx ii ΣΣ ii ∈∈ RR ythe y ii ΣΣ ii ∈∈ RR 11

bb == ΣΣ ii ∈∈ RR xx ii exex ii ΣΣ ii ∈∈ RR ythe y ii exex ii ΣΣ ii ∈∈ RR exex ii ΣΣ ii ∈∈ RR xx ii eyey ii ΣΣ ii ∈∈ RR ythe y ii eyey ii ΣΣ ii ∈∈ RR eyey ii TT

这里R代表保留下来的块的集合。Here R represents the set of preserved blocks.

④结束条件:重复步骤②和③最多5次,而且以下两个条件之一如果被满足的话也提前结束迭代:④ End condition: Repeat steps ② and ③ up to 5 times, and if one of the following two conditions is met, the iteration will also end early:

(i)计算m(l)-mstatic,得到一个差值向量,如果该差值向量中的每一个参数分量都小于0.01,就判断为属于摄像机静止的情况,结束迭代;其中mstatic=[100010]T为在摄像机静止情况下的全局运动向量;(i) Calculate m (l) -m static to obtain a difference vector, if each parameter component in the difference vector is less than 0.01, it is judged to belong to the situation where the camera is still, and the iteration ends; where m static =[ 100010] T is the global motion vector when the camera is still;

(ii)计算m(l)和m(l-1)的差值,如果这个差值的参数分量m3和m6小于0.01,而且其它参数分量小于0.0001,则迭代结束。(ii) Calculate the difference between m (l) and m (l-1) , if the parameter components m 3 and m 6 of this difference are less than 0.01, and other parameter components are less than 0.0001, then the iteration ends.

⑤将得到的全局运动模型参数矢量m代入 x i ′ y i ′ = m 1 m 2 m 4 m 5 x i y i + m 3 m 6 , 求出前一帧的估计坐标

Figure C200610116363D00165
最后得到全局运动矢量场
Figure C200610116363D00166
⑤ Substitute the obtained global motion model parameter vector m into x i ′ the y i ′ = m 1 m 2 m 4 m 5 x i the y i + m 3 m 6 , Find the estimated coordinates of the previous frame
Figure C200610116363D00165
Finally, the global motion vector field is obtained
Figure C200610116363D00166

(2)计算全局运动矢量场与累积运动矢量场中各4×4块的残差。(2) Calculate the residuals of each 4×4 block in the global motion vector field and the cumulative motion vector field.

d.区域分割:d. Region segmentation:

如图3所示,本发明采用统计区域生长算法实现对累积运动矢量场的区域分割。步骤详述如下:As shown in FIG. 3 , the present invention uses a statistical region growing algorithm to realize region segmentation of the accumulated motion vector field. The steps are detailed below:

(1)计算四邻域内任意相邻块组的运动差异性度量;(1) Calculate the motion difference measure of any adjacent block group in the four neighborhoods;

(2)所有相邻块组按照运动差异性度量从小到大的次序进行排序;(2) All adjacent block groups are sorted according to the order of motion difference measure from small to large;

(3)将运动差异性度量最小的相邻块组合并,以此处开始区域生长过程。在每次区域生长时,当前两个块组分别属于相邻的两个区域,则判断这两个区域是否合并的条件是这两个区域的平均运动矢量之差是否小于阈值条件: Δ ( R ) = SR 2 2 Q | R | ( min ( SR , | R | ) log ( 1 + | R | ) + 2 log 6 wh ) , 其中SR表示运动矢量的动态范围,|R|表示区域包含的运动矢量数目,wh表示运动矢量场的尺寸,参数Q用来控制运动矢量场的分割程度,就样就可以将运动矢量场适度地分割成若干具有相似运动的区域;(3) Merge the adjacent blocks with the smallest motion difference measure, and start the region growing process here. When each area grows, the current two block groups belong to two adjacent areas respectively, and the condition for judging whether the two areas are merged is whether the difference between the average motion vectors of the two areas is less than the threshold condition: Δ ( R ) = SR 2 2 Q | R | ( min ( SR , | R | ) log ( 1 + | R | ) + 2 log 6 wh ) , Among them, SR represents the dynamic range of the motion vector, |R| represents the number of motion vectors contained in the region, wh represents the size of the motion vector field, and the parameter Q is used to control the division degree of the motion vector field, so that the motion vector field can be moderately Segmentation into several regions with similar motion;

(4)计算每个分割区域在全局运动补偿后的平均残差;(4) Calculate the average residual error of each segmented region after global motion compensation;

(5)区分最可靠的背景区域和其它对象所在的区域。在面积大于整个运动矢量场10%的若干分割区域中选择平均残差最小的区域作为可靠的背景区域,标记为

Figure C200610116363D00172
剩下的区域作为运动对象可能存在的区域
Figure C200610116363D00173
最后对当前帧所分割的M个对象区域和1个背景区域分别标记,分割结果记为
Figure C200610116363D00174
(5) Distinguish the most reliable background region from the region where other objects are located. Among several segmented regions whose area is larger than 10% of the entire motion vector field, the region with the smallest average residual error is selected as a reliable background region, which is marked as
Figure C200610116363D00172
The remaining area is the area where the moving object may exist
Figure C200610116363D00173
Finally, the M object regions and 1 background region segmented by the current frame are respectively marked, and the segmentation result is denoted as
Figure C200610116363D00174

e.对象分割e. Object Segmentation

如图4所示,先通过计算找到在相邻两帧中匹配的块,再将前一帧的运动对象投影至当前帧并标记为对象投影,然后利用当前帧对象投影和分割区域的相关性构造3个M+1行N+1列的矩阵CMt,CMRt,CMCt。再由矩阵CMRt和CMCt生成匹配矩阵CMMt,基于这个匹配矩阵对五类不同的运动对象情况作出分割。步骤详述如下:As shown in Figure 4, first find the matching blocks in two adjacent frames by calculation, then project the moving object of the previous frame to the current frame and mark it as object projection, and then use the correlation between the current frame object projection and the segmented area Construct three matrices CM t , CMR t , and CMC t with M+1 rows and N+1 columns. Then the matching matrix CMM t is generated from the matrices CMR t and CMC t , and based on this matching matrix, five different types of moving objects are segmented. The steps are detailed below:

(1)采用后向投影方法获得前一帧,t-1时刻,各个对象在当前帧t时刻的投影区域。先将前一帧的N个运动对象

Figure C200610116363D00175
和1个背景对象
Figure C200610116363D00176
标记出来,然后采用后向投影的方法获得前一帧各个对象在当前帧的投影区域。就是利用当前帧累积运动矢量场中任意块的坐标和其对应的累积运动矢量的差求出这个块在前一帧中的匹配位置,然后将前一帧匹配位置上的块对象投影到当前帧并逐个标记出来,记为
Figure C200610116363D00181
(1) Use the backward projection method to obtain the projection area of each object at the time t of the current frame at time t-1 in the previous frame. First move the N moving objects of the previous frame
Figure C200610116363D00175
and 1 background object
Figure C200610116363D00176
Mark it out, and then use the method of back projection to obtain the projection area of each object in the previous frame in the current frame. It is to use the difference between the coordinates of any block in the cumulative motion vector field of the current frame and its corresponding cumulative motion vector to find the matching position of this block in the previous frame, and then project the block object at the matching position of the previous frame to the current frame And marked out one by one, denoted as
Figure C200610116363D00181

(2)构造矩阵CMt,它表示分割区域与对象投影相互重叠的面积;构造矩阵CMRt,它表示每个分割区域落在各个对象投影内的比例;构造矩阵CMCt,它表示每个对象投影落在各个分割区域内的比例。根据标记图象

Figure C200610116363D00183
构造3个M+1行N+1列的矩阵CMt,CMRt,CMCt。其中矩阵CMt中的任意元素CMt(i,j)取值为在
Figure C200610116363D00184
中标记为i且在
Figure C200610116363D00185
标记为j的象素数目,即分割区域
Figure C200610116363D00186
与对象投影
Figure C200610116363D00187
相互重叠的面积。而矩阵CMRt(i,j)定义为CMRt第i行的各个元素是分割区域
Figure C200610116363D00188
落在各个对象投影内的比例;矩阵CMCt(i,j)定义为CMCt第j列的各个元素是对象的投影落在各个分割区域内的比例。(2) Construct a matrix CM t , which represents the overlapping area of the segmentation region and the object projection; construct a matrix CMR t , which represents the proportion of each segmentation region within each object projection; construct a matrix CMC t , which represents each object The proportion of projections that fall within each segmented region. according to tagged image and
Figure C200610116363D00183
Construct three matrices CM t , CMR t , and CMC t with M+1 rows and N+1 columns. Among them, any element CM t (i, j) in the matrix CM t takes a value in
Figure C200610116363D00184
marked i in and in
Figure C200610116363D00185
The number of pixels marked as j, that is, the segmented area
Figure C200610116363D00186
with object projection
Figure C200610116363D00187
overlapping area. The matrix CMR t (i, j) is defined as each element of the i-th row of CMR t is the segmentation region
Figure C200610116363D00188
The proportion that falls within the projection of each object; the matrix CMC t (i, j) is defined as each element of the jth column of CMC t is the object The proportion of the projection of which falls within each segmented area.

(3)构造矩阵CMMt,它表示当前帧分割区域和对象投影之间的关联程度。矩阵CMMt记录了CMRt和CMCt所反映的

Figure C200610116363D001810
Figure C200610116363D001811
之间的相关信息。CMMt首先置为M+1行N+1列的零矩阵;接着对CMRt进行行扫描找到每一行最大值所在的位置,对CMMt中相应位置处的元素值加1;然后对CMCt进行列扫描找到每一列最大值所在的位置,对CMMt中相应位置处的元素值加2。生成的矩阵CMMt的纵坐标依次表示为当前帧背景区域和运动区域
Figure C200610116363D001813
横坐标依次表示为前一帧背景对象和运动对象
Figure C200610116363D001815
矩阵中各元素的可能取值为0,1,2,3。CMMt中任意不为0的元素CMMt(i,j)表明了分割区域
Figure C200610116363D001816
与对象
Figure C200610116363D001817
存在一定的相关性,具体而言:(3) Construct a matrix CMM t , which represents the degree of association between the segmented area of the current frame and the object projection. The matrix CMM t records the CMR t and CMC t reflected
Figure C200610116363D001810
and
Figure C200610116363D001811
related information between. CMM t is first set as a zero matrix of M+1 rows and N+1 columns; then row scan CMR t to find the position of the maximum value in each row, and add 1 to the element value at the corresponding position in CMM t ; then CMC t Perform column scanning to find the position of the maximum value of each column, and add 2 to the element value at the corresponding position in CMM t . The vertical coordinates of the generated matrix CMM t are sequentially represented as the background area of the current frame and sports area
Figure C200610116363D001813
The abscissa represents the background object of the previous frame in sequence and moving objects
Figure C200610116363D001815
The possible values of each element in the matrix are 0, 1, 2, 3. Any non-zero element CMM t (i, j) in CMM t indicates the segmentation area
Figure C200610116363D001816
with the object
Figure C200610116363D001817
There are certain correlations, specifically:

①CMMt(i,j)=1,表明分割区域在很大程度上属于前一帧对象

Figure C200610116363D001819
①CMM t (i, j)=1, indicating the segmented area Objects that belong largely to the previous frame
Figure C200610116363D001819

②CMMt(i,j)=2,表明前一帧对象在很大程度上包含在分割区域

Figure C200610116363D001821
中;②CMM t (i, j)=2, indicating that the object in the previous frame largely contained in the segmented area
Figure C200610116363D001821
middle;

③CMMt(i,j)=3,同时包含了上述两种情况,表明

Figure C200610116363D001822
Figure C200610116363D001823
具有极强的相关性。需要进一步比较,如果CMRt(i,j)>CMCt(i,j),则CMMt(i,j)=1;否则,CMMt(i,j)=2。最后生成的CMMt取值范围为0,1,2。③ CMM t (i, j) = 3, which includes the above two situations at the same time, indicating that
Figure C200610116363D001822
and
Figure C200610116363D001823
have a strong correlation. For further comparison, if CMR t (i, j)>CMC t (i, j), then CMM t (i, j)=1; otherwise, CMM t (i, j)=2. The finally generated CMM t ranges from 0, 1, 2.

(4)基于匹配矩阵CMMt对单个对象的跟踪与更新、新对象出现、对象的合并、对象的分裂以及对象的消失五类情况进行对象分割。通过矩阵CMMt可以有效地建立起分割区域与运动对象的关联关系,它能够以一种统一的方式有效地处理以下五种情况:(4) Carry out object segmentation based on the matching matrix CMM t for tracking and updating of a single object, appearance of new objects, merging of objects, splitting of objects and disappearance of objects. The relationship between the segmented area and the moving object can be effectively established through the matrix CMM t , which can effectively handle the following five situations in a unified manner:

①单个对象跟踪与更新(1→1):如果CMMt的第i行只有一个非零元素CMMt(i,j),而且第j列也只有这一个非零元素CMMt(i,j),那么表明分割区域

Figure C200610116363D0019161719QIETU
只与对象
Figure C200610116363D00191
存在相关性,根据CMMt(i,j)的取值采取不同的策略:如果CMMt(i,j)=2,采取更新策略,用当前帧的分割区域来表示更新后的对象,即 O j t = R i t 。如果CMMt(i,j)=1,一般采取对象跟踪的策略,即用前一帧对象的投影来表示当前帧的对象,即 O j t = Proj ( O j t - 1 ) ; 另外,如果分割区域
Figure C200610116363D00194
同时还满足阈值条件: | R i t | > T s , ER i t > α ER 0 t , 其中Ts=64,α=1.5,
Figure C200610116363D00197
表示区域
Figure C200610116363D00198
所包含的运动矢量数目,
Figure C200610116363D00199
表示区域的平均残差,
Figure C200610116363D001911
表示背景的平均残差;则认为
Figure C200610116363D001912
是一个可靠的运动区域,可用来表示当前帧的运动对象,即 O j t = R i t . ①Single object tracking and updating (1→1): If the i-th row of CMM t has only one non-zero element CMM t (i, j), and the j-th column also has only this non-zero element CMM t (i, j) , then the segmented region
Figure C200610116363D0019161719QIETU
only with object
Figure C200610116363D00191
There is a correlation, different strategies are adopted according to the value of CMM t (i, j): if CMM t (i, j) = 2, an update strategy is adopted, and the updated object is represented by the segmented area of the current frame, namely o j t = R i t . If CMM t (i, j) = 1, the strategy of object tracking is generally adopted, that is, the projection of the object in the previous frame is used to represent the object in the current frame, that is o j t = Proj ( o j t - 1 ) ; Also, if splitting the region
Figure C200610116363D00194
Also meet the threshold condition: | R i t | > T the s , and ER i t > α ER 0 t , where T s =64, α=1.5,
Figure C200610116363D00197
Indicates the area
Figure C200610116363D00198
the number of motion vectors included,
Figure C200610116363D00199
Indicates the area the average residual error of
Figure C200610116363D001911
Represents the average residual of the background; then it is considered
Figure C200610116363D001912
Is a reliable motion area, which can be used to represent the moving object of the current frame, namely o j t = R i t .

②新对象出现(0→1):如果CMMt的第i行只有一个非零元素且位于第1列,值为1,表明该分割区域

Figure C200610116363D001914
在前一帧还是背景对象并不属于已有的任何运动对象。如果
Figure C200610116363D001916
同时满足上面①中的阈值条件,则可认为
Figure C200610116363D001917
是一个新出现的运动对象,记为 O M + 1 t = R i t . ②A new object appears (0→1): If the i-th row of CMM t has only one non-zero element and is located in the first column, the value is 1, indicating that the segmented area
Figure C200610116363D001914
was the background object in the previous frame Does not belong to any existing motion object. if
Figure C200610116363D001916
At the same time meet the threshold conditions in ① above, it can be considered
Figure C200610116363D001917
is a new moving object, denoted as o m + 1 t = R i t .

上述①和②两种情况下,CMMt某行的非零元素个数都为1。如果CMMt的第i行存在多个非零元素,则表明分割区域

Figure C200610116363D001919
可能与多个对象存在相关性。在这种情况下,只需要将前一帧的对象投影到当前帧作为当前帧的运动对象,实现对象的跟踪,即 O j t = Proj ( O j t - 1 ) . In the above two cases of ① and ②, the number of non-zero elements in a row of CMM t is 1. If there are multiple non-zero elements in the i-th row of CMM t , it indicates a split region
Figure C200610116363D001919
There may be dependencies on multiple objects. In this case, it is only necessary to project the object of the previous frame to the current frame as the moving object of the current frame to realize the tracking of the object, namely o j t = Proj ( o j t - 1 ) .

③对象的合并(m→1):如果CMMt的第i行上除第1列外有2个以上的元素取值为2,表明前一帧中2个以上的对象在很大程度上包含在新的分割区域

Figure C200610116363D001921
中,则
Figure C200610116363D001922
表示了这些对象合并后的新对象,记为 O M + 1 t = R i t 。在这种情况下,分割区域
Figure C200610116363D00202
往往包含了2个或2个以上具有十分相似的运动且在空间相互邻接的对象,则它们在当前帧作为一个新的合并对象而被分割出来。③ Object merging (m→1): If there are more than 2 elements with a value of 2 on the i-th row of CMM t except column 1, it indicates that more than 2 objects in the previous frame largely contain in the new partition
Figure C200610116363D001921
in, then
Figure C200610116363D001922
Represents the new object after these objects are merged, denoted as o m + 1 t = R i t . In this case, the split region
Figure C200610116363D00202
Often contains two or more objects with very similar motion and adjacent to each other in space, they are segmented as a new merged object in the current frame.

④对象的分裂(1→m):如果CMMt的第j列中有2个以上的元素取值为1,则表明前一帧对象

Figure C200610116363D00203
在当前帧分裂成多个分割区域
Figure C200610116363D00204
即使这些区域在空间上并不邻接,在当前帧的分割中,仍然认为这些分割区域属于同一个对象,直到这些具有相同对象标记却在空间相互不邻接的多个分割区域在随后的若干帧中表现出不同的运动,则对这些分割区域赋予不同的对象标记,记为 O s i t = R s i t , 以实现真正的对象分裂。④Splitting of objects (1→m): If there are more than 2 elements in the jth column of CMM t with a value of 1, it indicates that the previous frame object
Figure C200610116363D00203
split into multiple split regions at the current frame
Figure C200610116363D00204
Even if these regions are not adjacent in space, in the segmentation of the current frame, these segmentation regions are still considered to belong to the same object, until these multiple segmentation regions with the same object label but not adjacent to each other in space are in the following frames show different motions, then assign different object labels to these segmented regions, denoted as o the s i t = R the s i t , to achieve true object splitting.

⑤对象的消失(1→0):如果CMMt的第j列只有1个非零元素且位于第1行,值为2,表明前一帧对象

Figure C200610116363D00206
的投影落在当前帧的背景区域则认为
Figure C200610116363D00208
在当前帧消失。⑤ Disappearance of the object (1→0): If the jth column of CMM t has only 1 non-zero element and is located in the first row, the value is 2, indicating that the previous frame object
Figure C200610116363D00206
The shadow of falls on the background area of the current frame then think
Figure C200610116363D00208
Disappears at the current frame.

如上所述已经能够有效地处理在视频序列的运动对象分割过程中可能出现的5种情况。但当场景发生较大变化时,连续多帧都对所有对象采取了跟踪的策略,即都是上一帧各个对象的投影,表明当前帧各个分割区域与前一帧各个运动对象的相关性很弱,因此将按照情况②来判断是否有新对象出现,需要重新检测运动对象。As mentioned above, five situations that may occur during the segmentation of moving objects in video sequences can be effectively dealt with. However, when the scene changes greatly, a tracking strategy is adopted for all objects in multiple consecutive frames, that is, all objects are projections of the objects in the previous frame, indicating that the segmentation regions of the current frame are closely related to the moving objects in the previous frame. Weak, so it will judge whether there is a new object according to the situation ②, and the moving object needs to be detected again.

以下给出输入视频格式为352×288的CIF时的实例,采用JM8.6版本的H.264编码器对MPEG-4标准测试序列进行编码,作为测试用的H.264压缩视频。H.264编码器的配置如下:Baseline Profile,IPPP,每30帧插入1个I帧,3个参考帧,运动估计的搜索范围为[-32,32],量化参数为30,编码帧数为300帧。在实验中,我们采取每隔3帧(运动矢量累积过程中使用的帧数)计算一次累积运动矢量场的做法,总共获得了100帧累积运动矢量场来测试本文提出的运动对象分割算法的性能。先从当前帧由累积运动矢量场得到区域分割结果,然后将前一帧运动对象投影到当前帧,基于这两个结果采用基于匹配矩阵的分割方法将运动对象分割出来。The following is an example when the input video format is 352×288 CIF, and the H.264 encoder of JM8.6 version is used to encode the MPEG-4 standard test sequence as the H.264 compressed video for testing. The configuration of the H.264 encoder is as follows: Baseline Profile, IPPP, inserting 1 I frame and 3 reference frames every 30 frames, the search range of motion estimation is [-32, 32], the quantization parameter is 30, and the number of encoded frames is 300 frames. In the experiment, we calculated the cumulative motion vector field every 3 frames (the number of frames used in the motion vector accumulation process), and obtained a total of 100 frames of cumulative motion vector field to test the performance of the moving object segmentation algorithm proposed in this paper . Firstly, the region segmentation result is obtained from the current frame by accumulating the motion vector field, and then the moving object in the previous frame is projected to the current frame. Based on these two results, the segmentation method based on matching matrix is used to segment the moving object.

采用典型的标准测试序列Coastguard和Mobile作为输入视频进行测试,实验结果分别如图5和图6所示。两图中第1列为当前帧的原始图象,第2列为当前帧由累积运动矢量场分割所得的区域分割结果,第3列为前一帧运动对象的在当前帧的投影区域,第4列为当前帧分割出的运动对象。平均每帧的处理时间为38ms,已能满足大多数实时应用25fps的要求。考虑到本文的分割方法其实是每隔3帧进行一次运动对象分割,对于给出的原始视频序列而言,完全可以在实时解码的同时就能分割出相应的运动对象,即使要求每帧都分割出相应的运动对象,只需要对其余帧进行对象投影,其计算量也很小,仍能实时分割出运动对象。The typical standard test sequences Coastguard and Mobile are used as the input video for testing, and the experimental results are shown in Figure 5 and Figure 6, respectively. The first column in the two figures is the original image of the current frame, the second column is the region segmentation result of the current frame by the cumulative motion vector field segmentation, the third column is the projection area of the previous frame moving object in the current frame, and the second column is Column 4 is the moving object segmented from the current frame. The average processing time of each frame is 38ms, which can meet the requirement of 25fps for most real-time applications. Considering that the segmentation method in this paper is actually segmenting moving objects every 3 frames, for the given original video sequence, it is completely possible to segment corresponding moving objects while decoding in real time, even if each frame is required to be segmented To generate the corresponding moving objects, it only needs to perform object projection on the remaining frames, and the calculation amount is very small, and the moving objects can still be segmented in real time.

实验1:序列Coastguard具有明显的全局运动,摄像机首先自右向左平移来跟踪画面中间的小船,然后自左向右运动来跟踪从画面左边出现的大船。图5第1行(序列第4帧)为摄像机自右向左跟踪小船的运动,图5第2行(序列第37帧)为新对象大船由左向右运动,图5第3行(序列第61帧)为两个运动对象大船和小船完全出现在摄像机的场景中,图5第4行(序列第208帧)为摄像机开始自左向右跟踪大船的运动。由图5第2列图象可以看出,对累积运动矢量场的分割大多能够比较准确地分割出两个运动对象所在的区域,而且符合全局运动模型的大部分背景区域也都包含在一个大的分割区域中,白色的区域表示了经运动补偿后最可靠的背景区域,因此本文采取的对运动矢量场的累积以及分割方法是有效的,能够利用运动矢量信息获得一个适度分割的结果。结合第3列所示的前一帧各个对象在当前帧的投影区域,利用基于匹配矩阵的运动对象分割方法,能够在整个序列中稳定可靠地分割出第4列所示的运动对象。Experiment 1: The sequence Coastguard has obvious global motion. The camera first pans from right to left to track the small boat in the middle of the picture, and then moves from left to right to track the big boat that appears from the left of the picture. Line 1 in Figure 5 (frame 4 in the sequence) is the camera tracking the movement of the boat from right to left, line 2 in Figure 5 (frame 37 in the sequence) is the movement of the new object ship from left to right, line 3 in Figure 5 (frame 3 in the sequence Frame 61) shows that two moving objects, a large ship and a small ship, completely appear in the scene of the camera. Line 4 in Figure 5 (frame 208 of the sequence) shows that the camera starts to track the movement of the large ship from left to right. From the images in the second column of Figure 5, it can be seen that most of the segmentation of the accumulated motion vector field can accurately segment the areas where the two moving objects are located, and most of the background areas that conform to the global motion model are also included in a large In the segmentation area of , the white area represents the most reliable background area after motion compensation, so the accumulation and segmentation method of the motion vector field adopted in this paper is effective, and a moderate segmentation result can be obtained by using the motion vector information. Combining the projection area of each object in the previous frame shown in the third column in the current frame, using the moving object segmentation method based on the matching matrix, the moving object shown in the fourth column can be stably and reliably segmented in the entire sequence.

实验2:序列Mobile具有更复杂的全局运动,除了摄像机的平移和俯仰运动外,在序列的前半段还有明显的缩放运动。图6第1行(序列第4帧)场景中总共包括3个运动对象,小火车推动球在轨道上运动,而挂历在间歇性地上下运动,因此运动对象分割的难度更大。由图6的分割结果可以看出,本发明提出的运动对象分割算法在运动对象停止运动的情况下,能够通过对象投影分割出该运动对象,如图6第2行(序列第43帧)的球以及图6第3行(序列第109帧)的挂历。此外,图6的实验结果也表明了本文的运动对象分割算法能够很好地处理运动对象的合并与分裂。在图6第3行(序列第109帧),由于小火车已经在无缝隙地推着球运动,因此两个在空间上紧密邻接且运动完全一致的运动对象被视作发生了对象合并;在图6第4行(序列第160帧)对有了间隙的两个对象,且运动程度不再相同时,两个对象被分割成了两个区域,真正实现两个对象的分裂。Experiment 2: Sequence Mobile has a more complex global motion. In addition to the pan and tilt motion of the camera, there is also an obvious zoom motion in the first half of the sequence. The scene in row 1 of Figure 6 (frame 4 of the sequence) includes a total of 3 moving objects. The small train pushes the ball to move on the track, while the wall calendar moves up and down intermittently, so the segmentation of moving objects is more difficult. It can be seen from the segmentation result in Fig. 6 that the moving object segmentation algorithm proposed by the present invention can segment the moving object through object projection when the moving object stops moving, as shown in the second line of Fig. 6 (the 43rd frame of the sequence) The ball and the wall calendar in line 3 of Figure 6 (frame 109 of the sequence). In addition, the experimental results in Figure 6 also show that the moving object segmentation algorithm in this paper can handle the merging and splitting of moving objects well. In line 3 of Figure 6 (frame 109 of the sequence), since the train is already pushing the ball seamlessly, two moving objects that are closely adjacent in space and have exactly the same motion are considered to have merged objects; Line 4 in Figure 6 (frame 160 of the sequence) pairs two objects with a gap, and when the motion levels are no longer the same, the two objects are divided into two regions, and the split of the two objects is truly realized.

Claims (6)

1. the H.264 compression domain motion object real time method for segmenting based on the coupling matrix is characterized in that the motion vector field normalization of the continuous multiple frames row iteration rear orientation projection that goes forward side by side is obtained the cumulative motion vector field; Then the cumulative motion vector field is carried out global motion compensation, adopting fast simultaneously, the statistical regions growth algorithm is divided into a plurality of zones according to kinematic similarity with the cumulative motion vector field; Utilize above-mentioned two aspect results, the motion Object Segmentation method based on the coupling matrix that adopts the present invention to propose is partitioned into the motion object, wherein can carry out multiple situations such as the appearance of the merging of the tracking of object and renewal, object and division, object and disappearance effectively in video sequence; Its step is as follows:
A. motion vector field normalization: from video H.264, extract motion vector field and carry out normalization on time domain and the spatial domain;
B. cumulative motion vector field: utilize the motion vector field of continuous multiple frames to carry out the iterative backward projection and obtain cumulative motion vector field more reliably;
C. global motion compensation: after carrying out overall motion estimation on the cumulative motion vector field, compensate to obtain each residual error of 4 * 4;
D. Region Segmentation: adopt the statistical regions growing method that the cumulative motion vector field is divided into a plurality of zones with similar movement;
E. Object Segmentation: adopt dividing method that the motion Object Segmentation is come out based on the coupling matrix.
2. the H.264 compression domain motion object real time method for segmenting based on the coupling matrix according to claim 1, it is characterized in that the normalized step of described motion vector field is: (1) time domain normalization: with the motion vector of present frame interval frame number, i.e. time domain distance divided by present frame and reference frame; (2) spatial domain normalization: give all 4 * 4 that this macro block covered greater than direct tax of each macroblock motion vector of 4 * 4 with all sizes.
3. the H.264 compression domain motion object real time method for segmenting based on the coupling matrix according to claim 1, the step that it is characterized in that described cumulative motion vector field is: (1) utilizes the present frame motion vector field of some frames afterwards, motion vector field to consecutive frame carries out rear orientation projection, multiply by the projection motion vector that addition behind the different scale factors obtains current block by motion vector exactly to each projecting block, the method for selecting of scale factor is: if the gross area of overlapping region greater than half of current block area, then the scale factor of each projecting block is taken as the gross area of the equitant area of this projecting block and current block divided by all projecting blocks and current block overlapping region; Otherwise the scale factor of each projecting block is taken as the ratio of its overlapping area and current block area; (2) begin the iteration accumulation to obtain the cumulative motion vector field of present frame from the back frame then.
4. the H.264 compression domain motion object real time method for segmenting based on the coupling matrix according to claim 1, it is characterized in that global motion compensation, be to adopt affine motion model estimation global motion vector field earlier, calculate each 4 * 4 residual error behind global motion compensation then.Step is as follows:
(1) adopt the affine motion model of 6 parameters to estimate the global motion vector field:
1. model parameter initialization: establish m=(m 1, m 2, m 3, m 4, m 5, m 6) be the parameter vector of global motion model, model parameter m (0)Be initialized as: m ( 0 ) = 1 0 1 N Σ i = 1 N mvx i 0 1 1 N Σ i = 1 N mvy i T ;
2. reject point not in the know:
At first calculating the present frame centre coordinate is (x i, y i) i piece at the estimation centre coordinate of former frame ( x i ′ , y i ′ ) : x i ′ y i ′ = m 1 m 2 m 4 m 5 x i y i + m 3 m 6 , Motion vectors then
Figure C200610116363C00033
With original cumulative motion vector (mvx i, mvy i) deviation (ex i, ey i) be calculated as: ex i = x i ′ - x i - mvx i ey i = y i ′ - y i - mv y i , Use this formula to calculate each prediction deviation (ex of 4 * 4 i, ey i), calculate the deviation amplitude quadratic sum at last
Figure C200610116363C00035
Histogram, reject those deviation amplitude quadratic sums in the histogram then greater than 25% motion vector;
3. model parameter is upgraded
Use the motion vector and the Newton-Raphson method that remain in the preceding step to upgrade model parameter, new model parameter vectors m in the l step iteration (l)Be defined as follows: m (l)=m (l-1)-H -1B, Hessian matrix H and gradient vector b are calculated as follows here:
H = Σ i ∈ R x i 2 Σ i ∈ R x i y i Σ i ∈ R x i 0 0 0 Σ i ∈ R x i y i Σ i ∈ R y i 2 Σ i ∈ R y i 0 0 0 Σ i ∈ R x i Σ i ∈ R y i Σ i ∈ R 1 0 0 0 0 0 0 Σ i ∈ R x i 2 Σ i ∈ R x i y i Σ i ∈ R x i 0 0 0 Σ i ∈ R x i y i Σ i ∈ R y i 2 Σ i ∈ R y i 0 0 0 Σ i ∈ R x i Σ i ∈ R y i Σ i ∈ R 1
b = Σ i ∈ R x i ex i Σ i ∈ R y i ex i Σ i ∈ R ex i Σ i ∈ R x i ey i Σ i ∈ R y i ey i Σ i ∈ R ey i T
The set of the piece that R representative here remains;
4. termination condition: repeating step 2. and 3. maximum 5 times, and if one of following two conditions be satisfied also finishing iteration in advance: (i) calculate m (l)-m Static, obtain a difference value vector, if all less than 0.01, just being judged as, each the parameter component in this difference value vector belongs to the static situation of video camera, finishing iteration; M wherein Static=[1 0001 0] TBe the global motion vector under the static situation of video camera; (ii) calculate m (l)And m (l-1)Difference, if the parameter component m of this difference 3And m 6Less than 0.01, and other parameter component is less than 0.0001, and then iteration finishes;
5. with the global motion model parameters vector m substitution that obtains x i ′ y i ′ = m 1 m 2 m 4 m 5 x i y i + m 3 m 6 , Obtain the estimated coordinates of former frame
Figure C200610116363C00042
Obtain the global motion vector field at last
Figure C200610116363C00043
(2) calculate each residual error of 4 * 4 in global motion vector field and the cumulative motion vector field.
5. the H.264 compression domain motion object real time method for segmenting based on the coupling matrix according to claim 1, it is characterized in that described Region Segmentation, is to adopt the statistical regions growth algorithm that the cumulative motion vector field is divided into a plurality of zones with similar movement; Step is as follows:
(1) the differences in motion opposite sex of calculating any adjacent block group in the neighbours territory is measured;
(2) all adjacent block groups sort according to differences in motion opposite sex tolerance order from small to large;
(3) the minimum adjacent block of differences in motion opposite sex tolerance is made up also, to begin area growth process herein, when each region growing, current two piece groups belong to two adjacent zones respectively, then judge condition that whether these two zones merge be the difference of average motion vector in these two zones whether less than threshold condition:
Δ ( R ) = SR 2 2 Q | R | ( min ( SR , | R | ) log ( 1 + | R | ) + 2 log 6 wh ) , Wherein SR represents the dynamic range of motion vector, | R| represents the motion vector number that the zone comprises, wh represents the size of motion vector field, and parameter Q is used for the dividing degree of controlled motion vector field, just motion vector field moderately can be divided into some zones with similar movement with regard to sample;
(4) calculate the mean residual of each cut zone behind global motion compensation;
(5) distinguish the zone at the most reliable background area and other object place, as background area reliably, be labeled as in zone that area is selected the mean residual minimum in greater than some cut zone of whole motion vector field 10%
Figure C200610116363C00045
The zone that remaining zone may exist as the motion object
Figure C200610116363C00046
Mark is distinguished in M subject area and 1 background area that present frame is cut apart at last, and segmentation result is designated as
Figure C200610116363C00047
6. the H.264 compression domain motion object real time method for segmenting based on the coupling matrix according to claim 1, it is characterized in that described Object Segmentation, be to utilize former frame, t-1 constantly, the motion Object Segmentation result who has obtained judges present frame, and t constantly, each cut zone whether with certain object coupling of former frame, construct the coupling matrix with this; Judge the division of merging, the object of the tracking of object and renewal, object, the appearance, the situations such as disappearance of old object of object newly based on the coupling matrix, finally obtain some motion objects of present frame; Step is as follows:
(1) adopt rear orientation projection's method to obtain former frame, in the t-1 moment, each object is in present frame t view field constantly, and elder generation is with N motion object of former frame
Figure C200610116363C00051
With 1 background object
Figure C200610116363C00052
Mark comes out, and adopts the method for rear orientation projection to obtain the view field of each object of former frame at present frame then; Utilize the coordinate of any piece in the present frame cumulative motion vector field and the difference of its corresponding cumulative motion vector to obtain the matched position of this piece in former frame exactly, then the block object on the former frame matched position is projected to present frame and one by one mark come out, be designated as
(2) structural matrix CM t, its expression cut zone and overlapped area of object projection; Structural matrix CMR t, it represents that each cut zone drops on the ratio in each object projection; Structural matrix CMC t, it represents that each object projection drops on the ratio in each cut zone; According to the mark image With
Figure C200610116363C00055
Construct the Matrix C M of 3 capable N+1 row of M+1 t, CMR t, CMC tMatrix C M wherein tIn arbitrary element CM t(i, j) value be
Figure C200610116363C00056
In be labeled as i and Be labeled as the number of pixels of j, i.e. cut zone
Figure C200610116363C00058
With the object projection
Figure C200610116363C00059
Overlapped area; And Matrix C MR t(i j) is defined as CMR tEach element that i is capable is a cut zone Drop on the ratio in each object projection; Matrix C MC t(i j) is defined as CMC tEach element of j row is an object Projection drop on ratio in each cut zone;
(3) structural matrix CMM t, the correlation degree between its expression present frame cut zone and the object projection, Matrix C MM tWrite down CMR tAnd CMC tReflected
Figure C200610116363C000512
With
Figure C200610116363C000513
Between relevant information; CMM tAt first be changed to the null matrix that M+1 is capable, N+1 is listed as; Then to CMR tCarry out line scanning and find the position at each row maximum place, to CMM tThe element value of middle corresponding position adds 1; Then to CMC tCarry out column scan and find the position at each row maximum place, to CMM tThe element value of middle corresponding position adds 2; The Matrix C MM that generates tOrdinate be expressed as the present frame background area successively
Figure C200610116363C00061
And moving region
Figure C200610116363C0006092744QIETU
(i=1,2 ... M) abscissa is expressed as the former frame background object successively
Figure C200610116363C00063
With the motion object
Figure C200610116363C0006092758QIETU
(i=1,2 ... M) the possible value of each element is 0,1,2,3 in the matrix; CMM tIn arbitrarily be not 0 Elements C MM t(i j) has shown cut zone
Figure C200610116363C00065
With object
Figure C200610116363C00066
There is certain correlation, particularly:
1. CMM t(i j)=1, shows cut zone
Figure C200610116363C00067
Belong to the former frame object to a great extent
2. CMM t(i j)=2, shows the former frame object
Figure C200610116363C00069
Be included in cut zone to a great extent In;
3. CMM t(i j)=3, has comprised above-mentioned two kinds of situations simultaneously, shows
Figure C200610116363C000611
With
Figure C200610116363C000612
Has extremely strong correlation; Need further relatively, if CMR t(i, j)〉CMC t(i, j), CMM then t(i, j)=1;
Otherwise, CMM t(i, j)=2; The CMM of Sheng Chenging at last tSpan is 0,1,2;
(4) based on coupling Matrix C MM tTracking and renewal, new object appearance, the merging of object, the division of object and the disappearance five class situations of object to single object are carried out Object Segmentation; By Matrix C MM tCan set up the incidence relation of cut zone and motion object effectively, it can handle following five kinds of situations effectively with a kind of uniform way:
1. single to image tracing and renewal (1 → 1);
(0 → 1) appears in 2. new object;
3. the merging of object (m → 1);
4. the division (1 → m) of object;
5. the disappearance of object (1 → 0).
CN 200610116363 2006-09-21 2006-09-21 Real time method for segmenting motion object based on H.264 compression domain Expired - Fee Related CN100486336C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 200610116363 CN100486336C (en) 2006-09-21 2006-09-21 Real time method for segmenting motion object based on H.264 compression domain

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 200610116363 CN100486336C (en) 2006-09-21 2006-09-21 Real time method for segmenting motion object based on H.264 compression domain

Publications (2)

Publication Number Publication Date
CN1960491A CN1960491A (en) 2007-05-09
CN100486336C true CN100486336C (en) 2009-05-06

Family

ID=38071950

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 200610116363 Expired - Fee Related CN100486336C (en) 2006-09-21 2006-09-21 Real time method for segmenting motion object based on H.264 compression domain

Country Status (1)

Country Link
CN (1) CN100486336C (en)

Families Citing this family (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101237581B (en) * 2008-02-29 2010-11-17 上海大学 A Real-time Video Object Segmentation Method Based on Motion Feature in H.264 Compressed Domain
CN101320085B (en) * 2008-07-21 2012-07-25 哈尔滨工业大学 Ultra-broadband wall-through point target positioning and imaging method based on back-projection algorithm
US20120207217A1 (en) * 2008-12-19 2012-08-16 Thomson Licensing Video coding based on global movement compensation
CN101719979B (en) * 2009-11-27 2011-08-03 北京航空航天大学 Video Object Segmentation Method Based on Memory Compensation of Fixed Interval in Time Domain
CN102196259B (en) * 2010-03-16 2015-07-01 北京中星微电子有限公司 Moving object detection system and method suitable for compression domain
ES2746182T3 (en) 2010-04-13 2020-03-05 Ge Video Compression Llc Prediction between planes
CN106454373B (en) 2010-04-13 2019-10-01 Ge视频压缩有限责任公司 Decoder, method, encoder and the coding method for rebuilding array
EP4398576B1 (en) 2010-04-13 2025-09-17 GE Video Compression, LLC Video coding using multi-tree sub-divisions of images
CN106303522B9 (en) * 2010-04-13 2020-01-31 Ge视频压缩有限责任公司 Decoder and method, encoder and method, data stream generating method
CN102123234B (en) * 2011-03-15 2012-09-05 北京航空航天大学 Unmanned airplane reconnaissance video grading motion compensation method
CN102333213A (en) * 2011-06-15 2012-01-25 夏东 H.264 compressed domain moving object detection algorithm under complex background
CN102917224B (en) * 2012-10-18 2015-06-17 北京航空航天大学 Mobile background video object extraction method based on novel crossed diamond search and five-frame background alignment
CN103198297B (en) * 2013-03-15 2016-03-30 浙江大学 Based on the kinematic similarity assessment method of correlativity geometric properties
CN104125430B (en) * 2013-04-28 2017-09-12 华为技术有限公司 Video moving object detection method, device and video monitoring system
CN104683803A (en) * 2015-03-24 2015-06-03 江南大学 A Moving Object Detection and Tracking Method in Compressed Domain
HK1203289A2 (en) * 2015-07-07 2015-10-23 香港生产力促进局 A method and a device for detecting moving object
CN108965869B (en) * 2015-08-29 2023-09-12 华为技术有限公司 Image prediction methods and equipment
CN105931274B (en) * 2016-05-09 2019-02-15 中国科学院信息工程研究所 A Fast Object Segmentation and Tracking Method Based on Motion Vector Trajectory
CN108574846B (en) * 2018-05-18 2019-03-08 中南民族大学 A kind of video compress domain method for tracking target and system
CN109389031B (en) * 2018-08-27 2021-12-03 浙江大丰实业股份有限公司 Automatic positioning mechanism for performance personnel
CN114567781A (en) * 2020-11-27 2022-05-31 安徽寒武纪信息科技有限公司 Method, device, electronic equipment and storage medium for coding and decoding video image
CN112990273B (en) * 2021-02-18 2021-12-21 中国科学院自动化研究所 Video-sensitive person recognition method, system and device for compressed domain

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
灵活可变的运动图像的分割算法. 赵彦玲等.中国工程科学,第8卷第5期. 2006 *

Also Published As

Publication number Publication date
CN1960491A (en) 2007-05-09

Similar Documents

Publication Publication Date Title
CN100486336C (en) Real time method for segmenting motion object based on H.264 compression domain
CN100538743C (en) Understanding video content through real-time video motion analysis
US8285045B2 (en) Image analysis method, medium and apparatus and moving image segmentation system
US7986810B2 (en) Mesh based frame processing and applications
Smith et al. Layered motion segmentation and depth ordering by tracking edges
Zhong et al. Video object model and segmentation for content-based video indexing
Zhong et al. Spatio-temporal video search using the object based video representation
US7142602B2 (en) Method for segmenting 3D objects from compressed videos
Bebeselea-Sterp et al. A comparative study of stereovision algorithms
Philip et al. A comparative study of block matching and optical flow motion estimation algorithms
Porikli et al. Compressed domain video object segmentation
Toklu et al. Simultaneous alpha map generation and 2-D mesh tracking for multimedia applications
Huang et al. Automatic feature-based global motion estimation in video sequences
Gao et al. Shot-based video retrieval with optical flow tensor and HMMs
Morand et al. Scalable object-based video retrieval in hd video databases
CN101600106A (en) A global motion estimation method and device
Gu et al. Tracking of multiple semantic video objects for internet applications
KR20010011348A (en) Recording medium and method for constructing and retrieving a data base of a mpeg video sequence by using a object
Choo et al. Scene mapping-based video registration using frame similarity measurement and feature tracking
Chen et al. Progressive motion vector clustering for motion estimation and auxiliary tracking
Yongsheng et al. A Survey on Content based video retrival
Felip et al. Robust dominant motion estimation using MPEG information in sport sequences
Cuevas et al. Temporal segmentation tool for high-quality real-time video editing software
Vo et al. Precise estimation of motion vectors and its application to MPEG video retrieval
Smeaton et al. Coherent segmentation of video into syntactic regions

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20090506

Termination date: 20110921