CN100486336C - Real time method for segmenting motion object based on H.264 compression domain - Google Patents
Real time method for segmenting motion object based on H.264 compression domain Download PDFInfo
- Publication number
- CN100486336C CN100486336C CN 200610116363 CN200610116363A CN100486336C CN 100486336 C CN100486336 C CN 100486336C CN 200610116363 CN200610116363 CN 200610116363 CN 200610116363 A CN200610116363 A CN 200610116363A CN 100486336 C CN100486336 C CN 100486336C
- Authority
- CN
- China
- Prior art keywords
- motion vector
- motion
- vector field
- sigma
- segmentation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Landscapes
- Compression Or Coding Systems Of Tv Signals (AREA)
- Image Analysis (AREA)
Abstract
本发明提出了一种基于匹配矩阵的H.264压缩域运动对象实时分割方法,分割所依赖的唯一信息是从H.264视频提取出的基于4×4块均匀采样的运动矢量场。首先对连续多帧的运动矢量场进行归一化并且迭代后向投影,获得累积运动矢量场以增强显著的运动信息。然后对累积运动矢量场进行全局运动补偿,同时采用快速的统计区域生长算法按照运动相似性将其分割成多个区域。利用上述两方面结果,提出基于匹配矩阵的运动对象分割方法,使之能有效的在视频序列中进行对象的跟踪与更新、对象的合并与分裂、新对象的出现以及对象消失等多种情况。对MPEG-4测试序列的实验结果表明,在一台CPU为3.0GHz,内存为512M的计算机上处理CIF格式的视频序列,平均每帧的处理时间为38ms,已能满足大多数实时应用25fps的要求,而且具有良好的分割质量。由于本发明提出的方法只使用了运动矢量场信息,因此它同样可适用于MPEG压缩域的运动对象分割。
The invention proposes a method for real-time segmentation of moving objects in H.264 compressed domain based on matching matrix. The only information the segmentation depends on is the motion vector field based on uniform sampling of 4*4 blocks extracted from H.264 video. Firstly, the motion vector field of consecutive multiple frames is normalized and iteratively back-projected to obtain the accumulated motion vector field to enhance salient motion information. Then the global motion compensation is performed on the accumulated motion vector field, and a fast statistical region growing algorithm is used to divide it into multiple regions according to the motion similarity. Using the results of the above two aspects, a moving object segmentation method based on matching matrix is proposed, so that it can effectively perform object tracking and updating, object merging and splitting, new object appearance and object disappearance in video sequences. The experimental results of the MPEG-4 test sequence show that on a computer with a CPU of 3.0GHz and a memory of 512M to process a video sequence in CIF format, the average processing time of each frame is 38ms, which can meet the requirements of most real-time applications at 25fps. requirements, and has good segmentation quality. Since the method proposed by the present invention only uses motion vector field information, it is also applicable to moving object segmentation in MPEG compression domain.
Description
技术领域 technical field
本发明涉及到一种基于H.264压缩域运动对象实时分割方法,特别是与现有方法截然不同的是,免除了对压缩视频的完全解码,仅通过熵解码提取出的运动矢量用作分割所需的运动特征,因此计算量大大减少。而且它不受限于静止背景的视频序列,对于具有运动背景或者静止背景的视频序列,都能快速可靠地分割出运动对象。由于该方法只使用了运动矢量场信息,因此它同样可适用于MPEG压缩域的运动对象分割。The present invention relates to a real-time segmentation method of moving objects based on H.264 compressed domain, especially different from the existing method, it avoids the complete decoding of compressed video, and only the motion vector extracted by entropy decoding is used for segmentation required motion features, so the amount of computation is greatly reduced. Moreover, it is not limited to video sequences with static backgrounds. For video sequences with moving backgrounds or static backgrounds, it can quickly and reliably segment moving objects. Since this method only uses the information of the motion vector field, it is also applicable to the segmentation of moving objects in the MPEG compression domain.
背景技术 Background technique
运动对象分割是诸如视频索引与检索、智能视频监控、视频编辑和人脸识别等众多基于内容的多媒体应用所必需的一个前提条件。自从MPEG-4提出基于内容的视频编码,有关运动对象分割的研究大多集中在象素域,而基于压缩域的运动对象分割直至近年来才开始引起关注。在压缩域内进行运动对象分割,与象素域内的分割方法相比更适合于实际应用的需要。尤其是实际应用中的大多数视频序列已经压缩为某种格式,直接在此压缩域内进行运动对象分割,可免除对压缩视频进行完全解码;而且,在压缩域内需要处理的数据量要比象素域少很多,因此计算量大大减少;加之,从压缩视频中仅通过熵解码提取出的运动矢量和DCT系数,可直接用作分割所需的运动特征和纹理特征。因此,从压缩域分割运动对象具有快速的特点,可解决传统的象素域分割方法难于满足实时性分割的要求,适合于众多的具有实时性要求的应用场合。Moving object segmentation is a prerequisite for many content-based multimedia applications such as video indexing and retrieval, intelligent video surveillance, video editing, and face recognition. Since content-based video coding was proposed by MPEG-4, most researches on moving object segmentation have focused on the pixel domain, while the moving object segmentation based on compressed domain has only attracted attention in recent years. Carrying out moving object segmentation in the compressed domain is more suitable for the needs of practical applications than the segmentation method in the pixel domain. In particular, most of the video sequences in practical applications have been compressed into a certain format, and the moving object segmentation is directly performed in this compressed domain, which can avoid completely decoding the compressed video; moreover, the amount of data to be processed in the compressed domain is larger than that of pixels. The domain is much less, so the amount of calculation is greatly reduced; in addition, the motion vector and DCT coefficient extracted from the compressed video only through entropy decoding can be directly used as the motion feature and texture feature required for segmentation. Therefore, the segmentation of moving objects from the compressed domain has the characteristics of rapidity, which can solve the problem that the traditional pixel domain segmentation method is difficult to meet the requirements of real-time segmentation, and is suitable for many applications with real-time requirements.
目前,压缩域运动对象分割方法虽已有人提出,但基本上是针对MPEG-2压缩域的,H.264是最新的视频编码标准,相比于MPEG-2编码效率提高了一倍,目前越来越多的应用都在转向采用H.264来取代MPEG-2,但至今在H.264压缩域内进行运动对象分割的研究甚少。与MPEG压缩域相比,H.264压缩域中I帧的DCT系数不能直接用作分割的纹理特征,因为它们是在块的空间预测残差上进行变换的,而不是在原始块上进行变换。因此,在H.264域进行运动对象分割可直接使用的特征只有运动矢量信息。目前在H.264域,只有Zeng等提出了一种基于块的MRF模型从稀疏运动矢量场中分割运动对象的方法,根据各个块运动矢量的幅值赋予各个块不同类型的标记,通过最大化MRF的后验概率标记出属于运动对象的块。但是,这种方法只适用于静态背景的视频序列,而且分割的准确度不高,计算量也较大。At present, although the moving object segmentation method in the compressed domain has been proposed, it is basically aimed at the MPEG-2 compressed domain. H.264 is the latest video coding standard, which doubles the coding efficiency compared to MPEG-2. More and more applications are turning to H.264 to replace MPEG-2, but so far there is little research on moving object segmentation in H.264 compression domain. In contrast to the MPEG compressed domain, the DCT coefficients of I-frames in the H.264 compressed domain cannot be directly used as texture features for segmentation, since they are transformed on the spatial prediction residual of the block instead of the original block . Therefore, the only feature that can be directly used for moving object segmentation in the H.264 domain is the motion vector information. Currently in the H.264 domain, only Zeng et al. proposed a block-based MRF model to segment moving objects from the sparse motion vector field, assigning different types of marks to each block according to the magnitude of the motion vector of each block, and maximizing The posterior probabilities of the MRF flag blocks that belong to moving objects. However, this method is only suitable for video sequences with static backgrounds, and the accuracy of segmentation is not high, and the amount of calculation is relatively large.
发明内容 Contents of the invention
本发明的目的是提供一种基于H.264压缩域运动对象实时分割方法,分割所用的唯一信息是从H.264压缩视频中提取出的基于4×4块均匀采样的运动矢量场。本方法可免除对压缩视频的完全解码,使用熵解码提取出的运动矢量作为分割所需的运动特征,因此计算量大大减少,以达到实时运动对象分割的目的。The object of the present invention is to provide a real-time segmentation method for moving objects based on H.264 compressed domain. The only information used for segmentation is the motion vector field based on 4×4 block uniform sampling extracted from H.264 compressed video. This method can avoid the complete decoding of compressed video, and uses the motion vector extracted by entropy decoding as the motion feature required for segmentation, so the calculation amount is greatly reduced, so as to achieve the purpose of real-time moving object segmentation.
为达到上述的目的,本发明的构思是:For achieving above-mentioned purpose, design of the present invention is:
如图1所示,对输入的H.264压缩视频流提取运动矢量并归一化,然后进行迭代后向投影,获得可显著增强运动信息的累积运动矢量场。再进行全局运动补偿并且按照运动相似性将累积运动矢量场分割成多个区域,然后用一个匹配矩阵表示当前帧的分割区域和前一帧运动对象在当前帧投影的相关性,基于这个匹配矩阵将运动对象分割出来。As shown in Figure 1, the motion vectors are extracted and normalized from the input H.264 compressed video stream, and then iterative back projection is performed to obtain the accumulated motion vector field which can significantly enhance the motion information. Then perform global motion compensation and divide the accumulated motion vector field into multiple areas according to the motion similarity, and then use a matching matrix to represent the correlation between the segmented area of the current frame and the projection of the moving object in the previous frame in the current frame, based on this matching matrix Segment moving objects.
根据上述构思,本发明的技术方案是:According to above-mentioned design, technical scheme of the present invention is:
一种基于H.264压缩域的运动对象实时分割方法,其特征在于对连续多帧的运动矢量场归一化并进行迭代后向投影,获得累积运动矢量场;然后对累积运动矢量场进行全局运动补偿,同时采用快速的统计区域生长算法按照运动相似性将累积运动矢量场分割成多个区域;利用上述两方面结果,采用本发明提出的基于匹配矩阵的运动对象分割方法分割出运动对象,其中可有效地在视频序列中进行对象的跟踪与更新、对象的合并与分裂、对象的出现与消失等多种情况;其步骤是:A method for real-time segmentation of moving objects based on H.264 compressed domain, characterized in that the motion vector field of continuous multi-frames is normalized and iteratively back-projected to obtain the cumulative motion vector field; then the cumulative motion vector field is globally motion compensation, adopt fast statistical region growing algorithm to divide the accumulative motion vector field into a plurality of regions according to motion similarity; utilize above-mentioned two aspects result, adopt the moving object segmentation method based on matching matrix proposed by the present invention to segment out moving object, Among them, various situations such as tracking and updating of objects, merging and splitting of objects, appearance and disappearance of objects, etc. can be effectively carried out in the video sequence; the steps are:
a.运动矢量场归一化:从H.264视频中提取出运动矢量场并进行时域和空域上的归一化;a. Normalization of the motion vector field: extract the motion vector field from the H.264 video and perform normalization in the time and space domains;
b.累积运动矢量场:利用连续多帧的运动矢量场进行迭代后向投影来获得更加可靠的累积运动矢量场;b. Cumulative motion vector field: use the motion vector field of multiple consecutive frames to perform iterative back projection to obtain a more reliable cumulative motion vector field;
c.全局运动补偿:在累积运动矢量场上进行全局运动估计后进行补偿以获得各4×4块的残差;c. Global motion compensation: Compensate after performing global motion estimation on the accumulated motion vector field to obtain the residual of each 4×4 block;
d.区域分割:采用统计区域生长方法将累积运动矢量场分割成多个具有相似运动的区域;d. Region Segmentation: The cumulative motion vector field is segmented into multiple regions with similar motion using the statistical region growing method;
e.对象分割:采用基于匹配矩阵的分割方法将运动对象分割出来。e. Object Segmentation: Segment moving objects using a matching matrix-based segmentation method.
上述的运动矢量场归一化的步骤是:The above-mentioned steps of normalizing the motion vector field are:
(1)时域归一化:将当前帧的运动矢量除以当前帧与参考帧的间隔帧数,即时域距离;(1) Temporal normalization: divide the motion vector of the current frame by the number of frames between the current frame and the reference frame, which is the distance in the instant domain;
(2)空域归一化:将凡尺寸大于4×4的各个宏块运动矢量直接赋给该宏块所覆盖的所有4×4块。(2) Spatial normalization: directly assign the motion vector of each macroblock whose size is larger than 4*4 to all 4*4 blocks covered by the macroblock.
上述的累积运动矢量场的步骤是:The above-mentioned steps of accumulating the motion vector field are:
(1)利用当前帧之后若干帧的运动矢量场,对相邻帧的运动矢量场进行后向投影,就是通过对各投影块的运动矢量乘以不同的比例因子后相加得到当前块的投影运动矢量,比例因子的选定方法为:如果重叠区域的总面积大于当前块面积的一半,则各投影块的比例因子取为该投影块与当前块的相重叠的面积除以所有投影块与当前块重叠区域的总面积;否则,各投影块的比例因子取为其重叠面积与当前块面积之比;(1) Use the motion vector field of several frames after the current frame to back-project the motion vector field of the adjacent frame, that is, the projection of the current block is obtained by multiplying the motion vector of each projected block by different scale factors and adding them together The selection method of the motion vector and the scale factor is: if the total area of the overlapping area is greater than half of the current block area, the scale factor of each projected block is taken as the overlapping area of the projected block and the current block divided by all projected blocks and The total area of the overlapping area of the current block; otherwise, the scale factor of each projected block is taken as the ratio of its overlapping area to the area of the current block;
(2)从后帧开始迭代累积以获得当前帧的累积运动矢量场;(2) Start iterative accumulation from the subsequent frame to obtain the accumulated motion vector field of the current frame;
上述的全局运动补偿的步骤是:The above steps of global motion compensation are:
(1)采用6参数的仿射运动模型估算全局运动矢量场;(1) Estimate the global motion vector field using an affine motion model with 6 parameters;
①模型参数初始化:设m=(m1,m2,m3,m4,m5,m6)是全局运动模型的参数矢量,模型参数m(0)初始化为:
②剔除局外点:首先计算当前帧中心坐标为(xi,yi)的第i个块在前一帧的估计中心坐标
③模型参数更新:使用前面步骤中保留下来的运动矢量和Newton-Raphson方法来更新模型参数,第l步迭代中新的模型参数矢量m(l)定义如下:m(l)=m(l-1)-H-1b,这里Hessian矩阵H和梯度矢量b计算如下:③Model parameter update: Use the motion vector and Newton-Raphson method retained in the previous steps to update the model parameters. The new model parameter vector m (l) in the first iteration is defined as follows: m (l) = m (l- 1) -H -1 b, where the Hessian matrix H and the gradient vector b are calculated as follows:
这里R代表保留下来的块的集合;Here R represents the collection of retained blocks;
④结束条件:重复步骤②和③最多5次,而且以下两个条件之一如果被满足的话也提前结束迭代:(i)计算m(l)-mstatic,得到一个差值向量,如果该差值向量中的每一个参数分量都小于0.01,就判断为属于摄像机静止的情况,结束迭代;其中mstatic=[100010]T为在摄像机静止情况下的全局运动向量;(ii)计算m(l)和m(l-1)的差值,如果这个差值的参数分量m3和m6小于0.01,而且其它参数分量小于0.0001,则迭代结束;④End condition: Repeat
⑤将得到的全局运动模型参数矢量m代入
(2)计算全局运动矢量场与累积运动矢量场中各4×4块的残差。(2) Calculate the residuals of each 4×4 block in the global motion vector field and the cumulative motion vector field.
上述的区域分割是采用统计区域生长算法将累积运动矢量场分划成多个具有相似运动的区域,步骤如下:The above region segmentation is to use the statistical region growing algorithm to divide the accumulated motion vector field into multiple regions with similar motion, the steps are as follows:
(1)计算四邻域内任意相邻块组的运动差异性度量;(1) Calculate the motion difference measure of any adjacent block group in the four neighborhoods;
(2)所有相邻块组按照运动差异性度量从小到大的次序进行排序;(2) All adjacent block groups are sorted according to the order of motion difference measure from small to large;
(3)将运动差异性度量最小的相邻块组合并,以此处开始区域生长过程,在每次区域生长时,当前两个块组分别属于相邻的两个区域,则判断这两个区域是否合并的条件是这两个区域的平均运动矢量之差是否小于阈值条件:(3) Merge the adjacent blocks with the smallest motion difference measure, and start the region growing process here. When each region grows, the current two block groups belong to two adjacent regions respectively, then judge the two The condition for whether regions are merged is whether the difference between the average motion vectors of these two regions is less than a threshold condition:
(4)计算每个分割区域在全局运动补偿后的平均残差;(4) Calculate the average residual error of each segmented region after global motion compensation;
(5)区分最可靠的背景区域和其它对象所在的区域,在面积大于整个运动矢量场10%的若干分割区域中选择平均残差最小的区域作为可靠的背景区域,标记为剩下的区域作为运动对象可能存在的区域最后对当前帧所分割的M个对象区域和1个背景区域分别标记,分割结果记为 (5) Distinguish the most reliable background area and the area where other objects are located, and select the area with the smallest average residual error among several segmented areas whose area is larger than 10% of the entire motion vector field as the reliable background area, marked as The remaining area is the area where the moving object may exist Finally, the M object regions and 1 background region segmented by the current frame are respectively marked, and the segmentation result is denoted as
上述的对象分割是利用前一帧,t-1时刻,已经获得的运动对象分割结果,来判断当前帧,t时刻,各个分割区域是否与前一帧的某个对象匹配,以此构造匹配矩阵;基于匹配矩阵判断对象的跟踪与更新、对象的合并、对象的分裂、新对象的出现、旧对象的消失等情况,最终获得当前帧的若干运动对象;其步骤是:The above object segmentation is to use the segmentation result of the moving object obtained in the previous frame at time t-1 to determine whether each segmented area matches an object in the previous frame at time t in the current frame, so as to construct a matching matrix ;Based on the matching matrix, judge the tracking and updating of objects, the merging of objects, the splitting of objects, the appearance of new objects, the disappearance of old objects, etc., and finally obtain some moving objects in the current frame; the steps are:
(1)采用后向投影方法获得前一帧,t-1时刻,各个对象在当前帧t时刻的投影区域,先将前一帧的N个运动对象和1个背景对象标记出来,然后采用后向投影的方法获得前一帧各个对象在当前帧的投影区域。就是利用当前帧累积运动矢量场中任意块的坐标和其对应的累积运动矢量的差求出这个块在前一帧中的匹配位置,然后将前一帧匹配位置上的块对象投影到当前帧并逐个标记出来,记为 (1) Use the backward projection method to obtain the projection area of each object at the time t of the current frame at the time t-1 of the previous frame, and first move the N moving objects of the previous frame and 1 background object Mark it out, and then use the method of back projection to obtain the projection area of each object in the previous frame in the current frame. It is to use the difference between the coordinates of any block in the cumulative motion vector field of the current frame and its corresponding cumulative motion vector to find the matching position of this block in the previous frame, and then project the block object at the matching position of the previous frame to the current frame And marked out one by one, denoted as
(2)构造矩阵CMt,它表示分割区域与对象投影相互重叠的面积;构造矩阵CMRt,它表示每个分割区域落在各个对象投影内的比例;构造矩阵CMCt,它表示每个对象投影落在各个分割区域内的比例;根据标记图象和构造3个M+1行N+1列的矩阵CMt,CMRt,CMCt。其中矩阵CMt中的任意元素CMt(i,j)取值为在中标记为i且在标记为j的象素数目,即分割区域与对象投影相互重叠的面积。而矩阵CMRt(i,j)定义为CMRt第i行的各个元素是分割区域落在各个对象投影内的比例;矩阵CMCt(i,j)定义为CMCt第j列的各个元素是对象的投影落在各个分割区域内的比例;(2) Construct a matrix CM t , which represents the overlapping area of the segmentation region and the object projection; construct a matrix CMR t , which represents the proportion of each segmentation region within each object projection; construct a matrix CMC t , which represents each object The proportion of the projection falling within each segmented area; according to the marker image and Construct three matrices CM t , CMR t , and CMC t with M+1 rows and N+1 columns. Among them, any element CM t (i, j) in the matrix CM t takes a value in marked i in and in The number of pixels marked as j, that is, the segmented area with object projection overlapping area. The matrix CMR t (i, j) is defined as each element of the i-th row of CMR t is the segmentation region The proportion that falls within the projection of each object; the matrix CMC t (i, j) is defined as each element of the jth column of CMC t is the object The proportion of the projection of the falls in each segmented area;
(3)构造矩阵CMMt,它表示当前帧分割区域和对象投影之间的关联程度,矩阵CMMt记录了CMRt和CMCt所反映的和之间的相关信息;CMMt首先置为M+1行、N+1列的零矩阵;接着对CMRt进行行扫描找到每一行最大值所在的位置,对CMMt中相应位置处的元素值加1;然后对CMCt进行列扫描找到每一列最大值所在的位置,对CMMt中相应位置处的元素值加2;生成的矩阵CMMt的纵坐标依次表示为当前帧背景区域和运动区域横坐标依次表示为前一帧背景对象和运动对象矩阵中各元素的可能取值为0,1,2,3;CMMt中任意不为0的元素CMMt(i,j)表明了分割区域与对象存在一定的相关性,具体而言:(3) Construct a matrix CMM t , which represents the degree of correlation between the current frame segmentation area and the object projection, and the matrix CMM t records the values reflected by CMR t and CMC t and The relevant information between; CMM t is first set as a zero matrix with M+1 rows and N+1 columns; then row-scan CMR t to find the position of the maximum value of each row, and the element value at the corresponding position in CMM t Add 1; then scan the columns of CMC t to find the position of the maximum value of each column, and add 2 to the element value at the corresponding position in CMM t ; the ordinate of the generated matrix CMM t is represented as the background area of the current frame in turn and sports area The abscissa represents the background object of the previous frame in sequence and moving objects The possible values of each element in the matrix are 0, 1, 2, 3; any element CMM t (i, j) that is not 0 in CMM t indicates the segmentation area with the object There are certain correlations, specifically:
①CMMt(i,j)=1,表明分割区域在很大程度上属于前一帧对象 ①CMM t (i, j)=1, indicating the segmented area Objects that belong largely to the previous frame
②CMMt(i,j)=2,表明前一帧对象在很大程度上包含在分割区域中;②CMM t (i, j)=2, indicating that the object in the previous frame largely contained in the segmented area middle;
③CMMt(i,j)=3,同时包含了上述两种情况,表明和具有极强的相关性;需要进一步比较,如果CMRt(i,j)>CMCt(i,j),则CMMt(i,j)=1;否则,CMMt(i,j)=2;最后生成的CMMt取值范围为0,1,2;③ CMM t (i, j) = 3, which includes the above two situations at the same time, indicating that and There is a strong correlation; further comparison is needed, if CMR t (i, j) > CMC t (i, j), then CMM t (i, j) = 1; otherwise, CMM t (i, j) = 2 ;The last generated CMM t ranges from 0, 1, 2;
(4)基于匹配矩阵CMMt对单个对象的跟踪与更新、新对象出现、对象的合并、对象的分裂以及对象的消失五类情况进行对象分割;通过矩阵CMMt可以有效地建立起分割区域与运动对象的关联关系,它能够以一种统一的方式有效地处理以下五种情况:(4) Carry out object segmentation based on the matching matrix CMM t for tracking and updating of a single object, new object appearance, object merging, object splitting and object disappearance; through the matrix CMM t , the segmentation area and The association relationship of moving objects, which can effectively handle the following five situations in a unified manner:
①单个对象跟踪与更新(1→1);①Single object tracking and updating (1→1);
②新对象出现(0→1);②A new object appears (0→1);
③对象的合并(m→1);③ Merge of objects (m→1);
④对象的分裂(1→m);④ Splitting of objects (1→m);
⑤对象的消失(1→0)。⑤ Disappearance of the object (1→0).
本发明与现有技术相比较,具有如下的突出特点和优点:本发明提供的基于压缩域的实时运动对象分割方法,是基于H.264视频流,即与现有方法截然不同的是现有的压缩域视频对象分割方法主要适用于MPEG域,而本发明不仅适用于H.264压缩域,同样适用于MPEG压缩域。而且本发明不受限于静止背景的视频序列,不论对于具有运动背景或者静止背景的视频序列,都能快速可靠地分割出运动对象。此外本发明提出的匹配矩阵分割运动对象的方法,几乎能够对视频对象运动的各种情况作出实时分割,因此分割对象的效果很好,具有很强的适用性。Compared with the prior art, the present invention has the following outstanding features and advantages: The real-time moving object segmentation method based on the compressed domain provided by the present invention is based on the H.264 video stream, which is completely different from the existing method. The compressed domain video object segmentation method is mainly applicable to the MPEG domain, while the present invention is not only applicable to the H.264 compressed domain, but also applicable to the MPEG compressed domain. Moreover, the present invention is not limited to video sequences with static backgrounds, and can quickly and reliably segment moving objects no matter for video sequences with moving backgrounds or static backgrounds. In addition, the method for segmenting moving objects by matching matrix proposed by the present invention can almost perform real-time segmentation for various situations of video object motion, so the effect of segmenting objects is very good and has strong applicability.
附图说明 Description of drawings
图1是本发明的基于匹配矩阵的H.264压缩域运动对象实时分割方法的程序框图。Fig. 1 is a program block diagram of the H.264 compressed domain moving object real-time segmentation method based on the matching matrix of the present invention.
图2是图1中运动矢量场归一化和累积运动矢量场的结构框图。FIG. 2 is a block diagram of the normalized motion vector field and the accumulated motion vector field in FIG. 1 .
图3是图1中全局运动补偿和区域分割的结构框图。Fig. 3 is a structural block diagram of global motion compensation and region segmentation in Fig. 1 .
图4是图1中对象分割的结构框图。Fig. 4 is a structural block diagram of object segmentation in Fig. 1 .
图5是对序列Coastguard中各个典型帧(第4、37、61、208帧)运动对象分割结果的图示。Fig. 5 is an illustration of the moving object segmentation results for each typical frame (
图6是对序列Mobile中各个典型帧(第4、43、109、160帧)运动对象分割结果的图示。Fig. 6 is an illustration of the moving object segmentation results of each typical frame (4th, 43rd, 109th, 160th frame) in the sequence Mobile.
具体实施方式 Detailed ways
本发明的一个实施例子结合附图详述如下:An implementation example of the present invention is described in detail as follows in conjunction with accompanying drawing:
本发明基于匹配矩阵的H.264压缩域运动对象实时分割方法是按图1所示程序框图,在CPU为3.0GHz、内存512M的PC测试平台上编程实现,图5和图6示出仿真测试结果。The H.264 compressed domain moving object real-time segmentation method based on the matching matrix of the present invention is according to the program block diagram shown in Figure 1, and is programmed on a PC test platform with a CPU of 3.0GHz and a memory of 512M. Figure 5 and Figure 6 show the simulation test result.
参见图1,本发明基于匹配矩阵的H.264压缩域运动对象实时分割方法,通过运动矢量场的归一化和累积增强了显著的运动信息,然后对累积运动矢量场进行全局运动补偿,采用统计区域生长算法和基于匹配矩阵的运动对象分割方法来分割区域和运动对象,具有算法简单,对象分割速度快,分割效果好的特点。Referring to Fig. 1, the H.264 compressed domain moving object real-time segmentation method based on the matching matrix of the present invention enhances significant motion information through the normalization and accumulation of the motion vector field, and then performs global motion compensation on the accumulated motion vector field, using Statistical region growing algorithm and moving object segmentation method based on matching matrix are used to segment regions and moving objects, which have the characteristics of simple algorithm, fast object segmentation speed and good segmentation effect.
其步骤是:The steps are:
(1)运动矢量场归一化:从H.264视频中提取出运动矢量场并进行时域和空域上归一化;(1) Normalization of the motion vector field: extract the motion vector field from the H.264 video and perform normalization in the time and space domains;
(2)累积运动矢量场:利用连续多帧的运动矢量场进行迭代后向投影来获得更加可靠的累积运动矢量场;(2) Cumulative motion vector field: use the motion vector field of multiple consecutive frames to perform iterative back projection to obtain a more reliable cumulative motion vector field;
(3)全局运动补偿:在累积运动矢量场上进行全局运动估计后进行补偿以获得各4×4块的残差;(3) Global motion compensation: Compensate after performing global motion estimation on the accumulated motion vector field to obtain the residual error of each 4×4 block;
(4)区域分割:采用统计区域生长方法将累积运动矢量场分割成多个具有相似运动的区域;(4) Region segmentation: The accumulated motion vector field is segmented into multiple regions with similar motions using the statistical region growing method;
(5)对象分割:采用基于匹配矩阵的分割算法将运动对象分割出来。(5) Object Segmentation: The moving object is segmented by a segmentation algorithm based on matching matrix.
上述步骤(1)的运动矢量场归一化的过程如下:The process of the motion vector field normalization of above-mentioned step (1) is as follows:
①时域归一化:将当前帧的运动矢量除以当前帧与参考帧的间隔帧数,即时域距离;①Temporal normalization: Divide the motion vector of the current frame by the number of frames between the current frame and the reference frame, which is the distance in the instant domain;
②空域归一化:将凡尺寸大于4×4的各个宏块运动矢量直接赋给该宏块所覆盖的所有4×4块。② Spatial domain normalization: directly assign the motion vector of each macroblock whose size is larger than 4*4 to all 4*4 blocks covered by the macroblock.
上述步骤(2)的累积运动矢量场的过程如下:The process of accumulating motion vector field of above-mentioned steps (2) is as follows:
①利用当前帧之后若干帧的运动矢量场,对相邻帧的运动矢量场进行后向投影;① Use the motion vector field of several frames after the current frame to back-project the motion vector field of the adjacent frame;
②从后帧开始迭代累积以获得当前帧的累积运动矢量场。② Iteratively accumulate from the next frame to obtain the accumulated motion vector field of the current frame.
上述步骤(3)的全局运动补偿的过程如下:The process of the global motion compensation of above-mentioned steps (3) is as follows:
①采用6参数的仿射运动模型估算全局运动矢量场;① Estimate the global motion vector field using a 6-parameter affine motion model;
②计算出各4×4块经全局运动补偿后的残差。② Calculate the residual error of each 4×4 block after global motion compensation.
上述步骤(4)的区域分割的过程如下:The process of the region segmentation of the above-mentioned step (4) is as follows:
①计算四邻域内任意相邻块组的运动差异性度量;① Calculate the motion difference measure of any adjacent block group in the four neighborhoods;
②所有相邻块组按照运动差异性度量从小到大的次序进行排序;② All adjacent block groups are sorted according to the order of motion difference measure from small to large;
③将运动差异性度量最小的相邻块组合并,以此处开始区域生长过程;③Merge the adjacent blocks with the smallest motion difference measure, and start the region growing process here;
④计算每个分割区域在全局运动补偿后的平均残差;④ Calculate the average residual error of each segmented area after global motion compensation;
⑤区分最可靠的背景区域和其它对象所在的区域。⑤ Distinguish the most reliable background area from the area where other objects are located.
上述步骤(5)的对象分割的过程如下:The process of object segmentation in the above step (5) is as follows:
①采用后向投影方法获得前一帧各个对象在当前帧的投影区域;① Use the back projection method to obtain the projection area of each object in the previous frame in the current frame;
②构造矩阵CMt,它表示分割区域与投影对象相互重叠的面积;构造矩阵CMRt,它表示每个分割区域落在各个对象投影内的比例;构造矩阵CMCt,它表示每个对象投影落在各个分割区域内的比例;② Construct a matrix CM t , which represents the overlapping area of the segmented area and the projected object; construct a matrix CMR t , which represents the proportion of each segmented area in the projection of each object; construct a matrix CMC t , which represents the area of each object projected. The proportion in each segmented area;
③构造矩阵CMMt,它表示当前帧分割区域和对象投影之间的关联程度;③ Construct matrix CMM t , which represents the degree of association between the current frame segmentation area and the object projection;
④基于匹配矩阵CMMt对单个对象跟踪与更新、新对象出现、对象的合并、对象的分裂以及对象的消失等五类情况进行对象分割。④ Based on the matching matrix CMM t, object segmentation is performed for five types of situations: single object tracking and updating, new object appearance, object merging, object splitting, and object disappearance.
下面对本实施例子结合总框图(图1)的五个步骤给予进一步详细说明:Below the five steps of this implementation example in conjunction with the general block diagram (Fig. 1) are given in further detail:
a.运动矢量场归一化:a. Motion vector field normalization:
如图2所示,将当前帧的运动矢量除以当前帧与参考帧的间隔帧数得到时域上的归一化,将当前帧中尺寸大于4×4的块的运动矢量直接赋给该块所覆盖的所有4×4块获得空域上的归一化。As shown in Figure 2, the motion vector of the current frame is divided by the number of frames between the current frame and the reference frame to obtain normalization in the time domain, and the motion vector of the block larger than 4×4 in the current frame is directly assigned to the All 4x4 blocks covered by the block get normalized over the spatial domain.
b.累积运动矢量场:b. Cumulative motion vector field:
如图2所示,先利用当前帧之后若干帧的运动矢量场,对相邻帧的运动矢量场进行后向投影。就是通过对各投影块的运动矢量乘以不同的比例因子后相加得到当前块的投影运动矢量,比例因子的选定方法为:如果重叠区域的总面积大于当前块面积的一半,则各投影块的比例因子取为该投影块与当前块的相重叠的面积除以所有投影块与当前块重叠区域的总面积;否则,各投影块的比例因子取为其重叠面积与当前块面积之比。然后从后帧开始迭代累积以获得当前帧的累积运动矢量场。As shown in FIG. 2 , the motion vector fields of several frames after the current frame are used to back-project the motion vector fields of adjacent frames. The projected motion vector of the current block is obtained by multiplying the motion vectors of each projected block by different scale factors and adding them together. The selection method of the scale factor is: if the total area of the overlapping area is greater than half of the area of the current block, each projection The scale factor of a block is taken as the overlapping area of the projected block and the current block divided by the total area of the overlapping area of all projected blocks and the current block; otherwise, the scale factor of each projected block is taken as the ratio of its overlapping area to the area of the current block . Then iteratively accumulate from the next frame to obtain the accumulated motion vector field of the current frame.
c.全局运动补偿:c. Global motion compensation:
如图3所示,采用6参数的仿射运动模型估算全局运动矢量场,利用它与累积运动矢量场之差就可获得累积运动矢量场任意块经全局运动补偿后的残差。其步骤是:As shown in Figure 3, a 6-parameter affine motion model is used to estimate the global motion vector field, and the difference between it and the cumulative motion vector field can be used to obtain the residual error of any block in the cumulative motion vector field after global motion compensation. The steps are:
(1)采用6参数的仿射运动模型估算全局运动矢量场:(1) Estimate the global motion vector field using an affine motion model with 6 parameters:
①模型参数初始化:①Model parameter initialization:
设m=(m1,m2,m3,m4,m5,m6)是全局运动模型的参数矢量,模型参数m(0)初始化为:
②剔除局外点:② Eliminate outliers:
首先计算当前帧中心坐标为(xi,yi)的第i个块在前一帧的估计中心坐标
③模型参数更新:③Model parameter update:
使用前面步骤中保留下来的运动矢量和Newton-Raphson方法来更新模型参数。第l步迭代中新的模型参数矢量m(l)定义如下:m(l)=m(l-1)-H-1b,这里Hessian矩阵H和梯度矢量b计算如下:Use the motion vectors preserved from the previous steps and the Newton-Raphson method to update the model parameters. The new model parameter vector m (l) in the first iteration is defined as follows: m (l) = m (l-1) -H -1 b, where the Hessian matrix H and the gradient vector b are calculated as follows:
这里R代表保留下来的块的集合。Here R represents the set of preserved blocks.
④结束条件:重复步骤②和③最多5次,而且以下两个条件之一如果被满足的话也提前结束迭代:④ End condition: Repeat steps ② and ③ up to 5 times, and if one of the following two conditions is met, the iteration will also end early:
(i)计算m(l)-mstatic,得到一个差值向量,如果该差值向量中的每一个参数分量都小于0.01,就判断为属于摄像机静止的情况,结束迭代;其中mstatic=[100010]T为在摄像机静止情况下的全局运动向量;(i) Calculate m (l) -m static to obtain a difference vector, if each parameter component in the difference vector is less than 0.01, it is judged to belong to the situation where the camera is still, and the iteration ends; where m static =[ 100010] T is the global motion vector when the camera is still;
(ii)计算m(l)和m(l-1)的差值,如果这个差值的参数分量m3和m6小于0.01,而且其它参数分量小于0.0001,则迭代结束。(ii) Calculate the difference between m (l) and m (l-1) , if the parameter components m 3 and m 6 of this difference are less than 0.01, and other parameter components are less than 0.0001, then the iteration ends.
⑤将得到的全局运动模型参数矢量m代入
(2)计算全局运动矢量场与累积运动矢量场中各4×4块的残差。(2) Calculate the residuals of each 4×4 block in the global motion vector field and the cumulative motion vector field.
d.区域分割:d. Region segmentation:
如图3所示,本发明采用统计区域生长算法实现对累积运动矢量场的区域分割。步骤详述如下:As shown in FIG. 3 , the present invention uses a statistical region growing algorithm to realize region segmentation of the accumulated motion vector field. The steps are detailed below:
(1)计算四邻域内任意相邻块组的运动差异性度量;(1) Calculate the motion difference measure of any adjacent block group in the four neighborhoods;
(2)所有相邻块组按照运动差异性度量从小到大的次序进行排序;(2) All adjacent block groups are sorted according to the order of motion difference measure from small to large;
(3)将运动差异性度量最小的相邻块组合并,以此处开始区域生长过程。在每次区域生长时,当前两个块组分别属于相邻的两个区域,则判断这两个区域是否合并的条件是这两个区域的平均运动矢量之差是否小于阈值条件:
(4)计算每个分割区域在全局运动补偿后的平均残差;(4) Calculate the average residual error of each segmented region after global motion compensation;
(5)区分最可靠的背景区域和其它对象所在的区域。在面积大于整个运动矢量场10%的若干分割区域中选择平均残差最小的区域作为可靠的背景区域,标记为剩下的区域作为运动对象可能存在的区域最后对当前帧所分割的M个对象区域和1个背景区域分别标记,分割结果记为 (5) Distinguish the most reliable background region from the region where other objects are located. Among several segmented regions whose area is larger than 10% of the entire motion vector field, the region with the smallest average residual error is selected as a reliable background region, which is marked as The remaining area is the area where the moving object may exist Finally, the M object regions and 1 background region segmented by the current frame are respectively marked, and the segmentation result is denoted as
e.对象分割e. Object Segmentation
如图4所示,先通过计算找到在相邻两帧中匹配的块,再将前一帧的运动对象投影至当前帧并标记为对象投影,然后利用当前帧对象投影和分割区域的相关性构造3个M+1行N+1列的矩阵CMt,CMRt,CMCt。再由矩阵CMRt和CMCt生成匹配矩阵CMMt,基于这个匹配矩阵对五类不同的运动对象情况作出分割。步骤详述如下:As shown in Figure 4, first find the matching blocks in two adjacent frames by calculation, then project the moving object of the previous frame to the current frame and mark it as object projection, and then use the correlation between the current frame object projection and the segmented area Construct three matrices CM t , CMR t , and CMC t with M+1 rows and N+1 columns. Then the matching matrix CMM t is generated from the matrices CMR t and CMC t , and based on this matching matrix, five different types of moving objects are segmented. The steps are detailed below:
(1)采用后向投影方法获得前一帧,t-1时刻,各个对象在当前帧t时刻的投影区域。先将前一帧的N个运动对象和1个背景对象标记出来,然后采用后向投影的方法获得前一帧各个对象在当前帧的投影区域。就是利用当前帧累积运动矢量场中任意块的坐标和其对应的累积运动矢量的差求出这个块在前一帧中的匹配位置,然后将前一帧匹配位置上的块对象投影到当前帧并逐个标记出来,记为 (1) Use the backward projection method to obtain the projection area of each object at the time t of the current frame at time t-1 in the previous frame. First move the N moving objects of the previous frame and 1 background object Mark it out, and then use the method of back projection to obtain the projection area of each object in the previous frame in the current frame. It is to use the difference between the coordinates of any block in the cumulative motion vector field of the current frame and its corresponding cumulative motion vector to find the matching position of this block in the previous frame, and then project the block object at the matching position of the previous frame to the current frame And marked out one by one, denoted as
(2)构造矩阵CMt,它表示分割区域与对象投影相互重叠的面积;构造矩阵CMRt,它表示每个分割区域落在各个对象投影内的比例;构造矩阵CMCt,它表示每个对象投影落在各个分割区域内的比例。根据标记图象和构造3个M+1行N+1列的矩阵CMt,CMRt,CMCt。其中矩阵CMt中的任意元素CMt(i,j)取值为在中标记为i且在标记为j的象素数目,即分割区域与对象投影相互重叠的面积。而矩阵CMRt(i,j)定义为CMRt第i行的各个元素是分割区域落在各个对象投影内的比例;矩阵CMCt(i,j)定义为CMCt第j列的各个元素是对象的投影落在各个分割区域内的比例。(2) Construct a matrix CM t , which represents the overlapping area of the segmentation region and the object projection; construct a matrix CMR t , which represents the proportion of each segmentation region within each object projection; construct a matrix CMC t , which represents each object The proportion of projections that fall within each segmented region. according to tagged image and Construct three matrices CM t , CMR t , and CMC t with M+1 rows and N+1 columns. Among them, any element CM t (i, j) in the matrix CM t takes a value in marked i in and in The number of pixels marked as j, that is, the segmented area with object projection overlapping area. The matrix CMR t (i, j) is defined as each element of the i-th row of CMR t is the segmentation region The proportion that falls within the projection of each object; the matrix CMC t (i, j) is defined as each element of the jth column of CMC t is the object The proportion of the projection of which falls within each segmented area.
(3)构造矩阵CMMt,它表示当前帧分割区域和对象投影之间的关联程度。矩阵CMMt记录了CMRt和CMCt所反映的和之间的相关信息。CMMt首先置为M+1行N+1列的零矩阵;接着对CMRt进行行扫描找到每一行最大值所在的位置,对CMMt中相应位置处的元素值加1;然后对CMCt进行列扫描找到每一列最大值所在的位置,对CMMt中相应位置处的元素值加2。生成的矩阵CMMt的纵坐标依次表示为当前帧背景区域和运动区域横坐标依次表示为前一帧背景对象和运动对象矩阵中各元素的可能取值为0,1,2,3。CMMt中任意不为0的元素CMMt(i,j)表明了分割区域与对象存在一定的相关性,具体而言:(3) Construct a matrix CMM t , which represents the degree of association between the segmented area of the current frame and the object projection. The matrix CMM t records the CMR t and CMC t reflected and related information between. CMM t is first set as a zero matrix of M+1 rows and N+1 columns; then row scan CMR t to find the position of the maximum value in each row, and add 1 to the element value at the corresponding position in CMM t ; then CMC t Perform column scanning to find the position of the maximum value of each column, and add 2 to the element value at the corresponding position in CMM t . The vertical coordinates of the generated matrix CMM t are sequentially represented as the background area of the current frame and sports area The abscissa represents the background object of the previous frame in sequence and moving objects The possible values of each element in the matrix are 0, 1, 2, 3. Any non-zero element CMM t (i, j) in CMM t indicates the segmentation area with the object There are certain correlations, specifically:
①CMMt(i,j)=1,表明分割区域在很大程度上属于前一帧对象 ①CMM t (i, j)=1, indicating the segmented area Objects that belong largely to the previous frame
②CMMt(i,j)=2,表明前一帧对象在很大程度上包含在分割区域中;②CMM t (i, j)=2, indicating that the object in the previous frame largely contained in the segmented area middle;
③CMMt(i,j)=3,同时包含了上述两种情况,表明和具有极强的相关性。需要进一步比较,如果CMRt(i,j)>CMCt(i,j),则CMMt(i,j)=1;否则,CMMt(i,j)=2。最后生成的CMMt取值范围为0,1,2。③ CMM t (i, j) = 3, which includes the above two situations at the same time, indicating that and have a strong correlation. For further comparison, if CMR t (i, j)>CMC t (i, j), then CMM t (i, j)=1; otherwise, CMM t (i, j)=2. The finally generated CMM t ranges from 0, 1, 2.
(4)基于匹配矩阵CMMt对单个对象的跟踪与更新、新对象出现、对象的合并、对象的分裂以及对象的消失五类情况进行对象分割。通过矩阵CMMt可以有效地建立起分割区域与运动对象的关联关系,它能够以一种统一的方式有效地处理以下五种情况:(4) Carry out object segmentation based on the matching matrix CMM t for tracking and updating of a single object, appearance of new objects, merging of objects, splitting of objects and disappearance of objects. The relationship between the segmented area and the moving object can be effectively established through the matrix CMM t , which can effectively handle the following five situations in a unified manner:
①单个对象跟踪与更新(1→1):如果CMMt的第i行只有一个非零元素CMMt(i,j),而且第j列也只有这一个非零元素CMMt(i,j),那么表明分割区域只与对象存在相关性,根据CMMt(i,j)的取值采取不同的策略:如果CMMt(i,j)=2,采取更新策略,用当前帧的分割区域来表示更新后的对象,即
②新对象出现(0→1):如果CMMt的第i行只有一个非零元素且位于第1列,值为1,表明该分割区域在前一帧还是背景对象并不属于已有的任何运动对象。如果同时满足上面①中的阈值条件,则可认为是一个新出现的运动对象,记为
上述①和②两种情况下,CMMt某行的非零元素个数都为1。如果CMMt的第i行存在多个非零元素,则表明分割区域可能与多个对象存在相关性。在这种情况下,只需要将前一帧的对象投影到当前帧作为当前帧的运动对象,实现对象的跟踪,即
③对象的合并(m→1):如果CMMt的第i行上除第1列外有2个以上的元素取值为2,表明前一帧中2个以上的对象在很大程度上包含在新的分割区域中,则表示了这些对象合并后的新对象,记为
④对象的分裂(1→m):如果CMMt的第j列中有2个以上的元素取值为1,则表明前一帧对象在当前帧分裂成多个分割区域即使这些区域在空间上并不邻接,在当前帧的分割中,仍然认为这些分割区域属于同一个对象,直到这些具有相同对象标记却在空间相互不邻接的多个分割区域在随后的若干帧中表现出不同的运动,则对这些分割区域赋予不同的对象标记,记为
⑤对象的消失(1→0):如果CMMt的第j列只有1个非零元素且位于第1行,值为2,表明前一帧对象的投影落在当前帧的背景区域则认为在当前帧消失。⑤ Disappearance of the object (1→0): If the jth column of CMM t has only 1 non-zero element and is located in the first row, the value is 2, indicating that the previous frame object The shadow of falls on the background area of the current frame then think Disappears at the current frame.
如上所述已经能够有效地处理在视频序列的运动对象分割过程中可能出现的5种情况。但当场景发生较大变化时,连续多帧都对所有对象采取了跟踪的策略,即都是上一帧各个对象的投影,表明当前帧各个分割区域与前一帧各个运动对象的相关性很弱,因此将按照情况②来判断是否有新对象出现,需要重新检测运动对象。As mentioned above, five situations that may occur during the segmentation of moving objects in video sequences can be effectively dealt with. However, when the scene changes greatly, a tracking strategy is adopted for all objects in multiple consecutive frames, that is, all objects are projections of the objects in the previous frame, indicating that the segmentation regions of the current frame are closely related to the moving objects in the previous frame. Weak, so it will judge whether there is a new object according to the situation ②, and the moving object needs to be detected again.
以下给出输入视频格式为352×288的CIF时的实例,采用JM8.6版本的H.264编码器对MPEG-4标准测试序列进行编码,作为测试用的H.264压缩视频。H.264编码器的配置如下:Baseline Profile,IPPP,每30帧插入1个I帧,3个参考帧,运动估计的搜索范围为[-32,32],量化参数为30,编码帧数为300帧。在实验中,我们采取每隔3帧(运动矢量累积过程中使用的帧数)计算一次累积运动矢量场的做法,总共获得了100帧累积运动矢量场来测试本文提出的运动对象分割算法的性能。先从当前帧由累积运动矢量场得到区域分割结果,然后将前一帧运动对象投影到当前帧,基于这两个结果采用基于匹配矩阵的分割方法将运动对象分割出来。The following is an example when the input video format is 352×288 CIF, and the H.264 encoder of JM8.6 version is used to encode the MPEG-4 standard test sequence as the H.264 compressed video for testing. The configuration of the H.264 encoder is as follows: Baseline Profile, IPPP, inserting 1 I frame and 3 reference frames every 30 frames, the search range of motion estimation is [-32, 32], the quantization parameter is 30, and the number of encoded frames is 300 frames. In the experiment, we calculated the cumulative motion vector field every 3 frames (the number of frames used in the motion vector accumulation process), and obtained a total of 100 frames of cumulative motion vector field to test the performance of the moving object segmentation algorithm proposed in this paper . Firstly, the region segmentation result is obtained from the current frame by accumulating the motion vector field, and then the moving object in the previous frame is projected to the current frame. Based on these two results, the segmentation method based on matching matrix is used to segment the moving object.
采用典型的标准测试序列Coastguard和Mobile作为输入视频进行测试,实验结果分别如图5和图6所示。两图中第1列为当前帧的原始图象,第2列为当前帧由累积运动矢量场分割所得的区域分割结果,第3列为前一帧运动对象的在当前帧的投影区域,第4列为当前帧分割出的运动对象。平均每帧的处理时间为38ms,已能满足大多数实时应用25fps的要求。考虑到本文的分割方法其实是每隔3帧进行一次运动对象分割,对于给出的原始视频序列而言,完全可以在实时解码的同时就能分割出相应的运动对象,即使要求每帧都分割出相应的运动对象,只需要对其余帧进行对象投影,其计算量也很小,仍能实时分割出运动对象。The typical standard test sequences Coastguard and Mobile are used as the input video for testing, and the experimental results are shown in Figure 5 and Figure 6, respectively. The first column in the two figures is the original image of the current frame, the second column is the region segmentation result of the current frame by the cumulative motion vector field segmentation, the third column is the projection area of the previous frame moving object in the current frame, and the second column is
实验1:序列Coastguard具有明显的全局运动,摄像机首先自右向左平移来跟踪画面中间的小船,然后自左向右运动来跟踪从画面左边出现的大船。图5第1行(序列第4帧)为摄像机自右向左跟踪小船的运动,图5第2行(序列第37帧)为新对象大船由左向右运动,图5第3行(序列第61帧)为两个运动对象大船和小船完全出现在摄像机的场景中,图5第4行(序列第208帧)为摄像机开始自左向右跟踪大船的运动。由图5第2列图象可以看出,对累积运动矢量场的分割大多能够比较准确地分割出两个运动对象所在的区域,而且符合全局运动模型的大部分背景区域也都包含在一个大的分割区域中,白色的区域表示了经运动补偿后最可靠的背景区域,因此本文采取的对运动矢量场的累积以及分割方法是有效的,能够利用运动矢量信息获得一个适度分割的结果。结合第3列所示的前一帧各个对象在当前帧的投影区域,利用基于匹配矩阵的运动对象分割方法,能够在整个序列中稳定可靠地分割出第4列所示的运动对象。Experiment 1: The sequence Coastguard has obvious global motion. The camera first pans from right to left to track the small boat in the middle of the picture, and then moves from left to right to track the big boat that appears from the left of the picture. Line 1 in Figure 5 (
实验2:序列Mobile具有更复杂的全局运动,除了摄像机的平移和俯仰运动外,在序列的前半段还有明显的缩放运动。图6第1行(序列第4帧)场景中总共包括3个运动对象,小火车推动球在轨道上运动,而挂历在间歇性地上下运动,因此运动对象分割的难度更大。由图6的分割结果可以看出,本发明提出的运动对象分割算法在运动对象停止运动的情况下,能够通过对象投影分割出该运动对象,如图6第2行(序列第43帧)的球以及图6第3行(序列第109帧)的挂历。此外,图6的实验结果也表明了本文的运动对象分割算法能够很好地处理运动对象的合并与分裂。在图6第3行(序列第109帧),由于小火车已经在无缝隙地推着球运动,因此两个在空间上紧密邻接且运动完全一致的运动对象被视作发生了对象合并;在图6第4行(序列第160帧)对有了间隙的两个对象,且运动程度不再相同时,两个对象被分割成了两个区域,真正实现两个对象的分裂。Experiment 2: Sequence Mobile has a more complex global motion. In addition to the pan and tilt motion of the camera, there is also an obvious zoom motion in the first half of the sequence. The scene in row 1 of Figure 6 (
Claims (6)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN 200610116363 CN100486336C (en) | 2006-09-21 | 2006-09-21 | Real time method for segmenting motion object based on H.264 compression domain |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN 200610116363 CN100486336C (en) | 2006-09-21 | 2006-09-21 | Real time method for segmenting motion object based on H.264 compression domain |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN1960491A CN1960491A (en) | 2007-05-09 |
| CN100486336C true CN100486336C (en) | 2009-05-06 |
Family
ID=38071950
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN 200610116363 Expired - Fee Related CN100486336C (en) | 2006-09-21 | 2006-09-21 | Real time method for segmenting motion object based on H.264 compression domain |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN100486336C (en) |
Families Citing this family (22)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101237581B (en) * | 2008-02-29 | 2010-11-17 | 上海大学 | A Real-time Video Object Segmentation Method Based on Motion Feature in H.264 Compressed Domain |
| CN101320085B (en) * | 2008-07-21 | 2012-07-25 | 哈尔滨工业大学 | Ultra-broadband wall-through point target positioning and imaging method based on back-projection algorithm |
| US20120207217A1 (en) * | 2008-12-19 | 2012-08-16 | Thomson Licensing | Video coding based on global movement compensation |
| CN101719979B (en) * | 2009-11-27 | 2011-08-03 | 北京航空航天大学 | Video Object Segmentation Method Based on Memory Compensation of Fixed Interval in Time Domain |
| CN102196259B (en) * | 2010-03-16 | 2015-07-01 | 北京中星微电子有限公司 | Moving object detection system and method suitable for compression domain |
| ES2746182T3 (en) | 2010-04-13 | 2020-03-05 | Ge Video Compression Llc | Prediction between planes |
| CN106454373B (en) | 2010-04-13 | 2019-10-01 | Ge视频压缩有限责任公司 | Decoder, method, encoder and the coding method for rebuilding array |
| EP4398576B1 (en) | 2010-04-13 | 2025-09-17 | GE Video Compression, LLC | Video coding using multi-tree sub-divisions of images |
| CN106303522B9 (en) * | 2010-04-13 | 2020-01-31 | Ge视频压缩有限责任公司 | Decoder and method, encoder and method, data stream generating method |
| CN102123234B (en) * | 2011-03-15 | 2012-09-05 | 北京航空航天大学 | Unmanned airplane reconnaissance video grading motion compensation method |
| CN102333213A (en) * | 2011-06-15 | 2012-01-25 | 夏东 | H.264 compressed domain moving object detection algorithm under complex background |
| CN102917224B (en) * | 2012-10-18 | 2015-06-17 | 北京航空航天大学 | Mobile background video object extraction method based on novel crossed diamond search and five-frame background alignment |
| CN103198297B (en) * | 2013-03-15 | 2016-03-30 | 浙江大学 | Based on the kinematic similarity assessment method of correlativity geometric properties |
| CN104125430B (en) * | 2013-04-28 | 2017-09-12 | 华为技术有限公司 | Video moving object detection method, device and video monitoring system |
| CN104683803A (en) * | 2015-03-24 | 2015-06-03 | 江南大学 | A Moving Object Detection and Tracking Method in Compressed Domain |
| HK1203289A2 (en) * | 2015-07-07 | 2015-10-23 | 香港生产力促进局 | A method and a device for detecting moving object |
| CN108965869B (en) * | 2015-08-29 | 2023-09-12 | 华为技术有限公司 | Image prediction methods and equipment |
| CN105931274B (en) * | 2016-05-09 | 2019-02-15 | 中国科学院信息工程研究所 | A Fast Object Segmentation and Tracking Method Based on Motion Vector Trajectory |
| CN108574846B (en) * | 2018-05-18 | 2019-03-08 | 中南民族大学 | A kind of video compress domain method for tracking target and system |
| CN109389031B (en) * | 2018-08-27 | 2021-12-03 | 浙江大丰实业股份有限公司 | Automatic positioning mechanism for performance personnel |
| CN114567781A (en) * | 2020-11-27 | 2022-05-31 | 安徽寒武纪信息科技有限公司 | Method, device, electronic equipment and storage medium for coding and decoding video image |
| CN112990273B (en) * | 2021-02-18 | 2021-12-21 | 中国科学院自动化研究所 | Video-sensitive person recognition method, system and device for compressed domain |
-
2006
- 2006-09-21 CN CN 200610116363 patent/CN100486336C/en not_active Expired - Fee Related
Non-Patent Citations (1)
| Title |
|---|
| 灵活可变的运动图像的分割算法. 赵彦玲等.中国工程科学,第8卷第5期. 2006 * |
Also Published As
| Publication number | Publication date |
|---|---|
| CN1960491A (en) | 2007-05-09 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN100486336C (en) | Real time method for segmenting motion object based on H.264 compression domain | |
| CN100538743C (en) | Understanding video content through real-time video motion analysis | |
| US8285045B2 (en) | Image analysis method, medium and apparatus and moving image segmentation system | |
| US7986810B2 (en) | Mesh based frame processing and applications | |
| Smith et al. | Layered motion segmentation and depth ordering by tracking edges | |
| Zhong et al. | Video object model and segmentation for content-based video indexing | |
| Zhong et al. | Spatio-temporal video search using the object based video representation | |
| US7142602B2 (en) | Method for segmenting 3D objects from compressed videos | |
| Bebeselea-Sterp et al. | A comparative study of stereovision algorithms | |
| Philip et al. | A comparative study of block matching and optical flow motion estimation algorithms | |
| Porikli et al. | Compressed domain video object segmentation | |
| Toklu et al. | Simultaneous alpha map generation and 2-D mesh tracking for multimedia applications | |
| Huang et al. | Automatic feature-based global motion estimation in video sequences | |
| Gao et al. | Shot-based video retrieval with optical flow tensor and HMMs | |
| Morand et al. | Scalable object-based video retrieval in hd video databases | |
| CN101600106A (en) | A global motion estimation method and device | |
| Gu et al. | Tracking of multiple semantic video objects for internet applications | |
| KR20010011348A (en) | Recording medium and method for constructing and retrieving a data base of a mpeg video sequence by using a object | |
| Choo et al. | Scene mapping-based video registration using frame similarity measurement and feature tracking | |
| Chen et al. | Progressive motion vector clustering for motion estimation and auxiliary tracking | |
| Yongsheng et al. | A Survey on Content based video retrival | |
| Felip et al. | Robust dominant motion estimation using MPEG information in sport sequences | |
| Cuevas et al. | Temporal segmentation tool for high-quality real-time video editing software | |
| Vo et al. | Precise estimation of motion vectors and its application to MPEG video retrieval | |
| Smeaton et al. | Coherent segmentation of video into syntactic regions |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| C14 | Grant of patent or utility model | ||
| GR01 | Patent grant | ||
| C17 | Cessation of patent right | ||
| CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20090506 Termination date: 20110921 |