CN106815854A

CN106815854A - A kind of Online Video prospect background separation method based on normal law error modeling

Info

Publication number: CN106815854A
Application number: CN201611252353.6A
Authority: CN
Inventors: 孟德宇; 雍宏巍; 岳宗胜; 赵谦
Original assignee: Xian Jiaotong University
Current assignee: Xian Jiaotong University
Priority date: 2016-12-30
Filing date: 2016-12-30
Publication date: 2017-06-09
Anticipated expiration: 2036-12-30
Also published as: CN106815854B

Abstract

一种基于正则误差建模的在线视频前景背景分离方法，1、实时获取监控系统的视频数据；2、基于视频背景环境实时变化在模型中嵌入能够基于视频背景的变换算子优化变量；3、基于视频前景目标实时变化的特点构建正则误差建模模型；4、结合2、3构建完整的统计模型，并根据最大后验估计方法，得到完整的监控视频前景背景分离模型；5、对视频数据进行下采样，对步骤4的视频数据前景背景分离模型计算进行加速，实现模型的实时求解；6、根据5得到的监控视频数据的前景背景分离结果，实时输出前景和背景。本发明是一种实现速度快、精度高的在线监控视频前景背景分离方法，在实际应用中对监控视频目标进行侦测、跟踪、识别、分析等具有重要意义。An online video foreground and background separation method based on regular error modeling, 1. Real-time acquisition of video data of the monitoring system; 2. Embedding in the model based on the real-time changes of the video background environment to optimize variables based on the transformation operator of the video background; 3. Construct a regular error modeling model based on the characteristics of real-time changes in video foreground targets; 4. Combine 2 and 3 to build a complete statistical model, and according to the maximum a posteriori estimation method, obtain a complete monitoring video foreground background separation model; 5. For video data Perform down-sampling to accelerate the calculation of the foreground and background separation model of video data in step 4, and realize the real-time solution of the model; 6. According to the foreground and background separation results of the surveillance video data obtained in step 5, output the foreground and background in real time. The invention is a method for separating the foreground and background of online monitoring video with high speed and high precision, and is of great significance in detecting, tracking, identifying and analyzing the monitoring video target in practical applications.

Description

A Foreground and Background Separation Method for Online Video Based on Regularized Error Modeling

技术领域technical field

本发明涉及一种监控视频的视频处理方法，具体涉及一种基于正则误差建模的在线视频前景背景分离方法。The invention relates to a video processing method for monitoring video, in particular to an online video foreground and background separation method based on regular error modeling.

背景技术Background technique

监控视频前景背景分离在现实生活中具有重要的应用价值，比如目标追踪、城市交通监测等等。但是如今监控设备遍布世界各个角落，每天的监控视频数据异常庞大，结构复杂。在保证高精度和高效率的前提下，实时的对监控视频进行前景背景分离依旧是一项巨大的挑战。Surveillance video foreground and background separation has important application value in real life, such as object tracking, urban traffic monitoring and so on. But nowadays, surveillance equipment is spread all over the world, and the daily surveillance video data is extremely large and complex in structure. Under the premise of ensuring high precision and high efficiency, it is still a huge challenge to separate the foreground and background of surveillance video in real time.

在图像处理领域中，已有不少的视频前景背景分离的技术。常见的技术包括基于统计假设直接分离的方法，子空间学习的方法和在线分离的方法等。In the field of image processing, there are quite a few technologies for foreground and background separation of videos. Common techniques include direct separation methods based on statistical assumptions, subspace learning methods, and online separation methods.

基于统计假设直接分离的方法，对视频的帧数据做某种统计分布的假设，然后基于一些统计量到分离前景背景的目的，比如中值或均值模型，直方图模型等。另外，MoG和MoGG方法对帧数据考虑更加细腻的统计分布假发，使用一些混合分布(比如混合高斯)去拟合各帧数据，得到了更好的分离效果。然而，这些方法均忽略掉了视频的结构信息，例如前景的空间连续性，背景在时间上的相似性等。相比之下，子空间学习的方法则对视频结构信息了比较细致的编码，通过假设视频的背景具有低秩结构，将视频前景的空间连续性，背景在时间上的相似性等结构信息融入到了模型之中，得到的前景背景分离效果比较理想。The method of direct separation based on statistical assumptions makes a certain statistical distribution assumption on the frame data of the video, and then separates the foreground and background based on some statistics, such as median or mean model, histogram model, etc. In addition, the MoG and MoGG methods consider more delicate statistical distribution wigs for frame data, and use some mixed distributions (such as mixed Gaussian) to fit each frame data, and get better separation effects. However, these methods ignore the structural information of the video, such as the spatial continuity of the foreground, the temporal similarity of the background, etc. In contrast, the subspace learning method encodes the structural information of the video in a more detailed manner. By assuming that the background of the video has a low-rank structure, the spatial continuity of the video foreground and the temporal similarity of the background are integrated into the structural information. In the model, the obtained foreground and background separation effect is ideal.

尽管子空间学习方法已经取得了比较显著的效果，但距离真正的实际应用还有一定的差距。当今社会，视频数据每时每刻都在迅速增长，要求前景背景分离技术在保证高精度的前提下还要有高效率；另一方面，面对时时刻刻不断涌现的监控数据，我们需要提供一种实时的在线分离技术。虽然现在已有一些在线分离方法，但往往不能达到高精度、高效率的双重要求。Although the subspace learning method has achieved remarkable results, there is still a certain distance from the real practical application. In today's society, video data is increasing rapidly all the time, requiring foreground and background separation technology to have high efficiency under the premise of ensuring high precision; on the other hand, in the face of constantly emerging surveillance data, we need to provide A real-time on-line separation technique. Although there are some online separation methods, they often cannot meet the dual requirements of high precision and high efficiency.

针对现有技术的不足，提供一种保证高精度的同时又能高效率的对不断出现的监控视频进行实时在线前景背景分离的技术甚为必要，尤其要对视频中存在的动态前景目标类型与动态背景环境变化应该能够有效自适应。In view of the deficiencies of existing technologies, it is necessary to provide a technology that can ensure high precision and high efficiency for real-time online foreground and background separation of continuously emerging surveillance videos, especially for dynamic foreground target types and Dynamic background environment changes should be able to adapt effectively.

发明内容Contents of the invention

本发明的目的在于提供一种能够更加充分准确的利用视频的结构信息进行统计建模，从而达到更高精度的分离效果，并且能保证处理的高效率的基于正则误差建模的在线视频前景背景分离方法。The purpose of the present invention is to provide an online video foreground and background based on regular error modeling that can more fully and accurately utilize the structural information of the video for statistical modeling, thereby achieving a higher-precision separation effect and ensuring high processing efficiency. Separation method.

为达到上述目的，本发明采用的技术方案包括如下步骤：In order to achieve the above object, the technical solution adopted in the present invention comprises the following steps:

步骤S1：在线获取监控系统的视频数据；Step S1: Obtain video data of the monitoring system online;

步骤S2：在视频背景的低秩结构假设基础上构建模型，在模型中嵌入自适应变换因子变量，编码视频背景的动态变化，实现对于真实视频动态背景的自适应建模；Step S2: Build a model based on the low-rank structural assumption of the video background, embed adaptive transformation factor variables in the model, encode the dynamic changes of the video background, and realize adaptive modeling of the real video dynamic background;

步骤S3：基于视频前景目标变化的随机性进行参数化分布建模，使其能够自适应于不同时间不同场景下视频前景的动态变化，进一步对其嵌入之前视频前景信息的正则化噪音信息编码，实现对于真实视频动态前景的自适应建模；Step S3: Parametric distribution modeling based on the randomness of video foreground object changes, so that it can adapt to the dynamic changes of video foreground at different times and scenes, and further encode the regularized noise information embedded in the previous video foreground information, Realize adaptive modeling of real video dynamic foreground;

步骤S4：结合步骤S2与步骤S3，构建完整的监控视频前景背景分离的统计模型；Step S4: Combining steps S2 and S3, construct a complete statistical model for the separation of the foreground and background of the surveillance video;

步骤S5：在步骤S1上，对视频数据进行下采样，并在此采样数据基础上应用步骤S4的视频数据前景背景分离统计模型对该项应用加速；Step S5: In step S1, down-sampling the video data, and applying the video data foreground and background separation statistical model of step S4 on the basis of the sampled data to accelerate the application of this item;

步骤S6：在步骤S5获得的前景目标之上，对其进行TV连续性建模；Step S6: performing TV continuity modeling on the foreground target obtained in step S5;

步骤S7：根据步骤S5，S6得到的结果，输出最终检测的视频前景目标与背景场景。Step S7: According to the results obtained in steps S5 and S6, output the final detected video foreground object and background scene.

所述步骤S2构建模型：由于视频数据各帧图像所对应的背景的相似性，该相似性通过对视频图像进行如下低秩表达进行编码：The step S2 builds a model: due to the similarity of the background corresponding to each frame image of the video data, the similarity is encoded by performing the following low-rank expression on the video image:

x^t＝U^tv^t+ε^t (1)x ^t =U ^t v ^t +ε ^t (1)

其中x^t∈R^d表示监控视频的第t帧图像，U^t∈R^d×r为该视频背景的当前表达基，其中r＜＜d，这些基底表达的子空间构成原图像空间的一个低秩子空间，v^t∈R^r为组合系数，U^tv^t表示x^t在子空间U^t下的低秩映射表示，ε^t表示残差；where x ^t ∈ R ^d represents the t-th frame image of the surveillance video, U ^t ∈ R ^d×r is the current expression base of the video background, where r<<d, the subspaces expressed by these bases constitute a low-level subspace of the original image space Rank subspace, v ^t ∈ R ^r is the combination coefficient, U ^t v ^t represents the low-rank mapping representation of x ^t under the subspace U ^t , ε ^t represents the residual;

在模型中嵌入自适应变换因子变量，即将模型(1)改进为：Embedding adaptive transformation factor variables in the model, that is, improving model (1) to:

其中τ^t为对图片x^t的仿射变换算子变量，表达旋转、平移、扭曲、尺度的视频背景变换。Where τ ^t is an affine transformation operator variable for the image x ^t , expressing the video background transformation of rotation, translation, distortion, and scale.

所述步骤S3的参数化分布建模是将模型(2)中的残差变量ε^t编码为混合高斯分布，使其自适应于不同时间不同场景下视频前景即视频背景残差的动态变化，对应模型为：The parametric distribution modeling of the step S3 is to encode the residual variable ε ^t in the model (2) into a mixed Gaussian distribution, so that it is adaptive to the dynamic change of the video foreground, that is, the video background residual at different times and different scenes, The corresponding model is:

其中为x^t的第i个像素值，为U^t的第i行，表示隐变量，代表第t侦的第i个像素值属于混合高斯分布中的第k个混合成分，且满足Multi表示多项式分布，为第t个混合成分的方差；in is the ith pixel value of x ^t , is the i-th row of U ^t , represents a hidden variable, The i-th pixel value representing the t-th pixel belongs to the k-th mixed component in the mixed Gaussian distribution, and satisfies Multi means multinomial distribution, is the variance of the tth mixed component;

为对其嵌入之前视频前景信息的正则化噪音信息编码，实现对于真实视频动态前景的自适应建模，分别对模型(3)中噪音分布变量进行共轭先验形式假设：In order to encode the regularized noise information embedded in the previous video foreground information and realize the adaptive modeling of the real video dynamic foreground, the conjugate prior formal assumptions are made on the noise distribution variables in model (3):

这里Inv-Gamma表示逆Gamma分布，Dir表示Dirichlet分布，的表达式如下：Here Inv-Gamma represents the inverse Gamma distribution, Dir represents the Dirichlet distribution, The expression of is as follows:

其中代表隶属度，表示第j侦的第i个像素值属于混合高斯分布中的第k个混合成分的程度，的意义与式(3)中相同。in Represents the degree of membership, indicating the degree to which the i-th pixel value of the j-th pixel belongs to the k-th mixed component in the mixed Gaussian distribution, The meaning of is the same as in formula (3).

所述步骤S4基于步骤S2，S3构建统计模型：The step S4 builds a statistical model based on the steps S2 and S3:

这里P(v^t)表示方差足够大的高斯分布，是第t侦的残差ε^t所属的混合高斯分布的混合系数，表示混合高斯分布各个混合成分的方差，τ^t表示自适应变换因子，如(6)式所定义；Here P(v ^t ) represents a Gaussian distribution with a sufficiently large variance, is the mixing coefficient of the mixed Gaussian distribution to which the t-th residual ε ^t belongs, Represents the variance of each mixture component of the mixture Gaussian distribution, τ ^t represents the adaptive transformation factor, as defined in formula (6);

根据最大后验估计原理，由统计模型转化的视频前景背景分离模型可转化为如下的优化问题，固定U^t＝U^t-1：According to the principle of maximum a posteriori estimation, the video foreground and background separation model converted from the statistical model can be transformed into the following optimization problem, fixed U ^t = U ^t-1 :

化简为：Simplifies to:

其中D_KL(·||·)表示KL散度，R(π^t，∑^t)为噪音正则项，形式为：where D _KL (·||·) represents the KL divergence, and R(π ^t , ∑ ^t ) is the noise regularization term, in the form:

这里是第t侦的残差ε^t所属的混合高斯分布的混合系数，表示混合高斯分布各个混合成分的方差，π^t-1，∑^t-1表示对应的第t-1侦的混合系数和方差向量，如(6)式所定义，C表示与π^t，∑^t无关的常数。here is the mixing coefficient of the mixed Gaussian distribution to which the t-th residual ε ^t belongs, Represents the variance of each mixture component of the mixed Gaussian distribution, π ^t-1 , ∑ ^t-1 represents the corresponding mixing coefficient and variance vector of the t-1th k, As defined in formula (6), C represents a constant independent of π ^t and Σ ^t .

所述步骤S5中，对第t帧图像建模求解之前进行下采样以加快求解速度，以Ω表示下标集，那么下采样之后得到即In the step S5, down-sampling is performed before the t-th frame image modeling solution to speed up the solution, and the subscript set is represented by Ω, then after down-sampling, it is obtained which is

Ω＝{k₁，k₂，…，k_m|1≤k_j≤d，j＝1，2，…，m}Ω={k ₁ , k ₂ ,..., k _m |1≤k _j ≤d, j=1, 2,..., m}

相应地，对U^t的行向量进行下采样可得 Correspondingly, downsampling the row vector of U ^t yields

所述步骤5中对(7)中τ^t进行一阶逼近，对Δτ^t求解的问题退化为加权最小二乘问题，从而求解如下模型获得更新结果：In the step 5, τ ^t in (7) is first-order approximated, and the problem solved for Δτ ^t degenerates into a weighted least squares problem, thereby solving the following model to obtain an updated result:

其中是第t侦的残差ε^t所属的混合高斯分布的混合系数，表示混合高斯分布各个混合成分的方差，τ^t为自适应变换因子变量，J是x在τ处的Jacobi矩阵，u_i为基底矩阵U的第i行。in is the mixing coefficient of the mixed Gaussian distribution to which the t-th residual ε ^t belongs, Represents the variance of each mixture component of the mixed Gaussian distribution, τ ^t is the adaptive transformation factor variable, J is the Jacobi matrix of x at τ, and u _i is the ith row of the basis matrix U.

所述步骤S5采用EM算法更新在线前景背景分离模型中的参数π^t，∑^t，v^t，以下公式中上标s表示第s次迭代，具体过程包括：The step S5 uses the EM algorithm to update the parameters π ^t in the online foreground and background separation model, ∑ ^t , v ^t , and the superscript s in the following formula represents the sth iteration, and the specific process includes:

S7.1：给出EM算法中E步隶属度的更新公式：S7.1: Give the E-step membership degree in the EM algorithm The update formula for is:

S7.2：给出EM算法中M步的迭代格式与终止条件：S7.2: Give the iteration format and termination condition of the M step in the EM algorithm:

迭代格式为：The iteration format is:

其中：in:

迭代终止条件为：The iteration termination condition is:

S7.3：设置迭代初始值：S7.3: Set the iteration initial value:

对初始l帧数据使用PCA方法得到初始子空间分解，然后使用MoG方法初始化参数π^t，0，∑^t，0，v^t，0；Use the PCA method for the initial l-frame data to obtain the initial subspace decomposition, and then use the MoG method to initialize the parameters π ^{t, 0} , ∑ ^{t, 0} , v ^{t, 0} ;

S7.4：进行(8)-(13)式的迭代运算，直到满足终止条件(14)式。S7.4: Carry out the iterative operation of (8)-(13) until the termination condition (14) is satisfied.

所述步骤S5中，对第t帧数据，在更新参数π^t，∑^t，v^t，τ^t的基础上，通过如下模型对背景的基U^t-1进行微调得到更新的U^t：In the step S5, for the tth frame data, on the basis of updating parameters π ^t , Σ ^t , v ^t , τ ^t , the following model is used to fine-tune the background base U ^t-1 to obtain an updated U ^t :

其中 in

模型(15)具有如下的解：Model (15) has the following solution:

其中表示U^t的第i行，和的表达式如下：in denote the ith row of U ^t , with The expression of is as follows:

按(16)-(18)式更新U^t，并输出前景Update U ^t according to formula (16)-(18), and output the foreground

所述步骤S6中，利用监控视频背景连续性特征建立TV范数模型如下：In described step S6, utilize monitoring video background continuity feature to set up TV norm model as follows:

这里||·||_TV表示TV范数，为式(19)的输出结果，λ设置为(为最大方差)，以上优化问题转化为：Here ||·|| _TV represents the TV norm, is the output result of formula (19), and λ is set as ( is the maximum variance), the above optimization problem is transformed into:

s.t.F＝Z_i，i＝1，2 (21)其中Z₁，Z₂∈R^m×n，S_i(·)定义如下：stF=Z _i , i=1,2 (21) where Z ₁ , Z ₂ ∈R ^m×n , S _i (·) are defined as follows:

利用交替方向乘子法ADMM求解(21)式的TV范数模型：The TV norm model of equation (21) is solved by using the ADMM of alternating direction multipliers:

S9.1：给出问题(18)的增广拉格朗日函数：S9.1: Give the augmented Lagrangian function of problem (18):

其中P₁，P₂∈R^m×n， where P ₁ , P ₂ ∈ R ^m×n ,

S9.2：建立交替方向乘子法的迭代格式与终止条件：S9.2: Establish the iteration format and termination condition of the alternating direction multiplier method:

迭代格式为：The iteration format is:

其中ρ是一个大于1的正数，取1.5；Where ρ is a positive number greater than 1, take 1.5;

迭代终止条件为：The iteration termination condition is:

S9.3：对(22)(23)进行求解，给出迭代的具体算式；S9.3: Solve (22)(23), and give the specific calculation formula of iteration;

S9.4：设置迭代的初值： S9.4: Set the initial value of iteration:

S9.5：进行(22)-(24)的迭代运算，直到满足终止条件(25)式。S9.5: Perform the iterative operation of (22)-(24) until the termination condition (25) is satisfied.

所述步骤S7中，最终输出前景FG^t，背景BG^t＝x^t-FG^t，这里FG^t，BG^t分别表示第t侦数据的前景和背景。In the step S7, foreground FG ^t and background BG ^t =x ^t −FG ^t are finally output, where FG ^t and BG ^t respectively represent the foreground and background of the tth detection data.

本发明在建立基于监控视频数据的前景与背景本身内在统计先验分别进行针对性分析与编码，实现速度快、精度高的在线监控视频前景背景分离模型与方法，对于实际应用中对监控视频目标进行侦测、跟踪、识别、分析等具有重要应用意义。The present invention establishes the foreground and background based on the monitoring video data and carries out targeted analysis and encoding respectively, so as to realize a fast and high-precision online monitoring video foreground and background separation model and method, which is useful for monitoring video targets in practical applications. It has important application significance for detection, tracking, identification, analysis, etc.

附图说明Description of drawings

图1为本发明的流程图。Fig. 1 is a flowchart of the present invention.

图2为LiDatasets数据集中部分视频的帧数据。Figure 2 shows the frame data of some videos in the LiDatasets dataset.

图3为步骤S5中下采样效果图，第一行为Li Datasets数据集中原图，第二行为经过1％下采样后所得的结果。Figure 3 is the downsampling effect diagram in step S5, the first row is the original image in the Li Datasets dataset, and the second row is the result obtained after 1% downsampling.

图4为本发明的视频分离效果图，第一列表示Li Datasets数据集中原图，第二列为预先标记好的前景的真实标签，第三列为步骤S5中分离得到的前景图，第四列为步骤S7中分离出来的前景图。Fig. 4 is the video separation effect diagram of the present invention, and the first column represents the original image in the Li Datasets data set, the second column is the real label of the pre-marked foreground, the third column is the foreground image separated in step S5, the fourth column Listed as the foreground image separated in step S7.

具体实施方式detailed description

下面结合附图及实施例对本发明作进一步详细描述。The present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments.

实施例1：采用数据集Li Datasets(https://fling.seas.upenn.edu/～xiaowz/dynamic/wordpress/decolor/)作为本发明的计算机仿真实验对象(参见图4第一列)。见下表(本发明在Li Datasets数据集中各视频应用的前景背景分离速度，其中FPS表示每秒钟处理的数据帧的数目)：Embodiment 1: The data set Li Datasets (https://fling.seas.upenn.edu/~xiaowz/dynamic/wordpress/decolor/) is used as the computer simulation experiment object of the present invention (see the first column in FIG. 4 ). See the following table (the foreground background separation speed of each video application of the present invention in the Li Datasets data set, wherein FPS represents the number of data frames processed per second):

该数据集包含了9个监控视频数据集，其中包括静态背景视频，随光照条件改变的背景视频，动态背景视频，并且其中部分数据有预先标记好的前景的真实标注(参见图4第二列)，部分视频的帧数据如图2所示。本发明从各种视频中各抽取200帧数据进行实验。其过程见图1：The dataset contains 9 surveillance video datasets, including static background videos, background videos that change with lighting conditions, and dynamic background videos, and some of the data have real annotations of pre-marked foregrounds (see the second column of Figure 4 ), the frame data of part of the video is shown in Figure 2. The present invention extracts 200 frames of data from various videos for experiments. The process is shown in Figure 1:

步骤S1：获取Li Datasets视频数据；Step S1: Obtain Li Datasets video data;

步骤S2：通过前景背景分离的基本原理对第t帧数据构建低秩分解模型，这里本发明使用含有三个混合成分的混合高斯去拟合噪声，具体表达式如下：Step S2: Construct a low-rank decomposition model for the t-th frame data through the basic principle of foreground and background separation. Here, the present invention uses a mixed Gaussian with three mixed components to fit the noise. The specific expression is as follows:

x^t＝U^tv^t+ε^t (26)x ^t =U ^t v ^t +ε ^t (26)

在模型中嵌入自适应变换因子变量，即将模型(26)改进为：Embedding adaptive transformation factor variables in the model, that is, improving the model (26) to:

步骤S3：根据对噪声的混合高斯分布假设，有Step S3: According to the assumption of the mixed Gaussian distribution of noise, there is

其中为x^t的第i个像素值，为U^t的第i行，表示隐变量，代表第t侦的第i个像素值属于混合高斯分布中的第k个混合成分，且满足Multi表示多项式分布，为第k个混合成分的方差；in is the ith pixel value of x ^t , is the i-th row of U ^t , represents a hidden variable, The i-th pixel value representing the t-th pixel belongs to the k-th mixed component in the mixed Gaussian distribution, and satisfies Multi means multinomial distribution, is the variance of the kth mixed component;

根据视频监控视频各帧数据背景之间的相似性，可以假设如下的先验分布：According to the similarity between the data backgrounds of each frame of video surveillance video, the following prior distribution can be assumed:

其中代表隶属度，表示第j侦的第i个像素值属于混合高斯分布中的第k个混合成分的程度，的意义与(28)中相同。in Represents the degree of membership, indicating the degree to which the i-th pixel value of the j-th pixel belongs to the k-th mixed component in the mixed Gaussian distribution, has the same meaning as in (28).

步骤S4：结合步骤S2与步骤S3及最大后验估计方法，可得如下的监控视频前景背景分离优化问题：Step S4: Combining steps S2 and S3 and the maximum a posteriori estimation method, the following surveillance video foreground and background separation optimization problem can be obtained:

化简为：Simplifies to:

这里是第t侦的残差ε^t所属的混合高斯分布的混合系数，表示混合高斯分布各个混合成分的方差，π^t-1，∑^t-1表示对应的第t-1侦的混合系数和方差向量，如(29)式所定义，C表示与π^t，∑^t无关的常数。here is the mixing coefficient of the mixed Gaussian distribution to which the t-th residual ε ^t belongs, Represents the variance of each mixture component of the mixed Gaussian distribution, π ^t-1 , ∑ ^t-1 represents the corresponding mixing coefficient and variance vector of the t-1th k, As defined in (29), C represents a constant independent of π ^t and Σ ^t .

步骤S5：参见图3，基于步骤S1所输入的视频数据，应用步骤S4的最大后验模型构造视频前景背景分离求解算法Step S5: Referring to Figure 3, based on the video data input in step S1, apply the maximum a posteriori model in step S4 to construct a video foreground and background separation algorithm

A.对前50帧数据使用PCA算法得到初始的子空间U及v，然后使用如下的MoG算法1初始化各个参数：A. Use the PCA algorithm for the first 50 frames of data to obtain the initial subspace U and v, and then use the following MoG algorithm 1 to initialize each parameter:

B.下采样过程，按1％的比例对第t帧数据x^t进行下采样，得到这里Ω表示下标集，即B. Down-sampling process, down-sampling the t-th frame data x ^t according to the ratio of 1%, to get Here Ω represents the set of subscripts, namely

Ω＝{k₁，k₂，…，km|1≤k_j≤d，j＝1，2，…，m}Ω={k ₁ , k ₂ ,..., km|1≤k _j ≤d, j=1, 2,..., m}

类似地对U^t进行下采样可得到 Downsampling U ^t similarly gives

C.固定U＝U^t-1，使用EM算法更新π^t，∑^t，v^t，迭代(上标s表示迭代次数)格式如下：C. Fix U=U ^t-1 , use EM algorithm to update π ^t , ∑ ^t , v ^t , iteration (the superscript s indicates the number of iterations) format is as follows:

E-步：E-step:

M-步：M-step:

D.迭代终止条件：D. Iteration termination condition:

E.根据上述过程更新完π^t，∑^t，v^t后，更新U^t，只需对其中部分元素进行微调，具体优化模型如下：E. After updating π ^t , ∑ ^t , and v ^t according to the above process, to update U ^t , only some of the elements For fine-tuning, the specific optimization model is as follows:

其中 in

上述模型(30)有如下的显式解：The above model (30) has the following explicit solution:

这里表示的第i行，和的表达式如下：here express the ith row of with The expression of is as follows:

F.输出前景(参见图4第三列)F. Output Prospect (See third column in Figure 4)

步骤S6：在步骤S5获得的前景目标之上，利用监控视频背景连续性特征建立TV范数模型如下：Step S6: On the foreground object obtained in step S5, use the monitoring video background continuity feature to establish a TV norm model as follows:

这里||·||_TV表示TV范数，为步骤S5中所输出的前景，λ设置为(为最大方差)。以上优化问题可转化为：Here ||·|| _TV represents the TV norm, For the foreground output in step S5, λ is set to ( is the maximum variance). The above optimization problem can be transformed into:

s.t.F＝Z_i，i＝1，2stF=Z _i , i=1, 2

其中Z₁，Z₂∈R^m×n，S_i(·)定义如下：Where Z ₁ , Z ₂ ∈R ^m×n , S _i (·) are defined as follows:

求解的过程使用如下的迭代格式：The solution process uses the following iterative format:

其中ρ是一个大于1的正数，这里取1.5。Where ρ is a positive number greater than 1, here 1.5 is taken.

A.(32)式即求解下面的问题：A. (32) formula promptly solves the following problem:

问题(35)有如下的最优解：Problem (35) has the following optimal solution:

B.式即求解下面的问题：B. The formula is to solve the following problem:

问题(36)可分解为以下两个子问题(i＝1，2)分别求解：Problem (36) can be decomposed into the following two sub-problems (i=1, 2) to be solved respectively:

等价地有Equivalently there are

这里 here

令v₁，v₂，…，v_n表示Z_i的列向量，t₁，t₂，…，t_n表示T_i的列向量，则问题(37)又可以又可以分解为以下n个子问题单独求解：Let v ₁ , v ₂ ,..., v _n represent the column vectors of Z _i , t ₁ , t ₂ ,..., t _n represent the column vectors of T _i , then problem (37) can be decomposed into the following n sub-problems Solve individually:

而问题(38)可以使用如下的1D TV去噪算法求解：And problem (38) can be solved using the following 1D TV denoising algorithm:

步骤S7：最终输出FG^t(步骤S6中式(31)的解)，背景BG^t＝x^t-FG^t(参见图4第四列)。Step S7: Final output FG ^t (solution of formula (31) in step S6), background BG ^t =x ^t −FG ^t (see the fourth column in FIG. 4 ).

Claims

1. it is a kind of based on normal law error modeling Online Video prospect background separation method, it is characterised in that comprise the following steps：

Step S1：The online video data for obtaining monitoring system；

Step S2：Model is built on the basis of the low-rank structure of video background is assumed, the adaptive transformation factor is embedded in a model Variable, the dynamic change of encoded video background realizes the adaptive modeling for real video dynamic background；

Step S3：Randomness based on video foreground object variations carries out parametrization distribution modeling, can be adaptive to not With the dynamic change of video foreground under time different scenes, the regularization noise of video foreground information before being further embedded in it Information is encoded, and realizes the adaptive modeling for real video dynamic prospect；

Step S4：With reference to step S2 and step S3, the statistical model that complete monitor video prospect background is separate is built；

Step S5：On step S1, carry out down-sampling to video data, and applying step S4 is regarded on the basis of this sampled data Frequency separates statistical model to this application acceleration according to prospect background；

Step S6：On the foreground target that step S5 is obtained, TV continuity modelings are carried out to it；

Step S7：According to step S5, the result that S6 is obtained, the video foreground target and background scene of the final detection of output.

2. the Online Video prospect background separation method based on normal law error modeling according to claim 1, its feature exists In：The step S2 builds model：Due to the similitude of the background corresponding to each two field picture of video data, the similitude is by right Video image carries out following low-rank expression and is encoded：

x^t=U^tv^t+ε^t (1)

Wherein x^t∈R^dRepresent the t two field pictures of monitor video, U^t∈R^d×rIt is the current expression base of the video background, wherein r ＜＜ d, the subspace of these substrates expression constitutes a low-rank subspace of artwork image space, v^t∈R^rIt is combination coefficient, U^tv^tTable Show x^tIn subspace U^tUnder low-rank mapping represent, ε^tRepresent residual error；

Be embedded in adaptive transformation factor variable in a model, will model (1) be improved to：

Wherein τ^tIt is to picture x^tAffine transformation operator variable, expression rotation, translation, distortion, yardstick video background conversion.

3. the Online Video prospect background separation method based on normal law error modeling according to claim 2, its feature exists In：The parametrization distribution modeling of the step S3 is by the residual error variable ε in model (2)^tGaussian mixtures are encoded to, make it The video foreground i.e. dynamic change of video background residual error under different time different scenes is adaptive to, correspondence model is：

x_{i}^{t} ~ Π_{k = 1}^{K} N {(x_{i}^{t} | {(u_{i}^{t})}^{T} v^{t}, σ_{k}^{t^{2}})}^{z_{i k}^{t}}, Z_{i}^{t} ~ M u l t i (Z_{i}^{t} | π^{t}) - - - (3)

WhereinIt is x^tIth pixel value,It is U^tThe i-th row,Represent hidden variable, K-th blending constituent that the ith pixel value that t detects belongs in Gaussian mixtures is represented, and MeetMulti representative polynomials are distributed,It is t-th variance of blending constituent；

It is the regularization noise information coding of video foreground information before being embedded in it, realizes for real video dynamic prospect Adaptive modeling, carries out conjugate prior form hypothesis to noise distribution variable in model (3) respectively：

σ_{k}^{t^{2}} ~ I n v - G a m m a (σ_{k}^{t^{2}} | \frac{N_{k}^{t - 1}}{2} - 1, \frac{N_{k}^{t - 1} σ_{k}^{t - 1^{2}}}{2}) - - - (4)

π^{t} ~ D i r (π^{t} | α), α = (N^{t - 1} π_{1}^{t - 1} + 1, ..., N^{t - 1} π_{K}^{t - 1}) - - - (5)

Here Inv-Gamma represents Inv-Gamma distribution, and Dir represents that Dirichlet is distributed, Expression formula it is as follows：

γ_{i k}^{j} = P (z_{i k}^{j} = 1 | x_{i}^{j}) = \frac{P (x_{i}^{j} | z_{i k}^{j} = 1) P (z_{i k}^{j} = 1)}{Σ_{k = 1}^{K} P (x_{i}^{j} | z_{i k}^{j} = 1) P (z_{i k}^{j} = 1)} = \frac{π_{k}^{j - 1} N (x_{i}^{j} | {(u_{i}^{j})}^{T} v^{j}, σ_{k}^{j^{2}})}{Σ_{k = 1}^{K} π_{k}^{j - 1} N (x_{i}^{j} | {(u_{i}^{j})}^{T} v^{j}, σ_{k}^{j^{2}})} - - - (6)

WhereinDegree of membership is represented, k-th blending constituent that the ith pixel value that jth is detectd belongs in Gaussian mixtures is represented Degree,Meaning it is identical with formula (3).

4. the Online Video prospect background separation method based on normal law error modeling according to claim 1, its feature exists In：The step S4 is based on step S2, S3 and builds statistical model：

Here P (v^t) the sufficiently large Gaussian Profile of variance is represented,It is residual epsilon that t is detectd^tIt is affiliated Gaussian mixtures mixed coefficint,Represent Gaussian mixtures each blending constituents Variance, t^tThe adaptive transformation factor is represented,As (6) formula is defined；

According to MAP estimation principle, the video foreground background separation model converted by statistical model can be converted into following excellent Change problem, fixed U^t=U^t-1：

Abbreviation is：

Wherein D_KL(| |) represent KL divergences, R (π^t,∑^t) it is noise regular terms, form is：

\begin{matrix} R (Π^{t}, Σ^{t}) = Σ_{k = 1}^{K} N_{k}^{t - 1} D_{K L} (N (0, σ_{k}^{t - 1^{2}}) | | N (0,) σ_{k}^{t^{2}}) \\ + N^{t - 1} D_{K L} (M u l t i (π^{t - 1}) | | M u l t i (π^{t})) + C . \end{matrix}

HereIt is residual epsilon that t is detectd^tThe mixed coefficint of affiliated Gaussian mixtures,Represent the variance of each blending constituent of Gaussian mixtures, 4^t-1, ∑^t-1Represent corresponding Mixed coefficint and variance vectors that t-1 is detectd, As (6) formula is defined, C is represented and π^t, ∑^tUnrelated constant.

5. the Online Video prospect background separation method based on normal law error modeling according to claim 1, its feature exists In：In the step S5, to carrying out down-sampling to accelerate solving speed before t two field picture model solutions, subscript is represented with Ω Collection, then obtained after down-samplingI.e.

Ω={ k₁,k₂,…,k_m|1≤k_j≤ d, j=1,2 ..., m }

x_{Ω}^{t} = {[x_{k_{1}}^{t}, x_{k_{2}}^{t}, ..., x_{k_{m}}^{t}]}^{T}

Correspondingly, to U^tRow vector carry out down-sampling and can obtain

6. the Online Video prospect background separation method based on normal law error modeling according to claim 4, its feature exists In：To τ in (7) in the step 5^tCarry out single order to approach, to Δ τ^tThe problem of solution deteriorates to weighted least-squares problem, from And solve and update result as drag is obtained：

WhereinIt is residual epsilon that t is detectd^tThe mixed coefficint of affiliated Gaussian mixtures,Represent the variance of each blending constituent of Gaussian mixtures, τ^tFor the adaptive transformation factor becomes Amount, J is Jacobi matrixes of the x at τ, u_iIt is i-th row of basis matrix U.

7. the Online Video prospect background separation method based on normal law error modeling according to claim 1, its feature exists In：The step S5 updates the parameter π in online prospect background disjunctive model using EM algorithms^t,∑^t,v^t, in below equation on Mark s represents the s times iteration, and detailed process includes：

S7.1：Provide E step degrees of membership in EM algorithmsMore new formula：

γ_{i k}^{t, s + 1} = \frac{π_{k}^{t, s} N (x_{Ω i}^{t} | {(u_{Ω i}^{t})}^{T} v^{t, s}, σ_{k}^{t, s^{2}})}{Σ_{k = 1}^{K} π_{k}^{t, s} N (x_{Ω i}^{t} | {(u_{Ω i}^{t})}^{T} v^{t, s}, σ_{k}^{t, s^{2}})} - - - (8)

S7.2：Provide the Iteration and end condition of M steps in EM algorithms：

Iteration is：

π_{k}^{t, s + 1} = π_{k}^{t - 1} - \frac{{\overset{&OverBar;}{N}}^{t, s}}{N^{t, s}} (π_{k}^{t - 1} - {\overset{&OverBar;}{π}}_{k}^{t, s}) - - - (9)

σ_{k}^{t, s + 1^{2}} = σ_{k}^{t - 1^{2}} - \frac{{\overset{&OverBar;}{N}}_{k}^{t, s}}{{\overset{&OverBar;}{N}}_{k}^{t, s}} (σ_{k}^{t - 1^{2}} - {\overset{&OverBar;}{σ}}_{k}^{t, s^{2}}) - - - (10)

w_{i}^{s + 1} = \sqrt{Σ_{k = 1}^{K} \frac{γ_{i, k}^{t, s + 1}}{2 σ_{k}^{t, s + 1^{2}}}}, i = 1, 2 ..., m - - - (11)

w^{s + 1} = {[w_{1}^{s + 1}, w_{2}^{s + 1}, ..., w_{d}^{s + 1}]}^{T} - - - (12)

v^{t, s + 1} = {(U_{Ω}^{T} d i α g {(w^{s + 1})}^{2} U_{Ω})}^{- 1} U_{Ω}^{T} d i a g {(w^{s + 1})}^{2} x_{Ω}^{t} - - - (13)

Wherein：

{\overset{&OverBar;}{N}}_{k}^{t, s} = Σ_{i = 1}^{d} γ_{i k}^{t, s}, {\overset{&OverBar;}{N}}^{t, s} = Σ_{k = 1}^{K} {\overset{&OverBar;}{N}}_{k}^{t, s}, {\overset{&OverBar;}{π}}_{k}^{t, s} = \frac{{\overset{&OverBar;}{N}}_{k}^{t, s}}{{\overset{&OverBar;}{N}}^{t, s}}

N_{k}^{t, s} = {\overset{&OverBar;}{N}}_{k}^{t, s} + N_{k}^{t - 1}, N^{t, s} = {\overset{&OverBar;}{N}}^{t, s} + N^{t - 1}, {\overset{&OverBar;}{σ}}_{k}^{t, s^{2}} = \frac{1}{{\overset{&OverBar;}{N}}_{k}^{t, s}} Σ_{i = 1}^{m} γ_{i k}^{t, s} {(x_{Ω i}^{t} - {(u_{i}^{t})}^{T} v^{t, s})}^{2}

Stopping criterion for iteration is：

\frac{| L (Π^{t, s + 1}, Σ^{t, s + 1}, v^{t, s + 1} l) - L (Π^{t, s}, Σ^{t, s}, v^{t, s}) |}{| L (Π^{t, s + 1}, Σ^{t, s + 1}, v^{t, s + 1}) |} < 10^{- 4} - - - (14)

S7.3：Iteration initial value is set：

Initial sub-spaces are obtained using PCA methods to decompose, then use MoG method initiation parameters π to initial l frame data^t,0, ∑^t,0,v^t,0；

S7.4：The interative computation of (8)-(13) formula is carried out, until meeting end condition (14) formula.

8. the Online Video prospect background separation method based on normal law error modeling according to claim 6, its feature exists In：In the step S5, to t frame data, in undated parameter π^t,∑^t,v^t, τ^tOn the basis of, by such as drag to background Base U^t-1It is finely adjusted the U for being updated^t：

Wherein

Model (15) is with following solution：

u_{i}^{t} = A_{i}^{t} b_{i}^{t} - - - (16)

WhereinRepresent U^tThe i-th row,WithExpression formula it is as follows：

\begin{matrix} A_{i}^{t} = {(Σ_{j = 1}^{t} w_{i}^{j^{2}} v^{j} v^{j^{T}})}^{- 1} \\ = {({(A_{i}^{t - 1})}^{- 1} + w_{i}^{t^{2}} v^{t} v^{t^{T}})}^{- 1} \\ = A_{i}^{t - 1} - \frac{w_{i}^{t^{2}} A_{i}^{t - 1} v^{t} v^{t^{T}} A_{i}^{t - 1}}{1 + w_{i}^{t^{2}} v^{t^{T}} A_{i}^{t - 1} v^{t}} \end{matrix} - - - (17)

\begin{matrix} b_{i}^{t} = Σ_{j = 1}^{t} w_{i}^{j^{2}} x_{Ω i}^{j} v^{j} \\ = b_{i}^{t - 1} + w_{i}^{t^{2}} x_{Ω i}^{t} v^{t} \end{matrix} - - - (18)

U is updated by (16)-(18) formula^t, and export prospect

{FG}_{0}^{t} = x^{t} - U^{t} v^{t} - - - (19)

9. the Online Video prospect background separation method based on normal law error modeling according to claim 8, its feature exists In：In the step S6, TV Norm Models are set up using monitor video background continuity Characteristics as follows：

{FG}^{t} = \arg \min_{F} \frac{1}{2} | | F - {FG}_{0}^{t} | |^{2} + λ | | F | |_{T V} - - - (20)

Here ‖ ‖_TVTV norms are represented,It is the output result of formula (19), λ is set to(For most Big variance), above optimization problem is converted into：

\min_{F, Z_{1}, Z_{2}} \frac{1}{2} | | F - {FG}_{0}^{t} | |_{2}^{2} + λ Σ_{i = 1}^{2} S_{i} (Z_{i})

S.t. F=Z_i, i=1,2 (21) wherein Z₁,Z₂∈R^m×n,S_i() is defined as follows：

S_{1} (Z_{k}) = Σ_{j = 1}^{n} Σ_{i = 1}^{m - 1} | {(Z_{k})}_{i j} - {(Z_{k})}_{i + 1, j} |

S_{1} (Z_{k}) = Σ_{i = 1}^{m} Σ_{j = 1}^{n - 1} | {(Z_{k})}_{i j} - {(Z_{k})}_{i, j + 1} |

The TV Norm Models of (21) formula are solved using alternating direction multiplier method ADMM：

S9.1：To the Augmented Lagrangian Functions of (18) of ging wrong：

L (F, Z_{i}, P_{i}) = \frac{1}{2} | | F - {FG}_{0}^{t} | |_{2}^{2} + λ Σ_{i = 1}^{1} < P_{i}, Z_{i} - F > + \frac{ρ}{2} Σ_{i = 1}^{2} | | Z_{i} - F | |_{2}^{2}

Wherein P₁,P₂∈R^m×n,

S9.2：Set up the Iteration and end condition of alternating direction multiplier method：

Iteration is：

F^{s + 1} = \arg \min_{F} L (F, Z_{i}^{s}, P_{i}^{s}) - - - (22)

{Z_{1}^{s + 1}, Z_{2}^{s + 1}} = \arg \min_{Z_{1}, Z_{2}} L (F^{s + 1}, Z_{i}, P_{i}^{s}) - - - (23)

P_{i}^{s + 1} = P_{i}^{s} + ρ (Z_{k}^{s + 1} - F^{s + 1}) - - - (24)

Wherein ρ is a positive number more than 1, takes 1.5；

Stopping criterion for iteration is：

\max {\frac{| | Z_{i}^{s + 1} - F^{s + 1} | |_{2}^{1}}{| | Z_{i}^{s + 1} | |_{2}^{2}}} \leq 10^{- 4} - - - (25)

S9.3：(22) (23) are solved, the specific formula of iteration is given；

S9.4：The initial value of iteration is set：

S9.5：The interative computation of (22)-(24) is carried out, until meeting end condition (25) formula.

10. the Online Video prospect background separation method based on normal law error modeling according to claim 1, its feature exists In：In the step S7, final output prospect FG^t, background BG^t=x^t-FG^t, FG here^t,BG^tRepresent that t detects data respectively Foreground and background.