CN104778453B

CN104778453B - A kind of night pedestrian detection method based on infrared pedestrian's brightness statistics feature

Info

Publication number: CN104778453B
Application number: CN201510154382.8A
Authority: CN
Inventors: 徐向华; 王路杰
Original assignee: Hangzhou Dianzi University
Current assignee: Hangzhou Dianzi University
Priority date: 2015-04-02
Filing date: 2015-04-02
Publication date: 2017-12-22
Anticipated expiration: 2035-04-02
Also published as: CN104778453A

Abstract

The invention discloses a kind of night pedestrian detection method based on infrared pedestrian's brightness statistics feature.The present invention counts gray average information to each part of pedestrian in Sample Storehouse and negative sample first, and mapping range border is determined using the information, constructs a brightness histogram feature for distinguishing ballot interval division；Then gradient orientation histogram feature is calculated, and the two features are subjected to joint and form final feature descriptor；Secondly model training is carried out using the method for Adaboost combination decision trees, and pedestrian's judgement and positioning are carried out by sliding window scanning method, finally when grader is obtained compared with low confidence to the classification judgement of some detection block, judged again using brightness section template, so as to realize the pedestrian detection at night.The present invention have effectively achieved the pedestrian detection under night environment, have the characteristics of high verification and measurement ratio, strong adaptability.

Description

A Pedestrian Detection Method at Night Based on Infrared Pedestrian Brightness Statistical Features

技术领域technical field

本发明涉及车载视频图像的行人检测方法，特别一种基于红外行人亮度统计特征的夜间行人检测方法。The invention relates to a pedestrian detection method of a vehicle-mounted video image, in particular to a pedestrian detection method at night based on the statistical characteristics of infrared pedestrian brightness.

背景技术Background technique

行人检测技术是计算机视觉的一个重要应用，在日常的生活和生产中有很高的实用价值。基于视觉的行人检测就是根据一定图像处理技术从输入的图片或者视频帧序列判断行人出现的具体位置。智能车辆辅助驾驶系统可以提高车辆驾驶的安全性，从而减少交通事故的发生，行人检测技术则是智能辅助驾驶系统中的核心技术之一。Pedestrian detection technology is an important application of computer vision, which has high practical value in daily life and production. Pedestrian detection based on vision is to judge the specific location of pedestrians from the input picture or video frame sequence according to certain image processing technology. The intelligent vehicle assisted driving system can improve the safety of vehicle driving, thereby reducing the occurrence of traffic accidents, and pedestrian detection technology is one of the core technologies in the intelligent assisted driving system.

基于视觉的夜间行人检测技术主要采用的是可见光图像、红外图像等技术。在夜晚情况下，由于光照等条件不理想，可见光摄像机的成像效果较差，影响行人检测的效果。红外摄像机通过被动红外技术捕获红外线信息感知物体，不同温度的物体在图像中呈现出不同亮度。在道路场景红外图像中，行人一般比背景辐射更多的热量，红外图像中的行人一般比背景更为明亮，且不受夜晚光线阴暗、雾天视线不清等影响，具有良好的夜视能力，对不同的光照环境都有较强的适应能力。因此红外图像的行人检测技术是实现夜间行人检测的有效解决方法。Vision-based pedestrian detection technology at night mainly uses visible light images, infrared images and other technologies. At night, due to unsatisfactory lighting and other conditions, the imaging effect of the visible light camera is poor, which affects the effect of pedestrian detection. Infrared cameras capture infrared information to perceive objects through passive infrared technology, and objects with different temperatures show different brightness in the image. In the infrared image of the road scene, pedestrians generally radiate more heat than the background, and the pedestrians in the infrared image are generally brighter than the background, and are not affected by dark light at night, unclear vision in foggy days, etc., and have good night vision ability , have strong adaptability to different lighting environments. Therefore, the pedestrian detection technology of infrared images is an effective solution to realize pedestrian detection at night.

目前多数红外行人检测技术采用基于机器学习的方法，如发明专利《基于特征组合的行人检测方法及装置》(CN103632170A)利用HOG特征与LBP特征构成一个联合特征，利用支持向量机(SVM)作为学习算法进行分类器训练实现行人检测。上述方法涉及的LBP特征对图像纹理有较强描述能力，而红外图像的纹理特征并不明显，因此LBP特征应用于红外行人检测中的效果一般。发明专利《一种基于红外图像的行人检测方法》(CN103902976A)在HOG特征基础上融合与亮度直方图特征描述行人，利用SVM进行分类器训练实现夜间行人检测。该方法除了利用行人的轮廓信息外，也对红外行人的亮度信息进行了提取，因此，有更好的红外行人检测效果。At present, most infrared pedestrian detection technologies use machine learning-based methods. For example, the invention patent "Pedestrian Detection Method and Device Based on Feature Combination" (CN103632170A) uses HOG features and LBP features to form a joint feature, and uses support vector machines (SVM) as learning Algorithm for classifier training to achieve pedestrian detection. The LBP feature involved in the above method has a strong ability to describe the image texture, but the texture feature of the infrared image is not obvious, so the effect of the LBP feature applied to infrared pedestrian detection is general. Invention patent "A Pedestrian Detection Method Based on Infrared Image" (CN103902976A) based on HOG feature fusion and brightness histogram features to describe pedestrians, using SVM for classifier training to realize nighttime pedestrian detection. In addition to using the outline information of pedestrians, this method also extracts the brightness information of infrared pedestrians, so it has a better detection effect of infrared pedestrians.

上述方法并没有考虑红外行人的亮度分布统计特征。本发明在HOG特征基础上，融合红外行人的亮度分布统计特征，构建一种描述能力更强的红外行人特征，利用Adaboost作为学习算法，设计行人检测分类器以及亮度区间模板再判定的行人行人框架。本发明的方法在红外行人检测中有更好的行人特征描述能力和检测效果。The above methods do not consider the statistical characteristics of the brightness distribution of infrared pedestrians. Based on the HOG feature, the present invention combines the statistical characteristics of the brightness distribution of infrared pedestrians to construct an infrared pedestrian feature with stronger description ability, uses Adaboost as a learning algorithm, and designs a pedestrian detection classifier and a pedestrian framework for re-determining the brightness interval template . The method of the invention has better pedestrian feature description ability and detection effect in infrared pedestrian detection.

发明内容Contents of the invention

本发明的目的是针对现有技术的不足，提供一种基于红外行人亮度统计特征的夜间行人检测方法。本发明利用红外行人的轮廓信息和亮度信息，首先通过对样本库中的行人部件(包括头部、上半身、下半身)和负样本图像统计灰度均值信息，根据灰度均值信息的分布特征确定行人描述特征的各投票映射区间，构造一个区分投票区间划分的亮度直方图特征(DBHOI)；然后计算梯度方向直方图(HOG)特征，并将这两个特征进行联合构成最终的行人特征描述符；其次，利用Adaboost结合决策树的方法进行模型训练，通过滑窗扫描法进行行人判定及定位；最后，当分类器对某个行人框分类判断得到较低置信度时，采用亮度区间模板进行再次判定，从而实现夜间红外行人检测。该方法具有检测率高、适应性强的特点。The purpose of the present invention is to provide a pedestrian detection method at night based on statistical characteristics of infrared pedestrian brightness. The present invention utilizes the profile information and brightness information of infrared pedestrians, firstly calculates the average gray value information of the pedestrian parts (including head, upper body, lower body) and negative sample images in the sample library, and determines the pedestrian according to the distribution characteristics of the gray average value information Describe each voting mapping interval of the feature, construct a brightness histogram feature (DBHOI) that distinguishes the voting interval division; then calculate the gradient orientation histogram (HOG) feature, and combine these two features to form the final pedestrian feature descriptor; Secondly, use Adaboost combined with the decision tree method for model training, and use the sliding window scanning method to determine and locate pedestrians; finally, when the classifier obtains a low confidence level for the classification and judgment of a certain pedestrian frame, it uses the brightness interval template for re-judgment , so as to realize infrared pedestrian detection at night. This method has the characteristics of high detection rate and strong adaptability.

本发明解决其技术问题采用的技术方案步骤如下：The technical solution steps adopted by the present invention to solve its technical problems are as follows:

步骤1、构建红外图像的正负样本数据集；Step 1. Construct positive and negative sample data sets of infrared images;

步骤2、基于亮度统计特征的红外图像DBHOI特征构造；Step 2, infrared image DBHOI feature construction based on brightness statistical features;

步骤3、基于轮廓信息的红外图像HOG特征构造；Step 3, infrared image HOG feature construction based on contour information;

步骤4、利用Adaboost进行分类器训练；Step 4, utilize Adaboost to carry out classifier training;

步骤5、检测窗口判定及定位；Step 5, detection window judgment and positioning;

步骤6、行人框重判定。Step 6: Pedestrian frame weight determination.

步骤1所述的构建红外图像的正负样本数据集具体如下：The details of constructing the positive and negative sample data sets of infrared images described in step 1 are as follows:

正样本数据集构建方法如下：采用最小矩形窗口，提取红外图像中的行人样本；假设行人的高为h，宽为w，使得w/h＝0.41；共提取行人正样本N₁张；The method of constructing the positive sample data set is as follows: use the smallest rectangular window to extract pedestrian samples in the infrared image; assume that the height of the pedestrian is h and the width is w, so that w/h=0.41; a total of N ₁ positive samples of pedestrians are extracted;

负样本数据集构建方法如下：随机在N₀张不包含行人的红外图像中共抽取N₀×10张负样本，即在每张红外图像中随机抽取10张负样本；The negative sample data set construction method is as follows: Randomly extract N ₀ × 10 negative samples from N ₀ infrared images that do not contain pedestrians, that is, randomly select 10 negative samples from each infrared image;

将所有正负样本缩放至宽为64像素，高为128像素的样本图像。Scales all positive and negative samples to a sample image with a width of 64 pixels and a height of 128 pixels.

步骤2所述的基于亮度统计特征的红外图像DBHOI特征构造具体如下：The infrared image DBHOI feature construction based on the brightness statistical feature described in step 2 is specifically as follows:

2-1.设样本图像尺寸为w′×h′,则将样本图像划分成多份大小相同的局部图像；每一个局部图像的大小为8×8，即将样本图像划分成相等的(w′/8)×(h′/8)个局部图像，并将这些局部图像记为cell；2-1. Set the size of the sample image as w′×h′, then divide the sample image into multiple partial images of the same size; the size of each partial image is 8×8, that is, divide the sample image into equal (w′ /8)×(h′/8) partial images, and record these partial images as cells;

2-2.根据行人部件确定映射规则；2-2. Determine the mapping rules according to the pedestrian components;

2-2-1.首先根据N₁个正样本截取行人各部件，行人各部件包括头部、上半身、下半身；2-2-1. Firstly, according to N ₁ positive samples, each part of the pedestrian is intercepted, and each part of the pedestrian includes the head, upper body, and lower body;

2-2-2.在负样本中随机截取N₁个背景图像；2-2-2. Randomly intercept N ₁ background images in the negative sample;

2-2-3.分别计算头部图像、上半身图像、下半身图像和背景图像的灰度均值；2-2-3. Calculate the gray mean value of the head image, upper body image, lower body image and background image respectively;

2-2-4.设四种图像的灰度均值为G₁、G₂、G₃、G₄，且大小依次增大；，根据四个灰度均值确定三个映射边界，分别为t₁＝(G₁+G₂)/2、t₂＝(G₂+G₃)/2、t₃＝(G₃+G₄)/2，根据映射边界将灰度值范围划分为四个灰度区间，分别为[0,t₁),[t₁,t₂),[t₂,t₃),[t₃,255]；2-2-4. Set the average gray values of the four images as G ₁ , G ₂ , G ₃ , and G ₄ , and the sizes increase sequentially; and determine three mapping boundaries according to the four average gray values, respectively t ₁ =(G ₁ +G ₂ )/2, t ₂ =(G ₂ +G ₃ )/2, t ₃ =(G ₃ +G ₄ )/2, divide the gray value range into four gray values according to the mapping boundary degree interval, respectively [0,t ₁ ),[t ₁ ,t ₂ ),[t ₂ ,t ₃ ),[t ₃ ,255];

2-2-5.根据该映射规则并以红外图像中像素的灰度值为权值，在每个cell内构建亮度直方图，得到一个四维的特征向量；2-2-5. According to the mapping rule and using the gray value of the pixel in the infrared image as the weight, construct a brightness histogram in each cell to obtain a four-dimensional feature vector;

2-3.相邻的2×2个cell进行块内归一化；2-3. Adjacent 2×2 cells are normalized within the block;

以一个cell为步长，将(w′/8)×(h′/8)个cell在样本图像中以从上到下、从左到右的顺序将相邻的四个cell通过L1-Sqrt方法进行块内归一化；Taking one cell as the step size, put (w'/8)×(h'/8) cells in the sample image in order from top to bottom and from left to right to pass the adjacent four cells through L1-Sqrt method for intra-block normalization;

其中L1-Sqrt方法如下：v为亮度直方图向量，ε为一个很小值，取值为0.001；The L1-Sqrt method is as follows: v is the brightness histogram vector, ε is a very small value, the value is 0.001;

2-4.将所有块的特征进行串联得到DBHOI特征；2-4. Concatenate the features of all blocks to obtain the DBHOI features;

所述的DBHOI为不同区间大小的亮度直方图。The DBHOI is a brightness histogram with different interval sizes.

步骤3所述的于轮廓信息的红外图像HOG特征构造具体如下：The HOG feature structure of the infrared image based on the contour information described in step 3 is specifically as follows:

利用经典sobel算子计算每个像素水平和垂直方向的梯度分量G_x(i,j)、G_y(i,j)，然后计算对应的梯度大小G(i,j)和方向D(i,j)如下：Use the classic sobel operator to calculate the gradient components G _x (i, j) and G _y (i, j) of each pixel in the horizontal and vertical directions, and then calculate the corresponding gradient size G (i, j) and direction D (i, j) as follows:

步骤4所述的利用Adaboost进行分类器训练具体如下：The classifier training using Adaboost described in step 4 is as follows:

4-1.将所有正负样本缩放至相同的尺度，然后针对每一个样本图像，提取DBHOI特征向量和HOG特征向量，并且标记正样本的标签为1，负样本的标签为-1；4-1. Scale all positive and negative samples to the same scale, then extract the DBHOI feature vector and HOG feature vector for each sample image, and mark the label of the positive sample as 1, and the label of the negative sample as -1;

4-2.将N₁个正样本和N₀×10个负样本对应的DBHOI-HOG特征向量以及样本标签输入到Adaboost学习算法进行训练，得到一个具有一系列弱分类器构成的分类器。4-2. Input the DBHOI-HOG feature vectors and sample labels corresponding to N ₁ positive samples and N ₀ ×10 negative samples to the Adaboost learning algorithm for training, and obtain a classifier composed of a series of weak classifiers.

步骤5所述的检测窗口判定及定位如下：The determination and positioning of the detection window described in step 5 are as follows:

5-1.根据待检测红外图像的尺度和检测窗口的大小确定缩放因子，确定待检测红外图像的缩放层数；然后用步骤4得到的分类器按步长为4个像素大小进行逐层扫描；设待检测红外图像的原始大小为W_i×H_i，其中W_i表示待检测红外图像的宽度，H_i表示待检测红外图像高度，检测窗口大小为W_d×H_d，其中W_d表示检测窗口的宽度，H_d表示检测窗口高度，用s_s表示缩放因子；则初始缩放因子为s_s＝1，终止缩放因子为s_s＝min{W_i/W_d,H_i/H_d}；对每一个尺度下的待检测红外图像进行滑窗扫描，利用训练得到的行人分类器进行检测窗口判定；如果该检测窗口是行人框，则记录该检测窗口的位置以及置信度，将该记录表示为{posX,posY,width,height,score}，其中posX，posY为行人框的左上角点，width,height为行人框的宽度和高度，score为置信度；5-1. Determine the scaling factor according to the scale of the infrared image to be detected and the size of the detection window, and determine the number of scaling layers of the infrared image to be detected; then use the classifier obtained in step 4 to scan layer by layer with a step size of 4 pixels ; Suppose the original size of the infrared image to be detected is W _i ×H _i , where W _i represents the width of the infrared image to be detected, H _i represents the height of the infrared image to be detected, and the size of the detection window is W _d ×H _d , where W _d represents The width of the detection window, H _d represents the height of the detection window, and s _s represents the scaling factor; then the initial scaling factor is s _s =1, and the termination scaling factor is s _s =min{W _i /W _d ,H _i /H _d } ; Sliding window scanning is performed on the infrared image to be detected at each scale, and the detection window is judged by using the trained pedestrian classifier; if the detection window is a pedestrian frame, record the position and confidence of the detection window, and record the Expressed as {posX, posY, width, height, score}, where posX and posY are the upper left corner of the pedestrian frame, width and height are the width and height of the pedestrian frame, and score is the confidence level;

所述的置信度为所有二层决策树叶子结点记录的错误率；The confidence degree is the error rate recorded by all the leaf nodes of the two-layer decision tree;

5-2.通过对待检测红外图像进行多尺度的滑窗扫描检测后，同一个行人在不同尺度的待检测红外图像中被检测出来，采用非极大值抑制法对多个不同尺度的行人框结果进行融合；5-2. After multi-scale sliding window scanning detection of the infrared image to be detected, the same pedestrian is detected in different scales of the infrared image to be detected, and the non-maximum value suppression method is used to detect multiple pedestrian frames of different scales. The results are merged;

所述的极大值抑制的标准是每个行人框的置信度score；The criterion for maximum value suppression is the confidence score of each pedestrian frame;

5-3.将所有行人框按置信度从低到高进行排列成数组A[n]，然后从该数组A[n]中取出当前置信度最大的检测窗口信息A[n]；5-3. Arrange all the pedestrian frames into an array A[n] according to the confidence level from low to high, and then take out the detection window information A[n] with the highest current confidence level from the array A[n];

5-4.判断该行人框A[n]与后续行人框A[n-1]的关系，如果两个行人框的重合度α大于0.5，则认为是同一个行人框，否则将行人框A[n-1]作为当前置信度最大的行人框；5-4. Determine the relationship between the pedestrian frame A[n] and the subsequent pedestrian frame A[n-1]. If the coincidence degree α of the two pedestrian frames is greater than 0.5, it is considered to be the same pedestrian frame, otherwise the pedestrian frame A [n-1] as the pedestrian frame with the highest current confidence;

所述的重合度其中area(B_n)为第n个行人框的面积，area(B_n∩B_n-1)为第n个行人框面积与第n-1个行人框面积的交集；The degree of coincidence Where area(B _n ) is the area of the nth pedestrian frame, area(B _n ∩B _n-1 ) is the intersection of the area of the nth pedestrian frame and the area of the n-1th pedestrian frame;

5-5.重复步骤5-4，直至判定完所有的行人框。5-5. Repeat steps 5-4 until all pedestrian frames are determined.

步骤6所述的行人框重判定如下：The pedestrian frame weight determination described in step 6 is as follows:

6-1.通过步骤5之后，每一个被判定为行人的检测窗口都会有一个置信度值score；采用了亮度区间模板对行人框进行再判定；当score_i≤τ时，则进行行人框再判定，其中阈值τ的取值由统计方式确定；如果score_i＞τ，则仅用分类器进行行人检测，假设得到N个行人框的置信度，那么τ满足式 6-1. After passing step 5, each detection window that is judged as a pedestrian will have a confidence value score; the pedestrian frame is re-judged using the brightness interval template; when the score _i ≤τ, the pedestrian frame is re-determined Judgment, where the value of the threshold τ is determined by a statistical method; if score _i > τ, only the classifier is used for pedestrian detection, assuming that the confidence of N pedestrian frames is obtained, then τ satisfies the formula

所述的score_i为步骤5-3定义的数组A[n]中第i个的置信度值；The score _i is the i-th confidence value in the array A[n] defined in step 5-3;

6-2.根据步骤2以及红外行人成像特点，[0,t₁),[t₁,t₂),[t₂,t₃),[t₃,255]这4个区间分别对应背景、上半身、下半身、头部；设行人宽高为w×h，如果行人框正确判定行人位置，那么将已经被判定为行人的行人框的高度h从上到下划分成4等分，则在第一个1/4等分处必定存在属于[t₃,255]范围的亮度信息，并记灰度值在此区间的像素总个数为Σp₁，第二个1/4等分处必定存在[t₁,t₂)范围的亮度信息，并记灰度值在此区间的像素总个数为Σp₂，以及后两个1/4等分处必定存在[t₂,t₃)范围内的亮度信息，并记灰度值在此区间的像素总个数为Σp₃；设置如下验证规则：Σp₁/(w×h)≥1/16，Σp₂/(w×h)≥1/8，Σp₃/(w×h)≥1/16；如果Σp₁、Σp₂、Σp₃同时满足这三个条件，那么该行人框再次被判定为行人框，否则该行人框判定为非行人框。本发明的有益效果：6-2. According to step 2 and the characteristics of infrared pedestrian imaging, the four intervals [0,t ₁ ),[t ₁ ,t ₂ ),[t ₂ ,t ₃ ),[t ₃ ,255] correspond to the background, Upper body, lower body, head; if the width and height of the pedestrian is w×h, if the pedestrian frame correctly determines the position of the pedestrian, then the height h of the pedestrian frame that has been judged as a pedestrian is divided into 4 equal parts from top to bottom, then at Brightness information belonging to the range [t ₃ ,255] must exist in a 1/4 equal division, and the total number of pixels whose gray value is in this interval is Σp ₁ , and the second 1/4 equal division must exist [t ₁ , t ₂ ) range of brightness information, and record the total number of grayscale pixels in this range as Σp ₂ , and the last two 1/4 equal parts must exist in the range of [t ₂ , t ₃ ) luminance information, and record the total number of pixels with gray values in this interval as Σp ₃ ; set the following verification rules: Σp ₁ /(w×h)≥1/16, Σp ₂ /(w×h)≥1/ 8. Σp ₃ /(w×h)≥1/16; if Σp ₁ , Σp ₂ , and Σp ₃ meet these three conditions at the same time, then the pedestrian frame is judged as a pedestrian frame again, otherwise the pedestrian frame is judged as a non-pedestrian frame. Beneficial effects of the present invention:

本发明针对红外图像中行人的特点，构建了一个更具表达能力的亮度特征描述符——DBHOI，并将它与HOG特征相结合。DBHOI描述符通过统计训练样本中行人各部件及背景的亮度分布，并在构造特征时对该分布信息进行编码，使得该特征描述符更能够刻画出行人与背景在亮度信息上的区别。该特征描述符的构造方式提高了特征的描述能力，从而使得最终的分类器具有更强的分类能力。Aiming at the characteristics of pedestrians in infrared images, the present invention constructs a more expressive brightness feature descriptor——DBHOI, and combines it with HOG features. The DBHOI descriptor counts the brightness distribution of pedestrian parts and background in training samples, and encodes the distribution information when constructing features, so that the feature descriptor can better describe the difference in brightness information between pedestrians and background. The construction method of the feature descriptor improves the description ability of the feature, so that the final classifier has a stronger classification ability.

当使用分类器对行人框进行判定后得到较低置信度时，则根据亮度区间模板进行行人框再判定。该方法大大地降低了因分类器误判而引起的错误率，提高了检测精度。When the classifier is used to determine the pedestrian frame and get a lower confidence, the pedestrian frame is re-determined according to the brightness interval template. This method greatly reduces the error rate caused by the misjudgment of the classifier and improves the detection accuracy.

附图说明Description of drawings

图1为本发明的整体流程图。Fig. 1 is the overall flow chart of the present invention.

图2为DBHOI特征描述符提取流程图。Figure 2 is a flow chart of DBHOI feature descriptor extraction.

图3为本发明行人检测模型的检测效果展示。Fig. 3 is a display of the detection effect of the pedestrian detection model of the present invention.

具体实施方式detailed description

下面结合附图，对本发明的具体实施方案作进一步详细描述。The specific embodiments of the present invention will be further described in detail below in conjunction with the accompanying drawings.

如图1、图2和图3所示，一种基于红外行人亮度统计特征的夜间行人检测方法，其具体包括如下步骤：As shown in Figure 1, Figure 2 and Figure 3, a night pedestrian detection method based on the statistical characteristics of infrared pedestrian brightness, which specifically includes the following steps:

步骤1、构建红外图像的正负样本数据集。Step 1. Construct positive and negative sample datasets of infrared images.

本发明主要针对车载情况下的行人检测，所以采集图像的场景主要是道路场景下的红外图像。The present invention is mainly aimed at the detection of pedestrians in the case of vehicles, so the scenes for collecting images are mainly infrared images in road scenes.

正样本数据集构建方法如下：采用最小矩形窗口,也就是说用一个能够刚好包围行人的矩形框，提取红外图像中的行人样本；假设行人的高为h,宽为w，使得w/h＝0.41；一共提取行人正样本N₁张；The method of constructing the positive sample data set is as follows: Use the smallest rectangular window, that is to say, use a rectangular frame that can just surround pedestrians to extract pedestrian samples in the infrared image; assume that the height of the pedestrian is h and the width is w, so that w/h = 0.41; A total of N ₁ positive pedestrian samples were extracted;

所述的最小矩形窗口的定义如作者P.Dollár在文献《Pedestrian Detection:AnEvaluation of the State of the Art》所述。The definition of the minimum rectangular window is as described by the author P. Dollár in the document "Pedestrian Detection: AnEvaluation of the State of the Art".

步骤2：基于亮度统计特征的红外图像DBHOI特征构造。Step 2: Construction of infrared image DBHOI features based on brightness statistical features.

所述的DBHOI为不同区间大小的亮度直方图，全称为Different Bins Histogramof Intensity。The DBHOI is a brightness histogram of different interval sizes, and its full name is Different Bins Histogram of Intensity.

如图2所示，本发明通过对红外图像中的亮度信息进行编码，使得亮度信息成为一个具有较强分辨能力的描述符，其具体构造方式如下。As shown in FIG. 2 , the present invention encodes the brightness information in the infrared image, so that the brightness information becomes a descriptor with strong resolution capability, and its specific structure is as follows.

2-1.假设样本图像尺寸为w′×h′,则将样本图像划分成多份大小相同的局部图像；每一个局部图像的大小为8×8，即将样本图像划分成相等的(w′/8)×(h′/8)个局部图像，并将这些局部图像记为cell。2-1. Assuming that the size of the sample image is w′×h′, the sample image is divided into multiple partial images of the same size; the size of each partial image is 8×8, that is, the sample image is divided into equal (w′ /8)×(h′/8) partial images, and record these partial images as cells.

2-2.根据行人部件确定映射规则。首先根据N₁个正样本截取行人各部件，行人各部件包括头部、上半身、下半身；再在负样本中随机截取N₁个背景图像，然后分别计算头部图像、上半身图像、下半身图像和背景图像的灰度均值。设四种图像的灰度均值为G₁、G₂、G₃、G₄，且大小依次增大；，根据四个灰度均值确定三个映射边界，分别为t₁＝(G₁+G₂)/2、t₂＝(G₂+G₃)/2、t₃＝(G₃+G₄)/2，根据映射边界将灰度值范围划分为四个灰度区间，分别为[0,t₁),[t₁,t₂),[t₂,t₃),[t₃,255]。根据该映射规则，并以灰度值为权值，在每个cell内构建亮度直方图，得到一个四维的特征向量。2-2. Determine the mapping rule according to the pedestrian component. Firstly, according to N ₁ positive samples, each part of the pedestrian is intercepted, and each part of the pedestrian includes the head, upper body, and lower body; then N ₁ background images are randomly intercepted in the negative samples, and then the head image, upper body image, lower body image and background are calculated respectively The grayscale mean of the image. Assume that the average gray levels of the four images are G ₁ , G ₂ , G ₃ , and G ₄ , and the sizes increase sequentially; and determine three mapping boundaries according to the four gray average values, which are respectively t ₁ =(G ₁ +G ₂ )/2, t ₂ ＝(G ₂ +G ₃ )/2, t ₃ ＝(G ₃ +G ₄ )/2, according to the mapping boundary, the gray value range is divided into four gray ranges, which are [ 0,t ₁ ),[t ₁ ,t ₂ ),[t ₂ ,t ₃ ),[t ₃ ,255]. According to the mapping rule and with the gray value as the weight, a brightness histogram is constructed in each cell to obtain a four-dimensional feature vector.

2-3.相邻的2×2个cell进行块内归一化。2-3. Adjacent 2×2 cells are normalized within the block.

个cell，以一个cell为步长，将(w′/8)×(h′/8)个cell在样本图像中以从上到下、从左到右的顺序将相邻的四个cell进行块内归一化。归一化方法为L1-Sqrt：其中v为亮度直方图向量，ε为一个很小值，本发明中取值为0.001。cells, with a cell as the step size, (w′/8)×(h′/8) cells are placed in the sample image in order from top to bottom and from left to right. Intra-block normalization. The normalization method is L1-Sqrt: Where v is the brightness histogram vector, ε is a very small value, which is 0.001 in the present invention.

2-4.将所有块的特征进行串联得到DBHOI特征。通过这样的方式构建的DBHOI描述符具有较强的描述能力。2-4. Concatenate the features of all blocks to obtain the DBHOI features. The DBHOI descriptor constructed in this way has strong descriptive ability.

步骤3、基于轮廓信息的红外图像HOG特征构造。Step 3. The infrared image HOG feature construction based on the contour information.

HOG特征的构造采用金典的构造方法，比如将64×128大小的样本图像划分成若干个cell，每个cell的尺寸大小为8×8。针对每一个cell区域将梯度方向划分成均等的9等分，并将梯度大小作为权值构建梯度方向直方图。将相邻的4个cell组合成块，并L1-sqrt方法进行归一化，最后串联不同块之间的梯度直方图，构成最终的HOG描述符。The construction of HOG features adopts the construction method of Jindian, such as dividing the sample image of 64×128 size into several cells, and the size of each cell is 8×8. For each cell area, the gradient direction is divided into 9 equal parts, and the gradient size is used as the weight to construct a gradient direction histogram. The adjacent 4 cells are combined into blocks, and the L1-sqrt method is used for normalization, and finally the gradient histograms between different blocks are concatenated to form the final HOG descriptor.

步骤4：利用Adaboost进行分类器训练。Step 4: Use Adaboost for classifier training.

将所有正负样本缩放至相同的尺度，如64×128。然后针对每一个样本图像，提取DBHOI特征向量和HOG特征向量，并且标记正样本的标签为1，负样本的标签为-1。Scale all positive and negative samples to the same scale, such as 64×128. Then for each sample image, extract the DBHOI feature vector and HOG feature vector, and mark the label of the positive sample as 1, and the label of the negative sample as -1.

将N₁个正样本和N₀×10个负样本对应的DBHOI-HOG特征向量以及样本标签输入到Adaboost学习算法进行训练，得到一个具有一系列弱分类器构成的分类器。其中，弱分类器为二层决策树模型。The DBHOI-HOG feature vectors and sample labels corresponding to N ₁ positive samples and N ₀ × 10 negative samples are input into the Adaboost learning algorithm for training, and a classifier consisting of a series of weak classifiers is obtained. Among them, the weak classifier is a two-layer decision tree model.

步骤5：行人框判定及定位。Step 5: Pedestrian frame determination and positioning.

如图3所示，由于行人分类器的大小是确定的，但是行人在红外图像中的尺寸却不同。比如当行人距离摄像机很近时，它的尺寸将比分类器的尺寸大。因此为了检测不同尺度下的行人，需要对图像进行缩放，然后针对每一个尺度下的图像进行划窗扫描。同一个行人在不同的图像尺度下都有可能被分类器判定为行人，因此需要进行行人框的融合。As shown in Figure 3, since the size of the pedestrian classifier is fixed, the size of pedestrians in the infrared image is different. For example, when a pedestrian is very close to the camera, its size will be larger than that of the classifier. Therefore, in order to detect pedestrians at different scales, it is necessary to zoom the image, and then perform a window scan for the image at each scale. The same pedestrian may be judged as a pedestrian by the classifier at different image scales, so the fusion of pedestrian frames is required.

5-1.根据待检测红外图像的尺度和检测窗口的大小确定缩放因子，确定待检测红外图像的缩放层数；然后用步骤4得到的分类器按步长为4个像素大小进行逐层扫描。设待检测红外图像的原始大小为W_i×H_i，其中W_i表示待检测红外图像的宽度，H_i表示待检测红外图像高度，检测窗口大小为W_d×H_d，其中W_d表示检测窗口的宽度，H_d表示检测窗口高度，用s_s表示缩放因子。则初始缩放因子为s_s＝1，终止缩放因子为s_s＝min{W_i/W_d,H_i/H_d}。对每一个尺度下的待检测红外图像进行滑窗扫描，利用训练得到的行人分类器进行窗口判定。如果该检测窗口是行人框，则记录该检测窗口的位置以及置信度，将该记录表示为{posX,posY,width,height,score}，其中posX，posY为行人框的左上角点，width,height为行人框的宽度和高度，score为置信度。5-1. Determine the scaling factor according to the scale of the infrared image to be detected and the size of the detection window, and determine the number of scaling layers of the infrared image to be detected; then use the classifier obtained in step 4 to scan layer by layer with a step size of 4 pixels . Suppose the original size of the infrared image to be detected is W _i ×H _i , where W _i represents the width of the infrared image to be detected, H _i represents the height of the infrared image to be detected, and the size of the detection window is W _d ×H _d , where W _d represents the detection The width of the window, H _d represents the height of the detection window, and s _s represents the scaling factor. Then the initial scaling factor is s _s =1, and the termination scaling factor is s _s =min{W _i /W _d , H _i /H _d }. Sliding window scanning is performed on the infrared images to be detected at each scale, and the window judgment is performed by using the trained pedestrian classifier. If the detection window is a pedestrian frame, record the position and confidence of the detection window, and express the record as {posX, posY, width, height, score}, where posX, posY are the upper left corner of the pedestrian frame, width, height is the width and height of the pedestrian frame, and score is the confidence level.

所述的置信度为所有二层决策树叶子结点记录的错误率。The confidence level is the error rate recorded by all the leaf nodes of the two-level decision tree.

5-2.通过对待检测红外图像进行多尺度的滑窗扫描检测后，同一个行人在不同尺度的待检测红外图像中被检测出来，为了使系统输出一个最可能对应实际行人位置的行人框，采用非极大值抑制法对多个不同尺度的行人框结果进行融合。所述的极大值抑制的标准是每个行人框的置信度score。5-2. After multi-scale sliding window scanning detection of the infrared image to be detected, the same pedestrian is detected in different scales of the infrared image to be detected. In order for the system to output a pedestrian frame that most likely corresponds to the actual pedestrian position, The non-maximum suppression method is used to fuse the pedestrian frame results of multiple different scales. The criterion for maximum value suppression is the confidence score of each pedestrian frame.

5-4.判断该行人框A[n]与后续行人框A[n-1]的关系，如果两个行人框的重合度α大于0.5，则认为是同一个行人框，否则将行人框A[n-1]作为当前置信度最大的行人框。5-4. Determine the relationship between the pedestrian frame A[n] and the subsequent pedestrian frame A[n-1]. If the coincidence degree α of the two pedestrian frames is greater than 0.5, it is considered to be the same pedestrian frame, otherwise the pedestrian frame A [n-1] as the pedestrian frame with the highest current confidence.

所述的重合度其中area(B_n)为第n个行人框的面积，area(B_n∩B_n-1)为第n个行人框面积与第n-1个行人框面积的交集。The degree of coincidence Among them, area(B _n ) is the area of the nth pedestrian frame, and area(B _n ∩B _n-1 ) is the intersection of the area of the nth pedestrian frame and the area of the n-1th pedestrian frame.

步骤6、行人框重判定。Step 6: Pedestrian frame weight determination.

6-1.通过步骤5之后，每一个被判定为行人的检测窗口都会有一个置信度值score。如果分类器对一个检测窗口的判定越明确，则相应的score值也越大；相反，如果分类器对检测窗口的判定越不明确，则会得到一个很低的score值。因此当一个区域被错误的判定为行人框时，该行人框所得到的置信度score都比较低。为了降低这种因分类器在置信度不够时，做出的错误判定，本发明采用了亮度区间模板再判定技术。当score_i≤τ时，则进行行人框再判定，其中阈值τ的取值由统计方式确定。如果score_i＞τ，则仅用分类器进行行人检测，假设得到N个行人框的置信度(默认值N为1000)，那么τ满足式 6-1. After passing step 5, each detection window that is determined to be a pedestrian will have a confidence value score. If the classifier has a clearer judgment on a detection window, the corresponding score value will be larger; on the contrary, if the classifier is less clear on the judgment of the detection window, a very low score value will be obtained. Therefore, when an area is wrongly judged as a pedestrian frame, the confidence score obtained by the pedestrian frame is relatively low. In order to reduce the erroneous judgment made by the classifier when the confidence level is not enough, the present invention adopts the brightness interval template re-judgment technology. When score _i ≤τ, the pedestrian frame re-judgment is performed, and the value of the threshold τ is determined by a statistical method. If score _i >τ, then only the classifier is used for pedestrian detection, assuming that the confidence of N pedestrian frames is obtained (the default value N is 1000), then τ satisfies the formula

所述的score_i为步骤5-3定义的数组A[n]中第i个的置信度值。The score _i is the confidence value of the i-th in the array A[n] defined in step 5-3.

6-2.根据步骤2以及红外行人成像特点，[0,t₁),[t₁,t₂),[t₂,t₃),[t₃,255]这4个区间分别对应背景、上半身、下半身、头部。设行人宽高为w×h，如果行人框正确判定行人位置，那么将已经被判定为行人的行人框的高度h从上到下划分成4等分，则在第一个1/4等分处必定存在属于[t₃,255]范围的亮度信息，并记灰度值在此区间的像素总个数为Σp₁，第二个1/4等分处必定存在[t₁,t₂)范围的亮度信息，并记灰度值在此区间的像素总个数为Σp₂，以及后两个1/4等分处必定存在[t₂,t₃)范围内的亮度信息，并记灰度值在此区间的像素总个数为Σp₃。设置如下验证规则：Σp₁/(w×h)≥1/16，Σp₂/(w×h)≥1/8，Σp₃/(w×h)≥1/16。如果Σp₁、Σp₂、Σp₃同时满足这三个条件，那么该行人框被判定为行人框，否则该行人框判定为非行人框。6-2. According to step 2 and the characteristics of infrared pedestrian imaging, the four intervals [0,t ₁ ),[t ₁ ,t ₂ ),[t ₂ ,t ₃ ),[t ₃ ,255] correspond to the background, Upper body, lower body, head. Let the width and height of the pedestrian be w×h, if the pedestrian frame correctly determines the position of the pedestrian, then divide the height h of the pedestrian frame that has been judged as a pedestrian into 4 equal parts from top to bottom, then in the first 1/4 equal part There must be brightness information belonging to the range of [t ₃ ,255], and the total number of pixels with gray values in this range is Σp ₁ , and the second 1/4 equal division must exist at [t ₁ ,t ₂ ) The brightness information of the range, and record the total number of pixels of the gray value in this interval as Σp ₂ , and there must be brightness information in the range [t ₂ ,t ₃ ) at the last two 1/4 equal parts, and record the gray The total number of pixels with degrees in this range is Σp ₃ . Set the verification rules as follows: Σp ₁ /(w×h)≥1/16, Σp ₂ /(w×h)≥1/8, Σp ₃ /(w×h)≥1/16. If Σp ₁ , Σp ₂ , and Σp ₃ satisfy these three conditions at the same time, then the pedestrian frame is judged as a pedestrian frame, otherwise the pedestrian frame is judged as a non-pedestrian frame.

Claims

1. a nighttime pedestrian detection method based on infrared pedestrian brightness statistical features, is characterized in that comprising the steps:

Step 1. Construct positive and negative sample data sets of infrared images;

Step 2, infrared image DBHOI feature construction based on brightness statistical features;

Step 3, infrared image HOG feature construction based on contour information;

Step 4, utilize Adaboost to carry out classifier training;

Step 5, detection window judgment and positioning;

Step 6, pedestrian frame weight determination;

The details of constructing the positive and negative sample data sets of infrared images described in step 1 are as follows:

The method of constructing the positive sample data set is as follows: use the smallest rectangular window to extract pedestrian samples in the infrared image; assume that the height of the pedestrian is h and the width is w, so that w/h=0.41; a total of N ₁ positive samples of pedestrians are extracted;

The negative sample data set construction method is as follows: Randomly extract N ₀ × 10 negative samples from N ₀ infrared images that do not contain pedestrians, that is, randomly select 10 negative samples from each infrared image;

Scale all positive and negative samples to a sample image with a width of 64 pixels and a height of 128 pixels;

The infrared image DBHOI feature construction based on the brightness statistical feature described in step 2 is specifically as follows:

2-1. Set the size of the sample image as w′×h′, then divide the sample image into multiple partial images of the same size; the size of each partial image is 8×8, that is, divide the sample image into equal (w′ /8)×(h′/8) partial images, and record these partial images as cells;

2-2. Determine the mapping rules according to the pedestrian components;

2-2-1. Firstly, according to N ₁ positive samples, each part of the pedestrian is intercepted, and each part of the pedestrian includes the head, upper body, and lower body;

2-2-2. Randomly intercept N ₁ background images in the negative sample;

2-2-3. Calculate the gray mean value of the head image, upper body image, lower body image and background image respectively;

2-2-4. Set the average gray values of the four images as G ₁ , G ₂ , G ₃ , and G ₄ , and the sizes increase sequentially; and determine three mapping boundaries according to the four average gray values, respectively t ₁ =(G ₁ +G ₂ )/2, t ₂ =(G ₂ +G ₃ )/2, t ₃ =(G ₃ +G ₄ )/2, divide the gray value range into four gray values according to the mapping boundary degree interval, respectively [0,t ₁ ),[t ₁ ,t ₂ ),[t ₂ ,t ₃ ),[t ₃ ,255];

2-2-5. According to the mapping rule and using the gray value of the pixel in the infrared image as the weight, construct a brightness histogram in each cell to obtain a four-dimensional feature vector;

2-3. Adjacent 2×2 cells are normalized within the block;

Taking one cell as the step size, put (w'/8)×(h'/8) cells in the sample image in order from top to bottom and from left to right to pass the adjacent four cells through L1-Sqrt method for intra-block normalization;

The L1-Sqrt method is as follows: v is the brightness histogram vector, ε is a very small value, the value is 0.001;

2-4. Concatenate the features of all blocks to obtain the DBHOI features;

The DBHOI is a brightness histogram of different interval sizes;

The HOG feature construction of the infrared image based on the contour information described in step 3 is as follows:

Use the classic sobel operator to calculate the gradient components G _x (i, j) and G _y (i, j) of each pixel in the horizontal and vertical directions, and then calculate the corresponding gradient size G (i, j) and direction D (i, j) as follows:

The classifier training using Adaboost described in step 4 is as follows:

4-1. Scale all positive and negative samples to the same scale, then extract the DBHOI feature vector and HOG feature vector for each sample image, and mark the label of the positive sample as 1, and the label of the negative sample as -1;

4-2. Input the DBHOI-HOG feature vectors and sample labels corresponding to N ₁ positive samples and N ₀ × 10 negative samples to the Adaboost learning algorithm for training, and obtain a classifier composed of a series of weak classifiers;

The determination and positioning of the detection window described in step 5 are as follows:

5-1. Determine the scaling factor according to the scale of the infrared image to be detected and the size of the detection window, and determine the number of scaling layers of the infrared image to be detected; then use the classifier obtained in step 4 to scan layer by layer with a step size of 4 pixels ; Suppose the original size of the infrared image to be detected is W _i ×H _i , where W _i represents the width of the infrared image to be detected, H _i represents the height of the infrared image to be detected, and the size of the detection window is W _d ×H _d , where W _d represents The width of the detection window, H _d represents the height of the detection window, and s _s represents the scaling factor; then the initial scaling factor is s _s =1, and the termination scaling factor is s _s =min{W _i /W _d ,H _i /H _d } ; Sliding window scanning is performed on the infrared image to be detected at each scale, and the detection window is judged by using the trained pedestrian classifier; if the detection window is a pedestrian frame, record the position and confidence of the detection window, and record the Expressed as {posX, posY, width, height, score}, where posX and posY are the upper left corner of the pedestrian frame, width and height are the width and height of the pedestrian frame, and score is the confidence level;

The confidence degree is the error rate recorded by all the leaf nodes of the two-layer decision tree;

5-2. After multi-scale sliding window scanning detection of the infrared image to be detected, the same pedestrian is detected in different scales of the infrared image to be detected, and the non-maximum value suppression method is used to detect multiple pedestrian frames of different scales. The results are merged;

The criterion for maximum value suppression is the confidence score of each pedestrian frame;

5-3. Arrange all the pedestrian frames into an array A[n] according to the confidence level from low to high, and then take out the detection window information A[n] with the highest current confidence level from the array A[n];

5-4. Determine the relationship between the pedestrian frame A[n] and the subsequent pedestrian frame A[n-1]. If the coincidence degree α of the two pedestrian frames is greater than 0.5, it is considered to be the same pedestrian frame, otherwise the pedestrian frame A [n-1] as the pedestrian frame with the highest current confidence;

The degree of coincidence Where area(B _n ) is the area of the nth pedestrian frame, area(B _n ∩B _n-1 ) is the intersection of the area of the nth pedestrian frame and the area of the n-1th pedestrian frame;

5-5. Repeat steps 5-4 until all pedestrian frames are determined.

2. a kind of night pedestrian detection method based on infrared pedestrian luminance statistical feature according to claim 1, it is characterized in that the pedestrian frame heavy determination described in step 6 is as follows:

6-1. After passing step 5, each detection window that is judged as a pedestrian will have a confidence value score; the pedestrian frame is re-judged using the brightness interval template; when the score _i ≤τ, the pedestrian frame is re-determined Judgment, where the value of the threshold τ is determined by a statistical method; if score _i > τ, only the classifier is used for pedestrian detection, assuming that the confidence of N pedestrian frames is obtained, then τ satisfies the formula

The score _i is the i-th confidence value in the array A[n] defined in step 5-3;

6-2. According to step 2 and the characteristics of infrared pedestrian imaging, the four intervals [0,t ₁ ),[t ₁ ,t ₂ ),[t ₂ ,t ₃ ),[t ₃ ,255] correspond to the background, Upper body, lower body, head; if the width and height of the pedestrian is w×h, if the pedestrian frame correctly determines the position of the pedestrian, then the height h of the pedestrian frame that has been judged as a pedestrian is divided into 4 equal parts from top to bottom, then at There must be brightness information belonging to the range of [t ₃ ,255] at a 1/4 equal division, and the total number of pixels with gray values in this interval is ∑p ₁ , and the second 1/4 equal division must be There is brightness information in the range of [t ₁ ,t ₂ ), and the total number of pixels with gray value in this interval is ∑p ₂ , and there must be [t ₂ ,t ₃ ) at the last two 1/4 equal parts Brightness information within the range, and record the total number of pixels with gray values in this range as ∑p ₃ ; set the following verification rules: ∑p ₁ /(w×h)≥1/16, ∑p ₂ /(w×h h)≥1/8, ∑p ₃ /(w×h)≥1/16; if ∑p ₁ , ∑p ₂ , ∑p ₃ meet these three conditions at the same time, then the pedestrian frame is judged as a pedestrian frame again , otherwise the pedestrian frame is judged as a non-pedestrian frame.