CN111814595A

CN111814595A - Low-light pedestrian detection method and system based on multi-task learning

Info

Publication number: CN111814595A
Application number: CN202010568470.3A
Authority: CN
Inventors: 卢涛; 王元植; 张彦铎; 赵康辉; 汪慧; 程芳芳
Original assignee: Wuhan Institute of Technology
Current assignee: Wuhan Institute of Technology
Priority date: 2020-06-19
Filing date: 2020-06-19
Publication date: 2020-10-23
Anticipated expiration: 2040-06-19
Also published as: CN111814595B

Abstract

The invention discloses a low-light pedestrian detection method and system based on multi-task learning, including acquiring normal and low-light pedestrian data sets; constructing a lighting enhancement network, and using the normal and low-light pedestrian data sets for pre-training; constructing pedestrian detection The network uses the normal illumination pedestrian dataset for pre-training; based on multi-task learning, a multi-task learning module that can fuse the features between different tasks is designed, and the features are shared between the two networks to build a multi-task feature sharing low-light pedestrian detection. network; import two pre-trained models into the low-light pedestrian detection network, and use normal and low-light pedestrian datasets for training to obtain a low-light pedestrian detection model with multi-task feature sharing; a low-light pedestrian detection model using multi-task feature sharing The detection model detects the detected image and obtains the position of the pedestrian in the image. The present invention can accurately and efficiently detect the position of a pedestrian in a low-light image.

Description

Low-light pedestrian detection method and system based on multi-task learning

技术领域technical field

本发明涉及计算机视觉目标检测技术领域，具体涉及一种基于多任务学习的低光照行人检测方法及系统。The invention relates to the technical field of computer vision target detection, in particular to a low-light pedestrian detection method and system based on multi-task learning.

背景技术Background technique

经济的快速发展带来了不同地域、不同城市间人员的频繁往来，随之而来的公共安全隐患让相关部门耗费了不少精力。目前，作为城市安全防范系统重要组成部分的视频监控设备己被广泛应用，它们被安装在街道、学校、车站等公共区域中。这些设备主要用来记录和存储相关地点发生的事情，方便人们完成远程监控和应急指挥等需求，保障社会的公共安全。行人是视频监控中的主体，利用智能技术对行人的行为进行研究和分析是智能监控技术的重要组成部分。行人检测就是这一领域的关键技术之一。The rapid economic development has brought about frequent exchanges of people in different regions and cities, and the accompanying public safety hazards have caused relevant departments to spend a lot of energy. At present, video surveillance equipment, which is an important part of the urban security system, has been widely used, and they are installed in public areas such as streets, schools, and stations. These devices are mainly used to record and store what happened in the relevant places, so that people can fulfill the needs of remote monitoring and emergency command, and ensure the public safety of the society. Pedestrians are the main body in video surveillance. Using intelligent technology to study and analyze pedestrian behavior is an important part of intelligent surveillance technology. Pedestrian detection is one of the key technologies in this field.

另一方面，行人检测还在无人驾驶领域起着非常关键的作用。近年来，随着人工智能的蓬勃发展，无人驾驶车辆的研究得到了重大发展。行人检测是无人驾驶车辆研究的一个重要课题，对于提升车辆感知周围行人的能力有重要影响。On the other hand, pedestrian detection also plays a very critical role in the field of unmanned driving. In recent years, with the vigorous development of artificial intelligence, the research of driverless vehicles has been greatly developed. Pedestrian detection is an important topic in unmanned vehicle research, which has an important impact on improving the ability of vehicles to perceive pedestrians around them.

行人检测是目标检测的一个具体应用，从该角度出发，行人检测算法可以分为两类：基于锚框和基于关键点的方法。卷积神经网络(CNN)在RCNN中首次被引入到目标检测中，它可以在不需要手工设计特征的情况下进行检测。基于关键点的目标检测算法通过检测和分组关键点来生成目标边界框。这极大简化了网络的输出，消除了设计锚框的需要。CornerNet和CornerNet-Lite是基于关键点方法的典型代表，他们通过预测目标的左上角和右下角来确定目标边界。Pedestrian detection is a specific application of target detection. From this perspective, pedestrian detection algorithms can be divided into two categories: anchor box-based and keypoint-based methods. Convolutional Neural Networks (CNNs) were first introduced to object detection in RCNNs, which can detect without the need for handcrafted features. Keypoint-based object detection algorithms generate object bounding boxes by detecting and grouping keypoints. This greatly simplifies the output of the network and eliminates the need to design anchor boxes. CornerNet and CornerNet-Lite are typical representatives of keypoint-based methods, which determine the object boundary by predicting the upper left and lower right corners of the object.

尽管上述行人检测算法在正常照明条件下取得了令人满意的性能，但在实际应用中，正常的照明并不总是有保证的，低光环境非常普遍。低光照环境下行人检测效果差的主要原因是低光照导致输入图像中色彩信息严重丢失，而色彩信息是行人检测的关键。为了解决低光照行人检测问题，目前研究出了红外行人检测，但红外行人检测需要在红外图像下来进行行人检测，不能直接使用原始低光照图像检测。Although the above pedestrian detection algorithms achieve satisfactory performance under normal lighting conditions, in practical applications, normal lighting is not always guaranteed, and low-light environments are very common. The main reason for the poor pedestrian detection effect in low-light environment is that the color information in the input image is seriously lost due to low light, and color information is the key to pedestrian detection. In order to solve the problem of low-light pedestrian detection, infrared pedestrian detection has been developed at present, but infrared pedestrian detection requires pedestrian detection in infrared images, and cannot directly use the original low-light image detection.

发明内容SUMMARY OF THE INVENTION

本发明解决的技术问题是提供一种基于多任务学习的低光照行人检测方法及系统，解决低光照环境下行人检测效果差的问题。The technical problem solved by the present invention is to provide a low-light pedestrian detection method and system based on multi-task learning, so as to solve the problem of poor pedestrian detection effect in a low-light environment.

为解决上述技术问题，本发明提供一种基于多任务学习的低光照行人检测方法，包括以下步骤：In order to solve the above technical problems, the present invention provides a low-light pedestrian detection method based on multi-task learning, which includes the following steps:

S1、获取正常光照行人数据集和低光照行人数据集；S1. Obtain a normal light pedestrian dataset and a low light pedestrian dataset;

S2、构建光照增强网络，光照增强网络包括分解网络和增强网络，利用正常光照行人数据集和低光照行人数据集对光照增强网络进行训练，得到光照增强预训练模型；S2. Build a lighting enhancement network. The lighting enhancement network includes a decomposition network and an enhancement network. The lighting enhancement network is trained by using the normal lighting pedestrian dataset and the low lighting pedestrian dataset to obtain a lighting enhancement pre-training model;

S3、构建行人检测网络，行人检测网络以两个沙漏网络为主干网络，并分别加入空间转换网络和挤压激励网络，利用正常光照行人数据集对行人检测网络进行训练，得到行人检测预训练模型；S3. Build a pedestrian detection network. The pedestrian detection network uses two hourglass networks as the backbone network, and adds a spatial transformation network and a squeeze excitation network respectively. The pedestrian detection network is trained using the normal illumination pedestrian data set, and a pedestrian detection pre-training model is obtained. ;

S4、基于多任务学习，设计一个能够融合不同任务之间特征的多任务学习模块，对光照增强网络和行人检测网络进行特征共享，其中将增强网络的第一个3*3卷积网络的特征和行人检测网络的最后一个空间转换网络的特征进行相加并反馈给两个网络，将增强网络的最后一个3*3卷积网络的特征和行人检测网络的第一个残差模块的特征也进行相加并反馈给两个网络，从而构建多任务特征共享的低光照行人检测网络；S4. Based on multi-task learning, design a multi-task learning module that can fuse the features between different tasks, and share the features of the illumination enhancement network and the pedestrian detection network, in which the features of the first 3*3 convolutional network of the enhanced network will be enhanced. The features of the last spatial transformation network of the pedestrian detection network are added and fed back to the two networks, and the features of the last 3*3 convolutional network of the enhanced network and the features of the first residual module of the pedestrian detection network are also Add and feed back to the two networks to construct a low-light pedestrian detection network with multi-task feature sharing;

S5、将光照增强预训练模型和行人检测预训练模型导入到多任务特征共享的低光照行人检测网络，并利用正常光照行人数据集和低光照行人数据集对多任务特征共享的低光照行人检测网络进行训练，得到多任务特征共享的低光照行人检测模型；S5. Import the illumination enhancement pre-training model and the pedestrian detection pre-training model into the multi-task feature sharing low-light pedestrian detection network, and use the normal-light pedestrian dataset and the low-light pedestrian dataset for multi-task feature sharing. Low-light pedestrian detection The network is trained to obtain a low-light pedestrian detection model with multi-task feature sharing;

S6、利用多任务特征共享的低光照行人检测模型对待检测图像进行检测，得到图像中行人的位置。S6, using the low-light pedestrian detection model shared by multi-task features to detect the image to be detected, and obtain the position of the pedestrian in the image.

进一步地，步骤S2中：基于RetinexNet卷积神经网络构建光照增强网络。光照增强网络的损失函数为：Further, in step S2: construct an illumination enhancement network based on the RetinexNet convolutional neural network. The loss function of the illumination enhancement network is:

L_enh＝L_recon+λ_irL_ir+λ_isL_is L _enh =L _recon +λ _ir L _ir +λ _is L _is

式中，λ_ir和λ_is为权重系数，L_recon，L_ir和L_is分别表示重建，反射率和照明平滑度损失函数。where λ _ir and λ _is the weight coefficients, L _recon , L _ir and L _is represent the reconstruction, reflectance and illumination smoothness loss functions, respectively.

进一步地，步骤S3中：基于CornerNet-Saccade构建行人检测网络。行人检测网络的损失函数为：Further, in step S3: construct a pedestrian detection network based on CornerNet-Saccade. The loss function of the pedestrian detection network is:

L_cor＝L_det+δL_pull+ηL_push+γL_off L _cor =L _det +δL _pull +ηL _push +γL _off

式中，δ，η和γ分别为L_pull，L_push和L_off三个损失函数的权重，In the formula, δ, η and γ are the weights of the three loss functions L _pull , L _push and L _off respectively,

其中，

in,

式中，L_det为角点损失，N是图像中对象的数量，α和β是控制每个角点的贡献的超参数，C，H和W分别代表输入的通道数，高度和宽度，p_aij为预测图像中a类的(i，j)位置处的得分，y_aij为未经归一化的原始图像；where L _det is the corner loss, N is the number of objects in the image, α and β are hyperparameters that control the contribution of each corner, C, H and W represent the number of input channels, height and width, respectively, p _aij is the score at the (i, j) position of class a in the predicted image, and y _aij is the unnormalized original image;

式中，L_off为偏移损失，o_k是偏移量，x_k和y_k是角点k的x和y坐标，n是下采样因子；where L _off is the offset loss, ok is the offset, x _k and y _k are the x and y coordinates of the corner _k , and n is the downsampling factor;

式中，L_pull用来对角进行分组，L_push对角进行分离，m表示对象个数，

为对象m的左上角的嵌入，为

为对象的右下角的嵌入，e_m为

和

的平均值。In the formula, L _pull is used to group the corners, L _push is used to separate the corners, m represents the number of objects,

is the embedding of the upper left corner of the object m, which is

is the embedding of the lower right corner of the object, and _em is

and

average of.

进一步地，多任务特征共享的低光照行人检测网络的总训练损失函数为：Further, the total training loss function of the low-light pedestrian detection network with multi-task feature sharing is:

L＝L_det+L_cor＝L_det+δL_pull+ηL_push+γL_off+ζL_enh L=L _det +L _cor =L _det +δL _pull +ηL _push +γL _off +ζL _enh

式中，L为总损失，ζ是光照增强损失L_enh的权重。where L is the total loss, and ζ is the weight of the illumination enhancement loss L _enh .

本发明还提供一种用于实现上述基于多任务学习的低光照行人检测方法的基于多任务学习的低光照行人检测系统，包括：The present invention also provides a low-light pedestrian detection system based on multi-task learning for realizing the above-mentioned multi-task learning-based low-light pedestrian detection method, including:

数据集模块，用于获取正常光照行人数据集和低光照行人数据集；The dataset module is used to obtain the normal light pedestrian dataset and the low light pedestrian dataset;

光照增强模块，用于构建光照增强网络，光照增强网络包括分解网络和增强网络，利用正常光照行人数据集和低光照行人数据集对光照增强网络进行训练，得到光照增强预训练模型；The illumination enhancement module is used to construct an illumination enhancement network. The illumination enhancement network includes a decomposition network and an enhancement network. The illumination enhancement network is trained by using the normal illumination pedestrian dataset and the low illumination pedestrian dataset to obtain the illumination enhancement pre-training model;

行人检测模块，用于构建行人检测网络，行人检测网络以两个沙漏网络为主干网络，并分别加入空间转换网络和挤压激励网络，利用正常光照行人数据集对行人检测网络进行训练，得到行人检测预训练模型；The pedestrian detection module is used to construct a pedestrian detection network. The pedestrian detection network uses two hourglass networks as the backbone network, and adds a spatial transformation network and a squeeze excitation network respectively. The pedestrian detection network is trained using the normal illumination pedestrian dataset to obtain pedestrians. Detect pretrained models;

多任务学习模块，用于对光照增强网络和行人检测网络进行特征共享，将增强网络的第一个3*3卷积网络的特征和行人检测网络的最后一个空间转换网络的特征进行相加并反馈给两个网络，将增强网络的最后一个3*3卷积网络的特征和行人检测网络的第一个残差模块的特征也进行相加并反馈给两个网络，构建多任务特征共享的低光照行人检测网络；The multi-task learning module is used for feature sharing between the illumination enhancement network and the pedestrian detection network. The features of the first 3*3 convolutional network of the enhanced network and the features of the last spatial transformation network of the pedestrian detection network are added and merged. Feedback to the two networks, the features of the last 3*3 convolutional network of the enhanced network and the features of the first residual module of the pedestrian detection network are also added and fed back to the two networks to build a shared multi-task feature. Low-light pedestrian detection network;

模型训练模块，用于将光照增强预训练模型和行人检测预训练模型导入到多任务特征共享的低光照行人检测网络，并利用正常光照行人数据集和低光照行人数据集对多任务特征共享的低光照行人检测网络进行训练，得到多任务特征共享的低光照行人检测模型；The model training module is used to import the illumination enhancement pre-training model and the pedestrian detection pre-training model into the multi-task feature sharing low-light pedestrian detection network, and use the normal illumination pedestrian dataset and the low-light pedestrian dataset to share the multi-task features. The low-light pedestrian detection network is trained to obtain a low-light pedestrian detection model with multi-task feature sharing;

图像检测模块，用于利用多任务特征共享的低光照行人检测模型对待检测图像进行检测，得到图像中行人的位置。The image detection module is used to detect the image to be detected by using the low-light pedestrian detection model shared by multi-task features, and obtain the position of the pedestrian in the image.

进一步地，基于RetinexNet卷积神经网络构建光照增强网络。Further, an illumination enhancement network is constructed based on the RetinexNet convolutional neural network.

进一步地，基于CornerNet-Saccade构建行人检测网络。Further, a pedestrian detection network is constructed based on CornerNet-Saccade.

本发明还提供一种计算机存储介质，其内存储有可被计算机处理器执行的计算机程序，该计算机程序执行上述的基于多任务学习的低光照行人检测方法。The present invention also provides a computer storage medium, in which a computer program executable by a computer processor is stored, and the computer program executes the above-mentioned low-light pedestrian detection method based on multi-task learning.

本发明的有益效果是：本发明提供了一种基于多任务学习的低光照行人检测方法及系统，能够准确、高效的在低光照的图像中检测出行人的位置；并且在行人检测网络的主干网络-沙漏网络中加入空间转换网络和挤压激励网络两个模块，提升了网络的空间和通道注意力，从而提升了检测的性能。The beneficial effects of the present invention are as follows: the present invention provides a low-light pedestrian detection method and system based on multi-task learning, which can accurately and efficiently detect the position of pedestrians in low-light images; Two modules of spatial transformation network and squeeze excitation network are added to the network-hourglass network, which improves the spatial and channel attention of the network, thereby improving the detection performance.

附图说明Description of drawings

图1是本发明基于多任务学习的低光照行人检测方法流程图。FIG. 1 is a flowchart of the low-light pedestrian detection method based on multi-task learning of the present invention.

图2是本发明基于多任务学习的低光照行人检测网络结构图。FIG. 2 is a structure diagram of the low-light pedestrian detection network based on multi-task learning of the present invention.

图3是本发明基于多任务学习的低光照行人检测系统示意图。FIG. 3 is a schematic diagram of the low-light pedestrian detection system based on multi-task learning of the present invention.

图4是本发明实施例测试结果比较图。FIG. 4 is a comparison chart of the test results of the embodiment of the present invention.

具体实施方式Detailed ways

下面将结合附图对本发明的基于多任务学习的低光照行人检测方法及系统作进一步的说明：The low-light pedestrian detection method and system based on multi-task learning of the present invention will be further described below in conjunction with the accompanying drawings:

本发明主要分为四个部分：光照增强预训练模型，行人检测预训练模型，多任务特征共享的低光照行人检测模型和多任务特征共享的低光照行人检测模型从低光照图像中推理出图像中行人位置。The present invention is mainly divided into four parts: a pre-training model for illumination enhancement, a pre-training model for pedestrian detection, a low-light pedestrian detection model shared by multi-task features, and a low-light pedestrian detection model shared by multi-task features to infer images from low-light images Pedestrian location.

本发明实施例的基于多任务学习的低光照行人检测方法，如图1所示，包括以下步骤：The low-light pedestrian detection method based on multi-task learning according to the embodiment of the present invention, as shown in FIG. 1 , includes the following steps:

S1、获取正常光照行人数据集和低光照行人数据集。S1. Obtain a normal light pedestrian dataset and a low light pedestrian dataset.

可以直接通过拍照获取；也可以将CityPersons(一个公开的大规模行人检测数据集)行人检测训练集和验证集降低光照，这里我们使用OpenCV(一个基于BSD许可(开源)发行的跨平台计算机视觉库)中的基于RGB空间亮度调整算法，此算法基于当前RGB值大小进行调整，即R、G、B值越大，调整的越大，例如：当前像素点为(100,200,50)，调整系数为1.1，则调整后为(110,220,55)。在本实施例中，我们使用的调整系数为0.8。降低亮度后，我们将正常和低光照CityPersons行人检测训练集和验证集作为我们的训练集和测试集。It can be obtained directly by taking pictures; it is also possible to reduce the illumination of the pedestrian detection training set and validation set of CityPersons (a public large-scale pedestrian detection data set). Here we use OpenCV (a cross-platform computer vision library based on BSD license (open source)). ) based on the RGB space brightness adjustment algorithm, this algorithm is adjusted based on the current RGB value, that is, the larger the R, G, and B values, the larger the adjustment, for example: the current pixel is (100, 200, 50), and the adjustment coefficient is 1.1, then adjusted to (110,220,55). In this example, we use an adjustment factor of 0.8. After reducing the brightness, we use the normal and low-light CityPersons pedestrian detection training and validation sets as our training and testing sets.

S2、构建光照增强网络，光照增强网络包括分解网络和增强网络，利用正常光照行人数据集和低光照行人数据集对光照增强网络进行训练，得到光照增强预训练模型。S2. Build a lighting enhancement network. The lighting enhancement network includes a decomposition network and an enhancement network. The lighting enhancement network is trained by using the normal lighting pedestrian dataset and the low lighting pedestrian dataset to obtain a lighting enhancement pre-training model.

将正常和低光照行人检测训练集对光照增强网络进行单独训练100个周期，得到光照增强预训练模型。光照增强网络的网络结构如图2所示，基于RetinexNet卷积神经网络构建光照增强网络，该网络引入了Retinex理论。经典的Retinex理论建立了人的颜色感知模型，这个理论假设观察到的图像可以分解为反射通道和照明通道两部分。令S代表源图像，则可以用S＝R*I表示，其中R代表反射率分量，I代表照明分量，*代表逐元素乘法。对于损失函数，为了确保恢复光照后的图像能够在保留物体边缘信息的同时，也能保留光照信息的平滑过渡，在光照增强网络中使用了以下损失函数：The illumination enhancement network is separately trained for 100 cycles with the normal and low-light pedestrian detection training sets, and the illumination enhancement pre-training model is obtained. The network structure of the illumination enhancement network is shown in Figure 2. The illumination enhancement network is constructed based on the RetinexNet convolutional neural network, which introduces the Retinex theory. The classic Retinex theory establishes a model of human color perception, which assumes that the observed image can be decomposed into two parts, the reflection channel and the illumination channel. Let S represent the source image, then it can be represented by S=R*I, where R represents the reflectance component, I represents the illumination component, and * represents element-wise multiplication. For the loss function, in order to ensure that the restored image can retain the edge information of the object and also retain the smooth transition of the illumination information, the following loss function is used in the illumination enhancement network:

其中，λ_ir和λ_is表示平衡反射率和照明度的系数。损失函数L_recon，L_ir和L_is分别表示重建，反射率和照明平滑度函数。Among them, _λir and _λis represent coefficients that balance reflectivity and illuminance. The loss functions L _recon , L _ir and L _is represent reconstruction, reflectance and illumination smoothness functions, respectively.

S3、构建行人检测网络，行人检测网络以两个沙漏网络为主干网络，并分别加入空间转换网络和挤压激励网络，利用正常光照行人数据集对行人检测网络进行训练，得到行人检测预训练模型。S3. Build a pedestrian detection network. The pedestrian detection network uses two hourglass networks as the backbone network, and adds a spatial transformation network and a squeeze excitation network respectively. The pedestrian detection network is trained using the normal illumination pedestrian data set, and a pedestrian detection pre-training model is obtained. .

将正常光照的行人检测训练集对行人检测网络进行单独训练100个周期，得到行人检测预训练模型。行人检测网络的网络结构如图2所示，基于CornerNet-Saccade构建行人检测网络。对于行人检测网络，参考了基于关键点的目标检测算法思想，基于关键点的目标检测算法通过检测和分组其关键点来生成对象边界框，极大地简化了网络的输出，并消除了设计锚框的需要，同时在网络中引入了注意力机制进一步提升检测性能。本专利利用沙漏网络作为行人检测网络的主干网络，并在主干网络-沙漏网络中加入空间转换网络(STN)和挤压激励网络(SENet)两个模块来提升网络的空间和通道注意力，提升了检测的性能。其次，我们修剪了沙漏网络的结构，从原来的54层网络降低至36层网络结构，从而大大降低了网络的参数量，网络训练的成本降低。The pedestrian detection network is trained separately for 100 cycles from the pedestrian detection training set of normal illumination, and the pedestrian detection pre-training model is obtained. The network structure of the pedestrian detection network is shown in Figure 2. The pedestrian detection network is constructed based on CornerNet-Saccade. For the pedestrian detection network, the idea of object detection algorithm based on key points is referenced. The object detection algorithm based on key points generates object bounding boxes by detecting and grouping its key points, which greatly simplifies the output of the network and eliminates the need to design anchor boxes. At the same time, an attention mechanism is introduced into the network to further improve the detection performance. This patent uses the hourglass network as the backbone network of the pedestrian detection network, and adds two modules, the Spatial Transformation Network (STN) and the Squeeze Excitation Network (SENet), to the backbone network-hourglass network to improve the spatial and channel attention of the network. the detection performance. Second, we pruned the structure of the hourglass network from the original 54-layer network to a 36-layer network structure, which greatly reduces the amount of network parameters and reduces the cost of network training.

对于行人检测网络的损失函数，可以使用α＝2，β＝4的focal损失函数，令p_aij为预测图像中a类的(i，j)位置处的得分，并令y_aij为未经归一化的原始图像。For the loss function of the pedestrian detection network, the focal loss function of α=2, β=4 can be used, let p _aij be the score at the (i, j) position of class a in the predicted image, and let y _aij be the unreported Normalized original image.

其中，N是图像中对象的数量，α和β是控制每个点的贡献的超参数。C，H和W分别代表输入的通道数，高度和宽度。where N is the number of objects in the image, and α and β are hyperparameters that control the contribution of each point. C, H and W represent the number of channels, height and width of the input, respectively.

当输入图像通过卷积层时，输出的大小通常小于输入图像。因此，图像中的位置(x，y)映射到热图中的位置

其中n是下采样因子。当我们将位置从热图重新映射到输入图像时，可能会损失一些精度。为了解决这个问题，我们预测了位置偏移，这会稍微调整角点位置，然后再将它们重新映射到输入分辨率。When the input image is passed through a convolutional layer, the size of the output is usually smaller than the input image. So the position (x,y) in the image maps to the position in the heatmap

where n is the downsampling factor. Some precision may be lost when we remap the locations from the heatmap to the input image. To address this, we predict position offsets, which slightly adjust the corner positions before remapping them to the input resolution.

其中，o_k是偏移量，x_k和y_k是角点k的x和y坐标。我们预测一组偏移由所有类别的左上角共享，另一组偏移由右下角共享。为了进行训练，将偏移损失标记为L_off，并应用平滑的L1损失作为偏移损失：where o _k is the offset, and x _k and y _k are the x and y coordinates of the corner k. We predict that one set of offsets is shared by the top-left corner of all classes, and the other set is shared by the bottom-right corner. For training, label the offset loss as L _off , and apply a smoothed L1 loss as the offset loss:

图像中可能存在多个对象，因此可以检测到多个左上角和右下角，所以需要确定一对左上角和右下角是否来自同一边界框。令

为对象m的左上角的嵌入，为

为对象的右下角的嵌入。使用“pull”损失来训练网络以对角进行分组，并使用“push”损失来对角进行分离：There may be multiple objects in the image, so multiple upper-left and lower-right corners can be detected, so it is necessary to determine whether a pair of upper-left and lower-right corners are from the same bounding box. make

is the embedding of the upper left corner of the object m, which is

Embed for the bottom right corner of the object. Use the "pull" loss to train the network to group the corners, and use the "push" loss to separate the corners:

其中，e_m是

和

的平均值。行人检测的训练总损失如下：where _em is

and

average of. The total training loss for pedestrian detection is as follows:

L_cor＝L_det+δL_pull+ηL_push+γL_off，L _cor =L _det +δL _pull +ηL _push +γL _off ,

其中，δ，η和γ分别是L_pull，L_push和L_off三个损失函数的权重，L_cor是行人检测网络总损失。Among them, δ, η and γ are the weights of the three loss functions L _pull , L _push and L _off respectively, and L _cor is the total loss of the pedestrian detection network.

S4、基于多任务学习，设计一个能够融合不同任务之间特征的多任务学习模块，对光照增强网络和行人检测网络进行特征共享，将增强网络的第一个3*3卷积网络的特征和行人检测网络的最后一个空间转换网络的特征进行相加并反馈给两个网络，将增强网络的最后一个3*3卷积网络的特征和行人检测网络的第一个残差模块的特征也进行相加并反馈给两个网络，构建多任务特征共享的低光照行人检测网络。S4. Based on multi-task learning, design a multi-task learning module that can fuse the features between different tasks, share the features of the illumination enhancement network and the pedestrian detection network, and combine the features of the first 3*3 convolutional network of the enhanced network and The features of the last spatial transformation network of the pedestrian detection network are added and fed back to the two networks, and the features of the last 3*3 convolutional network of the enhanced network and the features of the first residual module of the pedestrian detection network are also processed. Add and feed back to the two networks to build a low-light pedestrian detection network with multi-task feature sharing.

多任务特征共享的低光照行人检测网络包含：光照增强网络和行人检测网络，为了使两个网络的性能进一步提升，在这基础上引入一个多任务学习模块，多任务学习模块的详细结构如图2所示。其中，分别将增强网络的特征a₁和a₂与沙漏网络+STN模块中的特征b₁和b₂进行加的操作，并把两个相加后的特征标记为F₁和F₂，并反馈给两网络。特征a₁和a₂分别来自增强网络中的第一个3*3卷积层和最后一个3*3卷积层。特征b₁来自沙漏网络+STN中的最后一个STN模块，特征b₂则来自第一个残差模块。同时，这些特征的大小相同。F₁和F₂表示为：The low-light pedestrian detection network with multi-task feature sharing includes: illumination enhancement network and pedestrian detection network. In order to further improve the performance of the two networks, a multi-task learning module is introduced on this basis. The detailed structure of the multi-task learning module is shown in the figure 2 shown. Among them, the features a _{1 and a 2 of the enhanced network are respectively added to the features b 1} _and b ₂ in the hourglass network + STN module, and the _two added features are marked as F ₁ and F ₂ , and Feedback to both networks. Features a ₁ and a ₂ come from the first 3*3 convolutional layer and the last 3*3 convolutional layer in the augmentation network, respectively. Feature b ₁ comes from the hourglass network + the last STN module in the STN, and feature b ₂ comes from the first residual module. At the same time, these features are of the same size. F1 and _F2 are represented as _:

F₁＝a₁+b₁，F₂＝a₂+b₂.F ₁ =a ₁ +b ₁ , F ₂ =a ₂ +b ₂ .

最后，多任务特征共享的低光照行人检测网络的训练损失函数L如下：Finally, the training loss function L of the low-light pedestrian detection network with multi-task feature sharing is as follows:

L＝L_det+δL_pull+ηL_push+γL_off+ζL_enh，L=L _det +δL _pull +ηL _push +γL _off +ζL _enh ,

其中δ，η和γ分别是L_pull，L_push和L_off三个损失函数的权重。ζ是光照增强损失L_enh的权重。我们将δ和η设置为0.1，将γ设置为1，将ζ设置为0.05。where δ, η and γ are the weights of the three loss functions L _pull , L _push and L _off , respectively. ζ is the weight of the light enhancement loss L _enh . We set δ and η to 0.1, γ to 1, and ζ to 0.05.

S5、将光照增强预训练模型和行人检测预训练模型导入到多任务特征共享的低光照行人检测网络，并利用正常光照行人数据集和低光照行人数据集对多任务特征共享的低光照行人检测网络进行训练，得到多任务特征共享的低光照行人检测模型。S5. Import the illumination enhancement pre-training model and the pedestrian detection pre-training model into the multi-task feature sharing low-light pedestrian detection network, and use the normal-light pedestrian dataset and the low-light pedestrian dataset for multi-task feature sharing. Low-light pedestrian detection The network is trained to obtain a low-light pedestrian detection model with multi-task feature sharing.

将正常和低光照行人检测训练集对多任务特征共享的低光照行人检测网络进行训练，同时导入之前训练好的光照增强预训练模型和行人检测预训练模型，训练100个周期，得到多任务特征共享的低光照行人检测模型。The normal and low-light pedestrian detection training sets are used to train the low-light pedestrian detection network shared by multi-task features, and at the same time, the previously trained illumination enhancement pre-training model and pedestrian detection pre-training model are imported, and the multi-task features are obtained by training for 100 cycles. Shared low-light pedestrian detection model.

S6、利用多任务特征共享的低光照行人检测模型对待检测图像进行检测，得到图像中行人的位置。将低光照的测试集图片输入到训练好的多任务特征共享的低光照行人检测模型进行推理，并在图像中框出行人的位置。S6, using the low-light pedestrian detection model shared by multi-task features to detect the image to be detected, and obtain the position of the pedestrian in the image. The low-light test set images are input into the trained multi-task feature sharing low-light pedestrian detection model for inference, and the location of the pedestrian is framed in the image.

本发明还提供一种用于实现上述基于多任务学习的低光照行人检测方法的基于多任务学习的低光照行人检测系统，如图3所示，包括：The present invention also provides a low-light pedestrian detection system based on multi-task learning for realizing the above-mentioned multi-task learning-based low-light pedestrian detection method, as shown in FIG. 3 , including:

数据集模块101，用于获取正常光照行人数据集和低光照行人数据集；The dataset module 101 is used to obtain a normal lighting pedestrian dataset and a low lighting pedestrian dataset;

光照增强模块102，用于构建光照增强网络，光照增强网络包括分解网络和增强网络，利用正常光照行人数据集和低光照行人数据集对光照增强网络进行训练，得到光照增强预训练模型；The illumination enhancement module 102 is used to construct an illumination enhancement network, the illumination enhancement network includes a decomposition network and an enhancement network, and the illumination enhancement network is trained by using the normal illumination pedestrian dataset and the low illumination pedestrian dataset to obtain an illumination enhancement pre-training model;

行人检测模块103，用于构建行人检测网络，行人检测网络以两个沙漏网络为主干网络，并分别加入空间转换网络和挤压激励网络，利用正常光照行人数据集对行人检测网络进行训练，得到行人检测预训练模型；The pedestrian detection module 103 is used to construct a pedestrian detection network. The pedestrian detection network uses two hourglass networks as the backbone network, and adds a spatial transformation network and a squeeze excitation network respectively, and uses the normal illumination pedestrian dataset to train the pedestrian detection network to obtain Pedestrian detection pre-training model;

多任务学习模块104，用于对不同任务之间的特征进行融合，对光照增强网络和行人检测网络进行特征共享，将增强网络的第一个3*3卷积网络的特征和行人检测网络的最后一个空间转换网络的特征进行相加并反馈给两个网络，将增强网络的最后一个3*3卷积网络的特征和行人检测网络的第一个残差模块的特征也进行相加并反馈给两个网络，构建多任务特征共享的低光照行人检测网络；The multi-task learning module 104 is used to fuse the features between different tasks, share the features of the illumination enhancement network and the pedestrian detection network, and combine the characteristics of the first 3*3 convolutional network of the enhancement network with the characteristics of the pedestrian detection network. The features of the last spatial transformation network are added and fed back to the two networks. The features of the last 3*3 convolutional network of the enhancement network and the features of the first residual module of the pedestrian detection network are also added and fed back. For two networks, build a low-light pedestrian detection network with multi-task feature sharing;

模型训练模块105，用于将光照增强预训练模型和行人检测预训练模型导入到多任务特征共享的低光照行人检测网络，并利用正常光照行人数据集和低光照行人数据集对多任务特征共享的低光照行人检测网络进行训练，得到多任务特征共享的低光照行人检测模型；The model training module 105 is used to import the illumination enhancement pre-training model and the pedestrian detection pre-training model into the multi-task feature sharing low-light pedestrian detection network, and use the normal illumination pedestrian data set and the low-light pedestrian data set to share the multi-task features The low-light pedestrian detection network is trained to obtain a low-light pedestrian detection model with multi-task feature sharing;

图像检测模块106，用于利用多任务特征共享的低光照行人检测模型对待检测图像进行检测，得到图像中行人的位置。The image detection module 106 is configured to detect the image to be detected by using the low-light pedestrian detection model shared by multi-task features, and obtain the position of the pedestrian in the image.

进一步地，基于RetinexNet卷积神经网络构建光照增强网络；基于CornerNet-Saccade构建行人检测网络。Further, the illumination enhancement network is constructed based on the RetinexNet convolutional neural network; the pedestrian detection network is constructed based on the CornerNet-Saccade.

本发明最后提供一个测试实施例，采用正常和低光照CityPersons数据集，包括正常和低光照训练集各2975张图片，正常和低光照测试集各500张图片。实验在Pytorch中实现，并使用2张RTX 2080Ti显卡进行训练，同时应用Adam优化算法。实验参数的选择上，设置的学习率为0.0001。评估指标遵循加州理工学院的评估标准：每幅图像的对数平均未命中率(MR^-2)，MR^-2的值越低，表示算法性能越好。通过MR^-2评价指标，与其他优秀的行人检测算法进行对比来证明本发明的优越性。表1通过MR^-2评价指标展示了对比的结果，图4为行人位置检测结果对比图。The present invention finally provides a test example, using normal and low-light CityPersons data sets, including 2975 pictures each in the normal and low-light training sets, and 500 pictures in each of the normal and low-light test sets. The experiments were implemented in Pytorch and trained using 2 RTX 2080Ti graphics cards, while applying the Adam optimization algorithm. In the selection of experimental parameters, the learning rate is set to 0.0001. The evaluation metrics follow Caltech's evaluation criteria: log mean miss rate (MR ^-2 ) per image, with lower values of MR ^-2 indicating better algorithm performance. The superiority of the present invention is proved by comparing with other excellent pedestrian detection algorithms through the MR ^-2 evaluation index. Table 1 shows the comparison results through the MR ^-2 evaluation index, and Figure 4 is a comparison diagram of pedestrian position detection results.

选择作为对比的行人检测方法包括：CSP、ALFNet和CornerNet-Saccade。ALFNet是行人检测中最具代表性的基于锚框的算法，CSP是行人检测中使用中心点法的最佳算法。同时，CornerNet-Saccade是目标检测中基于关键点方法的最佳算法。在CSP中提出了两种不同的模型，一种是考虑偏移量的CSPoff模型，另一种是不考虑偏移量的CSPnooff模型。表1中的前四行算法均将正常和低光照训练集对算法进行训练，因此所有这些行人检测网络都具有处理低光照图像的能力，保证了实验的公平。此外，为了进一步说明本专利算法中多任务学习模块的作用，表1中第五行算法采用级联的方式将RetinexNet光照增强算法和CornerNet-Saccade检测算法级联在一起。可以从表中结果可知，级联方法的指标仍然没有优于本发明的算法，证明了多任务学习模块起到了很重要的作用。The pedestrian detection methods chosen for comparison include: CSP, ALFNet, and CornerNet-Saccade. ALFNet is the most representative anchor box-based algorithm in pedestrian detection, and CSP is the best algorithm using center point method in pedestrian detection. Meanwhile, CornerNet-Saccade is the best algorithm based on keypoint method in object detection. Two different models are proposed in CSP, one is the CSPoff model that considers the offset, and the other is the CSPnooff model that does not consider the offset. The first four rows of algorithms in Table 1 all train the algorithms on the normal and low-light training sets, so all these pedestrian detection networks have the ability to process low-light images, ensuring the fairness of the experiments. In addition, in order to further illustrate the role of the multi-task learning module in the patented algorithm, the algorithm in the fifth row in Table 1 uses a cascaded method to cascade the RetinexNet illumination enhancement algorithm and the CornerNet-Saccade detection algorithm. It can be seen from the results in the table that the index of the cascade method is still not better than the algorithm of the present invention, which proves that the multi-task learning module plays an important role.

表1本发明与五种优秀算法比较结果表Table 1 The present invention and five excellent algorithms comparison result table

从以上表格实验结果可以看出，本算法与其他五种方法相比，取得了很明显的优势。From the experimental results in the table above, it can be seen that this algorithm has obvious advantages compared with the other five methods.

说明书中未阐述的部分均为现有技术或公知常识。本实施例仅用于说明该发明，而不用于限制本发明的范围，本领域技术人员对于本发明所做的等价置换等修改均认为是落入该发明权利要求书所保护范围内。The parts not described in the specification are the prior art or common knowledge. This embodiment is only used to illustrate the invention, but not to limit the scope of the invention. Those skilled in the art make modifications such as equivalent replacement of the invention, which are considered to fall within the protection scope of the claims of the invention.

本领域的技术人员容易理解，以上仅为本发明的较佳实施例而已，并不用以限制本发明，凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等，均应包含在本发明的保护范围之内。Those skilled in the art can easily understand that the above are only preferred embodiments of the present invention, and are not intended to limit the present invention. Any modifications, equivalent replacements and improvements made within the spirit and principles of the present invention should be Included in the protection scope of the present invention.

Claims

1. A low-illumination pedestrian detection method based on multitask learning is characterized by comprising the following steps:

s1, acquiring a normal illumination pedestrian data set and a low illumination pedestrian data set;

s2, constructing an illumination enhancement network, wherein the illumination enhancement network comprises a decomposition network and an enhancement network, and the illumination enhancement network is trained by utilizing a normal illumination pedestrian data set and a low illumination pedestrian data set to obtain an illumination enhancement pre-training model;

s3, constructing a pedestrian detection network, wherein the pedestrian detection network takes two hourglass networks as a backbone network, respectively adds a space conversion network and an extrusion excitation network, and trains the pedestrian detection network by using a normal illumination pedestrian data set to obtain a pedestrian detection pre-training model;

s4, designing a multi-task learning module capable of fusing features among different tasks based on multi-task learning, and sharing the features of the illumination enhancement network and the pedestrian detection network, wherein the features of the first 3 x 3 convolution network of the enhancement network and the features of the last space conversion network of the pedestrian detection network are added and fed back to the two networks, and the features of the last 3 x 3 convolution network of the enhancement network and the features of the first residual module of the pedestrian detection network are also added and fed back to the two networks, so that the low-illumination pedestrian detection network shared by the multi-task features is constructed;

s5, importing the illumination enhancement pre-training model and the pedestrian detection pre-training model into a low-illumination pedestrian detection network shared by the multitask features, and training the low-illumination pedestrian detection network shared by the multitask features by utilizing a normal illumination pedestrian data set and a low illumination pedestrian data set to obtain a low-illumination pedestrian detection model shared by the multitask features;

and S6, detecting the image to be detected by using the low-illumination pedestrian detection model shared by the multitask characteristics to obtain the position of the pedestrian in the image.

2. The low-light pedestrian detection method based on multitask learning according to claim 1, wherein in step S2: and constructing an illumination enhancement network based on the RetinexNet convolutional neural network.

3. The low-illumination pedestrian detection method based on multitask learning according to claim 2, characterized in that the loss function of the illumination enhancement network is:

L_enh＝L_recon+λ_irL_ir+λ_isL_is

in the formula, λ_irAnd λ_isIs a weight coefficient, L_recon，L_irAnd L_isRepresenting reconstruction, reflectance and illumination smoothness loss functions, respectively.

4. The low-light pedestrian detection method based on multitask learning according to claim 3, wherein in step S3: and constructing a pedestrian detection network based on CornerNet-Saccade.

5. The low-light pedestrian detection method based on multitask learning according to claim 4, characterized in that the loss function of said pedestrian detection network is:

L_cor＝L_det+L_pull+ηL_push+γL_off

wherein eta and gamma are L_pull，L_pushAnd L_offThe weights of the three loss functions are such that,

wherein,

in the formula, L_detFor corner loss, N is the number of objects in the image, α and β are the hyperparameters controlling the contribution of each corner, C, H and W represent the number of channels, height and width, respectively, input, p_aijIs the score at the (i, j) position of class a in the predicted image, y_aijIs an original image without normalization;

in the formula, L_offFor offset loss, o_kIs an offset, x_kAnd y_kIs the x and y coordinates of the corner point k, n is the down-sampling factor;

in the formula, L_pullFor grouping corners, L_pushThe corners are separated, m represents the number of objects,

for the embedding of the upper left corner of object m, are

For embedding in the lower right corner of the object, e_mIs composed of

And

average value of (a).

6. The low-light pedestrian detection method based on multitask learning according to claim 5, wherein the overall training loss function of the low-light pedestrian detection network shared by multitask features is as follows:

L＝L_det+L_cor＝L_det+L_pull+ηL_push+γL_off+ζL_enh

wherein L is the total loss, and ζ is the light enhancement loss L_enhThe weight of (c).

7. A low-light pedestrian detection system based on multitask learning for implementing the low-light pedestrian detection method based on multitask learning according to claim 1, characterized by comprising:

the data set module is used for acquiring a normal illumination pedestrian data set and a low illumination pedestrian data set;

the illumination enhancement module is used for constructing an illumination enhancement network, the illumination enhancement network comprises a decomposition network and an enhancement network, and the illumination enhancement network is trained by utilizing a normal illumination pedestrian data set and a low illumination pedestrian data set to obtain an illumination enhancement pre-training model;

the pedestrian detection module is used for constructing a pedestrian detection network, the pedestrian detection network takes two hourglass networks as a backbone network, and respectively adds a space conversion network and an extrusion excitation network, and the pedestrian detection network is trained by utilizing a normal illumination pedestrian data set to obtain a pedestrian detection pre-training model;

the multitask learning module is used for sharing the features of the illumination enhancement network and the pedestrian detection network, adding the features of the first 3 x 3 convolution network of the enhancement network and the features of the last space conversion network of the pedestrian detection network and feeding back the added features to the two networks, adding the features of the last 3 x 3 convolution network of the enhancement network and the features of the first residual module of the pedestrian detection network and feeding back the added features to the two networks, and constructing the low-illumination pedestrian detection network shared by the multitask features;

the model training module is used for importing an illumination enhancement pre-training model and a pedestrian detection pre-training model into a multi-task feature-shared low-illumination pedestrian detection network, and training the multi-task feature-shared low-illumination pedestrian detection network by utilizing a normal illumination pedestrian data set and a low illumination pedestrian data set to obtain the multi-task feature-shared low-illumination pedestrian detection model;

and the image detection module is used for detecting the image to be detected by utilizing the low-illumination pedestrian detection model shared by the multitask characteristics to obtain the position of the pedestrian in the image.

8. The low-illumination pedestrian detection system based on multitask learning according to claim 7, characterized in that the illumination enhancement network is constructed based on a RetinexNet convolutional neural network.

9. The low-light pedestrian detection system based on multitask learning according to claim 7, wherein the pedestrian detection network is constructed based on CornerNet-Saccade.

10. A computer storage medium, characterized in that: stored therein is a computer program executable by a computer processor, the computer program performing the low-light pedestrian detection method based on multi-task learning according to any one of claims 1 to 6.