CN111476843A

CN111476843A - Chinese wolfberry branch recognition and positioning method based on attention mechanism and improved PV-RCNN network

Info

Publication number: CN111476843A
Application number: CN202010380789.3A
Authority: CN
Inventors: 李伟; 贾秀芳; 王红艳; 王儒敬; 黄河; 孙丙宇; 李娇娥; 胡宜敏; 金洲
Original assignee: West Electronic Business Co ltd; Hefei Institutes of Physical Science of CAS
Current assignee: West Electronic Business Co ltd; Hefei Institutes of Physical Science of CAS
Priority date: 2020-05-08
Filing date: 2020-05-08
Publication date: 2020-07-31
Anticipated expiration: 2040-05-08
Also published as: CN111476843B

Abstract

The invention relates to a wolfberry branch identification and positioning method based on an attention mechanism and an improved PV-RCNN network, and compared with the prior art, the method overcomes the defect that a two-dimensional image is difficult to accurately identify due to the fact that wolfberry branches are shielded, broken points, overlapped and the like. The invention comprises the following steps: collecting and preprocessing a training sample; carrying out voxelization processing on the three-dimensional point cloud; constructing a Chinese wolfberry branch and a key point detection network thereof; training medlar branches and key point detection networks thereof; collecting and preprocessing images of the medlar branches to be identified; and identifying and positioning the medlar branches. The method can supplement a large amount of positioning information lost by convolution operation in the voxelization and sparse 3D convolution network, and meanwhile, the accuracy of the detection of key points at the tail ends of the branches of the Chinese wolfberry is improved by obtaining the contribution degree of relevant points to target detection and characteristic enhancement in a refinement network according to the attention network, so that the accurate identification and positioning of the branches of the Chinese wolfberry are realized.

Description

Recognition and localization of wolfberry branches based on attention mechanism and improved PV-RCNN network method

技术领域technical field

本发明涉及枸杞采收技术领域，具体来说是基于注意力机制和改进PV-RCNN网络的枸杞枝条识别定位方法。The invention relates to the technical field of wolfberry harvesting, in particular to a wolfberry branch identification and positioning method based on an attention mechanism and an improved PV-RCNN network.

背景技术Background technique

随着枸杞种植面积不断扩大，枸杞的采摘成为制约枸杞产业持续发展的卡脖子问题。由于国内外市场并没有技术较成熟的采摘机械，所以枸杞的采收主要依靠于人工完成，但人工采收枸杞的效率仅为3-5kg/h，所需费用达到生产成本的50％以上。研制适应于我国国情的枸杞采收机械，对降低所需成本、提高农民收入、保障枸杞产业稳态持续发展具有重大意义。With the continuous expansion of wolfberry planting area, the picking of wolfberry has become a bottleneck problem that restricts the sustainable development of wolfberry industry. Since there is no mature picking machinery in the domestic and foreign markets, the harvesting of wolfberry mainly relies on manual work, but the efficiency of manual harvesting of wolfberry is only 3-5kg/h, and the cost is more than 50% of the production cost. The development of wolfberry harvesting machinery adapted to my country's national conditions is of great significance for reducing the required costs, increasing farmers' income, and ensuring the steady and sustainable development of the wolfberry industry.

各种枸杞采收机械均依赖于操作人员的主观判断操作，利用枸杞采收夹持装置夹住枸杞枝条，对枸杞枝条进行摆动或梳刷，效率较为低下。而枸杞果实的数量比较多、体积较小，采摘时存在树叶和枝条对果实的遮挡，致使枸杞枝条的准确识别与定位在二维图像下难以精准识别。All kinds of wolfberry harvesting machines rely on the subjective judgment of the operator. Using the wolfberry harvesting and clamping device to clamp the wolfberry branches, swinging or brushing the wolfberry branches, the efficiency is relatively low. However, the number of wolfberry fruit is relatively large and the volume is small, and the fruit is blocked by leaves and branches during picking, which makes the accurate identification and positioning of wolfberry branches difficult to accurately identify under two-dimensional images.

若能利用基于三维点云数据(三维点云数据可以获得目标的空间维度、分布特征以及三维形态等具体空间信息)的计算机识别技术准确地识别和判断枸杞枝条的位置及其枝条末端关键点的坐标，根据枝条的位置和走势及其枝条末端关键点坐标利用机械臂将其定点抓起进行枸杞高效率采摘，不仅可以提高枸杞采摘效率、采净率而且能够最大程度地降低对枸杞的损伤以及对树的保护而不被损坏。然而，目前对枸杞枝条的检测仍以二维图像为主，但自然环境复杂以及存在遮挡、断点、重叠等问题，故很难利用二维图像直接判断枝条的位置。If the computer recognition technology based on 3D point cloud data (3D point cloud data can obtain the specific spatial information such as the spatial dimension, distribution characteristics and 3D shape of the target) can be used to accurately identify and judge the position of wolfberry branches and the key points at the ends of the branches. Coordinates, according to the position and trend of the branches and the coordinates of the key points at the ends of the branches, the robotic arm is used to pick them up at a fixed point for high-efficiency picking of wolfberry, which can not only improve the picking efficiency and cleaning rate of wolfberry, but also minimize the damage to wolfberry and Protection of the tree from damage. However, the current detection of wolfberry branches is still mainly based on two-dimensional images, but the natural environment is complex and there are problems such as occlusion, breakpoints, and overlaps, so it is difficult to directly determine the position of branches by using two-dimensional images.

因此，如何提高枸杞枝条检测的精准性已成为当前亟待解决的关键技术问题。Therefore, how to improve the accuracy of wolfberry branch detection has become a key technical problem to be solved urgently.

发明内容SUMMARY OF THE INVENTION

本发明的目的是为了解决现有技术中由于枸杞枝条被遮挡、断点、重叠等原因致使二维图像难以精准识别的缺陷，提供一种基于注意力机制和改进PV-RCNN网络的枸杞枝条识别定位方法来解决上述问题。The purpose of the present invention is to solve the defects in the prior art that the two-dimensional images are difficult to accurately identify due to the occlusion, breakpoints, overlapping and other reasons of the wolfberry branches, and to provide a wolfberry branch identification based on an attention mechanism and an improved PV-RCNN network. positioning method to solve the above problems.

为了实现上述目的，本发明的技术方案如下：In order to achieve the above object, technical scheme of the present invention is as follows:

基于注意力机制和改进PV-RCNN网络的枸杞枝条识别定位方法，包括以下步骤：The identification and positioning method of wolfberry branches based on attention mechanism and improved PV-RCNN network includes the following steps:

训练样本的收集和预处理：通过双目相机获取枸杞树不同角度的20张图像，构建三维模型获得三维点云，对三维点云进行标注，对枝条末端关键点建立半径为r的球体作标注；Collection and preprocessing of training samples: Obtain 20 images of wolfberry trees from different angles through a binocular camera, build a 3D model to obtain a 3D point cloud, label the 3D point cloud, and create a sphere with a radius of r for the key points at the end of the branches. ;

三维点云的体素化处理：将三维点云经过VolexNet网络进行体素化处理，形成多个栅格；其中，点云输入区域的大小为(L,W,H)，每个栅格的大小为(l,w,h)，共有栅格个数为(L/l,W/w,H/h)，每个栅格中点云数量设置为8；Voxelization of 3D point cloud: Voxelize the 3D point cloud through the VolexNet network to form multiple grids; among them, the size of the point cloud input area is (L, W, H), and the size of each grid is The size is (l,w,h), the total number of grids is (L/l,W/w,H/h), and the number of point clouds in each grid is set to 8;

构建枸杞枝条及其关键点检测网络：基于PV-RCNN网络构建枸杞枝条及其关键点检测网络，并在枸杞枝条及其关键点检测网络的PV-RCNN内融合注意力机制获得枸杞枝条及关键点目标定位的精修网络；Construction of wolfberry branches and their key points detection network: Based on the PV-RCNN network, a wolfberry branch and its key point detection network was constructed, and the wolfberry branches and key points were obtained by integrating the attention mechanism in the PV-RCNN of the wolfberry branch and its key point detection network. Refinement network for targeting;

训练枸杞枝条及其关键点检测网络：利用训练样本对枸杞枝条以其关键点检测网络进行训练；Training wolfberry branch and its key point detection network: use training samples to train wolfberry branch and its key point detection network;

待识别枸杞枝条图像的收集和预处理：获取双目相机拍摄的待识别枸杞树不同角度的20张图像，利用已构建的三维模型获得待识别的三维点云，并对待识别的三维点云进行体素化处理；Collection and preprocessing of images of wolfberry branches to be identified: 20 images of the wolfberry tree to be identified from different angles captured by the binocular camera were obtained, and the three-dimensional point cloud to be identified was obtained by using the constructed three-dimensional model, and the three-dimensional point cloud to be identified was processed. voxelization;

枸杞枝条的识别和定位：将处理后待识别的三维点云数据输入训练后的枸杞枝条及其关键点检测网络，获得枸杞枝条和枝条末端关键点位置，实现枸杞枝条的识别与定位。Recognition and positioning of wolfberry branches: Input the processed three-dimensional point cloud data to be identified into the trained wolfberry branch and its key point detection network to obtain the positions of wolfberry branches and key points at the ends of the branches, so as to realize the identification and positioning of wolfberry branches.

所述构建枸杞枝条及其关键点检测网络包括以下步骤：The described construction of wolfberry branches and their key point detection network includes the following steps:

基于PV-RCNN网络构建枸杞枝条及其关键点检测网络，设定其输入层为：枸杞枝条三维点云进行体素化后的诸多栅格和枝条末端关键点半径为r的球体；The detection network of wolfberry branches and their key points was constructed based on PV-RCNN network, and the input layer was set as: many grids after voxelization of the three-dimensional point cloud of wolfberry branches and a sphere with a radius of r at the key points at the end of the branches;

设定其特征提取层为：利用稀疏3D卷积网络对输入栅格及其关键点半径为r的球体进行多尺度逐层特征提取；对利用FPS选取的相关点通过基于注意力机制的PointNet网络进行点云的特征提取；The feature extraction layer is set as: using a sparse 3D convolutional network to perform multi-scale layer-by-layer feature extraction on the input grid and a sphere whose key point radius is r; Perform point cloud feature extraction;

在枸杞枝条及其关键点检测网络PV-RCNN内构建融合注意力机制的精修网络：基于注意力机制构建枸杞枝条及其关键点目标候选框的精准定位网络，作为目标回归精修网络；Constructing a refinement network integrating attention mechanism in the detection network PV-RCNN of wolfberry branches and their key points: based on the attention mechanism, a precise localization network of wolfberry branches and their key point target candidate boxes is constructed as a target regression refinement network;

设定基于改进PV-RCNN网络枸杞枝条及其关键点检测网络的输出层为枸杞枝条位置及其枝条末端关键点坐标。The output layer of the wolfberry branch and its key point detection network based on the improved PV-RCNN network is set as the position of the wolfberry branch and the coordinates of the key point at the end of the branch.

所述训练枸杞枝条及其关键点检测网络包括以下步骤：The training of wolfberry branches and their key point detection network includes the following steps:

将诸多栅格和关键点半径为r的球体输入3D稀疏卷积神经网络中进行逐层特征提取；Input many grids and spheres with key point radius r into 3D sparse convolutional neural network for layer-by-layer feature extraction;

稀疏卷积神经网络由四层C1、C2、C3、C4，3×3×3的3D稀疏卷积组成，逐层进行特征提取；The sparse convolutional neural network consists of four layers of C1, C2, C3, C4, 3×3×3 3D sparse convolution, and feature extraction is performed layer by layer;

将C4特征图转换成俯视特征图，俯视特征图的大小为

Convert the C4 feature map to a top-down feature map, and the size of the top-view feature map is

根据特征图大小由RPN网络生成

个anchorboxes，角度分别为0度、45度、135度，通过NMS非极大值抑制操作生成3Dproposal，最终获得3Dproposal对应的类别和坐标位置；Generated by the RPN network according to the feature map size

Anchorboxes with angles of 0 degrees, 45 degrees, and 135 degrees, respectively, generate 3D proposals through NMS non-maximum suppression operation, and finally obtain the corresponding category and coordinate position of 3D proposals;

利用FPS选取的k个相关点并通过基于注意力机制的PointNet网络进行点云的特征提取；Use the k relevant points selected by FPS to extract the feature of point cloud through the PointNet network based on the attention mechanism;

目标回归精修网络的训练：将3Dproposals对应的俯视图特征和k个相关点权重特征F_i′进行级联；再利用Fusion模型对级联后的结果和3Dproposals卷积产生的注意力特征相乘进行融合；最后通过多层感知机获得精修的边界框3Dbox精确位置；The training of the target regression refinement network: cascade the top-view feature corresponding to 3D proposals and k related point weight features F _i ′; then use the Fusion model to multiply the cascaded result and the attention feature generated by the 3D proposals convolution. Fusion; finally, the precise position of the refined bounding box 3Dbox is obtained through the multi-layer perceptron;

在训练过程中进行损失函数的训练：损失函数包括RPN的多任务目标损失函数L_RPN和回归框精修损失函数L_REFINE。The training of the loss function is performed during the training process: the loss function includes the multi-task objective loss function L _RPN of the RPN and the regression box refinement loss function L _REFINE .

所述的利用FPS选取的k个相关点并通过基于注意力机制的PointNet网络进行点云的特征提取包括以下步骤：The feature extraction of the point cloud using the k relevant points selected by FPS and the PointNet network based on the attention mechanism includes the following steps:

利用FPS算法从三维点云中选取k个相关点，其公式如下：Using the FPS algorithm to select k relevant points from the 3D point cloud, the formula is as follows:

κ＝{p₁，p₂,…,p_k}；κ={p ₁ , p ₂ ,...,p _k };

每一个相关点p_i的特征表示如下：The feature representation of each relevant point _pi is as follows:

其中i＝1，2，3，...，k；

where i=1, 2, 3, ..., k;

其中，

为每一层3D稀疏卷积上产生的特征图，c＝1，2，3，4；in,

For the feature map generated on each layer of 3D sparse convolution, c=1, 2, 3, 4;

是三维点云通过SA模型计算的第i个相关点p_i特征；

is the i-th related point p _i feature calculated by the SA model of the 3D point cloud;

是对俯视图利用双线性插值获得的特征；

is the feature obtained by bilinear interpolation for the top view;

计算相关点p_i的特征F_i的权重如下：The weight of the feature F _i of the relevant point p _i is calculated as follows:

F′_i＝Λ(p_i)⊙F_ii＝1，2，3，...，k；F′ _i =Λ(pi ) _{⊙Fi i} ₌ 1, 2, 3,...,k;

其中，Λ(·)∈[0，1]为注意力网络，其值代表对应输入相关点的注意力向量，即该相关点的重要程度，F_i是相关点p_i的特征。Among them, Λ(·)∈[0,1] is the attention network, and its value represents the attention vector corresponding to the input related point, that is, the importance of the related point, and F _i is the feature of the related point _pi .

所述在训练过程中进行损失函数的训练包括以下步骤：The training of the loss function in the training process includes the following steps:

训练多任务目标损失函数L_RPN，该损失函数包括分类任务损失函数L_cls；目标回归框损失函数L_boxreg；关键点回归损失函数L_keyreg：：Training multi-task target loss function L _RPN , the loss function includes classification task loss function L _cls ; target regression box loss function L _boxreg ; key point regression loss function L _keyreg :

当IoU＞0.6时，anchor被认为是正样本；当IoU＜0.45时，anchor被认为是负样本；其表达式如下：When IoU>0.6, the anchor is considered as a positive sample; when IoU<0.45, the anchor is considered as a negative sample; its expression is as follows:

L_RPN＝L_cls+L_boxreg+L_keyreg；L _RPN = L _cls + L _boxreg + L _keyreg ;

分类任务的损失函数L_cls，其表达式如下：The loss function L _cls of the classification task, its expression is as follows:

其中，L_cls(x，y)＝-(xlog(y)+(1-x)log(1-y))，N₊表示正样本个数，N_-表示负样本个数；Among them, L _cls (x, y)=-(xlog(y)+(1-x)log(1-y)), N ₊ represents the number of positive samples, N _- represents the number of negative samples;

目标回归框损失函数L_boxreg，其表达式如下：The target regression box loss function L _boxreg , whose expression is as follows:

其中，令σ＝2；Among them, let σ=2;

训练关键点回归损失函数L_keyreg，其表达式如下：Train the keypoint regression loss function L _keyreg , whose expression is as follows:

其中，

是标记的关键点坐标，f(x_i)预测的关键点坐标；in,

is the marked keypoint coordinates, f( _xi ) predicted keypoint coordinates;

训练包括枝条目标框和关键点目标框的3Dbox回归框精修损失函数L_REFINE，其表达式如下：The _3Dbox regression box refinement loss function L REFINE including the branch target box and the keypoint target box is trained, and its expression is as follows:

其中，

是标记的目标框，3Dbox是预测目标框。in,

is the labeled target box, and 3Dbox is the predicted target box.

有益效果beneficial effect

本发明的基于注意力机制和改进PV-RCNN网络的枸杞枝条识别定位方法，与现有技术相比在PV-RCNN网络中融合了相关点的注意力网络及其精修网络中3Dproposals卷积特征产生的注意力，能够补充在体素化和稀疏3D卷积网络中进行卷积运算丢失的大量定位信息，同时依据注意力网络获得相关点对于目标检测的贡献程度和精修网络中特征增强来提高枸杞枝条和枝条末端关键点检测的精准性，实现了枸杞枝条的准确识别定位。Compared with the prior art, the method for identifying and locating wolfberry branches based on the attention mechanism and the improved PV-RCNN network of the present invention integrates the attention network of relevant points in the PV-RCNN network and the 3D proposals convolution feature in the refined network. The generated attention can supplement a large amount of localization information lost in the convolution operation in the voxelized and sparse 3D convolutional network. At the same time, according to the contribution of the relevant points to the target detection obtained by the attention network and the feature enhancement in the refined network. The detection accuracy of the key points of wolfberry branches and branch ends is improved, and the accurate identification and positioning of wolfberry branches is realized.

附图说明Description of drawings

图1为本发明的方法顺序图。FIG. 1 is a sequence diagram of the method of the present invention.

具体实施方式Detailed ways

为使对本发明的结构特征及所达成的功效有更进一步的了解与认识，用以较佳的实施例及附图配合详细的说明，说明如下：In order to have a further understanding and understanding of the structural features of the present invention and the effects achieved, the preferred embodiments and accompanying drawings are used in conjunction with detailed descriptions, and the descriptions are as follows:

如图1所示，本发明所述的基于注意力机制和改进PV-RCNN网络的枸杞枝条识别定位方法，包括以下步骤：As shown in Figure 1, the method for identifying and locating wolfberry branches based on an attention mechanism and an improved PV-RCNN network according to the present invention includes the following steps:

第一步，训练样本的收集和预处理。通过双目相机获取枸杞树不同角度的20张图像，利用现有技术构建三维模型获得三维点云，并对三维点云进行标注，对枝条末端关键点建立半径为r的球体作标注。三维点云会进行体素化预处理从而形成栅格，我们利用半径为r的球体作标注只是为了使其能够和栅格作为相同格式的数据输入到稀疏3D卷积网络中进行特征提取。The first step is the collection and preprocessing of training samples. The binocular camera was used to obtain 20 images of the wolfberry tree from different angles, and the existing technology was used to build a 3D model to obtain a 3D point cloud. The 3D point cloud will be preprocessed by voxelization to form a grid. We use a sphere with a radius of r as annotation just so that it can be input into the sparse 3D convolutional network as data in the same format as the grid for feature extraction.

第二步，三维点云的体素化处理。将三维点云经过VolexNet网络进行体素化处理，形成多个栅格；其中，点云输入区域的大小为(L,W,H)，每个栅格的大小为(l,w,h)，共有栅格个数为(L/l,W/w,H/h)，每个栅格中点云数量设置为8。The second step is the voxelization of the 3D point cloud. The 3D point cloud is voxelized through the VolexNet network to form multiple grids; the size of the point cloud input area is (L, W, H), and the size of each grid is (l, w, h) , the total number of grids is (L/l, W/w, H/h), and the number of point clouds in each grid is set to 8.

第三步，构建枸杞枝条及其关键点检测网络。基于PV-RCNN网络构建枸杞枝条及其关键点检测网络，并在枸杞枝条及其关键点检测网络的PV-RCNN内融合注意力机制获得枸杞枝条及关键点目标定位的精修网络。在此，利用PV-RCNN网络架构构建枸杞枝条及其关键点检测网络，并为了实现注意力机制，在PV-RCNN网络内建立一个精修网络，以融合注意力机制。其具体步骤如下：The third step is to construct the detection network of wolfberry branches and their key points. Based on the PV-RCNN network, the detection network of wolfberry branches and their key points was constructed, and the attention mechanism was integrated in the PV-RCNN of the detection network of wolfberry branches and their key points to obtain a refined network for the target location of wolfberry branches and key points. Here, the PV-RCNN network architecture is used to construct the detection network of wolfberry branches and their key points, and in order to realize the attention mechanism, a refinement network is established in the PV-RCNN network to integrate the attention mechanism. The specific steps are as follows:

(1)基于PV-RCNN网络构建枸杞枝条及其关键点检测网络，设定其输入层为：枸杞枝条三维点云进行体素化后的诸多栅格和枝条末端关键点半径为r的球体。(1) The detection network of wolfberry branches and their key points is constructed based on the PV-RCNN network, and the input layer is set as: many grids after voxelization of the three-dimensional point cloud of wolfberry branches and a sphere with a radius of r at the end of the branch.

(2)设定其特征提取层为：利用稀疏3D卷积网络对输入栅格及其关键点半径为r的球体进行多尺度逐层特征提取；对利用FPS选取的相关点通过基于注意力机制的PointNet网络进行点云的特征提取。(2) Set its feature extraction layer as: using a sparse 3D convolutional network to perform multi-scale layer-by-layer feature extraction on the input grid and a sphere whose key point radius is r; The PointNet network performs point cloud feature extraction.

(3)在枸杞枝条及其关键点检测网络PV-RCNN内构建融合注意力机制的精修网络：基于注意力机制构建枸杞枝条及其关键点目标候选框的精准定位网络，作为目标回归精修网络。(3) Constructing a refinement network integrating attention mechanism in the detection network PV-RCNN of wolfberry branches and their key points: Based on the attention mechanism, a precise localization network of wolfberry branches and their key point target candidate boxes is constructed as the target regression refinement network.

(4)设定基于改进PV-RCNN网络枸杞枝条及其关键点检测网络的输出层为枸杞枝条位置及其枝条末端关键点坐标。(4) Set the output layer of the wolfberry branch and its key point detection network based on the improved PV-RCNN network as the position of the wolfberry branch and the coordinates of the key point at the end of the branch.

第四步，训练枸杞枝条及其关键点检测网络：利用训练样本对枸杞枝条以其关键点检测网络进行训练。其具体步骤如下：The fourth step is to train the detection network of wolfberry branches and their key points: use the training samples to train the detection network of wolfberry branches and their key points. The specific steps are as follows:

(1)将诸多栅格和关键点半径为r的球体输入3D稀疏卷积神经网络中进行逐层特征提取；(1) Input many grids and spheres with key point radius r into the 3D sparse convolutional neural network for layer-by-layer feature extraction;

将C4特征图转换成俯视特征图，俯视特征图的大小为

根据特征图大小由RPN网络生成

个anchorboxes，角度分别为0度、45度、135度，通过NMS非极大值抑制操作生成3Dproposal，最终获得3Dproposal对应的类别和坐标位置。Generated by the RPN network according to the feature map size

Anchorboxes with angles of 0 degrees, 45 degrees, and 135 degrees, respectively, generate 3D proposals through the NMS non-maximum suppression operation, and finally obtain the corresponding categories and coordinate positions of 3D proposals.

(2)利用FPS选取k个相关点并通过基于注意力机制的P_ointNet网络进行点云的特征提取。由于枸杞枝条三维点云被体素化后会丢失大量重要的信息，所以从点云中选择k个相关点进行特征提取尽可能弥补丢失的信息，同时，引入了注意力机制可以获得相关点的重要程度。这样处理三维点云进行目标检测的方法既可以提高训练速度又可以提升目标检测的精度。其具体步骤如下：(2) Use FPS to select k relevant points and perform feature extraction of point cloud through the P _o intNet network based on the attention mechanism. Since the three-dimensional point cloud of wolfberry branches will lose a lot of important information after being voxelized, k related points are selected from the point cloud for feature extraction to make up for the lost information as much as possible. Importance. In this way, the method of processing 3D point cloud for target detection can not only improve the training speed but also improve the accuracy of target detection. The specific steps are as follows:

A1)利用FPS算法从三维点云中选取k个相关点，其公式如下：A1) Use the FPS algorithm to select k relevant points from the 3D point cloud, and the formula is as follows:

κ＝{p₁，p₂，…，p_k}；κ={p ₁ , p ₂ , ..., p _k };

其中i＝1，2，3，...，k；

where i=1, 2, 3, ..., k;

其中，

勾每一层3D稀疏卷积上产生的特征图，c＝1，2，3，4；in,

Check the feature map generated on each layer of 3D sparse convolution, c=1, 2, 3, 4;

是三维点云通过SA模型计算的第i个相关点p_i特征；

是对俯视图利用双线性插值获得的特征；

is the feature obtained by bilinear interpolation for the top view;

B2)计算相关点p_i的特征F_i的权重如下：B2) Calculate the weight of the feature F _i of the relevant point p _i as follows:

(3)目标回归精修网络的训练，对点云进行体素化处理，一般会损失一定的定位信息，从而影响目标检测的精度，所以对卷积特征和相关点权重特征作为精修网络输入，相互间可以提供语义补充信息；同时，增加注意力机制可以为每个点附带周围的上下文信息，增强特征，进一步优化检测结果。其步骤为：将3Dproposals对应的俯视图特征和k个相关点权重特征F′_i进行级联；同时再利用Fusion模型对级联后的结果和3Dproposals卷积产生的注意力特征相乘进行融合；最后通过多层感知机获得精修的边界框3Dbox精确位置。(3) The training of the target regression refinement network and the voxelization of the point cloud will generally lose a certain amount of positioning information, thus affecting the accuracy of target detection, so the convolution feature and the relevant point weight feature are used as the input of the refinement network. , which can provide semantic supplementary information to each other; at the same time, adding attention mechanism can attach surrounding context information to each point, enhance features, and further optimize detection results. The steps are: cascade the top-view feature corresponding to 3D proposals and k related point weight features F′ _i ; at the same time, use the Fusion model to multiply the concatenated result and the attention feature generated by 3D proposals convolution to fuse; The refined bounding box 3Dbox precise location is obtained by a multilayer perceptron.

(4)在训练过程中进行损失函数的训练：损失函数包括RPN的多任务目标损失函数L_RPN和回归框精修损失函数L_REFINE。(4) Training the loss function in the training process: the loss function includes the multi-task target loss function L _RPN of the RPN and the regression box refinement loss function L _REFINE .

根据确定枸杞枝条关键点抓取位置的重要性，我们引入关键点作为监督数据，从而在训练粗定位枸杞枝条及其关键点网络后，进一步对枸杞枝条目标框和关键点目标框进行了精修网络的训练，最终获得目标的精准定位。According to the importance of determining the grasping position of key points of wolfberry branches, we introduce key points as supervision data, so that after training a network for coarsely localizing wolfberry branches and their key points, we further refine the target frame of wolfberry branches and key point target boxes. The training of the network finally obtains the precise positioning of the target.

A1)训练多任务目标损失函数L_RPN，该损失函数包括分类任务损失函数L_cls；目标回归框损失函数L_boxreg；关键点回归损失函数L_keyreg：A1) Training multi-task target loss function L _RPN , the loss function includes classification task loss function L _cls ; target regression box loss function L _boxreg ; key point regression loss function L _keyreg :

当IoU＞0.6时，anchor被认为是正样本；当IoU＜0.45时，anchor被认为是负样本；当0.45＜IoU＜0.6时，这时很难判断anchor的正负样本，所以，在计算损失函数时我们将不予考虑。其表达式如下：When IoU>0.6, the anchor is considered as a positive sample; when IoU<0.45, the anchor is considered as a negative sample; when 0.45<IoU<0.6, it is difficult to judge the positive and negative samples of the anchor, so, when calculating the loss function will not be considered. Its expression is as follows:

L_RPN＝L_cls+L_boxreg+L_keyreg；L _RPN = L _cls + L _boxreg + L _keyreg ;

其中，令σ＝2；Among them, let σ=2;

其中，

是标记的关键点坐标，f(x_i)预测的关键点坐标；in,

is the marked keypoint coordinates, f( _xi ) predicted keypoint coordinates;

A2)训练包括枝条目标框和关键点目标框的3Dbox回归框精修损失函数L_REFINE，其表达式如下：A2) Train the _3Dbox regression box refinement loss function L REFINE including the branch target frame and the key point target frame, and its expression is as follows:

其中，

是标记的目标框，3Dbox是预测目标框。in,

is the labeled target box, and 3Dbox is the predicted target box.

第五步，待识别枸杞枝条图像的收集和预处理：获取双目相机拍摄的待识别枸杞树不同角度的20张图像，并构建三维模型，获得待识别的三维点云，并对待识别的三维点云进行体素化处理。The fifth step is the collection and preprocessing of the images of the wolfberry branches to be identified: 20 images of the wolfberry tree to be identified from different angles captured by the binocular camera are obtained, and a three-dimensional model is constructed to obtain the three-dimensional point cloud to be identified. The point cloud is voxelized.

第六步，枸杞枝条的识别和定位：将处理后待识别的三维点云数据输入训练后的枸杞枝条及其关键点检测网络，获得枸杞枝条和枝条末端关键点位置，实现枸杞枝条的识别与定位。The sixth step, identification and positioning of wolfberry branches: Input the processed three-dimensional point cloud data to be identified into the trained wolfberry branch and its key point detection network to obtain the positions of wolfberry branches and key points at the ends of the branches, and realize the identification and identification of wolfberry branches. position.

以上显示和描述了本发明的基本原理、主要特征和本发明的优点。本行业的技术人员应该了解，本发明不受上述实施例的限制，上述实施例和说明书中描述的只是本发明的原理，在不脱离本发明精神和范围的前提下本发明还会有各种变化和改进，这些变化和改进都落入要求保护的本发明的范围内。本发明要求的保护范围由所附的权利要求书及其等同物界定。The foregoing has shown and described the basic principles, main features and advantages of the present invention. It should be understood by those skilled in the art that the present invention is not limited by the above-mentioned embodiments. The above-mentioned embodiments and descriptions describe only the principles of the present invention. Without departing from the spirit and scope of the present invention, there are various Variations and improvements are intended to fall within the scope of the claimed invention. The scope of protection claimed by the present invention is defined by the appended claims and their equivalents.

Claims

1. a method for identifying and positioning medlar branches based on attention mechanism and improving PV-RCNN network, is characterized in that, comprises the following steps:

11) Collection and preprocessing of training samples: 20 images of the wolfberry tree from different angles are obtained through a binocular camera, a 3D model is constructed to obtain a 3D point cloud, the 3D point cloud is annotated, and a sphere with a radius of r is established for the key points at the end of the branch. mark;

12) Voxelization of 3D point cloud: Voxelize the 3D point cloud through the VolexNet network to form multiple grids; among them, the size of the point cloud input area is (L, W, H), and each grid The size of the grid is (l, w, h), the total number of grids is (L/l, W/w, H/h), and the number of point clouds in each grid is set to 8;

13) Constructing the detection network of wolfberry branches and their key points: The detection network of wolfberry branches and their key points is constructed based on the PV-RCNN network, and the attention mechanism is fused in the PV-RCNN of the wolfberry branches and their key points detection network to obtain the wolfberry branches and their key points. Refinement network for key point targeting;

14) Training the detection network of wolfberry branches and their key points: using the training samples to train the detection network of wolfberry branches and their key points;

15) Collection and preprocessing of the images of the wolfberry branches to be identified: 20 images of the wolfberry tree to be identified from different angles captured by the binocular camera are obtained, the three-dimensional point cloud to be identified is obtained by using the constructed three-dimensional model, and the three-dimensional point cloud to be identified is obtained. The cloud is voxelized;

16) Identification and positioning of wolfberry branches: Input the processed three-dimensional point cloud data to be identified into the trained wolfberry branch and its key point detection network, and obtain the positions of wolfberry branches and key points at the ends of the branches, so as to realize the identification and positioning of wolfberry branches.

2. the method for identifying and locating wolfberry branches based on attention mechanism and improving PV-RCNN network according to claim 1, is characterized in that, described constructing wolfberry branches and key point detection network thereof comprises the following steps:

21) A detection network for wolfberry branches and their key points is constructed based on the PV-RCNN network, and the input layer is set as: many grids after voxelization of the three-dimensional point cloud of wolfberry branches and a sphere with a radius of r at the end of the branch;

22) Set its feature extraction layer as: using a sparse 3D convolutional network to perform multi-scale layer-by-layer feature extraction on the input grid and a sphere whose key point radius is r; PointNet network for feature extraction of point cloud;

23) Constructing a refinement network integrating attention mechanism in the detection network PV-RCNN of wolfberry branches and their key points: Based on the attention mechanism, an accurate localization network of wolfberry branches and their key point target candidate boxes is constructed as a target regression refinement network ;

24) Set the output layer of the wolfberry branch and its key point detection network based on the improved PV-RCNN network as the position of the wolfberry branch and the coordinates of the key point at the end of the branch.

3. the method for identifying and positioning wolfberry branches based on attention mechanism and improving PV-RCNN network according to claim 1, is characterized in that, described training wolfberry branches and key point detection network thereof comprises the following steps:

31) Input many grids and spheres with key point radius r into the 3D sparse convolutional neural network for layer-by-layer feature extraction;

The sparse convolutional neural network consists of four layers of C1, C2, C3, C4, 3×3×3 3D sparse convolution, and feature extraction is performed layer by layer;

Generated by the RPN network according to the feature map size

Anchorboxes, the angles are 0 degrees, 45 degrees, and 135 degrees, respectively, through the NMS non-maximum suppression operation to generate 3Dproposa], and finally obtain the corresponding category and coordinate position of 3Dproposal;

32) Use the k relevant points selected by FPS and perform feature extraction of the point cloud through the PointNet network based on the attention mechanism;

33) Training of the target regression refinement network: cascade the top-view feature corresponding to 3D proposals and k related point weight features F′ _i ; then use the Fusion model to match the cascaded results with the attention features generated by 3D proposals convolution. Multiply and fuse; finally obtain the precise position of the refined bounding box 3Dbox through the multi-layer perceptron;

34) The training of the loss function in the training process: the loss function includes the multi-task target loss function L _RPN of the RPN and the regression box refinement loss function L _REFINE .

4. according to claim 3 based on attention mechanism and improving the wolfberry branch identification and positioning method of PV-RCNN network, it is characterized in that, described utilizing FPS to select k relevant points and by the PointNet network based on attention mechanism Feature extraction from point cloud includes the following steps:

41) Use the FPS algorithm to select k relevant points from the 3D point cloud, and the formula is as follows:

κ={p ₁ , p ₂ , ..., p _k };

The feature representation of each relevant point _pi is as follows:

where i=1, 2, 3, ..., k;

in,

is the feature obtained by bilinear interpolation for the top view;

42) Calculate the weight of the feature F _i of the relevant point p _i as follows:

F′ _i =A(pi ) _{⊙Fi i} ₌ 1, 2, 3,...,k;

Among them, Λ(·)∈[0,1] is the attention network, and its value represents the attention vector corresponding to the input related point, that is, the importance of the related point, and F _i is the feature of the related point _pi .

5. the wolfberry branch identification and positioning method based on attention mechanism and improved PV-RCNN network according to claim 3, is characterized in that, the described training that carries out loss function in training process comprises the following steps:

51) Training multi-task target loss function L _RPN , the loss function includes classification task loss function L _cls ; target regression box loss function L _boxreg ; key point regression loss function L _keyreg :

When IoU>0.6, the anchor is considered as a positive sample; when IoU<0.45, the anchor is considered as a negative sample; its expression is as follows:

L _RPN = L _cls + L _boxreg + L _keyreg ;

The loss function L _cls of the classification task, its expression is as follows:

Among them, L _cls (x, y)=-(xlog(y)+(1-x)log(1-y)), N ₊ represents the number of positive samples, and N represents the number of negative samples;

The target regression box loss function L _boxreg , whose expression is as follows:

Among them, let σ=2;

Train the keypoint regression loss function L _keyreg , whose expression is as follows:

in,

is the marked keypoint coordinates, f( _xi ) predicted keypoint coordinates;

52) Train the _3Dbox regression box refinement loss function L REFINE including the branch target frame and the key point target frame, and its expression is as follows:

in,

is the labeled target box, and 3Dbox is the predicted target box.