CN111476843A - Chinese wolfberry branch recognition and positioning method based on attention mechanism and improved PV-RCNN network - Google Patents
Chinese wolfberry branch recognition and positioning method based on attention mechanism and improved PV-RCNN network Download PDFInfo
- Publication number
- CN111476843A CN111476843A CN202010380789.3A CN202010380789A CN111476843A CN 111476843 A CN111476843 A CN 111476843A CN 202010380789 A CN202010380789 A CN 202010380789A CN 111476843 A CN111476843 A CN 111476843A
- Authority
- CN
- China
- Prior art keywords
- network
- wolfberry
- branches
- point
- loss function
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01D—HARVESTING; MOWING
- A01D46/00—Picking of fruits, vegetables, hops, or the like; Devices for shaking trees or shrubs
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10004—Still image; Photographic image
- G06T2207/10012—Stereo images
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30181—Earth observation
- G06T2207/30188—Vegetation; Agriculture
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Environmental Sciences (AREA)
- Image Analysis (AREA)
Abstract
Description
技术领域technical field
本发明涉及枸杞采收技术领域,具体来说是基于注意力机制和改进PV-RCNN网络的枸杞枝条识别定位方法。The invention relates to the technical field of wolfberry harvesting, in particular to a wolfberry branch identification and positioning method based on an attention mechanism and an improved PV-RCNN network.
背景技术Background technique
随着枸杞种植面积不断扩大,枸杞的采摘成为制约枸杞产业持续发展的卡脖子问题。由于国内外市场并没有技术较成熟的采摘机械,所以枸杞的采收主要依靠于人工完成,但人工采收枸杞的效率仅为3-5kg/h,所需费用达到生产成本的50%以上。研制适应于我国国情的枸杞采收机械,对降低所需成本、提高农民收入、保障枸杞产业稳态持续发展具有重大意义。With the continuous expansion of wolfberry planting area, the picking of wolfberry has become a bottleneck problem that restricts the sustainable development of wolfberry industry. Since there is no mature picking machinery in the domestic and foreign markets, the harvesting of wolfberry mainly relies on manual work, but the efficiency of manual harvesting of wolfberry is only 3-5kg/h, and the cost is more than 50% of the production cost. The development of wolfberry harvesting machinery adapted to my country's national conditions is of great significance for reducing the required costs, increasing farmers' income, and ensuring the steady and sustainable development of the wolfberry industry.
各种枸杞采收机械均依赖于操作人员的主观判断操作,利用枸杞采收夹持装置夹住枸杞枝条,对枸杞枝条进行摆动或梳刷,效率较为低下。而枸杞果实的数量比较多、体积较小,采摘时存在树叶和枝条对果实的遮挡,致使枸杞枝条的准确识别与定位在二维图像下难以精准识别。All kinds of wolfberry harvesting machines rely on the subjective judgment of the operator. Using the wolfberry harvesting and clamping device to clamp the wolfberry branches, swinging or brushing the wolfberry branches, the efficiency is relatively low. However, the number of wolfberry fruit is relatively large and the volume is small, and the fruit is blocked by leaves and branches during picking, which makes the accurate identification and positioning of wolfberry branches difficult to accurately identify under two-dimensional images.
若能利用基于三维点云数据(三维点云数据可以获得目标的空间维度、分布特征以及三维形态等具体空间信息)的计算机识别技术准确地识别和判断枸杞枝条的位置及其枝条末端关键点的坐标,根据枝条的位置和走势及其枝条末端关键点坐标利用机械臂将其定点抓起进行枸杞高效率采摘,不仅可以提高枸杞采摘效率、采净率而且能够最大程度地降低对枸杞的损伤以及对树的保护而不被损坏。然而,目前对枸杞枝条的检测仍以二维图像为主,但自然环境复杂以及存在遮挡、断点、重叠等问题,故很难利用二维图像直接判断枝条的位置。If the computer recognition technology based on 3D point cloud data (3D point cloud data can obtain the specific spatial information such as the spatial dimension, distribution characteristics and 3D shape of the target) can be used to accurately identify and judge the position of wolfberry branches and the key points at the ends of the branches. Coordinates, according to the position and trend of the branches and the coordinates of the key points at the ends of the branches, the robotic arm is used to pick them up at a fixed point for high-efficiency picking of wolfberry, which can not only improve the picking efficiency and cleaning rate of wolfberry, but also minimize the damage to wolfberry and Protection of the tree from damage. However, the current detection of wolfberry branches is still mainly based on two-dimensional images, but the natural environment is complex and there are problems such as occlusion, breakpoints, and overlaps, so it is difficult to directly determine the position of branches by using two-dimensional images.
因此,如何提高枸杞枝条检测的精准性已成为当前亟待解决的关键技术问题。Therefore, how to improve the accuracy of wolfberry branch detection has become a key technical problem to be solved urgently.
发明内容SUMMARY OF THE INVENTION
本发明的目的是为了解决现有技术中由于枸杞枝条被遮挡、断点、重叠等原因致使二维图像难以精准识别的缺陷,提供一种基于注意力机制和改进PV-RCNN网络的枸杞枝条识别定位方法来解决上述问题。The purpose of the present invention is to solve the defects in the prior art that the two-dimensional images are difficult to accurately identify due to the occlusion, breakpoints, overlapping and other reasons of the wolfberry branches, and to provide a wolfberry branch identification based on an attention mechanism and an improved PV-RCNN network. positioning method to solve the above problems.
为了实现上述目的,本发明的技术方案如下:In order to achieve the above object, technical scheme of the present invention is as follows:
基于注意力机制和改进PV-RCNN网络的枸杞枝条识别定位方法,包括以下步骤:The identification and positioning method of wolfberry branches based on attention mechanism and improved PV-RCNN network includes the following steps:
训练样本的收集和预处理:通过双目相机获取枸杞树不同角度的20张图像,构建三维模型获得三维点云,对三维点云进行标注,对枝条末端关键点建立半径为r的球体作标注;Collection and preprocessing of training samples: Obtain 20 images of wolfberry trees from different angles through a binocular camera, build a 3D model to obtain a 3D point cloud, label the 3D point cloud, and create a sphere with a radius of r for the key points at the end of the branches. ;
三维点云的体素化处理:将三维点云经过VolexNet网络进行体素化处理,形成多个栅格;其中,点云输入区域的大小为(L,W,H),每个栅格的大小为(l,w,h),共有栅格个数为(L/l,W/w,H/h),每个栅格中点云数量设置为8;Voxelization of 3D point cloud: Voxelize the 3D point cloud through the VolexNet network to form multiple grids; among them, the size of the point cloud input area is (L, W, H), and the size of each grid is The size is (l,w,h), the total number of grids is (L/l,W/w,H/h), and the number of point clouds in each grid is set to 8;
构建枸杞枝条及其关键点检测网络:基于PV-RCNN网络构建枸杞枝条及其关键点检测网络,并在枸杞枝条及其关键点检测网络的PV-RCNN内融合注意力机制获得枸杞枝条及关键点目标定位的精修网络;Construction of wolfberry branches and their key points detection network: Based on the PV-RCNN network, a wolfberry branch and its key point detection network was constructed, and the wolfberry branches and key points were obtained by integrating the attention mechanism in the PV-RCNN of the wolfberry branch and its key point detection network. Refinement network for targeting;
训练枸杞枝条及其关键点检测网络:利用训练样本对枸杞枝条以其关键点检测网络进行训练;Training wolfberry branch and its key point detection network: use training samples to train wolfberry branch and its key point detection network;
待识别枸杞枝条图像的收集和预处理:获取双目相机拍摄的待识别枸杞树不同角度的20张图像,利用已构建的三维模型获得待识别的三维点云,并对待识别的三维点云进行体素化处理;Collection and preprocessing of images of wolfberry branches to be identified: 20 images of the wolfberry tree to be identified from different angles captured by the binocular camera were obtained, and the three-dimensional point cloud to be identified was obtained by using the constructed three-dimensional model, and the three-dimensional point cloud to be identified was processed. voxelization;
枸杞枝条的识别和定位:将处理后待识别的三维点云数据输入训练后的枸杞枝条及其关键点检测网络,获得枸杞枝条和枝条末端关键点位置,实现枸杞枝条的识别与定位。Recognition and positioning of wolfberry branches: Input the processed three-dimensional point cloud data to be identified into the trained wolfberry branch and its key point detection network to obtain the positions of wolfberry branches and key points at the ends of the branches, so as to realize the identification and positioning of wolfberry branches.
所述构建枸杞枝条及其关键点检测网络包括以下步骤:The described construction of wolfberry branches and their key point detection network includes the following steps:
基于PV-RCNN网络构建枸杞枝条及其关键点检测网络,设定其输入层为:枸杞枝条三维点云进行体素化后的诸多栅格和枝条末端关键点半径为r的球体;The detection network of wolfberry branches and their key points was constructed based on PV-RCNN network, and the input layer was set as: many grids after voxelization of the three-dimensional point cloud of wolfberry branches and a sphere with a radius of r at the key points at the end of the branches;
设定其特征提取层为:利用稀疏3D卷积网络对输入栅格及其关键点半径为r的球体进行多尺度逐层特征提取;对利用FPS选取的相关点通过基于注意力机制的PointNet网络进行点云的特征提取;The feature extraction layer is set as: using a sparse 3D convolutional network to perform multi-scale layer-by-layer feature extraction on the input grid and a sphere whose key point radius is r; Perform point cloud feature extraction;
在枸杞枝条及其关键点检测网络PV-RCNN内构建融合注意力机制的精修网络:基于注意力机制构建枸杞枝条及其关键点目标候选框的精准定位网络,作为目标回归精修网络;Constructing a refinement network integrating attention mechanism in the detection network PV-RCNN of wolfberry branches and their key points: based on the attention mechanism, a precise localization network of wolfberry branches and their key point target candidate boxes is constructed as a target regression refinement network;
设定基于改进PV-RCNN网络枸杞枝条及其关键点检测网络的输出层为枸杞枝条位置及其枝条末端关键点坐标。The output layer of the wolfberry branch and its key point detection network based on the improved PV-RCNN network is set as the position of the wolfberry branch and the coordinates of the key point at the end of the branch.
所述训练枸杞枝条及其关键点检测网络包括以下步骤:The training of wolfberry branches and their key point detection network includes the following steps:
将诸多栅格和关键点半径为r的球体输入3D稀疏卷积神经网络中进行逐层特征提取;Input many grids and spheres with key point radius r into 3D sparse convolutional neural network for layer-by-layer feature extraction;
稀疏卷积神经网络由四层C1、C2、C3、C4,3×3×3的3D稀疏卷积组成,逐层进行特征提取;The sparse convolutional neural network consists of four layers of C1, C2, C3, C4, 3×3×3 3D sparse convolution, and feature extraction is performed layer by layer;
将C4特征图转换成俯视特征图,俯视特征图的大小为 Convert the C4 feature map to a top-down feature map, and the size of the top-view feature map is
根据特征图大小由RPN网络生成个anchorboxes,角度分别为0度、45度、135度,通过NMS非极大值抑制操作生成3Dproposal,最终获得3Dproposal对应的类别和坐标位置;Generated by the RPN network according to the feature map size Anchorboxes with angles of 0 degrees, 45 degrees, and 135 degrees, respectively, generate 3D proposals through NMS non-maximum suppression operation, and finally obtain the corresponding category and coordinate position of 3D proposals;
利用FPS选取的k个相关点并通过基于注意力机制的PointNet网络进行点云的特征提取;Use the k relevant points selected by FPS to extract the feature of point cloud through the PointNet network based on the attention mechanism;
目标回归精修网络的训练:将3Dproposals对应的俯视图特征和k个相关点权重特征Fi′进行级联;再利用Fusion模型对级联后的结果和3Dproposals卷积产生的注意力特征相乘进行融合;最后通过多层感知机获得精修的边界框3Dbox精确位置;The training of the target regression refinement network: cascade the top-view feature corresponding to 3D proposals and k related point weight features F i ′; then use the Fusion model to multiply the cascaded result and the attention feature generated by the 3D proposals convolution. Fusion; finally, the precise position of the refined bounding box 3Dbox is obtained through the multi-layer perceptron;
在训练过程中进行损失函数的训练:损失函数包括RPN的多任务目标损失函数LRPN和回归框精修损失函数LREFINE。The training of the loss function is performed during the training process: the loss function includes the multi-task objective loss function L RPN of the RPN and the regression box refinement loss function L REFINE .
所述的利用FPS选取的k个相关点并通过基于注意力机制的PointNet网络进行点云的特征提取包括以下步骤:The feature extraction of the point cloud using the k relevant points selected by FPS and the PointNet network based on the attention mechanism includes the following steps:
利用FPS算法从三维点云中选取k个相关点,其公式如下:Using the FPS algorithm to select k relevant points from the 3D point cloud, the formula is as follows:
κ={p1,p2,…,pk};κ={p 1 , p 2 ,...,p k };
每一个相关点pi的特征表示如下:The feature representation of each relevant point pi is as follows:
其中i=1,2,3,...,k; where i=1, 2, 3, ..., k;
其中,为每一层3D稀疏卷积上产生的特征图,c=1,2,3,4;in, For the feature map generated on each layer of 3D sparse convolution, c=1, 2, 3, 4;
是三维点云通过SA模型计算的第i个相关点pi特征; is the i-th related point p i feature calculated by the SA model of the 3D point cloud;
是对俯视图利用双线性插值获得的特征; is the feature obtained by bilinear interpolation for the top view;
计算相关点pi的特征Fi的权重如下:The weight of the feature F i of the relevant point p i is calculated as follows:
F′i=Λ(pi)⊙Fii=1,2,3,...,k;F′ i =Λ(pi ) ⊙Fi i = 1, 2, 3,...,k;
其中,Λ(·)∈[0,1]为注意力网络,其值代表对应输入相关点的注意力向量,即该相关点的重要程度,Fi是相关点pi的特征。Among them, Λ(·)∈[0,1] is the attention network, and its value represents the attention vector corresponding to the input related point, that is, the importance of the related point, and F i is the feature of the related point pi .
所述在训练过程中进行损失函数的训练包括以下步骤:The training of the loss function in the training process includes the following steps:
训练多任务目标损失函数LRPN,该损失函数包括分类任务损失函数Lcls;目标回归框损失函数Lboxreg;关键点回归损失函数Lkeyreg::Training multi-task target loss function L RPN , the loss function includes classification task loss function L cls ; target regression box loss function L boxreg ; key point regression loss function L keyreg :
当IoU>0.6时,anchor被认为是正样本;当IoU<0.45时,anchor被认为是负样本;其表达式如下:When IoU>0.6, the anchor is considered as a positive sample; when IoU<0.45, the anchor is considered as a negative sample; its expression is as follows:
LRPN=Lcls+Lboxreg+Lkeyreg;L RPN = L cls + L boxreg + L keyreg ;
分类任务的损失函数Lcls,其表达式如下:The loss function L cls of the classification task, its expression is as follows:
其中,Lcls(x,y)=-(xlog(y)+(1-x)log(1-y)),N+表示正样本个数,N-表示负样本个数;Among them, L cls (x, y)=-(xlog(y)+(1-x)log(1-y)), N + represents the number of positive samples, N - represents the number of negative samples;
目标回归框损失函数Lboxreg,其表达式如下:The target regression box loss function L boxreg , whose expression is as follows:
其中,令σ=2;Among them, let σ=2;
训练关键点回归损失函数Lkeyreg,其表达式如下:Train the keypoint regression loss function L keyreg , whose expression is as follows:
其中,是标记的关键点坐标,f(xi)预测的关键点坐标;in, is the marked keypoint coordinates, f( xi ) predicted keypoint coordinates;
训练包括枝条目标框和关键点目标框的3Dbox回归框精修损失函数LREFINE,其表达式如下:The 3Dbox regression box refinement loss function L REFINE including the branch target box and the keypoint target box is trained, and its expression is as follows:
其中,是标记的目标框,3Dbox是预测目标框。in, is the labeled target box, and 3Dbox is the predicted target box.
有益效果beneficial effect
本发明的基于注意力机制和改进PV-RCNN网络的枸杞枝条识别定位方法,与现有技术相比在PV-RCNN网络中融合了相关点的注意力网络及其精修网络中3Dproposals卷积特征产生的注意力,能够补充在体素化和稀疏3D卷积网络中进行卷积运算丢失的大量定位信息,同时依据注意力网络获得相关点对于目标检测的贡献程度和精修网络中特征增强来提高枸杞枝条和枝条末端关键点检测的精准性,实现了枸杞枝条的准确识别定位。Compared with the prior art, the method for identifying and locating wolfberry branches based on the attention mechanism and the improved PV-RCNN network of the present invention integrates the attention network of relevant points in the PV-RCNN network and the 3D proposals convolution feature in the refined network. The generated attention can supplement a large amount of localization information lost in the convolution operation in the voxelized and sparse 3D convolutional network. At the same time, according to the contribution of the relevant points to the target detection obtained by the attention network and the feature enhancement in the refined network. The detection accuracy of the key points of wolfberry branches and branch ends is improved, and the accurate identification and positioning of wolfberry branches is realized.
附图说明Description of drawings
图1为本发明的方法顺序图。FIG. 1 is a sequence diagram of the method of the present invention.
具体实施方式Detailed ways
为使对本发明的结构特征及所达成的功效有更进一步的了解与认识,用以较佳的实施例及附图配合详细的说明,说明如下:In order to have a further understanding and understanding of the structural features of the present invention and the effects achieved, the preferred embodiments and accompanying drawings are used in conjunction with detailed descriptions, and the descriptions are as follows:
如图1所示,本发明所述的基于注意力机制和改进PV-RCNN网络的枸杞枝条识别定位方法,包括以下步骤:As shown in Figure 1, the method for identifying and locating wolfberry branches based on an attention mechanism and an improved PV-RCNN network according to the present invention includes the following steps:
第一步,训练样本的收集和预处理。通过双目相机获取枸杞树不同角度的20张图像,利用现有技术构建三维模型获得三维点云,并对三维点云进行标注,对枝条末端关键点建立半径为r的球体作标注。三维点云会进行体素化预处理从而形成栅格,我们利用半径为r的球体作标注只是为了使其能够和栅格作为相同格式的数据输入到稀疏3D卷积网络中进行特征提取。The first step is the collection and preprocessing of training samples. The binocular camera was used to obtain 20 images of the wolfberry tree from different angles, and the existing technology was used to build a 3D model to obtain a 3D point cloud. The 3D point cloud will be preprocessed by voxelization to form a grid. We use a sphere with a radius of r as annotation just so that it can be input into the sparse 3D convolutional network as data in the same format as the grid for feature extraction.
第二步,三维点云的体素化处理。将三维点云经过VolexNet网络进行体素化处理,形成多个栅格;其中,点云输入区域的大小为(L,W,H),每个栅格的大小为(l,w,h),共有栅格个数为(L/l,W/w,H/h),每个栅格中点云数量设置为8。The second step is the voxelization of the 3D point cloud. The 3D point cloud is voxelized through the VolexNet network to form multiple grids; the size of the point cloud input area is (L, W, H), and the size of each grid is (l, w, h) , the total number of grids is (L/l, W/w, H/h), and the number of point clouds in each grid is set to 8.
第三步,构建枸杞枝条及其关键点检测网络。基于PV-RCNN网络构建枸杞枝条及其关键点检测网络,并在枸杞枝条及其关键点检测网络的PV-RCNN内融合注意力机制获得枸杞枝条及关键点目标定位的精修网络。在此,利用PV-RCNN网络架构构建枸杞枝条及其关键点检测网络,并为了实现注意力机制,在PV-RCNN网络内建立一个精修网络,以融合注意力机制。其具体步骤如下:The third step is to construct the detection network of wolfberry branches and their key points. Based on the PV-RCNN network, the detection network of wolfberry branches and their key points was constructed, and the attention mechanism was integrated in the PV-RCNN of the detection network of wolfberry branches and their key points to obtain a refined network for the target location of wolfberry branches and key points. Here, the PV-RCNN network architecture is used to construct the detection network of wolfberry branches and their key points, and in order to realize the attention mechanism, a refinement network is established in the PV-RCNN network to integrate the attention mechanism. The specific steps are as follows:
(1)基于PV-RCNN网络构建枸杞枝条及其关键点检测网络,设定其输入层为:枸杞枝条三维点云进行体素化后的诸多栅格和枝条末端关键点半径为r的球体。(1) The detection network of wolfberry branches and their key points is constructed based on the PV-RCNN network, and the input layer is set as: many grids after voxelization of the three-dimensional point cloud of wolfberry branches and a sphere with a radius of r at the end of the branch.
(2)设定其特征提取层为:利用稀疏3D卷积网络对输入栅格及其关键点半径为r的球体进行多尺度逐层特征提取;对利用FPS选取的相关点通过基于注意力机制的PointNet网络进行点云的特征提取。(2) Set its feature extraction layer as: using a sparse 3D convolutional network to perform multi-scale layer-by-layer feature extraction on the input grid and a sphere whose key point radius is r; The PointNet network performs point cloud feature extraction.
(3)在枸杞枝条及其关键点检测网络PV-RCNN内构建融合注意力机制的精修网络:基于注意力机制构建枸杞枝条及其关键点目标候选框的精准定位网络,作为目标回归精修网络。(3) Constructing a refinement network integrating attention mechanism in the detection network PV-RCNN of wolfberry branches and their key points: Based on the attention mechanism, a precise localization network of wolfberry branches and their key point target candidate boxes is constructed as the target regression refinement network.
(4)设定基于改进PV-RCNN网络枸杞枝条及其关键点检测网络的输出层为枸杞枝条位置及其枝条末端关键点坐标。(4) Set the output layer of the wolfberry branch and its key point detection network based on the improved PV-RCNN network as the position of the wolfberry branch and the coordinates of the key point at the end of the branch.
第四步,训练枸杞枝条及其关键点检测网络:利用训练样本对枸杞枝条以其关键点检测网络进行训练。其具体步骤如下:The fourth step is to train the detection network of wolfberry branches and their key points: use the training samples to train the detection network of wolfberry branches and their key points. The specific steps are as follows:
(1)将诸多栅格和关键点半径为r的球体输入3D稀疏卷积神经网络中进行逐层特征提取;(1) Input many grids and spheres with key point radius r into the 3D sparse convolutional neural network for layer-by-layer feature extraction;
稀疏卷积神经网络由四层C1、C2、C3、C4,3×3×3的3D稀疏卷积组成,逐层进行特征提取;The sparse convolutional neural network consists of four layers of C1, C2, C3, C4, 3×3×3 3D sparse convolution, and feature extraction is performed layer by layer;
将C4特征图转换成俯视特征图,俯视特征图的大小为 Convert the C4 feature map to a top-down feature map, and the size of the top-view feature map is
根据特征图大小由RPN网络生成个anchorboxes,角度分别为0度、45度、135度,通过NMS非极大值抑制操作生成3Dproposal,最终获得3Dproposal对应的类别和坐标位置。Generated by the RPN network according to the feature map size Anchorboxes with angles of 0 degrees, 45 degrees, and 135 degrees, respectively, generate 3D proposals through the NMS non-maximum suppression operation, and finally obtain the corresponding categories and coordinate positions of 3D proposals.
(2)利用FPS选取k个相关点并通过基于注意力机制的PointNet网络进行点云的特征提取。由于枸杞枝条三维点云被体素化后会丢失大量重要的信息,所以从点云中选择k个相关点进行特征提取尽可能弥补丢失的信息,同时,引入了注意力机制可以获得相关点的重要程度。这样处理三维点云进行目标检测的方法既可以提高训练速度又可以提升目标检测的精度。其具体步骤如下:(2) Use FPS to select k relevant points and perform feature extraction of point cloud through the P o intNet network based on the attention mechanism. Since the three-dimensional point cloud of wolfberry branches will lose a lot of important information after being voxelized, k related points are selected from the point cloud for feature extraction to make up for the lost information as much as possible. Importance. In this way, the method of processing 3D point cloud for target detection can not only improve the training speed but also improve the accuracy of target detection. The specific steps are as follows:
A1)利用FPS算法从三维点云中选取k个相关点,其公式如下:A1) Use the FPS algorithm to select k relevant points from the 3D point cloud, and the formula is as follows:
κ={p1,p2,…,pk};κ={p 1 , p 2 , ..., p k };
每一个相关点pi的特征表示如下:The feature representation of each relevant point pi is as follows:
其中i=1,2,3,...,k; where i=1, 2, 3, ..., k;
其中,勾每一层3D稀疏卷积上产生的特征图,c=1,2,3,4;in, Check the feature map generated on each layer of 3D sparse convolution, c=1, 2, 3, 4;
是三维点云通过SA模型计算的第i个相关点pi特征; is the i-th related point p i feature calculated by the SA model of the 3D point cloud;
是对俯视图利用双线性插值获得的特征; is the feature obtained by bilinear interpolation for the top view;
B2)计算相关点pi的特征Fi的权重如下:B2) Calculate the weight of the feature F i of the relevant point p i as follows:
F′i=Λ(pi)⊙Fii=1,2,3,...,k;F′ i =Λ(pi ) ⊙Fi i = 1, 2, 3,...,k;
其中,Λ(·)∈[0,1]为注意力网络,其值代表对应输入相关点的注意力向量,即该相关点的重要程度,Fi是相关点pi的特征。Among them, Λ(·)∈[0,1] is the attention network, and its value represents the attention vector corresponding to the input related point, that is, the importance of the related point, and F i is the feature of the related point pi .
(3)目标回归精修网络的训练,对点云进行体素化处理,一般会损失一定的定位信息,从而影响目标检测的精度,所以对卷积特征和相关点权重特征作为精修网络输入,相互间可以提供语义补充信息;同时,增加注意力机制可以为每个点附带周围的上下文信息,增强特征,进一步优化检测结果。其步骤为:将3Dproposals对应的俯视图特征和k个相关点权重特征F′i进行级联;同时再利用Fusion模型对级联后的结果和3Dproposals卷积产生的注意力特征相乘进行融合;最后通过多层感知机获得精修的边界框3Dbox精确位置。(3) The training of the target regression refinement network and the voxelization of the point cloud will generally lose a certain amount of positioning information, thus affecting the accuracy of target detection, so the convolution feature and the relevant point weight feature are used as the input of the refinement network. , which can provide semantic supplementary information to each other; at the same time, adding attention mechanism can attach surrounding context information to each point, enhance features, and further optimize detection results. The steps are: cascade the top-view feature corresponding to 3D proposals and k related point weight features F′ i ; at the same time, use the Fusion model to multiply the concatenated result and the attention feature generated by 3D proposals convolution to fuse; The refined bounding box 3Dbox precise location is obtained by a multilayer perceptron.
(4)在训练过程中进行损失函数的训练:损失函数包括RPN的多任务目标损失函数LRPN和回归框精修损失函数LREFINE。(4) Training the loss function in the training process: the loss function includes the multi-task target loss function L RPN of the RPN and the regression box refinement loss function L REFINE .
根据确定枸杞枝条关键点抓取位置的重要性,我们引入关键点作为监督数据,从而在训练粗定位枸杞枝条及其关键点网络后,进一步对枸杞枝条目标框和关键点目标框进行了精修网络的训练,最终获得目标的精准定位。According to the importance of determining the grasping position of key points of wolfberry branches, we introduce key points as supervision data, so that after training a network for coarsely localizing wolfberry branches and their key points, we further refine the target frame of wolfberry branches and key point target boxes. The training of the network finally obtains the precise positioning of the target.
A1)训练多任务目标损失函数LRPN,该损失函数包括分类任务损失函数Lcls;目标回归框损失函数Lboxreg;关键点回归损失函数Lkeyreg:A1) Training multi-task target loss function L RPN , the loss function includes classification task loss function L cls ; target regression box loss function L boxreg ; key point regression loss function L keyreg :
当IoU>0.6时,anchor被认为是正样本;当IoU<0.45时,anchor被认为是负样本;当0.45<IoU<0.6时,这时很难判断anchor的正负样本,所以,在计算损失函数时我们将不予考虑。其表达式如下:When IoU>0.6, the anchor is considered as a positive sample; when IoU<0.45, the anchor is considered as a negative sample; when 0.45<IoU<0.6, it is difficult to judge the positive and negative samples of the anchor, so, when calculating the loss function will not be considered. Its expression is as follows:
LRPN=Lcls+Lboxreg+Lkeyreg;L RPN = L cls + L boxreg + L keyreg ;
分类任务的损失函数Lcls,其表达式如下:The loss function L cls of the classification task, its expression is as follows:
其中,Lcls(x,y)=-(xlog(y)+(1-x)log(1-y)),N+表示正样本个数,N-表示负样本个数;Among them, L cls (x, y)=-(xlog(y)+(1-x)log(1-y)), N + represents the number of positive samples, N - represents the number of negative samples;
目标回归框损失函数Lboxreg,其表达式如下:The target regression box loss function L boxreg , whose expression is as follows:
其中,令σ=2;Among them, let σ=2;
训练关键点回归损失函数Lkeyreg,其表达式如下:Train the keypoint regression loss function L keyreg , whose expression is as follows:
其中,是标记的关键点坐标,f(xi)预测的关键点坐标;in, is the marked keypoint coordinates, f( xi ) predicted keypoint coordinates;
A2)训练包括枝条目标框和关键点目标框的3Dbox回归框精修损失函数LREFINE,其表达式如下:A2) Train the 3Dbox regression box refinement loss function L REFINE including the branch target frame and the key point target frame, and its expression is as follows:
其中,是标记的目标框,3Dbox是预测目标框。in, is the labeled target box, and 3Dbox is the predicted target box.
第五步,待识别枸杞枝条图像的收集和预处理:获取双目相机拍摄的待识别枸杞树不同角度的20张图像,并构建三维模型,获得待识别的三维点云,并对待识别的三维点云进行体素化处理。The fifth step is the collection and preprocessing of the images of the wolfberry branches to be identified: 20 images of the wolfberry tree to be identified from different angles captured by the binocular camera are obtained, and a three-dimensional model is constructed to obtain the three-dimensional point cloud to be identified. The point cloud is voxelized.
第六步,枸杞枝条的识别和定位:将处理后待识别的三维点云数据输入训练后的枸杞枝条及其关键点检测网络,获得枸杞枝条和枝条末端关键点位置,实现枸杞枝条的识别与定位。The sixth step, identification and positioning of wolfberry branches: Input the processed three-dimensional point cloud data to be identified into the trained wolfberry branch and its key point detection network to obtain the positions of wolfberry branches and key points at the ends of the branches, and realize the identification and identification of wolfberry branches. position.
以上显示和描述了本发明的基本原理、主要特征和本发明的优点。本行业的技术人员应该了解,本发明不受上述实施例的限制,上述实施例和说明书中描述的只是本发明的原理,在不脱离本发明精神和范围的前提下本发明还会有各种变化和改进,这些变化和改进都落入要求保护的本发明的范围内。本发明要求的保护范围由所附的权利要求书及其等同物界定。The foregoing has shown and described the basic principles, main features and advantages of the present invention. It should be understood by those skilled in the art that the present invention is not limited by the above-mentioned embodiments. The above-mentioned embodiments and descriptions describe only the principles of the present invention. Without departing from the spirit and scope of the present invention, there are various Variations and improvements are intended to fall within the scope of the claimed invention. The scope of protection claimed by the present invention is defined by the appended claims and their equivalents.
Claims (5)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010380789.3A CN111476843B (en) | 2020-05-08 | 2020-05-08 | Chinese wolfberry branch recognition and positioning method based on attention mechanism and improved PV-RCNN network |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010380789.3A CN111476843B (en) | 2020-05-08 | 2020-05-08 | Chinese wolfberry branch recognition and positioning method based on attention mechanism and improved PV-RCNN network |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN111476843A true CN111476843A (en) | 2020-07-31 |
| CN111476843B CN111476843B (en) | 2023-03-24 |
Family
ID=71762225
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202010380789.3A Active CN111476843B (en) | 2020-05-08 | 2020-05-08 | Chinese wolfberry branch recognition and positioning method based on attention mechanism and improved PV-RCNN network |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN111476843B (en) |
Cited By (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112215101A (en) * | 2020-09-27 | 2021-01-12 | 武汉科技大学 | A three-dimensional target recognition method and system based on attention mechanism |
| CN112950634A (en) * | 2021-04-22 | 2021-06-11 | 内蒙古电力(集团)有限责任公司内蒙古电力科学研究院分公司 | Method, equipment and system for identifying damage of wind turbine blade based on unmanned aerial vehicle routing inspection |
| CN112967221A (en) * | 2020-12-04 | 2021-06-15 | 江苏龙冠新型材料科技有限公司 | Shield constructs section of jurisdiction production and assembles information management system |
| CN114758222A (en) * | 2022-03-09 | 2022-07-15 | 哈尔滨工业大学水资源国家工程研究中心有限公司 | A method for damage identification and volume quantification of concrete pipes based on PointNet++ neural network |
| CN116486252A (en) * | 2023-03-03 | 2023-07-25 | 上海大学 | An intelligent unmanned search and rescue system and search and rescue method based on an improved PV-RCNN target detection algorithm |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109410238A (en) * | 2018-09-20 | 2019-03-01 | 中国科学院合肥物质科学研究院 | A kind of fructus lycii identification method of counting based on PointNet++ network |
| US20190147245A1 (en) * | 2017-11-14 | 2019-05-16 | Nuro, Inc. | Three-dimensional object detection for autonomous robotic systems using image proposals |
| CN109784294A (en) * | 2019-01-25 | 2019-05-21 | 中国科学院合肥物质科学研究院 | A method for identification and positioning of wolfberry images based on candidate frame selection technology of rough set theory |
| CN110674829A (en) * | 2019-09-26 | 2020-01-10 | 哈尔滨工程大学 | Three-dimensional target detection method based on graph convolution attention network |
-
2020
- 2020-05-08 CN CN202010380789.3A patent/CN111476843B/en active Active
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20190147245A1 (en) * | 2017-11-14 | 2019-05-16 | Nuro, Inc. | Three-dimensional object detection for autonomous robotic systems using image proposals |
| CN109410238A (en) * | 2018-09-20 | 2019-03-01 | 中国科学院合肥物质科学研究院 | A kind of fructus lycii identification method of counting based on PointNet++ network |
| CN109784294A (en) * | 2019-01-25 | 2019-05-21 | 中国科学院合肥物质科学研究院 | A method for identification and positioning of wolfberry images based on candidate frame selection technology of rough set theory |
| CN110674829A (en) * | 2019-09-26 | 2020-01-10 | 哈尔滨工程大学 | Three-dimensional target detection method based on graph convolution attention network |
Non-Patent Citations (2)
| Title |
|---|
| 王凯等: "基于改进Faster R-CNN图像小目标检测", 《电视技术》 * |
| 路强等: "基于体素特征重组网络的三维物体识别", 《图学学报》 * |
Cited By (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112215101A (en) * | 2020-09-27 | 2021-01-12 | 武汉科技大学 | A three-dimensional target recognition method and system based on attention mechanism |
| CN112215101B (en) * | 2020-09-27 | 2024-12-20 | 武汉科技大学 | A three-dimensional target recognition method and system based on attention mechanism |
| CN112967221A (en) * | 2020-12-04 | 2021-06-15 | 江苏龙冠新型材料科技有限公司 | Shield constructs section of jurisdiction production and assembles information management system |
| CN112967221B (en) * | 2020-12-04 | 2024-05-14 | 江苏龙冠新型材料科技有限公司 | Shield segment production and assembly information management system |
| CN112950634A (en) * | 2021-04-22 | 2021-06-11 | 内蒙古电力(集团)有限责任公司内蒙古电力科学研究院分公司 | Method, equipment and system for identifying damage of wind turbine blade based on unmanned aerial vehicle routing inspection |
| CN112950634B (en) * | 2021-04-22 | 2023-06-30 | 内蒙古电力(集团)有限责任公司内蒙古电力科学研究院分公司 | Unmanned aerial vehicle inspection-based wind turbine blade damage identification method, equipment and system |
| CN114758222A (en) * | 2022-03-09 | 2022-07-15 | 哈尔滨工业大学水资源国家工程研究中心有限公司 | A method for damage identification and volume quantification of concrete pipes based on PointNet++ neural network |
| CN114758222B (en) * | 2022-03-09 | 2024-05-14 | 哈尔滨工业大学水资源国家工程研究中心有限公司 | Concrete pipeline damage identification and volume quantification method based on PointNet ++ neural network |
| CN116486252A (en) * | 2023-03-03 | 2023-07-25 | 上海大学 | An intelligent unmanned search and rescue system and search and rescue method based on an improved PV-RCNN target detection algorithm |
Also Published As
| Publication number | Publication date |
|---|---|
| CN111476843B (en) | 2023-03-24 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN111476843B (en) | Chinese wolfberry branch recognition and positioning method based on attention mechanism and improved PV-RCNN network | |
| CN107563381B (en) | A target detection method based on multi-feature fusion based on fully convolutional network | |
| CN111080693A (en) | Robot autonomous classification grabbing method based on YOLOv3 | |
| Hou et al. | Detection and localization of citrus fruit based on improved You Only Look Once v5s and binocular vision in the orchard | |
| Wang et al. | Research on image recognition of insulators based on YOLO algorithm | |
| CN111523511B (en) | Video image Chinese wolfberry branch detection method for Chinese wolfberry harvesting and clamping device | |
| CN114821102A (en) | Intensive citrus quantity detection method, equipment, storage medium and device | |
| Zhang et al. | Three-dimensional branch segmentation and phenotype extraction of maize tassel based on deep learning | |
| CN114846998A (en) | Tomato picking method and system of binocular robot based on YOLOv4 algorithm | |
| CN114550166B (en) | A fruit detection method, device and storage medium for smart greenhouses | |
| CN116958823B (en) | Tea tender tip identification and picking point positioning method | |
| CN116704497A (en) | A method and system for extracting rapeseed phenotypic parameters based on three-dimensional point cloud | |
| CN111598172A (en) | Fast detection method of dynamic target grasping pose based on heterogeneous deep network fusion | |
| CN118587188A (en) | A PCB small target defect detection method based on improved YOLOv8s | |
| CN118279643A (en) | Unsupervised defect classification and segmentation method, system and storage medium based on double-branch flow model | |
| Le Louedec et al. | Segmentation and detection from organised 3D point clouds: A case study in broccoli head detection | |
| Hao et al. | [Retracted] Fast Recognition Method for Multiple Apple Targets in Complex Occlusion Environment Based on Improved YOLOv5 | |
| CN118799716A (en) | Crab detection and counting method, device, medium and product based on instance segmentation | |
| Rong et al. | RTMFusion: An enhanced dual-stream architecture algorithm fusing RGB and depth features for instance segmentation of tomato organs | |
| CN116652951A (en) | A robot vision positioning method and device in an unstructured large working space | |
| Zhang et al. | Segmentation of apple point clouds based on ROI in RGB images. | |
| Yu et al. | ASE-UNet: An orange fruit segmentation model in an agricultural environment based on deep learning | |
| CN118247729A (en) | A cattle farm multi-target detection method and system based on GCS-YOLO algorithm | |
| CN117893599A (en) | A method for locating the stalk shearing point of a tomato picking robot | |
| CN117036826A (en) | Power transmission line identification positioning method and equipment for distribution network live working and storage equipment |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |