[go: up one dir, main page]

CN111476843A - Chinese wolfberry branch recognition and positioning method based on attention mechanism and improved PV-RCNN network - Google Patents

Chinese wolfberry branch recognition and positioning method based on attention mechanism and improved PV-RCNN network Download PDF

Info

Publication number
CN111476843A
CN111476843A CN202010380789.3A CN202010380789A CN111476843A CN 111476843 A CN111476843 A CN 111476843A CN 202010380789 A CN202010380789 A CN 202010380789A CN 111476843 A CN111476843 A CN 111476843A
Authority
CN
China
Prior art keywords
network
wolfberry
branches
point
loss function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010380789.3A
Other languages
Chinese (zh)
Other versions
CN111476843B (en
Inventor
李伟
贾秀芳
王红艳
王儒敬
黄河
孙丙宇
李娇娥
胡宜敏
金洲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
West Electronic Business Co ltd
Hefei Institutes of Physical Science of CAS
Original Assignee
West Electronic Business Co ltd
Hefei Institutes of Physical Science of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by West Electronic Business Co ltd, Hefei Institutes of Physical Science of CAS filed Critical West Electronic Business Co ltd
Priority to CN202010380789.3A priority Critical patent/CN111476843B/en
Publication of CN111476843A publication Critical patent/CN111476843A/en
Application granted granted Critical
Publication of CN111476843B publication Critical patent/CN111476843B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01DHARVESTING; MOWING
    • A01D46/00Picking of fruits, vegetables, hops, or the like; Devices for shaking trees or shrubs
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • G06T2207/10012Stereo images
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30181Earth observation
    • G06T2207/30188Vegetation; Agriculture

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Environmental Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a wolfberry branch identification and positioning method based on an attention mechanism and an improved PV-RCNN network, and compared with the prior art, the method overcomes the defect that a two-dimensional image is difficult to accurately identify due to the fact that wolfberry branches are shielded, broken points, overlapped and the like. The invention comprises the following steps: collecting and preprocessing a training sample; carrying out voxelization processing on the three-dimensional point cloud; constructing a Chinese wolfberry branch and a key point detection network thereof; training medlar branches and key point detection networks thereof; collecting and preprocessing images of the medlar branches to be identified; and identifying and positioning the medlar branches. The method can supplement a large amount of positioning information lost by convolution operation in the voxelization and sparse 3D convolution network, and meanwhile, the accuracy of the detection of key points at the tail ends of the branches of the Chinese wolfberry is improved by obtaining the contribution degree of relevant points to target detection and characteristic enhancement in a refinement network according to the attention network, so that the accurate identification and positioning of the branches of the Chinese wolfberry are realized.

Description

基于注意力机制和改进PV-RCNN网络的枸杞枝条识别定位 方法Recognition and localization of wolfberry branches based on attention mechanism and improved PV-RCNN network method

技术领域technical field

本发明涉及枸杞采收技术领域,具体来说是基于注意力机制和改进PV-RCNN网络的枸杞枝条识别定位方法。The invention relates to the technical field of wolfberry harvesting, in particular to a wolfberry branch identification and positioning method based on an attention mechanism and an improved PV-RCNN network.

背景技术Background technique

随着枸杞种植面积不断扩大,枸杞的采摘成为制约枸杞产业持续发展的卡脖子问题。由于国内外市场并没有技术较成熟的采摘机械,所以枸杞的采收主要依靠于人工完成,但人工采收枸杞的效率仅为3-5kg/h,所需费用达到生产成本的50%以上。研制适应于我国国情的枸杞采收机械,对降低所需成本、提高农民收入、保障枸杞产业稳态持续发展具有重大意义。With the continuous expansion of wolfberry planting area, the picking of wolfberry has become a bottleneck problem that restricts the sustainable development of wolfberry industry. Since there is no mature picking machinery in the domestic and foreign markets, the harvesting of wolfberry mainly relies on manual work, but the efficiency of manual harvesting of wolfberry is only 3-5kg/h, and the cost is more than 50% of the production cost. The development of wolfberry harvesting machinery adapted to my country's national conditions is of great significance for reducing the required costs, increasing farmers' income, and ensuring the steady and sustainable development of the wolfberry industry.

各种枸杞采收机械均依赖于操作人员的主观判断操作,利用枸杞采收夹持装置夹住枸杞枝条,对枸杞枝条进行摆动或梳刷,效率较为低下。而枸杞果实的数量比较多、体积较小,采摘时存在树叶和枝条对果实的遮挡,致使枸杞枝条的准确识别与定位在二维图像下难以精准识别。All kinds of wolfberry harvesting machines rely on the subjective judgment of the operator. Using the wolfberry harvesting and clamping device to clamp the wolfberry branches, swinging or brushing the wolfberry branches, the efficiency is relatively low. However, the number of wolfberry fruit is relatively large and the volume is small, and the fruit is blocked by leaves and branches during picking, which makes the accurate identification and positioning of wolfberry branches difficult to accurately identify under two-dimensional images.

若能利用基于三维点云数据(三维点云数据可以获得目标的空间维度、分布特征以及三维形态等具体空间信息)的计算机识别技术准确地识别和判断枸杞枝条的位置及其枝条末端关键点的坐标,根据枝条的位置和走势及其枝条末端关键点坐标利用机械臂将其定点抓起进行枸杞高效率采摘,不仅可以提高枸杞采摘效率、采净率而且能够最大程度地降低对枸杞的损伤以及对树的保护而不被损坏。然而,目前对枸杞枝条的检测仍以二维图像为主,但自然环境复杂以及存在遮挡、断点、重叠等问题,故很难利用二维图像直接判断枝条的位置。If the computer recognition technology based on 3D point cloud data (3D point cloud data can obtain the specific spatial information such as the spatial dimension, distribution characteristics and 3D shape of the target) can be used to accurately identify and judge the position of wolfberry branches and the key points at the ends of the branches. Coordinates, according to the position and trend of the branches and the coordinates of the key points at the ends of the branches, the robotic arm is used to pick them up at a fixed point for high-efficiency picking of wolfberry, which can not only improve the picking efficiency and cleaning rate of wolfberry, but also minimize the damage to wolfberry and Protection of the tree from damage. However, the current detection of wolfberry branches is still mainly based on two-dimensional images, but the natural environment is complex and there are problems such as occlusion, breakpoints, and overlaps, so it is difficult to directly determine the position of branches by using two-dimensional images.

因此,如何提高枸杞枝条检测的精准性已成为当前亟待解决的关键技术问题。Therefore, how to improve the accuracy of wolfberry branch detection has become a key technical problem to be solved urgently.

发明内容SUMMARY OF THE INVENTION

本发明的目的是为了解决现有技术中由于枸杞枝条被遮挡、断点、重叠等原因致使二维图像难以精准识别的缺陷,提供一种基于注意力机制和改进PV-RCNN网络的枸杞枝条识别定位方法来解决上述问题。The purpose of the present invention is to solve the defects in the prior art that the two-dimensional images are difficult to accurately identify due to the occlusion, breakpoints, overlapping and other reasons of the wolfberry branches, and to provide a wolfberry branch identification based on an attention mechanism and an improved PV-RCNN network. positioning method to solve the above problems.

为了实现上述目的,本发明的技术方案如下:In order to achieve the above object, technical scheme of the present invention is as follows:

基于注意力机制和改进PV-RCNN网络的枸杞枝条识别定位方法,包括以下步骤:The identification and positioning method of wolfberry branches based on attention mechanism and improved PV-RCNN network includes the following steps:

训练样本的收集和预处理:通过双目相机获取枸杞树不同角度的20张图像,构建三维模型获得三维点云,对三维点云进行标注,对枝条末端关键点建立半径为r的球体作标注;Collection and preprocessing of training samples: Obtain 20 images of wolfberry trees from different angles through a binocular camera, build a 3D model to obtain a 3D point cloud, label the 3D point cloud, and create a sphere with a radius of r for the key points at the end of the branches. ;

三维点云的体素化处理:将三维点云经过VolexNet网络进行体素化处理,形成多个栅格;其中,点云输入区域的大小为(L,W,H),每个栅格的大小为(l,w,h),共有栅格个数为(L/l,W/w,H/h),每个栅格中点云数量设置为8;Voxelization of 3D point cloud: Voxelize the 3D point cloud through the VolexNet network to form multiple grids; among them, the size of the point cloud input area is (L, W, H), and the size of each grid is The size is (l,w,h), the total number of grids is (L/l,W/w,H/h), and the number of point clouds in each grid is set to 8;

构建枸杞枝条及其关键点检测网络:基于PV-RCNN网络构建枸杞枝条及其关键点检测网络,并在枸杞枝条及其关键点检测网络的PV-RCNN内融合注意力机制获得枸杞枝条及关键点目标定位的精修网络;Construction of wolfberry branches and their key points detection network: Based on the PV-RCNN network, a wolfberry branch and its key point detection network was constructed, and the wolfberry branches and key points were obtained by integrating the attention mechanism in the PV-RCNN of the wolfberry branch and its key point detection network. Refinement network for targeting;

训练枸杞枝条及其关键点检测网络:利用训练样本对枸杞枝条以其关键点检测网络进行训练;Training wolfberry branch and its key point detection network: use training samples to train wolfberry branch and its key point detection network;

待识别枸杞枝条图像的收集和预处理:获取双目相机拍摄的待识别枸杞树不同角度的20张图像,利用已构建的三维模型获得待识别的三维点云,并对待识别的三维点云进行体素化处理;Collection and preprocessing of images of wolfberry branches to be identified: 20 images of the wolfberry tree to be identified from different angles captured by the binocular camera were obtained, and the three-dimensional point cloud to be identified was obtained by using the constructed three-dimensional model, and the three-dimensional point cloud to be identified was processed. voxelization;

枸杞枝条的识别和定位:将处理后待识别的三维点云数据输入训练后的枸杞枝条及其关键点检测网络,获得枸杞枝条和枝条末端关键点位置,实现枸杞枝条的识别与定位。Recognition and positioning of wolfberry branches: Input the processed three-dimensional point cloud data to be identified into the trained wolfberry branch and its key point detection network to obtain the positions of wolfberry branches and key points at the ends of the branches, so as to realize the identification and positioning of wolfberry branches.

所述构建枸杞枝条及其关键点检测网络包括以下步骤:The described construction of wolfberry branches and their key point detection network includes the following steps:

基于PV-RCNN网络构建枸杞枝条及其关键点检测网络,设定其输入层为:枸杞枝条三维点云进行体素化后的诸多栅格和枝条末端关键点半径为r的球体;The detection network of wolfberry branches and their key points was constructed based on PV-RCNN network, and the input layer was set as: many grids after voxelization of the three-dimensional point cloud of wolfberry branches and a sphere with a radius of r at the key points at the end of the branches;

设定其特征提取层为:利用稀疏3D卷积网络对输入栅格及其关键点半径为r的球体进行多尺度逐层特征提取;对利用FPS选取的相关点通过基于注意力机制的PointNet网络进行点云的特征提取;The feature extraction layer is set as: using a sparse 3D convolutional network to perform multi-scale layer-by-layer feature extraction on the input grid and a sphere whose key point radius is r; Perform point cloud feature extraction;

在枸杞枝条及其关键点检测网络PV-RCNN内构建融合注意力机制的精修网络:基于注意力机制构建枸杞枝条及其关键点目标候选框的精准定位网络,作为目标回归精修网络;Constructing a refinement network integrating attention mechanism in the detection network PV-RCNN of wolfberry branches and their key points: based on the attention mechanism, a precise localization network of wolfberry branches and their key point target candidate boxes is constructed as a target regression refinement network;

设定基于改进PV-RCNN网络枸杞枝条及其关键点检测网络的输出层为枸杞枝条位置及其枝条末端关键点坐标。The output layer of the wolfberry branch and its key point detection network based on the improved PV-RCNN network is set as the position of the wolfberry branch and the coordinates of the key point at the end of the branch.

所述训练枸杞枝条及其关键点检测网络包括以下步骤:The training of wolfberry branches and their key point detection network includes the following steps:

将诸多栅格和关键点半径为r的球体输入3D稀疏卷积神经网络中进行逐层特征提取;Input many grids and spheres with key point radius r into 3D sparse convolutional neural network for layer-by-layer feature extraction;

稀疏卷积神经网络由四层C1、C2、C3、C4,3×3×3的3D稀疏卷积组成,逐层进行特征提取;The sparse convolutional neural network consists of four layers of C1, C2, C3, C4, 3×3×3 3D sparse convolution, and feature extraction is performed layer by layer;

将C4特征图转换成俯视特征图,俯视特征图的大小为

Figure BDA0002481987310000031
Convert the C4 feature map to a top-down feature map, and the size of the top-view feature map is
Figure BDA0002481987310000031

根据特征图大小由RPN网络生成

Figure BDA0002481987310000032
个anchorboxes,角度分别为0度、45度、135度,通过NMS非极大值抑制操作生成3Dproposal,最终获得3Dproposal对应的类别和坐标位置;Generated by the RPN network according to the feature map size
Figure BDA0002481987310000032
Anchorboxes with angles of 0 degrees, 45 degrees, and 135 degrees, respectively, generate 3D proposals through NMS non-maximum suppression operation, and finally obtain the corresponding category and coordinate position of 3D proposals;

利用FPS选取的k个相关点并通过基于注意力机制的PointNet网络进行点云的特征提取;Use the k relevant points selected by FPS to extract the feature of point cloud through the PointNet network based on the attention mechanism;

目标回归精修网络的训练:将3Dproposals对应的俯视图特征和k个相关点权重特征Fi′进行级联;再利用Fusion模型对级联后的结果和3Dproposals卷积产生的注意力特征相乘进行融合;最后通过多层感知机获得精修的边界框3Dbox精确位置;The training of the target regression refinement network: cascade the top-view feature corresponding to 3D proposals and k related point weight features F i ′; then use the Fusion model to multiply the cascaded result and the attention feature generated by the 3D proposals convolution. Fusion; finally, the precise position of the refined bounding box 3Dbox is obtained through the multi-layer perceptron;

在训练过程中进行损失函数的训练:损失函数包括RPN的多任务目标损失函数LRPN和回归框精修损失函数LREFINEThe training of the loss function is performed during the training process: the loss function includes the multi-task objective loss function L RPN of the RPN and the regression box refinement loss function L REFINE .

所述的利用FPS选取的k个相关点并通过基于注意力机制的PointNet网络进行点云的特征提取包括以下步骤:The feature extraction of the point cloud using the k relevant points selected by FPS and the PointNet network based on the attention mechanism includes the following steps:

利用FPS算法从三维点云中选取k个相关点,其公式如下:Using the FPS algorithm to select k relevant points from the 3D point cloud, the formula is as follows:

κ={p1,p2,…,pk};κ={p 1 , p 2 ,...,p k };

每一个相关点pi的特征表示如下:The feature representation of each relevant point pi is as follows:

Figure BDA0002481987310000041
其中i=1,2,3,...,k;
Figure BDA0002481987310000041
where i=1, 2, 3, ..., k;

其中,

Figure BDA0002481987310000048
为每一层3D稀疏卷积上产生的特征图,c=1,2,3,4;in,
Figure BDA0002481987310000048
For the feature map generated on each layer of 3D sparse convolution, c=1, 2, 3, 4;

Figure BDA0002481987310000042
是三维点云通过SA模型计算的第i个相关点pi特征;
Figure BDA0002481987310000042
is the i-th related point p i feature calculated by the SA model of the 3D point cloud;

Figure BDA0002481987310000043
是对俯视图利用双线性插值获得的特征;
Figure BDA0002481987310000043
is the feature obtained by bilinear interpolation for the top view;

计算相关点pi的特征Fi的权重如下:The weight of the feature F i of the relevant point p i is calculated as follows:

F′i=Λ(pi)⊙Fii=1,2,3,...,k;F′ i =Λ(pi ) ⊙Fi i = 1, 2, 3,...,k;

其中,Λ(·)∈[0,1]为注意力网络,其值代表对应输入相关点的注意力向量,即该相关点的重要程度,Fi是相关点pi的特征。Among them, Λ(·)∈[0,1] is the attention network, and its value represents the attention vector corresponding to the input related point, that is, the importance of the related point, and F i is the feature of the related point pi .

所述在训练过程中进行损失函数的训练包括以下步骤:The training of the loss function in the training process includes the following steps:

训练多任务目标损失函数LRPN,该损失函数包括分类任务损失函数Lcls;目标回归框损失函数Lboxreg;关键点回归损失函数Lkeyreg::Training multi-task target loss function L RPN , the loss function includes classification task loss function L cls ; target regression box loss function L boxreg ; key point regression loss function L keyreg :

当IoU>0.6时,anchor被认为是正样本;当IoU<0.45时,anchor被认为是负样本;其表达式如下:When IoU>0.6, the anchor is considered as a positive sample; when IoU<0.45, the anchor is considered as a negative sample; its expression is as follows:

LRPN=Lcls+Lboxreg+LkeyregL RPN = L cls + L boxreg + L keyreg ;

分类任务的损失函数Lcls,其表达式如下:The loss function L cls of the classification task, its expression is as follows:

Figure BDA0002481987310000044
Figure BDA0002481987310000044

其中,Lcls(x,y)=-(xlog(y)+(1-x)log(1-y)),N+表示正样本个数,N-表示负样本个数;Among them, L cls (x, y)=-(xlog(y)+(1-x)log(1-y)), N + represents the number of positive samples, N - represents the number of negative samples;

目标回归框损失函数Lboxreg,其表达式如下:The target regression box loss function L boxreg , whose expression is as follows:

Figure BDA0002481987310000045
Figure BDA0002481987310000045

其中,令σ=2;Among them, let σ=2;

训练关键点回归损失函数Lkeyreg,其表达式如下:Train the keypoint regression loss function L keyreg , whose expression is as follows:

Figure BDA0002481987310000046
Figure BDA0002481987310000046

其中,

Figure BDA0002481987310000047
是标记的关键点坐标,f(xi)预测的关键点坐标;in,
Figure BDA0002481987310000047
is the marked keypoint coordinates, f( xi ) predicted keypoint coordinates;

训练包括枝条目标框和关键点目标框的3Dbox回归框精修损失函数LREFINE,其表达式如下:The 3Dbox regression box refinement loss function L REFINE including the branch target box and the keypoint target box is trained, and its expression is as follows:

Figure BDA0002481987310000051
Figure BDA0002481987310000051

其中,

Figure BDA0002481987310000052
是标记的目标框,3Dbox是预测目标框。in,
Figure BDA0002481987310000052
is the labeled target box, and 3Dbox is the predicted target box.

有益效果beneficial effect

本发明的基于注意力机制和改进PV-RCNN网络的枸杞枝条识别定位方法,与现有技术相比在PV-RCNN网络中融合了相关点的注意力网络及其精修网络中3Dproposals卷积特征产生的注意力,能够补充在体素化和稀疏3D卷积网络中进行卷积运算丢失的大量定位信息,同时依据注意力网络获得相关点对于目标检测的贡献程度和精修网络中特征增强来提高枸杞枝条和枝条末端关键点检测的精准性,实现了枸杞枝条的准确识别定位。Compared with the prior art, the method for identifying and locating wolfberry branches based on the attention mechanism and the improved PV-RCNN network of the present invention integrates the attention network of relevant points in the PV-RCNN network and the 3D proposals convolution feature in the refined network. The generated attention can supplement a large amount of localization information lost in the convolution operation in the voxelized and sparse 3D convolutional network. At the same time, according to the contribution of the relevant points to the target detection obtained by the attention network and the feature enhancement in the refined network. The detection accuracy of the key points of wolfberry branches and branch ends is improved, and the accurate identification and positioning of wolfberry branches is realized.

附图说明Description of drawings

图1为本发明的方法顺序图。FIG. 1 is a sequence diagram of the method of the present invention.

具体实施方式Detailed ways

为使对本发明的结构特征及所达成的功效有更进一步的了解与认识,用以较佳的实施例及附图配合详细的说明,说明如下:In order to have a further understanding and understanding of the structural features of the present invention and the effects achieved, the preferred embodiments and accompanying drawings are used in conjunction with detailed descriptions, and the descriptions are as follows:

如图1所示,本发明所述的基于注意力机制和改进PV-RCNN网络的枸杞枝条识别定位方法,包括以下步骤:As shown in Figure 1, the method for identifying and locating wolfberry branches based on an attention mechanism and an improved PV-RCNN network according to the present invention includes the following steps:

第一步,训练样本的收集和预处理。通过双目相机获取枸杞树不同角度的20张图像,利用现有技术构建三维模型获得三维点云,并对三维点云进行标注,对枝条末端关键点建立半径为r的球体作标注。三维点云会进行体素化预处理从而形成栅格,我们利用半径为r的球体作标注只是为了使其能够和栅格作为相同格式的数据输入到稀疏3D卷积网络中进行特征提取。The first step is the collection and preprocessing of training samples. The binocular camera was used to obtain 20 images of the wolfberry tree from different angles, and the existing technology was used to build a 3D model to obtain a 3D point cloud. The 3D point cloud will be preprocessed by voxelization to form a grid. We use a sphere with a radius of r as annotation just so that it can be input into the sparse 3D convolutional network as data in the same format as the grid for feature extraction.

第二步,三维点云的体素化处理。将三维点云经过VolexNet网络进行体素化处理,形成多个栅格;其中,点云输入区域的大小为(L,W,H),每个栅格的大小为(l,w,h),共有栅格个数为(L/l,W/w,H/h),每个栅格中点云数量设置为8。The second step is the voxelization of the 3D point cloud. The 3D point cloud is voxelized through the VolexNet network to form multiple grids; the size of the point cloud input area is (L, W, H), and the size of each grid is (l, w, h) , the total number of grids is (L/l, W/w, H/h), and the number of point clouds in each grid is set to 8.

第三步,构建枸杞枝条及其关键点检测网络。基于PV-RCNN网络构建枸杞枝条及其关键点检测网络,并在枸杞枝条及其关键点检测网络的PV-RCNN内融合注意力机制获得枸杞枝条及关键点目标定位的精修网络。在此,利用PV-RCNN网络架构构建枸杞枝条及其关键点检测网络,并为了实现注意力机制,在PV-RCNN网络内建立一个精修网络,以融合注意力机制。其具体步骤如下:The third step is to construct the detection network of wolfberry branches and their key points. Based on the PV-RCNN network, the detection network of wolfberry branches and their key points was constructed, and the attention mechanism was integrated in the PV-RCNN of the detection network of wolfberry branches and their key points to obtain a refined network for the target location of wolfberry branches and key points. Here, the PV-RCNN network architecture is used to construct the detection network of wolfberry branches and their key points, and in order to realize the attention mechanism, a refinement network is established in the PV-RCNN network to integrate the attention mechanism. The specific steps are as follows:

(1)基于PV-RCNN网络构建枸杞枝条及其关键点检测网络,设定其输入层为:枸杞枝条三维点云进行体素化后的诸多栅格和枝条末端关键点半径为r的球体。(1) The detection network of wolfberry branches and their key points is constructed based on the PV-RCNN network, and the input layer is set as: many grids after voxelization of the three-dimensional point cloud of wolfberry branches and a sphere with a radius of r at the end of the branch.

(2)设定其特征提取层为:利用稀疏3D卷积网络对输入栅格及其关键点半径为r的球体进行多尺度逐层特征提取;对利用FPS选取的相关点通过基于注意力机制的PointNet网络进行点云的特征提取。(2) Set its feature extraction layer as: using a sparse 3D convolutional network to perform multi-scale layer-by-layer feature extraction on the input grid and a sphere whose key point radius is r; The PointNet network performs point cloud feature extraction.

(3)在枸杞枝条及其关键点检测网络PV-RCNN内构建融合注意力机制的精修网络:基于注意力机制构建枸杞枝条及其关键点目标候选框的精准定位网络,作为目标回归精修网络。(3) Constructing a refinement network integrating attention mechanism in the detection network PV-RCNN of wolfberry branches and their key points: Based on the attention mechanism, a precise localization network of wolfberry branches and their key point target candidate boxes is constructed as the target regression refinement network.

(4)设定基于改进PV-RCNN网络枸杞枝条及其关键点检测网络的输出层为枸杞枝条位置及其枝条末端关键点坐标。(4) Set the output layer of the wolfberry branch and its key point detection network based on the improved PV-RCNN network as the position of the wolfberry branch and the coordinates of the key point at the end of the branch.

第四步,训练枸杞枝条及其关键点检测网络:利用训练样本对枸杞枝条以其关键点检测网络进行训练。其具体步骤如下:The fourth step is to train the detection network of wolfberry branches and their key points: use the training samples to train the detection network of wolfberry branches and their key points. The specific steps are as follows:

(1)将诸多栅格和关键点半径为r的球体输入3D稀疏卷积神经网络中进行逐层特征提取;(1) Input many grids and spheres with key point radius r into the 3D sparse convolutional neural network for layer-by-layer feature extraction;

稀疏卷积神经网络由四层C1、C2、C3、C4,3×3×3的3D稀疏卷积组成,逐层进行特征提取;The sparse convolutional neural network consists of four layers of C1, C2, C3, C4, 3×3×3 3D sparse convolution, and feature extraction is performed layer by layer;

将C4特征图转换成俯视特征图,俯视特征图的大小为

Figure BDA0002481987310000061
Convert the C4 feature map to a top-down feature map, and the size of the top-view feature map is
Figure BDA0002481987310000061

根据特征图大小由RPN网络生成

Figure BDA0002481987310000062
个anchorboxes,角度分别为0度、45度、135度,通过NMS非极大值抑制操作生成3Dproposal,最终获得3Dproposal对应的类别和坐标位置。Generated by the RPN network according to the feature map size
Figure BDA0002481987310000062
Anchorboxes with angles of 0 degrees, 45 degrees, and 135 degrees, respectively, generate 3D proposals through the NMS non-maximum suppression operation, and finally obtain the corresponding categories and coordinate positions of 3D proposals.

(2)利用FPS选取k个相关点并通过基于注意力机制的PointNet网络进行点云的特征提取。由于枸杞枝条三维点云被体素化后会丢失大量重要的信息,所以从点云中选择k个相关点进行特征提取尽可能弥补丢失的信息,同时,引入了注意力机制可以获得相关点的重要程度。这样处理三维点云进行目标检测的方法既可以提高训练速度又可以提升目标检测的精度。其具体步骤如下:(2) Use FPS to select k relevant points and perform feature extraction of point cloud through the P o intNet network based on the attention mechanism. Since the three-dimensional point cloud of wolfberry branches will lose a lot of important information after being voxelized, k related points are selected from the point cloud for feature extraction to make up for the lost information as much as possible. Importance. In this way, the method of processing 3D point cloud for target detection can not only improve the training speed but also improve the accuracy of target detection. The specific steps are as follows:

A1)利用FPS算法从三维点云中选取k个相关点,其公式如下:A1) Use the FPS algorithm to select k relevant points from the 3D point cloud, and the formula is as follows:

κ={p1,p2,…,pk};κ={p 1 , p 2 , ..., p k };

每一个相关点pi的特征表示如下:The feature representation of each relevant point pi is as follows:

Figure BDA0002481987310000071
其中i=1,2,3,...,k;
Figure BDA0002481987310000071
where i=1, 2, 3, ..., k;

其中,

Figure BDA0002481987310000072
勾每一层3D稀疏卷积上产生的特征图,c=1,2,3,4;in,
Figure BDA0002481987310000072
Check the feature map generated on each layer of 3D sparse convolution, c=1, 2, 3, 4;

Figure BDA0002481987310000073
是三维点云通过SA模型计算的第i个相关点pi特征;
Figure BDA0002481987310000073
is the i-th related point p i feature calculated by the SA model of the 3D point cloud;

Figure BDA0002481987310000074
是对俯视图利用双线性插值获得的特征;
Figure BDA0002481987310000074
is the feature obtained by bilinear interpolation for the top view;

B2)计算相关点pi的特征Fi的权重如下:B2) Calculate the weight of the feature F i of the relevant point p i as follows:

F′i=Λ(pi)⊙Fii=1,2,3,...,k;F′ i =Λ(pi ) ⊙Fi i = 1, 2, 3,...,k;

其中,Λ(·)∈[0,1]为注意力网络,其值代表对应输入相关点的注意力向量,即该相关点的重要程度,Fi是相关点pi的特征。Among them, Λ(·)∈[0,1] is the attention network, and its value represents the attention vector corresponding to the input related point, that is, the importance of the related point, and F i is the feature of the related point pi .

(3)目标回归精修网络的训练,对点云进行体素化处理,一般会损失一定的定位信息,从而影响目标检测的精度,所以对卷积特征和相关点权重特征作为精修网络输入,相互间可以提供语义补充信息;同时,增加注意力机制可以为每个点附带周围的上下文信息,增强特征,进一步优化检测结果。其步骤为:将3Dproposals对应的俯视图特征和k个相关点权重特征F′i进行级联;同时再利用Fusion模型对级联后的结果和3Dproposals卷积产生的注意力特征相乘进行融合;最后通过多层感知机获得精修的边界框3Dbox精确位置。(3) The training of the target regression refinement network and the voxelization of the point cloud will generally lose a certain amount of positioning information, thus affecting the accuracy of target detection, so the convolution feature and the relevant point weight feature are used as the input of the refinement network. , which can provide semantic supplementary information to each other; at the same time, adding attention mechanism can attach surrounding context information to each point, enhance features, and further optimize detection results. The steps are: cascade the top-view feature corresponding to 3D proposals and k related point weight features F′ i ; at the same time, use the Fusion model to multiply the concatenated result and the attention feature generated by 3D proposals convolution to fuse; The refined bounding box 3Dbox precise location is obtained by a multilayer perceptron.

(4)在训练过程中进行损失函数的训练:损失函数包括RPN的多任务目标损失函数LRPN和回归框精修损失函数LREFINE(4) Training the loss function in the training process: the loss function includes the multi-task target loss function L RPN of the RPN and the regression box refinement loss function L REFINE .

根据确定枸杞枝条关键点抓取位置的重要性,我们引入关键点作为监督数据,从而在训练粗定位枸杞枝条及其关键点网络后,进一步对枸杞枝条目标框和关键点目标框进行了精修网络的训练,最终获得目标的精准定位。According to the importance of determining the grasping position of key points of wolfberry branches, we introduce key points as supervision data, so that after training a network for coarsely localizing wolfberry branches and their key points, we further refine the target frame of wolfberry branches and key point target boxes. The training of the network finally obtains the precise positioning of the target.

A1)训练多任务目标损失函数LRPN,该损失函数包括分类任务损失函数Lcls;目标回归框损失函数Lboxreg;关键点回归损失函数LkeyregA1) Training multi-task target loss function L RPN , the loss function includes classification task loss function L cls ; target regression box loss function L boxreg ; key point regression loss function L keyreg :

当IoU>0.6时,anchor被认为是正样本;当IoU<0.45时,anchor被认为是负样本;当0.45<IoU<0.6时,这时很难判断anchor的正负样本,所以,在计算损失函数时我们将不予考虑。其表达式如下:When IoU>0.6, the anchor is considered as a positive sample; when IoU<0.45, the anchor is considered as a negative sample; when 0.45<IoU<0.6, it is difficult to judge the positive and negative samples of the anchor, so, when calculating the loss function will not be considered. Its expression is as follows:

LRPN=Lcls+Lboxreg+LkeyregL RPN = L cls + L boxreg + L keyreg ;

分类任务的损失函数Lcls,其表达式如下:The loss function L cls of the classification task, its expression is as follows:

Figure BDA0002481987310000081
Figure BDA0002481987310000081

其中,Lcls(x,y)=-(xlog(y)+(1-x)log(1-y)),N+表示正样本个数,N-表示负样本个数;Among them, L cls (x, y)=-(xlog(y)+(1-x)log(1-y)), N + represents the number of positive samples, N - represents the number of negative samples;

目标回归框损失函数Lboxreg,其表达式如下:The target regression box loss function L boxreg , whose expression is as follows:

Figure BDA0002481987310000082
Figure BDA0002481987310000082

其中,令σ=2;Among them, let σ=2;

训练关键点回归损失函数Lkeyreg,其表达式如下:Train the keypoint regression loss function L keyreg , whose expression is as follows:

Figure BDA0002481987310000083
Figure BDA0002481987310000083

其中,

Figure BDA0002481987310000084
是标记的关键点坐标,f(xi)预测的关键点坐标;in,
Figure BDA0002481987310000084
is the marked keypoint coordinates, f( xi ) predicted keypoint coordinates;

A2)训练包括枝条目标框和关键点目标框的3Dbox回归框精修损失函数LREFINE,其表达式如下:A2) Train the 3Dbox regression box refinement loss function L REFINE including the branch target frame and the key point target frame, and its expression is as follows:

Figure BDA0002481987310000085
Figure BDA0002481987310000085

其中,

Figure BDA0002481987310000086
是标记的目标框,3Dbox是预测目标框。in,
Figure BDA0002481987310000086
is the labeled target box, and 3Dbox is the predicted target box.

第五步,待识别枸杞枝条图像的收集和预处理:获取双目相机拍摄的待识别枸杞树不同角度的20张图像,并构建三维模型,获得待识别的三维点云,并对待识别的三维点云进行体素化处理。The fifth step is the collection and preprocessing of the images of the wolfberry branches to be identified: 20 images of the wolfberry tree to be identified from different angles captured by the binocular camera are obtained, and a three-dimensional model is constructed to obtain the three-dimensional point cloud to be identified. The point cloud is voxelized.

第六步,枸杞枝条的识别和定位:将处理后待识别的三维点云数据输入训练后的枸杞枝条及其关键点检测网络,获得枸杞枝条和枝条末端关键点位置,实现枸杞枝条的识别与定位。The sixth step, identification and positioning of wolfberry branches: Input the processed three-dimensional point cloud data to be identified into the trained wolfberry branch and its key point detection network to obtain the positions of wolfberry branches and key points at the ends of the branches, and realize the identification and identification of wolfberry branches. position.

以上显示和描述了本发明的基本原理、主要特征和本发明的优点。本行业的技术人员应该了解,本发明不受上述实施例的限制,上述实施例和说明书中描述的只是本发明的原理,在不脱离本发明精神和范围的前提下本发明还会有各种变化和改进,这些变化和改进都落入要求保护的本发明的范围内。本发明要求的保护范围由所附的权利要求书及其等同物界定。The foregoing has shown and described the basic principles, main features and advantages of the present invention. It should be understood by those skilled in the art that the present invention is not limited by the above-mentioned embodiments. The above-mentioned embodiments and descriptions describe only the principles of the present invention. Without departing from the spirit and scope of the present invention, there are various Variations and improvements are intended to fall within the scope of the claimed invention. The scope of protection claimed by the present invention is defined by the appended claims and their equivalents.

Claims (5)

1.一种基于注意力机制和改进PV-RCNN网络的枸杞枝条识别定位方法,其特征在于,包括以下步骤:1. a method for identifying and positioning medlar branches based on attention mechanism and improving PV-RCNN network, is characterized in that, comprises the following steps: 11)训练样本的收集和预处理:通过双目相机获取枸杞树不同角度的20张图像,构建三维模型获得三维点云,对三维点云进行标注,对枝条末端关键点建立半径为r的球体作标注;11) Collection and preprocessing of training samples: 20 images of the wolfberry tree from different angles are obtained through a binocular camera, a 3D model is constructed to obtain a 3D point cloud, the 3D point cloud is annotated, and a sphere with a radius of r is established for the key points at the end of the branch. mark; 12)三维点云的体素化处理:将三维点云经过VolexNet网络进行体素化处理,形成多个栅格;其中,点云输入区域的大小为(L,W,H),每个栅格的大小为(l,w,h),共有栅格个数为(L/l,W/w,H/h),每个栅格中点云数量设置为8;12) Voxelization of 3D point cloud: Voxelize the 3D point cloud through the VolexNet network to form multiple grids; among them, the size of the point cloud input area is (L, W, H), and each grid The size of the grid is (l, w, h), the total number of grids is (L/l, W/w, H/h), and the number of point clouds in each grid is set to 8; 13)构建枸杞枝条及其关键点检测网络:基于PV-RCNN网络构建枸杞枝条及其关键点检测网络,并在枸杞枝条及其关键点检测网络的PV-RCNN内融合注意力机制获得枸杞枝条及关键点目标定位的精修网络;13) Constructing the detection network of wolfberry branches and their key points: The detection network of wolfberry branches and their key points is constructed based on the PV-RCNN network, and the attention mechanism is fused in the PV-RCNN of the wolfberry branches and their key points detection network to obtain the wolfberry branches and their key points. Refinement network for key point targeting; 14)训练枸杞枝条及其关键点检测网络:利用训练样本对枸杞枝条以其关键点检测网络进行训练;14) Training the detection network of wolfberry branches and their key points: using the training samples to train the detection network of wolfberry branches and their key points; 15)待识别枸杞枝条图像的收集和预处理:获取双目相机拍摄的待识别枸杞树不同角度的20张图像,利用已构建的三维模型获得待识别的三维点云,并对待识别的三维点云进行体素化处理;15) Collection and preprocessing of the images of the wolfberry branches to be identified: 20 images of the wolfberry tree to be identified from different angles captured by the binocular camera are obtained, the three-dimensional point cloud to be identified is obtained by using the constructed three-dimensional model, and the three-dimensional point cloud to be identified is obtained. The cloud is voxelized; 16)枸杞枝条的识别和定位:将处理后待识别的三维点云数据输入训练后的枸杞枝条及其关键点检测网络,获得枸杞枝条和枝条末端关键点位置,实现枸杞枝条的识别与定位。16) Identification and positioning of wolfberry branches: Input the processed three-dimensional point cloud data to be identified into the trained wolfberry branch and its key point detection network, and obtain the positions of wolfberry branches and key points at the ends of the branches, so as to realize the identification and positioning of wolfberry branches. 2.根据权利要求1所述的基于注意力机制和改进PV-RCNN网络的枸杞枝条识别定位方法,其特征在于,所述构建枸杞枝条及其关键点检测网络包括以下步骤:2. the method for identifying and locating wolfberry branches based on attention mechanism and improving PV-RCNN network according to claim 1, is characterized in that, described constructing wolfberry branches and key point detection network thereof comprises the following steps: 21)基于PV-RCNN网络构建枸杞枝条及其关键点检测网络,设定其输入层为:枸杞枝条三维点云进行体素化后的诸多栅格和枝条末端关键点半径为r的球体;21) A detection network for wolfberry branches and their key points is constructed based on the PV-RCNN network, and the input layer is set as: many grids after voxelization of the three-dimensional point cloud of wolfberry branches and a sphere with a radius of r at the end of the branch; 22)设定其特征提取层为:利用稀疏3D卷积网络对输入栅格及其关键点半径为r的球体进行多尺度逐层特征提取;对利用FPS选取的相关点通过基于注意力机制的PointNet网络进行点云的特征提取;22) Set its feature extraction layer as: using a sparse 3D convolutional network to perform multi-scale layer-by-layer feature extraction on the input grid and a sphere whose key point radius is r; PointNet network for feature extraction of point cloud; 23)在枸杞枝条及其关键点检测网络PV-RCNN内构建融合注意力机制的精修网络:基于注意力机制构建枸杞枝条及其关键点目标候选框的精准定位网络,作为目标回归精修网络;23) Constructing a refinement network integrating attention mechanism in the detection network PV-RCNN of wolfberry branches and their key points: Based on the attention mechanism, an accurate localization network of wolfberry branches and their key point target candidate boxes is constructed as a target regression refinement network ; 24)设定基于改进PV-RCNN网络枸杞枝条及其关键点检测网络的输出层为枸杞枝条位置及其枝条末端关键点坐标。24) Set the output layer of the wolfberry branch and its key point detection network based on the improved PV-RCNN network as the position of the wolfberry branch and the coordinates of the key point at the end of the branch. 3.根据权利要求1所述的基于注意力机制和改进PV-RCNN网络的枸杞枝条识别定位方法,其特征在于,所述训练枸杞枝条及其关键点检测网络包括以下步骤:3. the method for identifying and positioning wolfberry branches based on attention mechanism and improving PV-RCNN network according to claim 1, is characterized in that, described training wolfberry branches and key point detection network thereof comprises the following steps: 31)将诸多栅格和关键点半径为r的球体输入3D稀疏卷积神经网络中进行逐层特征提取;31) Input many grids and spheres with key point radius r into the 3D sparse convolutional neural network for layer-by-layer feature extraction; 稀疏卷积神经网络由四层C1、C2、C3、C4,3×3×3的3D稀疏卷积组成,逐层进行特征提取;The sparse convolutional neural network consists of four layers of C1, C2, C3, C4, 3×3×3 3D sparse convolution, and feature extraction is performed layer by layer; 将C4特征图转换成俯视特征图,俯视特征图的大小为
Figure FDA0002481987300000021
Convert the C4 feature map to a top-down feature map, and the size of the top-view feature map is
Figure FDA0002481987300000021
根据特征图大小由RPN网络生成
Figure FDA0002481987300000022
个anchorboxes,角度分别为0度、45度、135度,通过NMS非极大值抑制操作生成3Dproposa],最终获得3Dproposal对应的类别和坐标位置;
Generated by the RPN network according to the feature map size
Figure FDA0002481987300000022
Anchorboxes, the angles are 0 degrees, 45 degrees, and 135 degrees, respectively, through the NMS non-maximum suppression operation to generate 3Dproposa], and finally obtain the corresponding category and coordinate position of 3Dproposal;
32)利用FPS选取的k个相关点并通过基于注意力机制的PointNet网络进行点云的特征提取;32) Use the k relevant points selected by FPS and perform feature extraction of the point cloud through the PointNet network based on the attention mechanism; 33)目标回归精修网络的训练:将3Dproposals对应的俯视图特征和k个相关点权重特征F′i进行级联;再利用Fusion模型对级联后的结果和3Dproposals卷积产生的注意力特征相乘进行融合;最后通过多层感知机获得精修的边界框3Dbox精确位置;33) Training of the target regression refinement network: cascade the top-view feature corresponding to 3D proposals and k related point weight features F′ i ; then use the Fusion model to match the cascaded results with the attention features generated by 3D proposals convolution. Multiply and fuse; finally obtain the precise position of the refined bounding box 3Dbox through the multi-layer perceptron; 34)在训练过程中进行损失函数的训练:损失函数包括RPN的多任务目标损失函数LRPN和回归框精修损失函数LREFINE34) The training of the loss function in the training process: the loss function includes the multi-task target loss function L RPN of the RPN and the regression box refinement loss function L REFINE .
4.根据权利要求3所述的基于注意力机制和改进PV-RCNN网络的枸杞枝条识别定位方法,其特征在于,所述的利用FPS选取的k个相关点并通过基于注意力机制的PointNet网络进行点云的特征提取包括以下步骤:4. according to claim 3 based on attention mechanism and improving the wolfberry branch identification and positioning method of PV-RCNN network, it is characterized in that, described utilizing FPS to select k relevant points and by the PointNet network based on attention mechanism Feature extraction from point cloud includes the following steps: 41)利用FPS算法从三维点云中选取k个相关点,其公式如下:41) Use the FPS algorithm to select k relevant points from the 3D point cloud, and the formula is as follows: κ={p1,p2,…,pk};κ={p 1 , p 2 , ..., p k }; 每一个相关点pi的特征表示如下:The feature representation of each relevant point pi is as follows:
Figure FDA0002481987300000031
其中i=1,2,3,...,k;
Figure FDA0002481987300000031
where i=1, 2, 3, ..., k;
其中,
Figure FDA0002481987300000032
为每一层3D稀疏卷积上产生的特征图,c=1,2,3,4;
in,
Figure FDA0002481987300000032
For the feature map generated on each layer of 3D sparse convolution, c=1, 2, 3, 4;
Figure FDA0002481987300000033
是三维点云通过SA模型计算的第i个相关点pi特征;
Figure FDA0002481987300000033
is the i-th related point p i feature calculated by the SA model of the 3D point cloud;
Figure FDA0002481987300000034
是对俯视图利用双线性插值获得的特征;
Figure FDA0002481987300000034
is the feature obtained by bilinear interpolation for the top view;
42)计算相关点pi的特征Fi的权重如下:42) Calculate the weight of the feature F i of the relevant point p i as follows: F′i=A(pi)⊙Fii=1,2,3,...,k;F′ i =A(pi ) ⊙Fi i = 1, 2, 3,...,k; 其中,Λ(·)∈[0,1]为注意力网络,其值代表对应输入相关点的注意力向量,即该相关点的重要程度,Fi是相关点pi的特征。Among them, Λ(·)∈[0,1] is the attention network, and its value represents the attention vector corresponding to the input related point, that is, the importance of the related point, and F i is the feature of the related point pi .
5.根据权利要求3所述的基于注意力机制和改进PV-RCNN网络的枸杞枝条识别定位方法,其特征在于,所述在训练过程中进行损失函数的训练包括以下步骤:5. the wolfberry branch identification and positioning method based on attention mechanism and improved PV-RCNN network according to claim 3, is characterized in that, the described training that carries out loss function in training process comprises the following steps: 51)训练多任务目标损失函数LRPN,该损失函数包括分类任务损失函数Lcls;目标回归框损失函数Lboxreg;关键点回归损失函数Lkeyreg::51) Training multi-task target loss function L RPN , the loss function includes classification task loss function L cls ; target regression box loss function L boxreg ; key point regression loss function L keyreg : 当IoU>0.6时,anchor被认为是正样本;当IoU<0.45时,anchor被认为是负样本;其表达式如下:When IoU>0.6, the anchor is considered as a positive sample; when IoU<0.45, the anchor is considered as a negative sample; its expression is as follows: LRPN=Lcls+Lboxreg+LkeyregL RPN = L cls + L boxreg + L keyreg ; 分类任务的损失函数Lcls,其表达式如下:The loss function L cls of the classification task, its expression is as follows:
Figure FDA0002481987300000035
Figure FDA0002481987300000035
其中,Lcls(x,y)=-(xlog(y)+(1-x)log(1-y)),N+表示正样本个数,N表示负样本个数;Among them, L cls (x, y)=-(xlog(y)+(1-x)log(1-y)), N + represents the number of positive samples, and N represents the number of negative samples; 目标回归框损失函数Lboxreg,其表达式如下:The target regression box loss function L boxreg , whose expression is as follows:
Figure FDA0002481987300000036
Figure FDA0002481987300000036
其中,令σ=2;Among them, let σ=2; 训练关键点回归损失函数Lkeyreg,其表达式如下:Train the keypoint regression loss function L keyreg , whose expression is as follows:
Figure FDA0002481987300000041
Figure FDA0002481987300000041
其中,
Figure FDA0002481987300000042
是标记的关键点坐标,f(xi)预测的关键点坐标;
in,
Figure FDA0002481987300000042
is the marked keypoint coordinates, f( xi ) predicted keypoint coordinates;
52)训练包括枝条目标框和关键点目标框的3Dbox回归框精修损失函数LREFINE,其表达式如下:52) Train the 3Dbox regression box refinement loss function L REFINE including the branch target frame and the key point target frame, and its expression is as follows:
Figure FDA0002481987300000043
Figure FDA0002481987300000043
其中,
Figure FDA0002481987300000044
是标记的目标框,3Dbox是预测目标框。
in,
Figure FDA0002481987300000044
is the labeled target box, and 3Dbox is the predicted target box.
CN202010380789.3A 2020-05-08 2020-05-08 Chinese wolfberry branch recognition and positioning method based on attention mechanism and improved PV-RCNN network Active CN111476843B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010380789.3A CN111476843B (en) 2020-05-08 2020-05-08 Chinese wolfberry branch recognition and positioning method based on attention mechanism and improved PV-RCNN network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010380789.3A CN111476843B (en) 2020-05-08 2020-05-08 Chinese wolfberry branch recognition and positioning method based on attention mechanism and improved PV-RCNN network

Publications (2)

Publication Number Publication Date
CN111476843A true CN111476843A (en) 2020-07-31
CN111476843B CN111476843B (en) 2023-03-24

Family

ID=71762225

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010380789.3A Active CN111476843B (en) 2020-05-08 2020-05-08 Chinese wolfberry branch recognition and positioning method based on attention mechanism and improved PV-RCNN network

Country Status (1)

Country Link
CN (1) CN111476843B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112215101A (en) * 2020-09-27 2021-01-12 武汉科技大学 A three-dimensional target recognition method and system based on attention mechanism
CN112950634A (en) * 2021-04-22 2021-06-11 内蒙古电力(集团)有限责任公司内蒙古电力科学研究院分公司 Method, equipment and system for identifying damage of wind turbine blade based on unmanned aerial vehicle routing inspection
CN112967221A (en) * 2020-12-04 2021-06-15 江苏龙冠新型材料科技有限公司 Shield constructs section of jurisdiction production and assembles information management system
CN114758222A (en) * 2022-03-09 2022-07-15 哈尔滨工业大学水资源国家工程研究中心有限公司 A method for damage identification and volume quantification of concrete pipes based on PointNet++ neural network
CN116486252A (en) * 2023-03-03 2023-07-25 上海大学 An intelligent unmanned search and rescue system and search and rescue method based on an improved PV-RCNN target detection algorithm

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109410238A (en) * 2018-09-20 2019-03-01 中国科学院合肥物质科学研究院 A kind of fructus lycii identification method of counting based on PointNet++ network
US20190147245A1 (en) * 2017-11-14 2019-05-16 Nuro, Inc. Three-dimensional object detection for autonomous robotic systems using image proposals
CN109784294A (en) * 2019-01-25 2019-05-21 中国科学院合肥物质科学研究院 A method for identification and positioning of wolfberry images based on candidate frame selection technology of rough set theory
CN110674829A (en) * 2019-09-26 2020-01-10 哈尔滨工程大学 Three-dimensional target detection method based on graph convolution attention network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190147245A1 (en) * 2017-11-14 2019-05-16 Nuro, Inc. Three-dimensional object detection for autonomous robotic systems using image proposals
CN109410238A (en) * 2018-09-20 2019-03-01 中国科学院合肥物质科学研究院 A kind of fructus lycii identification method of counting based on PointNet++ network
CN109784294A (en) * 2019-01-25 2019-05-21 中国科学院合肥物质科学研究院 A method for identification and positioning of wolfberry images based on candidate frame selection technology of rough set theory
CN110674829A (en) * 2019-09-26 2020-01-10 哈尔滨工程大学 Three-dimensional target detection method based on graph convolution attention network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
王凯等: "基于改进Faster R-CNN图像小目标检测", 《电视技术》 *
路强等: "基于体素特征重组网络的三维物体识别", 《图学学报》 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112215101A (en) * 2020-09-27 2021-01-12 武汉科技大学 A three-dimensional target recognition method and system based on attention mechanism
CN112215101B (en) * 2020-09-27 2024-12-20 武汉科技大学 A three-dimensional target recognition method and system based on attention mechanism
CN112967221A (en) * 2020-12-04 2021-06-15 江苏龙冠新型材料科技有限公司 Shield constructs section of jurisdiction production and assembles information management system
CN112967221B (en) * 2020-12-04 2024-05-14 江苏龙冠新型材料科技有限公司 Shield segment production and assembly information management system
CN112950634A (en) * 2021-04-22 2021-06-11 内蒙古电力(集团)有限责任公司内蒙古电力科学研究院分公司 Method, equipment and system for identifying damage of wind turbine blade based on unmanned aerial vehicle routing inspection
CN112950634B (en) * 2021-04-22 2023-06-30 内蒙古电力(集团)有限责任公司内蒙古电力科学研究院分公司 Unmanned aerial vehicle inspection-based wind turbine blade damage identification method, equipment and system
CN114758222A (en) * 2022-03-09 2022-07-15 哈尔滨工业大学水资源国家工程研究中心有限公司 A method for damage identification and volume quantification of concrete pipes based on PointNet++ neural network
CN114758222B (en) * 2022-03-09 2024-05-14 哈尔滨工业大学水资源国家工程研究中心有限公司 Concrete pipeline damage identification and volume quantification method based on PointNet ++ neural network
CN116486252A (en) * 2023-03-03 2023-07-25 上海大学 An intelligent unmanned search and rescue system and search and rescue method based on an improved PV-RCNN target detection algorithm

Also Published As

Publication number Publication date
CN111476843B (en) 2023-03-24

Similar Documents

Publication Publication Date Title
CN111476843B (en) Chinese wolfberry branch recognition and positioning method based on attention mechanism and improved PV-RCNN network
CN107563381B (en) A target detection method based on multi-feature fusion based on fully convolutional network
CN111080693A (en) Robot autonomous classification grabbing method based on YOLOv3
Hou et al. Detection and localization of citrus fruit based on improved You Only Look Once v5s and binocular vision in the orchard
Wang et al. Research on image recognition of insulators based on YOLO algorithm
CN111523511B (en) Video image Chinese wolfberry branch detection method for Chinese wolfberry harvesting and clamping device
CN114821102A (en) Intensive citrus quantity detection method, equipment, storage medium and device
Zhang et al. Three-dimensional branch segmentation and phenotype extraction of maize tassel based on deep learning
CN114846998A (en) Tomato picking method and system of binocular robot based on YOLOv4 algorithm
CN114550166B (en) A fruit detection method, device and storage medium for smart greenhouses
CN116958823B (en) Tea tender tip identification and picking point positioning method
CN116704497A (en) A method and system for extracting rapeseed phenotypic parameters based on three-dimensional point cloud
CN111598172A (en) Fast detection method of dynamic target grasping pose based on heterogeneous deep network fusion
CN118587188A (en) A PCB small target defect detection method based on improved YOLOv8s
CN118279643A (en) Unsupervised defect classification and segmentation method, system and storage medium based on double-branch flow model
Le Louedec et al. Segmentation and detection from organised 3D point clouds: A case study in broccoli head detection
Hao et al. [Retracted] Fast Recognition Method for Multiple Apple Targets in Complex Occlusion Environment Based on Improved YOLOv5
CN118799716A (en) Crab detection and counting method, device, medium and product based on instance segmentation
Rong et al. RTMFusion: An enhanced dual-stream architecture algorithm fusing RGB and depth features for instance segmentation of tomato organs
CN116652951A (en) A robot vision positioning method and device in an unstructured large working space
Zhang et al. Segmentation of apple point clouds based on ROI in RGB images.
Yu et al. ASE-UNet: An orange fruit segmentation model in an agricultural environment based on deep learning
CN118247729A (en) A cattle farm multi-target detection method and system based on GCS-YOLO algorithm
CN117893599A (en) A method for locating the stalk shearing point of a tomato picking robot
CN117036826A (en) Power transmission line identification positioning method and equipment for distribution network live working and storage equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant