[go: up one dir, main page]

CN114691516B - Industrial robot debugging method based on natural language and computer vision - Google Patents

Industrial robot debugging method based on natural language and computer vision Download PDF

Info

Publication number
CN114691516B
CN114691516B CN202210375780.2A CN202210375780A CN114691516B CN 114691516 B CN114691516 B CN 114691516B CN 202210375780 A CN202210375780 A CN 202210375780A CN 114691516 B CN114691516 B CN 114691516B
Authority
CN
China
Prior art keywords
vector
network
hidden state
time
natural language
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210375780.2A
Other languages
Chinese (zh)
Other versions
CN114691516A (en
Inventor
胡海洋
李川豪
陈洁
李忠金
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University filed Critical Hangzhou Dianzi University
Priority to CN202210375780.2A priority Critical patent/CN114691516B/en
Publication of CN114691516A publication Critical patent/CN114691516A/en
Application granted granted Critical
Publication of CN114691516B publication Critical patent/CN114691516B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Prevention of errors by analysis, debugging or testing of software
    • G06F11/362Debugging of software
    • G06F11/3624Debugging of software by performing operations on the source code, e.g. via a compiler
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Prevention of errors by analysis, debugging or testing of software
    • G06F11/3698Environments for analysis, debugging or testing of software
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/44Encoding
    • G06F8/447Target code generation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/02Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

本发明公开一种基于自然语言和计算机视觉结合的工业机器人调试方法。本发明将自然语言描述通过word2vec网络和线性层来生成语义信息。通过三维循环卷积神经网络提取工业机器人环境特征。将环境特征输入到长短期记忆网络生成中间上下文,将中间上下文作为GRU网络的输入通过循环神经网络RNN得到API推荐。将自然语言描述通过word2vec网络生成的文本嵌入输入长短期记忆网络编码器和长短期记忆网络解码器,输出AST构建动作序列,将API推荐和构建动作序列结合生成工业机器人调试代码。将调试代码加入到机器人程序编辑器中完成调试。本发明有效提高机器人调试的开发效率,减少工业环境中机器人产线的部署时间。

The present invention discloses an industrial robot debugging method based on the combination of natural language and computer vision. The present invention generates semantic information by using a word2vec network and a linear layer to describe natural language. The environmental features of the industrial robot are extracted by a three-dimensional recurrent convolutional neural network. The environmental features are input into a long short-term memory network to generate an intermediate context, and the intermediate context is used as an input of a GRU network to obtain an API recommendation through a recurrent neural network RNN. The text generated by the natural language description through the word2vec network is embedded into a long short-term memory network encoder and a long short-term memory network decoder, and an AST is output to construct an action sequence, and the API recommendation and the constructed action sequence are combined to generate an industrial robot debugging code. The debugging code is added to a robot program editor to complete the debugging. The present invention effectively improves the development efficiency of robot debugging and reduces the deployment time of robot production lines in industrial environments.

Description

Industrial robot debugging method based on combination of natural language and computer vision
Technical Field
The invention belongs to the technical field of industrial robots, and particularly relates to an industrial robot debugging method based on combination of natural language and computer vision.
Background
In recent years, with the initiative of the country for intelligent factories, the production and manufacturing industry starts to use robot technology on a large scale to assist production, and the intelligent manufacturing concept enters a comprehensive popularization stage from popularization. The robot is used as a main component of the intelligent factory, can help to improve the productivity of the factory, can finish operations which cannot be performed by workers, and can quickly adapt to new production requirements.
The current industrial robot debugging modes in the market can be divided into online debugging and offline debugging, wherein the online debugging needs to control the robot to complete specified actions and save the actions, the operations can be repeated by running the specified actions, the offline debugging needs to carry out virtual programming on the robot by a user through a software tool, the robot does not need to be stopped during the debugging, and the production operation is not hindered. These methods have corresponding drawbacks, whether online or offline. The online debugging requires the user to operate the robot on site, the user needs to have abundant experience, the complicated task programming needs to take a lot of time, the robot is required to stop in the debugging process to influence the production operation, the offline debugging also needs to have abundant experience and programming capability of corresponding robot languages, and the user is required to finely adjust the action of the robot according to the actual scene after the completion. For example, in an elevator company, the debugging of a robot is completed by adopting an offline debugging method, and an engineer realizes a new welding plate conveying production line, which takes six months, wherein the time spent in the virtual programming is up to five months. The debugging of robots in industrial environments requires not only specialized field knowledge of the user, but also knowledge of the industrial field environment and the motion trajectories of the robots, and even specialized engineers are required to spend a lot of time. Shortening the development period and rapidly adapting to the production requirement has become the main appeal of enterprises today.
Both the industry and the academia have shown great interest in the related fields of robots, and methods based on machine learning and deep learning have also made important progress, and learning natural language features and visual features through neural networks has become a hotspot in research today. However, the current deep learning model-based robot debugging method has disadvantages such as poor task code robustness, and the inability of debugging results to be suitable for field environments. The debugging method based on the method cannot be applied to the production environment of factories.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides an industrial robot debugging method based on the combination of natural language and computer vision.
The general idea of the method of the invention is:
In order to solve the defects of the traditional debugging method based on the neural network, the application adds the environmental features around the robot into the neural network, and simultaneously converts the natural language describing the robot code into semantic features and adds the semantic features into the network to strengthen the robustness. The application is mainly divided into four parts, 1) natural language description is used for generating semantic information (SemanticInformation) for representing language semantics through a word2vec network and a linear layer, and then semantic vectors with specified dimensions are generated through a linear network. 2) The characteristics of the photographed industrial robot environment are extracted through a three-dimensional cyclic convolutional neural network (3D-RCNN) model. 3) Inputting the obtained characteristics into a long-short-term memory network (LSTM) encoder to generate intermediate contexts, initializing the GRU network by using the semantic vector generated in 1), and outputting the intermediate contexts output in 2) as the input of the GRU network through a cyclic neural network (RNN) to obtain API recommendation. 4) The method comprises the steps of embedding a text generated by a word2vec network through natural language description, sequentially inputting a long-term memory network encoder and a long-term memory network decoder, outputting an Abstract Syntax Tree (AST) construction action sequence, and combining API recommendation and the construction action sequence to generate an industrial robot debugging code. 5) Adding the debugging code in the step 4) into a task module of a robot program editor, and finally processing the debugging code by a user to finish the debugging of the industrial robot, wherein the specific implementation steps are as follows:
generating semantic information;
extracting environmental characteristics;
Step (3) API recommendation generation;
step (4) target code programming of the industrial robot;
And (5) completing robot debugging.
It is a further object of the present invention to provide a computer readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the above-mentioned method.
It is a further object of the present invention to provide a computing device comprising a memory having executable code stored therein and a processor which, when executing the executable code, implements the method described above.
The invention has the beneficial effects that the problems of low robot debugging efficiency and long robot program online period in the actual industrial environment are solved. The industrial robot debugging method based on the combination of the natural language and the computer vision provided by the invention has the main points of innovation that 1) the debugging of the robot is completed by using a method based on a neural network, 2) the image characteristic extraction based on the computer vision and the target code generation based on the natural language are combined, 3) the image characteristic under the industrial environment is extracted by using a three-dimensional cyclic convolution network, 4) the API recommendation is generated by using the neural network added with semantic information, and 5) the debugging work of the robot is completed by combining an encoder-decoder network and the API recommendation.
The invention does not need a great deal of robot programming knowledge, and the user can realize tasks by inputting natural language description for generating the robot codes, thereby providing a feasible debugging method for inexperienced novice and being used as an auxiliary means of a code engineer. The invention uses an advanced robot debugging technology based on the neural network, blends the industrial field environment into the neural network, and solves the problems of high time consumption and low accuracy of robot debugging in the industrial environment by considering that the existing neural network is not blended with environmental characteristics and actual scenes in the industrial environment. The invention can effectively improve the development efficiency of robot debugging and reduce the deployment time of the robot production line in an industrial environment.
Drawings
FIG. 1 is a schematic flow chart of the method of the invention;
FIG. 2 is a sequence diagram constructed for an action;
FIG.3 is a schematic diagram of coding.
Detailed Description
The invention will be further analyzed with reference to specific examples.
An industrial robot debugging method based on the combination of natural language and computer vision, as shown in fig. 1, comprises the following steps:
step (1), generating semantic information
1-1, A natural language description input word2vec network for describing a robot action code is used for generating text embedding, and the method comprises the following steps of:
Generating a text embedding vector matrix e= { E i|i=1,2,…,n}∈RL×C by using a natural language instruction x= { X i |i=1, 2, & n } through a word2vec network;
E=word2vec(X) (1)
Wherein x i represents an ith natural language word, e i represents an ith text-embedding vector, word2vec () represents a word2vec network function, L is the number of text-embedding, and C is the embedding dimension;
1-2 generating semantic information through two cascaded linear layers A by utilizing the text embedded vector matrix generated in the step 1-1, then inputting the semantic information into a linear layer B, and outputting semantic vectors with specified dimensions, wherein the method comprises the following steps:
1-2-1 converting the text embedded vector matrix E generated in the step 1-1 into a K-dimensional vector representation I, wherein the K dimension is L multiplied by C;
K=L×C (2)
1-2-2 using vector representation I as input to two cascaded linear layers A to obtain semantic information S;
S=W2σ(W1+b1)+b2 (3)
Wherein W 1,W2,b1,b2 is the trainable weight and bias of the corresponding linear function of the two cascaded linear layers A, respectively, and sigma is the ReLU activation function;
1-2-3 semantic information S is converted into semantic vectors of specified dimensions using a linear layer B.
Step (2) environmental feature extraction
The method takes the image data of the industrial environment where the industrial robot is as the input of the three-dimensional cyclic convolution network, thereby outputting the visual characteristics of the environment, and comprises the following steps:
Inputting an industrial robot site environment image Q into a three-dimensional cyclic convolution network to generate an environment visual feature f;
f=3D-RCNN(Q) (4)
Where f is the visual characteristic representation of the image environment and 3D-RCNN () is a three-dimensional circular convolution network function.
Step (3), API recommendation generation
3-1, Inputting the environment visual characteristics generated in the step (2) into a long-short-term memory network encoder for encoding to generate an intermediate semantic vector, wherein the method comprises the following steps of:
taking the environment visual characteristic f as input of a long-short-term memory network to generate a hidden state vector g, and then converting the hidden state vector g into an intermediate semantic vector;
gt=LSTM(ft,gt-1) (5)
wherein g t is a hidden state vector at time t, g t-1 is a hidden state vector at time t-1, f t is an environmental visual characteristic at time t, and LSTM () is a long-short-term memory network function;
3-2 initializing the GRU network according to the semantic vector S generated in the step (1), converting the intermediate semantic vector generated in the step (3-1) into a hidden state vector, and then inputting the hidden state vector into the initialized GRU network to generate the intermediate vector, wherein the method comprises the following steps:
3-2-1 initializing a GRU network according to the semantic vector generated in the step (1);
3-2-2 converting the intermediate semantic vector generated in the step 3-1 into a hidden state vector r, and using the hidden state vector r as input of the GRU network to generate an intermediate vector k;
kt=GRU(rt,kt-1) (6)
Where k t is the intermediate vector at time t, GRU () is a GRU function, k t-1 is the intermediate vector at time t-1, and r t is the hidden state vector at time t;
3-3 generating an API recommendation list by using the intermediate vector k generated in the step 3-2 through a cyclic neural network, wherein the method comprises the following steps:
Taking the intermediate vector generated in the step 3-2 as the input of a cyclic neural network (RNN), and then obtaining the probability distribution of the API recommendation list through softmax layer normalization;
P=softmax(RNN(k)) (7)
Where P is the probability distribution of the API recommendation list, softmax () is the normalized exponential function, RNN () is the recurrent neural network function, τ i is the predicted probability of the ith API recommendation output by the recurrent neural network, y i is the real-case API recommendation, Is the loss rate;
Step (4), object code programming of the industrial robot
4-1 Converting natural language description describing the robot action code into a text embedded vector E by using a word2vec network according to the formula (1);
4-2, using the text embedded vector E in the step 4-1 as input of a long-short-term memory network encoder, and then inputting output into a long-short-term memory network decoder to generate a construction action sequence of an AST tree, wherein the construction action sequence is specifically as follows:
4-2-1 taking the text embedded vector E as an input of a long and short term memory network encoder (LSTM-Encoder) to generate a hidden state vector h, and then converting the hidden state vector into an intermediate semantic vector;
ht=LSTMe(et,ht-1) (9)
Wherein h t is a hidden state vector at time t, LSTM e () represents a long-short-term memory network encoder function, e t represents an embedded vector at time t, and h t-1 represents a hidden state vector at time t-1;
4-2-2 converting the intermediate semantic vector into a hidden state vector θ, and taking the hidden state vector θ as an input of a long-short-term memory network Decoder (LSTM-Decoder);
Wherein θ t is the hidden state at time t, θ t-1 is the hidden state at time t-1, LSTM d () represents a long-short-term memory network decoder function, [: represents a vector joint feature, a t-1 represents an AST tree construction action at time t-1, Attention vector representing hidden state at time t-1, beta t is vector containing father boundary information in the derivation process,For the attention vector of the hidden state at time t, h t is the context vector, W c is the connection layer function, and tanh () is the hyperbolic tangent function;
4-2-3 to generate a construction action sequence of an AST tree, specifically:
Wherein ApplyConstr [ c ] is one of the action types of AST tree (abstract syntax tree) construction actions, which can apply construction operation c to a boundary field of the same type as c, which can be used to fill a node, p (a t=ApplyConstr[c]|a<t, x) represents action information before T time and probability of action ApplyConstr [ c ] under natural language description, a t is an AST tree construction action representing T time, a <t represents AST tree construction action information before T time, x is a natural language word, softmax () is a normalized exponential function, a c is an AST tree construction action of construction operation c, T is a vector transpose, W is a connection layer function, Hiding the attention vector of the state for the moment t;
p(at=GenToken[v]|a<t,x)
=p(gen|at,x)p(v|gen,at,x)+
p(copy|at,x)p(v|copy,at,x) (13)
Wherein GenToken [ v ] is another action type in an AST tree construction action, which can fill an AST tree boundary field into a code v, (a t=GenToken[v]|a<t, x) represents action information before a time slice and probability of action GenToken [ v ] under natural language description, a t is an AST tree construction action at time t, a <t represents AST tree construction action information before time t, x is a natural language word, gen is a generating operation, copy is a copying operation;
Finally, all the construction actions of the AST tree are obtained through formulas (12) - (13), and then the construction action sequence of the AST tree is obtained as shown in figure 2;
4-2-4, combining the API recommendation list generated in the step (3) and the construction action sequence of the AST tree generated in the step 4-2-3 to output a final robot code, wherein the method specifically comprises the following steps:
sequencing according to the probability distribution of the API recommendation list generated in the step (3) to obtain the API recommendation with the maximum probability, embedding the API into the construction action sequence of the AST tree generated in the step (4-2-3) to obtain the construction action sequence of the optimized AST tree, generating an Abstract Syntax Tree (AST) from the construction action sequence of the optimized AST tree, and converting the abstract syntax tree into a debugging code of the industrial robot in the current industrial environment through a conversion function, as shown in figure 3;
step (5) completing robot debugging
And inputting the debugging codes into a task module of the robot program editor to finish the debugging of the industrial robot.

Claims (8)

1.一种基于自然语言和计算机视觉结合的工业机器人调试方法,其特征在于包括以下步骤:1. An industrial robot debugging method based on the combination of natural language and computer vision, characterized by comprising the following steps: 步骤(1)、将描述机器人动作代码的自然语言描述生成语义信息步骤(2)、环境特征提取Step (1): Generate semantic information from the natural language description of the robot action code Step (2): Extract environmental features 将工业机器人所在工业环境的图像数据作为三维循环卷积网络的输入,从而输出环境视觉特征;The image data of the industrial environment where the industrial robot is located is used as the input of the three-dimensional recurrent convolutional network to output the visual features of the environment; 步骤(3)、API推荐生成Step (3): Generate API recommendations 3-1利用步骤(2)中生成的环境视觉特征输入长短期记忆网络编码器中进行编码生成中间语义向量;具体如下:3-1 Use the environmental visual features generated in step (2) to input into the long short-term memory network encoder for encoding to generate an intermediate semantic vector; the details are as follows: 将环境视觉特征f作为长短期记忆网络的输入,生成隐藏状态向量g,随后将隐藏状态向量g转化为中间语义向量;The environmental visual feature f is used as the input of the long short-term memory network to generate a hidden state vector g, which is then converted into an intermediate semantic vector. gt=LSTM(ft,gt-1) (5)g t =LSTM (f t ,g t-1 ) (5) 其中gt为t时刻的隐藏状态向量,gt-1表示t-1时刻的隐藏状态向量,ft为t时刻的环境视觉特征,LSTM()为长短期记忆网络函数;Where g t is the hidden state vector at time t, g t-1 is the hidden state vector at time t-1, f t is the environmental visual feature at time t, and LSTM() is the long short-term memory network function; 3-2根据步骤(1)生成的语义向量S初始化GRU网络,然后将步骤(3-1)生成的中间语义向量转化为隐藏状态向量,随后将隐藏状态向量输入至初始化后的GRU网络生成中间向量;具体如下:3-2 Initialize the GRU network according to the semantic vector S generated in step (1), then convert the intermediate semantic vector generated in step (3-1) into a hidden state vector, and then input the hidden state vector into the initialized GRU network to generate an intermediate vector; specifically as follows: 3-2-1根据步骤(1)生成的语义向量初始化GRU网络;3-2-1 Initialize the GRU network according to the semantic vector generated in step (1); 3-2-2将步骤3-1生成的中间语义向量转化为隐藏状态向量r,将隐藏状态向量r作为GRU网络的输入,生成中间向量k;3-2-2 Convert the intermediate semantic vector generated in step 3-1 into a hidden state vector r, use the hidden state vector r as the input of the GRU network, and generate an intermediate vector k; kt=GRU(rt,kt-1) (6)k t =GRU(r t ,k t-1 ) (6) 其中kt为t时刻的中间向量,GRU()是GRU函数,kt-1是t-1时刻的中间向量,rt是t时刻的隐藏状态向量;Where k t is the intermediate vector at time t, GRU() is the GRU function, k t-1 is the intermediate vector at time t-1, and r t is the hidden state vector at time t; 3-3利用步骤3-2生成的中间向量k通过循环神经网络来生成API推荐列表;具体如下:3-3 Use the intermediate vector k generated in step 3-2 to generate an API recommendation list through a recurrent neural network; the details are as follows: 根据步骤3-2生成的中间向量作为循环神经网络RNN的输入,然后通过softmax层归一化获得API推荐列表的概率分布;The intermediate vector generated in step 3-2 is used as the input of the recurrent neural network (RNN), and then normalized through the softmax layer to obtain the probability distribution of the API recommendation list. P=softmax(RNN(k)) (7)P = softmax(RNN(k)) (7) 其中P是API推荐列表的概率分布,softmax()是归一化指数函数,RNN()是循环神经网络函数,τi是循环神经网络输出的第i个API推荐的预测概率,yi是真实情况的API推荐,是损失率;Where P is the probability distribution of the API recommendation list, softmax() is the normalized exponential function, RNN() is the recurrent neural network function, τ i is the predicted probability of the i-th API recommendation output by the recurrent neural network, yi is the actual API recommendation, is the loss rate; 步骤(4)、工业机器人的目标代码编程Step (4): Target code programming of industrial robots 4-1根据公式(1)利用word2vec网络将描述机器人动作代码的自然语言描述转化为文本嵌入向量E;4-1 According to formula (1), the word2vec network is used to convert the natural language description of the robot action code into a text embedding vector E; 4-2使用步骤4-1文本嵌入向量E作为长短期记忆网络编码器的输入,然后将输出输入到长短期记忆网络解码器中,生成AST树的构建动作序列;具体如下:4-2 Use the text embedding vector E in step 4-1 as the input of the long short-term memory network encoder, and then input the output into the long short-term memory network decoder to generate the AST tree construction action sequence; specifically as follows: 4-2-1将文本嵌入向量E作为长短期记忆网络编码器(LSTM-Encoder)的输入,生成隐藏状态向量h,随后将隐藏状态向量转化为中间语义向量;4-2-1 Take the text embedding vector E as the input of the long short-term memory network encoder (LSTM-Encoder) to generate a hidden state vector h, and then convert the hidden state vector into an intermediate semantic vector; ht=LSTMe(et,ht-1) (9)h t =LSTM e (e t ,h t-1 ) (9) 其中ht为t时刻的隐藏状态向量;LSTMe()表示长短期记忆网络编码器函数,et表示t时刻的嵌入向量,ht-1表示t-1时刻的隐藏状态向量;Where h t is the hidden state vector at time t; LSTM e () represents the long short-term memory network encoder function, e t represents the embedding vector at time t, and h t-1 represents the hidden state vector at time t-1; 4-2-2将中间语义向量转化为隐藏状态向量θ,将隐藏状态向量θ作为长短期记忆网络解码器(LSTM-Decoder)的输入;4-2-2 Convert the intermediate semantic vector into a hidden state vector θ, and use the hidden state vector θ as the input of the long short-term memory network decoder (LSTM-Decoder); 其中θt为t时刻的隐藏状态,θt-1为t-1时刻的隐藏状态,LSTMd()表示长短期记忆网络解码器函数,[:]表示向量联合特征,at-1表示t-1时刻AST树构建动作,表示t-1时刻隐藏状态的注意力向量,βt为推导过程中包含父亲边界信息的向量,为t时刻隐藏状态的注意力向量,ht是上下文向量,Wc为连接层函数,tanh()为双曲正切函数;Where θt is the hidden state at time t, θt -1 is the hidden state at time t-1, LSTMd () represents the long short-term memory network decoder function, [:] represents the vector joint feature, at-1 represents the AST tree construction action at time t-1, represents the attention vector of the hidden state at time t-1, βt is the vector containing the father boundary information in the derivation process, is the attention vector of the hidden state at time t, h t is the context vector, W c is the connection layer function, and tanh() is the hyperbolic tangent function; 4-2-3生成AST树的构建动作序列,具体是:4-2-3 Generate the construction action sequence of the AST tree, specifically: 其中ApplyConstr[c]是AST树(抽象语法树)构建动作中的其中一种动作类型,该动作将构造操作c应用到与c具有相同类型的边界字段上,该字段用来填充节点;p(at=ApplyConstr[c]|a<t,x)表示在t时刻之前的动作信息和自然语言描述下动作ApplyConstr[c]的概率,at是表示t时刻AST树构建动作,a<t表示t时刻之前的AST树构建动作信息,x是自然语言单词,softmax()为归一化指数函数,ac为构造操作c的AST树构建动作,T为向量转置,W为连接层函数,为t时刻隐藏状态的注意力向量;Where ApplyConstr[c] is one of the action types in the AST tree (abstract syntax tree) construction action, which applies the construction operation c to the boundary field of the same type as c, which is used to fill the node; p( at = ApplyConstr[c]|a <t , x) represents the probability of the action ApplyConstr[c] under the action information and natural language description before time t, a t represents the AST tree construction action at time t, a <t represents the AST tree construction action information before time t, x is a natural language word, softmax() is a normalized exponential function, a c is the AST tree construction action of the construction operation c, T is a vector transpose, W is a connection layer function, is the attention vector of the hidden state at time t; p(at=GenToken[v]|a<t,x)p( at = GenToken[v]|a <t , x) =p(gen|at,x)p(v|gen,at,x)+=p(gen|a t ,x)p(v|gen,a t ,x)+ p(copy|at,x)p(v|copy,at,x) (13)p(copy|a t ,x)p(v|copy,a t ,x) (13) 其中GenToken[v]是AST树构建动作中的另一种动作类型,该动作将AST树边界字段填充为代码v;pa(t=GenToken[v]|a<t,x)表示在时间片之前的动作信息和自然语言描述下动作GenToken[v]的概率,at是表示t时刻AST树构建动作,a<t表示t时刻之前的AST树构建动作信息,x是自然语言单词,gen为生成操作,copy为复制操作;Where GenToken[v] is another type of action in the AST tree construction action, which fills the AST tree boundary field with code v; pa( t = GenToken[v]| a<t , x) represents the probability of action GenToken[v] under the action information and natural language description before the time slice, a t represents the AST tree construction action at time t, a <t represents the AST tree construction action information before time t, x is a natural language word, gen is a generation operation, and copy is a copy operation; 最后将公式(12)-(13)得到AST树的所有构建动作,进而得到AST树的构建动作序列;Finally, formulas (12)-(13) are used to obtain all the construction actions of the AST tree, and then the construction action sequence of the AST tree is obtained; 4-2-4结合步骤(3)生成的API推荐列表和步骤4-2-3生成的AST树的构建动作序列来输出最终的机器人代码;4-2-4 Combine the API recommendation list generated in step (3) and the AST tree construction action sequence generated in step 4-2-3 to output the final robot code; 步骤(5)、完成机器人调试Step (5): Complete robot debugging 利用步骤4-2-4所得代码输入至机器人程序编辑器的任务模块中,完成工业机器人的调试。Use the code obtained in step 4-2-4 to enter it into the task module of the robot program editor to complete the debugging of the industrial robot. 2.如权利要求1所述的方法,其特征在于步骤(1)具体如下:2. The method according to claim 1, characterized in that step (1) is specifically as follows: 1-1将描述机器人动作代码的自然语言描述输入word2vec网络生成文本嵌入;1-1 Input the natural language description of the robot action code into the word2vec network to generate text embedding; 1-2利用步骤1-1生成的文本嵌入向量矩阵通过两个级联的线性层A生成语义信息,然后将语义信息输入至线性层B,输出规定维度的语义向量。1-2 Use the text embedding vector matrix generated in step 1-1 to generate semantic information through two cascaded linear layers A, and then input the semantic information into the linear layer B to output a semantic vector of a specified dimension. 3.如权利要求2所述的方法,其特征在于步骤1-1具体如下:3. The method according to claim 2, characterized in that step 1-1 is specifically as follows: 将i个自然语言单词组成的自然语言指令X={xi|i=1,2,…,n}通过word2vec网络生成文本嵌入向量矩阵E={ei|i=1,2,…,n}∈RL×CThe natural language instruction X={ xi |i=1,2,…,n} composed of i natural language words is generated through the word2vec network to generate a text embedding vector matrix E={ ei |i=1,2,…,n}∈RL ×C ; E=word2vec(X) (1)E = word2vec(X) (1) 其中xi表示第i个自然语言单词,ei表示第i个文本嵌入向量,word2vec()表示word2vec网络函数,L为文本嵌入个数,C为嵌入维度。Where xi represents the i-th natural language word, ei represents the i-th text embedding vector, word2vec() represents the word2vec network function, L is the number of text embeddings, and C is the embedding dimension. 4.如权利要求2所述的方法,其特征在于步骤1-2具体如下:4. The method according to claim 2, characterized in that steps 1-2 are specifically as follows: 1-2-1将步骤1-1生成的文本嵌入向量矩阵E转化为K维的向量表示I;其中K的维度为L×C;1-2-1 Convert the text embedding vector matrix E generated in step 1-1 into a K-dimensional vector representation I; where the dimension of K is L×C; K=L×C (2)K=L×C (2) 1-2-2利用向量表示I作为两个级联的线性层A的输入,获取语义信息S;1-2-2 Use the vector representation I as the input of two cascaded linear layers A to obtain semantic information S; S=W2σ(W1+b1)+b2 (3)S=W 2 σ(W 1 +b 1 )+b 2 (3) 其中W1,W2,b1,b2分别是两个级联的线性层A对应线性函数的可训练权重和偏置,σ是ReLU激活函数;Where W 1 , W 2 , b 1 , b 2 are the trainable weights and biases of the linear functions corresponding to the two cascaded linear layers A, and σ is the ReLU activation function; 1-2-3语义信息S利用线性层B转化为规定维度的语义向量。1-2-3 The semantic information S is converted into a semantic vector of specified dimension using the linear layer B. 5.如权利要求1所述的方法,其特征在于步骤(2)具体如下:5. The method according to claim 1, characterized in that step (2) is specifically as follows: 将工业机器人现场环境图像Q输入至三维循环卷积网络中,生成环境视觉特征f;Input the industrial robot's on-site environment image Q into the three-dimensional recurrent convolutional network to generate the environmental visual feature f; f=3D-RCNN(Q) (4)f = 3D-RCNN(Q) (4) 其中f为图像环境视觉特征表示,3D-RCNN()为三维循环卷积网络函数。Where f is the visual feature representation of the image environment, and 3D-RCNN() is a three-dimensional recurrent convolutional network function. 6.如权利要求1所述的方法,其特征在于步骤4-2具体是:6. The method according to claim 1, characterized in that step 4-2 specifically comprises: 根据步骤(3)生成的API推荐列表概率分布进行排序,得到概率最大的API推荐,将此API嵌入到步骤4-2-3生成的AST树的构建动作序列中,得到优化后AST树的构建动作序列;然后将优化后AST树的构建动作序列生成抽象语法树(AST),再将抽象语法树通过转换函数转化为当前工业环境下工业机器人的调试代码。The API recommendation list generated in step (3) is sorted according to its probability distribution to obtain the API recommendation with the highest probability. This API is embedded into the construction action sequence of the AST tree generated in step 4-2-3 to obtain the construction action sequence of the optimized AST tree. The construction action sequence of the optimized AST tree is then used to generate an abstract syntax tree (AST), which is then converted into the debugging code of the industrial robot in the current industrial environment through a conversion function. 7.一种计算机可读存储介质,其上存储有计算机程序,当所述计算机程序在计算机中执行时,令计算机执行权利要求1-6中任一项所述的方法。7. A computer-readable storage medium having a computer program stored thereon, which, when executed in a computer, causes the computer to execute the method according to any one of claims 1 to 6. 8.一种计算设备,包括存储器和处理器,所述存储器中存储有可执行代码,所述处理器执行所述可执行代码时,实现权利要求1-6中任一项所述的方法。8. A computing device, comprising a memory and a processor, wherein the memory stores executable code, and when the processor executes the executable code, the method according to any one of claims 1 to 6 is implemented.
CN202210375780.2A 2022-04-11 2022-04-11 Industrial robot debugging method based on natural language and computer vision Active CN114691516B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210375780.2A CN114691516B (en) 2022-04-11 2022-04-11 Industrial robot debugging method based on natural language and computer vision

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210375780.2A CN114691516B (en) 2022-04-11 2022-04-11 Industrial robot debugging method based on natural language and computer vision

Publications (2)

Publication Number Publication Date
CN114691516A CN114691516A (en) 2022-07-01
CN114691516B true CN114691516B (en) 2025-05-27

Family

ID=82143292

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210375780.2A Active CN114691516B (en) 2022-04-11 2022-04-11 Industrial robot debugging method based on natural language and computer vision

Country Status (1)

Country Link
CN (1) CN114691516B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113033189A (en) * 2021-04-08 2021-06-25 北京理工大学 Semantic coding method of long-short term memory network based on attention dispersion
CN113569932A (en) * 2021-07-18 2021-10-29 湖北工业大学 An Image Description Generation Method Based on Text Hierarchy

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10565305B2 (en) * 2016-11-18 2020-02-18 Salesforce.Com, Inc. Adaptive attention model for image captioning
CN110609849B (en) * 2019-08-27 2022-03-25 广东工业大学 Natural language generation method based on SQL syntax tree node type
CN111267097B (en) * 2020-01-20 2021-03-02 杭州电子科技大学 A Natural Language-Based Aided Programming Method for Industrial Robots

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113033189A (en) * 2021-04-08 2021-06-25 北京理工大学 Semantic coding method of long-short term memory network based on attention dispersion
CN113569932A (en) * 2021-07-18 2021-10-29 湖北工业大学 An Image Description Generation Method Based on Text Hierarchy

Also Published As

Publication number Publication date
CN114691516A (en) 2022-07-01

Similar Documents

Publication Publication Date Title
CN109492113B (en) A joint entity and relation extraction method for software defect knowledge
CN115145551B (en) An intelligent assistance system for low-code development of machine learning applications
CN114722820B (en) Chinese entity relationship extraction method based on gating mechanism and graph attention network
CN116258147B (en) A Multimodal Comment Sentiment Analysis Method and System Based on Heterogeneous Graph Convolution
CN111267097B (en) A Natural Language-Based Aided Programming Method for Industrial Robots
CN110825829A (en) A method for autonomous navigation of robots based on natural language and semantic maps
CN114510576A (en) An Entity Relation Extraction Method Based on BERT and BiGRU Fusion Attention Mechanism
CN111597342B (en) A multi-task intent classification method, device, equipment and storage medium
Jiang et al. Motionchain: Conversational motion controllers via multimodal prompts
CN118350435B (en) Training Method and System for Embodied Intelligent Task Performers Based on Multimodal Large Models
CN110489348B (en) A software function defect mining method based on migration learning
CN114691516B (en) Industrial robot debugging method based on natural language and computer vision
CN113656066B (en) Clone code detection method based on feature alignment
WO2025190291A1 (en) Method and apparatus for controlling robot, and medium and electronic device
CN118626626B (en) Information processing method, apparatus, device, storage medium, and computer program product
JP2020119551A (en) Information processing method and information processing apparatus
CN112463209A (en) Automatic source program labeling method based on machine learning
CN110428051A (en) A kind of method and system being multiplexed deep neural network model training pattern
CN115796029A (en) NL2SQL method based on explicit and implicit characteristic decoupling
CN116841545A (en) Front-end code automatic generation method and system based on design interface screenshot
Liu et al. Research on Intelligent Agent Technology and Applications Based on Large Models
CN115329781A (en) Multi-task machine translation quality estimation method and system based on post-editing translation
CN120031019B (en) Audit question-answer large model optimization method based on prompt fine tuning
CN116958752B (en) Power grid infrastructure archiving method, device and equipment based on IPKCNN-SVM
CN115157254B (en) A method, device, equipment and storage medium for training instruction generation model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant