[go: up one dir, main page]

CN118927260B - Event-triggered single-link robotic arm control method and system under cloud-edge-end collaboration - Google Patents

Event-triggered single-link robotic arm control method and system under cloud-edge-end collaboration Download PDF

Info

Publication number
CN118927260B
CN118927260B CN202411419480.5A CN202411419480A CN118927260B CN 118927260 B CN118927260 B CN 118927260B CN 202411419480 A CN202411419480 A CN 202411419480A CN 118927260 B CN118927260 B CN 118927260B
Authority
CN
China
Prior art keywords
mechanical arm
time
link mechanical
real
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202411419480.5A
Other languages
Chinese (zh)
Other versions
CN118927260A (en
Inventor
陈晨
罗健
邓洪
陈伟
古晓东
赵耀
何常红
周敏
郭雅婕
李勇
王振鹏
张煜
杨嘉琛
郭晓旭
白裔峰
梁茹楠
晏寒
张妍君
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Railway Design Corp
Original Assignee
China Railway Design Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Railway Design Corp filed Critical China Railway Design Corp
Priority to CN202411419480.5A priority Critical patent/CN118927260B/en
Publication of CN118927260A publication Critical patent/CN118927260A/en
Application granted granted Critical
Publication of CN118927260B publication Critical patent/CN118927260B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/02Programme-controlled manipulators characterised by movement of the arms, e.g. cartesian coordinate type
    • B25J9/04Programme-controlled manipulators characterised by movement of the arms, e.g. cartesian coordinate type by rotating at least one arm, excluding the head movement itself, e.g. cylindrical coordinate type or polar coordinate type
    • B25J9/046Revolute coordinate type
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1628Programme controls characterised by the control loop
    • B25J9/163Programme controls characterised by the control loop learning, adaptive, model based, rule based expert control
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/092Reinforcement learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/02Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Mechanical Engineering (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Robotics (AREA)
  • Feedback Control In General (AREA)
  • Manipulator (AREA)

Abstract

The invention discloses a control method and a system for a single-link mechanical arm triggered by an event under the cooperation of cloud edge ends, wherein a terminal acquires operation data of the single-link mechanical arm and sends the operation data to an edge end for preprocessing, the edge end inputs the data into a model-free controller or a trigger retainer to obtain a control instruction according to a preset event triggering rule, the terminal controls the movement of the single-link mechanical arm according to the control instruction and sends the acquired operation data to the edge end, a cloud end selects to optimize and update parameters of the model-free controller according to the operation data fed back by the edge end and a preset event triggering rule.

Description

Control method and system for single-link mechanical arm triggered by events under Yun Bianduan cooperation
Technical Field
The invention belongs to the field of mechanical arm control, and particularly relates to a single-link mechanical arm control method and system triggered by events under the cooperation of cloud edge ends.
Background
With the progress of technology and the accelerated development of industrial automation, the mechanical arm has become an indispensable part of the automation field and is used for various important tasks such as assembly, transportation, welding and the like. The single-link mechanical arm is widely applied to the fields of industrial production, medical rehabilitation and the like due to the advantages of simple structure, low cost and the like. The traditional single-link mechanical arm control method is mainly divided into model-based control and data-driven control, and because the model-based control method needs to establish an accurate dynamic model, the modeling difficulty is high due to complex self factors and external environments, the deep research and application of the model are affected, and the data-driven control directly utilizes rich input and output data to directly design a controller, so that the extensive attention of students is obtained.
The direct model-free adaptive control is first proposed in 2013 by Hou Zhongsheng and Zhu Yuanming in paper "Controller-dynamic-linearization-based model free adaptive control for discrete-time nonlinear systems." IEEE Transactions on Industrial Informatics, 2013, 9(4): 2301-2309., and has been developed into a typical data driving control method, so that students are attracted to explore the potential of combining with a neural network to improve the control performance of a single-input single-output controlled system. With the development of cloud computing, edge computing and a direct model-free self-adaptive control method, a new idea is provided for single-link mechanical arm control. However, in practical application, the cloud end system is limited by system bandwidth and communication resources, and the traditional periodic sampling and data transmission scheme consumes a large amount of network bandwidth and energy even if the system state changes slightly, so that the communication resources are wasted, and the running cost of the system is increased.
In order to solve the problems, an event triggering control scheme is developed, data transmission and control operation are only carried out when the system state is changed significantly, unnecessary communication and calculation are effectively reduced, and therefore system resources are saved, on the other hand, in order to further improve the control performance of the single-link mechanical arm and combine the advantages of the cloud edge, the invention provides a control method based on event triggering by carrying out deep fusion on direct model-free self-adaptive control and reinforcement learning, and in view of the fact that the method is not involved in the prior study, the invention provides a control method and a control system of the single-link mechanical arm under the cooperation of the cloud edge.
Disclosure of Invention
In order to solve the problems in the background art, the invention aims to provide a single-link mechanical arm control method triggered by an event under the cooperation of a cloud edge end, wherein the cloud edge end comprises a cloud end, an edge end and a terminal, and the method comprises the following steps:
the method comprises the steps that (1) a terminal collects real-time data of a single-link mechanical arm through a sensor signal collection device, wherein the real-time data comprise real-time expected joint angles and real-time running joint angles, and the real-time data are sent to an edge end;
The method comprises the steps of (2) preprocessing real-time data by an edge end, designing a dynamic event trigger mechanism, inputting the preprocessed real-time data into a model-free controller for performing time-varying parameter self-adaptive learning based on an online Actor-Critic reinforcement learning network if a trigger rule is met, calculating to obtain a control instruction of a single-link mechanical arm, and if the trigger rule is not met, maintaining the control instruction at the last trigger moment, sending the control instruction to a terminal, and controlling the movement of the single-link mechanical arm by the terminal according to the control instruction;
The cloud end performs optimization updating on the model-free controller parameters according to the data fed back by the edge end and the dynamic event triggering mechanism, if the triggering rule is met, the optimized updated model-free controller parameters are transmitted to the model-free controller of the edge end, and if the triggering rule is not met, no operation is performed;
repeating the steps (1) - (3) until the control task is finished.
According to the single-link mechanical arm control method for event triggering under the cooperation of the cloud edge end, which is provided by the invention, the edge end in the step (2) carries out preprocessing on the real-time data, the preprocessing step comprises filtering and noise reduction, and the dynamic event triggering mechanism in the step (2) is designed and used for determining that the dynamic event triggering mechanism at the next triggering moment is as follows:
;
Wherein, S is a positive integer for the s-th trigger time,In order to trigger the error in the time of the error,The real-time operation joint angle of the single-link mechanical arm at the s-th trigger moment,The real-time operation joint angle of the single-link mechanical arm at the k sampling moment; For internal dynamic variables, the update rules are:
;
Wherein, AndAre all the event-triggering parameters, and the event-triggering parameters,,,Is a preset threshold parameter;
The triggering rule in the step (2) is introduced with an indicating factor Indicating whether the trigger rule is satisfied:
;
If the triggering rule is met in the step (2), inputting the preprocessed real-time data into a model-free controller for performing time-varying parameter self-adaptive learning based on an online Actor-Critic reinforcement learning network, and calculating to obtain a control instruction of the single-link mechanical arm, wherein the method comprises the following steps of:
And (2.1) at the time of k sampling, the mathematical formula of the model-free controller is as follows:
;
Wherein, Is a control instruction of the single-link mechanical arm at k sampling moments,Is the firstControl instructions of the single-link mechanical arm at the moment of triggering; Is the indicator; For a time-varying parameter vector of k sample instants, Is thatIs a first element of the (c) a (c),L is the pseudo-order of the controller, L is a positive integer; the tracking error of the track of the single-link mechanical arm at the k sampling moment, ,For the desired joint angle of the single link mechanical arm at the k sampling time,The operation joint angle of the single-link mechanical arm at the k sampling moment;,; Tracking error of track of single-link mechanical arm at k sampling moment Is a first order backward difference of (a);
Step (2.2), constructing a time-varying parameter self-adaptive learning mechanism based on an online Actor-Critic reinforcement learning network:
The input of the online Actor-Critic reinforcement learning network comprises the track tracking error of the k sampling moment single-link mechanical arm First-order backward difference of tracking error of single-link mechanical arm track at k sampling momentThe output of the online Actor-Critic reinforcement learning network comprises an Actor output part and a Critic output part, and the Actor output part comprises the time-varying parameter vectorIn (a) and (b),The Critic output portion contains a value function.
According to the method for controlling the single-link mechanical arm triggered by the event under the cooperation of the cloud side end, in the step (3), the cloud side combines the dynamic event triggering mechanism according to the data fed back by the edge end, and if the triggering rule is met, the method for optimizing and updating the parameters of the model-free controller is as follows:
step (3.1), constructing a real-time dynamic linearization model of the single-link mechanical arm:
;
Wherein k is the sampling time, Is thatMoment single-link mechanical arm the joint angle is operated in real time,;Is a control instruction of the single-link mechanical arm at the moment k,;The pseudo partial derivative of the single-link mechanical arm at the k sampling moment;
the pseudo partial derivative The iterative learning law of (a) is:
;
Wherein, Is the firstThe time of the triggering of the device is the same,Is the firstPseudo partial derivatives of single-link mechanical arms at the moment of triggering; in order to indicate the factor(s), As a step-size factor,Is a penalty factor;
and (3.2) using a one-step forward error in the optimization updating process, and calculating a mathematical formula of the one-step forward error based on the real-time dynamic linearization model of the single-link mechanical arm in the step (3.1) as follows:
;
Wherein, Is thatTracking errors of the track of the moment single-connecting-rod mechanical arm, namely the one-step forward errors; The expected joint angle of the single-link mechanical arm at the sampling moment of k+1;
And (3) in the optimization updating process, gradient information of the single-link mechanical arm is used, and based on the real-time dynamic linearization model of the single-link mechanical arm in the step (3.1), a mathematical formula for calculating the gradient information of the single-link mechanical arm is as follows:
;
Wherein, Is thatFor the purpose ofI.e. the gradient information;
Step (3.3) of using the control command of the single-link mechanical arm in the optimization updating process For the time-varying parameter vectorIn (a),,,The mathematical formula for calculating the partial derivative is:
If the controller is pseudo-ordered Then
;
If the controller is pseudo-orderedThen
,;
Step (3.4) defining a time sequence difference functionWhereinAs a function of the value,For discounting factors to minimize system performance index functionsThe method comprises the steps of optimizing and updating the weight of an Actor-Critic reinforcement learning network by using a gradient descent method, wherein the weight comprises an Actor network weight and a Critic network weight;
Optimizing updates The Actor network weight at the sampling moment,,H is the hidden layer node number of the reinforcement learning network:
If the controller is pseudo-ordered Then
;
If the controller is pseudo-orderedThen
,;
Wherein, Is the firstThe Actor network weights at each trigger time,In order to indicate the factor(s),For the learning rate of the Actor network,Adopting the mathematical formula of the one-step forward error in the step (3.2); Is a desired 0, variance is Is a normal distribution function of (2); an output of an ith node of an implicit layer of the reinforcement learning network;
Optimizing updates The Critic network weight is set at the sampling moment,:
;
Wherein, Is the firstCritic network weights at the time of the trigger,In order to indicate the factor(s),Learning rate for the Critic network;
step (3.5) optimizing the update The time-varying parameter vector of the controller at the sampling momentIn (a) and (b),,,:
According to the control method for the single-link mechanical arm triggered by the event under the cooperation of the cloud edge end, the Actor-Critic reinforcement learning network adopts a radial basis function network, and the radial basis function network adopts a structure that an hidden layer is a single layer, namely a three-layer network structure consisting of an input layer, a single hidden layer and an output layer.
The invention also provides a single-link mechanical arm control system triggered by the event under the cooperation of the cloud edge end, which comprises the following components:
The terminal comprises a single-link mechanical arm which operates in real time, a data acquisition module, a first data output module and a first data input module;
The data acquisition module is used for acquiring real-time data of the single-link mechanical arm, wherein the real-time data comprise real-time expected joint angles and real-time running joint angles;
the first data output module is used for sending the real-time data to an edge end;
the first data input module is used for receiving a control instruction issued by the edge end;
The edge end comprises a second data input module, a data preprocessing module, a first dynamic event triggering module, a model-free controller calculating module, a control instruction holding module and a second data output module;
The second data input module is used for receiving real-time data of the single-link mechanical arm uploaded by the terminal;
the data preprocessing module is used for preprocessing the real-time data and then sending the real-time data to the dynamic event triggering module and the cloud;
The first dynamic event triggering module is used for designing a dynamic event triggering mechanism, inputting the preprocessed real-time data into the model-free controller computing module if a triggering rule is met, and otherwise triggering the control instruction holding module;
the model-free controller calculation module is used for calculating and obtaining a control instruction of the single-link mechanical arm through the model-free controller which performs time-varying parameter self-adaptive learning based on the online Actor-Critic reinforcement learning network;
the control instruction holding module is used for holding the control instruction at the last trigger time;
The second data output module is used for sending the control instruction to the terminal;
The cloud comprises a third data input module, a data storage module, a second dynamic event triggering module, a model-free controller optimization updating module and a third data output module;
The third data input module is used for receiving the preprocessed real-time data uploaded by the edge terminal;
the data storage module is used for storing the preprocessed real-time data uploaded by the edge end;
the second dynamic event triggering module is used for judging whether the triggering rule is met, if yes, triggering the model-free controller optimizing updating module, otherwise, not performing any operation;
The model-free controller optimization updating module is used for optimizing and updating parameters of the model-free controller;
and the third data output module is used for transmitting the optimized and updated model-free controller parameters to the edge end.
Further, the invention adopts the following technical scheme:
A non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the above described cloud-edge-co-event-triggered single-link mechanical arm control method.
Further, the invention adopts the following technical scheme:
An electronic device comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the single-link mechanical arm control method triggered by the event under the cooperation of the cloud edge when executing the program.
The beneficial technical effects of the invention are as follows:
aiming at a single-input single-output controlled system, the prior study does not relate to combining a direct model-free self-adaptive control and reinforcement learning network, and solves the problem that a cloud side end faces the limitation of system bandwidth and communication resources through an event triggering mechanism, and the control method and the system of the single-link mechanical arm for event triggering under the cooperation of the cloud side end at least comprise the following beneficial technical effects:
(1) The modeling difficulty is reduced by adopting the data-driven control method without establishing an accurate dynamic model, so that the modeling difficulty is reduced, and the method is easier to be applied practically;
(2) The control precision is improved, namely, the model-free self-adaptive control and the reinforcement learning depth are fused, and the method can realize stronger learning capacity, so that the control precision of the single-connecting-rod mechanical arm is improved, and the application requirement of higher precision is met;
(3) The computing efficiency is improved, the system resources are saved, the strong computing capacity of the cloud computing platform is utilized for model training and data analysis, and the edge computing equipment is used for realizing the real-time data processing and the quick response of the control instruction, so that the computing efficiency is effectively improved; on the other hand, by introducing an event triggering mechanism, data transmission and control operation are carried out only when the system state is changed importantly, so that unnecessary communication and calculation are effectively reduced, and system resources are saved;
(4) The control method and the control system provided by the invention are not only suitable for the single-link mechanical arm, but also can be popularized and applied to other single-input single-output control systems, and have wide application prospects.
Drawings
FIG. 1 is a schematic view of a single link mechanical arm;
Fig. 2 is a schematic flow chart of a control method of a single-link mechanical arm triggered by an event under the coordination of a cloud edge end;
Fig. 3 is a schematic diagram of a module connection of a control system of a single-link mechanical arm triggered by an event under the cooperation of a cloud edge end.
Detailed Description
The invention discloses a control method and a control system for a single-link mechanical arm triggered by an event under the cooperation of cloud side ends, wherein a terminal acquires operation data of the single-link mechanical arm and sends the operation data to an edge end for preprocessing, the edge end inputs the data into a model-free controller or a trigger retainer to obtain a control instruction according to a preset event triggering rule, the terminal controls the single-link mechanical arm to move according to the control instruction, acquires the operation data and sends the operation data to the edge end, and a cloud side selects to optimize and update parameters of the model-free controller according to the operation data fed back by the edge end and evaluates control performance according to the preset event triggering rule. The control method and the control system are suitable for the single-link mechanical arm, reduce modeling difficulty, improve control precision, improve calculation efficiency, save system resources and have application prospects of being popularized to other single-input single-output systems.
The control method and the system for the single-link mechanical arm triggered by the event under the coordination of the cloud edge end are further and clearly described below by combining with the accompanying drawings:
the invention provides a single-link mechanical arm control method triggered by events under the cooperation of a cloud edge end, wherein the cloud edge end comprises a cloud end, an edge end and a terminal, and the method comprises the following steps:
the method comprises the steps that (1) a terminal collects real-time data of a single-link mechanical arm through a sensor signal collection device, wherein the real-time data comprise real-time expected joint angles and real-time running joint angles, and the real-time data are sent to an edge end;
The method comprises the steps of (2) preprocessing real-time data by an edge end, designing a dynamic event trigger mechanism, inputting the preprocessed real-time data into a model-free controller for performing time-varying parameter self-adaptive learning based on an online Actor-Critic reinforcement learning network if a trigger rule is met, calculating to obtain a control instruction of a single-link mechanical arm, and if the trigger rule is not met, maintaining the control instruction at the last trigger moment, sending the control instruction to a terminal, and controlling the movement of the single-link mechanical arm by the terminal according to the control instruction;
The cloud end performs optimization updating on the model-free controller parameters according to the data fed back by the edge end and the dynamic event triggering mechanism, if the triggering rule is met, the optimized updated model-free controller parameters are transmitted to the model-free controller of the edge end, and if the triggering rule is not met, no operation is performed;
repeating the steps (1) - (3) until the control task is finished.
According to the single-link mechanical arm control method for event triggering under the cooperation of the cloud edge end, which is provided by the invention, the edge end in the step (2) carries out preprocessing on the real-time data, the preprocessing step comprises filtering and noise reduction, and the dynamic event triggering mechanism in the step (2) is designed and used for determining that the dynamic event triggering mechanism at the next triggering moment is as follows:
;
Wherein, S is a positive integer for the s-th trigger time,In order to trigger the error in the time of the error,The real-time operation joint angle of the single-link mechanical arm at the s-th trigger moment,The real-time operation joint angle of the single-link mechanical arm at the k sampling moment; For internal dynamic variables, the update rules are:
;
Wherein, AndAre all the event-triggering parameters, and the event-triggering parameters,,,Is a preset threshold parameter;
The triggering rule in the step (2) is introduced with an indicating factor Indicating whether the trigger rule is satisfied:
If the trigger rule is met in the step (2), inputting the preprocessed real-time data into a model-free controller based on-line Actor-Critic reinforcement learning network for time-varying parameter self-adaptive learning, and calculating to obtain a control instruction of the single-link mechanical arm, otherwise, keeping the control instruction of the last trigger moment, wherein the calculation of the model-free controller based on-line Actor-Critic reinforcement learning network for time-varying parameter self-adaptive learning comprises the following steps:
And (2.1) at the time of k sampling, the mathematical formula of the model-free controller is as follows:
;
Wherein, Is a control instruction of the single-link mechanical arm at k sampling moments,Is the firstControl instructions of the single-link mechanical arm at the moment of triggering; Is the indicator; For a time-varying parameter vector of k sample instants, Is thatIs a first element of the (c) a (c),L is the pseudo-order of the controller, L is a positive integer; the tracking error of the track of the single-link mechanical arm at the k sampling moment, ,For the desired joint angle of the single link mechanical arm at the k sampling time,The operation joint angle of the single-link mechanical arm at the k sampling moment;,; Tracking error of track of single-link mechanical arm at k sampling moment Is a first order backward difference of (a);
Step (2.2), constructing a time-varying parameter self-adaptive learning mechanism based on an online Actor-Critic reinforcement learning network:
The input of the online Actor-Critic reinforcement learning network comprises the track tracking error of the k sampling moment single-link mechanical arm First-order backward difference of tracking error of single-link mechanical arm track at k sampling momentThe output of the online Actor-Critic reinforcement learning network comprises an Actor output part and a Critic output part, and the Actor output part comprises the time-varying parameter vectorIn (a) and (b),The Critic output portion contains a value function.
According to the method for controlling the single-link mechanical arm triggered by the event under the cooperation of the cloud side end, the cloud side in the step (3) combines the dynamic event triggering mechanism according to the data fed back by the edge end, if the triggering rule is met, the model-free controller parameter is optimized and updated, otherwise, no operation is performed, and the optimizing and updating process comprises the following steps:
step (3.1), constructing a real-time dynamic linearization model of the single-link mechanical arm:
;
Wherein k is the sampling time, Is thatMoment single-link mechanical arm the joint angle is operated in real time,;Is a control instruction of the single-link mechanical arm at the moment k,;The pseudo partial derivative of the single-link mechanical arm at the k sampling moment;
the pseudo partial derivative The iterative learning law of (a) is:
;
Wherein, Is the firstThe time of the triggering of the device is the same,Is the firstPseudo partial derivatives of single-link mechanical arms at the moment of triggering; in order to indicate the factor(s), As a step-size factor,Is a penalty factor;
and (3.2) using a one-step forward error in the optimization updating process, and calculating a mathematical formula of the one-step forward error based on the real-time dynamic linearization model of the single-link mechanical arm in the step (3.1) as follows:
;
Wherein, Is thatTracking errors of the track of the moment single-connecting-rod mechanical arm, namely the one-step forward errors; The expected joint angle of the single-link mechanical arm at the sampling moment of k+1;
And (3) in the optimization updating process, gradient information of the single-link mechanical arm is used, and based on the real-time dynamic linearization model of the single-link mechanical arm in the step (3.1), a mathematical formula for calculating the gradient information of the single-link mechanical arm is as follows:
;
Wherein, Is thatFor the purpose ofI.e. the gradient information;
Step (3.3) of using the control command of the single-link mechanical arm in the optimization updating process For the time-varying parameter vectorIn (a),,,The mathematical formula for calculating the partial derivative is:
If the controller is pseudo-ordered Then
;
If the controller is pseudo-orderedThen
,;
Step (3.4) defining a time sequence difference functionWhereinAs a function of the value,For discounting factors to minimize system performance index functionsThe method comprises the steps of optimizing and updating the weight of an Actor-Critic reinforcement learning network by adopting a gradient descent method, wherein the weight comprises an Actor network weight and a Critic network weight, and h is the number of hidden layer nodes of the reinforcement learning network;
Optimizing updates The Actor network weight at the sampling moment,,H is the hidden layer node number of the reinforcement learning network:
If the controller is pseudo-ordered Then
;
If the controller is pseudo-orderedThen
,;
Wherein, Is the firstThe Actor network weights at each trigger time,As a result of the indication factor(s),For the learning rate of the Actor network,Adopting the mathematical formula of the one-step forward error in the step (4.2); Is a desired 0, variance is Is a normal distribution function of (2); an output of an ith node of an implicit layer of the reinforcement learning network;
Optimizing updates The Critic network weight is set at the sampling moment,:
;
Wherein, Is the firstCritic network weights at the time of the trigger,In order to indicate the factor(s),Learning rate for the Critic network;
step (3.5) optimizing the update The time-varying parameter vector of the controller at the sampling momentIn (a) and (b),,,:
According to the control method for the single-link mechanical arm triggered by the event under the cooperation of the cloud edge end, the Actor-Critic reinforcement learning network adopts a radial basis function network, and the radial basis function network adopts a structure that an hidden layer is a single layer, namely a three-layer network structure consisting of an input layer, a single hidden layer and an output layer.
Fig. 3 shows a schematic connection diagram of a module of a control system of a single-link mechanical arm triggered by an event under the coordination of a cloud edge end, and the invention also provides a control system of the single-link mechanical arm triggered by the event under the coordination of the cloud edge end, which comprises:
The terminal comprises a single-link mechanical arm which operates in real time, a data acquisition module, a first data output module and a first data input module;
The data acquisition module is used for acquiring real-time data of the single-link mechanical arm through the sensor signal acquisition device, wherein the real-time data comprises a real-time expected joint angle and a real-time running joint angle;
the first data output module is used for sending the real-time data to an edge end;
the first data input module is used for receiving a control instruction issued by the edge end;
The edge end comprises a second data input module, a data preprocessing module, a first dynamic event triggering module, a model-free controller calculating module, a control instruction holding module and a second data output module;
The second data input module is used for receiving real-time data of the single-link mechanical arm uploaded by the terminal;
the data preprocessing module is used for preprocessing the real-time data and then sending the real-time data to the dynamic event triggering module and the cloud;
The first dynamic event triggering module is used for designing a dynamic event triggering mechanism, inputting the preprocessed real-time data into the model-free controller computing module if a triggering rule is met, and otherwise triggering the control instruction holding module;
the model-free controller calculation module is used for calculating and obtaining a control instruction of the single-link mechanical arm through the model-free controller which performs time-varying parameter self-adaptive learning based on the online Actor-Critic reinforcement learning network;
the control instruction holding module is used for holding the control instruction at the last trigger time;
The second data output module is used for sending the control instruction to the terminal;
The cloud comprises a third data input module, a data storage module, a second dynamic event triggering module, a model-free controller optimization updating module and a third data output module;
The third data input module is used for receiving the preprocessed real-time data uploaded by the edge terminal;
the data storage module is used for storing the preprocessed real-time data uploaded by the edge end;
the second dynamic event triggering module is used for judging whether the triggering rule is met, if yes, triggering the model-free controller optimizing updating module, otherwise, not performing any operation;
The model-free controller optimization updating module is used for optimizing and updating parameters of the model-free controller;
and the third data output module is used for transmitting the optimized and updated model-free controller parameters to the edge end.
Further, the invention adopts the following technical scheme:
A non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the above described cloud-edge-co-event-triggered single-link mechanical arm control method.
Further, the invention adopts the following technical scheme:
An electronic device comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the single-link mechanical arm control method triggered by the event under the cooperation of the cloud edge when executing the program.
From the description of the embodiments above, it will be apparent to those skilled in the art that the facility of the present invention may be implemented by means of software plus necessary general hardware platforms. Embodiments of the invention may be implemented using existing processors, or by special purpose processors used for this or other purposes for appropriate systems, or by hardwired systems. Embodiments of the invention also include non-transitory computer-readable storage media including machine-readable media for carrying or having machine-executable instructions or data structures stored thereon, which may be any available media that may be accessed by a general purpose or special purpose computer or other machine with a processor. Such machine-readable media may include, for example, RAM, ROM, EPROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of machine-executable instructions or data structures and that can be accessed by a general purpose or special purpose computer or other machine with a processor. When information is transferred or provided over a network or another communications connection (either hardwired, wireless, or a combination of hardwired or wireless) to a machine, the connection is also considered to be a machine-readable medium.
Thus far, the technical solution of the present invention has been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of protection of the present invention is not limited to these specific embodiments. Equivalent modifications and substitutions for related technical features may be made by those skilled in the art without departing from the principles of the present invention, and such modifications and substitutions will fall within the scope of the present invention.

Claims (9)

1. Yun Bianduan under coordination, the single-link mechanical arm control method triggered by the event is characterized in that the cloud side end comprises a cloud side, an edge end and a terminal, and the method comprises the following steps:
The method comprises the steps that (1) a terminal collects real-time data of a single-connecting-rod mechanical arm, wherein the real-time data comprise real-time expected joint angles and real-time running joint angles, and the real-time data are sent to an edge end;
The method comprises the steps of (2) preprocessing real-time data by an edge end, designing a dynamic event trigger mechanism, inputting the preprocessed real-time data into a model-free controller for performing time-varying parameter self-adaptive learning based on an online Actor-Critic reinforcement learning network if a trigger rule is met, calculating to obtain a control instruction of a single-link mechanical arm, and if the trigger rule is not met, maintaining the control instruction at the last trigger moment, sending the control instruction to a terminal, and controlling the movement of the single-link mechanical arm by the terminal according to the control instruction;
The cloud end performs optimization updating on the model-free controller parameters according to the data fed back by the edge end and the dynamic event triggering mechanism, if the triggering rule is met, the optimized updated model-free controller parameters are transmitted to the model-free controller of the edge end, and if the triggering rule is not met, no operation is performed;
Repeating the steps (1) - (3) until the control task is finished;
The step (2) of designing a dynamic event trigger mechanism, wherein the dynamic event trigger mechanism used for determining the next trigger time is as follows:
;
Wherein, S is a positive integer for the s-th trigger time,In order to trigger the error in the time of the error,The real-time operation joint angle of the single-link mechanical arm at the s-th trigger moment,The real-time operation joint angle of the single-link mechanical arm at the k sampling moment; For internal dynamic variables, the update rules are:
;
Wherein, AndAre all the event-triggering parameters, and the event-triggering parameters,,,Is a preset threshold parameter;
The triggering rule in the step (2) is introduced with an indicating factor Indicating whether the trigger rule is satisfied:
2. The method for controlling a single-link mechanical arm triggered by an event under the coordination of a cloud edge end according to claim 1, wherein if the triggering rule is met in the step (2), inputting the preprocessed real-time data into a model-free controller for performing time-varying parameter self-adaptive learning based on an online Actor-Critic reinforcement learning network, and calculating to obtain a control instruction of the single-link mechanical arm, wherein the method comprises the following steps:
And (2.1) at the time of k sampling, the mathematical formula of the model-free controller is as follows:
;
Wherein, Is a control instruction of the single-link mechanical arm at k sampling moments,Is the firstControl instructions of the single-link mechanical arm at the moment of triggering; Is an indicator; For a time-varying parameter vector of k sample instants, Is thatIs a first element of the (c) a (c),L is the pseudo-order of the controller, L is a positive integer; the tracking error of the track of the single-link mechanical arm at the k sampling moment, ,For the desired joint angle of the single link mechanical arm at the k sampling time,The operation joint angle of the single-link mechanical arm at the k sampling moment;,; Tracking error of track of single-link mechanical arm at k sampling moment Is a first order backward difference of (a);
Step (2.2), constructing a time-varying parameter self-adaptive learning mechanism based on an online Actor-Critic reinforcement learning network:
The input of the online Actor-Critic reinforcement learning network comprises the track tracking error of the k sampling moment single-link mechanical arm First-order backward difference of tracking error of single-link mechanical arm track at k sampling momentThe output of the online Actor-Critic reinforcement learning network comprises an Actor output part and a Critic output part, and the Actor output part comprises the time-varying parameter vectorIn (a) and (b),The Critic output portion contains a value function.
3. The method for controlling a single-link mechanical arm triggered by an event under the coordination of a cloud side end according to claim 2, wherein in the step (3), the cloud side combines the dynamic event triggering mechanism according to data fed back by the edge end, and if a triggering rule is met, the method for optimizing and updating parameters of the model-free controller comprises the following steps:
step (3.1), constructing a real-time dynamic linearization model of the single-link mechanical arm:
;
Wherein k is the sampling time, Is thatMoment single-link mechanical arm the joint angle is operated in real time,;Is a control instruction of the single-link mechanical arm at the moment k,;The pseudo partial derivative of the single-link mechanical arm at the k sampling moment;
the pseudo partial derivative The iterative learning law of (a) is:
;
Wherein, Is the firstThe time of the triggering of the device is the same,Is the firstPseudo partial derivatives of single-link mechanical arms at the moment of triggering; in order to indicate the factor(s), As a step-size factor,Is a penalty factor;
and (3.2) using a one-step forward error in the optimization updating process, and calculating a mathematical formula of the one-step forward error based on the real-time dynamic linearization model of the single-link mechanical arm in the step (3.1) as follows:
;
Wherein, Is thatTracking errors of the track of the moment single-link mechanical arm; The expected joint angle of the single-link mechanical arm at the sampling moment of k+1;
And (3) in the optimization updating process, gradient information of the single-link mechanical arm is used, and based on the real-time dynamic linearization model of the single-link mechanical arm in the step (3.1), a mathematical formula for calculating the gradient information of the single-link mechanical arm is as follows:
;
Wherein, Is thatFor the purpose ofIs a derivative of (2);
Step (3.3) of using the control command of the single-link mechanical arm in the optimization updating process For time-varying parameter vectorsIn (a),,,The mathematical formula for calculating the partial derivative is:
If the controller is pseudo-ordered Then
;
If the controller is pseudo-orderedThen
,;
Step (3.4) defining a time sequence difference functionWhereinAs a function of the value,For discounting factors to minimize system performance index functionsThe method comprises the steps of optimizing and updating the weight of an Actor-Critic reinforcement learning network by using a gradient descent method, wherein the weight comprises an Actor network weight and a Critic network weight;
Optimizing updates The Actor network weight at the sampling moment,,H is the hidden layer node number of the reinforcement learning network:
If the controller is pseudo-ordered Then
;
If the controller is pseudo-orderedThen
,;
Wherein, Is the firstThe Actor network weights at each trigger time,In order to indicate the factor(s),For the learning rate of the Actor network,Adopting the mathematical formula of the one-step forward error in the step (3.2); Is a desired 0, variance is Is a normal distribution function of (2); an output of an ith node of an implicit layer of the reinforcement learning network;
Optimizing updates The Critic network weight is set at the sampling moment,:
;
Wherein, Is the firstCritic network weights at the time of the trigger,In order to indicate the factor(s),Learning rate for the Critic network;
step (3.5) optimizing the update The time-varying parameter vector of the controller at the sampling momentIn (a) and (b),,,:
4. The method for controlling the single-link mechanical arm triggered by the event under the coordination of the cloud edge end according to claim 1 is characterized in that the Actor-Critic reinforcement learning network adopts a radial basis function network, and the radial basis function network adopts a structure with a hidden layer being a single layer.
5. The method for controlling the single-link mechanical arm triggered by the event under the cooperation of the cloud edge end as claimed in claim 1, wherein the method for preprocessing the real-time data by the edge end in the step (2) comprises filtering and noise reduction.
6. The method for controlling the single-link mechanical arm triggered by the event under the coordination of the cloud edge end according to claim 1, wherein the terminal in the step (1) collects real-time data of the single-link mechanical arm through a sensor signal collection device.
7. Yun Bianduan in conjunction with an event-triggered single-link mechanical arm control system, performing the single-link mechanical arm control method according to any one of claims 1 to 6, comprising:
The terminal comprises a single-link mechanical arm which operates in real time, a data acquisition module, a first data output module and a first data input module;
The data acquisition module is used for acquiring real-time data of the single-link mechanical arm, wherein the real-time data comprise real-time expected joint angles and real-time running joint angles;
the first data output module is used for sending the real-time data to an edge end;
the first data input module is used for receiving a control instruction issued by the edge end;
The edge end comprises a second data input module, a data preprocessing module, a first dynamic event triggering module, a model-free controller calculating module, a control instruction holding module and a second data output module;
The second data input module is used for receiving real-time data of the single-link mechanical arm uploaded by the terminal;
the data preprocessing module is used for preprocessing the real-time data and then sending the real-time data to the dynamic event triggering module and the cloud;
The first dynamic event triggering module is used for designing a dynamic event triggering mechanism, inputting the preprocessed real-time data into the model-free controller computing module if a triggering rule is met, and otherwise triggering the control instruction holding module;
the model-free controller calculation module is used for calculating and obtaining a control instruction of the single-link mechanical arm through the model-free controller which performs time-varying parameter self-adaptive learning based on the online Actor-Critic reinforcement learning network;
the control instruction holding module is used for holding the control instruction at the last trigger time;
The second data output module is used for sending the control instruction to the terminal;
The cloud comprises a third data input module, a data storage module, a second dynamic event triggering module, a model-free controller optimization updating module and a third data output module;
The third data input module is used for receiving the preprocessed real-time data uploaded by the edge terminal;
the data storage module is used for storing the preprocessed real-time data uploaded by the edge end;
the second dynamic event triggering module is used for judging whether the triggering rule is met, if yes, triggering the model-free controller optimizing updating module, otherwise, not performing any operation;
The model-free controller optimization updating module is used for optimizing and updating parameters of the model-free controller;
and the third data output module is used for transmitting the optimized and updated model-free controller parameters to the edge end.
8. A non-transitory computer readable storage medium having a computer program stored thereon, wherein the computer program when executed by a processor implements the method for controlling a single link robot under cloud-edge co-operation event triggering according to any one of claims 1 to 6.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and operable on the processor, wherein the processor implements the method for controlling a single link mechanical arm under the coordination of cloud edge as claimed in any one of claims 1 to 6 when executing the program.
CN202411419480.5A 2024-10-12 2024-10-12 Event-triggered single-link robotic arm control method and system under cloud-edge-end collaboration Active CN118927260B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202411419480.5A CN118927260B (en) 2024-10-12 2024-10-12 Event-triggered single-link robotic arm control method and system under cloud-edge-end collaboration

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202411419480.5A CN118927260B (en) 2024-10-12 2024-10-12 Event-triggered single-link robotic arm control method and system under cloud-edge-end collaboration

Publications (2)

Publication Number Publication Date
CN118927260A CN118927260A (en) 2024-11-12
CN118927260B true CN118927260B (en) 2025-01-21

Family

ID=93359167

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202411419480.5A Active CN118927260B (en) 2024-10-12 2024-10-12 Event-triggered single-link robotic arm control method and system under cloud-edge-end collaboration

Country Status (1)

Country Link
CN (1) CN118927260B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118927258B (en) * 2024-10-12 2025-01-21 中国铁路设计集团有限公司 Event-triggered dual-link robotic arm control method and system under cloud-edge-end collaboration

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110328668A (en) * 2019-07-27 2019-10-15 南京理工大学 Robotic arm path planing method based on rate smoothing deterministic policy gradient
CN113211446A (en) * 2021-05-20 2021-08-06 长春工业大学 Event trigger-neural dynamic programming mechanical arm decentralized tracking control method

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9156165B2 (en) * 2011-09-21 2015-10-13 Brain Corporation Adaptive critic apparatus and methods
WO2021178865A1 (en) * 2020-03-06 2021-09-10 Embodied Intelligence Inc. Trajectory optimization using neural networks
CN218085787U (en) * 2022-09-19 2022-12-20 中国铁路设计集团有限公司 Supporting seat and wall-climbing robot
CN117001675B (en) * 2023-09-28 2024-05-31 江苏云幕智造科技有限公司 Double-arm cooperative control non-cooperative target obstacle avoidance trajectory planning method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110328668A (en) * 2019-07-27 2019-10-15 南京理工大学 Robotic arm path planing method based on rate smoothing deterministic policy gradient
CN113211446A (en) * 2021-05-20 2021-08-06 长春工业大学 Event trigger-neural dynamic programming mechanical arm decentralized tracking control method

Also Published As

Publication number Publication date
CN118927260A (en) 2024-11-12

Similar Documents

Publication Publication Date Title
CN118927260B (en) Event-triggered single-link robotic arm control method and system under cloud-edge-end collaboration
CN116300977B (en) Articulated vehicle track tracking control method and device based on reinforcement learning
CN114967472A (en) A UAV Trajectory Tracking State Compensation Depth Deterministic Policy Gradient Control Method
CN109885077A (en) Attitude control method and controller of a quadrotor aircraft
CN119002289B (en) Adaptive collaborative control method for heterogeneous unmanned swarm systems under weak information interaction
CN114626505A (en) Mobile robot deep reinforcement learning control method
CN108803601B (en) A Robust Predictive Tracking Control Method for Mobile Robots with Communication Constraints
CN117311374A (en) Aircraft control method based on reinforcement learning, terminal equipment and medium
Li et al. Event-triggered-based cooperative game optimal tracking control for modular robot manipulator with constrained input
CN105302966A (en) Overhead crane neural network modeling method based on RNA genetic algorithm of hairpin mutation operation
CN117302204B (en) Multi-wind-lattice vehicle track tracking collision avoidance control method and device based on reinforcement learning
Wang et al. Tracking moving target for 6 degree-of-freedom robot manipulator with adaptive visual servoing based on deep reinforcement learning PID controller
CN118927261B (en) Model-free control method and system for single-link robotic arm under cloud-edge collaboration
Liu et al. Trajectory prediction and visual localization of snake robot based on BiLSTM neural network
CN118927258B (en) Event-triggered dual-link robotic arm control method and system under cloud-edge-end collaboration
CN108375903A (en) A kind of multi-agent system linearquadratic regulator control method
CN118927259B (en) Model-free control method and system for dual-link robotic arm under cloud-edge collaboration
CN118131628A (en) Mobile robot tracking control method based on multi-target point information fusion
CN115576317B (en) Multi-pretightening point path tracking control method and system based on neural network
CN117519129A (en) A trajectory tracking control method, device, computer equipment and storage medium
CN115657477A (en) An Adaptive Control Method for Robots in Dynamic Environment Based on Offline Reinforcement Learning
Chen et al. Data-driven discrete learning sliding mode control for overhead cranes suffering from disturbances
CN113359471A (en) Self-adaptive dynamic programming optimal control method and system based on collaborative state assistance
CN115016288B (en) Robust iterative learning model predictive controller for magnetically levitated planar motor and its application
Jiang et al. Robust multistage nonlinear model predictive control on an autonomous marine surface vehicle

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant