CN118927260B

CN118927260B - Event-triggered single-link robotic arm control method and system under cloud-edge-end collaboration

Info

Publication number: CN118927260B
Application number: CN202411419480.5A
Authority: CN
Inventors: 陈晨; 罗健; 邓洪; 陈伟; 古晓东; 赵耀; 何常红; 周敏; 郭雅婕; 李勇; 王振鹏; 张煜; 杨嘉琛; 郭晓旭; 白裔峰; 梁茹楠; 晏寒; 张妍君
Original assignee: China Railway Design Corp
Current assignee: China Railway Design Corp
Priority date: 2024-10-12
Filing date: 2024-10-12
Publication date: 2025-01-21
Anticipated expiration: 2044-10-12
Also published as: CN118927260A

Abstract

The invention discloses a control method and a system for a single-link mechanical arm triggered by an event under the cooperation of cloud edge ends, wherein a terminal acquires operation data of the single-link mechanical arm and sends the operation data to an edge end for preprocessing, the edge end inputs the data into a model-free controller or a trigger retainer to obtain a control instruction according to a preset event triggering rule, the terminal controls the movement of the single-link mechanical arm according to the control instruction and sends the acquired operation data to the edge end, a cloud end selects to optimize and update parameters of the model-free controller according to the operation data fed back by the edge end and a preset event triggering rule.

Description

Control method and system for single-link mechanical arm triggered by events under Yun Bianduan cooperation

Technical Field

The invention belongs to the field of mechanical arm control, and particularly relates to a single-link mechanical arm control method and system triggered by events under the cooperation of cloud edge ends.

Background

With the progress of technology and the accelerated development of industrial automation, the mechanical arm has become an indispensable part of the automation field and is used for various important tasks such as assembly, transportation, welding and the like. The single-link mechanical arm is widely applied to the fields of industrial production, medical rehabilitation and the like due to the advantages of simple structure, low cost and the like. The traditional single-link mechanical arm control method is mainly divided into model-based control and data-driven control, and because the model-based control method needs to establish an accurate dynamic model, the modeling difficulty is high due to complex self factors and external environments, the deep research and application of the model are affected, and the data-driven control directly utilizes rich input and output data to directly design a controller, so that the extensive attention of students is obtained.

The direct model-free adaptive control is first proposed in 2013 by Hou Zhongsheng and Zhu Yuanming in paper "Controller-dynamic-linearization-based model free adaptive control for discrete-time nonlinear systems." IEEE Transactions on Industrial Informatics, 2013, 9(4): 2301-2309., and has been developed into a typical data driving control method, so that students are attracted to explore the potential of combining with a neural network to improve the control performance of a single-input single-output controlled system. With the development of cloud computing, edge computing and a direct model-free self-adaptive control method, a new idea is provided for single-link mechanical arm control. However, in practical application, the cloud end system is limited by system bandwidth and communication resources, and the traditional periodic sampling and data transmission scheme consumes a large amount of network bandwidth and energy even if the system state changes slightly, so that the communication resources are wasted, and the running cost of the system is increased.

In order to solve the problems, an event triggering control scheme is developed, data transmission and control operation are only carried out when the system state is changed significantly, unnecessary communication and calculation are effectively reduced, and therefore system resources are saved, on the other hand, in order to further improve the control performance of the single-link mechanical arm and combine the advantages of the cloud edge, the invention provides a control method based on event triggering by carrying out deep fusion on direct model-free self-adaptive control and reinforcement learning, and in view of the fact that the method is not involved in the prior study, the invention provides a control method and a control system of the single-link mechanical arm under the cooperation of the cloud edge.

Disclosure of Invention

In order to solve the problems in the background art, the invention aims to provide a single-link mechanical arm control method triggered by an event under the cooperation of a cloud edge end, wherein the cloud edge end comprises a cloud end, an edge end and a terminal, and the method comprises the following steps:

the method comprises the steps that (1) a terminal collects real-time data of a single-link mechanical arm through a sensor signal collection device, wherein the real-time data comprise real-time expected joint angles and real-time running joint angles, and the real-time data are sent to an edge end;

The method comprises the steps of (2) preprocessing real-time data by an edge end, designing a dynamic event trigger mechanism, inputting the preprocessed real-time data into a model-free controller for performing time-varying parameter self-adaptive learning based on an online Actor-Critic reinforcement learning network if a trigger rule is met, calculating to obtain a control instruction of a single-link mechanical arm, and if the trigger rule is not met, maintaining the control instruction at the last trigger moment, sending the control instruction to a terminal, and controlling the movement of the single-link mechanical arm by the terminal according to the control instruction;

The cloud end performs optimization updating on the model-free controller parameters according to the data fed back by the edge end and the dynamic event triggering mechanism, if the triggering rule is met, the optimized updated model-free controller parameters are transmitted to the model-free controller of the edge end, and if the triggering rule is not met, no operation is performed;

repeating the steps (1) - (3) until the control task is finished.

According to the single-link mechanical arm control method for event triggering under the cooperation of the cloud edge end, which is provided by the invention, the edge end in the step (2) carries out preprocessing on the real-time data, the preprocessing step comprises filtering and noise reduction, and the dynamic event triggering mechanism in the step (2) is designed and used for determining that the dynamic event triggering mechanism at the next triggering moment is as follows:

;

Wherein, S is a positive integer for the s-th trigger time,In order to trigger the error in the time of the error,The real-time operation joint angle of the single-link mechanical arm at the s-th trigger moment,The real-time operation joint angle of the single-link mechanical arm at the k sampling moment; For internal dynamic variables, the update rules are:

;

Wherein, AndAre all the event-triggering parameters, and the event-triggering parameters,,,Is a preset threshold parameter;

The triggering rule in the step (2) is introduced with an indicating factor Indicating whether the trigger rule is satisfied:

;

If the triggering rule is met in the step (2), inputting the preprocessed real-time data into a model-free controller for performing time-varying parameter self-adaptive learning based on an online Actor-Critic reinforcement learning network, and calculating to obtain a control instruction of the single-link mechanical arm, wherein the method comprises the following steps of:

And (2.1) at the time of k sampling, the mathematical formula of the model-free controller is as follows:

;

Wherein, Is a control instruction of the single-link mechanical arm at k sampling moments,Is the firstControl instructions of the single-link mechanical arm at the moment of triggering; Is the indicator; For a time-varying parameter vector of k sample instants, Is thatIs a first element of the (c) a (c),L is the pseudo-order of the controller, L is a positive integer; the tracking error of the track of the single-link mechanical arm at the k sampling moment, ,For the desired joint angle of the single link mechanical arm at the k sampling time,The operation joint angle of the single-link mechanical arm at the k sampling moment;,; Tracking error of track of single-link mechanical arm at k sampling moment Is a first order backward difference of (a);

Step (2.2), constructing a time-varying parameter self-adaptive learning mechanism based on an online Actor-Critic reinforcement learning network:

The input of the online Actor-Critic reinforcement learning network comprises the track tracking error of the k sampling moment single-link mechanical arm First-order backward difference of tracking error of single-link mechanical arm track at k sampling momentThe output of the online Actor-Critic reinforcement learning network comprises an Actor output part and a Critic output part, and the Actor output part comprises the time-varying parameter vectorIn (a) and (b),The Critic output portion contains a value function.

According to the method for controlling the single-link mechanical arm triggered by the event under the cooperation of the cloud side end, in the step (3), the cloud side combines the dynamic event triggering mechanism according to the data fed back by the edge end, and if the triggering rule is met, the method for optimizing and updating the parameters of the model-free controller is as follows:

step (3.1), constructing a real-time dynamic linearization model of the single-link mechanical arm:

;

Wherein k is the sampling time, Is thatMoment single-link mechanical arm the joint angle is operated in real time,;Is a control instruction of the single-link mechanical arm at the moment k,;The pseudo partial derivative of the single-link mechanical arm at the k sampling moment;

the pseudo partial derivative The iterative learning law of (a) is:

;

Wherein, Is the firstThe time of the triggering of the device is the same,Is the firstPseudo partial derivatives of single-link mechanical arms at the moment of triggering; in order to indicate the factor(s), As a step-size factor,Is a penalty factor;

and (3.2) using a one-step forward error in the optimization updating process, and calculating a mathematical formula of the one-step forward error based on the real-time dynamic linearization model of the single-link mechanical arm in the step (3.1) as follows:

;

Wherein, Is thatTracking errors of the track of the moment single-connecting-rod mechanical arm, namely the one-step forward errors; The expected joint angle of the single-link mechanical arm at the sampling moment of k+1;

And (3) in the optimization updating process, gradient information of the single-link mechanical arm is used, and based on the real-time dynamic linearization model of the single-link mechanical arm in the step (3.1), a mathematical formula for calculating the gradient information of the single-link mechanical arm is as follows:

;

Wherein, Is thatFor the purpose ofI.e. the gradient information;

Step (3.3) of using the control command of the single-link mechanical arm in the optimization updating process For the time-varying parameter vectorIn (a),,,The mathematical formula for calculating the partial derivative is:

If the controller is pseudo-ordered Then

;

If the controller is pseudo-orderedThen

,;

Step (3.4) defining a time sequence difference functionWhereinAs a function of the value,For discounting factors to minimize system performance index functionsThe method comprises the steps of optimizing and updating the weight of an Actor-Critic reinforcement learning network by using a gradient descent method, wherein the weight comprises an Actor network weight and a Critic network weight;

Optimizing updates The Actor network weight at the sampling moment,,H is the hidden layer node number of the reinforcement learning network:

If the controller is pseudo-ordered Then

;

If the controller is pseudo-orderedThen

,;

Wherein, Is the firstThe Actor network weights at each trigger time,In order to indicate the factor(s),For the learning rate of the Actor network,Adopting the mathematical formula of the one-step forward error in the step (3.2); Is a desired 0, variance is Is a normal distribution function of (2); an output of an ith node of an implicit layer of the reinforcement learning network;

Optimizing updates The Critic network weight is set at the sampling moment,:

;

Wherein, Is the firstCritic network weights at the time of the trigger,In order to indicate the factor(s),Learning rate for the Critic network;

step (3.5) optimizing the update The time-varying parameter vector of the controller at the sampling momentIn (a) and (b),,,:

。

According to the control method for the single-link mechanical arm triggered by the event under the cooperation of the cloud edge end, the Actor-Critic reinforcement learning network adopts a radial basis function network, and the radial basis function network adopts a structure that an hidden layer is a single layer, namely a three-layer network structure consisting of an input layer, a single hidden layer and an output layer.

The invention also provides a single-link mechanical arm control system triggered by the event under the cooperation of the cloud edge end, which comprises the following components:

The terminal comprises a single-link mechanical arm which operates in real time, a data acquisition module, a first data output module and a first data input module;

The data acquisition module is used for acquiring real-time data of the single-link mechanical arm, wherein the real-time data comprise real-time expected joint angles and real-time running joint angles;

the first data output module is used for sending the real-time data to an edge end;

the first data input module is used for receiving a control instruction issued by the edge end;

The edge end comprises a second data input module, a data preprocessing module, a first dynamic event triggering module, a model-free controller calculating module, a control instruction holding module and a second data output module;

The second data input module is used for receiving real-time data of the single-link mechanical arm uploaded by the terminal;

the data preprocessing module is used for preprocessing the real-time data and then sending the real-time data to the dynamic event triggering module and the cloud;

The first dynamic event triggering module is used for designing a dynamic event triggering mechanism, inputting the preprocessed real-time data into the model-free controller computing module if a triggering rule is met, and otherwise triggering the control instruction holding module;

the model-free controller calculation module is used for calculating and obtaining a control instruction of the single-link mechanical arm through the model-free controller which performs time-varying parameter self-adaptive learning based on the online Actor-Critic reinforcement learning network;

the control instruction holding module is used for holding the control instruction at the last trigger time;

The second data output module is used for sending the control instruction to the terminal;

The cloud comprises a third data input module, a data storage module, a second dynamic event triggering module, a model-free controller optimization updating module and a third data output module;

The third data input module is used for receiving the preprocessed real-time data uploaded by the edge terminal;

the data storage module is used for storing the preprocessed real-time data uploaded by the edge end;

the second dynamic event triggering module is used for judging whether the triggering rule is met, if yes, triggering the model-free controller optimizing updating module, otherwise, not performing any operation;

The model-free controller optimization updating module is used for optimizing and updating parameters of the model-free controller;

and the third data output module is used for transmitting the optimized and updated model-free controller parameters to the edge end.

Further, the invention adopts the following technical scheme:

A non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the above described cloud-edge-co-event-triggered single-link mechanical arm control method.

Further, the invention adopts the following technical scheme:

An electronic device comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the single-link mechanical arm control method triggered by the event under the cooperation of the cloud edge when executing the program.

The beneficial technical effects of the invention are as follows:

aiming at a single-input single-output controlled system, the prior study does not relate to combining a direct model-free self-adaptive control and reinforcement learning network, and solves the problem that a cloud side end faces the limitation of system bandwidth and communication resources through an event triggering mechanism, and the control method and the system of the single-link mechanical arm for event triggering under the cooperation of the cloud side end at least comprise the following beneficial technical effects:

(1) The modeling difficulty is reduced by adopting the data-driven control method without establishing an accurate dynamic model, so that the modeling difficulty is reduced, and the method is easier to be applied practically;

(2) The control precision is improved, namely, the model-free self-adaptive control and the reinforcement learning depth are fused, and the method can realize stronger learning capacity, so that the control precision of the single-connecting-rod mechanical arm is improved, and the application requirement of higher precision is met;

(3) The computing efficiency is improved, the system resources are saved, the strong computing capacity of the cloud computing platform is utilized for model training and data analysis, and the edge computing equipment is used for realizing the real-time data processing and the quick response of the control instruction, so that the computing efficiency is effectively improved; on the other hand, by introducing an event triggering mechanism, data transmission and control operation are carried out only when the system state is changed importantly, so that unnecessary communication and calculation are effectively reduced, and system resources are saved;

(4) The control method and the control system provided by the invention are not only suitable for the single-link mechanical arm, but also can be popularized and applied to other single-input single-output control systems, and have wide application prospects.

Drawings

FIG. 1 is a schematic view of a single link mechanical arm;

Fig. 2 is a schematic flow chart of a control method of a single-link mechanical arm triggered by an event under the coordination of a cloud edge end;

Fig. 3 is a schematic diagram of a module connection of a control system of a single-link mechanical arm triggered by an event under the cooperation of a cloud edge end.

Detailed Description

The invention discloses a control method and a control system for a single-link mechanical arm triggered by an event under the cooperation of cloud side ends, wherein a terminal acquires operation data of the single-link mechanical arm and sends the operation data to an edge end for preprocessing, the edge end inputs the data into a model-free controller or a trigger retainer to obtain a control instruction according to a preset event triggering rule, the terminal controls the single-link mechanical arm to move according to the control instruction, acquires the operation data and sends the operation data to the edge end, and a cloud side selects to optimize and update parameters of the model-free controller according to the operation data fed back by the edge end and evaluates control performance according to the preset event triggering rule. The control method and the control system are suitable for the single-link mechanical arm, reduce modeling difficulty, improve control precision, improve calculation efficiency, save system resources and have application prospects of being popularized to other single-input single-output systems.

The control method and the system for the single-link mechanical arm triggered by the event under the coordination of the cloud edge end are further and clearly described below by combining with the accompanying drawings:

the invention provides a single-link mechanical arm control method triggered by events under the cooperation of a cloud edge end, wherein the cloud edge end comprises a cloud end, an edge end and a terminal, and the method comprises the following steps:

repeating the steps (1) - (3) until the control task is finished.

;

。

If the trigger rule is met in the step (2), inputting the preprocessed real-time data into a model-free controller based on-line Actor-Critic reinforcement learning network for time-varying parameter self-adaptive learning, and calculating to obtain a control instruction of the single-link mechanical arm, otherwise, keeping the control instruction of the last trigger moment, wherein the calculation of the model-free controller based on-line Actor-Critic reinforcement learning network for time-varying parameter self-adaptive learning comprises the following steps:

;

According to the method for controlling the single-link mechanical arm triggered by the event under the cooperation of the cloud side end, the cloud side in the step (3) combines the dynamic event triggering mechanism according to the data fed back by the edge end, if the triggering rule is met, the model-free controller parameter is optimized and updated, otherwise, no operation is performed, and the optimizing and updating process comprises the following steps:

;

the pseudo partial derivative The iterative learning law of (a) is:

;

Wherein, Is thatFor the purpose ofI.e. the gradient information;

If the controller is pseudo-ordered Then

;

If the controller is pseudo-orderedThen

,;

Step (3.4) defining a time sequence difference functionWhereinAs a function of the value,For discounting factors to minimize system performance index functionsThe method comprises the steps of optimizing and updating the weight of an Actor-Critic reinforcement learning network by adopting a gradient descent method, wherein the weight comprises an Actor network weight and a Critic network weight, and h is the number of hidden layer nodes of the reinforcement learning network;

If the controller is pseudo-ordered Then

;

If the controller is pseudo-orderedThen

,;

Wherein, Is the firstThe Actor network weights at each trigger time,As a result of the indication factor(s),For the learning rate of the Actor network,Adopting the mathematical formula of the one-step forward error in the step (4.2); Is a desired 0, variance is Is a normal distribution function of (2); an output of an ith node of an implicit layer of the reinforcement learning network;

Optimizing updates The Critic network weight is set at the sampling moment,:

;

。

Fig. 3 shows a schematic connection diagram of a module of a control system of a single-link mechanical arm triggered by an event under the coordination of a cloud edge end, and the invention also provides a control system of the single-link mechanical arm triggered by the event under the coordination of the cloud edge end, which comprises:

The data acquisition module is used for acquiring real-time data of the single-link mechanical arm through the sensor signal acquisition device, wherein the real-time data comprises a real-time expected joint angle and a real-time running joint angle;

Further, the invention adopts the following technical scheme:

From the description of the embodiments above, it will be apparent to those skilled in the art that the facility of the present invention may be implemented by means of software plus necessary general hardware platforms. Embodiments of the invention may be implemented using existing processors, or by special purpose processors used for this or other purposes for appropriate systems, or by hardwired systems. Embodiments of the invention also include non-transitory computer-readable storage media including machine-readable media for carrying or having machine-executable instructions or data structures stored thereon, which may be any available media that may be accessed by a general purpose or special purpose computer or other machine with a processor. Such machine-readable media may include, for example, RAM, ROM, EPROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of machine-executable instructions or data structures and that can be accessed by a general purpose or special purpose computer or other machine with a processor. When information is transferred or provided over a network or another communications connection (either hardwired, wireless, or a combination of hardwired or wireless) to a machine, the connection is also considered to be a machine-readable medium.

Thus far, the technical solution of the present invention has been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of protection of the present invention is not limited to these specific embodiments. Equivalent modifications and substitutions for related technical features may be made by those skilled in the art without departing from the principles of the present invention, and such modifications and substitutions will fall within the scope of the present invention.

Claims

1. Yun Bianduan under coordination, the single-link mechanical arm control method triggered by the event is characterized in that the cloud side end comprises a cloud side, an edge end and a terminal, and the method comprises the following steps:

The method comprises the steps that (1) a terminal collects real-time data of a single-connecting-rod mechanical arm, wherein the real-time data comprise real-time expected joint angles and real-time running joint angles, and the real-time data are sent to an edge end;

Repeating the steps (1) - (3) until the control task is finished;

The step (2) of designing a dynamic event trigger mechanism, wherein the dynamic event trigger mechanism used for determining the next trigger time is as follows:

;

。

2. The method for controlling a single-link mechanical arm triggered by an event under the coordination of a cloud edge end according to claim 1, wherein if the triggering rule is met in the step (2), inputting the preprocessed real-time data into a model-free controller for performing time-varying parameter self-adaptive learning based on an online Actor-Critic reinforcement learning network, and calculating to obtain a control instruction of the single-link mechanical arm, wherein the method comprises the following steps:

;

Wherein, Is a control instruction of the single-link mechanical arm at k sampling moments,Is the firstControl instructions of the single-link mechanical arm at the moment of triggering; Is an indicator; For a time-varying parameter vector of k sample instants, Is thatIs a first element of the (c) a (c),L is the pseudo-order of the controller, L is a positive integer; the tracking error of the track of the single-link mechanical arm at the k sampling moment, ,For the desired joint angle of the single link mechanical arm at the k sampling time,The operation joint angle of the single-link mechanical arm at the k sampling moment;,; Tracking error of track of single-link mechanical arm at k sampling moment Is a first order backward difference of (a);

3. The method for controlling a single-link mechanical arm triggered by an event under the coordination of a cloud side end according to claim 2, wherein in the step (3), the cloud side combines the dynamic event triggering mechanism according to data fed back by the edge end, and if a triggering rule is met, the method for optimizing and updating parameters of the model-free controller comprises the following steps:

;

the pseudo partial derivative The iterative learning law of (a) is:

;

Wherein, Is thatTracking errors of the track of the moment single-link mechanical arm; The expected joint angle of the single-link mechanical arm at the sampling moment of k+1;

;

Wherein, Is thatFor the purpose ofIs a derivative of (2);

Step (3.3) of using the control command of the single-link mechanical arm in the optimization updating process For time-varying parameter vectorsIn (a),,,The mathematical formula for calculating the partial derivative is:

If the controller is pseudo-ordered Then

;

If the controller is pseudo-orderedThen

,;

If the controller is pseudo-ordered Then

;

If the controller is pseudo-orderedThen

,;

Optimizing updates The Critic network weight is set at the sampling moment,:

;

。

4. The method for controlling the single-link mechanical arm triggered by the event under the coordination of the cloud edge end according to claim 1 is characterized in that the Actor-Critic reinforcement learning network adopts a radial basis function network, and the radial basis function network adopts a structure with a hidden layer being a single layer.

5. The method for controlling the single-link mechanical arm triggered by the event under the cooperation of the cloud edge end as claimed in claim 1, wherein the method for preprocessing the real-time data by the edge end in the step (2) comprises filtering and noise reduction.

6. The method for controlling the single-link mechanical arm triggered by the event under the coordination of the cloud edge end according to claim 1, wherein the terminal in the step (1) collects real-time data of the single-link mechanical arm through a sensor signal collection device.

7. Yun Bianduan in conjunction with an event-triggered single-link mechanical arm control system, performing the single-link mechanical arm control method according to any one of claims 1 to 6, comprising:

8. A non-transitory computer readable storage medium having a computer program stored thereon, wherein the computer program when executed by a processor implements the method for controlling a single link robot under cloud-edge co-operation event triggering according to any one of claims 1 to 6.

9. An electronic device comprising a memory, a processor and a computer program stored on the memory and operable on the processor, wherein the processor implements the method for controlling a single link mechanical arm under the coordination of cloud edge as claimed in any one of claims 1 to 6 when executing the program.