[go: up one dir, main page]

CN114740846B - Hierarchical path planning method for topology-grid-metric hybrid map - Google Patents

Hierarchical path planning method for topology-grid-metric hybrid map Download PDF

Info

Publication number
CN114740846B
CN114740846B CN202210340211.4A CN202210340211A CN114740846B CN 114740846 B CN114740846 B CN 114740846B CN 202210340211 A CN202210340211 A CN 202210340211A CN 114740846 B CN114740846 B CN 114740846B
Authority
CN
China
Prior art keywords
path
map
value
grid
sub
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210340211.4A
Other languages
Chinese (zh)
Other versions
CN114740846A (en
Inventor
武星
杨俊杰
王超超
楼佩煌
余文康
翟晶晶
胡子寒
汤凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Aeronautics and Astronautics
Original Assignee
Nanjing University of Aeronautics and Astronautics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Aeronautics and Astronautics filed Critical Nanjing University of Aeronautics and Astronautics
Priority to CN202210340211.4A priority Critical patent/CN114740846B/en
Publication of CN114740846A publication Critical patent/CN114740846A/en
Application granted granted Critical
Publication of CN114740846B publication Critical patent/CN114740846B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/02Control of position or course in two dimensions
    • G05D1/021Control of position or course in two dimensions specially adapted to land vehicles
    • G05D1/0231Control of position or course in two dimensions specially adapted to land vehicles using optical position detecting means
    • G05D1/0246Control of position or course in two dimensions specially adapted to land vehicles using optical position detecting means using a video camera in combination with image processing means
    • G05D1/0253Control of position or course in two dimensions specially adapted to land vehicles using optical position detecting means using a video camera in combination with image processing means extracting relative motion information from a plurality of images taken successively, e.g. visual odometry, optical flow

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Electromagnetism (AREA)
  • Aviation & Aerospace Engineering (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • General Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Control Of Position, Course, Altitude, Or Attitude Of Moving Bodies (AREA)

Abstract

本发明涉及一种面向拓扑‑栅格‑度量混合地图的分层路径规划方法,属于路径规划领域。首先,根据拓扑‑栅格‑度量混合地图构造方法,创建移动机器人作业环境的混合地图;其次,根据拓扑‑栅格分层规划方法生成一条整体优化的全局预设路径;最后,当移动机器人沿全局预设路径运行时,实时探测周围障碍物,若障碍物可能阻碍移动机器人的运行路径,则在度量地图上利用深度强化学习的改进深度确定性策略梯度算法进行局部路径规划,完成避障后重新回到全局预设路径,重复该过程直至移动机器人到达目标位置。本方法采用分层思想,将路径规划分为若干个子过程,在各层地图上应用不同的路径规划方法,降低规划问题的复杂度,面对部分未知的复杂大场景环境具有更高的搜索效率和适应性。

The present invention relates to a hierarchical path planning method for topology-grid-metric hybrid maps, and belongs to the field of path planning. First, according to the topology-grid-metric hybrid map construction method, a hybrid map of the mobile robot's operating environment is created; secondly, according to the topology-grid hierarchical planning method, an overall optimized global preset path is generated; finally, when the mobile robot runs along the global preset path, the surrounding obstacles are detected in real time. If the obstacles may hinder the running path of the mobile robot, the improved deep deterministic policy gradient algorithm of deep reinforcement learning is used on the metric map for local path planning, and after completing the obstacle avoidance, it returns to the global preset path, and repeats the process until the mobile robot reaches the target position. This method adopts a hierarchical idea, divides the path planning into several sub-processes, and applies different path planning methods on each layer of the map to reduce the complexity of the planning problem, and has higher search efficiency and adaptability in the face of partially unknown complex large scene environments.

Description

Hierarchical path planning method for topology-grid-metric hybrid map
Technical Field
The invention relates to the field of path planning, in particular to a hierarchical path planning method oriented to a topology-grid-metric hybrid map.
Background
Path planning is the basis of navigation, which aims at searching a collision-free optimal or suboptimal path from a start point to an end point according to one or more performance indexes such as safety, shortest distance, optimal time, etc. The path planning of the known environment map has a great deal of mature research results so far, but most of scenes in practical application are complex, the randomness is strong, and the method belongs to part of unknown environments. Most of the existing path planning methods are dependent on exact environments, and for path planning in partial unknown environments, obstacle avoidance safety and instantaneity are difficult to ensure. In addition, the task of mobile robots is becoming more complex and the working range is expanding, for example, the sweeping robot needs to have the capability of sweeping every corner of the whole house, and the AGV in the factory involves inter-vehicle distribution and the like. The expansion of the working range not only brings challenges to the indoor positioning technology, but also the widely applied intelligent searching method can not even search the optimal solution because the searching space is increased and a long time is spent for solving the solution.
The prior art discloses a path planning method (publication number: CN 113359768A) based on an improved A-algorithm, wherein the improved A-algorithm and an artificial potential field method are combined to design a hybrid path planning algorithm, a global initial path is optimized, the unknown environment is considered, but when the global path is planned, the A-algorithm and the ant colony algorithm are directly applied to a grid map, if the algorithm search space is rapidly expanded in the face of a large scene environment, the path search efficiency is reduced, and meanwhile, the artificial potential field method is used for solving the problem that frequent swing can occur when suddenly appearing obstacles, so that the robot motion control is not facilitated.
The effect of a path planning algorithm adopting a single map is not ideal in the face of a part of unknown complex large-scene environment, and the method mainly comprises the aspects of higher requirement on the integrity of environment information, larger search space, time consumption in planning, low obstacle avoidance instantaneity and the like.
Disclosure of Invention
In order to solve the technical defects in the prior art, the invention provides a hierarchical path planning method for a topology-grid-metric hybrid map, which divides path planning into a plurality of sub-processes, and solves the problems of larger search space, time consumption in planning, low obstacle avoidance instantaneity and the like in the prior art when facing a part of unknown complex large scene environment by applying different path planning methods on the topology, grid and metric map.
The invention is realized by the following technical scheme:
the hierarchical path planning method for the topology-grid-metric hybrid map comprises the following steps:
Step 1, creating a hybrid map of a mobile robot working environment according to a topology-grid-measurement hybrid map construction method, wherein the hybrid map comprises an abstract topological map, a concrete grid map and a fine measurement map;
Step 2, before the mobile robot operates, generating an overall optimized global preset path according to a topology-grid hierarchical planning method;
Step 3, detecting the surrounding environment in real time when the mobile robot runs along the global preset path, judging whether the distance from the nearest obstacle of the robot is smaller than the obstacle avoidance threshold value, if so, entering the step 4, and if not, entering the step 5;
Step 4, carrying out local path planning on a measurement map by utilizing an improved depth deterministic strategy Gradient (DEEP DETERMINISTIC Policy Gradient, DDPG) algorithm, namely firstly constructing a value classification experience playback pool, respectively storing a high-value sample, a common sample and a recent sample in three sub-playback pools of the value classification experience playback pool, randomly extracting a proper amount of samples from the three sub-playback pools by adopting a hierarchical sampling strategy for learning and training of the local path planning by adopting the DDPG algorithm, introducing a continuous rewarding function comprising arrival rewarding, collision punishment and process rewarding, rewarding or punishment on the local path planned by the DDPG algorithm, training the DDPG algorithm to plan a better local path, and returning to the global preset path after obstacle avoidance is completed through the local path planning;
and 5, judging whether the mobile robot reaches the target position, if so, ending the algorithm, and if not, entering the step 3.
Further, the topology-grid-metric hybrid map construction method includes the steps of:
step 1.1, rasterizing the whole operation environment to construct a global raster map;
Dividing a global grid map into a plurality of sub-grid maps, wherein each sub-grid map is provided with key nodes, describing a working environment into layered map representation by taking the sub-map as a tree node by using a hierarchical map, and in a k-layer map, the k+1th layer of sub-map is represented by the key nodes, and the key nodes of the same layer are connected through an offline priori path;
Step 1.3, generating an off-line priori path with higher safety degree on the grid map by adopting a refinement algorithm, and constructing a topological map corresponding to each level of sub-grid map by adopting a topological feature extraction method;
And 1.4, maintaining and updating the local measurement map by using the onboard sensor information carried by the mobile robot in the running process.
Further, the sub-map granularity is not necessarily too fine, and is generally divided into individual room dimensions.
Further, when the refinement algorithm processes the two-dimensional grid map, the free grid with the value of 0 is regarded as '1' in the image, the obstacle grid with the value of 1 is regarded as '0' in the image, and finally, the offline path of the idle area is generated.
Further, the topological feature extraction method takes bifurcation points of the offline prior paths as common nodes in a topological graph, manually selects regional entrance grids or other representative grids to represent key nodes of a sub-map, counts offline prior path lengths among the key nodes as weights of edges, and the path lengths are the number of grids contained in the paths.
Further, the topology-grid hierarchical planning method comprises the following steps:
Step 2.1, judging whether the initial grid S and the target grid D are in the same sub-map, if so, entering the step 2.2, and if not, entering the step 2.3;
step 2.2, aiming at the starting point S to the end point D, adopting an improved A-type algorithm to plan a global preset path of overall optimization, and ending the algorithm;
Step 2.3, respectively obtaining the depth L s、LD of a starting point S and a finishing point D in a hierarchical diagram framework and key nodes C S、CD of a sub-grid map where the starting point S and the finishing point D are located, if L s<LD is the key nodes C S、CD, starting from C D, searching from bottom to top in the hierarchical diagram framework until C S is found in a topological diagram corresponding to an L s-1 layer, otherwise, starting from C S to top, until C D is found in the topological diagram corresponding to the L D-1 layer, and planning an optimal node sequence on the topological map of each layer by using a Floyd algorithm in the searching process, and saving an off-line priori path between the nodes as a middle section preset path;
Step 2.4, using a starting point S as a starting point, a key node C S as an end point, searching a local path on a sub-grid map where the starting point S is positioned by using an improved A-type algorithm and storing the local path as a front-stage preset path, using a key node C D as the starting point and using an end point D as the end point, searching another local path on the sub-grid map where the starting point S is positioned by using the improved A-type algorithm and storing the local path as a rear-stage preset path;
and 2.5, merging the middle preset path generated in the step 2.3 with the front preset path and the rear preset path searched in the step 2.4 to obtain an overall optimized global preset path, and ending the algorithm.
Further, the improved algorithm introduces an extended point screening strategy, a bidirectional searching mechanism and a path redundant point eliminating technology into the standard algorithm.
The expanding point screening strategy ignores nodes with weak directivity with a certain probability during expanding, and comprises the following steps:
Rectangular area surrounded by current expansion point m, starting point S and end point D Reflecting the degree of compliance of m with the expected path,The smaller the node m and the higher the expected path compliance, the more opportunities to join the Open table to continue expansion.
In the formula (1), (x s,yS)、(xD,yD)、(xm,ym) is the grid coordinates of the starting point S, the end point D and the current expansion point m respectively;
Setting an area threshold When (when)When m directly joins the Open table specially storing the detected but not accessed node, whenAnd generating a random number p (m) between 0 and 1, judging whether the random number p (m) is larger than the trust threshold probability p 0, if so, ignoring the node m, and if not, adding an Open table.
The bidirectional searching mechanism starts forward searching and backward searching from a starting point and a finishing point at the same time, the forward searching and the backward searching respectively take the latest node obtained by the opposite side as a target node to perform alternative expansion searching, the opposite side is continuously closed until the target nodes of the forward searching and the backward searching meet, and finally, paths generated by the forward searching and the backward searching are spliced, and the bidirectional searching mechanism comprises the following steps:
Initializing Open1, open2, close1 and Close2 lists, adding a starting point S into the Open1, and adding an end point D into the Open 2;
Step 3.2, respectively taking out nodes n1 and n2 with the minimum cost value from the Open1 and Open2 lists, and respectively moving into Close1 and Close2;
step 3.3, judging whether n1 and n2 are current searching nodes in the relative searching direction, if yes, finishing the algorithm, splicing paths generated in the forward direction and the reverse direction to be a final optimized path, and if no, entering step 3.4;
Step 3.4, forward searching takes S as a starting point, n2 as an end point, n1 as a center and extends to the surrounding 8 neighborhood, and updating Open1 according to a rule of screening extension points;
step 3.5, reversely searching, namely expanding to the surrounding 8 neighborhood by taking D as a starting point and n1 as an end point and n2 as a center, and updating Open2 according to a rule of screening expansion points;
and 3.6, judging whether Open1 and Open2 are empty, if yes, ending the algorithm, and failing to find an effective path, and if not, entering step 3.2.
The path redundant point eliminating technology specifically comprises the following steps:
step 4.1, inserting new nodes at equal distances from each pair of adjacent nodes of the original path;
Step 4.2, recall that the initial node is P 1, the next node is P 2, number sequentially until the end point P k, and each node sets the previous node numbered by the node as the father node;
Setting P n as an optimization reference node, setting P c as a node under exploration, initializing n as k, and initializing c as k-2;
Step 4.4, judging whether the connection line L nc between the P n and the P c passes through an obstacle, if so, indicating that the P c+1 cannot be deleted, and jumping the P n to the P c+1, otherwise, deleting the P c+1, setting the P c as a father node of the P n, and jumping the P c to the father node P c-1;
Step 4.5, repeating the step 4.4 until P n=P1 or P c=P1;
And 4.6, connecting the rest nodes according to the pointers of the father nodes.
Further, the improved DDPG algorithm introduces a value classification experience playback pool and hierarchical sampling strategy and designs a continuous bonus function that blends target distance, security, motion, and number of steps.
The value classification experience playback pool consists of three queues, wherein the value classification experience playback pool comprises a sub playback pool B1 for storing high-value samples, a sub playback pool B2 for storing recent samples and a sub playback pool B3 for storing common samples, and each sub playback pool stores experience samples in the form of five tuples:
(St,at,rt,St+1,done) (2)
Equation (2) indicates that the agent reaches a new state S t+1 after performing action a t in state S t, while obtaining the environmental benefit r t, where done indicates whether S t+1 is a termination state.
The specific construction steps of the value classification experience playback pool are as follows:
the average value V m of all sample values in the two sub playback pools B1 and B3 is set to 0 during initialization, and the weight coefficient alpha of the bonus value is set to 1. And generating an experience sample at each time step in the training process, adding the generated sample into the tail part of the B2 queue, popping up a sample from the head part of the queue after the B2 reaches the upper limit of capacity, calculating the value of the sample according to the sample value measurement standard, adding the sample into the B1 if the value is higher than V m, and otherwise storing the sample into the B3. After the storage is finished, updating V m and alpha, and repeating the above processes until the training is finished.
Taking a sample generated in the time step t as an example, the sample value measurement standard is calculated by the following steps:
Vt=α*|rt|+(1-α)*|δt| (3)
δt=rt+γQ′(St+1,at+1Q′)-Q(St,atQ) (4)
In the formula (3), V t is the value of the sample, |r t | is the absolute value of the sample reward value, α is the weight coefficient of |r t |, |δ t | is the absolute value of the sample time difference error, and (1- α) is the weight of |δ t |;
in the formula (4), gamma is a reward value attenuation coefficient, Q' is a target value network state action value, theta Q′ is a target value network parameter, Q is an online value network state action value, theta Q is an online value network parameter, S t is a state space at time step t, a t is a decided action value in the state space, S t+1 is a new state reached by an intelligent agent after a t is executed in S t, and a t+1 is an action value in the new state;
and the hierarchical sampling strategy randomly extracts a proper amount of samples from the sub-playback pools B1, B2 and B3 respectively for training the network.
The continuous rewarding function comprises three parts of an arrival rewarding r a, a collision punishment r c and a process rewarding r n, and the rewarding function r t is defined as:
rn=rd+rs+rm+rp (6)
During navigation, the process reward r n obtained by the robot executing action is divided into 4 parts, namely a distance reward r d, a safety penalty r s, a motion penalty r m and a step number penalty r p, as shown in a formula (6).
R d is the most important part in rewarding navigation process, guiding the robot to get close to the target point, defined as:
rd=k1*(Dg(St-1)-Dg(St)) (7)
In equation (7), k 1 is a distance reward coefficient, D g(St-1) represents a distance from the target point in the last state of the robot, and D g(St) represents a distance from the target point in the current state of the robot.
R s is used for warning the robot to keep away from the obstacle as far as possible, selects a path with higher safety degree, and is defined as:
In the formula (8), k 2 is a safety penalty coefficient, D o is the distance between the robot and the nearest obstacle around, D s is a safety distance threshold, and s r represents the safety degree of the path and is defined as:
r m penalizes frequent swinging of the robot, simplifies the motion control of the robot, and is defined as:
In the equation (10), k 3 is a motion penalty coefficient, w t represents an angular velocity of the current state output, and w t-1 represents an angular velocity of the previous state output.
In order to avoid the blind pursuit of the maximum rewarding and rewarding of the robot and the direct trend of the target, r p is introduced to prevent the blind exploration of the robot, and the indirect stimulation of the robot to find a better path is defined as:
rp=k4 (11)
In equation (11), k 4 is the step penalty coefficient.
Compared with the prior art, the invention has at least the following beneficial effects or advantages:
the hierarchical path planning method for the topology-grid-metric hybrid map divides the whole operation environment into sub-maps with different levels, generates paths by respectively applying different path planning methods on each sub-map, and finally splices the paths. The layering idea avoids detailed description of the whole environment, only expands key point information when needed, limits the search space of a path planning algorithm to be in a sub map, reduces the scale of the path planning problem in a large-scale scene, and improves the search efficiency.
Meanwhile, in the aspect of global path planning, the global optimized initial path generated based on the topological map and the grid map has the safety and smoothness, and the quality of path planning is improved. In the aspect of local path planning, aiming at the problems of long time consumption, low utilization rate of experience samples and the like of DDPG algorithm training, a value classification experience playback pool, a layered sampling strategy and a continuous rewarding function are designed, the training process is accelerated, the network is guided to converge to a more reasonable obstacle avoidance strategy, the robot is helped to better cope with emergency situations, and the robot is more suitable for a dynamic unknown environment.
Drawings
FIG. 1 is a flow chart of a hierarchical path planning method for a topology-grid-metric hybrid map of the present invention;
FIG. 2 is a schematic diagram of an extended Point screening strategy
FIG. 3 topology-grid-metric hybrid map schematic
FIG. 4 is a hierarchical illustration intent
Detailed Description
The present invention will be described in further detail below with reference to the drawings and the specific embodiments of the present invention according to the technical solutions provided by the inventors, but the present invention is not limited to the embodiments, and all the similar methods and similar variations using the present invention should be included in the protection scope of the present invention.
The invention relates to a mixed map hierarchical path planning method facing to a topology-grid-metric map, which is shown in fig. 1 and specifically comprises the following steps:
And 1, creating a hybrid map of the mobile robot working environment according to a topology-grid-measurement hybrid map construction method.
The topology-grid-metric hybrid map completes abstract representation, materialization representation and refinement representation of the working environment from multiple dimensions through a topology map, a grid map and a metric map respectively, as shown in fig. 3. The topology-grid-metric hybrid map construction method comprises the following steps:
and 1.1, carrying out rasterization representation on the whole working environment, and constructing a global raster map.
The grid map is readily available, and the grid representation of the global environment can be obtained by using sonar sensors or tools such as gmapping function packages in the robotic operating system (Robot Operating System, ROS) to completely sweep the surrounding environment.
And 1.2, dividing the global grid map into a plurality of sub-grid maps, setting key nodes in each sub-grid map, describing the working environment as a hierarchical map representation taking the sub-map as a tree node by using a hierarchical map, wherein in a k-layer map, a k+1th sub-map is represented by the key nodes, and the key nodes of the same hierarchy are connected through an offline path.
The hierarchical graph is expressed by adopting a multi-way tree, the tree comprises a plurality of abstract levels, and the abstract level is reduced from top to bottom layer by layer, as shown in fig. 4. The granularity of the sub map is not suitable to be too fine, and is generally divided into room dimensions of 10-30m 2.
And 1.3, generating an offline priori path with higher safety degree on the grid map by adopting a refinement algorithm, and constructing a topological map corresponding to each level by a topological feature extraction method.
Since the refinement algorithm processes only the region having a pixel value of 1, when the two-dimensional grid map is processed, the free grid having a value of "0" is regarded as "1" in the image, and the obstacle grid having a value of "1" is regarded as "0" in the image, and the offline path of the free region is finally generated. On the basis, topological features are extracted, branch points of an offline prior path are used as common nodes in a topological graph, regional entrance grids or other representative grids are manually selected to represent key nodes of a sub-map, offline prior path lengths among the key nodes are counted to be used as weights of edges, the path lengths are the number of grids contained in the path, and finally the topological map corresponding to each level is constructed.
And 1.4, maintaining and updating the local measurement map by using the onboard sensor information carried by the mobile robot in the running process.
The refined position representation is realized by an inertial navigation system of the mobile robot equipment and a sensor for detecting the surrounding environment, such as a laser radar and the like, and a local measurement map is maintained.
And 2, before the mobile robot operates, generating an overall optimized global preset path according to a topology-grid hierarchical planning method.
The topology-grid hierarchical planning method comprises the following steps:
Step 2.1, judging whether the initial grid S and the target grid D are in the same sub-map, if so, entering the step 2.2, and if not, entering the step 2.3;
step 2.2, aiming at the starting point S to the end point D, adopting an improved A-type algorithm to plan a global preset path of overall optimization, and ending the algorithm;
Step 2.3, respectively obtaining the depth L s、LD of the starting point S and the end point D in the hierarchical map architecture and the key node C S、CD of the sub-grid map where the starting point S and the end point D are located, and if L s<LD, starting from C D, searching from bottom to top in the hierarchical map architecture layer by layer until C S is found in the L s-1 layers. In the searching process, an optimal node sequence is planned by using a Floyd algorithm on the topological map of each level, and an off-line prior path between nodes is saved as a middle-section preset path;
And 2.4, searching a local path on the sub-grid map where the starting point S is located by using the modified A-based algorithm by taking the starting point S as the starting point and the key node C S as the end point, and storing the local path as a pre-set path at the front section. Similarly, the key node C D is set as a starting point, the end point D is set as an end point, and another local path is searched on the sub-grid map where the starting point S is located by using the modified a-algorithm, and the preset path at the rear section is stored. If the feasible solution cannot be searched, the path planning fails, and the algorithm is ended;
and 2.5, merging the middle preset path generated in the step 2.3 with the front preset path and the rear preset path searched in the step 2.4 to obtain an overall optimized global preset path, and ending the algorithm.
And 3, detecting the surrounding environment in real time when the mobile robot runs along the global preset path, judging whether the distance from the nearest obstacle of the robot is smaller than the obstacle avoidance threshold value, if so, entering the step 4, and if not, entering the step 5.
The laser radar of the robot equipment scans for a week to obtain a plurality of point cloud data, the point cloud data reflect the distance between surrounding obstacles and the robot, and when the minimum value of the point cloud data is smaller than an obstacle avoidance threshold value, the robot is close to the obstacle and obstacle avoidance is needed.
And 4, carrying out local path planning on the measurement map by utilizing an improved depth deterministic strategy gradient algorithm, and returning to the global preset path after obstacle avoidance is completed.
The measurement map can easily obtain the relative pose of two points, and is beneficial to state space representation. The distance point cloud data of the laser radar, the current pose of the robot and the target pose are spliced into a state space, the state space is input into a trained obstacle avoidance strategy network, and the end-to-end output line speed control the movement of the robot. When the robot detects an obstacle, a sub-target point behind the obstacle is selected on the initial path according to the direction of the obstacle, and an obstacle avoidance route is planned automatically.
And 5, judging whether the mobile robot reaches the target position, if so, ending the algorithm, and if not, entering the step 3.
The improved A-algorithm introduces an expansion point screening strategy, a bidirectional searching mechanism and a path redundant point eliminating technology into the standard A-algorithm.
(1) Expansion point screening strategy
The method for filtering the strategy by the expansion point ignores the node with weak directivity with a certain probability when expanding, and comprises the following steps:
path planning requires that the path length be as short as possible, i.e. that the path direction be as close as possible to the direction from the start point to the end point. Rectangular area surrounded by current expansion point m, starting point S and end point D Reflecting the degree of compliance of m with the intended path, as indicated by the green rectangle in fig. 2.The smaller the node m and the higher the expected path compliance, the more opportunities to join the Open table to continue expansion.
In the formula (1), (x s,yS)、(xD,yD)、(xm,ym) is the grid coordinates of the starting point S, the end point D and the current expansion point m respectively;
Setting an area threshold When (when)When m directly joins the OpenTable, whenAnd generating a random number p (m), judging whether the random number p (m) is larger than the trust threshold probability p 0, if so, ignoring the node m, and if not, adding an Open table.
(2) Bidirectional search mechanism
The bidirectional searching mechanism starts to perform forward searching and backward searching from a starting point and a finishing point at the same time, the forward searching and the backward searching respectively take the latest node obtained by the opposite side as a target node to perform alternative expansion searching, the opposite side is continuously closed until the target nodes of the forward searching and the backward searching meet, and finally, paths generated by the forward searching and the backward searching are spliced, and the bidirectional searching mechanism comprises the following steps:
Initializing Open1, open2, close1 and Close2 lists, adding a starting point S into the Open1, and adding an end point D into the Open 2;
Step 3.2, respectively taking out nodes n1 and n2 with minimum cost value F from the Open1 and Open2 lists, and respectively moving into Close1 and Close2;
further, the cost value F is calculated as follows:
F(n)=g(n)+h(n) (2)
in equation (2), F (n) is an estimated cost of the starting point s to reach the target point via the node n, g (n) is an actual cost of the starting point s to reach the node n, and h (n) is an actual cost of the node n to reach the target d.
In the formula (3), (x n,yn)、(xs,yS) is the coordinates of the node n and the starting point s, respectively;
in the formula (4), (x d,yd) is the coordinate of the end point d;
step 3.3, judging whether n1 and n2 are current searching nodes in the relative searching direction, if yes, finishing the algorithm, splicing paths generated in the forward direction and the reverse direction to be a final optimized path, and if no, entering step 3.4;
Step 3.4, forward searching takes S as a starting point, n2 as an end point, n1 as a center and extends to the surrounding 8 neighborhood, and updating Open1 according to a rule of screening extension points;
step 3.5, reversely searching, namely expanding to the surrounding 8 neighborhood by taking D as a starting point and n1 as an end point and n2 as a center, and updating Open2 according to a rule of screening expansion points;
and 3.6, judging whether Open1 and Open2 are empty, if yes, ending the algorithm, and failing to find an effective path, and if not, entering step 3.2.
It should be noted that when the bidirectional search mechanism is extended to the middle, an extended point screening strategy is followed.
(3) Path redundancy point eliminating technology
The method further applies a path redundant point eliminating technology on the basis of generating an original path by bidirectional searching, and specifically comprises the following steps:
step 4.1, inserting new nodes at equal distances from each pair of adjacent nodes of the original path;
Step 4.2, recall that the initial node is P 1, the next node is P 2, number sequentially until the end point P k, and each node sets the previous node numbered by the node as the father node;
P n is a currently optimized node, P c is a node under exploration, n is initialized to k, and c is initialized to k-2;
Step 4.4, judging whether the connection line L nc between the P n and the P c passes through an obstacle, if so, indicating that the P c+1 cannot be deleted, and jumping the P n to the P c+1, otherwise, deleting the P c+1, setting the P c as a father node of the P n, and jumping the P c to the father node P c-1;
Step 4.5, repeating the step 3.4 until P n=P1 or P c=P1;
And 4.6, connecting the rest nodes according to the pointers of the father nodes.
The improved DDPG algorithm introduces a value classification experience playback pool and a hierarchical sampling strategy and designs a continuous rewarding function that fuses target distance, security, motion and number of steps.
(1) Value classification experience playback pool
The value classification experience playback pool consists of a sub playback pool B1 for storing high-value samples, a sub playback pool B2 for storing recent samples and a sub playback pool B3 for storing common samples, wherein each sub playback pool stores experience samples in the form of five tuples as follows:
(St,at,rt,St+1,done) (5)
Equation (5) indicates that the agent reaches a new state S t+1 after performing action a t in state S t, while obtaining the environmental benefit r t, where done indicates whether S t+1 is a termination state.
The specific construction steps of the value classification experience playback pool are as follows:
the average value V m of all sample values in the two sub playback pools B1 and B3 is set to 0 during initialization, and the weight coefficient alpha of the bonus value is set to 1. And generating an experience sample at each time step in the training process, adding the generated sample into the tail part of the B2 queue, popping up a sample from the head part of the queue after the B2 reaches the upper limit of capacity, calculating the value of the sample according to the sample value measurement standard, adding the sample into the B1 if the value is higher than V m, and otherwise storing the sample into the B3. After the storage is finished, updating V m and alpha, and repeating the above processes until the training is finished.
Taking a sample generated in the time step t as an example, the sample value measurement standard is calculated by the following steps:
Vt=α*|rt|+(1-α)*|δt| (6)
δt=rt+γQ′(St+1,at+1Q′)-Q(St,atQ) (7)
in the formula (6), V t is the value of the sample, |r t | is the absolute value of the sample prize value, α is the weight coefficient of |r t |, |δ t | is the absolute value of the sample time difference error, and (1- α) is the weight of |δ t |;
In the formula (7), gamma is a reward value attenuation coefficient, Q' is a target value network state action value, theta Q′ is a target value network parameter, Q is an online value network state action value, theta Q is an online value network parameter, S t is a state space at time step t, a t is a decided action value in the state space, S t+1 is a new state reached by an intelligent agent after a t is executed in S t, and a t+1 is an action value in the new state;
(2) Hierarchical sampling strategy
And the hierarchical sampling strategy randomly extracts a proper amount of samples from the sub-playback pools B1, B2 and B3 respectively for training the network. The total number of small batch samples is recorded as N, the number of samples from the sub playback pool B1 is N1, the number of samples from the sub playback pool B2 is N2, and the number of samples from the sub playback pool B3 is N3.
N=N1+N2+N3 (8)
N2 is a fixed value, so that each small batch of samples have a recent sample, and the samples have timeliness, thereby being beneficial to timely adjusting the network convergence direction. In the initial stage of training, N1 takes a larger value, and the network is promoted to converge towards the correct direction by selecting a large number of high-quality samples. With the development of training, the robot performance is more excellent, the appearance frequency of high-value samples is higher and higher, and in order to avoid over fitting, N1 is properly reduced, and N3 is improved.
(3) Continuous bonus function
The continuous rewarding function comprises three parts of arrival rewarding, collision punishment and process rewarding, wherein when the distance between the center of the robot and the target position is smaller than an arrival threshold value, as shown in a formula (9), the robot is judged to arrive at a target point to obtain a 300 rewarding value, when the minimum value of laser radar point cloud data of the robot equipment is smaller than a collision threshold value, the robot is judged to collide to obtain a-500 rewarding value, and a process rewarding r n for fusing the target distance, the safety, the movement and the step number is defined as a formula (10):
rn=rd+rs+rm+rp (10)
During navigation, a process reward r n obtained by the action performed by the robot is divided into 4 parts, namely a distance reward r d, a safety penalty r s, a motion penalty r m and a step number penalty r p.
R d is the most important part in rewarding navigation process, guiding the robot to get close to the target point, defined as:
rd=k1*(Dg(St-1)-Dg(St)) (11)
In equation (11), k 1 is a distance reward coefficient, and in this example, k 1=20,Dg(St-1) represents a distance from the target point in the previous state of the robot, and D g(St) represents a distance from the target point in the current state of the robot.
R s is used for warning the robot to keep away from the obstacle as far as possible, selects a path with higher safety degree, and is defined as:
In the formula (12), k 2 is a safety penalty coefficient, in this example, k 2=-1,Do is a distance between the robot and the nearest obstacle, d s is a safety distance threshold, and s r represents the safety degree of the path and is defined as:
r m penalizes frequent swinging of the robot, simplifies the motion control of the robot, and is defined as:
In equation (14), k 3 is a motion penalty coefficient, and in this example, k 3=-0.2;wt represents the angular velocity of the current state output, and w t-1 represents the angular velocity of the previous state output.
In order to avoid the blind pursuit of the maximum rewarding and rewarding of the robot and the direct trend of the target, r 4 is introduced to prevent the blind exploration of the robot, and the indirect stimulation of the robot to find a better path is defined as:
rp=k4 (15)
in the formula (15), k 4 is a step number penalty coefficient, and k 4 = -0.05 is taken as the example.
It should be noted that DDPG is a deep reinforcement learning algorithm, so the value classification experience playback pool and the continuous rewarding function are designed to accelerate the training process and ensure that a high quality obstacle avoidance strategy is finally learned. The training process can be performed in a simulation environment, and the training process can be transferred to a real environment by loading network parameters after the training process is completed.
The foregoing is merely a preferred embodiment of the invention, and it should be noted that modifications could be made by those skilled in the art without departing from the principles of the invention, which modifications would also be considered to be within the scope of the invention.

Claims (5)

1. The hierarchical path planning method for the topology-grid-metric hybrid map is characterized by comprising the following steps of:
Step 1, creating a hybrid map of a mobile robot working environment according to a topology-grid-measurement hybrid map construction method, wherein the hybrid map comprises an abstract topological map, a concrete grid map and a fine measurement map, and the concrete steps are as follows:
step 1.1, rasterizing the whole operation environment to construct a global raster map;
Dividing a global grid map into a plurality of sub-grid maps, wherein each sub-grid map is provided with key nodes, describing a working environment into layered map representation by taking the sub-map as a tree node by using a hierarchical map, and in a k-layer map, the k+1th layer of sub-map is represented by the key nodes, and the key nodes of the same layer are connected through an offline priori path;
step 1.3, generating an off-line prior path with higher safety degree on a grid map by adopting a refinement algorithm, and constructing a topological map corresponding to each level of sub-grid map by a topological feature extraction method, wherein the topological feature extraction method takes bifurcation points of the off-line prior path on the grid map as common nodes in the topological map, manually selects regional entrance grids or other representative grids as key nodes of the sub-grid map, counts off-line prior path lengths among the key nodes as weights of edges, and the path lengths are the number of grids contained in the path;
step 1.4, the mobile robot uses the information of the onboard sensor carried by the mobile robot to maintain and update the local measurement map in the running process;
Step 2, before the mobile robot operates, generating an overall optimized global preset path according to a topology-grid hierarchical planning method, wherein the method comprises the following steps of:
Step 2.1, judging whether the initial grid S and the target grid D are in the same sub-map, if so, entering the step 2.2, and if not, entering the step 2.3;
Step 2.2, aiming at the starting point S to the end point D, adopting an improved A-type algorithm to plan a global preset path which is optimized integrally, and ending the step 2;
Step 2.3, respectively obtaining the depth L s、LD of a starting point S and a finishing point D in a hierarchical diagram framework and key nodes C S、CD of a sub-grid map where the starting point S and the finishing point D are located, if L s<LD is the key nodes C S、CD, starting from C D, searching from bottom to top in the hierarchical diagram framework until C S is found in a topological diagram corresponding to an L s-1 layer, otherwise, starting from C S to top, until C D is found in the topological diagram corresponding to the L D-1 layer, and planning an optimal node sequence on the topological map of each layer by using a Floyd algorithm in the searching process, and saving an off-line priori path between the nodes as a middle section preset path;
Step 2.4, using S as a starting point, using a key node C S as an end point, searching a local path on a sub-grid map where the starting point S is located by using an improved A-type algorithm and storing the local path as a front-stage preset path, using a key node C D as the starting point and using an end point D as the end point, searching another local path on the sub-grid map where the starting point S is located by using the improved A-type algorithm and storing the local path as a rear-stage preset path;
Step 2.5, merging the middle preset path generated in the step 2.3 with the front preset path and the rear preset path searched in the step 2.4 to obtain an overall optimized global preset path, and ending the step 2;
Step 3, detecting the surrounding environment in real time when the mobile robot runs along the global preset path, judging whether the distance from the nearest obstacle of the robot is smaller than the obstacle avoidance threshold value, if so, entering the step 4, and if not, entering the step 5;
Step 4, carrying out local path planning on a measurement map by utilizing an improved depth deterministic strategy Gradient (DEEP DETERMINISTIC Policy Gradient, DDPG) algorithm, namely firstly constructing a value classification experience playback pool, respectively storing a high-value sample, a common sample and a recent sample in three sample sub-playback pools of the value classification experience playback pool, randomly extracting a proper amount of samples from the three sample sub-playback pools by adopting a hierarchical sampling strategy for learning and training of the local path planning by the DDPG algorithm, introducing a continuous rewarding function comprising arrival rewarding, collision punishment and process rewarding, rewarding or punishment on the local path planned by the DDPG algorithm, training the DDPG algorithm to plan a better local path, and returning to the global preset path after obstacle avoidance is completed through the local path planning;
and 5, judging whether the mobile robot reaches the target position, if so, ending the algorithm, and if not, entering the step 3.
2. The hierarchical path planning method for a topology-grid-metric hybrid map of claim 1, wherein said modified a-algorithm introduces a bi-directional search mechanism in a standard a-algorithm, while performing forward and reverse searches starting from a start point and an end point, comprising the steps of:
Initializing Open1, open2, close1 and Close2 lists, adding a starting point S into the Open1, and adding an end point D into the Open 2;
Step 3.2, respectively taking out nodes n1 and n2 with the minimum cost value from the Open1 and Open2 lists, and respectively moving into Close1 and Close2;
step 3.3, judging whether n1 and n2 are current searching nodes in the relative searching direction, if yes, finishing the algorithm, splicing paths generated in the forward direction and the reverse direction to be a final optimized path, and if no, entering step 3.4;
Step 3.4, forward searching takes S as a starting point, n2 as an end point, n1 as a center and extends to the surrounding 8 neighborhood, and updating Open1 according to a rule of screening extension points;
step 3.5, reversely searching, namely expanding to the surrounding 8 neighborhood by taking D as a starting point and n1 as an end point and n2 as a center, and updating Open2 according to a rule of screening expansion points;
and 3.6, judging whether Open1 and Open2 are empty, if yes, ending the algorithm, and failing to find an effective path, and if not, entering step 3.2.
3. The hierarchical path planning method for a topology-grid-metric hybrid map according to claim 1, wherein the improvement a-algorithm introduces a path redundancy point rejection technique in a standard a-algorithm, specifically comprising the steps of:
step 4.1, inserting new nodes at the distance equal to the distance between each pair of adjacent nodes of the original path;
Step 4.2, recall that the initial node is P 1, the next node is P 2, number sequentially until the end point P k, and each node sets the previous node numbered by the node as the father node;
setting P n as an optimization reference node, setting P c as a currently explored node, initializing n as k, and initializing c as k-2;
Step 4.4, judging whether the connection line L nc between the P n and the P c passes through an obstacle, if so, indicating that the P c+1 cannot be deleted, and if P n jumps to P c+1, otherwise, deleting the P c+1, setting the P c as a father node of the P n, and simultaneously, jumping the P c to the father node P c-1;
Step 4.5, repeating the step 4.4 until P n=P1 or P c=P1;
And 4.6, connecting the rest nodes according to the pointers of the father nodes.
4. The hierarchical path planning method for a topology-grid-metric hybrid map according to claim 1, wherein the value classification experience playback pool is composed of three queues including a sub playback pool B1 storing high value samples, a sub playback pool B2 storing recent samples, a sub playback pool B3 storing normal samples, each sub playback pool storing experience samples in the form of five tuples as follows:
(St,at,rt,St+1,done) (2)
Equation (2) indicates that the agent reaches a new state S t+1 after performing action a t in state S t while obtaining an environmental incentive r t, where done indicates whether S t+1 is a termination state;
the specific construction steps of the value classification experience playback pool are as follows:
Setting the average value V m of all sample values in the two sub playback pools B1 and B3 as 0 during initialization, setting the weight coefficient alpha of the rewarding value as 1, generating an experience sample at each time step during training, adding the generated sample into the tail part of a B2 queue, ejecting a sample from the head part of the queue after the B2 reaches the upper limit of capacity, calculating the value of the sample according to the sample value measurement standard, adding the sample into B1 if the value is higher than V m,, otherwise storing the sample into B3, updating V m and alpha after the storage is finished, and repeating the processes until the training is finished;
taking a sample generated in the time step t as an example, the sample value measurement standard is calculated by the following steps:
Vt=α*|rt|+(1-α)*|δt| (3)
δt=rt+γQ′(St+1,at+1Q′)-Q(St,atQ) (4)
In the formula (3), V t is the value of the sample, |r t | is the absolute value of the sample reward value, α is the weight coefficient of |r t |, |δ t | is the absolute value of the sample time difference error, and (1- α) is the weight of |δ t |;
in the formula (4), gamma is a reward value attenuation coefficient, Q' is a target value network state action value, theta Q′ is a target value network parameter, Q is an online value network state action value, theta Q is an online value network parameter, S t is a state space at time step t, a t is a decided action value in the state space, S t+1 is a new state reached by an intelligent agent after a t is executed in S t, and a t+1 is an action value in the new state;
and the hierarchical sampling strategy randomly extracts a proper amount of samples from the sub-playback pools B1, B2 and B3 respectively for training the network.
5. The hierarchical path planning method for a topology-grid-metric hybrid map of claim 1, wherein the continuous rewards function includes three parts, namely, arrival rewards r a, collision penalties r c, and process rewards r n, and the rewards function r t is defined as:
In the navigation process, a process reward r n obtained by the action performed by the robot is divided into 4 parts, namely a distance reward r d, a safety penalty r s, a motion penalty r m and a step number penalty r p, as shown in a formula (6):
rn=rd+rs+rm+rp (6)
In equation (6), the distance prize r d is
rd=k1*(Dg(St-1)-Dg(St)) (7)
Where k 1 is a distance reward coefficient, D g(St-1) represents the distance of the robot from the target point in the previous state, and D g(St) represents the distance of the robot from the target point in the current state;
in equation (6), the security penalty r s is:
Wherein k 2 is a safety penalty coefficient, D o is the distance between the robot and the nearest obstacle around, D s is a safety distance threshold, and s r represents the safety degree of the path as follows:
in equation (6), the motion penalty r m is:
Wherein k 3 is a motion penalty coefficient, w t represents the angular velocity of the current state output, and w t-1 represents the angular velocity of the last state output;
in equation (6), step penalty r p is:
rp=k4 (11)
Where k 4 is the step penalty factor.
CN202210340211.4A 2022-04-01 2022-04-01 Hierarchical path planning method for topology-grid-metric hybrid map Active CN114740846B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210340211.4A CN114740846B (en) 2022-04-01 2022-04-01 Hierarchical path planning method for topology-grid-metric hybrid map

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210340211.4A CN114740846B (en) 2022-04-01 2022-04-01 Hierarchical path planning method for topology-grid-metric hybrid map

Publications (2)

Publication Number Publication Date
CN114740846A CN114740846A (en) 2022-07-12
CN114740846B true CN114740846B (en) 2024-11-29

Family

ID=82280422

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210340211.4A Active CN114740846B (en) 2022-04-01 2022-04-01 Hierarchical path planning method for topology-grid-metric hybrid map

Country Status (1)

Country Link
CN (1) CN114740846B (en)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115406459B (en) * 2022-09-09 2024-08-13 中国第一汽车股份有限公司 Path determination method, path determination device, nonvolatile storage medium and computer equipment
CN115420296B (en) * 2022-11-07 2023-04-11 山东大学 Path searching method and system based on multi-resolution topological map
CN116147606B (en) * 2022-12-02 2023-09-08 浙江大学 An autonomous exploration and mapping method and system based on wheeled mobile robots
CN115619900B (en) * 2022-12-16 2023-03-10 中国科学技术大学 Topology Extraction Method of Point Cloud Map Based on Distance Map and Probability Road Map
CN116429137B (en) * 2023-03-22 2024-06-25 上海知而行科技有限公司 Traversal path generation method and equipment for cleaning device
CN118896603B (en) * 2023-05-04 2025-11-04 广州视源电子科技股份有限公司 Global path planning methods and mobile robots
CN116481557B (en) * 2023-06-20 2023-09-08 北京斯年智驾科技有限公司 An intersection traffic path planning method, device, electronic equipment and storage medium
CN117073697B (en) * 2023-08-17 2025-07-22 中国电子科技南湖研究院 Autonomous hierarchical exploration map building method, device and system for ground mobile robot
CN116774733B (en) * 2023-08-21 2023-10-31 南京航空航天大学 Multi-unmanned aerial vehicle coverage path planning method
CN118092413B (en) * 2023-11-06 2025-07-22 合肥工业大学 Path planning method, terminal and storage medium for multi-robot system
CN119354225B (en) * 2024-12-25 2025-03-21 江苏中科重德智能科技有限公司 Navigation method, system, device and storage medium for encouraging a robot to avoid obstacles by keeping to the right
CN120043547A (en) * 2024-12-31 2025-05-27 苏州大学 Self-adaptive navigation method and system for robot
CN120215514B (en) * 2025-05-27 2025-08-12 四川参盘供应链科技有限公司 A reinforcement learning unmanned forklift obstacle avoidance scheduling method and system for dynamic obstacles
CN120595812B (en) * 2025-08-06 2025-10-28 江苏盛海智能科技有限公司 Obstacle avoidance method and system for unmanned vehicle
CN120722862B (en) * 2025-08-19 2025-11-07 洛阳理工学院 An intelligent control method and system for AGVs

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101667029B1 (en) * 2009-08-10 2016-10-17 삼성전자 주식회사 Method and apparatus of path planing for a robot
EP2466532A1 (en) * 2010-12-16 2012-06-20 Siemens Aktiengesellschaft Method and device for determining a navigation graph for flows of individuals and method for determining a navigation strategy
CN105511457B (en) * 2014-09-25 2019-03-01 科沃斯机器人股份有限公司 Robot static path planning method
CN106097431A (en) * 2016-05-09 2016-11-09 王红军 A kind of object global recognition method based on 3 d grid map
CN107179082B (en) * 2017-07-07 2020-06-12 上海阅面网络科技有限公司 Autonomous exploration method and navigation method based on fusion of topological map and measurement map
CN109163731A (en) * 2018-09-18 2019-01-08 北京云迹科技有限公司 A kind of semanteme map constructing method and system
CN110703747B (en) * 2019-10-09 2021-08-03 武汉大学 A robot autonomous exploration method based on simplified generalized Voronoi diagram
CN113052152B (en) * 2021-06-02 2021-07-30 中国人民解放军国防科技大学 A vision-based indoor semantic map construction method, device and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
面向复合地图的移动机器人分层路径规划;武星,杨俊杰,汤凯,翟晶晶,楼佩煌;中国机械工程;20230331;第34卷(第5期);563-575 *

Also Published As

Publication number Publication date
CN114740846A (en) 2022-07-12

Similar Documents

Publication Publication Date Title
CN114740846B (en) Hierarchical path planning method for topology-grid-metric hybrid map
CN113110592B (en) Unmanned aerial vehicle obstacle avoidance and path planning method
Tang et al. Dynamic reallocation model of multiple unmanned aerial vehicle tasks in emergent adjustment scenarios
CN111142522B (en) Method for controlling agent of hierarchical reinforcement learning
CN109945873B (en) A hybrid path planning method for motion control of indoor mobile robots
WO2022052406A1 (en) Automatic driving training method, apparatus and device, and medium
CN114791743B (en) A collaborative trajectory planning method for UAV swarm considering communication delay
CN113342030B (en) Multi-UAV cooperative self-organization control method and system based on reinforcement learning
CN112947594A (en) Unmanned aerial vehicle-oriented flight path planning method
CN114089751A (en) A Path Planning Method for Mobile Robots Based on Improved DDPG Algorithm
CN116578080A (en) Local path planning method based on deep reinforcement learning
CN116679710A (en) Robot obstacle avoidance strategy training and deployment method based on multitask learning
CN118778650A (en) Improved A-star algorithm for mobile robot motion planning
Li et al. Predictive hierarchical reinforcement learning for path-efficient mapless navigation with moving target
CN118643858A (en) A hybrid digital-analog unmanned swarm brain-like collaborative navigation method
CN113391633A (en) Urban environment-oriented mobile robot fusion path planning method
CN117631697A (en) Map construction method, device, equipment and storage medium based on air-ground cooperation
CN113687657A (en) Method and storage medium for dynamic path planning of multi-agent formations
CN113124874A (en) Time collaborative flight path planning method for unmanned equipment
CN112904901A (en) Path planning method based on binocular vision slam and fusion algorithm
CN115657678A (en) Underwater unmanned underwater vehicle track generation method and system for complex dynamic environment
CN114815791B (en) Drivable space planning method and device
Yu The Impact of Path Planning Model Based on Improved Ant Colony Optimization Algorithm on Green Traffic Management.
CN117962918A (en) Automatic driving decision method and device, mobile tool and storage medium
CN115712311A (en) Method for covering cooperative area of multiple unmanned aerial vehicles under communication constraint

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant