JP2016119040A

JP2016119040A - Behavior control system, and method and program thereof

Info

Publication number: JP2016119040A
Application number: JP2015094398A
Authority: JP
Inventors: 洋川野; Hiroshi Kawano
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: NTT Inc
Priority date: 2014-12-17
Filing date: 2015-05-01
Publication date: 2016-06-30
Anticipated expiration: 2035-05-01
Also published as: JP6489923B2

Abstract

【課題】多数のロボットの存在を考慮しつつも、計画計算に必要な計算時間や計算機の記憶容量を一台のロボットを扱うときと同様に少ないものに低減可能であるロボット協調隊列形成技術を提供する。【解決手段】全ての目標位置単位の集合に制御対象物が存在しない場合、開始位置の集合に配置されたp台の制御対象物を目標位置の集合に移動させる際に、頭部制御対象物単位が移動の先頭を務めるように、価値関数を用いてある時刻の隊列Gaに対して隊列Gbを決定し、全ての目標位置単位の集合に制御対象物が存在しない場合、頭部制御対象物単位の位置にM×N台の制御対象物が位置するように各制御対象物の動作を制御し、全ての目標位置単位の集合に制御対象物が存在する場合に、目標位置単位の集合には含まれるが、目標位置の集合には含まれない位置に、各制御対象物が移動するように制御する。【選択図】図２３[PROBLEMS] To provide a robot coordination corps formation technology that can reduce the calculation time required for planned calculation and the storage capacity of a computer to as little as when handling a single robot while considering the existence of a large number of robots. provide. When a control object does not exist in a set of all target position units, the head control object is used when moving the p control objects arranged in the start position set to the target position set. If the column Gb is determined for the column Ga at a certain time using the value function so that the unit serves as the head of movement, and there is no control object in the set of all target position units, the head control object The operation of each control object is controlled so that M × N control objects are located at the unit position, and if the control object exists in the set of all target position units, Is included, but control is performed so that each control object moves to a position that is not included in the set of target positions. [Selection] Figure 23

Description

本発明は、複数の制御対象物の行動を制御する技術に関する。例えば、複数のロボットを、開始位置における隊列形成状態から協調して移動させ、障害物を回避させ、目標位置で隊列形成をさせるための各ロボットの行動計画を求めるロボット協調制御技術に関する。 The present invention relates to a technique for controlling actions of a plurality of control objects. For example, the present invention relates to a robot cooperative control technique for obtaining an action plan for each robot for moving a plurality of robots in a coordinated manner from a formation state at a start position, avoiding an obstacle, and forming a formation at a target position.

近年、多数の自律移動ロボットを効率的に制御にするための研究が活発に行われている。その任務内容は、人の入れない箇所の監視、物品の搬送などさまざまであるが、多数のロボットの協調動作による隊列形成を効率的に行わせるための技術が求められており盛んに研究が行われている（例えば、非特許文献１参照）。多数のロボットによる効率的な隊列形成を実現するには、それぞれのロボットの配置、動作順序などを事前に計画することが重要である。このような計画においては、当然ながら、複数のロボットが動作する実環境における障害物の存在や経路の形状なども十分に考慮しなければならない。 In recent years, research has been actively conducted to efficiently control a large number of autonomous mobile robots. Their missions vary, such as monitoring places where people can't enter, transporting goods, etc., but technology is being sought for efficient formation of platoons through the coordinated operation of many robots. (For example, refer nonpatent literature 1). In order to realize efficient formation of a formation by a large number of robots, it is important to plan the arrangement and operation order of each robot in advance. In such a plan, as a matter of course, it is necessary to sufficiently consider the presence of obstacles and the shape of a route in an actual environment where a plurality of robots operate.

このような計画計算を行うための効果的な手法の一つとして、マルコフ決定過程における動的計画法や強化学習の手法があり、さまざまな研究が行われている（例えば、非特許文献２参照）。 As an effective method for performing such a plan calculation, there are a dynamic programming method and a reinforcement learning method in a Markov decision process, and various studies have been conducted (for example, see Non-Patent Document 2). ).

また、ロボットの隊列制御の中でも、ロボット同士が互いに接したままの状態で、アメーバのように全体で移動を行うという仮定の下でのロボット隊列制御においては、ロボット同士の相対的な位置関係から、各ロボットの絶対位値の決定が可能であるという利点と、付加的な位置計測用の装備を必要としないという利点があり、そのようなロボットの研究もおこなわれている。例えば、非特許文献３に示すものでは任意の矩形形状隊列から他の矩形形状隊列までの隊列制御が示されている。 Also, in the robot row control, the robot row control under the assumption that the robot moves as a whole in a state where the robots are in contact with each other, the relative positional relationship between the robots The advantages of being able to determine the absolute position of each robot and the advantage of not requiring additional position measurement equipment are being studied. For example, in the non-patent document 3, the row control from an arbitrary rectangular shape row to another rectangular shape row is shown.

また、非特許文献４に示す研究に至る一連の研究や非特許文献５では、ある隊列から他の隊列に変化する隊列制御が示されている。 In addition, in a series of studies leading to the research shown in Non-Patent Document 4 and Non-Patent Document 5, the formation control that changes from one formation to another formation is shown.

M.Shimizu, A.Ishiguro, T.Kawakatsu, Y.Masubuchi, “Coherent Swarming from Local Interaction by Exploiting Molecular Dynamics and Stokesian Dynamics Methods”, Proceeaings of the 2003 IEE/RSJ International Conference on intelligent Robots and Systems, Las Veqas, pp.1614-1619, October 2003.M. Shimizu, A. Ishiguro, T. Kawakatsu, Y. Masubuchi, “Coherent Swarming from Local Interaction by Exploiting Molecular Dynamics and Stokesian Dynamics Methods”, Proceeaings of the 2003 IEE / RSJ International Conference on intelligent Robots and Systems, Las Veqas, pp.1614-1619, October 2003. Y.Wang, C.W.de Silva, “Multi-Robot Box-pushing: Single-Agent Q-Learning vs. Team Q-Learning”, Proceedings of the 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, Beijing, China, pp.3694-3699, October 2006.Y.Wang, CWde Silva, “Multi-Robot Box-pushing: Single-Agent Q-Learning vs. Team Q-Learning”, Proceedings of the 2006 IEEE / RSJ International Conference on Intelligent Robots and Systems, Beijing, China, pp .3694-3699, October 2006. A.Becker, G.Habibi, J.Werfel, M.Rubenstein, and J.McLurkin, “Massive Uniform Manipulation: Controlling Large Populations of Simple Robots with a Common Input Signal”, Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Japan, pp.520-527, November, 2013.A. Becker, G. Habibi, J. Werfel, M. Rubenstein, and J. McLurkin, “Massive Uniform Manipulation: Controlling Large Populations of Simple Robots with a Common Input Signal”, Proceedings of the IEEE / RSJ International Conference on Intelligent Robots and Systems, Japan, pp.520-527, November, 2013. Stanton Wong1 and Jennifer Walter ”Deterministic Distributed Algorithm for Self-Reconfiguration of Modular Robots from Arbitrary to Straight Chain Configurations”, Proceedings of the 2013 IEEE International Conference on Robotics and Automation (ICRA), Karlsruhe, Germany, pp.537-543, May 6-10, 2013.Stanton Wong1 and Jennifer Walter ”Deterministic Distributed Algorithm for Self-Reconfiguration of Modular Robots from Arbitrary to Straight Chain Configurations”, Proceedings of the 2013 IEEE International Conference on Robotics and Automation (ICRA), Karlsruhe, Germany, pp.537-543, May 6-10, 2013. Michael Rubenstein, Alejandro Cornejo, Radhika Nagpal, “Programmable self-assembly in a thousand-robot swarm”, SCIENCE, 2014, Vol. 345, Issue 6198, pp.795-799.Michael Rubenstein, Alejandro Cornejo, Radhika Nagpal, “Programmable self-assembly in a thousand-robot swarm”, SCIENCE, 2014, Vol. 345, Issue 6198, pp.795-799.

しかしながら、非特許文献１の手法では、流体力学的な特性をロボット動作に組み込む手法を用いて群ロボットの動作を制御しており、低い計算負荷での制御を可能にしている利点があるが、任意の形状の隊列形成をすることができるとは限らない。 However, in the method of Non-Patent Document 1, the operation of the group robot is controlled using a method of incorporating the hydrodynamic characteristics into the robot operation, and there is an advantage that enables control with a low calculation load. It is not always possible to form a formation of any shape.

また、非特許文献２の手法のように、マルコフ決定過程における動的計画法や強化学習を使用してこのような計画を行おうとすると、単体のロボットを使用する場合に比べて複数のロボットを使用する場合には、その計算に要する時間や計算機の記憶容量がロボットの数に対して指数関数的に増大してしまう。その主たる原因となるのが、探索計算のためのマルコフ状態空間内の状態数の莫大な増加である。非特許文献２では、検証された強化学習の手法では、ロボット数の増加に伴い、指数関数的に計算負荷が増加するという、マルコフ状態空間内の爆発問題への解決策は示されていない。 In addition, as in the method of Non-Patent Document 2, when trying to perform such a plan using dynamic programming or reinforcement learning in the Markov decision process, a plurality of robots are used compared to the case of using a single robot. When used, the time required for the calculation and the storage capacity of the computer increase exponentially with respect to the number of robots. The main cause is the enormous increase in the number of states in the Markov state space for search computation. Non-Patent Document 2 does not show a solution to the explosion problem in the Markov state space in which the verified reinforcement learning method increases the computational load exponentially as the number of robots increases.

また、非特許文献１，２の手法ともに、付加的な位置計測用の装備を必要とする。 Further, both the methods of Non-Patent Documents 1 and 2 require additional equipment for position measurement.

また、非特許文献３では、各時刻でロボットに与えられる動作命令が皆同じ方向であるという条件を考慮しており、付加的な位置計測用の装備を必要としないが、その実現には障害物の存在を必要としている。また、一度の動作における各ロボットの移動誤差から発生する隊列の崩れの問題も解決できていない。 Non-Patent Document 3 considers the condition that the motion commands given to the robot at the respective times are in the same direction, and does not require any additional equipment for position measurement. We need the existence of things. Moreover, the problem of collapse of the formation caused by the movement error of each robot in one operation has not been solved.

非特許文献４の手法においては、一度、線形隊列への変換をしなければならず、可能な隊列形成動作そのものへの制約が大きい。 In the method of Non-Patent Document 4, conversion to a linear formation must be performed once, and there is a great restriction on the possible formation operation itself.

非特許文献５の手法においては、開始隊列と目標隊列が共有する点がなければならないことや、障害物を考慮していないなどの問題点がある。 In the method of Non-Patent Document 5, there are problems that the starting row and the target row must share a point, and that obstacles are not considered.

このような現状に鑑みて、本発明では、多数のロボットの存在を考慮しつつも、計画計算に必要な計算時間や計算機の記憶容量を少ないものに低減可能で、かつ、ロボット同士が接したままの状態を維持しつつ任意の開始位置における隊列形成状態から、他の任意の目標位置における隊列形成状態へ障害物のある環境にて変形動作を行うことを可能とする、ロボット協調隊列形成技術を提供することを目的とする。 In view of such a current situation, in the present invention, while considering the existence of a large number of robots, it is possible to reduce the calculation time required for planned calculation and the storage capacity of the computer to a small one, and the robots are in contact with each other. Robot coordination formation technology that enables transformation operations from a formation state at any starting position to a formation state at any other target position in an environment with obstacles while maintaining the state as it is The purpose is to provide.

上記の課題を解決するために、本発明の一態様によれば、行動制御システムは、M及びNをそれぞれ2以上の整数の何れかとし、pをM×N以上の整数の何れかとし、開始位置の集合に配置されたp台の制御対象物を所定の入口位置を含む目標位置の集合に移動させるための行動制御を行う。行動制御システムは、第一方向に対して平行でない方向を第二方向とし、第一方向に対して反対の方向を第三方向とし、第二方向に対して反対の方向を第四方向とし、各開始位置及び各目標位置は、それぞれ第一方向〜第四方向の少なくとも何れかの方向において他の開始位置及び目標位置と隣接し、目標位置の集合及び開始位置の集合はそれぞれ一塊の任意の形状を成し、制御対象物は、当該制御対象物の２次元平面上の第一方向において隣接する第一位置、第二方向において隣接する第二位置、第三方向において隣接する第三位置、及び、第四方向において隣接する第四位置とし、静止するか、または、二次元平面上の第一〜第四位置の何れかに移動するように制御されるものとし、p台の制御対象物は、M×N台毎に１つの制御対象物単位を構成し、１つの制御対象物単位を構成するM×N台の制御対象物はそれぞれ２つ以上の方向において当該制御対象物単位を構成する他の制御対象物と隣接し、目標位置の集合からN×N個の位置からなる位置単位からなる目標位置単位の集合を設定し、制御対象物単位がその制御対象物単位の現在の位置sにおいて各行動aを取ったときの適切さを表す１個の価値関数に基づいて制御され、静止するか、または、二次元平面上の第一〜第四方向の何れかのN×N個の位置からなる位置単位に移動するように制御されるものとし、価値関数が記憶される記憶部と、全ての目標位置単位の集合に制御対象物が存在しない場合、複数の制御対象物単位が成すある隊列をGaとし、複数の制御対象物単位が成す他の隊列をGbとし、隊列Gbに存在し隊列Gaに存在しない制御対象物単位を頭部制御対象物単位とし、開始位置の集合に配置されたp台の制御対象物を目標位置の集合に移動させる際に、頭部制御対象物単位が移動の先頭を務めるように、価値関数を用いてある時刻の隊列Gaに対して隊列Gbを決定する移動先隊列決定用動作計画部と、全ての目標位置単位の集合に制御対象物が存在しない場合、頭部制御対象物単位の位置にM×N台の制御対象物が位置するように各制御対象物の動作を制御する各制御対象物動作用動作計画部とを含み、全ての目標位置単位の集合に制御対象物が存在する場合に、各制御対象物動作用動作計画部は、目標位置単位の集合には含まれるが、目標位置の集合には含まれない位置に、各制御対象物が移動するように制御する。 In order to solve the above-described problem, according to one aspect of the present invention, the behavior control system is configured such that M and N are each an integer of 2 or more, p is an integer of M × N or more, Action control for moving the p control objects arranged in the set of start positions to a set of target positions including a predetermined entrance position is performed. The action control system has a direction that is not parallel to the first direction as a second direction, a direction opposite to the first direction as a third direction, a direction opposite to the second direction as a fourth direction, Each start position and each target position are adjacent to other start positions and target positions in at least one of the first to fourth directions, and the set of target positions and the set of start positions are each a set of arbitrary Forming a shape, the control object is a first position adjacent in the first direction on the two-dimensional plane of the control object, a second position adjacent in the second direction, a third position adjacent in the third direction, And it is assumed that the fourth position is adjacent in the fourth direction, and is controlled to move to any one of the first to fourth positions on the two-dimensional plane. Is one control object unit per M × N units The M × N control objects constituting one control object unit are adjacent to the other control objects constituting the control object unit in two or more directions, respectively, and from the set of target positions. A set of target position units composed of N × N position units is set, and 1 represents the appropriateness when the controlled object unit takes each action a at the current position s of the controlled object unit. Controlled based on individual value functions and controlled to move to a position unit consisting of N × N positions in any of the first to fourth directions on the two-dimensional plane. When there is no control object in the storage unit in which the value function is stored and the set of all target position units, a certain formation formed by a plurality of control object units is defined as Ga, and a plurality of control object units are formed. The other formation is Gb, and it exists in formation Gb and does not exist in formation Ga When the target unit is the head control target unit and the p control target objects arranged in the set of start positions are moved to the target position set, the head control target unit is assumed to be the head of the movement. In addition, if there is no control object in the set of movement target platoon determination and the set of all target position units that determine the platoon Gb for the platoon Ga at a certain time using the value function, the head control object Including an operation planning unit for each control object operation that controls the operation of each control object so that M × N control objects are located at the position of the object unit. When there is an object, each motion target motion planning unit is included in the set of target position units, but moves to a position that is not included in the set of target positions. Control.

上記の課題を解決するために、本発明の他の態様によれば、行動制御方法は、M及びNをそれぞれ2以上の整数の何れかとし、pをM×N以上の整数の何れかとし、開始位置の集合に配置されたp台の制御対象物を所定の入口位置を含む目標位置の集合に移動させるための行動制御を行う。行動制御方法は、第一方向に対して平行でない方向を第二方向とし、第一方向に対して反対の方向を第三方向とし、第二方向に対して反対の方向を第四方向とし、各開始位置及び各目標位置は、それぞれ第一方向〜第四方向の少なくとも何れかの方向において他の開始位置及び目標位置と隣接し、目標位置の集合及び開始位置の集合はそれぞれ一塊の任意の形状を成し、制御対象物は、当該制御対象物の２次元平面上の第一方向において隣接する第一位置、第二方向において隣接する第二位置、第三方向において隣接する第三位置、及び、第四方向において隣接する第四位置とし、静止するか、または、二次元平面上の第一〜第四位置の何れかに移動するように制御されるものとし、p台の制御対象物は、M×N台毎に１つの制御対象物単位を構成し、１つの制御対象物単位を構成するM×N台の制御対象物はそれぞれ２つ以上の方向において当該制御対象物単位を構成する他の制御対象物と隣接し、目標位置の集合からN×N個の位置からなる位置単位からなる目標位置単位の集合を設定し、制御対象物単位がその制御対象物単位の現在の位置sにおいて各行動aを取ったときの適切さを表す１個の価値関数に基づいて制御され、静止するか、または、二次元平面上の第一〜第四方向の何れかのN×N個の位置からなる位置単位に移動するように制御されるものとし、移動先隊列決定用動作計画部が、全ての目標位置単位の集合に制御対象物が存在しない場合、複数の制御対象物単位が成すある隊列をGaとし、複数の制御対象物単位が成す他の隊列をGbとし、隊列Gbに存在し隊列Gaに存在しない制御対象物単位を頭部制御対象物単位とし、開始位置の集合に配置されたp台の制御対象物を目標位置の集合に移動させる際に、頭部制御対象物単位が移動の先頭を務めるように、価値関数を用いてある時刻の隊列Gaに対して隊列Gbを決定する移動先隊列決定用動作計画ステップと、各制御対象物動作用動作計画部が、全ての目標位置単位の集合に制御対象物が存在しない場合、頭部制御対象物単位の位置にM×N台の制御対象物が位置するように各制御対象物の動作を制御するステップと、各制御対象物動作用動作計画部が、全ての目標位置単位の集合に制御対象物が存在する場合に、各制御対象物動作用動作計画ステップは、目標位置単位の集合には含まれるが、目標位置の集合には含まれない位置に、各制御対象物が移動するように制御するステップと、を含む。 In order to solve the above-described problem, according to another aspect of the present invention, in the behavior control method, M and N are each an integer of 2 or more, and p is an integer of M × N or more. Then, behavior control for moving the p control objects arranged in the set of start positions to a set of target positions including a predetermined entrance position is performed. In the behavior control method, the direction that is not parallel to the first direction is the second direction, the direction opposite to the first direction is the third direction, the direction opposite to the second direction is the fourth direction, Each start position and each target position are adjacent to other start positions and target positions in at least one of the first to fourth directions, and the set of target positions and the set of start positions are each a set of arbitrary Forming a shape, the control object is a first position adjacent in the first direction on the two-dimensional plane of the control object, a second position adjacent in the second direction, a third position adjacent in the third direction, And it is assumed that the fourth position is adjacent in the fourth direction, and is controlled to move to any one of the first to fourth positions on the two-dimensional plane. Consists of one control object unit for each M × N units. The M × N control objects constituting one control object unit are adjacent to other control objects constituting the control object unit in two or more directions, and N from the set of target positions. X A set of target position units consisting of position units consisting of N positions, and one representing the appropriateness when the control object unit takes each action a at the current position s of the control object unit It is controlled on the basis of the value function, and is controlled to move to a position unit consisting of N × N positions in any of the first to fourth directions on the two-dimensional plane. When the movement planning unit for determining the movement target platoon does not have a control object in the set of all target position units, Ga is defined as a platoon formed by a plurality of control object units, and other control object units are formed. A control pair that exists in formation Gb and does not exist in formation Ga is Gb. When the object unit is a head control object unit and the p control objects arranged in the set of start positions are moved to the set of target positions, the head control object unit is the head of the movement. , The movement plan determining operation sequence step for determining the column Gb for the column Ga at a certain time using the value function, and the operation planning unit for each control object operation are controlled objects in a set of all target position units. When there is no object, the step of controlling the operation of each control object so that M × N control objects are located at the position of the head control object unit, and the operation planning unit for each control object operation When the control object exists in the set of all target position units, the operation planning step for each control object operation is included in the set of target position units but is not included in the set of target positions. Step for controlling each control object to move And including.

上記の課題を解決するために、本発明の他の態様によれば、行動制御システムは、M,N,Qをそれぞれ2以上の整数の何れかとし、Rを24以上の整数の何れかとし、pをM×N×Q×Rとし、開始位置の集合に配置されたp台の制御対象物を所定の入口位置を含む目標位置の集合に移動させるための行動制御を行う。行動制御システムは、第一方向に対して平行でない方向を第二方向とし、第一方向に対して反対の方向を第三方向とし、第二方向に対して反対の方向を第四方向とし、第一方向と第二方向との成す平面に対して平行でない方向を第五方向とし、第五方向に対して反対の方向を第六方向とし、M×N×Q個の位置からなる一つの位置単位とし、各開始位置及び各目標位置は、それぞれ第一方向〜第六方向の少なくとも何れかの方向において他の開始位置及び目標位置と隣接し、目標位置の集合及び開始位置の集合はそれぞれR個の位置単位からなる一塊の任意の形状を成し、制御対象物は、当該制御対象物の３次元空間上の第一方向において隣接する第一位置、第二方向において隣接する第二位置、第三方向において隣接する第三位置、第四方向において隣接する第四位置とし、第五方向において隣接する第五位置とし、第六方向において隣接する第六位置とし、静止するか、または、３次元空間上の第一〜第六位置の何れかに移動するように制御されるものとし、p台の制御対象物は、M×N×Q台毎に１つの制御対象物単位を構成し、１つの制御対象物単位を構成するM×N×Q台の制御対象物はそれぞれ３つ以上の方向において当該制御対象物単位を構成する他の制御対象物と隣接し、制御対象物単位がその制御対象物単位の現在の位置単位において各行動を取ったときの適切さを表す１個の価値関数に基づいて制御され、静止するか、または、３次元空間上の第一〜第六方向の何れかのN×N×Q個の位置からなる位置単位に移動するように制御されるものとし、価値関数が記憶される記憶部と、複数の制御対象物単位が成すある隊列をGaとし、複数の制御対象物単位が成す他の隊列をGbとし、隊列Gbに存在し隊列Gaに存在しない制御対象物単位を頭部制御対象物単位とし、開始位置の集合に配置されたp台の制御対象物を目標位置の集合に移動させる際に、頭部制御対象物単位が移動の先頭を務めるように、価値関数を用いてある時刻の隊列Gaに対して隊列Gbを決定する移動先隊列決定用動作計画部と、頭部制御対象物単位の位置にM×N×Q台の制御対象物が位置するように各制御対象物の動作を制御する各制御対象物動作用動作計画部とを含み、価値関数は、ある位置単位から所定の入口位置を含む位置単位までのマンハッタン距離を用いて得られる。 In order to solve the above problems, according to another aspect of the present invention, the behavior control system sets M, N, and Q to any one of integers of 2 or more, and sets R to any of integers of 24 or more. , P is M × N × Q × R, and behavior control is performed for moving p control objects arranged in the set of start positions to a set of target positions including a predetermined entrance position. The action control system has a direction that is not parallel to the first direction as a second direction, a direction opposite to the first direction as a third direction, a direction opposite to the second direction as a fourth direction, The direction that is not parallel to the plane formed by the first direction and the second direction is the fifth direction, the opposite direction to the fifth direction is the sixth direction, and one of M × N × Q positions Each start position and each target position are adjacent to other start positions and target positions in at least one of the first direction to the sixth direction, and the set of target positions and the set of start positions are An arbitrary shape of a group of R position units is formed, and the control object is a first position adjacent in the first direction on the three-dimensional space of the control object, and a second position adjacent in the second direction. In the third direction, in the third direction, in the fourth direction The fourth position adjacent to each other, the fifth position adjacent in the fifth direction, the sixth position adjacent in the sixth direction, and either stationary or any of the first to sixth positions in the three-dimensional space It is assumed that the control objects of p units are configured as one control target unit for each M × N × Q units, and M × N × constituting one control target unit. Each of the Q control objects is adjacent to another control object constituting the control object unit in three or more directions, and the control object unit performs each action in the current position unit of the control object unit. It is controlled based on a single value function that represents the appropriateness when it is taken, and is stationary, or consists of N × N × Q positions in any of the first to sixth directions in a three-dimensional space. It is assumed that it is controlled to move to a position unit, a storage unit for storing a value function, and a plurality of A convoy that consists of control object units is Ga, another convoy that consists of multiple control object units is Gb, a control object unit that exists in the convoy Gb and does not exist in the convoy Ga is a head control object unit, When moving the p control objects arranged in the set of start positions to the set of target positions, the gamut Ga at a certain time using a value function so that the head control object unit takes the lead in the movement. Control unit movement control unit so that M × N × Q control objects are positioned at the position of the head control object unit. The value function is obtained using a Manhattan distance from a certain position unit to a position unit including a predetermined entrance position.

上記の課題を解決するために、本発明の他の態様によれば、行動制御方法は、M,N,Qをそれぞれ2以上の整数の何れかとし、Rを24以上の整数の何れかとし、pをM×N×Q×Rとし、開始位置の集合に配置されたp台の制御対象物を所定の入口位置を含む目標位置の集合に移動させるための行動制御を行う。行動制御方法は、第一方向に対して平行でない方向を第二方向とし、第一方向に対して反対の方向を第三方向とし、第二方向に対して反対の方向を第四方向とし、第一方向と第二方向との成す平面に対して平行でない方向を第五方向とし、第五方向に対して反対の方向を第六方向とし、M×N×Q個の位置からなる一つの位置単位とし、各開始位置及び各目標位置は、それぞれ第一方向〜第六方向の少なくとも何れかの方向において他の開始位置及び目標位置と隣接し、目標位置の集合及び開始位置の集合はそれぞれR個の位置単位からなる一塊の任意の形状を成し、制御対象物は、当該制御対象物の３次元空間上の第一方向において隣接する第一位置、第二方向において隣接する第二位置、第三方向において隣接する第三位置、第四方向において隣接する第四位置とし、第五方向において隣接する第五位置とし、第六方向において隣接する第六位置とし、静止するか、または、３次元空間上の第一〜第六位置の何れかに移動するように制御されるものとし、p台の制御対象物は、M×N×Q台毎に１つの制御対象物単位を構成し、１つの制御対象物単位を構成するM×N×Q台の制御対象物はそれぞれ３つ以上の方向において当該制御対象物単位を構成する他の制御対象物と隣接し、制御対象物単位がその制御対象物単位の現在の位置単位において各行動を取ったときの適切さを表す１個の価値関数に基づいて制御され、静止するか、または、３次元空間上の第一〜第六方向の何れかのN×N×Q個の位置からなる位置単位に移動するように制御されるものとし、移動先隊列決定用動作計画部が、複数の制御対象物単位が成すある隊列をGaとし、複数の制御対象物単位が成す他の隊列をGbとし、隊列Gbに存在し隊列Gaに存在しない制御対象物単位を頭部制御対象物単位とし、開始位置の集合に配置されたp台の制御対象物を目標位置の集合に移動させる際に、頭部制御対象物単位が移動の先頭を務めるように、価値関数を用いてある時刻の隊列Gaに対して隊列Gbを決定する移動先隊列決定用動作計画ステップと、各制御対象物動作用動作計画部が、頭部制御対象物単位の位置にM×N×Q台の制御対象物が位置するように各制御対象物の動作を制御する各制御対象物動作用動作計画ステップとを含み、価値関数は、ある位置単位から所定の入口位置を含む位置単位までのマンハッタン距離を用いて得られる。 In order to solve the above-mentioned problem, according to another aspect of the present invention, the behavior control method is such that M, N, and Q are each an integer of 2 or more, and R is an integer of 24 or more. , P is M × N × Q × R, and behavior control is performed for moving p control objects arranged in the set of start positions to a set of target positions including a predetermined entrance position. In the behavior control method, the direction that is not parallel to the first direction is the second direction, the direction opposite to the first direction is the third direction, the direction opposite to the second direction is the fourth direction, The direction that is not parallel to the plane formed by the first direction and the second direction is the fifth direction, the opposite direction to the fifth direction is the sixth direction, and one of M × N × Q positions Each start position and each target position are adjacent to other start positions and target positions in at least one of the first direction to the sixth direction, and the set of target positions and the set of start positions are An arbitrary shape of a group of R position units is formed, and the control object is a first position adjacent in the first direction on the three-dimensional space of the control object, and a second position adjacent in the second direction. , Third position adjacent in the third direction, in the fourth direction 4th position in contact, 5th position adjacent in 5th direction, 6th position adjacent in 6th direction, stationary or moved to any of first to sixth positions in 3D space The control objects of p units constitute one control object unit for each M × N × Q units, and M × N × Q units that constitute one control object unit. The control object is adjacent to other control objects constituting the control object unit in three or more directions, and the control object unit takes action in the current position unit of the control object unit. A position unit that is controlled based on a single value function that represents the appropriateness of time and is stationary, or consists of N × N × Q positions in any of the first to sixth directions in a three-dimensional space. The movement planning unit for determining the movement target platoon is controlled by a plurality of controls. Gait is a formation that consists of object units, Gb is another formation that consists of multiple control object units, and control object units that exist in formation Gb but do not exist in formation Ga are head control object units. When moving p control objects arranged in a set of positions to a set of target positions, a value function is used to move to the formation Ga at a certain time so that the head control object unit takes the lead of the movement. On the other hand, the movement plan step for determining the movement group for determining the row Gb and the movement planning unit for each movement of the control target object are M × N × Q control target objects at the position of the head control target unit. Thus, the value function is obtained by using a Manhattan distance from a certain position unit to a position unit including a predetermined entrance position.

本発明に拠れば、多数のロボットの存在を考慮しつつも、計画計算に必要な計算時間や計算機の記憶容量を少ないものに低減可能で、かつ、ロボット同士が接したままの状態を維持しつつ任意の開始位置における隊列形成状態から、他の任意の目標位置における隊列形成状態へ障害物のある環境にて変形動作を行うことを可能とする。 According to the present invention, while considering the existence of a large number of robots, it is possible to reduce the calculation time required for the planned calculation and the storage capacity of the computer to a small one and maintain the state where the robots are in contact with each other. However, it is possible to perform a deformation operation in an environment with an obstacle from a formation state at an arbitrary starting position to a formation state at another arbitrary target position.

ロボットの移動を説明するための図。The figure for demonstrating the movement of a robot. 多数のロボットが協調して開始位置における隊列形成状態から移動を行い、目標位置での隊列形成を行う任務を説明するための図。The figure for demonstrating the mission which many robots move from the formation formation state in a starting position in cooperation, and form formation at a target position. ロボットの移動を説明するための図。The figure for demonstrating the movement of a robot. 尾部ロボット及び頭部ロボットを説明するための図。The figure for demonstrating a tail robot and a head robot. 頭部ロボットの位置にいた一つのボイドが、尾部ロボットの位置に移動していく様子を示す図。The figure which shows a mode that one void which was in the position of a head robot moves to the position of a tail robot. ロボット単位を説明するための図。The figure for demonstrating a robot unit. 目標隊列エリアG及び目標隊列エリア単位GUを説明するための図。The figure for demonstrating the target formation area G and the target formation area unit GU. 多数の不変型ロボットが協調して開始位置における隊列形成状態から移動を行い、目標位置での隊列形成を行う任務を説明するための図。The figure for demonstrating the mission which many invariant type robots cooperate and move from the formation formation state in a starting position, and form a formation in a target position. 隊列Ga及び隊列Gbを説明するための図。The figure for demonstrating formation Ga and formation Gb. 隊列Ga及び隊列Gbを説明するための図。The figure for demonstrating formation Ga and formation Gb. 図１１Ａは各ロボット動作決定用動作計画のモード１を説明するための図、図１１Ｂは各ロボット動作決定用動作計画のモード１を説明するための図。FIG. 11A is a diagram for explaining mode 1 of each robot motion determining motion plan, and FIG. 11B is a diagram for explaining mode 1 of each robot motion determining motion plan. 図１２Ａは各ロボット動作決定用動作計画のモード１を説明するための図、図１２Ｂは各ロボット動作決定用動作計画のモード１を説明するための図。FIG. 12A is a diagram for explaining mode 1 of each robot motion determining motion plan, and FIG. 12B is a diagram for explaining mode 1 of each robot motion determining motion plan. 図１３Ａは各ロボット動作決定用動作計画のモード１を説明するための図、図１３Ｂは各ロボット動作決定用動作計画のモード１を説明するための図。FIG. 13A is a diagram for explaining mode 1 of each robot motion determining motion plan, and FIG. 13B is a diagram for explaining mode 1 of each robot motion determining motion plan. 図１４Ａは各ロボット動作決定用動作計画のモード１を説明するための図、図１４Ｂは各ロボット動作決定用動作計画のモード１を説明するための図。FIG. 14A is a diagram for explaining mode 1 of each robot motion determining motion plan, and FIG. 14B is a diagram for explaining mode 1 of each robot motion determining motion plan. 各ロボット動作決定用動作計画のモード１を説明するための図。The figure for demonstrating the mode 1 of each robot operation | movement determination operation plan. ロボット序列計算処理Robot_Joretu_Decisionを説明するための図。The figure for demonstrating robot order calculation processing Robot_Joretu_Decision. 多数の不変型ロボットが協調して開始位置における隊列形成状態から移動を行い、目標位置での隊列形成を行う任務を説明するための図。The figure for demonstrating the mission which many invariant type robots cooperate and move from the formation formation state in a starting position, and form a formation in a target position. 多数の不変型ロボットが協調して開始位置における隊列形成状態から移動を行い、目標位置での隊列形成を行う任務を説明するための図。The figure for demonstrating the mission which many invariant type robots cooperate and move from the formation formation state in a starting position, and form a formation in a target position. 多数の不変型ロボットが協調して開始位置における隊列形成状態から移動を行い、目標位置での隊列形成を行う任務を説明するための図であって、目標エリア単位GUが充填された状態を示す図。FIG. 5 is a diagram for explaining a mission in which a large number of invariant robots cooperate to move from a formation state at a start position and form a formation at a target position, and shows a state in which a target area unit GU is filled. Figure. 条件１を説明するための図。The figure for demonstrating the conditions 1. FIG. 多数の不変型ロボットが協調して開始位置における隊列形成状態から移動を行い、目標位置での隊列形成を行う任務を説明するための図。The figure for demonstrating the mission which many invariant type robots cooperate and move from the formation formation state in a starting position, and form a formation in a target position. 多数の不変型ロボットが協調して開始位置における隊列形成状態から移動を行い、目標位置での隊列形成を行う任務を説明するための図であって、目標エリアGが充填された状態を示す図。。FIG. 6 is a diagram for explaining a mission in which a large number of invariant robots cooperate to move from a formation state at a start position and form a formation at a target position, and illustrates a state in which a target area G is filled . . 第一実施形態に係る行動制御システムの機能ブロック図。The functional block diagram of the action control system which concerns on 1st embodiment. 第一実施形態に係る行動制御システムの処理フローの例を示す図。The figure which shows the example of the processing flow of the action control system which concerns on 1st embodiment. 各格子が菱形での場合の例を示す図。The figure which shows the example in case each lattice is a rhombus. ロボットの移動を説明するための図。The figure for demonstrating the movement of a robot. 図２７Ａは開始位置の集合を説明するための図、図２７Ｂは目標位置の集合を説明するための図。FIG. 27A is a diagram for explaining a set of start positions, and FIG. 27B is a diagram for explaining a set of target positions. 尾部ロボット及び頭部ロボットを説明するための図。The figure for demonstrating a tail robot and a head robot. 頭部ロボットの位置にいた一つのボイドが、尾部ロボットの位置に移動していく様子を示す図。The figure which shows a mode that one void which was in the position of a head robot moves to the position of a tail robot. ロボット単位を説明するための図。The figure for demonstrating a robot unit. ボイド移動制御を説明するための図。The figure for demonstrating void movement control. 第二実施形態に係る行動制御システムの機能ブロック図。The functional block diagram of the action control system which concerns on 2nd embodiment. 第二実施形態に係る行動制御システムの処理フローの例を示す図。The figure which shows the example of the processing flow of the action control system which concerns on 2nd embodiment. 多数のロボットが協調して開始位置における隊列形成状態から移動を行い、目標位置での隊列形成を行う任務を説明するための図。The figure for demonstrating the mission which many robots move from the formation formation state in a starting position in cooperation, and form formation at a target position.

以下、本発明の実施形態について説明する。なお、以下の説明に用いる図面では、同じ機能を持つ構成部や同じ処理を行うステップには同一の符号を記し、重複説明を省略する。 Hereinafter, embodiments of the present invention will be described. In the drawings used for the following description, constituent parts having the same function and steps for performing the same process are denoted by the same reference numerals, and redundant description is omitted.

＜第一実施形態＞
まず、行動制御システム及び方法の理論的背景について説明する。以下、行動制御の対象である制御対象物が、ロボットである場合を例に挙げて説明するが、制御対象物は、制御の対象となり得るものであれば、ロボット以外であってもよい。 <First embodiment>
First, the theoretical background of the behavior control system and method will be described. Hereinafter, the case where the control target object that is the target of behavior control is a robot will be described as an example, but the control target object may be other than the robot as long as it can be a control target.

[問題設定]
多数のロボットが協調して開始位置における隊列形成状態から、各ロボットが接した状態を維持しつつ移動を行い、目標位置での隊列形成を行う任務は、例えば図１に例示するような、互いに接する辺同志をスライドさせて移動していくことが可能な正方形型のロボットの使用を想定し、図２に示すように、壁で区切られた部屋においての開始位置から目標位置まで複数のロボットの移動によって実現するものである。 [Problem settings]
A number of robots cooperate to move from the formation state at the start position while maintaining the state in which each robot is in contact, and the task of forming the formation at the target position is as shown in FIG. Assuming the use of a square-type robot that can move by sliding the adjacent sides, as shown in FIG. 2, a plurality of robots can be moved from the start position to the target position in a room separated by walls. It is realized by movement.

ロボットについては、例えば図１に示すように、ロボットの周囲上下左右４マスのうち一つに他のロボットが存在している状態を維持しながら移動をするものとする。この手法では１つのロボット自身が、一台のロボットのサイズ分の距離を移動することで、一回の動作の移動量を正確に測ることができるというメリットがある。また、一つの辺を共有する隣り合うロボットとの相対的な位置を計測しあうことで、ロボットの群れ全体の中での各ロボットの位置も容易に知ることができる。このため、ロボットの移動量の誤差によって、隊列が崩れるといった問題を起こしにくい。また、図３のように、複数のロボットを連結したように、同時に複数のロボットを移動させていくことが可能である。図３では、隊列Aから隊列Bに変形する際に、及び、隊列Dから隊列Eに変形する際に二台のロボットが連結したように同時に移動する。なお、ロボットは、隣の位置に他のロボットが存在しているか否か、障害物があるか否か、そして、自身が目標位置上にいるかどうかを知ることができるものとする。例えば、何れのロボットもロボットの周囲のマスに存在する他のロボットと通信できるものとし、少なくとも一台のロボットがGPSを備えるか、または、少なくとも１台のロボットが一回の行動制御において移動しない場合、GPSを備えるロボットや移動しなかったロボットを基準として、隣接するロボットと通信することで、各ロボットは自身の絶対的な位置を容易に知ることができる。もしくは、少なくとも一台のロボットの絶対位置が開始位置にて既知であるならば、それを使用してGPSなしでも、各時刻の各ロボットの絶対位置を知ることが出来る。 For example, as shown in FIG. 1, the robot moves while maintaining the state in which another robot exists in one of the four squares around the robot. This method has an advantage that one robot itself can accurately measure the movement amount of one operation by moving the distance corresponding to the size of one robot. Also, by measuring the relative positions of adjacent robots that share one side, the position of each robot in the entire group of robots can be easily known. For this reason, it is difficult to cause a problem that the formation is collapsed due to an error in the movement amount of the robot. Further, as shown in FIG. 3, it is possible to move a plurality of robots at the same time as if a plurality of robots are connected. In FIG. 3, when transforming from platoon A to platoon B and when transforming from platoon D to platoon E, the two robots move simultaneously as if they were connected. It is assumed that the robot can know whether another robot is present at the adjacent position, whether there is an obstacle, and whether the robot is on the target position. For example, it is assumed that any robot can communicate with other robots existing in the mass around the robot, and at least one robot is equipped with GPS, or at least one robot does not move in one action control. In this case, each robot can easily know its absolute position by communicating with an adjacent robot with reference to a robot with GPS or a robot that has not moved. Alternatively, if the absolute position of at least one robot is known at the start position, it can be used to know the absolute position of each robot at each time without GPS.

任務を行うロボットは、p台（pは2以上の整数の何れかであり、例えばp≧50）であり、各ロボットは、隣接するロボットと一辺以上を共有しつつ、二次元平面におけるX-Y軸方向（この例では、図面の紙面上下左右の四方向）に移動可能とする。ただし、p台のロボットで１つの群れを成すものとする。各格子にはロボットは一台しか存在することができない。それぞれのロボットは、移動しようとする方向に障害物か他のロボットがある場合には、静止をするものと仮定する。図２において、Rが記載された格子はロボットが存在する位置を示し、Oが記載された格子は障害物が存在する位置を示す。また、太線の破線で囲まれた領域は開始位置の集合を示し、太線の一点鎖線で囲まれた領域は目標位置の集合Gで表される領域（以下、この領域のことを「目標隊列エリアG」ともいう）を示し、太線の実線で囲まれた領域は後述する目標隊列エリアGの入口位置Peを示す。このように、各開始位置及び各目標位置は、それぞれ上下左右方向の少なくとも何れかの方向において他の開始位置及び目標位置と隣接し、ロボットの開始位置及び目標位置での隊列形状はそれぞれ一塊の任意の形状である。 The robots that perform the mission are p units (p is any integer of 2 or more, for example, p ≧ 50), and each robot shares one or more sides with an adjacent robot, and the XY axis in a two-dimensional plane. It is possible to move in the direction (in this example, the four directions on the drawing up / down / left / right). However, p robots form one group. There can be only one robot in each grid. Each robot is assumed to be stationary if there are obstacles or other robots in the direction of movement. In FIG. 2, a grid in which R is described indicates a position where the robot exists, and a grid in which O is described indicates a position where an obstacle exists. An area surrounded by a thick broken line indicates a set of start positions, and an area surrounded by a thick dashed line is an area represented by a set G of target positions (hereinafter, this area is referred to as “target row area”). G ”), and an area surrounded by a thick solid line indicates an entrance position Pe of a target platoon area G to be described later. In this way, each start position and each target position are adjacent to other start positions and target positions in at least one of the vertical and horizontal directions, respectively, and the formation at the start position and the target position of the robot is a lump. Any shape.

[ロボットの座標設定]
それぞれのロボットi（iはロボット番号を表すi=0,1,2,3,,,p-1）の初期位置(開始位置)を（Xr0[i],Yr0[i]）とし、目標位置を（Xre[i],Yre[i]）とするとき、本問題は、開始位置に配置されたロボットが、目標位置まで移動するための行動計画を求めることと定義できる。 [Robot coordinate settings]
The initial position (start position) of each robot i (i is the robot number i = 0,1,2,3 ,,, p-1) is (Xr0 [i], Yr0 [i]) and the target position Is (Xre [i], Yre [i]), this problem can be defined as finding an action plan for the robot placed at the start position to move to the target position.

[任務空間の定義]
このような問題に対して単純にマルコフ状態遷移モデルを適用しようとする場合、マルコフ状態空間は、iをロボット番号としたとき、ロボットiの位置（Xr[i],Yr[i]）、ロボットiの行動aによって構成される。各状態（ロボットの位置と行動）は離散値で表現される。部屋をX,Yの直交座標系からなる２次元平面で表すと、X軸、Y軸をそれぞれ離散化表現した値により各位置を表現する。つまり、図２のように部屋（２次元平面）は格子で区切られ、各格子が各位置に対応する。また、各格子において、障害物の「ある／なし」が予め設定されている。 [Definition of mission space]
When simply applying the Markov state transition model to such a problem, the Markov state space is the position of the robot i (Xr [i], Yr [i]), i composed of i actions a. Each state (robot position and action) is represented by discrete values. When a room is represented by a two-dimensional plane composed of an orthogonal coordinate system of X and Y, each position is represented by a discrete representation of the X axis and the Y axis. That is, as shown in FIG. 2, the room (two-dimensional plane) is divided by a grid, and each grid corresponds to each position. In each grid, “present / none” of the obstacle is set in advance.

[ロボット動作の定義]
また、行動主体は部屋に配置されている各ロボットとなる。ロボットiの行動a[i]は、静止、上下左右方向への１格子分の移動の計５種類のうちのいずれかを取る。例えば、a[i]∈{0,1,2,3,4}として、
0: 静止
1: 二次元平面上でX座標値が増える方向（図2で右方向）に１格子だけ移動する
2: 二次元平面上でY座標値が増える方向（図2で下方向）に１格子だけ移動する
3: 二次元平面上でX座標値が減る方向（図2で左方向）に１格子だけ移動する
4: 二次元平面上でY座標値が減る方向（図2で上方向）に１格子だけ移動する
とする。 [Definition of robot motion]
The action subject is each robot arranged in the room. The action a [i] of the robot i takes one of a total of five types, that is, stationary and movement of one lattice in the vertical and horizontal directions. For example, a [i] ∈ {0,1,2,3,4}
0: stationary
1: Move by one grid in the direction in which the X coordinate value increases on the two-dimensional plane (to the right in Fig. 2)
2: Move one grid in the direction of increasing Y-coordinate value on the two-dimensional plane (downward in Fig. 2)
3: Move one grid in the direction of decreasing X-coordinate value on the two-dimensional plane (leftward in Fig. 2)
4: Suppose that one grid moves in the direction in which the Y coordinate value decreases on the two-dimensional plane (upward in Fig. 2).

[探索計算上の問題点]
このような任務環境におけるマルコフ状態空間は、ロボット数×2の次元数の状態を持ち、かつ選択可能な行動数は、ロボットの行動（＝５通り）のロボット数乗だけ存在する。例えば、ロボット数が５０で、部屋の縦横方向の格子数がそれぞれ２０であるとすれば状態数は２０の100乗個にもなり、探索計算に要する資源の量は膨大なものとなる。さらにロボット数が１台増えるごとに、その状態数は400倍増加していくことになる。 [Problems in search calculation]
The Markov state space in such a mission environment has a state of the number of dimensions of the number of robots × 2, and the number of selectable actions exists by the number of robot actions (= 5). For example, if the number of robots is 50 and the number of grids in the vertical and horizontal directions of the room is 20, the number of states becomes 20 to the 100th power, and the amount of resources required for the search calculation becomes enormous. Each time the number of robots increases by one, the number of states will increase 400 times.

[問題設定]の項で説明したように、ロボット同士が接しているという拘束条件を取り入れても、根本的な計算量の削減は難しく、複数のロボットを使用する場合の大きな問題となっている。 As explained in [Problem Setting], even if the constraint condition that robots are in contact with each other is incorporated, it is difficult to reduce the fundamental amount of calculation, which is a big problem when using multiple robots. .

具体的には、例えば、多数のロボットを複数の目標位置に誘導する問題においては、一台のロボットがある位置へ至るための経路を探索計算により求めることができるとしても、各ロボットにどの目標位置を割り振るかが大きな問題となる。この目標位置の割り振りがうまくいかなければ、早めに目標位置内に到達したロボットが、後から来るロボットの移動を邪魔するような位置に居座ってしまい、全体でのロボットの隊列形成を困難にしてしまう。そのようなことを避けるための各ロボットへの目標位置への割り振り方を探索により計算することを考えた場合、ロボットの台数の階乗分だけの目標位置への割り振り方が存在するため、計算にかかる負荷はロボットの台数とともに劇的に増加する。 Specifically, for example, in the problem of guiding a large number of robots to a plurality of target positions, even though a route for reaching one position can be obtained by search calculation, it is possible to determine which target for each robot. The problem is whether to allocate the position. If this target position allocation is not successful, a robot that has reached the target position early will stay in a position that hinders the movement of the robot that comes later, making it difficult to form a robot platoon as a whole. End up. In order to avoid such a situation, considering how to allocate the target position to each robot by searching, there is a way to allocate to the target position by the factorial of the number of robots. The load on the robot increases dramatically with the number of robots.

また、ロボットが互いに一辺を共有しつつ移動を行うための探索計算について考えた場合の探索計算も、各ロボットのお互いの移動を考慮したうえで行うために、必要な計算量はロボットの台数に対して指数関数的に増えてしまう。 In addition, the search calculation when considering the search calculation for the robots to move while sharing one side with each other also takes into account the mutual movement of each robot. On the other hand, it increases exponentially.

[二つの動作計画の導入]
そこで本実施形態では、これらの計算負荷の問題を解決するための方策の一つとして、ロボットの動作決定における探索計算を二つに分けることとする。ここでそのための準備として、ある隊列Gaをなしているロボットが、次になすべき隊列の形状としてGbを考える。隊列Gaと隊列Gbの間には、隊列Gaにて存在し、隊列Gbにて存在しない位置にあるロボットと、隊列Gbにて存在し隊列Gaにて存在しない位置にあるロボットがある（図４参照）。前者を尾部ロボットとし、後者を頭部ロボットとする。本実施形態では、ちょうど、頭部ロボットの位置にあるロボットが、ロボット群の移動において先頭を務め、尾部ロボットが、移動の最後尾を務めるロボットになるように、隊列Gaと隊列Gbとを定義する。 [Introduction of two action plans]
Therefore, in this embodiment, as one of the measures for solving these calculation load problems, the search calculation for determining the robot motion is divided into two. Here, as a preparation for this, a robot forming a certain column Ga considers Gb as the shape of the next column to be formed. Between the formations Ga and Gb, there are robots that exist in the formation Ga and that do not exist in the formation Gb, and robots that exist in the formation Gb and do not exist in the formation Ga (FIG. 4). reference). The former is a tail robot and the latter is a head robot. In the present embodiment, the formation Ga and the formation Gb are defined so that the robot at the position of the head robot is the first robot in the movement of the robot group and the tail robot is the last robot in the movement. To do.

この定義の元、一つ目の探索計算は、隊列Gaをなすロボットの群が、障害物を回避しつつも、目標位置まで移動するためには、次の隊列Gbとしてどの隊列を選択すればよいかを決定するための動作計画計算（以下「移動先隊列決定用動作計画」ともいう）である。この探索計算は、事実上、頭部ロボット位置に移動するロボットとして、隊列Gaのどのロボットを選択するかと、隊列Gaにおける尾部ロボットを決定するための計算である。 Based on this definition, the first search calculation shows that the group of robots that form the formation Ga is selected as the next formation Gb in order to move to the target position while avoiding obstacles. This is an operation plan calculation for deciding whether it is okay (hereinafter also referred to as “movement plan for movement target column determination”). This search calculation is actually a calculation for determining which robot of the formation Ga is selected as the robot moving to the head robot position and the tail robot in the formation Ga.

二つ目の探索計算は、ロボットが互いに接した位置関係を維持しつつも、隊列Gaから隊列Gbへの変形する際の各ロボットの動作を決定するための動作計画計算（以下「各ロボット動作用動作計画」ともいう）である。これは、事実上、移動先隊列決定用動作計画により決定済みの頭部ロボット及び尾部ロボットに対し、尾部ロボットの位置からロボットを追い出し、隊列Gbにおける頭部ロボットの位置へのロボット誘導のための探索計算である。 The second search calculation is an action plan calculation (hereinafter referred to as "each robot operation") that determines the operation of each robot when transforming from formation Ga to formation Gb while maintaining the positional relationship where the robots are in contact with each other. It is also called “operation plan”. In effect, for the head robot and tail robot that have already been determined by the movement plan for determining the destination platoon, the robot is expelled from the position of the tail robot and the robot is guided to the position of the head robot in the platoon Gb. Search calculation.

前者、後者ともに動的計画法を使用した探索計算を使用するが、ここでは、前者の頭部ロボットの決定のための探索計算ではマルコフ決定過程における動的計画法を利用する。また、前者の尾部ロボットの決定と、後者の探索計算ではロボット同士の隣接関係の情報を使用したダイクストラ法等の動的計画法を使用する。なお、前者の頭部ロボットを決定する際に用いる価値関数Q(s,a)（詳細は後述する）の計算は、ロボットの群れが開始位置からの移動を開始する前にあらかじめ行い、価値関数Q(s,a)を用いた実際の前者の頭部ロボットの決定ロボット移動中の実時間で行う。また、後者の探索計算とそれを使用した尾部ロボットの決定も、ロボット移動中の実時間で行う。 Search calculation using dynamic programming is used for both the former and the latter, but here, the dynamic calculation method in the Markov decision process is used in the search calculation for determining the head robot. Further, the determination of the former tail robot and the latter search calculation use a dynamic programming method such as a Dijkstra method using information on the adjacent relationship between the robots. Note that the value function Q (s, a) (details will be described later) used for determining the former head robot is calculated in advance before the group of robots starts moving from the start position. The real former head robot using Q (s, a) is performed in real time while moving the decision robot. The latter search calculation and the determination of the tail robot using it are also performed in real time while the robot is moving.

[2種類のボイド制御]
これらの二つの探索計算の計算負荷の軽減のため、本実施形態では、ボイド制御の考え方を導入する。まず、ここでいうボイドとは、あるロボットが別の位置に移動した後に、元いた位置にできる空隙のことである。別の言い方をすると、ボイドとは、ロボットの移動する方向と反対の方向に移動する仮想的な存在である。こうした群ロボットの隊列形成問題においては、複数のロボットの動作に着目するがゆえに、その探索計算量が爆発してしまうが、視点を変えて、ボイドの動きに着目すれば、多数のロボットの動作計画の問題を単一のボイドの動作計画として考えることができる。そのため、ボイド制御の考え方を導入することは、群ロボットの隊列形成問題における探索計算負荷の軽減に適している。 [Two types of void control]
In order to reduce the calculation load of these two search calculations, this embodiment introduces the concept of void control. First, the term “void” as used herein refers to a gap that can be restored to the original position after a robot has moved to another position. In other words, a void is a virtual entity that moves in a direction opposite to the direction in which the robot moves. In such a group robot formation problem, the amount of search calculation explodes because it focuses on the movement of multiple robots, but if you change the viewpoint and focus on the movement of voids, the movement of many robots The planning problem can be thought of as a single void motion plan. For this reason, introducing the concept of void control is suitable for reducing the search calculation load in the group formation problem of group robots.

まず、移動先隊列決定用動作計画において、ロボット群の移動の先頭を務める頭部ロボットの動作について考える。本実施形態では、目標隊列エリアGに一か所の入口位置Peを設けて、他の位置からは頭部ロボットが目標隊列エリアGに入ることはできないこととする。また、一度目標隊列エリアG内に入った全てのロボットは、二度と目標隊列エリアG外には出られないこととする。この条件のもと、目標隊列エリアG外にいる頭部ロボットがPeに至るまでのロボットの経路決定問題は、一台のロボットを入口位置Peまで誘導するための探索問題を解けばよく、得られた解を頭部ロボットの選択と誘導のために使用すればよい。目標隊列エリアG内の頭部ロボットの動作については、ボイドの動きに着目して、ボイドの誘導について同様のことを行う。すなわち、目標隊列エリアG内の頭部ロボットに望まれる動作とは、適切に頭部ロボットが入口位置Peから遠ざかり、目標隊列エリアG内の各場所に散らばっていきつつ目標隊列エリアGを埋めていくことである。これをボイドの動きとしてみれば、ロボットが入口位置Peから、目標隊列エリアG内に入るたびに、ボイドはそれに伴って、入口位置Peから目標隊列エリアGの外に出ていくのである。このとき、目標隊列エリアG内のボイドは、すべて一点の入口位置Peに向かって移動していくのが理想的な動作である。つまり、目標隊列エリアG内の頭部ロボットを目標隊列エリアGに適切に散らばるように誘導するためには、目標隊列エリアG内に位置する各ボイドが入口位置Peに至るための経路を探索計算すればよく、そうして求められたボイドの動作を頭部ロボットを動かすことで実現するようにすればよい。この考え方ならば、すべての位置にあるロボットが入口位置Peを目指して動作するための動作計画（価値関数Q(s,a)の計算）を一回行えばよいので、計算負荷が劇的に少なくなる。 First, consider the behavior of the head robot that acts as the head of the movement of the robot group in the movement plan for the movement target column determination. In this embodiment, it is assumed that one entrance position Pe is provided in the target platoon area G, and the head robot cannot enter the target platoon area G from other positions. In addition, all robots that have once entered the target platoon area G will never be able to leave the target platoon area G again. Under this condition, the robot routing problem until the head robot outside the target row area G reaches Pe can be obtained by solving the search problem for guiding one robot to the entrance position Pe. The obtained solution may be used for selection and guidance of the head robot. Regarding the movement of the head robot in the target row area G, the same is done for the guidance of the void, focusing on the movement of the void. That is, the desired behavior of the head robot in the target platoon area G is that the head robot appropriately moves away from the entrance position Pe and fills the target platoon area G while being scattered in each place in the target platoon area G. Is to go. If this is seen as the movement of the void, each time the robot enters the target platoon area G from the entrance position Pe, the void goes out of the target platoon area G from the entrance position Pe accordingly. At this time, it is an ideal action that all the voids in the target platoon area G move toward the single entrance position Pe. In other words, in order to guide the head robots in the target platoon area G so as to be properly scattered in the target platoon area G, search calculation is performed for each void located in the target platoon area G to reach the entrance position Pe. What is necessary is just to make it implement | achieve by moving a head robot so that the motion of the void calculated | required in that way may be carried out. With this concept, the operation load (calculation of the value function Q (s, a)) for the robots at all positions to move toward the entrance position Pe only needs to be performed once. Less.

続いて、隊列Gaを維持しているロボット群が、隊列Gbに変形することを考える。このとき、ロボット群が隊列Gaから隊列Gbに変形する動作とは、頭部ロボットの位置にいた一つのボイドが、尾部ロボットの位置に移動していくプロセスとしてとらえることができる（図４及び図５参照）。このようなボイドの動作は、頭部ロボットの位置をスタート位置とし、尾部ロボットの位置をゴール位置とした、一つのボイドの動作計画によって計算可能であり、計算負荷も小さい。この計算において考慮すべきボイドの移動の拘束条件としては、ボイドが移動する際に、ロボットが存在する位置をたどっていくという制限を設けるだけでよい。 Next, consider that the robot group that maintains the formation Ga transforms into the formation Gb. At this time, the movement of the robot group from the formation Ga to the formation Gb can be regarded as a process in which one void at the position of the head robot moves to the position of the tail robot (FIGS. 4 and 4). 5). Such a motion of the void can be calculated by a motion plan of one void with the position of the head robot as the start position and the position of the tail robot as the goal position, and the calculation load is small. As a constraint condition for the movement of the void to be considered in this calculation, it is only necessary to provide a restriction that the robot follows the position where the robot exists when the void moves.

[4マスロボット単位の導入]
さらに、本実施形態では、図６に示すように、田の字状に隣接した4つのロボットを一つの単位とし（以下、「ロボット単位」ともいう）、ロボットは、この田の字型のロボット単位を維持しつつ移動を行うこととする。言い換えると、四台毎に１つのロボット単位を構成し、１つのロボット単位を構成する四台のロボットはそれぞれ２つの方向において１つのロボット単位を構成する他のロボットと隣接した状態を維持しつつ移動を行う。このロボット単位の集団は、互いにロボット単位ごとに一辺を共有し、接しながら移動をするように制御される。 [Introduction of 4 mass robot units]
Further, in the present embodiment, as shown in FIG. 6, four robots adjacent to the rice field are defined as one unit (hereinafter also referred to as “robot unit”), and the robot is a rice field robot. Move while maintaining the unit. In other words, every four robots constitute one robot unit, and each of the four robots constituting one robot unit maintains a state of being adjacent to another robot constituting one robot unit in two directions. Move. This group of robot units is controlled so as to share one side and move while contacting each other.

このような４つのロボットを一つの単位とした移動を行う理由としては、このような状態で移動を行う限り、ロボットの群れの中のいずれのロボットが一台欠けても、各ロボットはお互いに一つの辺で接しあう位置関係を崩さずに済むからである。すなわちこれは、各ロボット動作用動作計画において、ロボット同士の接続を考慮するための計算負荷を軽減することにつながるからである。 The reason why such four robots are moved as one unit is that, as long as the robot moves in such a state, each robot will be able to This is because it is not necessary to destroy the positional relationship that touches on one side. That is, this is because the calculation load for considering the connection between the robots is reduced in each robot operation plan.

すなわち、先に述べた移動先隊列決定用動作計画は、4台のロボットがなすロボット単位がなす群れの移動について行う。先に述べた頭部ロボット、尾部ロボットは、それぞれ4台のロボットで構成される頭部ロボット単位、尾部ロボット単位となる。各ロボット動作用動作計画は、頭部ロボット単位に含まれる４つのロボット位置に、ロボットを誘導し、尾部ロボット単位位置に含まれるロボットから、４つのロボットを追い払うための動作を探索計算する。 That is, the movement plan for the movement destination platoon determination described above is performed for the movement of a group made up of four robots. The head robot and tail robot described above are in units of head robot and tail robot each composed of four robots. Each motion plan for robot operation guides the robot to four robot positions included in the head robot unit, and searches and calculates an operation for driving off the four robots from the robot included in the tail robot unit position.

ここでは4台のロボットがなすロボット単位に対応する四つのマスが一つのマスの単位（以下「マス単位」ともいう）であるとし、一つのマス単位を一状態として状態空間を組む。言い換えると、ロボット単位の位置を（Xr_unit[j],Yr_unit[j]）（j=0,1,2,…j_max-1）としたとき、そのロボット単位に所属するロボットをi1,i2,i3,i4とすれば、
Xr[i1] = 2 ×Xr_unit[j]
Yr[i1] = 2 ×Yr_unit[j]
Xr[i2] = 2 ×Xr_unit[j] + 1
Yr[i2] = 2 ×Yr_unit[j]
Xr[i3] = 2 ×Xr_unit[j]
Yr[i3] = 2 ×Yr_unit[j] + 1
Xr[i4] = 2 ×Xr_unit[j] + 1
Yr[i4] = 2 ×Yr_unit[j] + 1
である。なお、ロボットの全体数pが、4の倍数でない場合には、端数のロボットが存在することになるが、それらのロボットは、皆、移動先隊列決定用動作計画においては切り捨てられ（存在しないものとして扱い）、各ロボット動作用動作計画においては、尾部ロボット単位のメンバとして扱われる。 Here, it is assumed that four squares corresponding to the robot units formed by the four robots are one square unit (hereinafter also referred to as “mass unit”), and a state space is formed with one square unit as one state. In other words, when the position of the robot unit is (Xr_unit [j], Yr_unit [j]) (j = 0,1,2, ... j_max-1), the robots belonging to that robot unit are i1, i2, i3 , i4,
Xr [i1] = 2 × Xr_unit [j]
Yr [i1] = 2 × Yr_unit [j]
Xr [i2] = 2 × Xr_unit [j] + 1
Yr [i2] = 2 × Yr_unit [j]
Xr [i3] = 2 × Xr_unit [j]
Yr [i3] = 2 × Yr_unit [j] + 1
Xr [i4] = 2 × Xr_unit [j] + 1
Yr [i4] = 2 × Yr_unit [j] + 1
It is. If the total number of robots p is not a multiple of 4, there will be fractional robots, but all of these robots will be rounded down in the movement plan for moving to the target team (not present) In the motion plan for each robot operation, it is treated as a member of the tail robot unit.

ロボット単位の頭部ロボット決定用の目標位置の集合をGU（以下「目標隊列エリア単位GU」ともいう）とする。目標隊列エリア単位GUは、ロボット単位と同様に4つのロボット位置が一単位となったマス単位で構成され、図７、図８で示すように、目標隊列エリア単位GUを構成するマスが、目標隊列エリアGをはみ出さないように設定される。このとき、目標隊列エリアGに属するが目標隊列エリア単位GUに属さないロボット目標位置（4マスを構成できない端数のロボット目標位置）が存在する場合がある。なお、目標隊列エリアGは以下の２つの条件を満たすことを前提とする。(1)目標隊列エリアGが目標隊列エリア単位GUから飛び出している部分は目標隊列エリア単位GUに隣接するマス単位に含まれるものとする（図７参照）。(2)目標隊列エリアGが目標隊列エリア単位GUから飛び出している部分は、かぎ型のように飛び出したL字型とはならないものとする（図７参照）。このように、目標隊列エリアG及び目標隊列エリア単位GUを設定するため、移動先隊列決定用動作計画にて使用するロボット単位の総数は、目標隊列エリア単位GU内のロボット単位の数と同じかそれ以上となる。よって、目標隊列エリアGに属するが目標隊列エリア単位GUに属さないロボット目標位置の個数が４を超えることは起こりうるが、本実施形態はそのような場合でも使用可能である。すべての目標隊列エリア単位GU内にロボットが満たされたら、移動先隊列決定用動作計画はそれ以降は行わない。それ以降は、各ロボット動作決定用動作計画によって、目標隊列エリア単位GUに属さないが目標隊列エリアGに属する目標位置へロボットを埋めるための動作が計算される。 A set of target positions for determining a head robot in units of robots is defined as GU (hereinafter also referred to as “target unit area unit GU”). The target platoon area unit GU is composed of mass units in which four robot positions are united in the same manner as the robot unit. As shown in FIGS. 7 and 8, the masses constituting the target platoon area unit GU are the target units. It is set so as not to protrude in the convoy area G. At this time, there may be a robot target position that belongs to the target platoon area G but does not belong to the target platoon area unit GU. It is assumed that the target formation area G satisfies the following two conditions. (1) The portion where the target row area G protrudes from the target row area unit GU is included in the mass unit adjacent to the target row area unit GU (see FIG. 7). (2) The portion where the target row area G protrudes from the target row area unit GU is not an L-shape that protrudes like a hook shape (see FIG. 7). In this way, in order to set the target platoon area G and the target platoon area unit GU, is the total number of robot units used in the movement platoon determination action plan equal to the number of robot units in the target platoon area unit GU? More than that. Therefore, although the number of robot target positions belonging to the target platoon area G but not belonging to the target platoon area unit GU may exceed four, the present embodiment can be used even in such a case. If the robots are filled in all the target platoon area units GU, the movement plan for the movement platoon determination is not performed thereafter. Thereafter, an operation for filling the robot in a target position that does not belong to the target platoon area unit GU but belongs to the target platoon area G is calculated by each robot movement determination operation plan.

[移動先隊列決定用動作計画のためのMDP状態空間]
移動先隊列決定用動作計画での探索計算用のマルコフ状態空間を、一つのロボット単位の状態変数のみで構成することにする。すなわち、
状態変数s=（Xr_unit,Yr_unit），行動変数a∈{0,1,2,3,4}
である。頭部ロボット単位の選択と誘導は、この状態変数を引数とした一つの価値関数Q(s,a)を使用して行動決定を行う。価値関数Q（s,ａ）の計算は、動的計画法を使用して、任務の事前に行うものとする。入口位置Peを含む入口位置単位をPeUとし（図８参照）、動的計画法にて計算を行う際に、報酬の値は、入口位置単位PeUに頭部ロボット単位が移動した場合において高報酬値１が与えられ、その他は０の報酬が与えられる。 [MDP state space for action plan for determining the target team]
The Markov state space for search calculation in the movement plan for determining the movement target platoon is composed of only one state variable for each robot. That is,
State variable s = (Xr_unit, Yr_unit), action variable a∈ {0,1,2,3,4}
It is. For selection and guidance of the head robot unit, action determination is performed using one value function Q (s, a) with this state variable as an argument. The calculation of the value function Q (s, a) shall be done in advance of the task using dynamic programming. When the entrance position unit including the entrance position Pe is PeU (see FIG. 8) and the calculation is performed by dynamic programming, the reward value is high when the head robot unit moves to the entrance position unit PeU. The value 1 is given and the others are given 0 reward.

すなわち、価値関数Q（s,ａ）を計算するために必要なマルコフ状態空間での状態遷移確率と報酬値の設定は、行動aにより、状態がsからｓ'に移動したとして、その遷移確率をP(s'|s,a)とすると、例えば、設定方法の一例として
(1)sが障害物でなく、s'が行動aの行先のマス単位のとき、
P(s'|s,a)←1
(2)sが障害物でなく、s'が行動aの行先のマス単位でないとき、
P(s'|s,a)←0
(3)sが障害物のときで、s'=sのとき、
P(s'|s,a)←1
(4)sが障害物のときで、s'=sではないとき、
P(s'|s,a)←0
のように設定される。なお、マス単位の中に1つでも障害物が含まれる場合には、そのロボット単位の状態sを障害物とする。報酬Rについては、
If s'= PeUthen R(s',s, a) ← r_plus
Else If (sがGU内かつｓ'がGU外) then R(s',s, a) ← -r_minus
Else If (sがGU外かつｓ'がGU内でかつs'=PeUでない) then R(s',s, a) ← -r_minus
Else R(s',s, a) ← 0
である。r_minusとr_plusはともに正値だが、r_minus > 10 × r_plusとするのが望ましい。頭部ロボット単位が障害物には入れず、また、入口位置単位PeU以外から目標隊列エリア単位GUに入れず、目標隊列エリア単位GU内からは出られないという条件が記述できていれば、遷移確率と報酬の設定方法はこの限りではない。すべてのマス単位にあるロボット単位が入口位置単位PeUを目指して動作するための動作計画（価値関数Q(s,a)の計算）を一回行えばよいので、前述の通り、計算負荷が劇的に少なくなる。また、マス単位の個数は、マスの個数の4分の1となるため、状態の個数が減り計算負荷が少なくなる。 In other words, the state transition probability and reward value setting in the Markov state space necessary for calculating the value function Q (s, a) are determined by assuming that the state has moved from s to s ′ by action a. Is P (s' | s, a), for example, as an example of the setting method
(1) When s is not an obstacle and s' is the unit of mass of the destination of action a,
P (s' | s, a) ← 1
(2) When s is not an obstacle and s' is not a unit of mass for the destination of action a,
P (s' | s, a) ← 0
(3) When s is an obstacle and s' = s,
P (s' | s, a) ← 1
(4) When s is an obstacle and not s' = s,
P (s' | s, a) ← 0
It is set like this. In addition, when even one obstacle is included in the mass unit, the state s of the robot unit is set as the obstacle. For reward R,
If s '= Pe U then R (s', s, a) ← r_plus
Else If (s is in GU and s 'is out of GU) then R (s', s, a) ← -r_minus
Else If (s is out of GU and s' is in GU and s' = not PeU) then R (s', s, a) ← -r_minus
Else R (s', s, a) ← 0
It is. r_minus and r_plus are both positive values, but it is desirable that r_minus> 10 x r_plus. If the condition that the head robot unit does not enter the obstacle, can not enter the target platoon area unit GU other than the entrance position unit PeU, and can not be exited from within the target platoon area unit GU, transition is possible Probability and reward setting method is not limited to this. Since the robot unit in all the mass units only needs to perform the motion plan (calculation of the value function Q (s, a)) for the movement toward the entrance position unit PeU, the calculation load is dramatic as described above. Less. Moreover, since the number of square units is a quarter of the number of squares, the number of states is reduced and the calculation load is reduced.

通常は、各状態sにある頭部ロボット単位が、Q（s,a）を使用して行動選択を行う場合は、各状態sにおいて、価値関数Q（s,ａ）の値を最大にする行動aを選択するが、以上に提示した条件設定で、価値関数Q（s,ａ）を計算した場合、目標隊列エリア単位GU外にいる頭部ロボット単位も目標隊列エリア単位GU内にいる頭部ロボット単位もともに、入口位置単位PeUを目指して移動することになる。そこで、本実施形態では以下のルールにより行動選択を行うものとする。この行動方策を行動方策Policy_Gとする。 Normally, when the head robot unit in each state s performs action selection using Q (s, a), the value of the value function Q (s, a) is maximized in each state s. If action a is selected, but the value function Q (s, a) is calculated with the conditions set forth above, the head robot unit outside the target platoon area unit GU is also within the target platoon area unit GU. Both the robot units move toward the entrance position unit PeU. Therefore, in this embodiment, it is assumed that an action is selected according to the following rules. This action policy is defined as action policy Policy_G.

（行動方策Policy_G）
(1)ｓがGU内のとき、頭部ロボット単位は、他のロボット単位の位置や障害物位置に移動しない行動であって、かつ、GU外に出ていかない行動であり、かつ、Q(s,a)<Q(s,0)を満たす行動aの中でQ(s,a)を最小化する行動を選択する。そのような行動がない場合は行動a=0（静止）を選択する。
(2)ｓがGU外のとき、頭部ロボット単位は、他のロボットの位置や障害物位置に移動しない行動であって、かつ、Q(s,a) > Q(s,0)を満たす行動aの中でQ(s,a)を最大化する行動を選択する。そのような行動がない場合は行動a=0（静止）を選択する。 (Action Policy Policy_G)
(1) When s is within GU, the head robot unit is an action that does not move to the position of other robot units or the position of an obstacle, and does not go out of the GU, and Q ( An action that minimizes Q (s, a) is selected from actions a that satisfy s, a) <Q (s, 0). If there is no such action, action a = 0 (still) is selected.
(2) When s is outside GU, the head robot unit is an action that does not move to the position of another robot or an obstacle, and satisfies Q (s, a)> Q (s, 0) Select the action that maximizes Q (s, a) among the actions a. If there is no such action, action a = 0 (still) is selected.

このようにすることで、一度、(1)によりGU内に入った頭部ロボット単位は、入口位置単位PeUからなるべく離れた（遠い他の）GU内の位置に移動する。一方、(2)により、GU外の頭部ロボット単位は、入口位置単位PeUに向かって移動し、入口位置単位PeUよりGU内に入ろうとする行動をとるようにできる。また、他のロボットや障害物との衝突も避けることができる。なお(1)において、条件を満たす行動aが二つ以上存在するとき、その行動aにより移動した先のロボット単位が移動元ロボット単位以外のロボット単位と接していないならば、そのようなロボット単位に移動する行動aを優先出力する。 In this way, the head robot unit that has once entered the GU according to (1) moves to a position in the GU as far as possible from the entrance position unit PeU as far as possible. On the other hand, according to (2), the head robot unit outside the GU moves toward the entrance position unit PeU and can take an action to enter the GU from the entrance position unit PeU. Also, collisions with other robots and obstacles can be avoided. In addition, in (1), when there are two or more actions a that satisfy the condition, if the destination robot unit moved by the action a is not in contact with a robot unit other than the source robot unit, such robot unit Give priority to action a moving to.

（頭部決定処理Head_Robot_Decision）
以上の価値関数Q（s,ａ）、行動方策Policy_Gを用いて、ある隊列Gaをなすロボット単位の群が、つぎに移動すべき隊列Gbの頭部ロボット単位を決定する方法は以下のとおりである。なお、本処理を、頭部決定処理Head_Robot_Decisionともいう。
(1)GU内の各ロボット単位jについて、ロボット単位jの位置sにおける全行動aの中でのQ(s,a)の最小値Qmin(j)を計算する。
(2)GU内の各ロボット単位jについて、Qmin(j)の値が小さい順に順位付けを行う。
(3)GU外の各ロボット単位jについて、ロボット単位jの位置sにおける全行動aの中でのQ(s,a)の最大値Qmax(j)を計算する。
(4)GU外の各ロボット単位jについて、Qmax(j)の値が大きい順に順位付けを行う。
(5)GU外の各ロボット単位jの順位値にGU内にあるロボット単位数を加算する。
(6)変数head_num←0とする。
(7) (6)までで計算した順位値の第head_num位の、ロボット単位jを選択し、j_head←jとする。ロボット単位j_headの行動a_headを行動方策Policy_Gにより計算する。a_head=0でない場合、ロボット単位j_headを頭部ロボット単位位置に移動するロボット単位として選択する。ロボット単位j_headを行動a_headによって移動させた先の位置を、頭部ロボット単位位置とし、頭部ロボット単位をj_newとする。a_head=0のとき、head_numをインクリメントし、(7)を繰り返す。 (Head determination process Head_Robot_Decision)
Using the above value function Q (s, a) and action policy Policy_G, the method of determining the head robot unit of the row Gb to which the group of robots constituting the row Ga should move next is as follows. is there. This process is also referred to as a head determination process Head_Robot_Decision.
(1) For each robot unit j in the GU, the minimum value Qmin (j) of Q (s, a) in all actions a at the position s of the robot unit j is calculated.
(2) For each robot unit j in the GU, ranking is performed in ascending order of Qmin (j).
(3) For each robot unit j outside the GU, the maximum value Qmax (j) of Q (s, a) in all actions a at the position s of the robot unit j is calculated.
(4) For each robot unit j outside the GU, ranking is performed in descending order of Qmax (j).
(5) Add the number of robot units in the GU to the rank value of each robot unit j outside the GU.
(6) Set the variable head_num ← 0.
(7) Select the robot unit j at the head_num position of the rank value calculated in (6) above, and set j_head ← j. The behavior a_head of the robot unit j_head is calculated by the behavior policy Policy_G. If a_head = 0 is not satisfied, the robot unit j_head is selected as the robot unit that moves to the head robot unit position. The position where the robot unit j_head is moved by the action a_head is set as the head robot unit position, and the head robot unit is set as j_new. When a_head = 0, increment head_num and repeat (7).

以上の処理では、GU外にしかロボット単位が存在しないときになるべくPeUに近いロボット単位か、もしくは、GU内にロボット単位が存在するときには、なるべくQ_min(j)の値の小さいロボット単位のうちで、行動方策Policy_Gにより選択される行動が静止ではないものを頭部ロボット単位として選んでいる。 In the above processing, when the robot unit exists only outside the GU, the robot unit is as close to PeU as possible, or when there is a robot unit within the GU, the robot unit with the smallest Q_min (j) value The action selected by the action policy Policy_G is selected as a head robot unit that is not stationary.

（ロボット単位群内序列決定処理Unit_Joretu_Decision）
続いて、尾部ロボット単位位置を決定するための手法を述べる。まず、各ロボット単位が、頭部ロボット単位からどれだけの距離離れているかを計算する。この計算はダイクストラ法などの動的計画法で計算可能であるが、本実施形態で使用する方法を一例として以下に述べる。なお、本処理をロボット単位群内序列決定処理Unit_Joretu_Decisionともいう。
(1)各ロボット単位にて、上下左右４方向に隣接するロボット単位の番号を調べる。ロボット単位jの行動aの移動方向にロボット単位j_nextが隣接して存在する場合、next_u[a][j]変数の値をj_nextに設定する（next_u[a][j]←j_next）。ロボットが存在しないときは、next_u[a][j]変数の値をj_maxに設定する（next_u[a][j]←j_max）。ただし、j_maxはロボット単位の個数である。
(2)状態空間内（Xr_unit,Yr_unit）の各位置の序列変数値Joretu[Xr_unit][Yr_unit]の値をj_maxに初期化する(Joretu[Xr_unit][Yr_unit]←j_max)。
(3)ロボット単位j_headの位置の序列変数値Joretu[Xr_unit[j_head]][Yr_unit[j_head]]の値をj_minに初期化する（Joretu[Xr_unit[j_head]][Yr_unit[j_head]]←j_min）。ただし、j_minの値は、例えば、j_min=0とする。
(4)ロボット単位j_head以外の、各ロボット単位jの、全行動aについて、next_u[a][j]の値がj_maxではない場合の、Joretu[Xr_unit[next_u[a][j]]][Yr_unit[next_u[a][j]]]の値を調べ、その最小値に１を加えた値が、現在のJoretu[Xr_unit[j]][Yr_unit[j]]よりも小さい場合は、その値をJoretu[Xr_unit[j]][Yr_unit[j]]に代入する(Joretu[Xr_unit[j]][Yr_unit[j]] ←Joretu[Xr_unit[next_u[a][j]]][Yr_unit[next_u[a][j]]]+1)。
(5) (4)の処理にて、Joretu値の更新がなくなるまで、(4)を繰り返す。この処理により、ロボット単位j_headから離れた位置にあるロボット単位jの序列変数値Joretu[Xr_unit[j]][Yr_unit[j]]ほど大きな値となるように更新される（図９参照）。なお、図９の丸括弧内の数値は序列変数値Joretu[Xr_unit[j]][Yr_unit[j]]を表す。 (Robot unit group order determination processing Unit_Joretu_Decision)
Next, a method for determining the tail robot unit position will be described. First, how far each robot unit is from the head robot unit is calculated. This calculation can be performed by a dynamic programming method such as the Dijkstra method. The method used in this embodiment will be described below as an example. This process is also referred to as a robot unit group order determination process Unit_Joretu_Decision.
(1) For each robot unit, check the number of the robot unit adjacent in the four directions of up, down, left and right. When the robot unit j_next is adjacent to the movement direction of the action a of the robot unit j, the value of the next_u [a] [j] variable is set to j_next (next_u [a] [j] ← j_next). When there is no robot, the value of the next_u [a] [j] variable is set to j_max (next_u [a] [j] ← j_max). However, j_max is the number of robot units.
(2) The value of the ordinal variable value Joretu [Xr_unit] [Yr_unit] at each position in the state space (Xr_unit, Yr_unit) is initialized to j_max (Joretu [Xr_unit] [Yr_unit] ← j_max).
(3) The order variable value of the robot unit j_head is initialized to the value of Joretu [Xr_unit [j_head]] [Yr_unit [j_head]] to j_min (Joretu [Xr_unit [j_head]] [Yr_unit [j_head]] ← j_min) . However, the value of j_min is, for example, j_min = 0.
(4) Joretu [Xr_unit [next_u [a] [j]]] [when the value of next_u [a] [j] is not j_max for all actions a of each robot unit j other than the robot unit j_head Check the value of Yr_unit [next_u [a] [j]]], and if the minimum value plus 1 is smaller than the current Joretu [Xr_unit [j]] [Yr_unit [j]] Is assigned to Joretu [Xr_unit [j]] [Yr_unit [j]] (Joretu [Xr_unit [j]] [Yr_unit [j]] ← Joretu [Xr_unit [next_u [a] [j]] [Yr_unit [next_u [ a] [j]]] + 1).
(5) Repeat (4) until Joretu value is no longer updated in the process of (4). By this processing, the order variable value Joretu [Xr_unit [j]] [Yr_unit [j]] of the robot unit j located at a position away from the robot unit j_head is updated so as to become a larger value (see FIG. 9). In addition, the numerical value in the parenthesis of FIG. 9 represents an ordinal variable value Joretu [Xr_unit [j]] [Yr_unit [j]].

（尾部ロボット決定処理Tail_robot_Decision）
以上の、ロボット単位群内序列決定処理Joretu_Decisionが済んだうえで、Joretu[Xr_unit[j]][Yr_unit[j]]が最大となる位置にあるロボット単位を尾部ロボット単位j_tailとする。なお、尾部ロボット単位を決定する処理を尾部ロボット決定処理Tail_robot_Decisionともいう。以上の処理では、頭部ロボット単位からもっとも離れたロボット単位を尾部ロボット単位として設定している。 (Tail robot decision processing Tail_robot_Decision)
The robot unit group order determination process Joretu_Decision is completed, and the robot unit at the position where Joretu [Xr_unit [j]] [Yr_unit [j]] is maximized is defined as the tail robot unit j_tail. The process for determining the tail robot unit is also referred to as a tail robot determination process Tail_robot_Decision. In the above processing, the robot unit farthest from the head robot unit is set as the tail robot unit.

（移動先隊列決定用動作計画処理Next_Formation_Decision）
以上の処理をまとめて、以下のように、移動先隊列決定用動作計画処理Next_Formation_Decisionが行われる。
(1)頭部決定処理Head_Robot_Decisionを行う。
(2)ロボット単位群内序列決定処理Unit_Joretu_Decisionを行う。
(3)尾部ロボット決定処理Tail_robot_Decisionを行う。
(4)ロボット単位j_headが行動a_headを行った後の、移動後の位置に新たな頭部ロボット単位j_newを設定し、(3)にて、尾部ロボット単位位置が特定できたので、現在の隊列Gaから尾部ロボット位置j_tailのロボット単位を削除し、隊列Gaに頭部ロボット単位j_newを追加し、隊列Gbとする（図９参照）。 (Operation plan processing Next_Formation_Decision)
The above process is summarized, and the movement plan determination operation plan process Next_Formation_Decision is performed as follows.
(1) The head determination process Head_Robot_Decision is performed.
(2) Perform the order determination process Unit_Joretu_Decision in the robot unit group.
(3) Tail robot determination processing Tail_robot_Decision is performed.
(4) After the robot unit j_head performs the action a_head, a new head robot unit j_new is set at the moved position, and the tail robot unit position can be specified in (3). The robot unit at the tail robot position j_tail is deleted from Ga, and the head robot unit j_new is added to the formation Ga to form a formation Gb (see FIG. 9).

[簡略化された移動先隊列決定用動作計画処理]
なお、移動先隊列決定用動作計画処理Next_Formation_Decisionは、以下のように簡略化してもよい。
(1)現時点で、４つのロボットが収まっているロボット単位位置を特定し、隊列Gaのロボット単位の要素として設定する。
(2)頭部決定処理Head_Robot_Decisionを行う。
(3)ロボット単位j_headが行動a_headを行った後の、移動後の位置に新たな頭部ロボット単位j_newを設定し、頭部ロボット単位j_newとロボット単位j_headとからなるロボット単位隊列を、隊列Gbとする（図１０参照）。なお、隊列Gbに含まれれないロボットの群れを尾部ロボット群G_tailという。この時点では、尾部ロボット群G_tailに含まれるどのロボットが削除されるかは決定していない。 [Simplified action plan process for determining the destination column]
Note that the movement plan determination operation plan process Next_Formation_Decision may be simplified as follows.
(1) At present, the robot unit position where the four robots are accommodated is specified and set as an element of the robot unit of the formation Ga.
(2) The head determination process Head_Robot_Decision is performed.
(3) After the robot unit j_head performs the action a_head, a new head robot unit j_new is set at the position after the movement, and the robot unit column composed of the head robot unit j_new and the robot unit j_head is defined as the column Gb. (See FIG. 10). A group of robots not included in the convoy Gb is referred to as a tail robot group G_tail. At this time, it has not been determined which robot included in the tail robot group G_tail is to be deleted.

後述する各ロボット動作決定用動作計画においては、ロボット単位j_headと頭部ロボット単位j_new以外は、直接計算に使用されないので、省いて構わない。その代り、簡略化された移動先隊列決定用動作計画処理における(1)の処理にて、毎回、4つのロボットで埋まっているロボット単位の特定を行う。 In each robot motion determination motion plan to be described later, the robot unit j_head and the head robot unit j_new are not directly used for the calculation and may be omitted. Instead, the robot unit buried by the four robots is identified each time in the process (1) in the simplified movement plan determination process.

[各ロボット動作決定用動作計画]
移動先隊列決定用動作計画においてロボット単位j_headと頭部ロボット単位j_newが決定され、次の隊列Gbが決定されたので、現隊列Gaから次隊列Gbへの移動を各ロボットの移動によって実現するための探索計算を各ロボット動作決定用動作計画にて行う。 [Operation plan for determining each robot operation]
Since the robot unit j_head and the head robot unit j_new are determined in the movement plan for determining the movement target column, and the next column Gb is determined, the movement from the current column Ga to the next column Gb is realized by the movement of each robot. Search calculation is performed in each robot motion determination motion plan.

まず、各ロボット動作決定用動作計画においては、隊列Gb内のどのロボット単位にも属さない端数のロボットを特定し、それらを尾部ロボット群G_tailに属するロボットとする（図１０参照）。 First, in each robot motion determination motion plan, fractional robots that do not belong to any robot unit in the formation Gb are specified, and those robots belong to the tail robot group G_tail (see FIG. 10).

本実施形態では、4つのロボットを組にしたロボット単位を維持しつつロボット群が移動を行うようにしているので、ボイド制御を行う際の、各ロボットの接続維持を考える際には、尾部ロボット単位j_tail(前述の通り、各ロボット動作決定用動作計画における尾部ロボット単位は、何れのロボット単位にも属さない端数のロボットをメンバとする)と頭部ロボット単位j_new付近でのボイドの動きにさえ気を配れば、それ以外は、ボイドがロボット群内のどの位置にあるかを考慮する必要はない。 In this embodiment, since the robot group moves while maintaining the robot unit of four robots as a set, when considering the connection maintenance of each robot when performing void control, the tail robot The unit j_tail (as described above, the tail robot unit in each robot motion determination motion plan is a fractional robot that does not belong to any robot unit) and even the movement of voids near the head robot unit j_new Other than that, there is no need to consider where the void is in the robot group.

各ロボット動作決定用動作計画においては、以下の２つのモードが存在する。
モード１：GUがロボットで満たされていない場合。
モード２：GUがロボットで満たされている場合。
（モード１）
まず、モード１のGUがロボットで満たされていない場合の各ロボット動作決定用動作計画の動作について述べる。 In each robot motion determination motion plan, the following two modes exist.
Mode 1: GU is not filled with robots.
Mode 2: When GU is filled with robots.
(Mode 1)
First, the operation of each robot motion determination motion plan when the mode 1 GU is not satisfied by the robot will be described.

まず、ロボット単位j_headに所属するロボットのうち、頭部ロボット単位j_newに接した位置にある２つのロボットをロボットi_move_11,i_move_21とし(図１１参照)、ロボットi_move_11,i_move_21と行動a_headの移動方向の逆方向に接しているロボットをそれぞれ、ロボットi_move_12,i_move_22とする。頭部ロボット単位j_newに所属し、かつ、ロボットi_move_11,i_move21に、行動a_headの方向で接するロボットがある場合は、そのロボットをそれぞれi_move_101,i_move_201とし, ロボットi_move_101,i_move201の位置に、行動a_headの方向で接する位置にロボットがある場合は、そのロボットをそれぞれi_move_102,i_move_202とする。モード１の動作は、以下の手順で行われる。 First, of the robots belonging to the robot unit j_head, the two robots in contact with the head robot unit j_new are designated as robots i_move_11 and i_move_21 (see FIG. 11), and the movement directions of the robots i_move_11 and i_move_21 and the action a_head are reversed. The robots in contact with the directions are referred to as robots i_move_12 and i_move_22, respectively. If the robot i_move_11, i_move21 belongs to the head robot unit j_new and there is a robot that touches in the direction of action a_head, the robots are set to i_move_101, i_move_201, respectively, and the direction of action a_head at the position of robot i_move_101, i_move201 If there is a robot at the position where it touches, i_move_102 and i_move_202 are assigned to the robots, respectively. The operation in mode 1 is performed according to the following procedure.

（手順１）ロボットi_move_101,i_move102の両方が存在する場合は（図１１参照、なお、図１１においては破線で示したロボットi_move_101,i_move_102が存在する場合）、手順７へ。その他の場合で、ロボットi_move_101が存在する場合は（図１２Ａ参照）、ロボットi_move_11とロボットi_move12とロボットi_move101をa_head方向に移動する（図１２Ｂ参照）。その他の場合で、ロボットi_move_101が存在しない場合は（図１３Ａ）、ロボットi_move_11とロボットi_move12をa_head方向に移動する（図１３Ｂ）。 (Procedure 1) When both of the robots i_move_101 and i_move102 exist (see FIG. 11, the robots i_move_101 and i_move_102 indicated by broken lines in FIG. 11), go to the procedure 7. In other cases, when the robot i_move_101 exists (see FIG. 12A), the robot i_move_11, the robot i_move12, and the robot i_move101 are moved in the a_head direction (see FIG. 12B). In other cases, when the robot i_move_101 does not exist (FIG. 13A), the robot i_move_11 and the robot i_move12 are moved in the a_head direction (FIG. 13B).

（手順２）手順１で生じたボイド（ロボットi_move12の元あった位置）を位置void_endとする（図１２Ｂ、図１３Ｂ参照）。 (Procedure 2) The void generated in the procedure 1 (the position where the robot i_move12 was originally) is set as a position void_end (see FIGS. 12B and 13B).

（手順３）位置void_endを、ロボットi_move_11とロボットi_move12以外のロボットで埋めていくことで、尾部ロボット群G_tailに所属するロボットのうち、もっとも位置void_endから離れた位置のロボットを現在位置より、1マス分位置void_endに近づける。例えば、図１３Ｂから図１４Ａに遷移する。 (Procedure 3) By filling the position void_end with a robot other than the robot i_move_11 and robot i_move12, the robot farthest from the position void_end among the robots belonging to the tail robot group G_tail is Move closer to the minute position void_end. For example, the transition is made from FIG. 13B to FIG. 14A.

（手順４）ロボットi_move_101かロボットi_move_102が存在する場合は（図１２、図１３参照、なお、図１３においては破線で示したロボットi_move_102が存在する場合）、手順７へ。どちらも存在しない場合は（図１４Ａ参照）、ロボットi_move_11とロボットi_move12と、「ロボットi_move12に、行動a_headの移動方向と逆の方向で接するロボット」を、a_head方向に移動する（図１４Ｂ参照）。 (Procedure 4) When the robot i_move_101 or the robot i_move_102 exists (see FIGS. 12 and 13, the robot i_move_102 indicated by a broken line in FIG. 13 exists), go to the procedure 7. If neither exists (see FIG. 14A), the robot i_move_11, the robot i_move12, and the “robot in contact with the robot i_move12 in the direction opposite to the moving direction of the action a_head” are moved in the a_head direction (see FIG. 14B).

（手順５）手順４で生じたボイド（手順１実行前にロボットi_move12の元あった位置）を、位置void_endとする。 (Procedure 5) The void generated in the procedure 4 (the position where the robot i_move12 was generated before the execution of the procedure 1) is set as a position void_end.

（手順６）位置void_endを、ロボットi_move_11とロボットi_move12以外のロボットで埋めていくことで、尾部ロボット群G_tailに所属するロボットのうち、もっとも位置void_endから離れた位置のロボットを現在位置より、1マス分位置void_endに近づける。 (Procedure 6) By filling the position void_end with a robot other than the robot i_move_11 and robot i_move12, the robot farthest from the position void_end among the robots belonging to the tail robot group G_tail is Move closer to the minute position void_end.

以上、手順１から手順６により、頭部ロボット単位j_newの下側のマスをロボットで埋める。同様の処理を、手順７から手順１２で行い、頭部ロボット単位j_newの上側のマスをロボットで埋める。頭部ロボット単位j_newの上側のマスにおいて同様の処理を行えばよいため、図を省略する。 As described above, the lower mass of the head robot unit j_new is filled with the robot by the procedure 1 to the procedure 6. A similar process is performed in steps 7 to 12, and the upper square of the head robot unit j_new is filled with the robot. Since the same process may be performed on the upper square of the head robot unit j_new, the illustration is omitted.

（手順７）ロボットi_move_201,i_move202の両方が存在する場合は（図１１参照）、モード１の各ロボット動作決定用動作計画を終了する。頭部ロボット単位j_newが4台のロボットで埋まったことを意味する。 (Procedure 7) When both the robots i_move_201 and i_move202 exist (see FIG. 11), the operation plans for determining robot operations in mode 1 are terminated. This means that the head robot unit j_new is filled with 4 robots.

その他の場合で、ロボットi_move_201が存在する場合は、ロボットi_move_21とロボットi_move22とロボットi_move201をa_head方向に移動する。その他の場合で、ロボットi_move_201が存在しない場合は、ロボットi_move_21とロボットi_move22をa_head方向に移動する。 In other cases, when the robot i_move_201 exists, the robot i_move_21, the robot i_move22, and the robot i_move201 are moved in the a_head direction. In other cases, when the robot i_move_201 does not exist, the robot i_move_21 and the robot i_move22 are moved in the a_head direction.

（手順８）手順７で生じたボイド（ロボットi_move22の元あった位置）を位置void_endとする。 (Procedure 8) The void generated in the procedure 7 (position where the robot i_move22 was originally) is set as a position void_end.

（手順９）位置void_endを、ロボットi_move_21とロボットi_move22以外のロボットで埋めていくことで、尾部ロボット群G_tailに所属するロボットのうち、もっともを位置void_endから離れた位置のロボットを現在位置より、1マス分位置void_endに近づける。 (Procedure 9) By filling the position void_end with robots other than the robots i_move_21 and i_move22, the robot belonging to the tail robot group G_tail, the robot farthest away from the position void_end from the current position is 1 Move closer to the void position void_end.

（手順１０）ロボットi_move_201かロボットi_move_202が存在する場合は、終了する。どちらも存在しない場合は、ロボットi_move_21とロボットi_move22と、「ロボットi_move22に、行動a_headの移動方向と逆の方向で接するロボット」を、a_head方向に移動する。 (Procedure 10) When the robot i_move_201 or the robot i_move_202 exists, the process ends. If neither exists, the robot i_move_21, the robot i_move22, and the “robot in contact with the robot i_move22 in the direction opposite to the movement direction of the action a_head” are moved in the a_head direction.

（手順１１）手順１０で生じたボイド（手順７実行前にロボットi_move22の元あった位置）を位置void_endとする。 (Procedure 11) The void generated in the procedure 10 (the position where the robot i_move22 was created before the execution of the procedure 7) is set as a position void_end.

（手順１２）位置void_endを、ロボットi_move_21とロボットi_move22以外のロボットで埋めていくことで、尾部ロボット群G_tailに所属するロボットのうち、もっともを位置void_endから離れた位置のロボットを現在位置より、1マス分位置void_endに近づける。 (Procedure 12) By filling the position void_end with robots other than the robots i_move_21 and i_move22, the robot belonging to the tail robot group G_tail, the robot farthest away from the position void_end from the current position is 1 Move closer to the void position void_end.

ロボット単位j_headがロボット単位群の移動の先頭に位置する場合には、通常、頭部ロボット単位j_newに他のロボットは存在しない（言い換えると、図１１においては破線で示した４つのロボットi_move_101,i_move_102,i_move_201,i_move202が存在しない）が、開始位置の隊列によって（図１５参照）、ロボット単位j_headの移動方向に、ロボット単位を構成しない端数のロボットi_move_101,i_move_102,i_move_201,i_move202の多くとも何れか３つが存在する場合がある。その場合、手順１〜手順１２を行うことで、ロボットi_move_101,i_move_102,i_move_201,i_move202を頭部ロボット単位の一部として取り込む。つまり、ロボットi_move_101,i_move_102,i_move_201,i_move202の多くとも何れか３つが存在する場合、ロボット単位j_headを構成していた4台のロボットi_move_11,ロボットi_move_12,ロボットi_move_21,ロボットi_move_22がそのまま頭部ロボット単位j_newを構成するのではなく、存在するロボットi_move_101,i_move_102,i_move_201,i_move202の多くとも何れか３つと、4台のロボットi_move_11,ロボットi_move_12,ロボットi_move_21,ロボットi_move_22の何れか（足らない部分を補充するもの）とで、頭部ロボット単位j_newを構成する。例えば、図１５の場合には、ロボットi_move_201,i_move202と、ロボットi_move_11,ロボットi_move_12とで頭部ロボット単位j_newを構成する。 When the robot unit j_head is positioned at the head of the movement of the robot unit group, there is usually no other robot in the head robot unit j_new (in other words, four robots i_move_101, i_move_102 shown by broken lines in FIG. 11). , i_move_201, i_move202 does not exist), but depending on the starting position row (see FIG. 15), in the moving direction of the robot unit j_head, any one of at most 3 of the robots i_move_101, i_move_102, i_move_201, i_move202 that do not constitute a robot unit There may be one. In that case, the robots i_move_101, i_move_102, i_move_201, i_move202 are taken in as a part of the head robot unit by performing the steps 1 to 12. In other words, if at least three of robot i_move_101, i_move_102, i_move_201, i_move202 exist, the four robots i_move_11, robot i_move_12, robot i_move_21, and robot i_move_22 that constitute the robot unit j_head are used as the head robot unit j_new. And at least three of the existing robots i_move_101, i_move_102, i_move_201, i_move202, and four robots i_move_11, robot i_move_12, robot i_move_21, and robot i_move_22 (replenish the missing parts) ) Constitute a head robot unit j_new. For example, in the case of FIG. 15, the robot i_move_201, i_move202 and the robot i_move_11, robot i_move_12 constitute the head robot unit j_new.

以上の処理のうち、手順３，６，９，１２は、共通の処理である。以下、この処理をボイド制御処理Robot_Void_Controlともいい、この処理について説明する。 Among the above processes, procedures 3, 6, 9, and 12 are common processes. Hereinafter, this processing is also referred to as void control processing Robot_Void_Control, and this processing will be described.

(1)前回、本処理Robot_Void_Controlが実行された際のi_tail値を変数pre_i_tailに格納する(pre_i_tail←i_tail)。本処理が実行されるのがこれが最初ならば、pre_i_tail←-1とする。 (1) The i_tail value when this processing Robot_Void_Control was executed last time is stored in the variable pre_i_tail (pre_i_tail ← i_tail). If this is the first time that this process is executed, pre_i_tail ← −1.

(2)各ロボットが位置void_endからどれだけの距離離れているかを計算する。以下、この処理をロボット序列計算処理Robot_Joretu_Decisionともいう。ロボット序列計算処理Robot_Joretu_Decisionでは、各位置の序列変数値Joretu[Xr][Yr]を計算する。ロボット序列計算処理Robot_Joretu_Decisionについては後述する。 (2) Calculate how far each robot is from the position void_end. Hereinafter, this processing is also referred to as robot order calculation processing Robot_Joretu_Decision. In the robot order calculation process Robot_Joretu_Decision, an order variable value Joretu [Xr] [Yr] at each position is calculated. The robot order calculation process Robot_Joretu_Decision will be described later.

(3)(a)pre_i_tailが-1以外の値のとき（本処理を行うのが2回目以降のとき）で、ロボットpre_i_tailが一時刻ステップ前に所属していたロボット単位の位置に所属していて、かつG外にあるロボットがある場合は、その中で最もvoid_endから離れているロボット（joretu値のもっとも大きい位置にあるロボット）を選択し、ロボットi_tailとする（選択したロボットの番号をi_tailに代入する）。(b)pre_i_tailが-1以外の値のときで、ロボットpre_i_tailが一時刻ステップ前に所属していたロボット単位の位置に一つもロボットが所属していなく、他にロボット単位位置内に４つのロボットが充填されていないロボット単位位置があるときは、そのようなロボット単位内にありかつ、G外にあり、その中で最もvoid_endから離れているロボット（joretu値のもっとも大きい位置にあるロボット）ロボットを選択し、ロボットi_tailとする。(b)のような選択を行うことで、4つのロボットが充填されているロボット単位を構成するようにし、ロボット単位を構成しない端数のロボット数が４未満となるようにしている。(c)pre_i_tailが-1以外の値のときで、ロボットpre_i_tailが一時刻ステップ前に所属していたロボット単位の位置に一つもロボットが所属してなく、他に４つのロボットが充填されていないロボット単位位置がない場合は、尾部ロボット群G_tailに所属するロボットのうち、G外にあるものの中で、最もvoid_endから離れているロボット（joretu値のもっとも大きい位置にあるロボット）を選択し、ロボットi_tailとする。(d)pre_i_tailが-1のとき（本処理を行うのが1回目のとき）、尾部ロボット群G_tailに所属するロボットのうち、G外にあるものの中で、最もvoid_endから離れているロボット（joretu値のもっとも大きい位置にあるロボット）を選択し、ロボットi_tailとする。例えば、図１６は、図１１の状態において、モード１の手順１でロボットi_move_11とi_move_12とがa_head方向に移動し、ボイドが位置void_endに存在し、ロボットi_tailが設定された例を示す。 (3) (a) When pre_i_tail is a value other than -1 (when this processing is performed for the second time or later), the robot pre_i_tail belongs to the position of the robot unit to which the previous time step belonged. If there is a robot outside G, select the robot that is farthest from void_end (the robot with the largest joretu value) and set it as the robot i_tail (the number of the selected robot is i_tail Assigned to). (b) When pre_i_tail is a value other than -1, no robot belongs to the position of the robot unit to which the robot pre_i_tail belonged one step ago, and there are four robots in the robot unit position. If there is a robot unit position that is not filled with, the robot that is in such a robot unit, is outside G, and is farthest from the void_end among them (the robot that has the highest joretu value) Is selected as robot i_tail. By performing the selection as shown in (b), a robot unit filled with four robots is configured, and the number of robots that are not included in the robot unit is less than four. (c) When pre_i_tail is a value other than -1, no robot belongs to the position of the robot unit to which robot pre_i_tail belonged one time step before, and no other four robots are filled If there is no robot unit position, select the robot belonging to the tail robot group G_tail that is far from void_end among the robots outside G (the robot with the highest joretu value) i_tail. (d) When pre_i_tail is -1 (when this processing is performed for the first time), among the robots belonging to the tail robot group G_tail, those that are outside G but farthest from void_end (joretu The robot having the largest value is selected as a robot i_tail. For example, FIG. 16 shows an example in which the robots i_move_11 and i_move_12 are moved in the a_head direction in the procedure 1 of mode 1 in the state of FIG. 11, the void exists at the position void_end, and the robot i_tail is set.

(4)ロボットi_tailを用いて以下の位置void_startを求める。ロボットi_tailの位置から、隣接したロボットをたどって同じ方向に2マス以上移動可能であり、2マス以上移動した先にあるロボットの所属するロボット単位が、ロボット単位j_headではなく、かつ、内に４つのロボットが所属しているという条件を満たす方向を探す。その方向と同じ方向に移動する行動をa_taiｌとする(図１６参照)。続いて、行動a_tailの方向に2マス以上移動した先にあるロボットの所属するロボット単位の番号j_startを特定する。言い換えると、ロボットi_tailの位置から、同じ方向dに2台以上ロボットが連続して隣接しており、その方向dにおいて、ロボットi_tailから2マス以上離れたマス（２つ以上離れた位置）にあり、連続して隣接するロボットの何れかが所属するロボット単位が、ロボット単位j_headではなく、かつ、内に４つのロボットが所属しているという条件を満たすとき、方向dに移動する行動をa_taiｌとする。上述の条件（ロボットi_tailから2つ以上離れた位置にあり、連続して隣接するロボットの何れかが所属し、ロボット単位j_headではなく、かつ、内に４つのロボットが所属しているという条件）を満たすロボット単位の番号j_startを特定する。なお、上述の条件を満たすロボット単位は、複数個存在しうるので、その中から任意のロボット単位を、ロボット単位j_startとして特定しうる。 (4) The following position void_start is obtained using the robot i_tail. It is possible to move 2 squares or more in the same direction by tracing the adjacent robot from the position of the robot i_tail, and the robot unit to which the robot that has moved 2 squares or more belongs is not the robot unit j_head, but within 4 Look for a direction that satisfies the condition that two robots belong. An action moving in the same direction as that direction is a_tail (see FIG. 16). Subsequently, the robot unit number j_start to which the robot that has moved two squares or more in the direction of the action a_tail belongs is specified. In other words, two or more robots are adjacent to each other in the same direction d from the position of the robot i_tail, and in that direction d, the robot i_tail is at a square (two or more positions away). When the robot unit to which any one of the adjacent robots belongs is not the robot unit j_head and the condition that four robots belong to the condition, the action to move in the direction d is a_tail To do. The above-mentioned condition (a condition that two or more robots that are located at two or more positions away from the robot i_tail belong continuously, are not robot units j_head, and four robots belong to them) The robot unit number j_start that satisfies the condition is specified. Since there can be a plurality of robot units that satisfy the above conditions, an arbitrary robot unit can be specified as the robot unit j_start.

行動a_tailとロボット単位j_startの位置(Xr_unit[j_start],Yr_unit[j_start])とから、以下のようにから位置void_startを設定する。
a_tail=1のとき：void_startのX座標を2 ×Xr_unit[j_start] + 1、Y座標をYr[i_tail]とする。
a_tail=2のとき：void_startのX座標をXr[i_tail]、Y座標を2 × Yr_unit[j_start] + 1、とする。
a_tail=3のとき：void_startのX座標を2 × Xr_unit[j_start] 、Y座標をYr[i_tail]とする。
a_tail=4のとき：void_startのX座標をXr[i_tail]、Y座標を2 × Yr_unit[j_start] 、とする。 From the action a_tail and the position of the robot unit j_start (Xr_unit [j_start], Yr_unit [j_start]), the position void_start is set as follows.
When a_tail = 1: The X coordinate of void_start is 2 × Xr_unit [j_start] +1, and the Y coordinate is Yr [i_tail].
When a_tail = 2: The X coordinate of void_start is Xr [i_tail], and the Y coordinate is 2 × Yr_unit [j_start] +1.
When a_tail = 3: The X coordinate of void_start is 2 × Xr_unit [j_start], and the Y coordinate is Yr [i_tail].
When a_tail = 4: The X coordinate of void_start is Xr [i_tail], and the Y coordinate is 2 × Yr_unit [j_start].

(5)位置void_startから、位置void_endにいたるボイドの移動経路を計算する。以下、この処理をボイド経路計算処理Void_Trajectory_Decisionともいう。 (5) The movement path of the void from the position void_start to the position void_end is calculated. Hereinafter, this processing is also referred to as void path calculation processing Void_Trajectory_Decision.

(6) (5)で計算されたボイドの移動経路を、void_endから_void_startに逆にたどり、ボイドを埋めていく。以下、この処理をボイド移動処理Void_Motionともいう。 (6) Follow the void movement path calculated in (5) from void_end to _void_start and fill the void. Hereinafter, this processing is also referred to as void movement processing Void_Motion.

なお、処理(4)は、尾部ロボット単位に属するロボットを動かす際に、ロボットの接続状態を維持するためには、「ボイドの位置が、尾部ロボット単位に隣接するロボット単位に含まれるロボットのうち、尾部ロボット単位に接するロボットの位置であってはならない」ことを考慮したものである。よって、制御システムは、尾部ロボット単位に隣接するロボット単位に含まれるロボットのうち、尾部ロボット単位に接するロボットの位置にボイドが存在しないように、ｐ台のロボットの行動を制御する。なお、前述の通り、各ロボット動作用動作計画においては、図１６の端数のロボットi_tailは、尾部ロボット単位に所属するロボットとして扱われる。図１６において、ボイドが移動経路NGを用いた場合、ロボットi_tailが上下左右方向において隣接するロボットがなくなってしまい、p台のロボット全てが一つの群れを成すことができず、ロボット同士の接続を維持することができない。そこで、位置void_endにあるボイドを一旦、位置void_startに移動させ、その後、一気にロボットi_tailの位置に移動させるように制御する。このような行動制御を行うことで、p台のロボット全てが一つの群れを成し、ロボット同士の接続を維持することができる。以下、このような制御を実現するためのロボット序列計算処理Robot_Joretu_Decision、ボイド経路計算処理Void_Trajectory_Decision、ボイド移動処理Void_Motionについて説明する。 In order to maintain the connection state of the robot when the robot belonging to the tail robot unit is moved, the process (4) is described as follows: “The position of the void is a robot included in the robot unit adjacent to the tail robot unit. The position of the robot in contact with the tail robot unit must not be considered. Therefore, the control system controls the behavior of the p robots so that no void exists at the position of the robot in contact with the tail robot unit among the robots included in the robot units adjacent to the tail robot unit. As described above, in each robot operation plan, the fractional robot i_tail in FIG. 16 is treated as a robot belonging to the tail robot unit. In FIG. 16, when the void uses the movement path NG, the robot i_tail is no longer adjacent to the robot in the vertical and horizontal directions, and all of the p robots cannot form one group, and the robots are connected to each other. It cannot be maintained. Therefore, control is performed so that the void at the position void_end is once moved to the position void_start and then moved to the position of the robot i_tail at a stroke. By performing such behavior control, all the p robots form one group, and the connection between the robots can be maintained. Hereinafter, robot order calculation processing Robot_Joretu_Decision, void path calculation processing Void_Trajectory_Decision, and void movement processing Void_Motion for realizing such control will be described.

(ロボット序列計算処理Robot_Joretu_Decision)
ロボット序列計算処理Robot_Joretu_Decisionは、ダイクストラ法などを用いることができる。例えば以下のとおりである。 (Robot order calculation processing Robot_Joretu_Decision)
The robot order calculation processing Robot_Joretu_Decision can use the Dijkstra method or the like. For example:

(1)各ロボットにて、上下左右４方向に隣接するロボットの番号を調べる。ロボットiの行動aの移動方向にロボットi_nextが隣接して存在する場合、next[a][i]変数の値をi_nextに設定する。ロボットが存在しないときは、next[a][i]変数の値をp(ロボットの台数を表す)に設定する。 (1) For each robot, check the number of the robot that is adjacent in the four directions. When the robot i_next exists adjacent to the movement direction of the action a of the robot i, the value of the next [a] [i] variable is set to i_next. When there is no robot, the value of the next [a] [i] variable is set to p (representing the number of robots).

(2)状態空間内（Xr,Yr）各位置の序列変数値Joretu[Xr][Yr]の値をpに初期化する(Joretu[Xr][Yr]←p)。 (2) The value of the ordinal variable value Joretu [Xr] [Yr] at each position in the state space (Xr, Yr) is initialized to p (Joretu [Xr] [Yr] ← p).

(3)void_endの位置の序列変数値Joretuの値を0とする(Joretu[Xr][Yr]←0)。さらに、本処理を実行する直前に行動a_headによって移動した全てのロボットの位置の序列変数値Joretuの値を-1に初期化する。さらに、それが所属するロボット単位が、4台のロボットで充填されていないロボットの位置の序列変数値Joretuの値を-1に初期化する。さらに、void_startの位置から行動a_tailの逆向きに隣接するロボット位置のJoretu値を-1にする。 (3) The value of the ordinal variable value Joretu at the position of void_end is set to 0 (Joretu [Xr] [Yr] ← 0). Furthermore, the value of the ordering variable value Joretu of the positions of all the robots moved by the action a_head immediately before executing this process is initialized to -1. In addition, the robot unit to which it belongs is initialized to -1 for the ordering variable value Joretu of the position of the robot that is not filled with four robots. Further, the Joretu value of the robot position adjacent in the opposite direction of the action a_tail from the position of void_start is set to -1.

(4)void_endの位置と、void_endから、行動a_headの移動方向に隣接してたどれるロボットの位置以外の、各ロボットiの、全行動aについて、next[a][j]の値がpではない場合の、Joretu[Xr[next[a][i]]][Yr[next[a][i]]]の値を調べ、Joretu[Xr[next[a][i]]][Yr[next[a][i]]]の値が-1でない場合で、かつ、その最小値に１を加えた値が、現在のJoretu[Xr[i]][Yr[i]]よりも小さい場合は、その値をJoretu[Xr[i]][Yr[i]]に代入する（Joretu[Xr[i]][Yr[i]]←Joretu[Xr[next[a][i]]][Yr[next[a][i]]]+1）。 (4) The value of next [a] [j] is not p for all actions a of each robot i, except for the position of void_end and the position of the robot that is adjacent to the movement direction of action a_head from void_end Joretu [Xr [next [a] [i]]] [Yr [next [a] [i]]] and Joretu [Xr [next [a] [i]]] [Yr [next If the value of [a] [i]]] is not -1 and the minimum value plus 1 is less than the current Joretu [Xr [i]] [Yr [i]] , Assign the value to Joretu [Xr [i]] [Yr [i]] (Joretu [Xr [i]] [Yr [i]] ← Joretu [Xr [next [a] [i]]] [Yr [next [a] [i]]] + 1).

(5) (4)の処理にて、Joretu値の更新がなくなるまで、(4)を繰り返す。 (5) Repeat (4) until Joretu value is no longer updated in the process of (4).

この処理により、(a)行動a_headによって移動した全てのロボットの位置、(b)そのロボットが所属するロボット単位が4台のロボットで充填されていない場合には、そのロボット単位に所属する他のロボットの位置、(c)void_startの位置から行動a_tailの逆向きに隣接するロボットの位置を除き、位置void_endから離れた位置の序列変数値Joretu[Xr[i]][Yr[i]]ほど大きな値となるように更新される。図１６の角括弧内の数値は、更新がなくなったときの序列変数値Joretuの値の例を示す。 By this process, (a) the position of all robots moved by action a_head, (b) if the robot unit to which the robot belongs is not filled with 4 robots, other robots belonging to that robot unit The position of the robot, (c) excluding the position of the robot that is adjacent to the position of void_start in the opposite direction of action a_tail, and the order variable value Joretu [Xr [i]] [Yr [i]] that is away from the position void_end Updated to be a value. The numerical values in the square brackets in FIG. 16 indicate examples of values of the ordinal variable value Joretu when there is no update.

(ボイド経路計算処理Void_Trajectory_Decision)
ボイド経路計算処理Void_Trajectory_Decisionは、以下のとおりである。 (Void path calculation processing Void_Trajectory_Decision)
The void path calculation process Void_Trajectory_Decision is as follows.

(1)位置void_startの位置をボイド軌道(X,Y)=(void_trj_x[h], void_trj_y[h])(h=0,1,2,3…)の最初の位置とする。
(2)軌道番号h_trj←0とする。
(3)現ボイド軌道位置（void_trj_x[h_trj],void_trj_y[h_trj]）から、全方向にて隣接する位置にあるロボットiのうち、そのロボットiの位置における序列値Joretu[Xr[i]][Yr[i]]が、-1以外であって、かつ、Joretu[void_trj_x[h_trj]][void_trj_y[h_trj]]よりも小さいロボットiの位置を次のボイド軌道位置（void_trj_x[h_trj+1],void_trj_y[h_trj+1]）とする。その際の、ボイドの移動の方向と同じ行動の番号をボイド行動履歴a_void[h_trj]に格納する。h_trjをインクリメントする。
(4)現ボイド軌道位置が、位置void_endに等しくなるまで(3)を繰り返す。 (1) Position void_start is the first position of void trajectory (X, Y) = (void_trj_x [h], void_trj_y [h]) (h = 0,1,2,3...).
(2) Set orbital number h_trj ← 0.
(3) From the current void trajectory position (void_trj_x [h_trj], void_trj_y [h_trj]), among the robots i that are adjacent in all directions, the order value Joretu [Xr [i]] [ Yr [i]] is other than -1 and the position of robot i smaller than Joretu [void_trj_x [h_trj]] [void_trj_y [h_trj]] is the next void trajectory position (void_trj_x [h_trj + 1], void_trj_y [h_trj + 1]). At this time, the same action number as the movement direction of the void is stored in the void action history a_void [h_trj]. Increment h_trj.
(4) Repeat (3) until the current void trajectory position is equal to the position void_end.

この処理により、位置void_endから位置void_startまでのボイドの経路を計算する。
(ボイド移動処理Void_Motion)
ボイド移動処理Void_Motionは、例えば、以下のとおりである。計算されたボイド軌道に沿って、同じ方向に動くロボットを一度に移動させている。もちろん、一度に移動させず一台ずつ動かしてもよい。 By this processing, a void path from the position void_end to the position void_start is calculated.
(Void_Motion)
The void movement process Void_Motion is, for example, as follows. A robot moving in the same direction is moved at once along the calculated void trajectory. Of course, you may move one unit at a time without moving it at once.

(1)h_trj_buffer←h_trj-1とする。
(2)h_trjをデクリメントし、a_void[h_trj]の値を、a_void_bufferに格納する。
(3)h_trj=0でなければ、h_trjをデクリメントし、a_void[h_trj]= a_void_bufferであるか否かを判定し、a_void[h_trj]= a_void_bufferのとき、(3)を繰りかえす。そうでなければ(a_void[h_trj]≠ a_void_bufferのとき)、ボイド軌道(void_trj_x[h_trj_buffer], void_trj_y[h_trj_buffer])から、(void_trj_x[h_trj+1], void_trj_y[h_trj+1])までの位置にあるロボットを皆、a_void_bufferに格納された行動値で移動させ(4)にいく。この処理により、同じ方向に動くロボットを一度に移動させている。h_trj=0のとき、ボイド軌道(void_trj_x[h_trj_buffer], void_trj_y[h_trj_buffer])から、(void_trj_x[0], void_trj_y[0])までの位置にあるロボットをa_void[0]に格納された行動値で移動させ(5)にいく。
(4)h_trjをインクリメントし、(1)に戻る。
(5)ロボットi_tailから行動a_tailの向きに隣接し、位置void_startに至るまでの位置にあるロボットを皆、行動a_tailで移動させる。なお、この移動は、一台ずつではなく、全て一緒に移動させる必要がある。この処理により位置void_startにあるボイドをロボットi_tailの位置に一気に移動させ、ロボットi_tailに隣接するロボットが無くなることを防ぐ。 (1) Set h_trj_buffer ← h_trj-1.
(2) Decrement h_trj and store the value of a_void [h_trj] in a_void_buffer.
(3) If h_trj = 0 is not satisfied, h_trj is decremented and it is determined whether a_void [h_trj] = a_void_buffer. If a_void [h_trj] = a_void_buffer, (3) is repeated. Otherwise (when a_void [h_trj] ≠ a_void_buffer), the void trajectory (void_trj_x [h_trj_buffer], void_trj_y [h_trj_buffer]) to (void_trj_x [h_trj + 1], void_trj_y [h_trj + 1]) Move all robots with action values stored in a_void_buffer and go to (4). By this process, the robot moving in the same direction is moved at a time. When h_trj = 0, the robot in the position from void trajectory (void_trj_x [h_trj_buffer], void_trj_y [h_trj_buffer]) to (void_trj_x [0], void_trj_y [0]) is the action value stored in a_void [0] Move to go to (5).
(4) Increment h_trj and return to (1).
(5) All robots that are adjacent to the robot i_tail in the direction of the action a_tail and reach the position void_start are moved by the action a_tail. In addition, this movement needs to move all together, not one by one. By this process, the void at the position void_start is moved to the position of the robot i_tail at once, and the robot adjacent to the robot i_tail is prevented from being lost.

GUがロボットで満たされるまで、移動先隊列決定用動作計画処理Next_Formation_Decisionの後、モード１（手順１〜手順１２）を使用してロボットを移動させる処理を繰り返す（図８、図１７〜図１９参照）。GUがロボットで満たされるとモード２に移行する。 Until the GU is filled with the robot, after the movement planning process Next_Formation_Decision for determining the movement destination platoon, the process of moving the robot using mode 1 (procedure 1 to procedure 12) is repeated (see FIGS. 8 and 17 to 19). ). When GU is satisfied with the robot, it shifts to mode 2.

（モード２）
続いて、モード２のGUがロボットで満たされている場合の各ロボット動作決定用動作計画の動作について述べる。 (Mode 2)
Next, the operation of each robot motion determination motion plan when the mode 2 GU is satisfied by the robot will be described.

(i_startの選択)
モード２では、先頭のロボットを選択する際に、頭部ロボット単位は使用しない。先頭ロボットの番号をi_startとする。start_select[i]変数をロボットiの先頭ロボットとしての選択回数を記憶する変数とし、ロボットの移動開始前に０にあらかじめ初期化する（start_select[i]←0）。 (Select i_start)
In mode 2, the head robot unit is not used when selecting the head robot. The number of the first robot is i_start. The start_select [i] variable is a variable for storing the number of times the robot i is selected as the first robot, and is initialized to 0 before the robot starts moving (start_select [i] ← 0).

まず、前ステップで本モード２の処理が実行されたときの先頭ロボットi_startが前ステップでの移動を行った後の位置から、前ステップでのロボットi_startの移動方向a_startの移動する方向に隣接していて、「GUに属さず、Gに属す目標位置であって、そこに他のロボットが存在していない位置」があるとき、引き続きロボットi_startを先頭ロボットとして選択する。start_select[i_start]をインクリメントする。 First, the first robot i_start when the processing of this mode 2 is executed in the previous step is adjacent to the moving direction a_start of the robot i_start in the previous step from the position after the movement in the previous step. If there is a “position that does not belong to GU but belongs to G and does not have any other robot”, robot i_start is continuously selected as the first robot. Increment start_select [i_start].

そうでない場合は、これまで、先頭ロボットとして選択されたことのないロボットで、かつ、「GUに属さず、Gに属す目標位置で、ロボットが存在していない位置」に隣接するロボットiのうちで、その隣接する目標位置に移動する行動をa_startとしたとき、以下の条件１を満たさないロボットを先頭ロボットi_startとして選択する。start_select[i_start]をインクリメントする。なお、ロボットi_startの選択は、目標位置への入口位置単位PeUに属するものから優先的に選択する。（G外に残留するロボット数が残り少なくなって4つ以下になったときに、入口位置単位PeUからi_startを選ぶと、ロボットi_startが口述するモード２内手順３で計算されるロボット単位j_start内に含まれてしまうこととなり、本実施形態ではそのような場合に対応できないからである。）
条件１：あるロボットiについて、そのロボットが所属するロボット単位ｊに行動a_startの方向で隣接するロボット単位の位置の中にGUに属さずGに属する位置が3個であり、その中で、ロボット単位jに隣接しない位置が2つ以上ある。 If not, the robot i that has not been selected as the first robot so far, and that is adjacent to the "position that does not belong to GU and belongs to G, where no robot exists" Thus, when the action to move to the adjacent target position is a_start, the robot that does not satisfy the following condition 1 is selected as the first robot i_start. Increment start_select [i_start]. The robot i_start is preferentially selected from those belonging to the entrance position unit PeU to the target position. (When the number of robots remaining outside G decreases to 4 or less, if i_start is selected from the entry position unit PeU, the robot unit j_start calculated in step 3 in mode 2 dictated by robot i_start (This is because the present embodiment cannot cope with such a case.)
Condition 1: For a robot i, there are three positions belonging to G that do not belong to GU among the positions of robot units adjacent to robot unit j to which the robot belongs in the direction of action a_start. There are two or more positions that are not adjacent to the unit j.

図２０に条件１を満たす例を示す。なお、図２０のロボット単位j₁のロボットi₁は条件１を満たすため選択されない。一方、図２０のロボット単位j₂に属するロボットi₂やi₃は条件２を満たさない(ロボット単位jに隣接しない位置が1つしかない)ため、選択されうる。 FIG. 20 shows an example satisfying the condition 1. Note that the robot i ₁ of the robot unit j ₁ in FIG. On the other hand, since the robots i ₂ and i ₃ belonging to the robot unit j ₂ in FIG. 20 do not satisfy the condition 2 (there is only one position that is not adjacent to the robot unit j), they can be selected.

モード２の動作は以下のとおりである。
（手順１）ロボットi_startの選択が2回目の場合（start_select[i_start]=2）、ロボットi_startとロボットi_startと行動a_startの逆の方向で隣接するロボットと、さらにそのロボットに行動a_startの逆の方向で隣接するロボットをa_start方向に移動する。ロボットi_startの選択が１回目の場合、ロボットi_startとロボットi_startと行動a_startの逆の方向で隣接するロボットをa_start方向に移動する。なお、目標隊列エリアGが目標隊列エリア単位GUから飛び出している部分は目標隊列エリア単位GUに隣接するマス単位に含まれることを前提としているため、start_select[i_start]は１または２となる。
（手順２）手順１で生じたボイド（手順１で最後尾で移動したロボットの元あった位置）を位置void_endとする。
（手順３）位置void_endを、手順１にて行動a_startで移動したロボット以外のロボットで埋めていくことで、尾部ロボット単位i_tailに所属するロボットのうち、もっとも位置void_endから離れた位置のロボットを現在位置より、1マス分位置void_endに近づける。 The operation in mode 2 is as follows.
(Procedure 1) When the robot i_start is selected for the second time (start_select [i_start] = 2), the robot i_start, the robot i_start, and the robot a_start are adjacent to each other, and the robot a_start is the opposite direction. To move the adjacent robot in the a_start direction. When the robot i_start is selected for the first time, the robot i_start, the robot i_start, and the action a_start are moved in the direction opposite to the a_start direction. Note that since the portion where the target row area G protrudes from the target row area unit GU is assumed to be included in the mass unit adjacent to the target row area unit GU, start_select [i_start] is 1 or 2.
(Procedure 2) The void generated in Procedure 1 (the original position of the robot that moved at the end in Procedure 1) is defined as a position void_end.
(Procedure 3) By filling the position void_end with a robot other than the robot moved by action a_start in step 1, among the robots belonging to the tail robot unit i_tail, the robot farthest from the position void_end Move closer to the position void_end by 1 square from the position.

手順３はモード１と同じものである。 Procedure 3 is the same as mode 1.

以上の処理を、G内にロボットが充填されるまで繰り返す（図１９、図２１、図２２参照）。 The above processing is repeated until the robot is filled in G (see FIGS. 19, 21, and 22).

＜第一実施形態に係る行動制御システム１００＞
第一実施形態に係る行動制御システム１００は、以上に説明した各処理によって構成される全体動作について以下にまとめる。 <Action control system 100 according to the first embodiment>
The behavior control system 100 according to the first embodiment summarizes the overall operation configured by the processes described above.

(1)移動先隊列決定用動作計画のために使用する価値関数Q（s,a）をMDPの動的計画法で計算する。
(2)開始位置で、４つのロボットが収まっているロボット単位位置を特定し、Gaのロボット単位の要素として設定する。
(3)以下(4)，(5)を全G内にロボットが充填されるまで繰り返す。
(4)GUがロボットで充填されていない場合、移動先隊列決定用動作計画処理Next_Formation_Decisionを実行する。充填されているならなにもしない。
(5)GUがロボットで充填されていない場合、各ロボット動作決定用動作計画処理のモード１を実行する。GUがロボットで充填されている場合、モード２を実行する。 (1) The value function Q (s, a) used for the movement plan for determining the movement target column is calculated by the dynamic programming method of MDP.
(2) At the start position, specify the robot unit position where the four robots are housed, and set it as an element of Ga robot unit.
(3) Repeat (4) and (5) below until the robot is filled in all G.
(4) If the GU is not filled with a robot, the movement plan determining movement plan process Next_Formation_Decision is executed. Do nothing if filled.
(5) When the GU is not filled with a robot, mode 1 of each robot motion determination motion plan process is executed. If GU is filled with robot, execute mode 2.

以下、これらの処理を実行する構成について説明する。 Hereinafter, a configuration for executing these processes will be described.

図２３は第一実施形態に係る行動制御システム１００の機能ブロック図を、図２４はその処理フローの例を示す。行動制御システム１００は、図２３に示すように、動作計画部１１０と、行動選択部１２０と、隣接状態判定部１２４と、位置更新部１２３と、位置判定部１２６と、記憶部１４０と、通信部１５０と、入力部１６０とを含む。行動選択部１２０は移動先隊列決定用動作計画部１２１と各制御対象物動作用動作計画部１２２とを含む。 FIG. 23 is a functional block diagram of the behavior control system 100 according to the first embodiment, and FIG. 24 shows an example of the processing flow. As shown in FIG. 23, the behavior control system 100 includes an operation planning unit 110, a behavior selection unit 120, an adjacent state determination unit 124, a position update unit 123, a position determination unit 126, a storage unit 140, and communication. Part 150 and input part 160. The action selection unit 120 includes a movement platoon determination operation planning unit 121 and each control target object operation planning unit 122.

以下では、制御の対象となる制御対象物が、ロボットである場合を例に挙げて説明する。もちろん、制御対象物は、制御の対象となり得るものであれば、ロボット以外であってもよい。 Hereinafter, a case where the control target to be controlled is a robot will be described as an example. Of course, the control object may be other than the robot as long as it can be a control target.

本実施形態では、行動制御システム１００は、p台のロボットの行動を制御し、p台のロボットの内の１つのロボット上に実装される。なお、行動制御システム１００が実装されていないp-１台のロボットについても、通信部１５０と、隣接状態判定部１２４とを含む。 In the present embodiment, the behavior control system 100 controls the behavior of p robots and is mounted on one of the p robots. Note that the p-1 robot on which the behavior control system 100 is not mounted also includes the communication unit 150 and the adjacent state determination unit 124.

＜動作計画部１１０＞
動作計画部１１０は、価値関数Q(s,a)の値を、動的計画法により、ロボットの任務行動開始前に事前に計算し（Ｓ１１０）、記憶部１４０に格納する。ここで、動作計画部１１０の計算は、一つのロボット単位を使用した強化学習（例：Q学習）に置き換えてもよい。なお、別装置で価値関数Q(s,a)を計算しておき、ロボットの任務行動開始前に事前に記憶部１４０に格納しておけば、行動制御システム１００は、動作計画部１１０を備えなくともよい。 <Operation Planning Unit 110>
The motion planning unit 110 calculates the value of the value function Q (s, a) in advance by dynamic programming before starting the mission action of the robot (S110) and stores it in the storage unit 140. Here, the calculation of the motion planning unit 110 may be replaced with reinforcement learning (eg, Q learning) using one robot unit. If the value function Q (s, a) is calculated by a separate device and stored in the storage unit 140 in advance before starting the robot's mission behavior, the behavior control system 100 includes the motion planning unit 110. Not necessary.

なお、価値関数Q(s,a)の計算については、[移動先隊列決定用動作計画のためのMDP状態空間]で説明した通りである。 Note that the calculation of the value function Q (s, a) is as described in [MDP state space for movement plan for movement target column determination].

＜入力部１６０＞
入力部１６０には、p台のロボットiのそれぞれの初期位置(Xr0[i],Xr0[i])及びp個の目標位置の集合G={(Xre[0],Yre[0]),(Xre[1],Yre[1]),…,(Xre[p-1],Yre[p-1])}が入力され、記憶部１４０に記憶される。 <Input unit 160>
The input unit 160 includes an initial position (Xr0 [i], Xr0 [i]) and a set of p target positions G = {(Xre [0], Yre [0]), (Xre [1], Yre [1]),..., (Xre [p-1], Yre [p-1])} are input and stored in the storage unit 140.

なお、目標位置は、所定の入口位置Peを含むとする。この入口位置Peについての情報も、入力部１６０から入力され、記憶部１４０に記憶されるとする。 Note that the target position includes a predetermined entrance position Pe. Information about the entrance position Pe is also input from the input unit 160 and stored in the storage unit 140.

＜記憶部１４０＞
記憶部１４０には、位置s及びa∈{0,1,2,3,4}の組み合わせのそれぞれについての価値関数Q(s,a)が記憶されているとする。sの取りうる範囲は、対象となる二次元平面上の領域内のロボット単位ｊが存在しうる全ての座標である。 <Storage unit 140>
It is assumed that the storage unit 140 stores a value function Q (s, a) for each of the combinations of the position s and aε {0,1,2,3,4}. The range that can be taken by s is all the coordinates where the robot unit j in the region on the target two-dimensional plane can exist.

各位置sの報酬r(s)についても、記憶部１４０に記憶されているとする。各位置sの報酬r(s)についての情報は、例えば入力部１６０から入力される。 It is assumed that the reward r (s) at each position s is also stored in the storage unit 140. Information about the reward r (s) at each position s is input from the input unit 160, for example.

＜通信部１５０＞
行動制御システム１００が実装されているロボットも含め、全てのロボットは、通信部１５０を介して、二次元平面上の上下左右方向(以下「４方向」ともいう)において隣接する他のロボットと通信することができる。 <Communication unit 150>
All robots including the robot on which the behavior control system 100 is mounted communicate with other robots that are adjacent in the vertical and horizontal directions on the two-dimensional plane (hereinafter also referred to as “four directions”) via the communication unit 150. can do.

＜行動選択部１２０＞
行動選択部１２０は、記憶部１４０から価値関数Qを取り出す。 <Action selection unit 120>
The action selection unit 120 extracts the value function Q from the storage unit 140.

行動選択部１２０は、上述の方法で説明した方法で、p台のロボットを制御する（ｓ１２０）。 The action selection unit 120 controls the p robots by the method described in the above method (s120).

行動選択部１２０は、まず、p台のロボットの位置から4つのロボットが収まっているロボット単位位置を特定し、隊列Gaに含まれるロボット単位として設定する。なお、各ロボットの位置を必要とするが、各ロボットの位置の初期位置(Xr0[i],Xr0[i])については、前述の通り、入力部１６０を介して入力され記憶部１４０に記憶されているものを利用し、移動後の各ロボットの位置については、後述する位置判定部１２６で求め記憶部１４０に記憶されているものを利用すればよい。 The action selection unit 120 first identifies the robot unit position where the four robots are accommodated from the positions of the p robots, and sets it as the robot unit included in the formation Ga. Although the position of each robot is required, the initial position (Xr0 [i], Xr0 [i]) of each robot is input via the input unit 160 and stored in the storage unit 140 as described above. What is necessary is just to use what was calculated | required in the position determination part 126 mentioned later and memorize | stored in the memory | storage part 140 about the position of each robot after moving.

行動選択部１２０は、位置判定部１２６で求め記憶部１４０に記憶されている各ロボットの位置と、目標隊列エリア単位GUとを照合し、目標隊列エリア単位GUが全てロボットで充填されているか否かを判定する（ｓ１２０−１）。なお、目標隊列エリア単位GU及び入口位置単位PeUについては、記憶部１４０からp個の目標位置の集合G及び入口位置Peを取り出し、集合G及び入口位置Peから求めてもよいし、価値関数Q(s,a)を求める際に用いたものを用いてもよい。 The action selection unit 120 compares the position of each robot obtained by the position determination unit 126 and stored in the storage unit 140 with the target platoon area unit GU, and whether or not the target platoon area unit GU is all filled with robots. Is determined (s120-1). For the target platoon area unit GU and the entrance position unit PeU, the set G and the entrance position Pe of p target positions may be extracted from the storage unit 140, and may be obtained from the set G and the entrance position Pe. You may use what was used when calculating | requiring (s, a).

充填されていない場合には、行動選択部１２０の移動先隊列決定用動作計画部１２１は、価値関数Q(s,a)を用いて、移動先隊列決定用動作計画処理Next_Formation_Decision（または、簡略化された移動先隊列決定用動作計画処理）を行い（ｓ１２１）、開始位置の集合に配置されたp台のロボットを目標位置の集合に移動させる際に、頭部ロボット単位が移動の先頭を務めるように、頭部ロボット単位の位置を出力する。 If not filled, the movement plan determining action planning unit 121 of the action selecting unit 120 uses the value function Q (s, a) to determine the movement plan determining movement plan Next_Formation_Decision (or simplification). (S121), the head robot unit serves as the head of the movement when moving the p robots arranged in the set of start positions to the set of target positions. Thus, the position of the head robot unit is output.

充填されていない場合、さらに、行動選択部１２０の各制御対象物動作用動作計画部１２２は、頭部ロボット単位の位置を受け取り、各ロボット動作用動作計画のモード１を行い（ｓ１２２−１）、頭部ロボット単位の位置に４台のロボットが位置するように、各ロボットに対して制御信号を送る。各ロボットは制御信号に従って行動する。 If not filled, each motion target motion planning unit 122 of the behavior selection unit 120 receives the position of each head robot unit and performs mode 1 of each robot motion motion plan (s122-1). A control signal is sent to each robot so that the four robots are positioned at the position of the head robot unit. Each robot behaves according to the control signal.

充填されている場合、行動選択部１２０の各制御対象物動作用動作計画部１２２は、各ロボット動作用動作計画のモード２を行い（ｓ１２２−２）、目標エリア単位GUに属さず、目標エリアGに属す目標位置で、ロボットが存在していない位置に隣接するロボットを選択し、目標エリア単位GUに属さず、目標エリアGに属す目標位置の先端（そのロボット位置の最大２マス先の位置）にそのロボットが移動するように、各ロボットに対して制御信号を送る。各ロボットは制御信号に従って行動する。例えば、行動選択部１２０は、記憶部１４０に格納されている目標エリアGと目標エリア単位GUとを照合し、目標エリア単位GUに属さず、目標エリアGに属す目標位置を予め求めておき、移動後の各ロボットの位置から、目標エリア単位GUに属さず、目標エリアGに属す目標位置で、ロボットが存在していない位置に隣接するロボットを選択すればよい。 When it is filled, each motion target motion planning unit 122 of the action selection unit 120 performs mode 2 of each robot motion motion plan (s122-2), does not belong to the target area unit GU, and does not belong to the target area. Select a robot that is adjacent to a target position that belongs to G and does not exist. The tip of the target position that does not belong to the target area unit GU and belongs to the target area G (maximum 2 squares ahead of the robot position) ), A control signal is sent to each robot so that the robot moves. Each robot behaves according to the control signal. For example, the action selection unit 120 collates the target area G and the target area unit GU stored in the storage unit 140, and obtains a target position that does not belong to the target area unit GU but belongs to the target area G in advance. From the position of each robot after movement, a robot that does not belong to the target area unit GU but is adjacent to the position where the robot does not exist at the target position belonging to the target area G may be selected.

なお、一回の行動制御により、モード１では１回の移動先隊列決定用動作計画処理Next_Formation_Decisionで決定された頭部ロボット単位を充填する処理が行われ、モード２では、ロボットを、目標エリア単位GUに属さず、目標エリアGに属す目標位置の先端（最大２マス）まで移動させる処理が行われる。 In addition, in mode 1, the process of filling the head robot unit determined in one movement plan determination action plan process Next_Formation_Decision is performed in mode 1, and in mode 2, the robot is moved to the target area unit. A process of moving to the tip of the target position belonging to the target area G (maximum 2 squares) not belonging to the GU is performed.

移動先隊列決定用動作計画処理Next_Formation_Decision（または、簡略化された移動先隊列決定用動作計画処理）において行われる頭部決定処理Head_Robot_Decisionの中で用いられる行動方策Policy_Gにおいて、各ロボット単位の位置や障害物位置を必要とするが、各ロボット単位の位置ついては前述の通り、取得可能である。また、障害物の位置に関しては、価値関数Q(s,a)を学習する際に利用した障害物の位置を利用してもよいし、後述する隣接状態判定部１２４で判定した判定結果を用いて得られるものを利用してもよい。 In the action policy Policy_G used in the head determination process Head_Robot_Decision performed in the movement plan process Next_Formation_Decision (or simplified movement plan process for determination of the movement target party sequence), the position and obstacle of each robot unit Although an object position is required, the position of each robot unit can be acquired as described above. As for the position of the obstacle, the position of the obstacle used when learning the value function Q (s, a) may be used, or the determination result determined by the adjacent state determination unit 124 described later is used. You may use what is obtained.

＜位置更新部１２３＞
位置更新部１２３は、各i=0,1,…,p-1について、i番目のロボットの現在の位置（Xr[i],Yr[i]）において、行動選択部１２０で決定した行動を実行した場合のロボットの移動後（行動後）の位置（Xr'[i],Yr'[i]）を計算し、計算された（Xr'[i],Yr'[i]）で記憶部１４０に格納されたi番目のロボットの位置を更新する（Ｓ１２５）。言い換えれば、位置更新部１２３は、行動選択部１２０で決定した行動に基づいて、ロボットが行動した場合に想定される位置（以下、「想定位置」ともいう）を計算し、ロボットの位置を更新し記憶部１４０に格納する。具体的には、図１０のｓ８（位置の更新）を行う。 <Location update unit 123>
For each i = 0, 1,..., P−1, the position update unit 123 performs the action determined by the action selection unit 120 at the current position (Xr [i], Yr [i]) of the i-th robot. Calculates the position (Xr '[i], Yr' [i]) after movement (after action) of the robot when executed, and stores the calculated (Xr '[i], Yr' [i]) The position of the i-th robot stored in 140 is updated (S125). In other words, the position update unit 123 calculates a position assumed when the robot behaves (hereinafter also referred to as “assumed position”) based on the action determined by the action selection unit 120 and updates the position of the robot. Stored in the storage unit 140. Specifically, s8 (update of position) in FIG. 10 is performed.

＜隣接状態判定部１２４＞
隣接状態判定部１２４は、ロボットの２次元平面上の上下左右の隣接する位置に、障害物または他のロボットが存在するか否かを判定し、（Ｓ１２１），判定結果を記憶部１４０に格納する。 <Adjacent state determination unit 124>
The adjacent state determination unit 124 determines whether there is an obstacle or another robot at adjacent positions in the top, bottom, left, and right on the two-dimensional plane of the robot (S121), and stores the determination result in the storage unit 140. To do.

なお、上述の通り、行動制御システム１００が実装されていないp-１台のロボットについても、通信部１５０と、隣接状態判定部１２４とを含む。そのため、各ロボットiは隣接状態判定部１２４において、自身の上下左右方向に障害物または他のロボットがいるかどうかを判定し、通信部１５０を介して判定結果を行動制御システム１００に出力することができる。行動制御システム１００は、通信部１５０を介して各ロボットiから判定結果を受け取り、行動制御システム１００に含まれる隣接状態判定部１２４の判定結果と一緒に記憶部１４０に格納する。なお、p台のロボットは、各ロボットの隣り合う位置（４方向）に必ず、他のロボットが存在し、隣り合うロボット同士がなす群れは、一塊なので、各ロボットiは通信部１５０を介してp-1個の判定結果を直接、または、他のロボットを介して、行動制御システム１００に送信することができる。また、行動制御システム１００は、通信部１５０を介して、直接、または、他のロボットを介して、各ロボットiに選択した行動を実行するように制御信号を送信することができる。また、他の情報もp台のロボット間で送受信可能となる。 Note that, as described above, the p-1 robot on which the behavior control system 100 is not mounted also includes the communication unit 150 and the adjacent state determination unit 124. Therefore, each robot i can determine whether there is an obstacle or another robot in its up / down / left / right direction in the adjacent state determination unit 124 and output the determination result to the behavior control system 100 via the communication unit 150. it can. The behavior control system 100 receives the determination result from each robot i via the communication unit 150 and stores it in the storage unit 140 together with the determination result of the adjacent state determination unit 124 included in the behavior control system 100. In addition, the p robots always have other robots in adjacent positions (4 directions) of each robot, and the group formed by the adjacent robots is a lump, so each robot i is connected via the communication unit 150. The p-1 determination results can be transmitted to the behavior control system 100 directly or via another robot. In addition, the behavior control system 100 can transmit a control signal so as to execute the selected behavior to each robot i directly or via another robot via the communication unit 150. In addition, other information can be transmitted and received between the p robots.

なお、ロボットが移動するように制御したとしても、何らかのトラブル（感知できなかった障害物の存在や、機器の故障等）により、行動選択部１２０の制御通りに移動できるとは限らない。一方、動かなかったロボットの位置は、制御する前と変わらない。よって、少なくとも１台のロボットが動かないように制御することで、動かなかったロボットの位置を基準として、隣接状態判定部１２４による判定結果を用いて、移動するように制御されたロボットの、実際に行動した後の位置（以下「行動後位置」ともいう）（Xr"[i],Yr"[i]）を求めることができる。また、GPS等の少なくとも一台のロボットがGPSを備えれば、そのロボットを基準として、同様に移動後位置を求めることができる。 Even if the robot is controlled to move, it may not be able to move as controlled by the action selection unit 120 due to some trouble (the presence of an obstacle that could not be detected, equipment failure, etc.). On the other hand, the position of the robot that did not move is the same as before the control. Therefore, by controlling so that at least one robot does not move, the actual state of the robot controlled to move using the determination result by the adjacent state determination unit 124 based on the position of the robot that did not move is used as a reference. The position after acting (hereinafter also referred to as “post-behavior position”) (Xr "[i], Yr" [i]) can be obtained. If at least one robot such as GPS is equipped with GPS, the post-movement position can be obtained in the same manner with reference to the robot.

＜位置判定部１２６＞
位置判定部１２６は、隣接状態判定部１２４の判定結果を用いて、行動後位置を求め、行動後位置（Xr"[i],Yr"[i]）と想定位置（Xr'[i],Yr'[i]）とが一致するか否かを判定する（Ｓ１２６）。なお、一致しない場合には、移動するように制御されたロボットが何らかのトラブルにより、制御通りに移動できなかったと考えられる。この場合、行動後位置（Xr"[i],Yr"[i]）と想定位置（Xr'[i],Yr'[i]）との少なくとも一方を補正すればよい。補正方法としては様々な手法が考えられる。例えば、移動した全てのロボットに対して、制御前の位置に戻るように指示し、行動後位置（Xr"[i],Yr"[i]）を補正してもよいし、想定位置（Xr'[i],Yr'[i]）を行動後位置（Xr"[i],Yr"[i]）に合わせて補正してもよい。 <Position determination unit 126>
The position determination unit 126 obtains a post-behavior position using the determination result of the adjacent state determination unit 124, and determines a post-behavior position (Xr "[i], Yr" [i]) and an assumed position (Xr '[i], It is determined whether or not Yr ′ [i]) matches (S126). If they do not match, it is considered that the robot controlled to move could not move as controlled due to some trouble. In this case, at least one of the post-behavior position (Xr "[i], Yr" [i]) and the assumed position (Xr '[i], Yr' [i]) may be corrected. Various methods can be considered as the correction method. For example, all the robots that have moved may be instructed to return to the pre-control position, and the post-behavior position (Xr "[i], Yr" [i]) may be corrected, or the assumed position (Xr '[i], Yr' [i]) may be corrected according to the post-action position (Xr "[i], Yr" [i]).

各時刻ステップごとに、すべてのロボットがG内にあるかどうかを判定し（Ｓ１２７）、すべてのロボットがG内にあるときは、任務を終了する。そうでないときは、任務を継続する。 At each time step, it is determined whether or not all robots are in G (S127). If all robots are in G, the task is terminated. If not, continue the mission.

例えば、図示しない目標位置到達判定部において、各i=0,1,…,p-1について、位置判定部１２６から出力された行動後位置（Xr"[i],Yr"[i]）∈Gであるか否かを判定し、全てのiについて（Xr"[i],Yr"[i]）∈Gである場合には、任務を終了する。少なくとも１つ以上のiについて（Xr"[i],Yr"[i]）∈Gを満たさない場合には、行動選択部１２０を再度実行するよう制御する。 For example, in the target position arrival determination unit (not shown), the post-action position (Xr "[i], Yr" [i]) ∈ output from the position determination unit 126 for each i = 0, 1,. It is determined whether or not G, and if (Xr "[i], Yr" [i]) ∈ G for all i, the mission is terminated. If (Xr ″ [i], Yr ″ [i]) εG is not satisfied for at least one i, the behavior selection unit 120 is controlled to be executed again.

以上に述べた処理ｓ１２０〜ｓ１２７を毎時刻ステップごとに行う。 The processes s120 to s127 described above are performed for each time step.

＜効果＞
このような構成により、一つのロボット単位に必要な分だけのマルコフ状態空間を用意し、それを用いて動的計画法を利用して各位置でのロボット単位の行動方策を計算し、その行動方策を利用することで、ロボット群に任意の隊列形状と、任務環境内の任意の障害物形状に対応した障害物回避アルゴリズムを獲得でき、ロボット同士が接した状態を維持したうえでの多数ロボットのための隊列形成アルゴリズムを獲得することができる。すなわち、計算負荷がロボット数の階乗に比例せず、実時間で実行可能な低計画計算負荷での自己位置座標定義型隊列形成アルゴリズム獲得ができる。また、静止しているロボットに対する相対的な位置を判定することで、絶対的な位置を取得することができるため、付加的な位置計測用の装備を必要としない。または少なくとも一台のGPSを備えるロボットか、開始状態にて位置の定義されたロボットに対する相対的な位置を判定することで、絶対的な位置を取得することができるため、全てのロボットがGPS等を備えなくともよい。 <Effect>
With such a configuration, as many Markov state spaces as necessary for one robot unit are prepared, and using it, the action policy for each robot at each position is calculated using dynamic programming, By using the strategy, you can acquire an obstacle formation algorithm corresponding to an arbitrary formation shape and an arbitrary obstacle shape in the mission environment for the robot group, and keep many robots in contact with each other. You can get a formation algorithm for. That is, the calculation load is not proportional to the factorial of the number of robots, and a self-position coordinate definition type formation formation algorithm can be obtained with a low plan calculation load that can be executed in real time. Moreover, since an absolute position can be acquired by determining a relative position with respect to a stationary robot, no additional position measurement equipment is required. Or, since the absolute position can be obtained by determining the relative position of the robot with at least one GPS or the robot whose position is defined in the start state, all robots have GPS etc. It is not necessary to have.

また、ロボット単位で制御することで、ロボットの群れの中のいずれのロボットが一台欠けても、各ロボットはお互いに一つの辺で接しあう位置関係を崩さずに済む。ロボットの移動量の誤差による隊列崩れも起こりにくい。 In addition, by controlling in units of robots, even if one of the robots in the group of robots is missing, the robots do not lose the positional relationship that they touch each other on one side. It is also difficult for the formation to collapse due to an error in the amount of movement of the robot.

＜変形例＞
本実施形態では、各格子（マス）は、正方形であるが、他の形状であってもよい。格子は左右方向及び上下方向に連続して配置される。また、各格子は左右方向で他の二つの格子と隣接し、上下方向で他の二つの格子と隣接する。言い換えると、各格子は、ロボットの移動できる方向と同じ方向においてのみ、他の格子と隣接する。この条件を満たせば、各格子はどのような形状であってもよい。また、「直交」とは、厳密に「垂直に交わること」を意味しなくともよく、例えば、図２５のように、各格子は、菱形であってもよく、各格子が他の二つの格子と隣接する方向の一方を上下方向とし、他方を左右方向とすればよく、このとき、上下方向と左右方向とは直交するものとする。 <Modification>
In the present embodiment, each lattice (mass) is a square, but may have other shapes. The lattice is continuously arranged in the left-right direction and the up-down direction. Each lattice is adjacent to the other two lattices in the left-right direction and adjacent to the other two lattices in the vertical direction. In other words, each grid is adjacent to other grids only in the same direction that the robot can move. Each lattice may have any shape as long as this condition is satisfied. Further, “orthogonal” does not necessarily mean strictly “vertically intersecting”. For example, as shown in FIG. 25, each lattice may be a rhombus, and each lattice is the other two lattices. One of the adjacent directions may be the vertical direction and the other is the horizontal direction. In this case, the vertical direction and the horizontal direction are orthogonal to each other.

別の言い方をすると、制御対象物は、二次元平面上の、第一方向（例えば右方向）、第一方向に対して平行でない方向である第二方向（例えば上方向）、第一方向に対して反対方向である第三方向（例えば左方向）、第二方向に対して反対方向である第四方向（例えば下方向）に移動可能であり、一回の行動制御により、現在いる領域（格子、マス）から、現在いる領域に対して、第一方向、第二方向、第三方向、第四方向において隣接する領域の何れかに移動するように制御される。また、この場合、ロボットの２次元平面上の、第一方向において隣接する位置を第一位置、第二方向において隣接する位置を第二位置、第三方向において隣接する位置を第三位置、第四方向において隣接する位置を第四位置、第一位置に第二方向において隣接する位置を第五位置、第二位置に第三方向において隣接する位置を第六位置、第三位置に第四方向において隣接する位置を第七位置、第四位置に第一方向において隣接する位置を第八位置と呼んでもよい。 In other words, the control object is in the first direction (for example, the right direction), the second direction (for example, the upward direction) that is not parallel to the first direction, and the first direction on the two-dimensional plane. On the other hand, it is possible to move in the third direction (for example, the left direction) opposite to the second direction, and in the fourth direction (for example, the downward direction) opposite to the second direction. Control is performed so that the current area is moved to one of the adjacent areas in the first direction, the second direction, the third direction, and the fourth direction from the grid. In this case, on the two-dimensional plane of the robot, the position adjacent in the first direction is the first position, the position adjacent in the second direction is the second position, the position adjacent in the third direction is the third position, Positions that are adjacent in the four directions are the fourth position, positions that are adjacent to the first position in the second direction are the fifth positions, positions that are adjacent to the second position in the third direction are the sixth positions, and positions that are adjacent to the third position are the fourth direction May be referred to as the seventh position, and the position adjacent to the fourth position in the first direction as the eighth position.

本実施形態では、行動制御システム１００が、p台のロボットの内の１つのロボット上に実装される例を示したが、コンピュータ上の制御対象物に対して行動制御を行ってもよい。その場合、意図的にトラブルを発生させない限り、制御した通りに制御対象物は移動するため、想定位置（Xr'[i],Yr'[i]）と移動後位置（Xr"[i],Yr"[i]）とが一致する。そのような状態で、各制御対象物i毎の理想的な行動計画を行ってもよい。その結果得られた任務を終了するまでの行動計画を実ロボットに実行させる。実ロボットは、通信部１５０と隣接状態判定部１２４を備え、各行動を終えた後に、隣接状態を判定し、判定結果を用いて、行動後位置を求め、行動後位置（Xr"[i],Yr"[i]）と想定位置（Xr'[i],Yr'[i]）とが一致するか否かを判定する。なお、一致しない場合には、移動するように制御されたロボットが何らかのトラブルにより、制御通りに移動できなかったと考えられる。その後の処理については第一実施形態と同様でもよいし、他の処理を行ってもよい。このような構成により、少なくとも制御通りに移動できなかったことを検知することができる。 In the present embodiment, the example in which the behavior control system 100 is mounted on one of the p robots has been shown, but behavior control may be performed on a control target on a computer. In that case, unless a trouble occurs intentionally, the controlled object moves as controlled, so the assumed position (Xr '[i], Yr' [i]) and the post-movement position (Xr "[i], Yr "[i]) matches. In such a state, an ideal action plan for each control object i may be performed. The actual robot is made to execute the action plan until the mission obtained as a result is completed. The real robot includes a communication unit 150 and an adjacent state determination unit 124. After each action is completed, the adjacent state is determined, the post-action position is obtained using the determination result, and the post-action position (Xr "[i] , Yr "[i]) and the assumed position (Xr '[i], Yr' [i]) are determined to match. If they do not match, it is considered that the robot controlled to move could not move as controlled due to some trouble. Subsequent processing may be the same as in the first embodiment, or other processing may be performed. With such a configuration, it is possible to detect at least that movement was not possible as controlled.

本実施形態では、一回の行動制御に対して、一回の位置判定を行っているが、一回の行動制御において、複数のロボットを移動させる場合には、各ロボットが移動の度に位置判定を行う構成としてもよい。 In this embodiment, one position determination is performed for one action control. However, when a plurality of robots are moved in one action control, each robot is moved each time it is moved. It is good also as a structure which performs determination.

本実施形態では、ロボット単位は、２×２の４台のロボットからなるが、M×N台のロボットからなってもよい。ただし、M及びNはそれぞれ２以上の整数の何れかである。なお、ロボットの台数pはロボット単位に含まれるロボットの台数M×N以上とする。このような状態で移動を行った場合にも、２×２のロボット単位のときと同様に、ロボットの群れの中のいずれのロボットが一台欠けても、各ロボットはお互いに一つの辺で接しあう位置関係を崩さずに済む。M×Nの台数を大きくすると、価値関数Q(s,a)の学習において、行動できるパターンが減るため、価値関数Q(s,a)の学習の計算量が減る。しかし、価値関数Q(s,a)の学習において、M×N個のマスを一つの単位とするマス単位の中に1つでも障害物が含まれる場合には、そのマス単位の状態sは障害物として扱うため、M×Nの台数を大きくすると、障害物の範囲が多くなり、行動できる範囲が小さくなる。そもそも一つのロボット単位に必要な分だけのマルコフ状態空間を用意し、それを用いて動的計画法を利用して各位置でのロボットの行動方策を計算すればよいため、従来技術に比べれば、十分に価値関数Q(s,a)の学習の計算量が減っている。そのため、行動できる範囲を大きく保つために２×２台のロボットを一つの単位とするロボット単位とすることが望ましい。 In the present embodiment, the robot unit is composed of four 2 × 2 robots, but may be composed of M × N robots. However, M and N are each an integer of 2 or more. Note that the number of robots p is greater than or equal to the number of robots M × N included in the robot unit. Even when moving in such a state, as in the case of a 2 × 2 robot unit, even if one of the robots in the group of robots is missing, each robot is on one side of each other. There is no need to break the positional relationship that touches. If the number of M × N is increased, the number of patterns that can be acted on in learning of the value function Q (s, a) is reduced, so that the amount of calculation of learning of the value function Q (s, a) is reduced. However, in the learning of the value function Q (s, a), if at least one obstacle is included in the square unit with M × N squares as one unit, the state s of the square unit is Since it is handled as an obstacle, if the number of M × N is increased, the range of obstacles increases, and the range in which action can be performed decreases. In the first place, as many Markov state spaces as needed for one robot unit are prepared, and it is only necessary to calculate the action policy of the robot at each position using dynamic programming. The amount of learning of the value function Q (s, a) is sufficiently reduced. For this reason, in order to keep a large range in which the robot can act, it is desirable to use 2 × 2 robots as a robot unit.

＜第二実施形態＞
第一実施形態との違いを中心に説明する。第一実施形態の制御方法を三次元空間に拡張する。本実施形態では、
(1)ロボット隊列を3次元化する。 <Second embodiment>
The difference from the first embodiment will be mainly described. The control method of the first embodiment is extended to a three-dimensional space. In this embodiment,
(1) Three-dimensional robot platoon.

(2)ボイドコントロールを並列処理化して隊列形成動作を高速化する。 (2) Speed up the formation process by parallelizing void control.

(3)動作計画の計算量をO(n²)からO(n)に軽減する。 (3) Reduce the computational complexity of the motion plan from O (n ² ) to O (n).

(4)頭部ロボット単位Dの選択方法を改良して、24個以上のロボット単位で構成される任意形状のロボット群への適用を保証する。 (4) Improve the selection method of head robot unit D and guarantee application to robots of arbitrary shape composed of 24 or more robot units.

本実施形態では、多数のロボットの存在を考慮しつつも、計画計算に必要な計算時間をロボットの台数の一乗に比例するように抑え、また、他のロボットの面の上でのスライディング動作しかできないロボット同士が接したまま状態を維持しつつ任意の隊列から、別の任意の隊列へ障害物のある環境にて変形動作を行うことを可能とする。 In this embodiment, while considering the existence of a large number of robots, the calculation time required for the planned calculation is limited to the first power of the number of robots, and only sliding operations on the surfaces of other robots are possible. It is possible to perform a deformation operation from an arbitrary platoon to another arbitrary platoon in an environment where there is an obstacle while maintaining a state in which the robots that cannot be contacted.

また、本実施形態に拠れば、詳しくは後述するが、ロボットが変形を行う空間に含まれる格子数に比例した計算量で、ロボットの動作計画が可能であり、その動作計画の結果を利用することで、ロボット台数の2乗に比例した実行時間で、任意形状の障害物が存在する環境において、ロボットに任意の形状から任意の形状への変形をさせることが可能である。 Further, according to the present embodiment, as will be described in detail later, it is possible to plan an operation of the robot with a calculation amount proportional to the number of grids included in the space in which the robot deforms, and use the result of the operation plan. Thus, it is possible to cause the robot to deform from an arbitrary shape to an arbitrary shape in an environment where an obstacle of an arbitrary shape exists in an execution time proportional to the square of the number of robots.

[問題設定]
多数のロボットが協調して開始位置における隊列形成状態から、各ロボットが接した状態を維持しつつ移動を行い、目標位置での隊列形成を行う任務は、例えば図２６に例示するような、互いに接する面同志をスライドさせて移動していくことが可能な立方体型のロボットの使用を想定する。図２７に示すように、壁で区切られた部屋（ただし図中、壁を省略する）においての開始位置から目標位置まで複数のロボットの移動によって実現するものである。 [Problem settings]
The task of performing the formation at the target position by moving the robot while maintaining the state in which each robot is in contact from the formation state at the start position in cooperation with a number of robots, Assume the use of a cube-type robot that can move by moving the faces that touch each other. As shown in FIG. 27, this is realized by the movement of a plurality of robots from a start position to a target position in a room partitioned by walls (however, walls are omitted in the figure).

ロボットについては、例えば図２６に示すように、ロボットの周囲縦横高さ方向（以下「上下左右前後方向」ともいう）６マスのうち一つに他のロボットが存在している状態を維持しながら移動をするものとする。この手法では１つのロボット自身が、一台のロボットのサイズ分の距離を移動することで、一回の動作の移動量を正確に測ることができるというメリットがある。また、一つの面を共有する隣り合うロボットとの相対的な位置を計測しあうことで、ロボットの群れ全体の中での各ロボットの位置も容易に知ることができる。このため、ロボットの移動量の誤差によって、隊列が崩れるといった問題を起こしにくい。また、複数のロボットを連結したように、同時に複数のロボットを移動させていくことが可能である。 As for the robot, for example, as shown in FIG. 26, while maintaining the state in which another robot exists in one of six squares in the vertical and horizontal height directions of the robot (hereinafter also referred to as “vertical, horizontal and vertical directions”). It shall be moved. This method has an advantage that one robot itself can accurately measure the movement amount of one operation by moving the distance corresponding to the size of one robot. In addition, by measuring the relative positions of adjacent robots that share one surface, the position of each robot in the entire group of robots can be easily known. For this reason, it is difficult to cause a problem that the formation is collapsed due to an error in the movement amount of the robot. Further, it is possible to move a plurality of robots at the same time as if a plurality of robots were connected.

なお、ロボットは、隣の位置に他のロボットが存在しているか否か、障害物があるか否か、そして、自身が目標位置上にいるかどうかを知ることができるものとする。 It is assumed that the robot can know whether another robot is present at the adjacent position, whether there is an obstacle, and whether the robot is on the target position.

任務を行うロボットは、p台（p≧192=24×8）であり、各ロボットは、隣接するロボットと一面以上を共有しつつ、三次元空間におけるX-Y-Z軸方向に移動可能とする。図２６の各立方体は、それぞれのロボットの位置を示すものである。各立方体にはロボットは一台しか存在することができない。それぞれのロボットは、移動しようとする方向に障害物か他のロボットがある場合には、静止をするものと仮定する。なお、ロボットが存在しうる立方体状の空間をマス、または、格子ともいう。図２７において、グレーに塗りつぶされたマスはロボットが存在する位置を示し、実線で囲まれた白抜きのマスは障害物が存在する位置を示す。図２７Ａのロボットが存在する位置はロボットの開始位置の集合を示し、図２７Ｂのロボットが存在する位置はロボットの目標位置の集合を示す。目標位置の集合で表される領域を目標隊列エリアともいう。このように、各開始位置及び各目標位置は、それぞれ縦横高さ方向の少なくとも何れかの方向において他の開始位置及び目標位置と隣接し、ロボットの開始位置及び目標位置での隊列形状はそれぞれ一塊の任意の形状である。 The robots that perform the mission are p units (p ≧ 192 = 24 × 8), and each robot can move in the X-Y-Z-axis direction in the three-dimensional space while sharing one or more surfaces with adjacent robots. Each cube in FIG. 26 indicates the position of each robot. Each cube can have only one robot. Each robot is assumed to be stationary if there are obstacles or other robots in the direction of movement. Note that a cubic space where the robot can exist is also referred to as a mass or a lattice. In FIG. 27, a square filled with gray indicates a position where the robot is present, and a white square surrounded by a solid line indicates a position where an obstacle is present. The position where the robot of FIG. 27A exists indicates a set of start positions of the robot, and the position where the robot of FIG. 27B exists indicates a set of target positions of the robot. An area represented by a set of target positions is also called a target platoon area. As described above, each start position and each target position are adjacent to other start positions and target positions in at least one of the vertical and horizontal height directions, and the formations at the start position and the target position of the robot are each in a lump. Any shape.

[ロボットの座標設定]
それぞれのロボットi（iはロボット番号を表すi=0,1,2,3,,,p-1）の初期位置を（Xr0[i],Yr0[i],Zr0[i]）とし、目標位置を（Xre[i],Yre[i],Zre[i]）とするとき、本問題は、初期位置に配置されたロボットが、目標位置まで移動するための行動計画を求めることと定義できる。目標位置（Xre[i],Yre[i],Zre[i]）の集合をGとする。 [Robot coordinate settings]
The initial position of each robot i (i is the robot number i = 0,1,2,3 ,,, p-1) is (Xr0 [i], Yr0 [i], Zr0 [i]) When the position is (Xre [i], Yre [i], Zre [i]), this problem can be defined as finding an action plan for the robot placed at the initial position to move to the target position. . Let G be a set of target positions (Xre [i], Yre [i], Zre [i]).

[任務空間の定義]
iをロボット番号としたとき、ロボットiの各状態（ロボットの位置と行動）は離散値で表現される。部屋をX,Y,Zの直交座標系からなる3次元空間で表すと、X軸、Y軸、Z軸をそれぞれ離散化表現した値により各位置を表現する。つまり、部屋（３次元空間）は格子で区切られ、各格子が各位置に対応する。また、各格子において、障害物の「ある／なし」が予め設定されている。 [Definition of mission space]
When i is a robot number, each state (robot position and action) of the robot i is expressed by discrete values. When a room is represented in a three-dimensional space composed of an orthogonal coordinate system of X, Y, and Z, each position is represented by a discrete representation of the X, Y, and Z axes. That is, the room (three-dimensional space) is divided by a lattice, and each lattice corresponds to each position. In each grid, “present / none” of the obstacle is set in advance.

[ロボット動作の定義]
また、行動主体は部屋に配置されている各ロボットとなる。ロボットi（iはロボット番号）の行動aは、静止、縦横高さ方向への１格子分の移動、の計７種類のうちのいずれかを取る。例えば、a∈{0,1,2,3,4,5,6}として、
0: 静止
1: 三次元空間内でX軸正方向に１格子だけ移動する
2: 三次元空間内でY軸正方向に１格子だけ移動する
3: 三次元空間内でX軸負方向に１格子だけ移動する
4: 三次元空間内でY軸負方向に１格子だけ移動する
5: 三次元空間内でZ軸正方向に１格子だけ移動する
6: 三次元空間内でZ軸負方向に１格子だけ移動する
とする。 [Definition of robot motion]
The action subject is each robot arranged in the room. The action a of the robot i (i is a robot number) takes one of a total of seven types: stationary and movement of one lattice in the vertical and horizontal height directions. For example, if a∈ {0,1,2,3,4,5,6}
0: stationary
1: Move one grid in the positive direction of X axis in 3D space
2: Move one grid in the positive Y-axis direction in 3D space
3: Move one grid in the negative X-axis direction in 3D space
4: Move one grid in the negative Y-axis direction in 3D space
5: Move one grid in the positive Z-axis direction in 3D space
6: Suppose that one grid moves in the negative Z-axis direction in the three-dimensional space.

[探索計算上の問題点]
このような任務環境における状態空間は、ロボット数×3の次元数の状態を持ち、かつ選択可能な行動数は、ロボットの行動（＝７通り）のロボット数乗だけ存在する。例えば、ロボット数が５０で、部屋の縦横高さ方向の格子数がそれぞれ２０であるとすれば状態数は２０の150乗個にもなり、探索計算に要する資源の量は膨大なものとなる。さらにロボット数が１台増えるごとに、その状態数は8000倍増加していくことになる。本実施形態の[問題設定]の項で説明したように、ロボット同士が接しているという拘束条件を取り入れる場合、ロボットのお互いの移動を考慮したうえで探索計算行わなければならないために、根本的な計算量の削減は難しく、複数ロボットを使用する場合の大きな問題となっている。 [Problems in search calculation]
The state space in such a task environment has a state of the number of dimensions of the number of robots × 3, and the number of actions that can be selected exists by the robot power (= 7 ways) of the number of robots. For example, if the number of robots is 50 and the number of grids in the vertical and horizontal height directions of the room is 20, the number of states becomes 20 to the 150th power, and the amount of resources required for the search calculation becomes enormous. . As the number of robots further increases, the number of states will increase by 8000 times. As described in the [Problem Setting] section of this embodiment, when the constraint condition that the robots are in contact with each other is taken in, the search calculation must be performed in consideration of the mutual movement of the robots. It is difficult to reduce the amount of calculation, which is a big problem when using multiple robots.

[二つの動作計画の導入]
そこで、第一実施形態の場合と同様に二つの動作計画を導入する。本実施形態で必要とされる探索計算は、大きく二つにわけられる。ここでそのための準備として、ある隊列Gaをなしているロボットが、次になすべき隊列の形状としてGbを考える。隊列Gaと隊列Gbの間には、隊列Gaにて存在し、隊列Gbにて存在しない位置にあるロボットと、隊列Gbにて存在し隊列Gaにて存在しない位置にあるロボットがある。前者を尾部ロボットとし、後者を頭部ロボットとする。本実施形態では、ちょうど、頭部ロボット位置にあるロボットが、ロボット群の移動において先頭を務め、尾部ロボットが、移動の最後尾を務めるロボットになるように、隊列Gaと隊列Gbを定義する。 [Introduction of two action plans]
Therefore, two operation plans are introduced as in the case of the first embodiment. The search calculation required in this embodiment is roughly divided into two. Here, as a preparation for this, a robot forming a certain column Ga considers Gb as the shape of the next column to be formed. Between the formations Ga and Gb, there are robots that exist in the formation Ga and that do not exist in the formation Gb, and robots that exist in the formation Gb and do not exist in the formation Ga. The former is a tail robot and the latter is a head robot. In the present embodiment, the formation Ga and the formation Gb are defined so that the robot at the head robot position is the first robot in the movement of the robot group, and the tail robot is the last robot in the movement.

この定義の元、一つ目の探索計算は、隊列Gaをなすロボットの群が、障害物を回避しつつも、目標位置まで移動するためには、次の隊列Gbとしてどの隊列を選択すればよいかを決定するための（図２８）動作計画計算（移動先隊列決定用動作計画）である。この探索計算は、事実上、頭部ロボット位置に移動するロボットとして、隊列Gaのどのロボットを選択するかと、隊列Gaにおける尾部ロボットを決定するための計算である。 Based on this definition, the first search calculation shows that the group of robots that form the formation Ga is selected as the next formation Gb in order to move to the target position while avoiding obstacles. This is an operation plan calculation (operation plan for determining a movement destination platoon) for determining whether it is acceptable (FIG. 28). This search calculation is actually a calculation for determining which robot of the formation Ga is selected as the robot moving to the head robot position and the tail robot in the formation Ga.

二つ目の探索計算は、ロボットが互いに接した位置関係を維持しつつも、隊列Gaから隊列Gbへ変形する際の各ロボットの動作を決定するための動作計画計算（各ロボット動作用動作計画）である。これは、事実上、移動先隊列決定用動作計画により決定済みの頭部ロボット及び尾部ロボットに対し、尾部ロボットの位置からロボットを追い出し、隊列Gbにおける頭部ロボットの位置へロボットを誘導するための探索計算である。 The second search calculation is an action plan calculation (an action plan for each robot action) that determines the action of each robot when transforming from formation Ga to formation Gb while maintaining the positional relationship where the robots are in contact with each other. ). This is in effect to drive the robot from the position of the tail robot to the position of the head robot in the formation Gb, with respect to the head robot and tail robot that have already been determined by the movement plan for the movement target row determination. Search calculation.

本実施形態では、これらの探索計算を、目標位置の集合G内に一点の入口位置を設定し、その入口位置からの各ロボット位置の全体ロボット構造内でのマンハッタン距離を計算することで行う。 In this embodiment, these search calculations are performed by setting a single entrance position in the set G of target positions and calculating the Manhattan distance in the entire robot structure of each robot position from the entrance position.

[ボイド制御]
上記二つの探索計算負荷の軽減のため、本実施形態においても、ボイド制御の考え方を導入する。まず、ここでいうボイドとは、あるロボットが別の位置に移動した後に、元いた位置にできる空隙のことである。こうした群ロボットの隊列形成問題においては、複数のロボットの動作に着目するがゆえに、その探索計算量が爆発してしまうが、視点を変えて、ボイドの動きに着目すれば、多数のロボットの動作計画の問題を単一のボイドの動作計画として考えることができ、探索計算負荷の軽減に適している。 [Void control]
In order to reduce the above two search calculation loads, the concept of void control is also introduced in this embodiment. First, the term “void” as used herein refers to a gap that can be restored to the original position after a robot has moved to another position. In such a group robot formation problem, the amount of search calculation explodes because it focuses on the movements of multiple robots. The planning problem can be considered as a single void motion plan, which is suitable for reducing the search calculation load.

まず、移動先群決定用動作計画において、ロボット群の移動の先頭を務める頭部ロボットの動作について考える。本実施形態では、目標位置の集合Gに一か所の入口位置Peを設けて、他の位置からは頭部ロボットがGに入ることはできないこととする。また、一度G内に入った全てのロボットは、二度とG外には出られないこととする。この条件のもと、G外にいる頭部ロボットが入口位置Peに至るまでのロボットの経路決定問題は、一台のロボットを入口位置Peまで誘導するための探索問題を解けばよく、得られた解を頭部ロボットの選択と誘導のために使用すればよい。目標位置の集合G内の頭部ロボットの動作については、ボイドの動きに着目して、ボイドの誘導について同様のことを行う。すなわち、目標位置の集合G内の頭部ロボットに望まれる動作とは、適切に頭部ロボットが入口位置Peから遠ざかり、目標位置の集合G内の各場所に散らばっていきつつ目標位置の集合Gを埋めていくことであるが、これをボイドの動きとしてみれば、ロボットが入口位置Peから、目標位置の集合G内に入るたびに、ボイドはそれに伴って、入口位置Peから目標位置の集合Gの外に出ていくのである。このとき、目標位置の集合G内のボイドは、すべて一点の入口位置Peに向かって移動していくのが理想的な動作である。つまり、目標位置の集合G内の頭部ロボットを目標位置の集合Gに適切に散らばるように誘導するためには、目標位置の集合G内に位置する各ボイドが入口位置Peに至るための経路を探索計算すればよく、そうして求められたボイドの動作を頭部ロボットを動かすことで実現するようにすればよい。この考え方ならば、すべてのロボットが、一点、入口位置Peを目指して動作するための動作計画を一回行えばよいので、計算負荷が劇的に少なくなる。 First, consider the behavior of the head robot that acts as the head of the movement of the robot group in the movement plan for determining the movement destination group. In the present embodiment, one entrance position Pe is provided in the set G of target positions, and the head robot cannot enter G from other positions. Also, all robots that have entered G once cannot go outside G again. Under this condition, the robot routing problem until the head robot outside G reaches the entrance position Pe can be obtained by solving the search problem for guiding one robot to the entrance position Pe. Can be used for head robot selection and guidance. As for the operation of the head robot in the set G of target positions, attention is paid to the movement of the void and the same is performed for the guidance of the void. That is, the desired behavior of the head robot in the target position set G is that the head robot appropriately moves away from the entrance position Pe and is scattered in each place in the target position set G. If this robot is seen as the movement of a void, each time the robot enters the target position set G from the entrance position Pe, the void is accompanied by a set of target positions from the entrance position Pe. Go out of G. At this time, it is an ideal operation that all the voids in the set G of target positions move toward the single entrance position Pe. In other words, in order to guide the head robot in the target position set G so as to be properly scattered in the target position set G, the path through which each void located in the target position set G reaches the entrance position Pe. It is only necessary to search for and calculate the void motion obtained by moving the head robot. With this concept, all the robots need only perform an operation plan for one point of movement toward the entrance position Pe, so the calculation load is dramatically reduced.

続いて、隊列Gaを維持しているロボット群が、隊列Gbに変形することを考える。このとき、ロボット群が隊列Gaから隊列Gbに変形する動作とは、頭部ロボットの位置にいた一つのボイドが、尾部ロボットの位置に移動していくプロセスとしてとらえることができる（図２８及び図２９参照）。このようなボイドの動作は、頭部ロボット位置をスタート位置とし、尾部ロボット位置をゴール位置とした、一つのボイドの動作計画によって計算可能であり、計算負荷も小さい。この計算において考慮すべきボイドの移動の拘束条件としては、ボイドが移動する際に、ロボットが存在する位置をたどっていくという制限を設けるだけでよい。 Next, consider that the robot group that maintains the formation Ga transforms into the formation Gb. At this time, the movement of the robot group from the formation Ga to the formation Gb can be regarded as a process in which one void at the position of the head robot moves to the position of the tail robot (FIGS. 28 and 28). 29). Such a motion of the void can be calculated by an operation plan of one void with the head robot position as the start position and the tail robot position as the goal position, and the calculation load is small. As a constraint condition for the movement of the void to be considered in this calculation, it is only necessary to provide a restriction that the robot follows the position where the robot exists when the void moves.

[8マスロボット単位の導入]
さらに、本実施形態では、図３０に示すように、８つの田の字状に隣接したロボットを一つの単位とし（ロボット単位）、ロボットは、この田の字型のロボット単位を維持しつつ移動を行うとする。言い換えると、８台毎に１つのロボット単位を構成し、１つのロボット単位を構成する８台のロボットはそれぞれ３つの方向において１つのロボット単位を構成する他のロボットと隣接した状態を維持しつつ移動を行う。このロボット単位の集団は、互いにロボット単位ごとに一面を共有し、接しながら移動をするように制御される。 [Introduction of 8 mass robot units]
Furthermore, in this embodiment, as shown in FIG. 30, the robot adjacent to the eight-field shape is defined as one unit (robot unit), and the robot moves while maintaining this field-shaped robot unit. Suppose that In other words, every eight robots constitute one robot unit, and each of the eight robots constituting one robot unit maintains a state adjacent to another robot constituting one robot unit in three directions. Move. This group of robot units is controlled so that each robot unit shares one surface and moves while touching.

このような８つのロボットを一つの単位とした移動を行う理由としては、このような状態で移動を行う限り、各ロボット単位の中のいずれのロボットが一台のみ欠けても、各ロボット単位はお互いに一つの面で接しあう位置関係を崩さずに済むからである。すなわちこれは、隊列形態の維持を考量しなければならない各ロボット動作用動作計画において、ロボット同士の接続を考慮するための計算負荷を軽減することにつながるからである。各ロボット単位内のボイドが１つ以内であれば、全ロボット単位内にボイドが存在してもよいことになるので、一度に複数のボイドをロボット群の中に並列で存在させて、ロボットの隊列変形動作を高速化することも可能である。 The reason why such eight robots are moved as one unit is that, as long as movement is performed in such a state, each robot unit is This is because it is not necessary to destroy the positional relationship of contacting each other on one surface. In other words, this is because the calculation load for considering the connection between the robots is reduced in each robot motion operation plan in which it is necessary to consider maintaining the formation of the formation. If there is no more than one void in each robot unit, there may be voids in all robot units, so multiple robots can exist in parallel in the robot group at the same time. It is also possible to speed up the formation deformation operation.

先に述べた移動先隊列決定用動作計画は、８台のロボットによるロボット単位がなす群れの移動について行う。先に述べた頭部ロボット、尾部ロボットは、それぞれ８台のロボットで構成される頭部ロボット単位、尾部ロボット単位となる。各ロボット動作用動作計画は、頭部ロボット単位に含まれる８つのロボットの位置に、ロボットを誘導し、尾部ロボット単位の位置に含まれるロボットから、８つのロボットを追い払うための動作を探索計算する。 The movement plan for determining the movement destination corps described above is performed for a movement of a group of eight robots. The head robot and tail robot described above are in units of head robot and tail robot each composed of eight robots. Each robot operation plan guides the robot to the positions of the eight robots included in the head robot unit, and searches for and calculates the operation for driving away the eight robots from the robots included in the position of the tail robot unit. .

ここでは８台のロボットがなすロボット単位が一つのマスの単位（本実施形態では、以下、この単位を「マス単位」または「位置単位」ともいう）であるとし、一つのマス単位を一状態として状態空間を組む。ロボット単位の位置を（Xr_unit[j],Yr_unit[j], Zr_unit[j]）（j=0,1,2,…j_max-1）としたとき、そのロボット単位jに所属するロボットをi1,i2,i3,i4,i5,i6,i7,i8とすれば、
Xri1 = 2 x Xr_unit[j]
Yri1 = 2 x Yr_unit[j]
Zri1 = 2 x Zr_unit[j]
Xri2 = 2 x Xr_unit[j] + 1
Yri2 = 2 x Yr_unit[j]
Zri2 = 2 x Zr_unit[j]
Xri3 = 2 x Xr_unit[j]
Yri3 = 2 x Yr_unit[j] + 1
Zri3 = 2 x Zr_unit[j]
Xri4 = 2 x Xr_unit[j] + 1
Yri4 = 2 x Yr_unit[j] + 1
Zri4 = 2 x Zr_unit[j]
Xri5 = 2 x Xr_unit[j]
Yri5 = 2 x Yr_unit[j]
Zri5 = 2 x Zr_unit[j] + 1
Xri6 = 2 x Xr_unit[j] + 1
Yri6 = 2 x Yr_unit[j]
Zri6 = 2 x Zr_unit[j] + 1
Xri7 = 2 x Xr_unit[j]
Yri7 = 2 x Yr_unit[j] + 1
Zri7 = 2 x Zr_unit[j] + 1
Xri8 = 2 x Xr_unit[j] + 1
Yri8 = 2 x Yr_unit[j] + 1
Zri8 = 2 x Zr_unit[j] + 1
である。 Here, it is assumed that the robot unit formed by eight robots is one unit of mass (in the present embodiment, this unit is hereinafter also referred to as “mass unit” or “position unit”), and one mass unit is in one state. As a state space. When the position of the robot unit is (Xr_unit [j], Yr_unit [j], Zr_unit [j]) (j = 0,1,2, ... j_max-1), the robot belonging to the robot unit j is i1, i2, i3, i4, i5, i6, i7, i8
Xri1 = 2 x Xr_unit [j]
Yri1 = 2 x Yr_unit [j]
Zri1 = 2 x Zr_unit [j]
Xri2 = 2 x Xr_unit [j] + 1
Yri2 = 2 x Yr_unit [j]
Zri2 = 2 x Zr_unit [j]
Xri3 = 2 x Xr_unit [j]
Yri3 = 2 x Yr_unit [j] + 1
Zri3 = 2 x Zr_unit [j]
Xri4 = 2 x Xr_unit [j] + 1
Yri4 = 2 x Yr_unit [j] + 1
Zri4 = 2 x Zr_unit [j]
Xri5 = 2 x Xr_unit [j]
Yri5 = 2 x Yr_unit [j]
Zri5 = 2 x Zr_unit [j] + 1
Xri6 = 2 x Xr_unit [j] + 1
Yri6 = 2 x Yr_unit [j]
Zri6 = 2 x Zr_unit [j] + 1
Xri7 = 2 x Xr_unit [j]
Yri7 = 2 x Yr_unit [j] + 1
Zri7 = 2 x Zr_unit [j] + 1
Xri8 = 2 x Xr_unit [j] + 1
Yri8 = 2 x Yr_unit [j] + 1
Zri8 = 2 x Zr_unit [j] + 1
It is.

頭部ロボット単位決定用の開始位置の集合をSU、目標位置の集合をGUとする。開始位置の集合SU,目標位置の集合GUは、ロボット単位と同様に8つのロボット位置が一単位となったマスで構成される。GU内には一つの入口位置単位PeUが設けられ、各ロボットは、PeU内の位置を経由してGU内に入る。なお、移動先隊列決定用動作計画にて使用するロボット単位の総数は、開始位置の集合SU及び目標位置の集合GU内のロボット単位の数と同じでなければならない。よって、ロボットの全体数pを8の倍数とし、開始位置の集合SU及び目標位置の集合GUにおいて、全てのロボット単位は8台のロボットが充填され、端数のロボットは生じないものとする。また目標位置の集合GU及び開始位置の集合SUはそれぞれR個の位置単位からなる一塊の任意の形状を成す。 The set of start positions for head robot unit determination is SU, and the set of target positions is GU. The start position set SU and the target position set GU are composed of cells in which eight robot positions become one unit, similar to a robot unit. One entrance position unit PeU is provided in the GU, and each robot enters the GU via a position in the PeU. It should be noted that the total number of robot units used in the movement plan for the movement destination platoon determination must be the same as the number of robot units in the start position set SU and the target position set GU. Therefore, the total number p of robots is a multiple of 8, and in the set SU of start positions and the set GU of target positions, all robot units are filled with 8 robots, and no fractional robots are generated. The set of target positions GU and the set of start positions SU each have an arbitrary shape made up of R position units.

[移動先隊列決定用動作計画]
本実施形態では、ロボットが動作を開始する前にあらかじめ、２つの動作計画のために使用する任務空間内の各位置単位の入口位置単位PeUからのマンハッタン距離の計算を行う。そのために、まず、任務空間内の各位置単位（X,Y,Z）にて、入口位置単位PeUからの各位置単位へのマンハッタン距離δ[X][Y][Z]を以下の計算手続きで求める。なお、以下において、説明を簡単にするため「位置単位」を単に「位置」ともいい、「入口位置単位PeU」を単に「入口位置Pe」ともいう。 [Movement plan for moving to move to]
In this embodiment, the Manhattan distance from the entrance position unit PeU of each position unit in the mission space used for two operation plans is calculated in advance before the robot starts the operation. For this purpose, first, for each position unit (X, Y, Z) in the mission space, the Manhattan distance δ [X] [Y] [Z] from the entrance position unit PeU to each position unit is calculated as follows: Ask for. In the following, for simplicity of explanation, the “position unit” is also simply referred to as “position”, and the “entrance position unit PeU” is also simply referred to as “entrance position Pe”.

[マンハッタン距離δの計算]
(1)各位置（X,Y,Z）において、next[a][X][Y][Z]を用意し、隣接する行動aの方向に障害物があるか、もしくは位置（X,Y,Z）が入口位置Pe以外の目標位置の集合GU内の位置の場合に、隣接する行動aの方向の位置が目標位置の集合GUの外であるか、もしくは位置（X,Y,Z）が目標位置の集合GU外の場合に、隣接する行動aの方向の位置が入口位置Pe以外の目標位置の集合GU内の位置である場合には、next[a][X][Y][Z]←0とし、それ以外の場合はnext[a][X][Y][Z]←１とする。すなわち、next[a][X][Y][Z]=0は位置（X,Y,Z）において行動aを実行できないことを、next[a][X][Y][Z]=1は位置（X,Y,Z）において行動aを実行できることを意味する。
(2)状態空間内の各位置（X,Y,Z）のマンハッタン距離δ[X][Y][Z]を状態空間内の格子数より大きな値s_maxに初期化する。
(3)入口位置Peのマンハッタン距離δを0に初期化する。
(4)入口位置Pe以外の、各位置（X,Y,Z）の全行動aについて、next[a][X][Y][Z]の値が0ではない場合の、行動aによって位置(X,Y,Z)から移動した先の位置（X’,Y’,Z’）でのマンハッタン距離δ[X’][Y’][Z’]を調べ、その最小値に１を加えた値が、現在のδ[X][Y][Z]よりも小さい場合は、その値をδ[X][Y][Z]に代入する。
(5)上述の(4)の処理にて、δ[X][Y][Z]値の更新がなくなるまで、(4)を繰り返す。 [Calculation of Manhattan distance δ]
(1) Next [a] [X] [Y] [Z] is prepared at each position (X, Y, Z), and there is an obstacle in the direction of the adjacent action a or the position (X, Y , Z) is a position in the target position set GU other than the entrance position Pe, the position of the adjacent action a is outside the target position set GU, or the position (X, Y, Z) Is outside the target position set GU, and the position in the direction of the adjacent action a is a position in the target position set GU other than the entrance position Pe, next [a] [X] [Y] [ Z] ← 0, otherwise, next [a] [X] [Y] [Z] ← 1. That is, next [a] [X] [Y] [Z] = 0 indicates that the action a cannot be executed at the position (X, Y, Z). Means that the action a can be executed at the position (X, Y, Z).
(2) The Manhattan distance δ [X] [Y] [Z] at each position (X, Y, Z) in the state space is initialized to a value s_max larger than the number of lattices in the state space.
(3) The Manhattan distance δ at the entrance position Pe is initialized to zero.
(4) For all actions a at each position (X, Y, Z) other than the entrance position Pe, the position is determined by action a when the value of next [a] [X] [Y] [Z] is not 0. Check the Manhattan distance δ [X '] [Y'] [Z '] at the previous position (X', Y ', Z') moved from (X, Y, Z), and add 1 to the minimum value. If the obtained value is smaller than the current δ [X] [Y] [Z], the value is substituted into δ [X] [Y] [Z].
(5) Repeat (4) until there is no more update of the δ [X] [Y] [Z] value in the processing of (4) above.

以上の処理で計算したマンハッタン距離δ[X][Y][Z]を使用し、続いて開始位置の集合SUと目標位置の集合GUをつなぐパスPを決定する。パスPの決定は以下の処理によってなされる。パスPをなす位置をp[ip](ip=0,1,2,3…)とする。 The Manhattan distance δ [X] [Y] [Z] calculated by the above processing is used, and then a path P connecting the start position set SU and the target position set GU is determined. The path P is determined by the following process. The position forming the path P is assumed to be p [ip] (ip = 0, 1, 2, 3 ...).

[パスPの決定]
(1)開始位置の集合SU内にて、最もマンハッタン距離δ[X][Y][Z]値の小さな位置をPsとする。
(2)位置Psに隣接し、位置Psにおけるマンハッタン距離δよりも１だけ小さなマンハッタン距離δを持つ開始位置の集合SUの外の位置をパスPの始点p[0]とする。
(3)ip←0とする。
(4)位置p[ip]に隣接する位置の中で、位置p[ip]におけるマンハッタン距離δよりも１だけ小さなマンハッタン距離δを持つ位置をp[ip+1]とする。もし、p[ip+1]が入口位置Peに一致しないならば、ipをインクリメントし、(4)を繰りかえす。一致する場合は(5)へ移行する。
(5)目標位置の集合GU及びパスPの何れにも含まれない全ての位置のマンハッタン距離δをs_maxに再設定する。
(6)開始位置の集合SU内の全ての位置(X,Y,Z)の全行動aについて、next[a][X][Y][Z]の値が0ではない場合の、行動aによって位置(X,Y,Z)から移動した先の位置（X’,Y’,Z’）でのマンハッタン距離δ[X’][Y’][Z’]を調べ、その最小値に１を加えた値が、現在のδ[X][Y][Z]よりも小さい場合は、その値をδ[X][Y][Z]に代入する。
(7)上述の(6)の処理にて、δ[X][Y][Z]値の更新がなくなるまで、(6)を繰り返す。
(8)目標位置の集合GU内の全ての点のマンハッタン距離δの符号を負にする。この処理により、目標位置の集合GU内では、入口位置Peより離れるほどマンハッタン距離δが小さくなる。 [Determine path P]
(1) A position having the smallest Manhattan distance δ [X] [Y] [Z] value in the start position set SU is defined as Ps.
(2) A position outside the start position set SU adjacent to the position Ps and having a Manhattan distance δ that is smaller by one than the Manhattan distance δ at the position Ps is defined as a starting point p [0] of the path P.
(3) Set ip ← 0.
(4) Among positions adjacent to the position p [ip], a position having a Manhattan distance δ smaller by 1 than the Manhattan distance δ at the position p [ip] is defined as p [ip + 1]. If p [ip + 1] does not coincide with the entry position Pe, ip is incremented and (4) is repeated. If they match, go to (5).
(5) Reset the Manhattan distance δ of all positions not included in either the target position set GU or the path P to s_max.
(6) Action a when the value of next [a] [X] [Y] [Z] is not 0 for all actions a at all positions (X, Y, Z) in the start position set SU The Manhattan distance δ [X '] [Y'] [Z '] at the previous position (X', Y ', Z') moved from the position (X, Y, Z) is checked, and the minimum value is 1 When the value obtained by adding is smaller than the current δ [X] [Y] [Z], the value is substituted into δ [X] [Y] [Z].
(7) Repeat (6) until there is no more update of the δ [X] [Y] [Z] value in the processing of (6) above.
(8) The sign of the Manhattan distance δ of all points in the set GU of target positions is made negative. As a result of this processing, the Manhattan distance δ decreases as the distance from the entrance position Pe increases within the set GU of target positions.

以上の処理で計算されたマンハッタン距離δは、SU+P+GUの中のみでのロボットの移動を考慮したマンハッタン距離の値となる。マンハッタン距離δを使用して、ある隊列Gaをなすロボット単位の群が、つぎに移動すべき隊列Gbの頭部ロボットを決定する方法は以下のとおりである。本処理を、頭部決定処理Head_Robot_Decisionとする。。なお、以上の処理で計算されたマンハッタン距離δが第一実施形態の価値関数に相当し、ロボット単位がそのロボット単位の現在の位置単位において各行動を取ったときの適切さを表す。具体的には、マンハッタン距離δが小さくなる方向に移動するようにロボット単位を制御する。よって、第一実施形態の価値関数に相当する価値関数（ロボット単位がそのロボット単位の現在の位置単位において各行動を取ったときの適切さを表す１個の価値関数）が、ある位置単位から入口位置単位までのマンハッタン距離を用いて得られると言える。 The Manhattan distance δ calculated by the above processing is a value of the Manhattan distance considering the movement of the robot only in SU + P + GU. Using the Manhattan distance δ, a group of robot units that form a certain column Ga determines the head robot of the column Gb to be moved next as follows. This processing is referred to as head determination processing Head_Robot_Decision. . The Manhattan distance δ calculated by the above processing corresponds to the value function of the first embodiment, and represents the appropriateness when the robot unit takes each action in the current position unit of the robot unit. Specifically, the robot unit is controlled so as to move in a direction in which the Manhattan distance δ decreases. Therefore, a value function corresponding to the value function of the first embodiment (one value function representing appropriateness when the robot unit takes each action in the current position unit of the robot unit) is obtained from a certain position unit. It can be said that it is obtained using the Manhattan distance to the entrance position unit.

[Head_Robot_Decision]
(1)各ロボット単位の位置（Xr_unit[j],Yr_unit[j],Zr_unit[j]）において、位置（Xr_unit[j],Yr_unit[j],Zr_unit[j]）に隣接し、かつロボットが存在しておらず、かつパスP及び目標位置の集合GUの何れかに属する位置の中で、マンハッタン距離δ[Xr_unit[j]][Yr_unit[j]][Zr_unit[j]]の値より1小さい、マンハッタン距離δをもつ位置を求め、その位置のマンハッタン距離δをδ_neighbor_max[Xr_unit[j]][Yr_unit[j]][Zr_unit[j]]とする。
(2)全ロボット単位中で最大のδ_neighbor_max[Xr_unit[j]][Yr_unit[j]][Zr_unit[j]]の値を持つロボット単位H(なお、ロボット単位Hの位置を（Xrh,Yrh,Zrh）とする)を選択し、ロボット単位Hに隣接し、かつロボットが存在しておらず、かつパスP及び目標位置の集合GUの何れかに属する位置の中で、マンハッタン距離δがロボット単位Hのそれより１小さいマンハッタン距離δをもつ位置を頭部ロボット単位Dの位置とする。ロボット単位Hから頭部ロボット単位Dの位置に向かう行動の方向をa_headとする。 [Head_Robot_Decision]
(1) The position of each robot unit (Xr_unit [j], Yr_unit [j], Zr_unit [j]) is adjacent to the position (Xr_unit [j], Yr_unit [j], Zr_unit [j]) and the robot is 1 from the value of the Manhattan distance δ [Xr_unit [j]] [Yr_unit [j]] [Zr_unit [j]] among the positions that do not exist and belong to either the path P or the target position set GU A position having a small Manhattan distance δ is obtained, and the Manhattan distance δ at the position is set as δ_neighbor_max [Xr_unit [j]] [Yr_unit [j]] [Zr_unit [j]].
(2) Robot unit H with the maximum value of δ_neighbor_max [Xr_unit [j]] [Yr_unit [j]] [Zr_unit [j]] among all robot units (Note that the position of robot unit H is (Xrh, Yrh, Zrh)), the robot is adjacent to the robot unit H, the robot does not exist, and the Manhattan distance δ is the robot unit among the positions belonging to either the path P or the target position set GU. A position having a Manhattan distance δ that is one smaller than that of H is defined as the position of head robot unit D. The direction of action from the robot unit H toward the position of the head robot unit D is a_head.

以上の処理では、目標位置の集合GU外にしかロボット単位が存在しないときに、ロボットの群れがパスPを通して入口位置Peに近づいていくか、もしくは、目標位置の集合GU内にロボット単位が存在するときには、入口位置Peに近い位置から優先的にロボットを埋めていけるように、頭部ロボット単位Dを選択している。 In the above processing, when the robot unit exists only outside the target position set GU, the group of robots approaches the entrance position Pe through the path P, or the robot unit exists within the target position set GU. When doing so, the head robot unit D is selected so that the robot can be preferentially filled from a position close to the entrance position Pe.

続いて、尾部ロボット単位位置を決定するための手法Tail_Robot_Decisionを述べる。 Next, a method Tail_Robot_Decision for determining the tail robot unit position will be described.

[Tail_Robot_Decision]
(1)各ロボット単位位置（Xr_unit[j],Yr_unit[j],Zr_unit[j]）において、
δ'[Xr_unit[j]][Yr_unit[j]][Zr_unit[j]]
=δ[Xr_unit[j]][Yr_unit[j]][Zr_unit[j]]-δ[Xrh][Yrh][Zrh]
を計算する。
(2)各ロボット単位位置（Xr_unit[j],Yr_unit[j],Zr_unit[j]）において、隣接する全てのロボット単位N（なお、ロボット単位Nの位置を(Xrn,Yrn,Zrn)とする）について、δ'[Xr_unit[j]][Yr_unit[j]][Zr_unit[j]]≧δ'[Xrn][Yrn][Zrn]を満たし、かつ、δ'[Xr_unit[j]][Yr_unit[j]][Zr_unit[j]]≧3であるロボット単位の何れか一つを尾部ロボット単位Tとして選択する。つまり、尾部ロボット単位Tは、隣接する全てのロボット単位N以上にロボット単位Hから離れており、かつ、ロボット単位Hから3マス単位以上離れている（ロボット単位Hと尾部ロボット単位Tとのマンハッタン距離が３マス単位以上離れている）。 [Tail_Robot_Decision]
(1) At each robot unit position (Xr_unit [j], Yr_unit [j], Zr_unit [j])
δ '[Xr_unit [j]] [Yr_unit [j]] [Zr_unit [j]]
= δ [Xr_unit [j]] [Yr_unit [j]] [Zr_unit [j]]-δ [Xrh] [Yrh] [Zrh]
Calculate
(2) In each robot unit position (Xr_unit [j], Yr_unit [j], Zr_unit [j]), all adjacent robot units N (note that the position of the robot unit N is (Xrn, Yrn, Zrn) ) For δ '[Xr_unit [j]] [Yr_unit [j]] [Zr_unit [j]] ≧ δ' [Xrn] [Yrn] [Zrn] and δ '[Xr_unit [j]] [Yr_unit Any one of the robot units satisfying [j]] [Zr_unit [j]] ≧ 3 is selected as the tail robot unit T. That is, the tail robot unit T is far from the robot unit H more than all the adjacent robot units N, and is more than 3 square units away from the robot unit H (Manhattan between the robot unit H and the tail robot unit T). The distance is more than 3 square units).

以上の処理では、隊列Gaより取り払ったとしても、隊列Gaの連続性を失わないロボット単位を尾部ロボット単位として設定している。 In the above processing, the robot unit that does not lose the continuity of the formation Ga even if it is removed from the formation Ga is set as the tail robot unit.

以上の処理をまとめて、以下のように、移動先隊列決定用動作計画処理（Next_Formation_Decision）が行われる。
(1)頭部決定処理Head_Robot_Decisionを行う。
(2)尾部ロボット決定処理Tail_robot_Decisionを行う。
(3)上述の(2)にて特定した尾部ロボット単位Tを、現在の隊列Gaから削除し、隊列Gaに頭部ロボット単位Dを追加し、隊列Gbとする。 The above processing is summarized and the movement plan determination operation plan process (Next_Formation_Decision) is performed as follows.
(1) The head determination process Head_Robot_Decision is performed.
(2) Perform tail robot decision processing Tail_robot_Decision.
(3) The tail robot unit T identified in the above (2) is deleted from the current formation Ga, and the head robot unit D is added to the formation Ga to form a formation Gb.

[各ロボット動作決定用動作計画]
移動先隊列決定用動作計画において頭部ロボット単位Dと尾部ロボット単位Tが決定され、次の隊列Gbが決定されたので、現隊列Gaから次隊列Gbへの変形を各ロボットの移動によって実現するための探索計算を各ロボット動作決定用動作計画にて行う。 [Operation plan for determining each robot operation]
Since the head robot unit D and the tail robot unit T are determined in the movement plan for the movement target column determination, and the next column Gb is determined, the transformation from the current column Ga to the next column Gb is realized by the movement of each robot. Search calculation is performed in each robot motion determination motion plan.

本実施形態では、８つのロボットを組にしたロボット単位を維持しつつロボット群が移動を行うようにしているので、ボイド制御を行う際の、各ロボットの接続維持を考える際には、尾部ロボット単位Tと頭部ロボット単位D付近でのボイドの動きにさえ気を配れば、それ以外は、ボイドが各ロボット単位内に１つ以内であるようにすればよい。 In this embodiment, the robot group moves while maintaining a robot unit of eight robots. Therefore, when considering the connection maintenance of each robot when performing void control, the tail robot As long as attention is paid to the movement of the void in the vicinity of the unit T and the head robot unit D, the number of voids may be within one in each robot unit.

以下、各ロボット動作決定用動作計画の動作について述べる。ロボット動作決定用動作計画においては、ロボット単位Hから頭部ロボット単位Dへロボット単位を移動させるたびに、８つのボイドが発生するので、まずは、それら発生した各々のボイドについてそのボイドがロボット単位Hから尾部ロボット単位Tまでたどるべき経路を計算する。 The operation of each robot motion determination motion plan will be described below. In the robot motion determination action plan, every time the robot unit is moved from the robot unit H to the head robot unit D, eight voids are generated. First, for each of the generated voids, the void is the robot unit H. To calculate the path to follow from T to the tail robot unit T.

まず、ロボット単位Hに所属するロボットのうち、頭部ロボット単位Dに接した位置にある４つのロボットをロボットi_move_11,i_move_21,i_move_31,i_move_41とし、ロボットi_move_11,i_move_21,i_move_31,i_move_41と行動a_headの移動方向の逆方向に接しているロボットをそれぞれ、ロボットi_move_12,i_move_22, i_move_32,i_move_42と順序付けする。 First, of the robots belonging to robot unit H, the four robots that are in contact with head robot unit D are robots i_move_11, i_move_21, i_move_31, i_move_41, and movement of robots i_move_11, i_move_21, i_move_31, i_move_41 and action a_head The robots in contact with the opposite direction of the direction are ordered as robots i_move_12, i_move_22, i_move_32, and i_move_42, respectively.

ロボット単位Hから頭部ロボット単位Dへの移動においては、ロボット単位H内のロボットi_move_r1(rは1,2,3,4の何れか)とロボットi_move_r2がペアとなって、a_head方向にスライドすることで、ロボットが移動する。そのため、ロボット単位H内にて発生するボイドの位置は、かならずロボットi_move_12,i_move_22, i_move_32,i_move_42のいずれかとなる。ロボットが開始位置の集合SUから目標位置の集合GUへの隊列形成動作を開始してからの発生ボイドの通し番号をiv(=0,1,2,3…)で管理し、フル充填されたロボット単位Hから頭部ロボット単位Dへのロボットの移動が開始されたときに、それまでに発生しているボイドの数をiv_prevとすれば、iv_prevは、(8の倍数-1)である。ロボット単位Hから頭部ロボット単位Dへの移動で発生するボイドをvoid[iv_prev+k](k=1,2,3,4,5,6,7,8)とする。各ボイドの軌道を( void_trj_x[iv_prev+k][t], void_trj_y[iv_prev+k][t] , void_trj_z[iv_prev+k][t] ) (t=0,1,2,3…)で表現する。 When moving from robot unit H to head robot unit D, robot i_move_r1 (where r is one of 1, 2, 3, or 4) and robot i_move_r2 in robot unit H are paired and slide in the a_head direction. As a result, the robot moves. Therefore, the position of the void generated in the robot unit H is always one of the robots i_move_12, i_move_22, i_move_32, and i_move_42. Fully-filled robots that manage the serial number of the voids that have been generated since the robot started the formation process from the set SU at the start position to the set GU at the target position with iv (= 0, 1, 2, 3 ...) When the movement of the robot from the unit H to the head robot unit D is started, assuming that the number of voids generated so far is iv_prev, iv_prev is (multiple of 8-1). Let void [iv_prev + k] (k = 1, 2, 3, 4, 5, 6, 7, 8) be a void generated when moving from robot unit H to head robot unit D. Represent each void trajectory as (void_trj_x [iv_prev + k] [t], void_trj_y [iv_prev + k] [t], void_trj_z [iv_prev + k] [t]) (t = 0,1,2,3…) To do.

各ボイドの軌道計算処理（Void_Trajectory_Decision）は、以下の手順で行われる。 The trajectory calculation process (Void_Trajectory_Decision) for each void is performed according to the following procedure.

[Void_Trajectory_Decision]
(1)各ボイドの発生位置を以下のようにする。
void[iv_prev+1]、void[iv_prev+2]の発生位置→ロボットi_move_12の位置
void[iv_prev+3]、void[iv_prev+4]の発生位置→ロボットi_move_22の位置
void[iv_prev+5]、void[iv_prev+6]の発生位置→ロボットi_move_32の位置
void[iv_prev+7]、void[iv_prev+8]の発生位置→ロボットi_move_42の位置
(2)各ボイドの発生位置を各ボイドの軌道の終点void_end[iv_prev]とする。
(3)上述の(1)で発生位置を設定した８つのボイドがロボット単位Hの位置から尾部ロボット単位Tの位置までたどるべき位置単位の道筋を(void_unit_trj_x[tu], void_unit_trj_y[tu], void_unit_trj_z[tu]) (tu=0,1,2,3…)とし、後述するVoid_Unit_Trajectory_Decisionにて(void_unit_trj_x[tu], void_unit_trj_y[tu], void_unit_trj_z[tu])を決定する。tuの値は、Tから近い方の道筋上の点で０としてHに近づくにつれて大きくなるようにボイドの動きと逆向きに設定する。
(4)(void_unit_trj_x[1], void_unit_trj_y[1], void_unit_trj_z[1])をT'、 (void_unit_trj_x[2], void_unit_trj_y[2], void_unit_trj_z[2])をT"とする。T'は尾部ロボット単位Tに隣接するロボット単位であり、T"はロボット単位T'に隣接するロボット単位である。
(5)ロボット単位T'内にあり、尾部ロボット単位T内の位置に接しない位置にある４つのロボットをロボットi_tail_0,i_tail_1,i_tail_2,i_tail_3とし、各ボイドの目標位置を以下のように設定する。
void[iv_prev+1]、void[iv_prev+2]の目標位置→ロボットi_tail_0の位置
void[iv_prev+3]、void[iv_prev+4]の目標位置→ロボットi_tail_1の位置
void[iv_prev+5]、void[iv_prev+6]の目標位置→ロボットi_tail_2の位置
void[iv_prev+7]、void[iv_prev+8]の目標位置→ロボットi_tail_3の位置
(6)ロボット単位T"内における各ボイドの発生位置と同じ位置（ロボット単位Hをロボット単位T"の位置に平行移動させた場合のロボット単位H内の各ボイドの発生位置）から、ロボット単位T'内の各ボイド目標位置までの軌道( void_trj_x[iv_prev+k][t], void_trj_y[iv_prev+k][t] , void_trj_z[iv_prev+k][t] )を後述するVoid_Local_Trajectroy_Decisionで決定する。tの値は、ロボット単位T'内の目標位置で０としてロボット単位T"内各ボイド発生位置に近づくにつれて大きくなるようにボイドの動きと逆向きに設定する。
(7)上述の(6)で決定された軌道( void_trj_x[iv_prev+k][t], void_trj_y[iv_prev+k][t] , void_trj_z[iv_prev+k][t] )の終点(t=tsとする。)から、ロボット単位Hの位置内のボイド発生位置に至る軌道を(8)以下の処理で設定する。
(8)it←2とする。
(9)(void_unit_trj_x[2+it/2-1], void_unit_trj_y[2+it/2-1], void_unit_trj_z[2+it/2-1])がロボット単位Hの位置に一致していないならば、(void_unit_trj_x[2+(it)/2-1], void_unit_trj_y[2+(it)/2-1], void_unit_trj_z[2+(it)/2-1])から(void_unit_trj_x[2+it/2], void_unit_trj_y[2+it/2], void_unit_trj_z[2+it/2])へ移動する方向に、( void_trj_x[iv_prev+k][ts+it-2], void_trj_y[iv_prev+k][ts+it-2] , void_trj_z[iv_prev+k][ts+it-2] )から１ステップ移動した位置を( void_trj_x[iv_prev+k][ts+it-1], void_trj_y[iv_prev+k][t+it-1] , void_trj_z[iv_prev+k][t+it-1] )とし、２ステップ移動した位置を( void_trj_x[iv_prev+k][ts+it], void_trj_y[iv_prev+k][ts+it] , void_trj_z[iv_prev+k][ts+it] )とし、itを2回インクリメントする。(void_unit_trj_x[2+it/2-1], void_unit_trj_y[2+it/2-1], void_unit_trj_z[2+it/2-1])がロボット単位Hの位置に一致するまで(9)を繰り返す。(void_unit_trj_x[2+it/2-1], void_unit_trj_y[2+it/2-1], void_unit_trj_z[2+it/2-1])がロボット単位Hの位置に一致しているならば、軌道内点数trj_num[iv_prev+k]←ts+it-2として終了する。 [Void_Trajectory_Decision]
(1) The occurrence position of each void is as follows.
Generation location of void [iv_prev + 1], void [iv_prev + 2] → position of robot i_move_12
Position where void [iv_prev + 3] and void [iv_prev + 4] occur → Position of robot i_move_22
Position where void [iv_prev + 5] and void [iv_prev + 6] occur → Position of robot i_move_32
Position where void [iv_prev + 7] and void [iv_prev + 8] occur → Position of robot i_move_42
(2) The occurrence position of each void is defined as the end point void_end [iv_prev] of the trajectory of each void.
(3) The eight voids whose occurrence positions are set in (1) above should follow the path of the position unit to be followed from the position of robot unit H to the position of tail robot unit T (void_unit_trj_x [tu], void_unit_trj_y [tu], void_unit_trj_z [tu]) (tu = 0,1,2,3 ...) and (void_unit_trj_x [tu], void_unit_trj_y [tu], void_unit_trj_z [tu]) are determined by Void_Unit_Trajectory_Decision described later. The value of tu is set to 0 at the point on the path closer to T, and is set in the opposite direction to the movement of the void so as to increase as H approaches.
(4) (void_unit_trj_x [1], void_unit_trj_y [1], void_unit_trj_z [1]) is T ', and (void_unit_trj_x [2], void_unit_trj_y [2], void_unit_trj_z [2]) is T ". T' is a tail robot. A robot unit adjacent to the unit T, and T ″ is a robot unit adjacent to the robot unit T ′.
(5) The robots i_tail_0, i_tail_1, i_tail_2, and i_tail_3 are the four robots that are in the robot unit T ′ and are not in contact with the position in the tail robot unit T, and set the target position of each void as follows: .
Target position of void [iv_prev + 1], void [iv_prev + 2] → position of robot i_tail_0
Target position of void [iv_prev + 3], void [iv_prev + 4] → position of robot i_tail_1
Target position of void [iv_prev + 5], void [iv_prev + 6] → position of robot i_tail_2
Target position of void [iv_prev + 7], void [iv_prev + 8] → position of robot i_tail_3
(6) From the same position as the generation position of each void in the robot unit T "(the generation position of each void in the robot unit H when the robot unit H is translated to the position of the robot unit T") Trajectories (void_trj_x [iv_prev + k] [t], void_trj_y [iv_prev + k] [t], void_trj_z [iv_prev + k] [t]) to each void target position in T ′ are determined by Void_Local_Trajectroy_Decision described later. The value of t is set to 0 at the target position in the robot unit T ′, and is set in the opposite direction to the movement of the void so as to increase as it approaches each void generation position in the robot unit T ″.
(7) End point (t = ts) of the trajectory (void_trj_x [iv_prev + k] [t], void_trj_y [iv_prev + k] [t], void_trj_z [iv_prev + k] [t]) determined in (6) above And the trajectory from the robot unit H to the void generation position is set by the following process (8).
(8) It ← 2.
(9) If (void_unit_trj_x [2 + it / 2-1], void_unit_trj_y [2 + it / 2-1], void_unit_trj_z [2 + it / 2-1]) does not match the position of robot unit H , (Void_unit_trj_x [2+ (it) / 2-1], void_unit_trj_y [2+ (it) / 2-1], void_unit_trj_z [2+ (it) / 2-1]) to (void_unit_trj_x [2 + it / 2 ], void_unit_trj_y [2 + it / 2], void_unit_trj_z [2 + it / 2]), (void_trj_x [iv_prev + k] [ts + it-2], void_trj_y [iv_prev + k] [ts + it-2], void_trj_z [iv_prev + k] [ts + it-2]) to the position moved one step (void_trj_x [iv_prev + k] [ts + it-1], void_trj_y [iv_prev + k] [t + it-1], void_trj_z [iv_prev + k] [t + it-1]), and the position moved two steps is (void_trj_x [iv_prev + k] [ts + it], void_trj_y [iv_prev + k] [ts + it ], void_trj_z [iv_prev + k] [ts + it]) and increment it twice. Repeat (9) until (void_unit_trj_x [2 + it / 2-1], void_unit_trj_y [2 + it / 2-1], void_unit_trj_z [2 + it / 2-1]) matches the position of robot unit H. If (void_unit_trj_x [2 + it / 2-1], void_unit_trj_y [2 + it / 2-1], void_unit_trj_z [2 + it / 2-1]) matches the robot unit H position, Finish as score trj_num [iv_prev + k] ← ts + it-2.

[Void_Local_Trajectroy_Decision]
Void_Local_Trajectroy_Decisionの処理は、以下のとおりである。まず、ロボットが動作を開始する前に、あらかじめ、以下の6つの位置関係にある尾部ロボット単位T及びロボット単位T'において、ロボット単位T'に5通りのロボット単位T"[kt](kt=1,2,3,4,5)が接しているとして、ロボット単位T'内の尾部ロボット単位Tに接しない位置とロボット単位T"の全ての位置について、ロボット単位T'内の尾部ロボット単位Tに接しない4つの位置からのマンハッタン距離δlocal[vs][al](X,Y,Z)(vs=0,1,2,3。 al=1,2,3,4,5,6)を全て計算する。ここでは、ロボット単位ではなく、ロボットのマンハッタン距離を求める。なお、alはロボット単位Tの位置からロボット単位T'の位置に向かう６つの方向の行動であり、vsはロボット単位T'内の尾部ロボット単位Tに接しない4つのロボット(その位置)のインデックスである。 [Void_Local_Trajectroy_Decision]
The processing of Void_Local_Trajectroy_Decision is as follows. First, before the robot starts operation, five robot units T "[kt] (kt = 1, 2, 3, 4 and 5) are in contact with each other, and the position of the tail robot unit in the robot unit T ′ for all positions in the robot unit T ′ and the position not in contact with the tail robot unit T in the robot unit T ′. Manhattan distance δlocal [vs] [al] (X, Y, Z) (vs = 0,1,2,3. Al = 1,2,3,4,5,6) from four positions not touching T Are all calculated. Here, the Manhattan distance of the robot is obtained instead of the robot unit. Here, al is an action in six directions from the position of the robot unit T to the position of the robot unit T ′, and vs is an index of four robots (that position) that do not touch the tail robot unit T in the robot unit T ′. It is.

(1)al=1, T=(0,0,0), T'=(1,0,0), T"[1]=(1,1,0), T"[2]=(1,-1,0) , T"[3]=(2,0,0), T"[4]=(1,0,1), T"[5]=(1,0,-1)
(2)al=2, T=(0,0,0), T'=(0,1,0) , T"[1]=(1,1,0), T"[2]=(-1,1,0) ,T"[3]=(0,2,0), T"[4]=(0,1,1), T"[5]=(0,1,-1)
(3)al=3, T=(0,0,0),T'=(-1,0,0) ,T"[1]=(-1,1,0), T"[2]=(-1,-1,0) , T"[3]=(-2,0,0), T"[4]=(-1,0,1), T"[5]=(-1,0,-1)
(4)al=4, T=(0,0,0),T'=(0,-1,0) ,T"[1]=(1,-1,0), T"[2]=(-1,-1,0) , T"[3]=(0,-2,0), T"[4]=(0,-1,1), T"[5]=(0,-1,-1)
(5)al=5, T=(0,0,0),T'=(0,0,1) ,T"[1]=(0,1,1), T"[2]=(0,-1,1) , T"[3]=(0,0,2), T"[4]=(1,0,1), T"[5]=(-1,0,1)
(6)al=6, T=(0,0,0),T'=(0,0,-1) ,T"[1]=(0,1,-1), T"[2]=(0,-1,-1) , T"[3]=(0,0,-2), T"[4]=(1,0,-1), T"[5]=(-1,0,-1)
以下、δlocal[vs][al](X,Y,Z)の計算方法を示す。 (1) al = 1, T = (0,0,0), T '= (1,0,0), T "[1] = (1,1,0), T" [2] = (1 , -1,0), T "[3] = (2,0,0), T" [4] = (1,0,1), T "[5] = (1,0, -1)
(2) al = 2, T = (0,0,0), T '= (0,1,0), T "[1] = (1,1,0), T" [2] = (- 1,1,0), T "[3] = (0,2,0), T" [4] = (0,1,1), T "[5] = (0,1, -1)
(3) al = 3, T = (0,0,0), T '= (-1,0,0), T "[1] = (-1,1,0), T" [2] = (-1, -1,0), T "[3] = (-2,0,0), T" [4] = (-1,0,1), T "[5] = (-1, (0, -1)
(4) al = 4, T = (0,0,0), T '= (0, -1,0), T "[1] = (1, -1,0), T" [2] = (-1, -1,0), T "[3] = (0, -2,0), T" [4] = (0, -1,1), T "[5] = (0,- 1, -1)
(5) al = 5, T = (0,0,0), T '= (0,0,1), T "[1] = (0,1,1), T" [2] = (0 , -1,1), T "[3] = (0,0,2), T" [4] = (1,0,1), T "[5] = (-1,0,1)
(6) al = 6, T = (0,0,0), T '= (0,0, -1), T "[1] = (0,1, -1), T" [2] = (0, -1, -1), T "[3] = (0,0, -2), T" [4] = (1,0, -1), T "[5] = (-1, (0, -1)
Hereinafter, the calculation method of δlocal [vs] [al] (X, Y, Z) is shown.

[Calculate_delta_local]
(1)ロボット単位T'内の尾部ロボット単位Tに接しない位置とロボット単位T"の全ての位置（X,Y,Z）において、next_local[al][a][X][Y][Z]を用意し、位置（X,Y,Z）に隣接するロボット単位T'内の尾部ロボット単位Tに接しない位置か、ロボット単位T"内の位置があるならば、next_local[al][a][X][Y][Z]←１とし、それ以外の場合はnext_local[al][a][X][Y][Z]←0とする。
(2)ロボット単位T'内の尾部ロボット単位Tに接しない位置とロボット単位T"の全ての位置(X,Y,Z）について、各位置(X,Y,Z）のδlocal[vs][al](X,Y,Z)値を44(T'内4つ、T"内8x5つの位置の数)より大きな値s_maxに初期化する。
(3)各vs,alについて、以下の位置を目標位置target_local[vs][al]とし、目標位置でのマンハッタン距離δlocal[vs][al] (24通り)を0に初期化する。
target_local[0][1]＝（3,0,0）
target_local[1][1]＝（3,0,1）
target_local[2][1]＝（3,1,0）
target_local[3][1]＝（3,1,1）
target_local[0][2]＝（0,3,0）
target_local[1][2]＝（0,3,1）
target_local[2][2]＝（1,3,0）
target_local[3][2]＝（1,3,1）
target_local[0][3]＝（-2,0,0）
target_local[1][3]＝（-2,0,1）
target_local[2][3]＝（-2,1,0）
target_local[3][3]＝（-2,1,1）
target_local[0][4]＝（0,-2,0）
target_local[1][4]＝（0,-2,1）
target_local[2][4]＝（1,-2,0）
target_local[3][4]＝（1,-2,1）
target_local[0][5]＝（0,0,3）
target_local[1][5]＝（0,1,3）
target_local[2][5]＝（1,0,3）
target_local[3][5]＝（1,1,3）
target_local[0][6]＝（0,0,-2）
target_local[1][6]＝（0,1,-2）
target_local[2][6]＝（1,0,-2）
target_local[3][6]＝（1,1,-2）
(4)目標位置target_local[vs][al]以外の、各位置（X,Y,Z）の全行動aについて、next_local[al][a][X][Y][Z]の値が0ではない場合の、δlocal[vs][al](X,Y,Z)の値を調べ、その最小値に１を加えた値が、現在のδlocal[vs][al](X,Y,Z)よりも小さい場合は、その値をδlocal[vs][al](X,Y,Z)に代入する。δlocal[vs][al](X,Y,Z)値の更新がなくなるまで、(4)を繰り返す。 [Calculate_delta_local]
(1) next_local [al] [a] [X] [Y] [Z at the position (X, Y, Z) of the robot unit T 'that does not touch the tail robot unit T and all the positions (X, Y, Z) of the robot unit T " ], And if there is a position not in contact with the tail robot unit T in the robot unit T ′ adjacent to the position (X, Y, Z) or a position in the robot unit T ″, next_local [al] [a ] [X] [Y] [Z] ← 1, otherwise, next_local [al] [a] [X] [Y] [Z] ← 0.
(2) δlocal [vs] [of each position (X, Y, Z) for all positions (X, Y, Z) of robot unit T 'and all positions (X, Y, Z) of robot unit T "within robot unit T' al] (X, Y, Z) values are initialized to a value s_max greater than 44 (number of 4 positions in T ′, 8 × 5 positions in T ″).
(3) For each vs and al, the following positions are set as the target position target_local [vs] [al], and the Manhattan distance δlocal [vs] [al] (24 ways) at the target position is initialized to 0.
target_local [0] [1] = (3,0,0)
target_local [1] [1] = (3,0,1)
target_local [2] [1] = (3,1,0)
target_local [3] [1] = (3,1,1)
target_local [0] [2] = (0,3,0)
target_local [1] [2] = (0,3,1)
target_local [2] [2] = (1,3,0)
target_local [3] [2] = (1,3,1)
target_local [0] [3] = (-2,0,0)
target_local [1] [3] = (-2,0,1)
target_local [2] [3] = (-2,1,0)
target_local [3] [3] = (-2,1,1)
target_local [0] [4] = (0, -2,0)
target_local [1] [4] = (0, -2,1)
target_local [2] [4] = (1, -2,0)
target_local [3] [4] = (1, -2,1)
target_local [0] [5] = (0,0,3)
target_local [1] [5] = (0,1,3)
target_local [2] [5] = (1,0,3)
target_local [3] [5] = (1,1,3)
target_local [0] [6] = (0,0, -2)
target_local [1] [6] = (0,1, -2)
target_local [2] [6] = (1,0, -2)
target_local [3] [6] = (1,1, -2)
(4) The value of next_local [al] [a] [X] [Y] [Z] is 0 for all actions a at each position (X, Y, Z) except for the target position target_local [vs] [al] If not, the value of δlocal [vs] [al] (X, Y, Z) is examined, and the value obtained by adding 1 to the minimum value is the current δlocal [vs] [al] (X, Y, Z) If it is smaller than), the value is assigned to δlocal [vs] [al] (X, Y, Z). Repeat (4) until the δlocal [vs] [al] (X, Y, Z) value is no longer updated.

Void_Local_Trajectory_Decisionの処理は以下のとおりである。 Processing of Void_Local_Trajectory_Decision is as follows.

[Void_Local_Trajectroy_Decision]
(1)ロボット単位T'の位置を（XT',YT',ZT'）、ロボット単位T"の位置を（XT",YT",ZT"）として、T->T’に向かう方向の行動をal値とする。。i_tail_0の位置をtarget_local[0][al](vs=0とする), i_tail_1の位置をtarget_local[1][al] (vs=1とする), i_tail_2の位置をtarget_local[2][al] (vs=2とする), i_tail_3の位置をtarget_local[3][al] (vs=3とする)とする。
(2)各ボイドvoid[iv_prev+k]の発生位置を(xv[iv_prev+k],yv[iv_prev+ｋ],zv[iv_prev+k])として、ロボット単位T"内の位置vt"を以下のように計算する。 [Void_Local_Trajectroy_Decision]
(1) With the position of robot unit T 'as (XT', YT ', ZT') and the position of robot unit T "as (XT", YT ", ZT"), the action in the direction toward T-> T ' Is the al value. . The location of i_tail_0 is target_local [0] [al] (vs = 0), the location of i_tail_1 is target_local [1] [al] (vs = 1), and the location of i_tail_2 is target_local [2] [al] ( vs = 2), and the position of i_tail_3 is assumed to be target_local [3] [al] (vs = 3).
(2) The occurrence position of each void void [iv_prev + k] is (xv [iv_prev + k], yv [iv_prev + k], zv [iv_prev + k]), and the position vt "in the robot unit T" is Calculate as follows.

ロボット単位T'からロボット単位T"への移動量ベクトルの値を(Xdt,Ydt,Zdt)= (XT",YT",ZT")-(XT',YT',ZT')として、
If al=1,
vt"←(xv[iv_prev+k]-xrh*2+2+Xdt*2, yv[iv_prev+k]-yrh*2+Ydt*2, zv[iv_prev+k]-zrh*2+Zdt*2)
If al=2,
vt"←(xv[iv_prev+k]-xrh*2+Xdt*2, yv[iv_prev+k]-yrh*2+2+Ydt*2, zv[iv_prev+k]-zrh*2+Zdt*2)
If al=3,
vt"←(xv[iv_prev+k]-xrh*2-2+Xdt*2, yv[iv_prev+k]-yrh*2+Ydt*2, zv[iv_prev+k]-zrh*2+Zdt*2)
If al=4,
vt"←(xv[iv_prev+k]-xrh*2+Xdt*2, yv[iv_prev+k]-yrh*2-2+Ydt*2, zv[iv_prev+k]-zrh*2+Zdt*2)
If al=5,
vt"←(xv[iv_prev+k]-xrh*2+Xdt*2, yv[iv_prev+k]-yrh*2+Ydt*2, zv[iv_prev+k]-zrh*2+2+Zdt*2)
If al=6,
vt"←(xv[iv_prev+ｋ]-xrh*2+Xdt*2, yv[iv_prev+ｋ]-yrh*2+Ydt*2, zv[iv_prev+ｋ]-zrh*2-2+Zdt*2)
(3)tl←0とし、vt"の位置を(void_local_x[iv_prev+k][tl], void_local_y[iv_prev+k][tl], void_local_z[iv_prev+k][tl])とする。
(4)位置(void_local_x[iv_prev+k][tl], void_local_y[iv_prev+k][tl], void_local_z[iv_prev+k][tl])が、target_local[vs][al]に一致しないとき、
位置(void_local_x[iv_prev+k][tl], void_local_y[iv_prev+k][tl], void_local_z[iv_prev+k][tl])に隣接し、δlocal[vs][al]の値が(void_local_x[iv_prev+k][tl], void_local_y[iv_prev+k][tl], void_local_z[iv_prev+k][tl])の位置のδlocal[vs][al]の値より小さい位置を(void_local_x[iv_prev+k][tl+1], void_local_y[iv_prev+k][tl+1], void_local_z[iv_prev+k][tl+1])とし、tlをインクリメントする。(void_local_x[iv_prev+k][tl], void_local_y[iv_prev+k][tl], void_local_z[iv_prev+k][tl])が、target_local[vs][al]に一致するまで(4)を繰り返す。位置(void_local_x[iv_prev+k][tl], void_local_y[iv_prev+k][tl], void_local_z[iv_prev+k][tl])が、target_local[vs][al]に一致するとき、(5)へ移行する。
(5)( void_trj_x[iv_prev+k][t], void_trj_y[iv_prev+k][t] , void_trj_z[iv_prev+k][t] )← (void_local_x[iv_prev+k][tl-t], void_local_y[iv_prev+k][tl-t], void_local_z[iv_prev+k][tl-t])とし、ts←tlとする。 The value of the movement vector from the robot unit T 'to the robot unit T "is (Xdt, Ydt, Zdt) = (XT", YT ", ZT")-(XT', YT ', ZT')
If al = 1,
vt "← (xv [iv_prev + k] -xrh * 2 + 2 + Xdt * 2, yv [iv_prev + k] -yrh * 2 + Ydt * 2, zv [iv_prev + k] -zrh * 2 + Zdt * 2 )
If al = 2,
vt "← (xv [iv_prev + k] -xrh * 2 + Xdt * 2, yv [iv_prev + k] -yrh * 2 + 2 + Ydt * 2, zv [iv_prev + k] -zrh * 2 + Zdt * 2 )
If al = 3,
vt "← (xv [iv_prev + k] -xrh * 2-2 + Xdt * 2, yv [iv_prev + k] -yrh * 2 + Ydt * 2, zv [iv_prev + k] -zrh * 2 + Zdt * 2 )
If al = 4,
vt "← (xv [iv_prev + k] -xrh * 2 + Xdt * 2, yv [iv_prev + k] -yrh * 2-2 + Ydt * 2, zv [iv_prev + k] -zrh * 2 + Zdt * 2 )
If al = 5,
vt "← (xv [iv_prev + k] -xrh * 2 + Xdt * 2, yv [iv_prev + k] -yrh * 2 + Ydt * 2, zv [iv_prev + k] -zrh * 2 + 2 + Zdt * 2 )
If al = 6,
vt "← (xv [iv_prev + k] -xrh * 2 + Xdt * 2, yv [iv_prev + k] -yrh * 2 + Ydt * 2, zv [iv_prev + k] -zrh * 2-2 + Zdt * 2 )
(3) tl ← 0, and the position of vt "is (void_local_x [iv_prev + k] [tl], void_local_y [iv_prev + k] [tl], void_local_z [iv_prev + k] [tl]).
(4) When the position (void_local_x [iv_prev + k] [tl], void_local_y [iv_prev + k] [tl], void_local_z [iv_prev + k] [tl]) does not match target_local [vs] [al]
Adjacent to the location (void_local_x [iv_prev + k] [tl], void_local_y [iv_prev + k] [tl], void_local_z [iv_prev + k] [tl]) and the value of δlocal [vs] [al] is (void_local_x [iv_prev + k] [tl], void_local_y [iv_prev + k] [tl], void_local_z [iv_prev + k] [tl]) position less than δlocal [vs] [al] value (void_local_x [iv_prev + k] [tl + 1], void_local_y [iv_prev + k] [tl + 1], void_local_z [iv_prev + k] [tl + 1]) and increment tl. (4) is repeated until (void_local_x [iv_prev + k] [tl], void_local_y [iv_prev + k] [tl], void_local_z [iv_prev + k] [tl]) matches target_local [vs] [al]. Go to (5) when the position (void_local_x [iv_prev + k] [tl], void_local_y [iv_prev + k] [tl], void_local_z [iv_prev + k] [tl]) matches target_local [vs] [al] Transition.
(5) (void_trj_x [iv_prev + k] [t], void_trj_y [iv_prev + k] [t], void_trj_z [iv_prev + k] [t]) ← (void_local_x [iv_prev + k] [tl-t], void_local_y [ iv_prev + k] [tl-t], void_local_z [iv_prev + k] [tl-t]), and ts ← tl.

Void_Unit_Trajectory_Decisionの処理は以下のとおりである。 The processing of Void_Unit_Trajectory_Decision is as follows.

[Void_Unit_Trajectory_Decision]
(1)尾部ロボット単位Tの位置を(void_unit_trj_x[0], void_unit_trj_y[0], void_unit_trj_z[0])とする。tu←１とし、尾部ロボット単位Tよりも小さなマンハッタン距離δを持つ隣接のロボット単位の位置を(void_unit_trj_x[tu], void_unit_trj_y[tu], void_unit_trj_z[tu])とする。
(2)もし、(void_unit_trj_x[tu], void_unit_trj_y[tu], void_unit_trj_z[tu])が入口位置Peでもロボット単位Hでなければ、tuをインクリメントし、位置(void_unit_trj_x[tu-1], void_unit_trj_y[tu-1], void_unit_trj_z[tu-1])にあるロボット単位よりも小さなマンハッタン距離δを持つ隣接のロボット単位の位置を(void_unit_trj_x[tu], void_unit_trj_y[tu], void_unit_trj_z[tu])とする。(void_unit_trj_x[tu], void_unit_trj_y[tu], void_unit_trj_z[tu])が入口位置Peまたはロボット単位Hの位置になるまで(2)を繰り返す。 [Void_Unit_Trajectory_Decision]
(1) The position of the tail robot unit T is (void_unit_trj_x [0], void_unit_trj_y [0], void_unit_trj_z [0]). It is assumed that tu ← 1 and the position of an adjacent robot unit having a Manhattan distance δ smaller than the tail robot unit T is (void_unit_trj_x [tu], void_unit_trj_y [tu], void_unit_trj_z [tu]).
(2) If (void_unit_trj_x [tu], void_unit_trj_y [tu], void_unit_trj_z [tu]) is not robot unit H even at the entrance position Pe, increment tu and position (void_unit_trj_x [tu-1], void_unit_trj_y [tu -1], void_unit_trj_z [tu-1]), the position of an adjacent robot unit having a smaller Manhattan distance δ than the robot unit is (void_unit_trj_x [tu], void_unit_trj_y [tu], void_unit_trj_z [tu]). Repeat (2) until (void_unit_trj_x [tu], void_unit_trj_y [tu], void_unit_trj_z [tu]) reaches the entrance position Pe or the robot unit H position.

もし、(void_unit_trj_x[tu], void_unit_trj_y[tu], void_unit_trj_z[tu])が入口位置Pe(Hでなく)の場合は、(3)へ移行する。 If (void_unit_trj_x [tu], void_unit_trj_y [tu], void_unit_trj_z [tu]) is the entrance position Pe (not H), the process proceeds to (3).

もし、(void_unit_trj_x[tu], void_unit_trj_y[tu], void_unit_trj_z[tu])がロボット単位Hの位置の場合は、Void_Unit_Trajectory_Decisionを終了する。
(3)ロボット単位Hの位置を(void_unit_trj_rev_x[0], void_unit_trj_rev_y[0], void_unit_trj_rev_z[0])とする。tr←１とし、ロボット単位Hよりも大きなマンハッタン距離δを持つ隣接のロボット単位の位置を(void_unit_trj_rev_x[tr], void_unit_trj_rev_y[tr], void_unit_trj_rev_z[tr])とする。
(4)もし、(void_unit_trj_rev_x[tr], void_unit_trj_rev_y[tr], void_unit_trj_rev_z[tr])が入口位置Peでなければ、trをインクリメントし、位置(void_unit_trj_rev_x[tr-1], void_unit_trj_rev_y[tr-1], void_unit_trj_rev_z[tr-1]) にあるロボット単位よりも大きなマンハッタン距離δを持つ隣接のロボット単位の位置を(void_unit_trj_rev_x[tr], void_unit_trj_rev_y[tr], void_unit_trj_rev_z[tr])とする。(void_unit_trj_rev_x[tr], void_unit_trj_rev_y[tr], void_unit_trj_rev_z[tr])が入口位置Peになるまで(4)を繰り返す。 If (void_unit_trj_x [tu], void_unit_trj_y [tu], void_unit_trj_z [tu]) is the position of the robot unit H, Void_Unit_Trajectory_Decision is terminated.
(3) The position of the robot unit H is (void_unit_trj_rev_x [0], void_unit_trj_rev_y [0], void_unit_trj_rev_z [0]). It is assumed that tr ← 1, and the position of an adjacent robot unit having a Manhattan distance δ larger than the robot unit H is (void_unit_trj_rev_x [tr], void_unit_trj_rev_y [tr], void_unit_trj_rev_z [tr]).
(4) If (void_unit_trj_rev_x [tr], void_unit_trj_rev_y [tr], void_unit_trj_rev_z [tr]) is not the entrance position Pe, tr is incremented and the position (void_unit_trj_rev_x [tr-1], void_unit_trj_rev_y [tr-1], Let (void_unit_trj_rev_x [tr], void_unit_trj_rev_y [tr], void_unit_trj_rev_z [tr]) be the position of an adjacent robot unit having a Manhattan distance δ larger than the robot unit in void_unit_trj_rev_z [tr-1]). Repeat (4) until (void_unit_trj_rev_x [tr], void_unit_trj_rev_y [tr], void_unit_trj_rev_z [tr]) reaches the entrance position Pe.

もし、(void_unit_trj_rev_x[tr], void_unit_trj_rev_y[tr], void_unit_trj_rev_z[tr])が入口位置Peの場合は、(5)へ移行する。
(5)it=1,,,trとして、(void_unit_trj_x[tu+it], void_unit_trj_y[tu+it], void_unit_trj_z[tu+it])←(void_unit_trj_rev_x[tr-it], void_unit_trj_rev_y[tr-it], void_unit_trj_rev_z[tr-it])とする。 If (void_unit_trj_rev_x [tr], void_unit_trj_rev_y [tr], void_unit_trj_rev_z [tr]) is the entrance position Pe, the process proceeds to (5).
(5) As it = 1 ,,, tr, (void_unit_trj_x [tu + it], void_unit_trj_y [tu + it], void_unit_trj_z [tu + it]) ← (void_unit_trj_rev_x [tr-it], void_unit_trj_rev_y [tr-it], void_unit_trj_rev_z [tr-it]).

[ボイド移動制御]
以下、上記の各ロボット動作決定用動作計画を使用して、ボイドをロボット単位Hの位置から尾部ロボット単位Tの位置に移動させる制御方法を示す。図３１に示すように、ロボット単位H内の頭部ロボット単位Dに接する位置のロボットと、ロボット単位T'内の尾部ロボット単位Tに接する位置にあるロボット位置に空白が生じないように、ロボット単位Hでのボイド発生と尾部ロボット単位Tでのボイド追い出しをしていることで、ロボット群全体の接続は保たれている。ロボット単位Hとロボット単位T'が直接、接しないように、尾部ロボット単位Tの選択アルゴリズムにおいて、尾部ロボット単位Tの位置のδ'(ロボット単位Hと尾部ロボット単位Tとのマンハッタン距離)が3以上という条件を設定している。本制御方法で、各ロボット単位内において、ボイドが一つ以内に収まるように制御されている。 [Void movement control]
Hereinafter, a control method for moving the void from the position of the robot unit H to the position of the tail robot unit T using the above-described operation plans for determining the robot movement will be described. As shown in FIG. 31, the robot is positioned so that there is no blank between the robot in the position in contact with the head robot unit D in the robot unit H and the robot position in the position in contact with the tail robot unit T in the robot unit T ′. The connection of the entire robot group is maintained by generating voids in unit H and expelling voids in tail robot unit T. In the selection algorithm of the tail robot unit T, the δ ′ of the position of the tail robot unit T (Manhattan distance between the robot unit H and the tail robot unit T) is 3 so that the robot unit H and the robot unit T ′ do not directly touch each other. The above conditions are set. With this control method, control is performed so that the number of voids is within one in each robot unit.

以下を目標位置の集合GUが満たされるまで、繰り返す。 The following is repeated until the target position set GU is satisfied.

[Void_Control]
(1)Void_Trajectory_Decisionを実行する。trj_cnt[iv_prev+1],,, trj_cnt[iv_prev+8],を-1にする。iv_prevの値を8回インクリメントする。
(2)以下をvoid[iv_prev]がロボット単位Hの外に出るまで繰り返す。 [Void_Control]
(1) Execute Void_Trajectory_Decision. Set trj_cnt [iv_prev + 1] ,, trj_cnt [iv_prev + 8], to -1. Increment the value of iv_prev 8 times.
(2) Repeat the following until void [iv_prev] comes out of robot unit H.

発生番号の大きいもの（iv=iv_prev）から以下の処理をする。 The following processing is performed from the one with the largest occurrence number (iv = iv_prev).

(2-i)trj_cnt[iv]=-10のとき→void[iv]について何もしない。 (2-i) When trj_cnt [iv] =-10 → Do nothing for void [iv].

(2-ii)trj_cnt[iv]< trj_num[iv] && trj_cnt[iv]>=0のとき→
i= trj_cnt[iv]からtrj_num[iv]まで以下を行う：
位置nv(void_trj_x[trj_num[iv]-i-1],void_trj_y[trj_num[iv]-i-1], void_trj_z[trj_num[iv]-i-1])が含まれるロボット単位Nvに他のvoidがないこと、自分より番号の小さいvoidが、すべてロボット単位Nvよりもマンハッタン距離δの大きいロボット単位に所属していることを確認する。確認できたなら、nvの位置にあるロボットを位置(void_trj_x[trj_num[iv]-i],void_trj_y[trj_num[iv]-i],void_trj_z[trj_num[iv]-i])に移動させる。 (2-ii) When trj_cnt [iv] <trj_num [iv] && trj_cnt [iv]> = 0 →
i = From trj_cnt [iv] to trj_num [iv] do the following:
There is another void in the robot unit Nv containing the position nv (void_trj_x [trj_num [iv] -i-1], void_trj_y [trj_num [iv] -i-1], void_trj_z [trj_num [iv] -i-1]) Confirm that all voids with numbers smaller than themselves belong to the robot unit whose Manhattan distance δ is larger than the robot unit Nv. If confirmed, the robot at the position of nv is moved to the position (void_trj_x [trj_num [iv] -i], void_trj_y [trj_num [iv] -i], void_trj_z [trj_num [iv] -i]).

trj_cnt[iv]をインクリメントし、確認できなくなるまで処理を繰り返し、確認できないならなにもせず、ループ終了する。 It increments trj_cnt [iv] and repeats the process until it cannot be confirmed. If it cannot be confirmed, nothing is done and the loop ends.

(2-iii)trj_cnt[iv]＝trj_num[iv]のとき→
現在のvoid[iv]の位置から、alの方向の逆方向に隣接していて、たどれるロボットを全て、al方向に1ステップ移動させて、trj_cnt[iv]←‐10とする。 (2-iii) When trj_cnt [iv] = trj_num [iv] →
All robots that are adjacent in the direction opposite to the direction of al from the current position of void [iv] and that are traced are moved one step in the direction of al, and trj_cnt [iv] ← −10.

(2-iv)trj_cnt[iv]=‐1のとき→
ロボット単位Hが８つのロボットで満たされている場合、void[iv]の発生位置にあるロボットと、そのロボットに行動a_headの方向で隣接しているロボット、さらにその先にa_headの方向で隣接しているロボットがあればそのロボットを皆まとめて、a_headの方向に1ステップ移動させる。trj_cnt[iv]←0とする。満たされてない場合なにもしない。 (2-iv) When trj_cnt [iv] =-1 →
When robot unit H is filled with 8 robots, the robot in the void [iv] generation position, the robot adjacent to the robot in the direction of action a_head, and further to the robot in the direction of a_head If there are any robots, move them all together and move them one step in the direction of a_head. Set trj_cnt [iv] ← 0. If you are not satisfied, do nothing.

以上の手法によるロボットの隊列形成は、24個以上のロボット単位で構成される任意の形状の開始位置の集合SU、、目標位置の集合GUのロボット群に適用可能である。 The formation of a robot row by the above method can be applied to a robot group of a start position set SU and a target position set GU having an arbitrary shape composed of 24 or more robot units.

＜第二実施形態に係る行動制御システム２００＞
第二実施形態に係る行動制御システム２００は、以上に説明した各処理によって構成される全体動作について以下にまとめる。
(1)移動先隊列決定用動作計画のために使用するマンハッタン距離δ（各位置単位から入口位置単位PeUまでのマンハッタン距離）を計算する。尾部ロボット単位近辺のロボット動作計画に使用するマンハッタン距離δlocalをCalculate_delta_localを実行し計算する。
(2)以下(3),(4)を全GU内にロボットが充填されるまで繰り返す。
(3)移動先隊列決定用動作計画処理Next_Formation_Decisionを実行する。
(4)各ロボット動作決定用動作計画処理を実行する。 <Behavior Control System 200 According to Second Embodiment>
The behavior control system 200 according to the second embodiment summarizes the overall operation configured by the processes described above.
(1) Calculate the Manhattan distance δ (Manhattan distance from each position unit to the entrance position unit PeU) used for the movement plan for the movement destination platoon determination. Calculate_delta_local is used to calculate the Manhattan distance δlocal used for robot motion planning near the tail robot unit.
(2) Repeat (3) and (4) below until all GUs are filled with robots.
(3) Execute the movement plan process Next_Formation_Decision for determining the movement target platoon.
(4) The robot motion determination motion planning process is executed.

図３２は第一実施形態に係る行動制御システム２００の機能ブロック図を、図３３はその処理フローの例を示す。行動制御システム２００は、図３２に示すように、動作計画部２１０と、行動選択部２２０と、隣接状態判定部１２４と、位置更新部１２３と、位置判定部１２６と、記憶部２４０と、通信部１５０と、入力部１６０とを含む。行動選択部２２０は移動先隊列決定用動作計画部２２１と各制御対象物動作用動作計画部２２２とを含む。第一実施形態と異なる部分である動作計画部２１０と、行動選択部２２０と、記憶部２４０を中心に説明する。他の部分については、第一実施形態と同様の処理を行う。ただし、通信部１５０及び隣接状態判定部１２４では、二次元平面上の上下左右方向ではなく、三次元空間上の上下左右前後方向(以下「６方向」ともいう)において、通信を行ったり、隣接状態を判定したりする。 FIG. 32 is a functional block diagram of the behavior control system 200 according to the first embodiment, and FIG. 33 shows an example of the processing flow. As shown in FIG. 32, the behavior control system 200 includes an operation planning unit 210, a behavior selection unit 220, an adjacent state determination unit 124, a position update unit 123, a position determination unit 126, a storage unit 240, and communication. Part 150 and input part 160. The action selection part 220 includes a movement plan determination action planning part 221 and each control object action action planning part 222. The operation plan unit 210, the action selection unit 220, and the storage unit 240, which are different from the first embodiment, will be mainly described. About the other part, the process similar to 1st embodiment is performed. However, the communication unit 150 and the adjacent state determination unit 124 perform communication in the up / down / left / right front / rear direction (hereinafter also referred to as “six directions”) in the three-dimensional space instead of the up / down / left / right direction on the two-dimensional plane. To determine the state.

本実施形態では、行動制御システム２００は、p(pは、192以上の整数の何れかであって、８の倍数)台のロボットの行動を制御し、p台のロボットの内の１つのロボット上に実装される。 In this embodiment, the behavior control system 200 controls the behavior of p robots (p is an integer of 192 or more and a multiple of 8), and one robot among the p robots. Implemented above.

＜動作計画部２１０＞
動作計画部２１０は、上述の[マンハッタン距離δの計算]及び[パスPの決定]で説明した方法により、開始位置の集合SU、目標位置の集合GU及びパスPの何れかの各位置から入口位置Peまでのマンハッタン距離δを、ロボットの任務行動開始前に事前に計算し（Ｓ２１０）、記憶部１４０に格納する。ただし、目標位置の集合の各位置から入口位置Peまでのマンハッタン距離δは、その符号を負とする。また、開始位置の集合SU、目標位置の集合GU及びパスPの何れにも含まれない全ての位置のマンハッタン距離δをs_max（状態空間内の格子数より大きな値）として記憶部１４０に格納する。また、同様にロボットの任務行動開始前にCalculate_delta_localを実行し、尾部ロボット単位近辺のロボット動作計画に使用するマンハッタン距離δlocalを計算する。なお、別装置でマンハッタン距離δを計算しておき、ロボットの任務行動開始前に事前に記憶部１４０に格納しておけば、行動制御システム１００は、動作計画部２１０を備えなくともよい。 <Operation Planning Unit 210>
The motion planning unit 210 enters the entrance from each position of the start position set SU, the target position set GU, and the path P by the method described in [Calculation of Manhattan distance δ] and [Determination of path P]. The Manhattan distance δ to the position Pe is calculated in advance before the robot's mission action starts (S210) and stored in the storage unit 140. However, the sign of the Manhattan distance δ from each position of the set of target positions to the entrance position Pe is negative. Further, the Manhattan distance δ of all positions not included in any of the start position set SU, the target position set GU, and the path P is stored in the storage unit 140 as s_max (a value larger than the number of grids in the state space). . Similarly, Calculate_delta_local is executed before the robot mission action is started, and the Manhattan distance δlocal used for the robot motion plan near the tail robot unit is calculated. If the Manhattan distance δ is calculated by a separate device and stored in the storage unit 140 in advance before starting the robot mission behavior, the behavior control system 100 may not include the motion planning unit 210.

＜記憶部２４０＞
記憶部２４０には、各位置sから入口位置Peまでのマンハッタン距離δと、尾部ロボット単位近辺のロボット動作計画用マンハッタン距離δlocalが記憶されているとする。sの取りうる範囲は、対象となる三次元空間上の領域内のロボット単位が存在しうる全ての座標である。 <Storage unit 240>
It is assumed that the storage unit 240 stores a Manhattan distance δ from each position s to the entrance position Pe and a robot operation planning Manhattan distance δlocal near the tail robot unit. The range that s can take is all the coordinates where the robot unit in the region in the target three-dimensional space can exist.

＜行動選択部２２０＞
行動選択部２２０は、記憶部２４０からマンハッタン距離δ、δlocalを取り出す。 <Action selection unit 220>
The action selection unit 220 extracts the Manhattan distances δ and δlocal from the storage unit 240.

行動選択部２２０は、上述の方法で説明した方法で、p台のロボットを制御する（ｓ２２０）。 The action selection unit 220 controls the p robots by the method described in the above method (s220).

行動選択部２２０の移動先隊列決定用動作計画部２２１は、マンハッタン距離δを用いて、移動先隊列決定用動作計画処理Next_Formation_Decisionを行い（ｓ２２１）、開始位置の集合に配置されたp台のロボットを目標位置の集合に移動させる際に、頭部ロボット単位が移動の先頭を務めるように、各位置単位におけるマンハッタン距離δを用いてある時刻の隊列Gaに対して隊列Gbを決定し、頭部ロボット単位及び尾部ロボット単位の位置を出力する。 The movement plan determining action planning unit 221 of the action selecting unit 220 uses the Manhattan distance δ to perform the moving plan determining action planning process Next_Formation_Decision (s221), and p robots arranged in the set of starting positions. When moving the head to the set of target positions, the head row robot unit determines the row Gb for the row Ga at a certain time using the Manhattan distance δ in each position unit so that the head robot unit acts as the head of the movement. Outputs the position of robot unit and tail robot unit.

さらに、行動選択部２２０の各制御対象物動作用動作計画部２２２は、頭部ロボット単位及び尾部ロボット単位の位置を受け取り、マンハッタン距離δlocalを使って各ロボット動作用動作計画を行い（ｓ２２２−１）、頭部ロボット単位の位置に８台のロボットが位置するように、各ロボットに対して制御信号を送り、各ロボットの動作を制御する。各ロボットは制御信号に従って行動する。 Further, the motion planning unit 222 for each controlled object motion of the behavior selection unit 220 receives the position of the head robot unit and the tail robot unit, and performs a motion plan for each robot motion using the Manhattan distance δlocal (s222-1). ) A control signal is sent to each robot to control the operation of each robot so that eight robots are positioned at the position of the head robot unit. Each robot behaves according to the control signal.

各ロボットは制御信号に従って行動する。 Each robot behaves according to the control signal.

なお、一回の行動制御により、１回の移動先隊列決定用動作計画処理Next_Formation_Decisionで決定された頭部ロボット単位を充填する処理が行われる。 In addition, the process which fills the head robot unit determined by one movement plan decision operation plan process Next_Formation_Decision by one action control is performed.

移動先隊列決定用動作計画処理Next_Formation_Decisionにおいて行われる頭部決定処理Head_Robot_Decisionの中で、各ロボット単位の位置や障害物位置を必要とするが、各ロボット単位の位置ついては前述の通り、取得可能である。また、障害物の位置に関しては、マンハッタン距離δを計算する際に利用した障害物の位置を利用してもよいし、隣接状態判定部１２４で判定した判定結果を用いて得られるものを利用してもよい。 In the head determination process Head_Robot_Decision that is performed in the movement plan process Next_Formation_Decision for determining the movement target platoon, the position of each robot unit and the position of the obstacle are required, but the position of each robot unit can be acquired as described above. . As for the position of the obstacle, the position of the obstacle used when calculating the Manhattan distance δ may be used, or the one obtained using the determination result determined by the adjacent state determination unit 124 is used. May be.

本実施形態により制御され、図２７Ａの開始位置の集合から図２７Ｂの目標位置の集合に変形する際のロボット群の遷移状態の例を図３４に示す。 FIG. 34 shows an example of the transition state of the robot group that is controlled by this embodiment and is transformed from the set of start positions in FIG. 27A to the set of target positions in FIG. 27B.

＜効果＞
このような構成により、第一実施形態と同様の効果を得ることができる。また、2次元平面上だけでなく、三次元空間上での制御対象物の制御が可能となる。 <Effect>
With such a configuration, the same effect as that of the first embodiment can be obtained. In addition, the control object can be controlled not only on the two-dimensional plane but also on the three-dimensional space.

＜変形例＞
本実施形態では、各格子（マス）は、立方体であるが、他の形状であってもよい。格子は左右方向、上下方向及び前後方向に連続して配置される。また、各格子は左右方向で他の二つの格子と隣接し、上下方向で他の二つの格子と隣接し、前後方向で他の二つの格子と隣接する。言い換えると、各格子は、ロボットの移動できる方向と同じ方向においてのみ、他の格子と隣接する。この条件を満たせば、各格子はどのような形状であってもよい。また、「直交」とは、厳密に「垂直に交わること」を意味しなくともよく、例えば、各格子は、平行六面体であってもよく、各格子が他の二つの格子と隣接する方向の一方を上下方向とし、他方を左右方向とすればよく、上下方向及び左右方向とからなる平面に対して平行でない方向を前後方向とすればよい。 <Modification>
In this embodiment, each lattice (mass) is a cube, but may have other shapes. The lattice is continuously arranged in the left-right direction, the up-down direction, and the front-back direction. Each lattice is adjacent to the other two lattices in the left-right direction, is adjacent to the other two lattices in the vertical direction, and is adjacent to the other two lattices in the front-rear direction. In other words, each grid is adjacent to other grids only in the same direction that the robot can move. Each lattice may have any shape as long as this condition is satisfied. Further, the term “orthogonal” does not necessarily mean strictly “perpendicularly”. For example, each lattice may be a parallelepiped, and each lattice is adjacent to the other two lattices. One side may be the up-down direction and the other side may be the left-right direction, and a direction that is not parallel to the plane formed by the up-down direction and the left-right direction may be the front-back direction.

別の言い方をすると、制御対象物は、三次元空間上の、第一方向（例えば右方向）、第一方向に対して平行でない方向である第二方向（例えば上方向）、第一方向に対して反対方向である第三方向（例えば左方向）、第二方向に対して反対方向である第四方向（例えば下方向）、第一方向及び第二方向の成す平面に対して平行でない方向を第五方向（例えば前方向）、第五方向に対して反対方向である第六方向に移動可能であり、一回の行動制御により、現在いる領域（格子、マス）から、現在いる領域に対して、第一方向、第二方向、第三方向、第四方向、第五方向、第六方向において隣接する領域の何れかに移動するように制御される。また、この場合、ロボットの２次元平面上の、第一方向において隣接する位置を第一位置、第二方向において隣接する位置を第二位置、第三方向において隣接する位置を第三位置、第四方向において隣接する位置を第四位置、第五方向において隣接する位置を第五位置、第六方向において隣接する位置を第六位置と呼んでもよい。 In other words, the control object is in the first direction (for example, the right direction), the second direction (for example, the upper direction) that is not parallel to the first direction, and the first direction in the three-dimensional space. A third direction (for example, the left direction) opposite to the second direction, a fourth direction (for example, the downward direction) opposite to the second direction, a direction not parallel to the plane formed by the first direction and the second direction. Can be moved in the fifth direction (for example, the forward direction) and the sixth direction, which is the opposite direction to the fifth direction, and from the current region (lattice, square) to the current region by one action control. On the other hand, it is controlled to move to any of the adjacent areas in the first direction, the second direction, the third direction, the fourth direction, the fifth direction, and the sixth direction. In this case, on the two-dimensional plane of the robot, the position adjacent in the first direction is the first position, the position adjacent in the second direction is the second position, the position adjacent in the third direction is the third position, A position adjacent in the four directions may be referred to as a fourth position, a position adjacent in the fifth direction as a fifth position, and a position adjacent in the sixth direction as a sixth position.

本実施形態では、ロボット単位は、２×２×２の８台のロボットからなるが、M×N×Q台のロボットからなってもよい。ただし、M、N及びQはそれぞれ２以上の整数の何れかである。なお、ロボットの台数pはロボット単位に含まれるロボットの台数M×N×Qの24以上の倍数とする。つまり、p=M×N×Q×Rであり、Rは24以上の整数の何れかである。このような状態で移動を行った場合にも、２×２×２のロボット単位のときと同様に、ロボットの群れの中のいずれのロボットが一台欠けても、各ロボットはお互いに一つの辺で接しあう位置関係を崩さずに済む。M×N×Qの台数を大きくすると、ロボット単位の個数が減り、行動できるパターンが減るため、マンハッタン距離δの計算量が減る。しかし、M×N×Q個のマスを一つの単位とするマス単位の中に1つでも障害物が含まれる場合には、そのマス単位には移動できないものとして扱うため、M×N×Qの台数を大きくすると、移動できない範囲が多くなり、行動できる範囲が小さくなる。そもそも動作計画において、各位置から入口位置Peまでのマンハッタン距離を調べさえすればよいため、従来技術に比べれば、十分に計算量が減っている。そのため、行動できる範囲を大きく保つために２×２×２台のロボットを一つの単位とするロボット単位とすることが望ましい。 In this embodiment, the robot unit is composed of 8 robots of 2 × 2 × 2, but may be composed of M × N × Q robots. However, M, N, and Q are each an integer of 2 or more. The number p of robots is a multiple of 24 or more of the number M × N × Q of robots included in the robot unit. That is, p = M × N × Q × R, and R is any integer of 24 or more. Even when moving in such a state, as in the case of a 2 × 2 × 2 robot unit, even if any robot in the group of robots is missing, each robot It is not necessary to break the positional relationship that touches the sides. When the number of M × N × Q is increased, the number of robot units decreases, and the number of patterns that can be acted on decreases, so the amount of calculation of the Manhattan distance δ decreases. However, if even one obstacle is included in a square unit with M x N x Q squares as one unit, it will be treated as being unable to move to that square unit, so M x N x Q Increasing the number of vehicles increases the range in which movement is not possible and the range in which action can be reduced. In the first place, since it is only necessary to check the Manhattan distance from each position to the entrance position Pe in the operation plan, the amount of calculation is sufficiently reduced as compared with the prior art. For this reason, in order to keep a large range in which the robot can act, it is desirable to use 2 × 2 × 2 robots as a robot unit.

＜その他の変形例＞
本発明は上記の実施形態及び変形例に限定されるものではない。例えば、上述の各種の処理は、記載に従って時系列に実行されるのみならず、処理を実行する装置の処理能力あるいは必要に応じて並列的にあるいは個別に実行されてもよい。その他、本発明の趣旨を逸脱しない範囲で適宜変更が可能である。 <Other variations>
The present invention is not limited to the above-described embodiments and modifications. For example, the various processes described above are not only executed in time series according to the description, but may also be executed in parallel or individually as required by the processing capability of the apparatus that executes the processes. In addition, it can change suitably in the range which does not deviate from the meaning of this invention.

＜プログラム及び記録媒体＞
また、上記の実施形態及び変形例で説明した各装置における各種の処理機能をコンピュータによって実現してもよい。その場合、各装置が有すべき機能の処理内容はプログラムによって記述される。そして、このプログラムをコンピュータで実行することにより、上記各装置における各種の処理機能がコンピュータ上で実現される。 <Program and recording medium>
In addition, various processing functions in each device described in the above embodiments and modifications may be realized by a computer. In that case, the processing contents of the functions that each device should have are described by a program. Then, by executing this program on a computer, various processing functions in each of the above devices are realized on the computer.

この処理内容を記述したプログラムは、コンピュータで読み取り可能な記録媒体に記録しておくことができる。コンピュータで読み取り可能な記録媒体としては、例えば、磁気記録装置、光ディスク、光磁気記録媒体、半導体メモリ等どのようなものでもよい。 The program describing the processing contents can be recorded on a computer-readable recording medium. As the computer-readable recording medium, for example, any recording medium such as a magnetic recording device, an optical disk, a magneto-optical recording medium, and a semiconductor memory may be used.

また、このプログラムの流通は、例えば、そのプログラムを記録したＤＶＤ、ＣＤ−ＲＯＭ等の可搬型記録媒体を販売、譲渡、貸与等することによって行う。さらに、このプログラムをサーバコンピュータの記憶装置に格納しておき、ネットワークを介して、サーバコンピュータから他のコンピュータにそのプログラムを転送することにより、このプログラムを流通させてもよい。 The program is distributed by selling, transferring, or lending a portable recording medium such as a DVD or CD-ROM in which the program is recorded. Further, the program may be distributed by storing the program in a storage device of the server computer and transferring the program from the server computer to another computer via a network.

このようなプログラムを実行するコンピュータは、例えば、まず、可搬型記録媒体に記録されたプログラムもしくはサーバコンピュータから転送されたプログラムを、一旦、自己の記憶部に格納する。そして、処理の実行時、このコンピュータは、自己の記憶部に格納されたプログラムを読み取り、読み取ったプログラムに従った処理を実行する。また、このプログラムの別の実施形態として、コンピュータが可搬型記録媒体から直接プログラムを読み取り、そのプログラムに従った処理を実行することとしてもよい。さらに、このコンピュータにサーバコンピュータからプログラムが転送されるたびに、逐次、受け取ったプログラムに従った処理を実行することとしてもよい。また、サーバコンピュータから、このコンピュータへのプログラムの転送は行わず、その実行指示と結果取得のみによって処理機能を実現する、いわゆるＡＳＰ（Application Service Provider）型のサービスによって、上述の処理を実行する構成としてもよい。なお、プログラムには、電子計算機による処理の用に供する情報であってプログラムに準ずるもの（コンピュータに対する直接の指令ではないがコンピュータの処理を規定する性質を有するデータ等）を含むものとする。 A computer that executes such a program first stores, for example, a program recorded on a portable recording medium or a program transferred from a server computer in its storage unit. When executing the process, this computer reads the program stored in its own storage unit and executes the process according to the read program. As another embodiment of this program, a computer may read a program directly from a portable recording medium and execute processing according to the program. Further, each time a program is transferred from the server computer to the computer, processing according to the received program may be executed sequentially. Also, the program is not transferred from the server computer to the computer, and the above-described processing is executed by a so-called ASP (Application Service Provider) type service that realizes the processing function only by the execution instruction and result acquisition. It is good. Note that the program includes information provided for processing by the electronic computer and equivalent to the program (data that is not a direct command to the computer but has a property that defines the processing of the computer).

また、コンピュータ上で所定のプログラムを実行させることにより、各装置を構成することとしたが、これらの処理内容の少なくとも一部をハードウェア的に実現することとしてもよい。 In addition, although each device is configured by executing a predetermined program on a computer, at least a part of these processing contents may be realized by hardware.

Claims

M and N are each an integer greater than or equal to 2, p is an integer greater than or equal to M × N, and p control objects arranged in a set of start positions include target positions including a predetermined entrance position A behavior control system for performing behavior control to move to a set of
The direction not parallel to the first direction is the second direction, the opposite direction to the first direction is the third direction, the opposite direction to the second direction is the fourth direction, and each start position and each The target positions are adjacent to other start positions and target positions in at least one of the first direction to the fourth direction, respectively, and the set of the target positions and the set of the start positions each have an arbitrary shape in a lump. And
The control object includes a first position adjacent in the first direction on the two-dimensional plane of the control object, a second position adjacent in the second direction, a third position adjacent in the third direction, and a fourth position. It is assumed that the fourth position is adjacent in the direction and is controlled to move to any one of the first to fourth positions on the two-dimensional plane.
The p control objects constitute one control object unit for each M × N units, and the M × N control objects constituting one control object unit each have two or more directions. Adjacent to other control objects constituting the control object unit,
When a set of target position units consisting of position units consisting of N × N positions is set from the set of target positions, and the control object unit takes each action a at the current position s of the control object unit It is controlled based on one value function that represents the appropriateness of the image, and is stationary or moved to a position unit composed of N × N positions in any of the first to fourth directions on the two-dimensional plane. And shall be controlled as
A storage unit for storing the value function;
If there is no control object in the set of all target position units, Ga is the formation that consists of multiple control object units, Gb is the other formation that consists of multiple control object units, and exists in formation Gb. A control target unit that does not exist in the formation Ga is a head control target unit, and when moving the p control targets arranged in the start position set to the target position set, the head control target The movement plan determination operation planning unit for determining the column Gb for the column Ga at a certain time using the value function so that the object unit serves as the head of movement,
Each control object that controls the operation of each control object so that M × N control objects are located at the position of the head control object unit when there is no control object in the set of all target position units An operation planning section for object movement,
When a control object exists in a set of all target position units, each of the control object operation motion planning units is included in the set of target position units but is not included in the set of target positions. To control each object to move,
Behavior control system.

The behavior control system according to claim 1,
M = 2, N = 2,
Behavior control system.

M and N are each an integer greater than or equal to 2, p is an integer greater than or equal to M × N, and p control objects arranged in a set of start positions include target positions including a predetermined entrance position A behavior control method for performing behavior control to move to a set of
The direction not parallel to the first direction is the second direction, the opposite direction to the first direction is the third direction, the opposite direction to the second direction is the fourth direction, and each start position and each The target positions are adjacent to other start positions and target positions in at least one of the first direction to the fourth direction, respectively, and the set of the target positions and the set of the start positions each have an arbitrary shape in a lump. And
The control object includes a first position adjacent in the first direction on the two-dimensional plane of the control object, a second position adjacent in the second direction, a third position adjacent in the third direction, and a fourth position. It is assumed that the fourth position is adjacent in the direction and is controlled to move to any one of the first to fourth positions on the two-dimensional plane.
The p control objects constitute one control object unit for each M × N units, and the M × N control objects constituting one control object unit each have two or more directions. Adjacent to other control objects constituting the control object unit,
When a set of target position units consisting of position units consisting of N × N positions is set from the set of target positions, and the control object unit takes each action a at the current position s of the control object unit It is controlled based on one value function that represents the appropriateness of the image, and is stationary or moved to a position unit composed of N × N positions in any of the first to fourth directions on the two-dimensional plane. And shall be controlled as
If the movement planning unit for determining the movement target platoon has no control object in the set of all target position units, Ga is defined as a platoon in which multiple control object units are formed, and other control object units are formed. The convoy is Gb, the control object unit that is present in the convoy Gb and not in the convoy Ga is the head control object unit, and the p control objects arranged in the set of the start positions are the set of the target positions. When moving to, movement target corporal decision operation planning step for determining the platoon Gb for the constellation Ga at a certain time using the value function, so that the head control object unit serves as the head of the movement,
When there is no control object in the set of all target position units, each motion target motion planning unit is configured so that M × N control objects are located at the position of the head control object unit. Controlling the operation of the control object;
When each control target object motion planning unit includes a control target object in a set of all target position units, each control target object motion planning step is included in the target position unit set. A step of controlling each control object to move to a position not included in the set of target positions,
Behavior control method.

The behavior control method according to claim 3,
M = 2, N = 2,
Behavior control method.

A program for causing a computer to function as the behavior control system according to claim 1.

M, N, and Q are each an integer greater than or equal to 2, R is any integer greater than or equal to 24, p is M × N × Q × R, and control of p units arranged in the set of start positions A behavior control system that performs behavior control for moving an object to a set of target positions including a predetermined entrance position,
The direction that is not parallel to the first direction is the second direction, the direction opposite to the first direction is the third direction, the direction opposite to the second direction is the fourth direction, the first direction and the first direction The direction that is not parallel to the plane formed by the two directions is the fifth direction, the opposite direction to the fifth direction is the sixth direction, and one position unit consisting of M × N × Q positions, The start position and each target position are adjacent to other start positions and target positions in at least one of the first direction to the sixth direction, respectively, and the set of target positions and the set of start positions are each R pieces. It forms an arbitrary shape consisting of a plurality of position units,
The control object includes a first position adjacent in the first direction on the three-dimensional space of the control object, a second position adjacent in the second direction, a third position adjacent in the third direction, and a fourth direction. Adjacent fourth position, fifth position adjacent in the fifth direction, sixth position adjacent in the sixth direction, either stationary, or any of the first to sixth positions in the three-dimensional space Shall be controlled to move,
The above-mentioned control objects of p units constitute one control object unit for each M × N × Q units, and there are three control objects of M × N × Q units constituting one control object unit. Adjacent to other control objects constituting the control object unit in the above direction,
The controlled object unit is controlled based on one value function that represents the appropriateness when each action is taken in the current position unit of the controlled object unit, and the controlled object unit is stationary or in a three-dimensional space. It shall be controlled to move to a position unit consisting of N × N × Q positions in any of the first to sixth directions,
A storage unit for storing the value function;
A convoy that consists of multiple control object units is Ga, another convoy that consists of multiple control object units is Gb, and a control object unit that exists in the convoy Gb but does not exist in the convoy Ga is a head control object unit. And the value function is used so that the head control object unit acts as a head of movement when moving the p control objects arranged in the set of start positions to the set of target positions. The movement plan determining unit for determining the column Gb for the column Ga at a certain time,
An operation planning unit for each control object operation that controls the operation of each control object such that M × N × Q control objects are located at the position of the head control object unit,
The value function is obtained using a Manhattan distance from a certain position unit to a position unit including the predetermined entrance position.
Behavior control system.

The behavior control system according to claim 6,
M = 2, N = 2, Q = 2,
Behavior control system.

M, N, and Q are each an integer greater than or equal to 2, R is any integer greater than or equal to 24, p is M × N × Q × R, and control of p units arranged in the set of start positions A behavior control method for performing behavior control for moving an object to a set of target positions including a predetermined entrance position,
The direction that is not parallel to the first direction is the second direction, the direction opposite to the first direction is the third direction, the direction opposite to the second direction is the fourth direction, the first direction and the first direction The direction that is not parallel to the plane formed by the two directions is the fifth direction, the opposite direction to the fifth direction is the sixth direction, and one position unit consisting of M × N × Q positions, The start position and each target position are adjacent to other start positions and target positions in at least one of the first direction to the sixth direction, respectively, and the set of target positions and the set of start positions are each R pieces. It forms an arbitrary shape consisting of a plurality of position units,
The control object includes a first position adjacent in the first direction on the three-dimensional space of the control object, a second position adjacent in the second direction, a third position adjacent in the third direction, and a fourth direction. Adjacent fourth position, fifth position adjacent in the fifth direction, sixth position adjacent in the sixth direction, either stationary, or any of the first to sixth positions in the three-dimensional space Shall be controlled to move,
The above-mentioned control objects of p units constitute one control object unit for each M × N × Q units, and there are three control objects of M × N × Q units constituting one control object unit. Adjacent to other control objects constituting the control object unit in the above direction,
The controlled object unit is controlled based on one value function that represents the appropriateness when each action is taken in the current position unit of the controlled object unit, and the controlled object unit is stationary or in a three-dimensional space. It shall be controlled to move to a position unit consisting of N × N × Q positions in any of the first to sixth directions,
The movement planning unit for determining the movement target platoon sets Ga as a platoon consisting of multiple control target units, and Gb as another platoon consisting of multiple control target units. When the object unit is a head control object unit and the p control objects arranged in the start position set are moved to the target position set, the head control object unit is the head of the movement. To determine the formation Gb for the formation Ga at a certain time using the value function,
For each control object operation, the operation planning unit for each control object operation controls the operation of each control object so that M × N × Q control objects are located at the position of the head control object unit. An action planning step,
The value function is obtained using a Manhattan distance from a certain position unit to a position unit including the predetermined entrance position.
Behavior control method.

The behavior control method according to claim 8,
M = 2, N = 2, Q = 2,
Behavior control method.

A program for causing a computer to function as the behavior control system according to claim 6 or 7.