JP2017091460A

JP2017091460A - Compute node network system

Info

Publication number: JP2017091460A
Application number: JP2015224988A
Authority: JP
Inventors: 浩嗣中野; Koji Nakano; 藤田　聡; Satoshi Fujita; 聡藤田; 道紘鯉渕; Michihiro Koibuchi; 一毅藤原; Kazutake Fujiwara
Original assignee: Hiroshima University NUC; Research Organization of Information and Systems
Current assignee: Hiroshima University NUC; Research Organization of Information and Systems
Priority date: 2015-11-17
Filing date: 2015-11-17
Publication date: 2017-05-25

Abstract

【課題】ネットワークの直径が極力小さい計算ノードネットワークシステムを提供する。【解決手段】計算ノードネットワークシステム（１００Ａ）は、矩形領域（３０）に収まるように２次元配置された複数の計算ノード（１０）と、複数の計算ノード（１０）の任意のものどうしを接続する複数の通信リンク（２０）とを備え、複数の計算ノード（１０）のそれぞれが仮想的な第１のグリッド（４０）の格子点に配置されており、複数の通信リンク（２０）のそれぞれが仮想的な第２のグリッド（５０）に沿って配線されており、第１のグリッド（４０）の向きと第２のグリッド（５０）の向きとが一致しており、第１のグリッド（４０）および第２のグリッド（５０）が矩形領域（３０）に対して４５°傾いている。【選択図】図１[Problem] To provide a computation node network system with a network diameter as small as possible. [Solution] The computation node network system (100A) comprises a plurality of computation nodes (10) arranged two-dimensionally to fit within a rectangular area (30), and a plurality of communication links (20) connecting any of the plurality of computation nodes (10), each of the plurality of computation nodes (10) being arranged at a lattice point of a virtual first grid (40), each of the plurality of communication links (20) being wired along a virtual second grid (50), the orientation of the first grid (40) and the orientation of the second grid (50) being the same, and the first grid (40) and the second grid (50) being inclined at 45° with respect to the rectangular area (30). [Selected Figure] Figure 1

Description

本発明は、複数の計算ノードを通信リンクで接続した計算ノードネットワークシステムに関する。 The present invention relates to a computing node network system in which a plurality of computing nodes are connected by communication links.

プロセッサのメニーコア化、計算機システムの大規模並列化（チップ内：数十コア、チップ間：１０万ノード規模）が進むにつれて、計算機システムのメモリ、ストレージ、プロセッサコアなどの構成要素間を接続するネットワークである相互結合網（Interconnection Networks）の通信遅延が、計算システム全体に与える性能の割合が大きくなってきている。例えば、次世代の高性能システムにおける多くのマルチコア並列アプリケーションは、数百ナノ秒〜１マイクロ秒のＭＰＩ（Message Passing Interface）通信遅延を必要とすることが予測されている。また、そのメニーコアプロセッサチップ内の（パケット）ネットワークに関しては、プロセッサコア間通信遅延、およびＬ２キャッシュとの通信遅延が高々数サイクルであることが求められている。これらの設計では、コア間をどのように効率的に相互接続すべきか？というネットワーク構成（以下、ネットワークトポロジと呼ぶ）の設計の問題に直面している。 A network that connects components such as memory, storage, and processor cores of a computer system as processor cores and computer system massively parallel (within a chip: tens of cores, between chips: 100,000 nodes) The ratio of the performance given to the entire computing system by the communication delay of the interconnection network is increasing. For example, many multi-core parallel applications in next-generation high performance systems are expected to require Message Passing Interface (MPI) communication delays of several hundred nanoseconds to 1 microsecond. Further, regarding the (packet) network in the many-core processor chip, the communication delay between the processor cores and the communication delay with the L2 cache are required to be several cycles at most. How should these designs interconnect efficiently between cores? We are faced with the problem of designing the network configuration (hereinafter referred to as network topology).

チップ間、チップ内ともに、相互結合網の通信遅延を削減する有効な１つの方法は、直径・平均最短パス長の小さいネットワークトポロジを採用することである。実際に、チップ内、チップ間とも直径、平均最短パス長の小さなネットワークトポロジを採用することにより、多くの並列アプリケーションの性能が向上することがシミュレーション、解析結果から報告されている。現在、興味深いことに、有力なグラフはランダム性を持つグラフ（例：リンクにランダムなショートカットリンクを加えたトポロジ）である。 One effective method for reducing the communication delay of the interconnection network between chips and within the chip is to employ a network topology having a small diameter and average shortest path length. In fact, it has been reported from simulation and analysis results that the performance of many parallel applications is improved by adopting a network topology with a small diameter and average shortest path length both inside and between chips. Interestingly, the dominant graphs are those with randomness (eg, topologies with random shortcut links added to links).

実際にランダムトポロジをチップ内ネットワーク、スパコンで用いる場合、通信リンク長に関する制約がある。例えば、チップ内ネットワークにおいて、メタル配線のリンク遅延は、６５ｎｍプロセスにおいて、理想的には５０ｐｓ／ｍｍ以下となる。そのため動作周波数に合った配線長に抑える必要がある。また、チップ間ネットワークでは、４０Ｇｂｐｓ以上のリンクバンド幅が標準となりつつある。この場合、電気ケーブルは、ＩｎｆｉｎｉＢａｎｄの場合７ｍ、イーサネット（登録商標）の場合５ｍまでに限定される。ビット化けなどのソフトエラー率を重視した場合、さらにケーブル長が短くなる。下記非特許文献１には、ランダムトポロジの生成アルゴリズムが開示されている。 When a random topology is actually used in an on-chip network or a supercomputer, there are restrictions on the communication link length. For example, in an on-chip network, the link delay of metal wiring is ideally 50 ps / mm or less in a 65 nm process. Therefore, it is necessary to keep the wiring length suitable for the operating frequency. In addition, in a chip-to-chip network, a link bandwidth of 40 Gbps or more is becoming a standard. In this case, the electric cable is limited to 7 m for InfiniBand and 5 m for Ethernet (registered trademark). When emphasizing the soft error rate such as bit corruption, the cable length becomes even shorter. Non-Patent Document 1 below discloses a random topology generation algorithm.

高藤大介、藤田聡、中野浩嗣、藤原一毅、鯉渕道紘、「ランダムトポロジの生成アルゴリズムの改良」、信学技報、電子情報通信学会、2015年8月、第115巻、第174号、p.217-221Daisuke Takafuji, Kei Fujita, Hiroki Nakano, Kazuaki Fujiwara, Kaoru Shindo, “Improvement of Random Topology Generation Algorithm”, IEICE Technical Report, IEICE, August 2015, 115, 174, p. .217-221

図７は、従来の計算ノードネットワークシステムを模式的に示す平面図である。従来の計算ノードネットワークシステムでは、複数の計算ノード１０が矩形領域３０において仮想的なグリッド４０の格子点に配置され、任意の計算ノード１０どうしが仮想的なグリッド５０（実質的にグリッド４０と同じグリッド）に沿って配線された通信リンク２０によって接続されている。例えば、図７では、８×８＝６４個の計算ノード１０が格子状に並んでいる。 FIG. 7 is a plan view schematically showing a conventional computing node network system. In the conventional computing node network system, a plurality of computing nodes 10 are arranged at lattice points of a virtual grid 40 in the rectangular region 30, and arbitrary computing nodes 10 are virtual grids 50 (substantially the same as the grid 40). Connected by a communication link 20 wired along the grid. For example, in FIG. 7, 8 × 8 = 64 calculation nodes 10 are arranged in a grid.

一般に、計算ノードネットワークシステムにおいて対角に位置する計算ノード間のリンク距離が最も長くなるという問題がある。例えば、図７に示した計算ノードネットワークシステムでは、左上の計算ノード１０と右下の計算ノード１０との間のリンク距離が最も長くなる。そのため、通信リンク長に厳しい制限がある場合、これらノード間のホップ数が直径（ネットワークの直径ともいう）となる可能性が極めて高い。ネットワークの直径が大きくなると最大通信遅延が大きくなるため、ネットワークの直径はできるだけ小さいことが望ましい。 In general, there is a problem that the link distance between the calculation nodes located diagonally is the longest in the calculation node network system. For example, in the calculation node network system shown in FIG. 7, the link distance between the upper left calculation node 10 and the lower right calculation node 10 is the longest. Therefore, when the communication link length is severely limited, the number of hops between these nodes is very likely to be a diameter (also referred to as a network diameter). Since the maximum communication delay increases as the network diameter increases, it is desirable that the network diameter be as small as possible.

上記問題に鑑み、本発明は、ネットワークの直径が極力小さい計算ノードネットワークシステムを提供することを目的とする。 In view of the above problems, an object of the present invention is to provide a computation node network system in which the network diameter is as small as possible.

本発明の一局面に従った計算ノードネットワークシステムは、矩形領域に収まるように２次元配置された複数の計算ノードと、複数の計算ノードの任意のものどうしを接続する複数の通信リンクとを備え、複数の計算ノードのそれぞれが仮想的な第１のグリッドの格子点に配置されており、複数の通信リンクのそれぞれが仮想的な第２のグリッドに沿って配線されており、矩形領域の縦横方向、第１のグリッドの向き、および第２のグリッドの向きの三つのうち少なくとも二つが互いに異なるものである。 A calculation node network system according to one aspect of the present invention includes a plurality of calculation nodes arranged two-dimensionally so as to fit in a rectangular area, and a plurality of communication links connecting any of the plurality of calculation nodes. Each of the plurality of calculation nodes is arranged at a lattice point of the virtual first grid, each of the plurality of communication links is wired along the virtual second grid, and the rectangular area is vertically and horizontally At least two of the three directions, the direction of the first grid and the direction of the second grid, are different from each other.

これによると、計算ノード間の最長リンク距離を従来よりも短くすることができ、ネットワークの直径を低減することができる。 According to this, the longest link distance between calculation nodes can be made shorter than before, and the diameter of the network can be reduced.

例えば、第１のグリッドの向きと第２のグリッドの向きとが一致しており、第１のグリッドおよび第２のグリッドが矩形領域に対して４５°傾いている。 For example, the direction of the first grid and the direction of the second grid coincide with each other, and the first grid and the second grid are inclined by 45 ° with respect to the rectangular region.

あるいは、矩形領域の縦横方向と第１のグリッドの向きとが一致しており、第２のグリッドが矩域領域および第１のグリッドに対して４５°傾いている。 Alternatively, the vertical and horizontal directions of the rectangular area coincide with the orientation of the first grid, and the second grid is inclined 45 ° with respect to the rectangular area and the first grid.

あるいは、矩形領域の縦横方向と第２のグリッドの向きとが一致しており、第１のグリッドが矩形領域および第２のグリッドに対して４５°傾いている。 Alternatively, the vertical and horizontal directions of the rectangular area coincide with the orientation of the second grid, and the first grid is inclined 45 ° with respect to the rectangular area and the second grid.

また、複数の通信リンクが、ランダムにネットワークを構成した後、ネットワークの直径および平均ホップ数が小さくなるように局所的に任意の通信リンクを置き換えることを繰り返して得られたものであることが好ましい。 In addition, it is preferable that a plurality of communication links are obtained by repeatedly replacing any communication link locally so that the network diameter and the average number of hops are reduced after the network is randomly configured. .

本発明によると、ネットワークの直径が極力小さい計算ノードネットワークシステムを実現することができる。これにより、計算ノードネットワークシステムにおける計算ノード間の通信遅延を小さくすることができる。 According to the present invention, it is possible to realize a computation node network system in which the network diameter is as small as possible. Thereby, the communication delay between the calculation nodes in a calculation node network system can be made small.

本発明の第１の実施形態に係る計算ノードネットワークシステムを模式的に示す平面図The top view which shows typically the calculation node network system which concerns on the 1st Embodiment of this invention 本発明の第２の実施形態に係る計算ノードネットワークシステムを模式的に示す平面図The top view which shows typically the calculation node network system which concerns on the 2nd Embodiment of this invention 本発明の第３の実施形態に係る計算ノードネットワークシステムを模式的に示す平面図The top view which shows typically the calculation node network system which concerns on the 3rd Embodiment of this invention 本発明の第４の実施形態に係る計算ノードネットワークシステムを模式的に示す平面図The top view which shows typically the calculation node network system which concerns on the 4th Embodiment of this invention 配線最適化処理のフローチャートWiring optimization process flowchart ２つの通信リンクの入れ替えを説明する図The figure explaining exchange of two communication links 従来の計算ノードネットワークシステムを模式的に示す平面図A plan view schematically showing a conventional computing node network system

以下、適宜図面を参照しながら、実施の形態を詳細に説明する。ただし、必要以上に詳細な説明は省略する場合がある。例えば、既によく知られた事項の詳細説明や実質的に同一の構成に対する重複説明を省略する場合がある。これは、以下の説明が不必要に冗長になるのを避け、当業者の理解を容易にするためである。 Hereinafter, embodiments will be described in detail with reference to the drawings as appropriate. However, more detailed explanation than necessary may be omitted. For example, detailed descriptions of already well-known matters and repeated descriptions for substantially the same configuration may be omitted. This is to avoid the following description from becoming unnecessarily redundant and to facilitate understanding by those skilled in the art.

なお、発明者らは、当業者が本発明を十分に理解するために添付図面および以下の説明を提供するのであって、これらによって特許請求の範囲に記載の主題を限定することを意図するものではない。また、図面に描かれた各要素の大きさ、細部の詳細形状などは実際のものとは異なることがある。 In addition, the inventors provide the accompanying drawings and the following description in order for those skilled in the art to fully understand the present invention, and these are intended to limit the subject matter described in the claims. is not. Moreover, the size of each element drawn in the drawings, the detailed shape of the details, and the like may differ from the actual ones.

≪第１の実施形態≫
図１は、本発明の第１の実施形態に係る計算ノードネットワークシステムを模式的に示す平面図である。本実施形態に係る計算ノードネットワークシステム１００Ａは、矩形領域３０に収まるように２次元配置された複数の計算ノード１０と、これら計算ノード１０の任意のものどうしを接続する複数の通信リンク２０とを備える。なお、便宜上、図１では通信リンク２０を数本しか描いていないが、実際にはもっと多くの通信リンク２０が配線される。また、計算ノード１０の上を通過する通信リンク２０はその計算ノード１０に接続されていないことを表す。 << First Embodiment >>
FIG. 1 is a plan view schematically showing a computation node network system according to the first embodiment of the present invention. The computing node network system 100A according to the present embodiment includes a plurality of computing nodes 10 that are two-dimensionally arranged so as to fit in a rectangular area 30 and a plurality of communication links 20 that connect any of these computing nodes 10 together. Prepare. For convenience, only a few communication links 20 are depicted in FIG. 1, but more communication links 20 are actually wired. In addition, the communication link 20 passing over the calculation node 10 is not connected to the calculation node 10.

計算ノードネットワークシステム１００Ａは、具体的にはスーパーコンピュータやＮｏＣ（ネットワーク・オン・チップ）などである。スーパーコンピュータの場合、計算ノード１０は、ＣＰＵやメモリなどが搭載されたシステムボード、磁気ディスク装置、電源装置、Ｉ／Ｏシステムボード（スイッチ装置）、冷却機構等を収容した個々のラックに相当する。スーパーコンピュータは、計算機室にこのようなラックを数百個規則的に並べて構成される。ＮｏＣの場合、計算ノード１０は、演算回路、論理回路、キャッシュメモリ、通信ポート（スイッチ回路）等が実装された個々のプロセッサコアに相当する。ＮｏＣは、半導体基板上にこのようなプロセッサコアを数百個規則的に並べて構成される。 The computing node network system 100A is specifically a supercomputer, NoC (network on chip), or the like. In the case of a supercomputer, the computation node 10 corresponds to an individual rack that houses a system board on which a CPU, a memory, and the like are mounted, a magnetic disk device, a power supply device, an I / O system board (switch device), a cooling mechanism, and the like. . A supercomputer is configured by regularly arranging several hundred such racks in a computer room. In the case of NoC, the calculation node 10 corresponds to an individual processor core on which an arithmetic circuit, a logic circuit, a cache memory, a communication port (switch circuit), and the like are mounted. The NoC is configured by regularly arranging hundreds of such processor cores on a semiconductor substrate.

計算ノードネットワーク１００Ａにおいて、各計算ノード１０（具体的にはラックやプロセッサコア）は仮想的なグリッド４０の格子点に配置されている。ここで、グリッド４０は、矩形領域３０の縦横方向に対して４５°傾いている。 In the computation node network 100A, each computation node 10 (specifically, a rack or a processor core) is arranged at a grid point of the virtual grid 40. Here, the grid 40 is inclined 45 ° with respect to the vertical and horizontal directions of the rectangular region 30.

通信リンク２０は、仮想的なグリッド５０に沿って配線されている。スーパーコンピュータの場合、通信リンク２０は、光ファイバーや電気ケーブルなどに相当する。ＮｏＣの場合、通信リンク２０は、メタル配線などに相当する。ここで、グリッド５０もまた矩形領域３０の縦横方向に対して４５°傾いており、グリッド５０とグリッド４０は実質的に同じグリッドである。 The communication link 20 is wired along a virtual grid 50. In the case of a supercomputer, the communication link 20 corresponds to an optical fiber or an electric cable. In the case of NoC, the communication link 20 corresponds to a metal wiring or the like. Here, the grid 50 is also inclined by 45 ° with respect to the vertical and horizontal directions of the rectangular region 30, and the grid 50 and the grid 40 are substantially the same grid.

すなわち、計算ノードネットワーク１００Ａでは、計算ノード１０の配置の基準となるグリッド４０の向きと通信リンク２０の配線の基準となるグリッド５０の向きとが一致しており、グリッド４０およびグリッド５０が矩形領域３０に対して４５°傾いている。 That is, in the calculation node network 100A, the orientation of the grid 40 that is the reference for the arrangement of the calculation nodes 10 and the orientation of the grid 50 that is the reference for the wiring of the communication link 20 are the same, and the grid 40 and the grid 50 are rectangular regions. It is inclined 45 ° with respect to 30.

本実施形態によると、計算ノードネットワークシステム１００Ａにおいて最も距離が離れた対角の２つの計算ノード１０を対角線で接続することができる。これにより、計算ノード間の最長リンク距離を従来比で１／√２（約７０％）に低減することができる。 According to the present embodiment, the two diagonal computation nodes 10 that are most distant from each other in the computation node network system 100A can be connected by a diagonal line. As a result, the longest link distance between the computation nodes can be reduced to 1 / √2 (about 70%) compared to the conventional method.

また、本実施形態では、いずれの計算ノード１０も平面視略正方形の縦辺または横辺、すなわち、計算ノード１０の側方辺部に通信リンク２０を接続することができる。一般に、計算ノード１０において通信リンク２０の引き出し口は計算ノード１０の辺部に配置されているため、上記のように通信リンク２０は計算ノード１０の側方辺部に接続可能にすることは好ましいことである。 In the present embodiment, any of the calculation nodes 10 can connect the communication link 20 to the vertical or horizontal side of the substantially square in plan view, that is, to the side portion of the calculation node 10. Generally, since the outlet of the communication link 20 is arranged in the side part of the calculation node 10 in the calculation node 10, it is preferable that the communication link 20 be connectable to the side part of the calculation node 10 as described above. That is.

≪第２の実施形態≫
図２は、第２の実施形態に係る計算ノードネットワークシステム１００Ｂを模式的に示す平面図である。本実施形態に係る計算ノードネットワークシステム１００Ｂは、矩形領域３０に収まるように２次元配置された複数の計算ノード１０と、これら計算ノード１０の任意のものどうしを接続する複数の通信リンク２０とを備える。なお、便宜上、図２では通信リンク２０を数本しか描いていないが、実際にはもっと多くの通信リンク２０が配線される。また、計算ノード１０の上を通過する通信リンク２０はその計算ノード１０に接続されていないことを表す。 << Second Embodiment >>
FIG. 2 is a plan view schematically showing a computation node network system 100B according to the second embodiment. The computing node network system 100B according to the present embodiment includes a plurality of computing nodes 10 that are two-dimensionally arranged so as to fit in a rectangular area 30 and a plurality of communication links 20 that connect any of these computing nodes 10 together. Prepare. For convenience, only a few communication links 20 are illustrated in FIG. 2, but more communication links 20 are actually wired. In addition, the communication link 20 passing over the calculation node 10 is not connected to the calculation node 10.

計算ノードネットワークシステム１００Ｂは、具体的にはスーパーコンピュータやＮＯＣなどである。計算ノードネットワーク１００Ｂにおいて、各計算ノード１０（具体的にはラックやプロセッサコア）は仮想的なグリッド４０の格子点に配置されている。ここで、グリッド４０の向きは、矩形領域３０の縦横方向と一致している。 The computing node network system 100B is specifically a super computer, NOC, or the like. In the computation node network 100B, each computation node 10 (specifically, a rack or a processor core) is arranged at a lattice point of the virtual grid 40. Here, the orientation of the grid 40 coincides with the vertical and horizontal directions of the rectangular region 30.

通信リンク２０は、仮想的なグリッド５０に沿って配線されている。ここで、グリッド５０は、グリッド４０の格子点を斜めに結ぶようなグリッドであり、矩形領域３０およびグリッド４０に対して４５°傾いている。 The communication link 20 is wired along a virtual grid 50. Here, the grid 50 is a grid that obliquely connects the lattice points of the grid 40, and is inclined 45 ° with respect to the rectangular region 30 and the grid 40.

すなわち、計算ノードネットワーク１００Ｂでは、複数の計算ノード１０が配置される矩形領域３０の縦横方向と計算ノード１０の配置の基準となるグリッド４０の向きとが一致しており、通信リンク２０の配線の基準となるグリッド５０が矩形領域３０およびグリッド４０に対して４５°傾いている。 That is, in the calculation node network 100B, the vertical and horizontal directions of the rectangular area 30 where the plurality of calculation nodes 10 are arranged and the orientation of the grid 40 which is the reference for the arrangement of the calculation nodes 10 match, and the wiring of the communication link 20 The reference grid 50 is inclined 45 ° with respect to the rectangular region 30 and the grid 40.

本実施形態では、計算ノード１０の角部に通信リンク２０を接続する必要があるため、第１の実施形態と比較して、計算ノード１０への通信リンク２０の接続が複雑になるものの、計算ノードネットワークシステム１００Ｂにおいて最も距離が離れた対角の２つの計算ノード１０を対角線で接続することができる。これにより、計算ノード間の最長リンク距離を従来比で１／√２（約７０％）に低減することができる。 In this embodiment, since it is necessary to connect the communication link 20 to the corner of the calculation node 10, the connection of the communication link 20 to the calculation node 10 is complicated compared to the first embodiment. In the node network system 100B, the two diagonal computation nodes 10 that are the farthest away can be connected by a diagonal line. As a result, the longest link distance between the computation nodes can be reduced to 1 / √2 (about 70%) compared to the conventional method.

特に、本実施形態では計算ノード１０の配置自体は従来の構成と同じ格子状であるから、従来の計算ノードネットワークシステムにおける計算ノード１０の配列はそのままで通信リンク２０の配線パターンを変えるだけで、従来の計算ノードネットワークシステムを本実施形態に係る計算ノードネットワークシステム１００Ｂに変更することができる。例えば、既存のスーパーコンピュータにおいてラックの配置を変えることなく、通信リンク２０を本実施形態のように配線するだけで計算ノード間の最長リンク距離を低減する効果が得られる。 In particular, in this embodiment, since the arrangement of the calculation nodes 10 is the same grid as the conventional configuration, the arrangement of the calculation nodes 10 in the conventional calculation node network system remains unchanged, and the wiring pattern of the communication link 20 is changed. The conventional computing node network system can be changed to the computing node network system 100B according to the present embodiment. For example, the effect of reducing the longest link distance between calculation nodes can be obtained by simply wiring the communication link 20 as in the present embodiment without changing the rack arrangement in an existing supercomputer.

≪第３の実施形態≫
図３は、本発明の第３の実施形態に係る計算ノードネットワークシステムを模式的に示す平面図である。本実施形態に係る計算ノードネットワークシステム１００Ｃは、矩形領域３０に収まるように２次元配置された複数の計算ノード１０と、これら計算ノード１０の任意のものどうしを接続する複数の通信リンク２０とを備える。なお、便宜上、図３では通信リンク２０を数本しか描いていないが、実際にはもっと多くの通信リンク２０が配線される。また、計算ノード１０の上を通過する通信リンク２０はその計算ノード１０に接続されていないことを表す。 << Third Embodiment >>
FIG. 3 is a plan view schematically showing a computation node network system according to the third embodiment of the present invention. The computation node network system 100C according to the present embodiment includes a plurality of computation nodes 10 that are two-dimensionally arranged so as to fit in a rectangular area 30 and a plurality of communication links 20 that connect any of these computation nodes 10 together. Prepare. For convenience, only a few communication links 20 are depicted in FIG. 3, but more communication links 20 are actually wired. In addition, the communication link 20 passing over the calculation node 10 is not connected to the calculation node 10.

計算ノードネットワークシステム１００Ｃは、具体的にはスーパーコンピュータやＮＯＣなどである。計算ノードネットワーク１００Ｃにおいて、各計算ノード１０（具体的にはラックやプロセッサコア）は仮想的なグリッド４０の格子点に配置されている。ここで、グリッド４０は、矩形領域３０の縦横方向に対して４５°傾いている。 The computing node network system 100C is specifically a supercomputer, NOC, or the like. In the calculation node network 100 </ b> C, each calculation node 10 (specifically, a rack or a processor core) is arranged at a lattice point of the virtual grid 40. Here, the grid 40 is inclined 45 ° with respect to the vertical and horizontal directions of the rectangular region 30.

通信リンク２０は、仮想的なグリッド５０に沿って配線されている。ここで、グリッド５０は、グリッド４０の格子点を斜めに結ぶようなグリッドであり、グリッド５０の向きは、矩形領域３０の縦横方向と一致している。 The communication link 20 is wired along a virtual grid 50. Here, the grid 50 is a grid that diagonally connects the grid points of the grid 40, and the orientation of the grid 50 coincides with the vertical and horizontal directions of the rectangular region 30.

すなわち、計算ノードネットワーク１００Ｃでは、複数の計算ノード１０が配置される矩形領域３０の縦横方向と通信リンク２０の配線の基準となるグリッド５０の向きとが一致しており、計算ノード１０の配置の基準となるグリッド４０が矩形領域３０およびグリッド５０に対して４５°傾いている。 In other words, in the computation node network 100C, the vertical and horizontal directions of the rectangular area 30 where the plurality of computation nodes 10 are arranged and the orientation of the grid 50 serving as a reference for the wiring of the communication link 20 are the same. The reference grid 40 is inclined 45 ° with respect to the rectangular region 30 and the grid 50.

本実施形態では、計算ノード１０の角部に通信リンク２０を接続する必要があるため、第１の実施形態と比較して、計算ノード１０への通信リンク２０の接続が複雑になるものの、最小の通信リンク長で接続できる計算ノード１０の数を増やすことができる。すなわち、本実施形態および第１の実施形態のいずれにおいてもある計算ノード１０の周りに８個の計算ノード１０が配置されているが、第１の実施形態では最小の通信リンク長で接続できる計算ノード１０は４個であるのに対して、本実施形態では周りの８個すべての計算ノード１０に最小の通信リンク長で接続可能となる。これにより、平均ホップ数を低減することができる。 In this embodiment, since it is necessary to connect the communication link 20 to the corner of the calculation node 10, the connection of the communication link 20 to the calculation node 10 is complicated as compared with the first embodiment. It is possible to increase the number of computing nodes 10 that can be connected with the communication link length of. That is, in the present embodiment and the first embodiment, eight calculation nodes 10 are arranged around a certain calculation node 10, but in the first embodiment, the calculation can be connected with the minimum communication link length. Whereas the number of nodes 10 is four, in the present embodiment, it is possible to connect to all eight surrounding calculation nodes 10 with a minimum communication link length. Thereby, the average number of hops can be reduced.

≪第４の実施形態≫
図４は、本発明の第４の実施形態に係る計算ノードネットワークシステムを模式的に示す平面図である。本実施形態に係る計算ノードネットワークシステム１００Ｄは、矩形領域３０に収まるように２次元配置された複数の計算ノード１０と、これら計算ノード１０の任意のものどうしを接続する複数の通信リンク２０とを備える。なお、便宜上、図４では通信リンク２０を数本しか描いていないが、実際にはもっと多くの通信リンク２０が配線される。また、計算ノード１０の上を通過する通信リンク２０はその計算ノード１０に接続されていないことを表す。 << Fourth Embodiment >>
FIG. 4 is a plan view schematically showing a computation node network system according to the fourth embodiment of the present invention. The computation node network system 100D according to the present embodiment includes a plurality of computation nodes 10 that are two-dimensionally arranged so as to fit in the rectangular area 30 and a plurality of communication links 20 that connect any of these computation nodes 10 together. Prepare. For convenience, only a few communication links 20 are illustrated in FIG. 4, but more communication links 20 are actually wired. In addition, the communication link 20 passing over the calculation node 10 is not connected to the calculation node 10.

計算ノードネットワークシステム１００Ｄは、具体的にはスーパーコンピュータやＮＯＣなどである。計算ノードネットワーク１００Ｄにおいて、各計算ノード１０（具体的にはラックやプロセッサコア）は仮想的なグリッド４０の格子点に配置されている。ここで、グリッド４０は、矩形領域３０の縦横方向に対しておよそ１８．５°傾いている。 Specifically, the computation node network system 100D is a supercomputer, NOC, or the like. In the computation node network 100D, each computation node 10 (specifically, a rack or a processor core) is arranged at a grid point of the virtual grid 40. Here, the grid 40 is inclined approximately 18.5 ° with respect to the vertical and horizontal directions of the rectangular region 30.

通信リンク２０は、仮想的なグリッド５０に沿って配線されている。ここで、グリッド５０は、グリッド４０の格子点を斜めに結ぶようなグリッドであり、グリッド４０に対して４５°傾き、かつ、矩形領域３０に対しておよそ２６．５°傾いている。 The communication link 20 is wired along a virtual grid 50. Here, the grid 50 is a grid that obliquely connects the lattice points of the grid 40, and is inclined 45 ° with respect to the grid 40 and approximately 26.5 ° with respect to the rectangular region 30.

すなわち、計算ノードネットワーク１００Ｄでは、複数の計算ノード１０が配置される矩形領域３０の縦横方向、計算ノード１０の配置の基準となるグリッド４０の向き、および通信リンク２０の配線の基準となるグリッド５０の向きの三つが互いに異なっている。 That is, in the calculation node network 100D, the vertical and horizontal directions of the rectangular area 30 in which the plurality of calculation nodes 10 are arranged, the orientation of the grid 40 that is the reference for the arrangement of the calculation nodes 10, and the grid 50 that is the reference for the wiring of the communication link 20 The three directions are different from each other.

本実施形態では、第１の実施形態と同様に、いずれの計算ノード１０も平面視略正方形の縦辺または横辺、すなわち、計算ノード１０の側方辺部に通信リンク２０を接続することができる。さらに、本実施形態では、最小の通信リンク長で接続できる計算ノード１０の数を、第１の実施形態よりも増やすことができる。すなわち、本実施形態および第１の実施形態のいずれにおいてもある計算ノード１０の周りに８個の計算ノード１０が配置されているが、第１の実施形態では最小の通信リンク長で接続できる計算ノード１０は４個であるのに対して、本実施形態では周りの８個すべての計算ノード１０に最小の通信リンク長で接続可能となる。これにより、平均ホップ数を低減することができる。 In the present embodiment, as in the first embodiment, any of the calculation nodes 10 may connect the communication link 20 to the vertical side or the horizontal side of the substantially square in plan view, that is, the side portion of the calculation node 10. it can. Furthermore, in the present embodiment, the number of computing nodes 10 that can be connected with the minimum communication link length can be increased as compared with the first embodiment. That is, in the present embodiment and the first embodiment, eight calculation nodes 10 are arranged around a certain calculation node 10, but in the first embodiment, the calculation can be connected with the minimum communication link length. Whereas the number of nodes 10 is four, in the present embodiment, it is possible to connect to all eight surrounding calculation nodes 10 with a minimum communication link length. Thereby, the average number of hops can be reduced.

以上のように、第１ないし第４の各実施形態によると、計算ノードネットワークにおいて、計算ノード間の最長リンク距離を短くしたり、平均ホップ数を低減したりすることができる。これにより、計算ノードネットワークシステムにおけるネットワークの直径を極力小さくすることができる。 As described above, according to the first to fourth embodiments, it is possible to shorten the longest link distance between computation nodes or reduce the average number of hops in the computation node network. Thereby, the diameter of the network in a calculation node network system can be made as small as possible.

≪配線最適化≫
一般に、計算ノードネットワークシステムは計算ノードと通信リンクで定義され、グラフ理論ではそれぞれノードとエッジで表される。そして、計算ノードネットワークシステムの性能は、ネットワークの直径とＡＳＰＬ（Average Shortest Path Length）（平均最短パス長あるいは平均ホップ数）で評価される。直径は、ノード間の最小ホップ数の最大値である。平均ホップ数は、全ノード間の最小ホップ数の平均値である。 ≪Wiring optimization≫
In general, a computing node network system is defined by computing nodes and communication links, and is represented by nodes and edges, respectively, in graph theory. The performance of the computing node network system is evaluated by the network diameter and ASPL (Average Shortest Path Length) (average shortest path length or average number of hops). The diameter is the maximum value of the minimum number of hops between nodes. The average number of hops is an average value of the minimum number of hops between all nodes.

通信リンクが長いと通信遅延が大きくなるため、通信リンク長には上限が設けられる。例えば、スーパーコンピュータにおいて電気ケーブルを用いて伝送速度４０Ｇｂｐｓを実現するには、通信リンク長は７ｍが限界だと言われている。したがって、通信リンクで直接つながっていない遠く離れたノード間で通信を行う場合、他のノードを経由（ホップ）して通信を行う必要がある。しかし、経由数（ホップ数）が多いほどノード間遅延が大きくなり、また、消費電力も大きくなる。したがって、計算ノードネットワークシステムでは直径および平均ホップ数をともに少なくすることが望まれる。 Since the communication delay becomes large when the communication link is long, an upper limit is set for the communication link length. For example, in order to achieve a transmission speed of 40 Gbps using an electric cable in a supercomputer, it is said that the limit of the communication link length is 7 m. Therefore, when communication is performed between remote nodes that are not directly connected by a communication link, it is necessary to perform communication via another node (hop). However, as the number of vias (the number of hops) increases, the delay between nodes increases and the power consumption also increases. Therefore, it is desirable to reduce both the diameter and the average number of hops in the computation node network system.

そこで、計算ノードネットワークシステムにおいて直径および平均ホップ数をともに少なくするような配線最適化アルゴリズムを開示する。図５は、配線最適化処理のフローチャートである。なお、当該配線最適化処理は、図略のコンピュータを用いて実行することができる。 Therefore, a wiring optimization algorithm that reduces both the diameter and the average number of hops in the computation node network system is disclosed. FIG. 5 is a flowchart of the wiring optimization process. The wiring optimization process can be executed using a computer (not shown).

まず、計算ノードネットワークシステムにおいて通信リンクをランダムに設定する（Ｓ１）。ただし、通信リンク長の上限および計算ノードのポート数は超えないようにする。一般に、通信リンクに長さ制限がなければこのようなランダムネットワークは優れた配線であることが知られている、しかし、通信リンクに強い長さ制限があればランダムネットワークはさほどよくない。 First, a communication link is randomly set in the computation node network system (S1). However, the upper limit of the communication link length and the number of ports of the calculation node should not be exceeded. In general, it is known that such a random network is an excellent wiring if there is no length restriction on the communication link, but if the communication link has a strong length restriction, the random network is not so good.

ランダムネットワークが完了すると、任意の２つの通信リンクを選択する（Ｓ２）。そして、選択した２つの通信リンクの代替候補となる２つの通信リンクを設定する（Ｓ３）。 When the random network is completed, any two communication links are selected (S2). Then, two communication links that are alternative candidates for the two selected communication links are set (S3).

図６は、２つの通信リンクの入れ替えを説明する図である。ステップＳ２で２つの通信リンクＬ１、Ｌ２を選択したとする。通信リンクＬ１は、計算ノードＡと計算ノードＢとを繋ぐリンクである。通信リンクＬ２は、計算ノードＣと計算ノードＤとを繋ぐリンクである。この場合、代替候補として２つの通信リンクＬ３、Ｌ４が設定される。通信リンクＬ３は、計算ノードＡと計算ノードＣとを繋ぐリンクである。通信リンクＬ４は、計算ノードＢと計算ノードＤとを繋ぐリンクである。 FIG. 6 is a diagram for explaining replacement of two communication links. Assume that two communication links L1 and L2 are selected in step S2. The communication link L1 is a link that connects the calculation node A and the calculation node B. The communication link L2 is a link connecting the calculation node C and the calculation node D. In this case, two communication links L3 and L4 are set as alternative candidates. The communication link L3 is a link that connects the calculation node A and the calculation node C. The communication link L4 is a link that connects the calculation node B and the calculation node D.

図５へ戻り、ステップＳ２で選択した２つの通信リンクをステップＳ３で設定した代替候補の２つの通信リンクに置換した場合に、通信リンクの長さ制限の条件を満たすか否かを判定する。なお、各計算ノードには接続可能な通信リンク数、すなわちポート数の制限もあるが、図６に示したように通信リンクを置換するのであれば置換前後で各計算ノードに接続された通信リンク数は変化しないため、ポート数の制限については考慮しなくてもよくなる。 Returning to FIG. 5, when the two communication links selected in step S <b> 2 are replaced with two alternative communication links set in step S <b> 3, it is determined whether the communication link length restriction condition is satisfied. Each calculation node has a limit on the number of communication links that can be connected, that is, the number of ports. However, if the communication link is replaced as shown in FIG. 6, the communication link connected to each calculation node before and after the replacement. Since the number does not change, it is not necessary to consider the limitation on the number of ports.

もし、通信リンクの長さ制限の条件を満たしていない、すなわち、代替候補の２つの通信リンクに置換することで、通信リンクの上限を超えるようであれば（Ｓ４でＮＯ）、ステップＳ２に戻り、別の２つの通信リンクを選択する。一方、通信リンクの長さ制限の条件を満たしていれば（Ｓ４でＹＥＳ）、代替候補の２つの通信リンクに置換後の計算ノードネットワークシステムについて、ネットワークの直径と平均ホップ数を計算する（Ｓ５）。 If the communication link length restriction condition is not satisfied, i.e., replacement with two alternative communication links exceeds the upper limit of the communication link (NO in S4), the process returns to step S2. , Select another two communication links. On the other hand, if the communication link length restriction condition is satisfied (YES in S4), the network diameter and the average number of hops are calculated for the calculation node network system after replacement with the two alternative communication links (S5). ).

そして、ステップＳ５で計算したネットワークの直径と平均ホップ数が、代替候補の２つの通信リンクに置換前の計算ノードネットワークシステムのネットワークの直径と平均ホップ数よりも小さくなるか否かを判定する。もし、小さくならなければ（Ｓ６でＮＯ）、ステップＳ２に戻り、別の２つの通信リンクを選択する。一方、２つの通信リンクを置換することでネットワークの直径と平均ホップ数がともに以前よりも小さくなるようであれば（Ｓ６でＹＥＳ）、ステップＳ２で選択した２つの通信リンク（図６の例では通信リンクＬ１、Ｌ２）を削除して、ステップＳ３で設定した代替候補の２つの通信リンク（図６の例では通信リンクＬ３、Ｌ４）を採用し、ステップＳ２に戻って新たな２つの通信リンクを選択する。 Then, it is determined whether or not the network diameter and average hop count calculated in step S5 are smaller than the network diameter and average hop count of the calculation node network system before replacement with the two alternative communication links. If not smaller (NO in S6), the process returns to step S2 to select another two communication links. On the other hand, if the two communication links are replaced so that both the network diameter and the average number of hops are smaller than before (YES in S6), the two communication links selected in step S2 (in the example of FIG. 6) The communication links L1 and L2) are deleted, two alternative communication links (communication links L3 and L4 in the example of FIG. 6) set in step S3 are adopted, and the process returns to step S2 and two new communication links Select.

以上のステップを繰り返すことで計算ノードネットワークシステムの直径と平均ホップ数はローカルミニマム値に収束する。 By repeating the above steps, the diameter and average hop count of the computation node network system converge to the local minimum value.

次に、従来構成（図７参照）、第１の実施形態、および第２の実施形態の各タイプの計算ノードネットワークシステムを上記の配線最適化アルゴリズムで配線した例を示す。なお、各例において、隣接する２つの計算ノードを接続するときの通信リンク長を１単位として、通信リンクの長さ上限を４単位とする。また、各計算ノードの最大ポート数を４とする。 Next, an example is shown in which each type of calculation node network system of the conventional configuration (see FIG. 7), the first embodiment, and the second embodiment is wired with the above-described wiring optimization algorithm. In each example, the communication link length when two adjacent calculation nodes are connected is assumed to be 1 unit, and the upper limit of the communication link length is assumed to be 4 units. In addition, the maximum number of ports of each calculation node is four.

≪比較例≫
比較例では、３２×３２＝１０２４個の計算ノードを用いて従来タイプ（図７参照）の計算ノードネットワークシステムを構成した。比較例に係る計算ノードネットワークシステムにおいてランダムネットワークを形成したところ、直径が２３、平均ホップ数が９．５７６５０６であった。このようなランダムネットワークに上記の配線最適化アルゴリズムを適用すると、直径が１６、平均ホップ数が６．６８５４８６に低下した。この直径は、３２×３２＝１０２４個の計算ノードからなる計算ノードネットワークシステムにおける最適値である。 ≪Comparative example≫
In the comparative example, a conventional type (see FIG. 7) computing node network system is configured using 32 × 32 = 1024 computing nodes. When a random network was formed in the computation node network system according to the comparative example, the diameter was 23 and the average number of hops was 9.576506. When the above wiring optimization algorithm was applied to such a random network, the diameter decreased to 16 and the average number of hops decreased to 6.68486. This diameter is an optimum value in a computation node network system composed of 32 × 32 = 1024 computation nodes.

≪実施例１≫
実施例１では、２３×４６＝１０５８個の計算ノードを用いて第１の実施形態のタイプの計算ノードネットワークシステムを構成した。実施例１に係る計算ノードネットワークシステムを上記の配線最適化アルゴリズムで配線したところ、直径が１２、平均ホップ数が６．７９０２７４であった。 Example 1
In Example 1, the computation node network system of the type of the first embodiment was configured using 23 × 46 = 1058 computation nodes. When the computation node network system according to Example 1 was wired with the above-described wiring optimization algorithm, the diameter was 12 and the average number of hops was 6.790274.

実施例１と比較例とを比較すると、計算ノード数が比較例よりもわずかに多いにもかかわらず直径が小さくなっている。この直径は、２３×４６＝１０５８個の計算ノードからなる第１の実施形態のタイプの計算ノードネットワークシステムにおける最適値である。 When Example 1 is compared with the comparative example, the diameter is small although the number of calculation nodes is slightly larger than that of the comparative example. This diameter is an optimum value in the calculation node network system of the type of the first embodiment composed of 23 × 46 = 1058 calculation nodes.

≪実施例２≫
実施例１では、３２×３２＝１０２４個の計算ノードを用いて第２の実施形態のタイプの計算ノードネットワークシステムを構成した。実施例２に係る計算ノードネットワークシステムを上記の配線最適化アルゴリズムで配線したところ、直径が１１、平均ホップ数が６．６０４４９７であった。なお、厳密に言うと、実施例２では通信リンクを斜めに配線する都合上、通信リンクの長さ上限は３√２（およそ４．２）単位となる。 << Example 2 >>
In Example 1, the computation node network system of the type of the second embodiment was configured using 32 × 32 = 1024 computation nodes. When the computation node network system according to Example 2 was wired with the above-described wiring optimization algorithm, the diameter was 11 and the average hop count was 6.604497. Strictly speaking, in the second embodiment, the upper limit of the length of the communication link is 3√2 (approximately 4.2) for convenience of wiring the communication link diagonally.

実施例２と比較例とを比較すると、計算ノード数が比較例と同じでも直径が小さくなっている。この直径は、３２×３２＝１０２４個の計算ノードからなる第２の実施形態のタイプの計算ノードネットワークシステムにおける最適値である。 When Example 2 is compared with the comparative example, the diameter is small even if the number of calculation nodes is the same as that of the comparative example. This diameter is the optimum value in the calculation node network system of the type of the second embodiment composed of 32 × 32 = 1024 calculation nodes.

以上のように、本発明における技術の例示として、実施の形態を説明した。そのために、添付図面および詳細な説明を提供した。 As described above, the embodiments have been described as examples of the technology in the present invention. For this purpose, the accompanying drawings and detailed description are provided.

したがって、添付図面および詳細な説明に記載された構成要素の中には、課題解決のために必須な構成要素だけでなく、上記技術を例示するために、課題解決のためには必須でない構成要素も含まれ得る。そのため、それらの必須ではない構成要素が添付図面や詳細な説明に記載されていることをもって、直ちに、それらの必須ではない構成要素が必須であるとの認定をするべきではない。 Accordingly, among the components described in the accompanying drawings and the detailed description, not only the components essential for solving the problem, but also the components not essential for solving the problem in order to illustrate the above technique. May also be included. Therefore, it should not be immediately recognized that these non-essential components are essential as those non-essential components are described in the accompanying drawings and detailed description.

例えば、計算ノード１０の平面視形状は略正方形である必要はなく矩形であってもよい。また、計算ノード１０の配置間隔は等間隔でなくてもよい。例えば、行間隔よりも列間隔を広めにしてもよい。したがって、グリッド４０の傾きも４５度である必要はない。計算ノード１０の平面視形状や配置間隔に応じてグリッド４０の傾きを変えてもよい。 For example, the planar view shape of the calculation node 10 does not have to be substantially square, and may be rectangular. Moreover, the arrangement intervals of the calculation nodes 10 do not have to be equal. For example, the column spacing may be wider than the row spacing. Therefore, the inclination of the grid 40 does not need to be 45 degrees. You may change the inclination of the grid 40 according to the planar view shape and arrangement | positioning space | interval of the calculation node 10. FIG.

また、上述の実施の形態は、本発明における技術を例示するためのものであるから、特許請求の範囲またはその均等の範囲において種々の変更、置き換え、付加、省略などを行うことができる。 Moreover, since the above-mentioned embodiment is for demonstrating the technique in this invention, a various change, replacement, addition, abbreviation, etc. can be performed in a claim or its equivalent range.

１００Ａ、１００Ｂ、１００Ｃ計算ノードネットワークシステム
１０計算ノード
２０通信リンク
３０矩形領域
４０グリッド（第１のグリッド）
５０グリッド（第２のグリッド） 100A, 100B, 100C Computing node network system 10 Computing node 20 Communication link 30 Rectangular area 40 Grid (first grid)
50 grid (second grid)

Claims

A plurality of calculation nodes arranged two-dimensionally to fit in a rectangular area;
A plurality of communication links connecting any one of the plurality of computing nodes;
Each of the plurality of computation nodes is arranged at a grid point of a virtual first grid;
Each of the plurality of communication links is wired along a virtual second grid;
A computing node network system, wherein at least two of three directions of a vertical and horizontal directions of the rectangular region, a direction of the first grid, and a direction of the second grid are different from each other.

The orientation of the first grid and the orientation of the second grid match,
The computing node network system according to claim 1, wherein the first grid and the second grid are inclined by 45 ° with respect to the rectangular region.

The vertical and horizontal directions of the rectangular area coincide with the orientation of the first grid,
The computing node network system according to claim 1, wherein the second grid is inclined at 45 ° with respect to the rectangular area and the first grid.

The vertical and horizontal directions of the rectangular area coincide with the orientation of the second grid,
The computing node network system according to claim 1, wherein the first grid is inclined by 45 ° with respect to the rectangular area and the second grid.

The plurality of communication links are obtained by repeatedly replacing arbitrary communication links locally so that the network diameter and the average number of hops are reduced after the network is randomly formed. The computing node network system according to claim 4.

6. The computing node network system according to claim 1, wherein the computing node is an individual rack constituting a supercomputer.

6. The computing node network system according to claim 1, wherein the computing node is an individual processor core in NoC (network on chip).