JPH11328134A

JPH11328134A - Data transmission / reception method between computers

Info

Publication number: JPH11328134A
Application number: JP10132364A
Authority: JP
Inventors: Tsuneyuki Imaki; 常之今木; Nobutoshi Sagawa; 暢俊佐川
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1998-05-14
Filing date: 1998-05-14
Publication date: 1999-11-30
Anticipated expiration: 2018-05-14
Also published as: JP3628514B2

Abstract

(57)【要約】【課題】ＴＣＰ／ＩＰアプリケーションプログラムから
メッセージパッシング型のストリーム通信を利用可能に
する。【解決手段】各計算機に設けられたエミュレーションラ
イブラリ１１２が、アプリケーションプログラム１０７
からの通信要求が指定する通信相手プロセスが並列計算
機１０１の内部であるか外部であるかを判別し、内部で
あれは高速通信ライブラリ１３５を起動して、高速内部
通信網１０５を介してメッセージパッシング型通信方式
でもってデータを送受信する。このデータの送受信は、
ストリーム通信を実現するように実行される。相手先が
並列計算機１０１の外部計算機１０４であるときには、
ライブラリ１１２は、ＴＣＰ／ＩＰ処理ルーティン１１
４にＴＣＰ／ＩＰに従い通信を要求する。 (57) [Summary] A message passing type stream communication can be used from a TCP / IP application program. An emulation library provided in each computer is provided with an application program.
It determines whether the communication partner process specified by the communication request from is inside or outside the parallel computer 101, and if it is inside, starts the high-speed communication library 135 and passes the message through the high-speed internal communication network 105. Data is transmitted and received by the mobile communication system. This data transmission and reception
It is executed to realize stream communication. When the destination is the external computer 104 of the parallel computer 101,
The library 112 is a TCP / IP processing routine 11
4 requests communication according to TCP / IP.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、複数の種類の通信
網により接続された複数の計算機を有する計算機システ
ムにおける、計算機間のデータ送受信方法に係り、特に
メッセージパッシング型の通信を実行するのに好適な、
計算機間のデータ送受信方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a method for transmitting and receiving data between computers in a computer system having a plurality of computers connected by a plurality of types of communication networks, and more particularly to a method for executing message passing communication. Suitable,
The present invention relates to a method for transmitting and receiving data between computers.

【０００２】[0002]

【従来の技術】計算機間の通信規約としてＴＣＰ／ＩＰ
が極めて一般的に使用されている。ＴＣＰ／ＩＰを使用
して他のプログラムと通信するように構成されたプログ
ラムを以下ではＴＣＰ／ＩＰアプリケーションプログラ
ムと呼ぶ。このＴＣＰ／ＩＰアプリケーションプログラ
ムからＴＣＰ／ＩＰ以外の通信規約を使用できるように
した計算機システムも存在する。すなわち、計算機シス
テムを構成する計算機が性能の異なる複数のネットワー
クにより構成され、そのシステムの内の計算機が他の計
算機と通信するときに、それらの計算機がその通信にこ
れらのネットワークのいずれを使用するかに依存して、
ＴＣＰ／ＩＰまたは他の通信規約を使用するシステムも
提案されている。例えば、バークレー大のSteven H.Rod
riguesらが開発したシステムは、ＴＣＰ／ＩＰプロトコ
ルで使用可能な広域のネットワークと、それより狭い領
域でより簡潔でオーバヘッドが低い通信規約を使用する
局所的なネットワークからなり、局所的なネットワーク
に接続された計算機同士は、この通信規約を使用して通
信し、広域のネットワークに接続された計算機同士は、
ＴＣＰ／ＩＰ規約に従って通信する。たとえば、"High-
Performance Local-Area Communication With Fast Soc
ket", USENIX '97 Annual Technical Conference pp.25
7-274)参照。2. Description of the Related Art TCP / IP is a protocol for communication between computers.
Are very commonly used. A program configured to communicate with another program using TCP / IP is hereinafter referred to as a TCP / IP application program. There is also a computer system in which a communication protocol other than TCP / IP can be used from the TCP / IP application program. That is, the computers constituting the computer system are configured by a plurality of networks having different performances, and when a computer in the system communicates with another computer, the computer uses any of these networks for the communication. Depending on
Systems using TCP / IP or other communication protocols have also been proposed. For example, Steven H. Rod at Berkeley
The system developed by rigues et al. consists of a wide area network that can be used with the TCP / IP protocol, and a local network that uses a simpler, lower-overhead protocol in a smaller area and connects to the local network. Computers communicated using this protocol, and computers connected to the wide area network
Communication is performed according to the TCP / IP protocol. For example, "High-
Performance Local-Area Communication With Fast Soc
ket ", USENIX '97 Annual Technical Conference pp.25
7-274).

【０００３】この計算機システムには、ＴＣＰ／ＩＰア
プリケーションプログラムからもこの通信規約を利用で
きるようにするために、ＴＣＰ／ＩＰエミュレーション
ライブラリが用意されている。上記システムではこのＴ
ＣＰ／ＩＰエミュレーションライブラリは、高速ソケッ
ト（ＦａｓｔＳｏｃｋｅｔ）と呼ばれている。The computer system is provided with a TCP / IP emulation library so that the TCP / IP application program can use the communication protocol. In the above system, this T
The CP / IP emulation library is called a fast socket (Fast Socket).

【０００４】高速ソケットは、簡潔な通信規約としてワ
ークステーションクラスタ向けに開発されたアクティブ
メッセージ（ＡｃｔｉｖｅＭｅｓｓａｇｅ）を用い
る。例えば文献 T. von Eicken, D. E. Culler, S. C.
Goldstein, K. E. Schauser "Active Messages: a Mec
hanism for Integrated Communication and Computatio
n", in Proceedings of the 19th International Sympo
sium on Computer Architecture, May 1992 pp.256-266
参照。アクティブメッセージでは、データを送る側のア
プリケーションプログラムが受け手側のアプリケーショ
ンプログラムに割り込み、受け手側がその割り込みを契
機にデータ受信処理を行うという方法でデータをやり取
りする。The high-speed socket uses an active message developed for a workstation cluster as a simple communication protocol. For example, references T. von Eicken, DE Culler, SC
Goldstein, KE Schauser "Active Messages: a Mec
hanism for Integrated Communication and Computatio
n ", in Proceedings of the 19th International Sympo
sium on Computer Architecture, May 1992 pp.256-266
reference. In the active message, the application program on the data transmitting side interrupts the application program on the receiving side, and the receiving side exchanges data in such a manner that the receiving side performs data reception processing triggered by the interruption.

【０００５】並列計算機は、複数の計算機が互いに通信
しながら協調して１つの問題を解決することを目的とし
た計算機である。この目的を満たすため、一般に並列計
算機の内部の各計算機は高速の内部高速通信網により相
互に接続され、並列計算機内の少なくとも一部の計算機
は、さらに、ＬＡＮ等のより広域のネットワークにより
外部の計算機に接続されている。[0005] A parallel computer is a computer that aims to solve one problem in cooperation with a plurality of computers while communicating with each other. In order to satisfy this purpose, generally, the computers inside the parallel computers are interconnected by a high-speed internal high-speed communication network, and at least some of the computers in the parallel computers are further connected to an external network by a broader network such as a LAN. Connected to a calculator.

【０００６】広域のネットワークに使用される通信規約
は主としてＴＣＰ／ＩＰである。並列計算機は、内部高
速通信網を使用するために、各計算機に他の内部計算機
と内部高速通信網を介して通信するための高速な通信ハ
ードウェアとそのハードウエアを使用するための高速通
信ライブラリを設けている。現在、多くの並列計算機に
採用されている通信規約は、メッセージパッシング型の
通信である。メッセージパッシング型の通信は、送信ア
プリケーションプログラムが発行する送信命令と受信ア
プリケーションプログラムが発行する受信命令が一対一
に対応付けられた時に通信が行われるという通信であ
る。多くの場合、この通信方式は並列計算機内部におけ
る高速通信ハードウェアに適している。このメッセージ
パッシング型通信を実現するために使用される高速通信
ライブラリには、リモートメモリ書き込みライブラリあ
るいはＰＵＴライブラリと呼ばれているものが主に使用
されている。The communication protocol used for a wide area network is mainly TCP / IP. In order to use the internal high-speed communication network, the parallel computers use high-speed communication hardware for communicating with other internal computers via the internal high-speed communication network, and a high-speed communication library for using the hardware. Is provided. At present, the communication protocol adopted by many parallel computers is message-passing type communication. The message passing type communication is communication in which communication is performed when a transmission command issued by a transmission application program and a reception command issued by a reception application program are associated one-to-one. In many cases, this communication scheme is suitable for high-speed communication hardware inside a parallel computer. As a high-speed communication library used to realize the message passing type communication, a library called a remote memory writing library or a PUT library is mainly used.

【０００７】並列計算機内部の通信のオーバヘッドは、
高速通信ハードウェアを使用することによりかなり減じ
ることができる。アクティブメッセージで用いる割り込
み型の通信方式では、割り込みのオーバヘッドが目立っ
てしまう。しかるに、メッセージパッシング型通信方式
は、割り込みを前提としないので、並列計算機の内部で
の通信にはアクティブメッセージ型通信よりも適してい
る。[0007] The communication overhead inside the parallel computer is:
Significant reduction can be achieved by using high speed communication hardware. In the interrupt-type communication system used for the active message, the overhead of the interrupt becomes conspicuous. However, the message passing communication method does not require an interrupt, and is therefore more suitable for communication inside a parallel computer than active message communication.

【０００８】[0008]

【発明が解決しようとする課題】前述のように、並列計
算機の内部高速通信網を使用するための通信規約は一般
にはメッセージパッシング型である。しかし、並列計算
機をビジネスの分野で利用する場合に使用するビジネス
用のアプリケーションプログラムは多くはＴＣＰ／ＩＰ
規約を使用するように構成されている。したがって、そ
のようなアプリケーションプログラムは、そのままでは
メッセージパッシング型通信方式をそのまま利用するこ
とはできない。しかも、メッセージパッシング型通信方
式を利用した状態で、ストリーム通信を実現する方法が
知られていない。また、割込を使用するアクティブメッ
セージ型通信で使用されるストリーム通信の実現方法を
そのままメッセージパッシング型通信でのストリーム通
信に利用するわけには行かない。As described above, the communication protocol for using the internal high-speed communication network of the parallel computer is generally of a message passing type. However, many business application programs used when a parallel computer is used in the business field are mostly TCP / IP.
It is configured to use conventions. Therefore, such an application program cannot directly use the message passing communication system as it is. In addition, there is no known method for realizing stream communication using a message passing communication system. Further, the method of realizing the stream communication used in the active message type communication using the interrupt cannot be directly used for the stream communication in the message passing type communication.

【０００９】従って本発明の目的は、メッセージパッシ
ング型通信を実行するように構成されている計算機上で
動作する複数のアプリケーションプログラム間でストリ
ーム通信を実行可能にする計算機間データ送受信方法を
提供することである。Accordingly, an object of the present invention is to provide an inter-computer data transmission / reception method which enables stream communication between a plurality of application programs operating on a computer configured to execute message passing communication. It is.

【００１０】本発明のより具体的な目的は、第１の計算
機ネットワークを介して、ＴＣＰ／ＩＰのごとき、メッ
セージパッシング型通信と異なる通信規約を用いて通信
を実行するように構成されている複数のアプリケーショ
ンプログラム間で、その計算機ネットワークよりも高速
な通信網を使用して、かつ、メッセージパッシング型通
信を実行可能にする計算機間データ送受信方法を提供す
ることである。A more specific object of the present invention is to provide a plurality of communication apparatuses configured to execute communication via a first computer network using a communication protocol different from that of message passing communication such as TCP / IP. The present invention provides an inter-computer data transmission / reception method that enables a message-passing communication to be executed between application programs using a communication network faster than the computer network.

【００１１】さらに、本発明の他の目的は、第１の通信
網とそれより高速の第２の通信網に接続された計算機上
で動作し、他のアプリケーションプログラムとの間で、
たとえばＴＣＰ／ＩＰ通信規約のような第１の通信規約
に基づいて通信するように構成されたアプリケーション
プログラムが、第１の通信網に接続された他の計算機上
で動作する他のアプリケーションプログラムとの間で、
その通信規約に基づいた通信を実行することを可能にす
るとともに、第２の通信網に接続された他の計算機上で
動作する他のアプリケーションプログラムとの間でその
第２の通信網を使用した高速の通信を実行する行うこと
を可能にする計算機間データ送受信方法を提供すること
である。Still another object of the present invention is to operate on a computer connected to a first communication network and a second communication network which is faster than the first communication network.
For example, an application program configured to communicate on the basis of a first communication protocol such as a TCP / IP communication protocol is connected to another application program operating on another computer connected to the first communication network. Between,
It is possible to execute communication based on the communication protocol, and to use the second communication network with another application program running on another computer connected to the second communication network. An object of the present invention is to provide an inter-computer data transmission / reception method capable of performing high-speed communication.

【００１２】さらに、本発明の他の具体的な目的は、上
記第２の通信網を使用した通信を、メッセージパッシン
グ型の通信とすることができる計算機間データ送受信方
法を提供することである。Still another object of the present invention is to provide an inter-computer data transmission / reception method in which communication using the second communication network can be message-passing type communication.

【００１３】[0013]

【課題を解決するための手段】上記目的を達成するため
に、本発明による計算機間データ送受信方法では、第１
の計算機上で実行されている第１のアプリケーションプ
ログラムが発行した複数の送信命令により指定される複
数の送信データのそれぞれを、第２の計算機上で実行さ
れている第２のアプリケーションプログラムが発行した
複数の受信命令に応答して、メッセージパッシング型の
通信で受信する。この受信を第２の計算機上に設けられ
たエミュレーションライブラリにより制御する。In order to achieve the above object, a method for transmitting and receiving data between computers according to the present invention comprises the following steps.
Each of the plurality of transmission data specified by the plurality of transmission instructions issued by the first application program executed on the second computer is issued by the second application program executed on the second computer. In response to a plurality of reception commands, the reception is performed by message passing type communication. This reception is controlled by an emulation library provided on the second computer.

【００１４】さらに、上記複数の送信データの連なりか
らなるひと繋がりのデータの内、上記複数の受信命令に
より指定されるサイズ部分に区分して得られる部分をそ
れぞれの受信命令が指定する複数のバッファに格納する
というストリーム通信を実現するように、受信されたそ
れぞれの送信データを処理する。この処理を、上記エミ
ュレーションライブラリにより制御する。[0014] Further, a plurality of buffers, each of which is obtained by dividing a portion obtained by dividing into a size portion specified by the plurality of reception commands, out of the concatenated data formed by the concatenation of the plurality of transmission data, with each reception command. Each of the received transmission data is processed so as to realize the stream communication of storing the transmission data. This processing is controlled by the emulation library.

【００１５】より具体的には、本発明による計算機間デ
ータ送受信方法は、以下の処理を実行する。More specifically, the inter-computer data transmission / reception method according to the present invention executes the following processing.

【００１６】（ａ）送信側アプリケーションが一回の送
信命令によって送信しようとするデータの長さを、受信
側エミュレーションライブラリにより検知し、（ｂ）上
記の送信データ長を、アプリケーションが一回の受信命
令によって指定しているデータ受け取り長と受信側エミ
ュレーションライブラリにより比較し、（ｃ）上記の比
較で、送信データ長がデータ受け取り長より長ければ、
受信側エミュレーションライブラリが、一旦メモリ上に
確保したバッファ領域にデータの全てを受信して、そこ
からアプリケーションが指定したデータ受け取り長分だ
けのデータをアプリケーション領域にコピーし、送信デ
ータ長がデータ受け取り長以下であれば、受信側エミュ
レーションライブラリが、アプリケーション領域にデー
タを直接受信する。(A) The length of data to be transmitted by the transmission application in one transmission command is detected by the reception emulation library, and (b) the transmission data length is detected by the application once. The data reception length specified by the instruction is compared with the reception side emulation library. (C) In the above comparison, if the transmission data length is longer than the data reception length,
The receiving-side emulation library receives all of the data in the buffer area once secured in memory, copies the data for the data reception length specified by the application from there to the application area, and sets the transmission data length to the data reception length. In the following cases, the receiving emulation library directly receives the data in the application area.

【００１７】（ｄ）アプリケーションが受信命令を発行
した時に、バッファ領域にデータが残っている場合に
は、そこからアプリケーション領域にデータをコピーす
る。(D) When data is left in the buffer area when the application issues the reception command, the data is copied from there to the application area.

【００１８】さらにより具体的には、本発明によるデー
タ送受信方法を実現するために、ＴＣＰ／ＩＰエミュレ
ーションライブラリを用意する。このライブラリは、Ｔ
ＣＰ／ＩＰのソケットアプリケーションプログラムイン
タフェースと同一のインタフェースを持ち、通信相手が
並列計算機の外部であったら従来のシステムコールを、
内部であったらＭＰＩ等の並列計算機用メッセージパッ
シング型通信方式による高速通信網を用いる、という切
り分けを行う。すなわち、このＴＣＰ／ＩＰエミュレー
ションライブラリには以下のような特徴を持たせる。More specifically, a TCP / IP emulation library is prepared to realize the data transmission / reception method according to the present invention. This library is
It has the same interface as the CP / IP socket application program interface. If the communication partner is outside the parallel computer, the conventional system call is executed.
If it is inside, it is decided to use a high-speed communication network based on a message passing communication system for parallel computers such as MPI. That is, the TCP / IP emulation library has the following features.

【００１９】（１）並列計算機内部で通信する場合は、
メッセージパッシング型通信方式に適した通信手順を用
いて、ＴＣＰ／ＩＰと同等のストリーム通信サービスを
提供する。(1) When communicating within a parallel computer,
A stream communication service equivalent to TCP / IP is provided using a communication procedure suitable for a message passing communication system.

【００２０】（２）ＴＣＰ／ＩＰエミュレーションライ
ブラリとして適当な手段を用いて通信方法を切り分け
る。(2) The communication method is separated using an appropriate means as a TCP / IP emulation library.

【００２１】（３）外部通信のデータと内部通信のデー
タをスピンループによって検出する。もしくは外部通信
のデータと内部通信のデータを別々のスレッドで検出す
る。(3) Data of external communication and data of internal communication are detected by a spin loop. Alternatively, data of external communication and data of internal communication are detected by separate threads.

【００２２】[0022]

【発明の実施の形態】＜従来の技術とその問題点＞本発
明の実施の形態を説明する前に、従来の技術とその問題
点を説明する。DESCRIPTION OF THE PREFERRED EMBODIMENTS <Prior Art and Problems> Prior to describing an embodiment of the present invention, a conventional technique and its problems will be described.

【００２３】（１）ＴＣＰ／ＩＰ図１３にＴＣＰ／ＩＰの階層を示す。ＴＣＰ／ＩＰで
は、下からリンク層３０１，ＩＰ層３０２，ＴＣＰ層３
０３，アプリケーション層３０６の４階層を定めてい
る。実際にはリンク層３０１の下に物理層が存在するが
簡単化のためにこの層には言及しない。以下では、簡単
化のためにアプリケーション層より下の層３０３，３０
２，３０１からなる層群３０４をまとめてＴＣＰ／ＩＰ
層と呼ぶ。一般にＴＣＰ／ＩＰが定める手順で通信を行
なうために必要なプログラム（以下、ＴＣＰ／ＩＰ処理
ルーチンと呼ぶ）はＯＳに含まれる。ＴＣＰ／ＩＰが定
める手順で通信を実行するには、ＴＣＰ／ＩＰ層３０４
を構成する複数の層の各々によりそれぞれ特定の処理が
実行される。本明細書では、これらの層がそれぞれ実行
する複数の処理あるいはそれらの処理を実行する複数の
プログラムルーチンを総称してＴＣＰ／ＩＰ処理ルーチ
ンと呼ぶ。(1) TCP / IP FIG. 13 shows the hierarchy of TCP / IP. In TCP / IP, a link layer 301, an IP layer 302, a TCP layer 3
03, four layers of the application layer 306 are defined. Actually, a physical layer exists below the link layer 301, but this layer is not described for simplicity. In the following, for simplicity, layers 303 and 30 below the application layer will be described.
TCP / IP combining layer group 304 consisting of 2,301
Called layer. Generally, a program (hereinafter, referred to as a TCP / IP processing routine) necessary for performing communication according to a procedure determined by TCP / IP is included in the OS. To execute communication according to the procedure defined by TCP / IP, the TCP / IP layer 304
The specific processing is respectively performed by each of the plurality of layers constituting. In this specification, a plurality of processes respectively executed by these layers or a plurality of program routines for executing the processes are collectively referred to as a TCP / IP processing routine.

【００２４】アプリケーションプログラムがＴＣＰ／Ｉ
Ｐ処理ルーチンを利用するためのインタフェースとして
は、一般にソケットアプリケーションプログラミングイ
ンタフェース（ソケットＡＰＩ）１１１が用いられる。
アプリケーションプログラムがＴＣＰ／ＩＰ処理ルーチ
ンを利用するために、ソケットライブラリと呼ばれるシ
ステムコール群がＯＳの一部として設けられている。The application program is TCP / I
As an interface for utilizing the P processing routine, a socket application programming interface (socket API) 111 is generally used.
In order for the application program to use the TCP / IP processing routine, a system call group called a socket library is provided as a part of the OS.

【００２５】なお、本明細書では、データ受信のための
システムコールおよび関数あるいはデータ送信のための
システムコールおよび関数を呼び出すことを、それぞれ
の機能と組み合わせて受信命令あるいは送信命令を発行
すると記載することがある。ソケットアプリケーション
プログラムインタフェースは、ソケットライブラリのア
プリケーションプログラムインタフェースとして定めら
れている。ソケットアプリケーションプログラムインタ
フェースに従って記述された、ＴＣＰ／ＩＰで通信を行
なうアプリケーションプログラム（以下、ＴＣＰ／ＩＰ
アプリケーションプログラムと呼ぶ）の例を以下に示
す。In this specification, calling a system call and a function for data reception or calling a system call and a function for data transmission is to issue a reception command or a transmission command in combination with each function. Sometimes. The socket application program interface is defined as an application program interface of a socket library. An application program (hereinafter, referred to as TCP / IP) that performs communication using TCP / IP described according to a socket application program interface.
An example of the application program will be described below.

【００２６】（サーバ側）（クライアント側） sa=socket(AF_INET,SOCK_STREAM,0); sb0=socket(AF_INET,SOCK_STREAM,0); bind(sa, server, slen); listen(sa, 5); sa1=accept(sa, &client, &clen); connect(sb0, server, slen); read(sa1, buffer1, length1); write(sb0, buffer0, length0); read(sa1, buffer2, length2); write(sb0, buffer3, length3); ・・・・・・ close(sa1); close(sa); close(sb0); まず、他と通信を行おうとするサーバ側のＴＣＰ／ＩＰ
アプリケーションプログラムおよびクライアント側のＴ
ＣＰ／ＩＰアプリケーションプログラムは、それが実行
中の計算機に設けられたＴＣＰ／ＩＰ処理ルーチンに含
まれたシステムコールソケット（ｓｏｃｋｅｔ）を呼び
出す。呼び出されたシステムコールソケットは、通信端
の役目を果たす、ソケットと呼ばれるオブジェクトを生
成し、ソケット記述子を返す。上の例ではサーバ側、ク
ライアント側のＴＣＰ／ＩＰアプリケーションプログラ
ムは、それぞれに返されるソケット記述子をｓａまたは
ｓｂ０として受け取る。(Server side) (client side) sa = socket (AF_INET, SOCK_STREAM, 0); sb0 = socket (AF_INET, SOCK_STREAM, 0); bind (sa, server, slen); listen (sa, 5); sa1 = accept (sa, & client, &clen); connect (sb0, server, slen); read (sa1, buffer1, length1); write (sb0, buffer0, length0); read (sa1, buffer2, length2); write (sb0, buffer3, length3); ・・・・・・ close (sa1); close (sa); close (sb0);
Application program and client-side T
The CP / IP application program calls a system call socket (socket) included in a TCP / IP processing routine provided in the computer on which the CP / IP application program is running. The called system call socket creates an object called a socket, which acts as a communication end, and returns a socket descriptor. In the above example, the server-side and client-side TCP / IP application programs receive the returned socket descriptors as sa or sb0, respectively.

【００２７】ソケット記述子は、アプリケーションプロ
グラム内でソケット毎に一意に決定されるソケットの識
別子（ＩＤ）であり、整数値からなる。生成されたソケ
ットに対する操作は、全てこのソケット記述子を指定し
て行う。以下、特に断らない限り、ソケットとソケット
記述子は同義であるとして説明する。上のシステムコー
ルの引数ＡＦ＿ＩＮＥＴは、アドレスファミリインタネ
ットを表し、インタネットを介した通信にソケットを使
用することを示す。さらに、引数ＳＯＣＫ＿ＳＴＲＥＡ
Ｍは、ストリーム通信を要求する。最後の引数はプロト
コルを指定する引数である。この値が０のときには、プ
ロトコルはその引数より前の二つの引数により決まる。
今の場合には、ＴＣＰ／ＩＰプロトコルを使用すること
になる。ソケットが生成された後、サーバ側とクライア
ント側のＴＣＰ／ＩＰアプリケーションプログラムは異
なる手順でソケットの接続処理を行う。The socket descriptor is a socket identifier (ID) uniquely determined for each socket in the application program, and is composed of an integer value. All operations on the created socket are performed by specifying this socket descriptor. In the following description, unless otherwise specified, the socket and the socket descriptor are synonymous. The argument AF_INET of the above system call represents the address family Internet, and indicates that the socket is used for communication via the Internet. In addition, the argument SOCK_STREA
M requests stream communication. The last argument is the one that specifies the protocol. When this value is zero, the protocol is determined by the two arguments preceding that argument.
In this case, the TCP / IP protocol will be used. After the socket is created, the TCP / IP application programs on the server side and the client side perform socket connection processing in different procedures.

【００２８】サーバ側は、ＴＣＰ／ＩＰ処理ルーチンに
含まれたシステムコールバインド（ｂｉｎｄ）を呼び出
す。呼び出されたシステムコールバインドは、引数で指
定されたソケットｓａに引数で指定された名前を対応付
ける。名前は、ＩＰアドレスとポート番号の組合わせか
らなる。上の例では、ｓｌｅｎで指定される大きさを持
つ構造体ｓｅｒｖｅｒに格納されたＩＰアドレスとポー
ト番号の組からなる名前にソケット識別子ｓａが対応付
けられる。サーバ側は、さらにＴＣＰ／ＩＰ処理ルーチ
ンに含まれたシステムコールリスン（ｌｉｓｔｅｎ）を
呼び出す。The server calls a system call bind included in the TCP / IP processing routine. The called system call bind associates the socket sa specified by the argument with the name specified by the argument. The name is composed of a combination of an IP address and a port number. In the above example, the socket identifier sa is associated with the name composed of the pair of the IP address and the port number stored in the structure server having the size specified by slen. The server further calls a system call listen included in the TCP / IP processing routine.

【００２９】このシステムコールリスンは、引数で指定
されたソケットｓａを接続要求の受領のためのソケット
に設定する。このシステムコールの第２の引数は、その
接続要求および他の接続要求を一時的に保持するのに用
いるキューに要求するサイズを表し、今の場合には、こ
のコールは、５個の接続要求を保持可能なキューを要求
している。This system call listen sets the socket sa specified by the argument as a socket for receiving a connection request. The second argument of this system call represents the size to request from the queue used to temporarily hold the connection request and other connection requests, in this case the call Requesting a queue that can hold.

【００３０】以上の処理の結果、論理的には、サーバ側
とクライアント側の間に通信路が決定されたことにな
る。As a result of the above processing, a communication path is logically determined between the server side and the client side.

【００３１】その後、サーバ側は、ＴＣＰ／ＩＰ処理ル
ーチンに含まれたシステムコールアクセプト（ａｃｃｅ
ｐｔ）を呼び出す。このシステムコールは、引数により
指定されたソケットｓａを接続要求の待ち状態にする。
このシステムコールの第２，第３の引数は、待つべき接
続要求の発行元クライアント側のＩＰアドレスと長さを
表す。Then, the server side accepts a system call (accept) included in the TCP / IP processing routine.
pt). This system call puts the socket sa specified by the argument in a waiting state for a connection request.
The second and third arguments of this system call indicate the IP address and length of the client that issued the connection request to be waited for.

【００３２】一方、クライアント側は、ＴＣＰ／ＩＰ処
理ルーチンに含まれたシステムコールコネクト（ｃｏｎ
ｎｅｃｔ）を呼び出す。このシステムコールは、引数で
指定された名前、今の場合にはサーバ側のソケットｓａ
に付けられた名前（構造体ｓｅｒｖｅｒに格納された、
ＩＰアドレスｓｉｎ＿ａｄｄｒとポート番号ｓｉｎ＿ｐ
ｏｒｔの組み合わせ）に対して引数で指定されたソケッ
トｓｂ０を接続する。On the other hand, the client side transmits a system call connect (conn) included in the TCP / IP processing routine.
Nect). This system call returns the name specified by the argument, in this case the server side socket sa.
Name (stored in the structure server)
IP address sin_addr and port number sin_p
(or combination of ort) is connected to the socket sb0 specified by the argument.

【００３３】さらに、サーバ側の計算機では、先に呼び
出されたシステムコールアクセプトが、この接続要求を
受信し、システムコールアクセプトに対する前述の引数
が指定するソケットｓａ１をこのクライアント側との通
信のためのソケットとして新たに生成する。こうして、
サーバ側のソケットｓａ１とクライアント側のソケット
ｓｂ０との間で通信路が確立されたことになる。Further, in the server-side computer, the previously called system call accept receives the connection request, and sets the socket sa1 specified by the above-mentioned argument for the system call accept for communication with the client side. Create a new socket. Thus,
This means that a communication path has been established between the server-side socket sa1 and the client-side socket sb0.

【００３４】ソケット間の接続が確立された状態でクラ
イアント側がサーバ側にデータを送信する時には、クラ
イアント側は、ＴＣＰ／ＩＰ処理ルーチンに含まれたシ
ステムコールライト（ｗｒｉｔｅ）を呼び出す。このシ
ステムコールに対する引数では、送信すべきデータを保
持するのに用いるバッファのアドレスｂｕｆｆｅｒ０と
そのバッファの長さｌｅｎｇｔｈ０を指定する。システ
ムコールライトは、このバッファ内のデータをこのシス
テムコールが指定するソケットｓｂ０を用いて送信す
る。サーバ側はＴＣＰ／ＩＰ処理ルーチンに含まれたシ
ステムコールリード（ｒｅａｄ）を呼び出す。このシス
テムコールは、その引数で指定されたソケットｓａ１を
介して転送されたデータを受信し、引数で指定されたア
ドレスｂｕｆｆｅｒ１のバッファに書き込む。このよう
にして、２つのアプリケーションプログラムの間でデー
タが転送される。もし必要ならば、クライアント側は複
数の後続のデータをそれぞれ送信するために複数のライ
トシステムコールを呼び出し、サーバ側はそれらの後続
のデータを受信するために一つまたは複数のリードシス
テムコールを呼び出す。When the client sends data to the server with the connection between the sockets established, the client calls a system call write included in the TCP / IP processing routine. The arguments to this system call specify the address buffer0 of the buffer used to hold the data to be transmitted and the length length0 of the buffer. The system call write transmits the data in the buffer using the socket sb0 specified by the system call. The server calls a system call read included in the TCP / IP processing routine. This system call receives the data transferred via the socket sa1 specified by the argument and writes the data into the buffer at the address buffer1 specified by the argument. Thus, data is transferred between the two application programs. If necessary, the client calls multiple write system calls to send multiple subsequent data, respectively, and the server calls one or more read system calls to receive the subsequent data. .

【００３５】ソケット記述子は、ファイルへの入出力や
標準入出力などを行う際に用いられる、ファイル記述子
の一種として定義されている。このためソケットアプリ
ケーションプログラムインタフェースでは、ファイル等
の入出力を行う場合と同様のインタフェースを通して、
データの送受信を行うことができる。The socket descriptor is defined as a type of file descriptor used when performing input / output to a file or standard input / output. For this reason, in the socket application program interface, through the same interface as when inputting and outputting files etc.,
Data can be transmitted and received.

【００３６】サーバ側とクライアント側のプログラムの
間の通信が終了したときには、サーバ側は、システムコ
ールクローズ（ｃｌｏｓｅ）を呼び出し、引数で指定さ
れたソケットｓａ１を閉鎖する。サーバ側は、さらに同
じシステムコールクローズを再度呼び出し、システムコ
ールクローズは今度はソケットｓａを閉鎖する。クライ
アント側も同様にシステムコールクローズを呼び出し
て、このシステムコールはソケットｓｂ０を閉鎖する。When the communication between the server side and the client side programs is completed, the server side calls a system call close to close the socket sa1 specified by the argument. The server side calls the same system call close again, and this closes the socket sa this time. The client side similarly calls a system call close, and this system call closes the socket sb0.

【００３７】（２）高速ソケットにおける通信方法の
切り替え従来のＴＣＰ／ＩＰでは、システムコールバインドによ
ってＩＰアドレスとポート番号からなる名前（ｓｅｒｖ
ｅｒ）をソケット（ｓａ）に対応付けていた。高速ソケ
ットでは、この時、さらに高速通信専用のソケットを以
下のように生成する。まず、アプリケーションがシステ
ムコールバインドの呼び出し時に指定する名前のポート
番号（構造体ｓｅｒｖｅｒに格納されたポート番号ｓｉ
ｎ＿ｐｏｒｔ）にハッシュ関数を施して、新たにシャド
ウポート番号を導き出す。次に、システムコールソケッ
トを呼び出して、シャドウソケットを新規に生成する。
最後にバインドシステムコールを呼び出して、シャドウ
ソケットにシャドウポート番号を対応付ける。高速通信
を行なう場合は、サーバおよびクライアントがａｃｃｅ
ｐｔおよびｃｏｎｎｅｃｔを呼び出して接続するとき
に、このシャドウソケットを用いることで高速通信専用
の通信路を貼る。(2) Switching of communication method in high-speed socket In the conventional TCP / IP, a name (serv
er) is associated with the socket (sa). At this time, in the high-speed socket, a socket dedicated to high-speed communication is generated as follows. First, the port number (the port number si stored in the structure server) of the name specified when the application calls the system call bind
n_port) is subjected to a hash function to derive a new shadow port number. Next, a shadow socket is newly generated by calling a system call socket.
Finally, call the bind system call to associate the shadow port number with the shadow socket. When performing high-speed communication, the server and client
When calling and connecting pt and connect, a communication path dedicated to high-speed communication is attached by using the shadow socket.

【００３８】この方法では、システムコールバインドや
システムコールリスンが呼び出される時に、特別な処理
が必要となる。In this method, special processing is required when a system call bind or a system call listen is called.

【００３９】（3）ＴＣＰ／ＩＰ処理ルーチンによるス
トリーム通信ＴＣＰ／ＩＰのソケットライブラリは、ストリーム通信
をサービスする。ストリーム通信とは、送信側のアプリ
ケーションプログラムが一連のライトシステムコールに
より送る複数のデータをひと繋がりのデータストリーム
として処理し、そのデータストリームを受信側のアプリ
ケーションプログラムが発行する一つあるいは複数のリ
ードシステムコールが指定する任意の長さを有する一つ
あるいは複数のデータに切り分けて受け取るという通信
方法である。(3) Stream Communication by TCP / IP Processing Routine The TCP / IP socket library services stream communication. Stream communication refers to one or a plurality of read systems in which a plurality of data sent by a transmitting application program by a series of write system calls are processed as a continuous data stream, and the data stream is issued by a receiving application program. This is a communication method in which one or more pieces of data having an arbitrary length designated by a call are divided and received.

【００４０】図１４を用いて従来のストリーム通信の動
作を説明する。ここでは、送信アプリケーションプログ
ラム８０１が受信アプリケーションプログラム８０２に
データを送る例を示している。送信アプリケーションプ
ログラム８０１は、まず第１のライトシステムコールを
呼び出してそのプログラムのバッファ８０３内の５０キ
ロバイト（ＫＢ）のデータを送信する（８０５）。な
お、図１４では単位ＫＢは簡単化のために省略してい
る。送信側のＴＣＰ／ＩＰ処理ルーチンは、バッファ８
０３内のこの送信データを一旦ＯＳ内に確保したバッフ
ァ（図示せず）にコピーし、複数のパケットに分割し
て、受信側ＯＳに送信する。通常パケットのサイズは
４０〜１５００バイトである。受信側ＯＳは、それらの
パケットをＯＳ内に確保した複数のバッファ（図示せ
ず）に受け取り、これらのパケットをリスト状に繋い
で、データストリームを再構成する。送信アプリケーシ
ョンプログラム８０１は、さらに第２のライトシステム
コールを発行してそのプログラムの他のバッファ８０４
内の８０ＫＢのデータを送信する（８０６）。このとき
も送信側のＴＣＰ／ＩＰ処理ルーチンと受信側のＴＣＰ
／ＩＰ処理ルーチンは同様に動作して、このデータを先
に送信されたデータと組み合わせて一つのデータストリ
ームとしてＯＳ内の前述のバッファに保持する。The operation of the conventional stream communication will be described with reference to FIG. Here, an example is shown in which the transmission application program 801 sends data to the reception application program 802. The transmission application program 801 first calls the first write system call and transmits 50 kilobytes (KB) of data in the buffer 803 of the program (805). In FIG. 14, the unit KB is omitted for simplification. The TCP / IP processing routine on the transmission side
The transmission data in 03 is temporarily copied to a buffer (not shown) secured in the OS, divided into a plurality of packets, and transmitted to the receiving OS. The size of a normal packet is 40 to 1500 bytes. The receiving OS receives the packets in a plurality of buffers (not shown) secured in the OS, connects the packets in a list, and reconfigures the data stream. The transmission application program 801 further issues a second write system call to send another buffer 804 of the program.
The data of 80 KB is transmitted (806). At this time, the TCP / IP processing routine on the transmission side and the TCP
The / IP processing routine operates in the same manner, and combines this data with the previously transmitted data and stores it as one data stream in the aforementioned buffer in the OS.

【００４１】図において、８０９はＯＳ９００内に保持
されたこのデータストリームを模式的に表す。先頭のデ
ータ８０７、後続のデータ８０８はそれぞれ上記第１、
第２のライトシステムコールにより送信されたそれぞれ
５０ＫＢ、８０ＫＢのデータを表す。ストリーム通信で
は、受信アプリケーションプログラムは、これらのデー
タを一本の連続した１３０ＫＢのデータストリーム８０
９として捉える。送信アプリケーションプログラム８０
１内のバッファ８０３、８０４内のデータは、ＯＳ９０
０内のデータストリーム８０９内のデータ部分８０７，
８０９をそれぞれ保持する。このための、受信アプリケ
ーションプログラム８０２は、第１のリードシステムコ
ールを発行して、ストリームデータ８０９内の５０ＫＢ
の先頭データ８０７の内の３０ＫＢを受信アプリケーシ
ョンプログラム８０２の３０ＫＢのバッファ８１２に受
け取ることを要求する（８１４）。受信側のＴＣＰ／Ｉ
Ｐ処理ルーチンは、このリードシステムコールに応答し
て、このデータストリーム８０９から先頭の３０ＫＢの
データ８１０をバッファ８１２にコピーする。受信アプ
リケーションプログラム８０２がさらに第２のリードシ
ステムコールを発行すると、このシステムコールが指定
する長さに従って、ストリームデータ８０９内にある残
りの１００ＫＢのデータ８１１をこのシステムコールが
指定するバッファ８１３にコピーする（８１５）。In the figure, reference numeral 809 schematically shows this data stream held in the OS 900. The first data 807 and the subsequent data 808 are the first data and the second data 808, respectively.
The data represents 50 KB and 80 KB, respectively, transmitted by the second write system call. In stream communication, the receiving application program converts these data into one continuous 130 KB data stream 80.
Catch as 9. Transmission application program 80
The data in the buffers 803 and 804 in the
0, a data portion 807 in a data stream 809,
809 respectively. For this purpose, the reception application program 802 issues a first read system call and sets the 50 KB in the stream data 809.
Of the leading data 807 of the receiving application program 802 in the 30 KB buffer 812 of the receiving application program 802 (814). TCP / I on the receiving side
The P processing routine copies the leading 30 KB data 810 from the data stream 809 to the buffer 812 in response to the read system call. When the receiving application program 802 further issues the second read system call, the remaining 100 KB data 811 in the stream data 809 is copied to the buffer 813 specified by the system call according to the length specified by the system call. (815).

【００４２】（4）アクティブメッセージ型通信でのス
トリーム通信上記高速ソケットを使用したシステムでは、アクティブ
メッセージ型の通信を採用しながらストリーム通信を実
現している。すなわち、このシステムでは、送信側のア
プリケーションプログラムから送信要求が発行される
と、この送信要求は、受信側のアプリケーションプログ
ラムから受信要求が発行されるのを待たないで実行され
る。送信要求は、受信側の計算機に割り込みを発生し、
この割り込みにより割り込みハンドラーが起動され、送
信データは、この割り込みハンドラー内のバッファに一
旦受信される。受信側のアプリケーションプログラムが
受信要求を発行すると、割り込みハンドラーは、既に受
信されたデータの内、この受信要求が要求する大きさの
データを、受信側のアプリケーションプログラムのバッ
ファに転送する。もし、受信側のアプリケーションプロ
グラムが要求するサイズが、受信済みのデータのサイズ
より小さければ、受信要求の処理が終了する。割り込み
ハンドラーに保持された残りのデータは、受信側のアプ
リケーションプログラムから新たな受信要求が発行され
たときに、そのアプリケーションプログラムのバッファ
に転送される。(4) Stream communication in active message type communication In the system using the high-speed socket, stream communication is realized while employing active message type communication. That is, in this system, when a transmission request is issued from a transmission-side application program, the transmission request is executed without waiting for a reception request to be issued from a reception-side application program. The transmission request generates an interrupt on the receiving computer,
This interrupt activates the interrupt handler, and the transmission data is temporarily received in a buffer in the interrupt handler. When the receiving-side application program issues a receiving request, the interrupt handler transfers, to the buffer of the receiving-side application program, data of a size required by the receiving request among the data already received. If the size requested by the application program on the receiving side is smaller than the size of the received data, the processing of the receiving request ends. The remaining data held in the interrupt handler is transferred to the buffer of the receiving application program when a new receiving request is issued from the receiving application program.

【００４３】逆に、受信側のアプリケーションプログラ
ムが要求するサイズが、受信済みのデータのサイズより
大きければ、割り込みハンドラーは、送信側のアプリケ
ーションプログラムがその後データを送信してきたとき
に、そのデータを、受信側のアプリケーションプログラ
ムに供給する。受信側の割り込みハンドラーは、送信側
のアプリケーションプログラムが送信要求を発行する前
に、受信側のアプリケーションプログラムから後続の受
信要求が発行されたときにも、同様にその後送信側のア
プリケーションプログラムが後続のデータをその後送信
してきたときに、その後続のデータを受信側のアプリケ
ーションプログラムに供給する。もし、この後続のデー
タが、上記最初の受信要求が要求する不足のデータのサ
イズあるいは上記後続の受信要求が要求するデータのサ
イズより大きければ、割り込みハンドラーは、あまりの
データをさらに後続の受信要求のために保持する。Conversely, if the size requested by the receiving-side application program is larger than the size of the received data, the interrupt handler replaces the data when the transmitting-side application program subsequently transmits the data. Supply to the receiving application program. Similarly, when a receiving request is issued from a receiving application program before the transmitting application program issues a sending request, the receiving interrupt handler also executes a subsequent sending application program. When the data is subsequently transmitted, the subsequent data is supplied to the application program on the receiving side. If this subsequent data is larger than the size of the missing data required by the first receive request or the size of the data required by the subsequent receive request, the interrupt handler will reduce the amount of data to the subsequent receive request. Hold for.

【００４４】こうして、送信側のアプリケーションプロ
グラムが発行する複数の送信要求により送信される複数
のデータを、受信側のアプリケーションプログラムが発
行する一連の受信要求に応答して受信側のアプリケーシ
ョンプログラムに供給する。こうして、この方法では、
割り込みハンドラー内のバッファに送信データを一旦保
持することにより、ストリーム通信を実現している。Thus, a plurality of data transmitted by a plurality of transmission requests issued by the transmitting application program are supplied to the receiving application program in response to a series of receiving requests issued by the receiving application program. . Thus, in this method,
Stream communication is realized by temporarily storing transmission data in a buffer in the interrupt handler.

【００４５】しかし、割込を使用しないメッセージパッ
シング型の通信では、このような方法を使用してストリ
ーム通信を実現できない。However, in a message passing type communication that does not use an interrupt, stream communication cannot be realized using such a method.

【００４６】以下、本発明に係る計算機間データ送受信
方法を図面に示したいくつかの実施の形態を参照してさ
らに詳細に説明する。なお、以下においては、同じ参照
番号は同じものもしくは類似のものを表わすものとす
る。また、発明の第２の実施の形態以降においては、発
明の第１の実施の形態との相違点を主に説明するに止め
る。Hereinafter, a method for transmitting and receiving data between computers according to the present invention will be described in more detail with reference to some embodiments shown in the drawings. In the following, the same reference numerals represent the same or similar ones. Further, in the second and subsequent embodiments of the invention, only the differences from the first embodiment of the invention will be mainly described.

【００４７】＜発明の実施の形態１＞（１）装置の概要図１は、本発明に係る計算機間送受信方法を実行するた
めの計算機システムの一例を示す。図において、並列計
算機１０１の内部の２台の計算機１０２，１０３と、１
台の外部計算機１０４とがお互いに通信網で繋がってい
ると仮定する。実際には、並列計算機１０１の内部およ
び外部の計算機の台数はそれぞれ任意である。内部の計
算機１０２と１０３は内部高速通信網１０５で繋がって
おり、内部の計算機１０２，１０３にはそれぞれ内部高
速通信網１０５専用のネットワークインタフェースハー
ドウェア１１９，１２０が存在する。内部高速通信網１
０５は、複数のパケットを互いに並列にかつ高速に転送
可能なネットワーク、たとえばハイパクロスバスイッチ
などにより構成される。また、内部の計算機１０２，１
０３の全てと外部の計算機１０４はグローバルな通信網
１０６に繋がっており、各計算機にはそれぞれ通信網１
０６専用のネットワークインタフェースハードウェア１
２１，１２２，１２３が存在する。<First Embodiment of the Invention> (1) Outline of Apparatus FIG. 1 shows an example of a computer system for executing an inter-computer transmission / reception method according to the present invention. In the figure, two computers 102 and 103 inside a parallel computer 101 and 1
It is assumed that two external computers 104 are connected to each other via a communication network. Actually, the number of computers inside and outside the parallel computer 101 is arbitrary. The internal computers 102 and 103 are connected by an internal high-speed communication network 105, and the internal computers 102 and 103 have network interface hardware 119 and 120 dedicated to the internal high-speed communication network 105, respectively. Internal high-speed communication network 1
Reference numeral 05 denotes a network capable of transferring a plurality of packets in parallel with each other at high speed, such as a hyper crossbar switch. Also, the internal computers 102, 1
03 and the external computer 104 are connected to a global communication network 106, and each computer has a communication network 1
06 network interface hardware 1
21, 122, and 123 exist.

【００４８】データの送受信は、各計算機のメモリ１２
４，１２５，１２６にロードされているアプリケーショ
ンプログラム１０７，１０８，１０９の間で行われる。
また、各メモリにはアプリケーションプログラムの他に
ＯＳ１２７，１２８，１２９がロードされており、それ
ぞれのＯＳの中にはＴＣＰ／ＩＰ処理ルーチン１１４，
１１５，１１０が存在する。ＴＣＰ／ＩＰが定める手順
で通信を実行するには、ＴＣＰ／ＩＰ層３０４を構成す
る各層によりそれぞれ処理が実行される。ＴＣＰ／ＩＰ
処理ルーチンは、ＴＣＰ／ＩＰが定める手順で通信を実
行するためにこれらの層がそれぞれ実行する複数の処理
の総称で、これらのＴＣＰ／ＩＰ処理ルーチン１１４，
１１５，１１０自体は公知のものと同じであり、既に述
べたような、システムコールで呼び出し可能な複数の関
数を含んでいる。ＴＣＰ／ＩＰ処理ルーチン１１４，１
１５，１１０は、広域通信を目的として、グローバルな
通信網１０６専用のネットワークインタフェースハード
ウェア１２１，１２２，１２３によって通信を行う。Data transmission and reception are performed by the memory 12 of each computer.
The processing is performed between the application programs 107, 108, and 109 that are loaded in the application programs 4, 125, and 126.
In addition, OSs 127, 128, and 129 are loaded in each memory in addition to the application programs, and each OS has a TCP / IP processing routine 114,
There are 115 and 110. In order to execute communication according to the procedure defined by TCP / IP, each layer constituting the TCP / IP layer 304 executes a process. TCP / IP
The processing routine is a collective term of a plurality of processes executed by these layers in order to execute communication according to a procedure defined by TCP / IP.
115 and 110 themselves are the same as those known in the art, and include a plurality of functions that can be called by system calls as described above. TCP / IP processing routine 114, 1
15 and 110 perform communication by network interface hardware 121, 122 and 123 dedicated to the global communication network 106 for the purpose of wide area communication.

【００４９】計算機１０２のメモリ１２４には、さらに
ＴＣＰ／ＩＰエミュレーションライブラリ１１２とメッ
セージパッシング型ライブラリ１４０と高速通信ライブ
ラリ１３５とがロードされている。同様に、計算機１０
３のメモリ１２５には、ＴＣＰ／ＩＰエミュレーション
ライブラリ１１３とメッセージパッシング型ライブラリ
１４１と高速通信ライブラリ１３６とがロードされてい
る。Further, the TCP / IP emulation library 112, the message passing type library 140, and the high-speed communication library 135 are loaded in the memory 124 of the computer 102. Similarly, the calculator 10
In the third memory 125, a TCP / IP emulation library 113, a message passing type library 141, and a high-speed communication library 136 are loaded.

【００５０】計算機１０２，１０３上の高速通信ライブ
ラリ１３５，１３６は並列計算機１０１の内部での高速
通信を目的として、内部高速通信網１０５専用のネット
ワークインタフェースハードウェア１１９，１２０によ
って通信を行うためのライブラリであり、リモートメモ
リ書き込みライブラリあるいはＰＵＴライブラリと呼ば
れているものが多く使用される。本実施の形態でも高速
通信ライブラリ１３５、１３６にはこのＰＵＴライブラ
リを使用する。しかし、本発明はこのライブラリに限定
されるのではなく、他のライブラリたとえばＰＵＴ／Ｇ
ＥＴライブラリと呼ばれるライブラリも使用可能であ
る。The high-speed communication libraries 135 and 136 on the computers 102 and 103 are used for communication by the network interface hardware 119 and 120 dedicated to the internal high-speed communication network 105 for the purpose of high-speed communication inside the parallel computer 101. A library called a remote memory writing library or a PUT library is often used. In this embodiment, the PUT library is used for the high-speed communication libraries 135 and 136. However, the invention is not limited to this library;
A library called an ET library can also be used.

【００５１】一般に、高速内部通信網１０５を通した通
信はグローバルな通信網１０６を通して通信する場合に
比べて格段に速い。そこでＴＣＰ／ＩＰエミュレーショ
ンライブラリ１１２，１１３は、内部計算機上のアプリ
ケーションプログラム同士がＴＣＰ／ＩＰ通信を行おう
とする際には、ＴＣＰ／ＩＰ処理ルーチン１１４，１１
５ではなくメッセージパッシング型ライブラリ１４０，
１４１と高速通信ライブラリ１３５，１３６と内部高速
通信網１０５を使用してメッセージパッシング型の通信
を実現するように構成され、それでもって通信の高速化
を図る。Generally, communication through the high-speed internal communication network 105 is much faster than communication through the global communication network 106. Therefore, the TCP / IP emulation libraries 112 and 113 transmit TCP / IP processing routines 114 and 11 when application programs on the internal computer attempt to perform TCP / IP communication.
5, message passing type library 140,
141, the high-speed communication libraries 135 and 136, and the internal high-speed communication network 105 are used to realize message-passing-type communication, thereby speeding up communication.

【００５２】メッセージパッシング型ライブラリ１４
０、１４１は、ＴＣＰ／ＩＰエミュレーションライブラ
リ１１２または１１３からの要求にしたがって、高速通
信ライブラリ１３５または１３６を起動するためのライ
ブラリである。メッセージパッシング型ライブラリ１４
０、１４１は、一般的にはアプリケーションプログラム
（本実施の形態においてはＴＣＰ／ＩＰエミュレーショ
ンライブラリ１１２、１１３）に対してメッセージパッ
シング型のインタフェースを有するライブラリである。Message passing type library 14
Reference numerals 0 and 141 are libraries for activating the high-speed communication library 135 or 136 in response to a request from the TCP / IP emulation library 112 or 113. Message passing type library 14
Reference numerals 0 and 141 are libraries having a message passing type interface to an application program (in this embodiment, TCP / IP emulation libraries 112 and 113).

【００５３】ＴＣＰ／ＩＰエミュレーションライブラリ
１１２，１１３は、さらに、このメッセージパッシング
型の通信においても従来のＴＣＰ／ＩＰ処理ルーチンが
提供していたのと同じくストリーム通信を実現し、それ
でもって通信の高速化を図る。ＴＣＰ／ＩＰ処理ルーチ
ン１１４，１１５は、ＯＳ１２７，１２８の機能である
ため、従来技術ではアプリケーションプログラムがこれ
らを利用する際には必ずコンテクストスイッチのオーバ
ーヘッドが発生するが、本実施の形態では、高速通信ラ
イブラリ１３５，１３６を使うので、ＯＳを介さないた
めオーバーヘッドを回避でき、それにより、より一層の
通信の高速化も期待できる。The TCP / IP emulation libraries 112 and 113 further realize stream communication in this message-passing type communication as provided by the conventional TCP / IP processing routine, thereby increasing the speed of communication. Plan. Since the TCP / IP processing routines 114 and 115 are the functions of the OS 127 and 128, a context switch overhead is always generated when the application program uses them in the related art. However, in the present embodiment, the high-speed communication is performed in the present embodiment. Since the libraries 135 and 136 are used, the overhead can be avoided because the OS does not go through the OS, so that further higher communication speed can be expected.

【００５４】（２）論理構成と構成要素間のインタフェース図１
（Ｂ）は、上記ハードウェア構成を論理構成として表現
した図である。この図では、以降の説明に関係の無い部
分は全て省略してある。また、高速通信ライブラリ１３
５，１３６と内部高速通信専用ハードウェア１１９，１
２０をひとまとめにして高速通信機構１１６，１１７と
表現している。さらに、各構成要素間のインタフェース
１１１および１１８を新たに示している。計算機１０
２，１０３，１０４や並列計算機１０１を表す四角はそ
れぞれ、その中の論理構成要素が一台の計算機または並
列計算機１０１上で実行されることを示している。(2) Logical Configuration and Interface Between Components 1
FIG. 2B is a diagram expressing the above hardware configuration as a logical configuration. In this figure, all parts that are not relevant to the following description are omitted. In addition, the high-speed communication library 13
5,136 and hardware 119,1 dedicated to internal high-speed communication
20 are collectively expressed as high-speed communication mechanisms 116 and 117. Furthermore, the interfaces 111 and 118 between the components are newly shown. Computer 10
The squares representing 2, 103, 104 and the parallel computer 101 respectively indicate that the logical components therein are executed on one computer or the parallel computer 101.

【００５５】計算機１０４上では従来通り、アプリケー
ションプログラム１０９がＴＣＰ／ＩＰ１１０と、ソケ
ットアプリケーションプログラムインタフェース１１１
でリンクされている。並列計算機１０１の内部の計算機
１０２，１０３上で動くアプリケーションプログラム１
０７，１０８は、ＴＣＰ／ＩＰエミュレーションライブ
ラリ１１２，１１３にソケットアプリケーションプログ
ラムインタフェース１１１でもってリンクされている。
また、ＴＣＰ／ＩＰエミュレーションライブラリ１１
２，１１３は、ＯＳの機能である従来のＴＣＰ／ＩＰ処
理ルーチン１１４，１１５にソケットアプリケーション
プログラムインタフェース１１１でもってリンクされ、
同時に、高速通信機構１１６，１１７にＭＰＩ仕様のイ
ンタフェース１１８でもってリンクされている。On the computer 104, as in the conventional case, the application program 109 is composed of the TCP / IP 110 and the socket application program interface 111.
It is linked by. Application program 1 running on computers 102 and 103 inside parallel computer 101
07 and 108 are linked to the TCP / IP emulation libraries 112 and 113 by the socket application program interface 111.
Also, TCP / IP emulation library 11
2 and 113 are linked to conventional TCP / IP processing routines 114 and 115, which are functions of the OS, by a socket application program interface 111,
At the same time, it is linked to the high-speed communication mechanisms 116 and 117 by the interface 118 of the MPI specification.

【００５６】（３）アプリケーションプログラム１０
７，１０８本実施の形態では、並列計算機１０１内のいずれかの計
算機１０２上で動作しているアプリケーションプログラ
ム例えば１０７が、いずれかの他の計算機上で動作して
いる他のアプリケーションプログラムと通信する場合、
当該他のアプリケーションプログラムが、並列計算機１
０１内の計算機１０３上で動作しているアプリケーショ
ンプログラム例えば１０８か並列計算機１０１外の計算
機１０４上で動作しているアプリケーションプログラム
例えば１０９かによって、高速通信機構１１６とＴＣＰ
／ＩＰ処理ルーチン１１４を使い分ける。並列計算機１
０１内の異なる計算機１０２、１０３上で動作する二つ
のアプリケーションプログラム１０７、１０８が高速通
信機構１１６、１１７を使用して相互に通信するために
は、それらのアプリケーションプログラム間でデータを
実際に送受信するときだけでなく、それぞれのアプリケ
ーションプログラムのためのソケットを相互に接続する
ときにも高速通信機構１１６、１１７を利用するように
特別の処理を行う必要がある。ＴＣＰ／ＩＰエミュレー
ションライブラリ１１２または１１３がこの特別の処理
を実行するための複数の関数を含む。本実施の形態で
は、ＴＣＰ／ＩＰエミュレーションライブラリ１１２、
１１３内に設けられた関数の名前にＥＭＵ＿という接頭
辞をつけ、ＴＣＰ／ＩＰ処理ルーチン１１４、１１５ま
たは１１０内に設けられた前述の関数の名前と区別す
る。(3) Application program 10
7, 108 In the present embodiment, an application program running on one of the computers 102 in the parallel computer 101, for example, 107 communicates with another application program running on any of the other computers. If
The other application program is a parallel computer 1
01 and the application program running on the computer 104 outside the parallel computer 101, such as 109, the high-speed communication mechanism 116 and the TCP.
/ IP processing routine 114 is properly used. Parallel computer 1
01, two application programs 107, 108 operating on different computers 102, 103 communicate with each other using the high-speed communication mechanisms 116, 117 in order to actually transmit and receive data between the application programs. Not only when, but also when connecting the sockets for the respective application programs to each other, it is necessary to perform special processing so as to use the high-speed communication mechanisms 116 and 117. The TCP / IP emulation library 112 or 113 contains a number of functions for performing this special processing. In the present embodiment, the TCP / IP emulation library 112,
The name of the function provided in 113 is prefixed with EMU_ to distinguish it from the name of the function provided in TCP / IP processing routine 114, 115 or 110.

【００５７】具体的には、並列計算機１０１内の計算機
１０２、１０３上で動作しているアプリケーションプロ
グラム１０７または１０８の内、サーバ側およびクライ
アント側として動作するアプリケーションプログラムは
それぞれ以下のプログラムを実行するように生成され
る。More specifically, of the application programs 107 or 108 operating on the computers 102 and 103 in the parallel computer 101, the application programs operating on the server side and the client side execute the following programs, respectively. Is generated.

【００５８】（サーバ側）（クライアント側） sa=socket(AF_INET,SOCK_STREAM,0); sb0=socket(AF_INET,SOCK_STREAM,0); bind(sa, server, slen); listen(sa, 5); sa1=EMU_accept(sa, &client, &clen); EMU_connect(sb0, server, slen); EMU_read(sa1, buffer1, length1); EMU_write(sb0, buffer0, length0); EMU_read(sa1, buffer1, length1); EMU_write(sb0, buffer0, length0); ・・・・・・ close(sa1); close(sa); close(sb0); サーバ側アプリケーションプログラムは従来と同様にｓ
ｏｃｋｅｔ、ｂｉｎｄ、ｌｉｓｔｅｎに対するシステム
コールを呼び出す。これらのシステムコールは対応する
ＴＣＰ／ＩＰ処理ルーチンにより従来と同様に処理され
る。クライアント側のアプリケーションプログラムも従
来と同様にｓｏｃｋｅｔシステムコールを発行する。こ
のシステムコールも対応するＴＣＰ／ＩＰ処理ルーチン
により従来と同様に処理される。こうしてサーバ側とク
ライアント側に対して従来と同様にソケットｓａ、ｓｂ
０が生成される。(Server side) (client side) sa = socket (AF_INET, SOCK_STREAM, 0); sb0 = socket (AF_INET, SOCK_STREAM, 0); bind (sa, server, slen); listen (sa, 5); sa1 = EMU_accept (sa, & client, &clen); EMU_connect (sb0, server, slen); EMU_read (sa1, buffer1, length1); EMU_write (sb0, buffer0, length0); EMU_read (sa1, buffer1, length1); EMU_write (sb0, buffer0, length0); ・・・・・・ close (sa1); close (sa); close (sb0);
Call system calls for pocket, bind, and listen. These system calls are processed by the corresponding TCP / IP processing routine in a conventional manner. The client-side application program also issues a socket system call in the same manner as in the related art. This system call is also processed by the corresponding TCP / IP processing routine in the same manner as in the related art. Thus, the sockets sa and sb for the server side and the client side in the same manner as before.
0 is generated.

【００５９】その後サーバ側アプリケーションプログラ
ムは、従来のシステムコールアクセプトに代えて、その
アプリケーションプログラムが動作している計算機内に
設けられたＴＣＰ／ＩＰエミュレーションライブラリ内
に設けられた関数エミュレーションアクセプト（ＥＭＵ
＿ａｃｃｅｐｔ）を呼び出す。クライアント側のアプリ
ケーションプログラムは、従来の関数コネクトに代え
て、そのアプリケーションプログラムが動作している計
算機内に設けられたＴＣＰ／ＩＰエミュレーションライ
ブラリ内に設けられた関数エミュレーションコネクト
（ＥＭＵ＿ｃｏｎｎｅｃｔ）を呼び出す。さらに、サー
バ側アプリケーションプログラムは、従来のシステムコ
ールリードに代えて、ＴＣＰ／ＩＰエミュレーションラ
イブラリ内に設けられた関数エミュレーションリード
（ＥＭＵ＿ｒｅａｄ）を呼び出し、クライアント側アプ
リケーションプログラムは、従来のシステムコールライ
トに代えて、対応するＴＣＰ／ＩＰエミュレーションラ
イブラリ内に設けられた関数エミュレーションライト
（ＥＭＵ＿ｗｒｉｔｅ）を呼び出す。以下、これらの新
たな関数が行う処理を説明する。Thereafter, the server-side application program replaces the conventional system call accept with a function emulation accept (EMU) provided in a TCP / IP emulation library provided in a computer on which the application program is running.
_Accept). The client-side application program calls a function emulation connect (EMU_connect) provided in a TCP / IP emulation library provided in a computer on which the application program is running, instead of the conventional function connect. Further, the server-side application program calls a function emulation read (EMU_read) provided in the TCP / IP emulation library instead of the conventional system call read, and the client-side application program replaces the conventional system call write. Call the function emulation light (EMU_write) provided in the corresponding TCP / IP emulation library. Hereinafter, processing performed by these new functions will be described.

【００６０】（４）ＴＣＰ／ＩＰエミュレーションライ
ブラリによるソケットの接続図２において、サーバ側アプリケーションプログラム，
クライアント側アプリケーションプログラムが、それぞ
れ上述のＥＭＵ＿ａｃｃｅｐｔ，ＥＭＵ＿ｃｏｎｎｅｃ
ｔ関数を呼び出すと（処理５０１，５０２）、これらの
システムコールによってサーバ側のＴＣＰ／ＩＰエミュ
レーションライブラリ内に設けられた関数ＥＭＵ＿ａｃ
ｃｅｐｔとクライアント側のＴＣＰ／ＩＰエミュレーシ
ョンライブラリ内に設けられたＥＭＵ＿ｃｏｎｎｅｃｔ
呼び出される。これらのシステムコールの引数は、サー
バ側アプリケーションプログラム，クライアント側アプ
リケーションプログラムが、それぞれシステムコールａ
ｃｅｐｔ、ｃｏｎｎｅｃｔを呼び出し、ソケットｓａ、
ｓｂ０の接続をＴＣＰ／ＩＰ処理ルーチン１１４，１１
５に要求するときと同じである。(4) Socket Connection by TCP / IP Emulation Library In FIG.
The client-side application program executes the above-described EMU_accept and EMU_connect, respectively.
When the t function is called (processes 501 and 502), the function EMU_ac provided in the TCP / IP emulation library on the server side by these system calls
CMU and EMU_connect provided in TCP / IP emulation library on the client side
Be called. The arguments of these system calls are as follows: the server-side application program and the client-side application program execute the system call a
call cept, connect, socket sa,
The connection of sb0 is changed to the TCP / IP processing routines 114 and 11
5 is the same as when requesting.

【００６１】呼び出された関数ＥＭＵ＿ａｃｃｅｐｔと
ＥＭＵ＿ｃｏｎｎｅｃｔは、まずａｃｃｅｐｔシステム
コール，ｃｏｎｎｅｃｔシステムコールをそれぞれ発行
する（処理５０３，５０４）。これらのシステムコール
の引数は、関数ＥＭＵ＿ａｃｃｅｐｔ，ＥＭＵ＿ｃｏｎ
ｎｅｃｔのそれぞれに対する引数がそのまま使用され
る。これによりサーバ側のＴＣＰ／ＩＰ処理ルーチン内
に設けられたシステムコールａｃｃｅｐｔとクライアン
ト側のＴＣＰ／ＩＰ処理ルーチン内に設けられたシステ
ムコールｃｏｎｎｅｃｔが呼び出され、従来と同様にコ
ールされたシステムコールａｃｃｅｐｔはコールされた
システムコールｃｏｎｎｅｃｔによって発行された接続
要求を受領し、ソケットｓａ１が生成され、ソケットｓ
ａ１とソケットｓｂ０が通信路１０６を介して接続され
た状態になる。The called functions EMU_accept and EMU_connect first issue an accept system call and a connect system call, respectively (processes 503 and 504). The arguments of these system calls are the functions EMU_accept, EMU_con
The argument for each nect is used as is. As a result, the system call "accept" provided in the TCP / IP processing routine on the server side and the system call "connect" provided in the TCP / IP processing routine on the client side are called. Upon receiving the connection request issued by the called system call connect, a socket sa1 is generated and the socket s
a1 and the socket sb0 are connected via the communication path 106.

【００６２】その後、関数ＥＭＵ＿ａｃｃｅｐｔ，ＥＭ
Ｕ＿ｃｏｎｎｅｃｔは、それぞれ相手のＩＰアドレスが
並列計算機１０１の内部の計算機のアドレスであるか否
かを確認する（処理５０６，５０７）。相手が並列計算
機１０１の内部である場合は、ソケット記述子を内部テ
ーブル４１０または４１１に登録する（処理５１０，５
１１）。今の場合には、サーバ側のアプリケーションプ
ログラム１０７用のソケットｓａ１、クライアント側の
アプリケーションプログラム用のソケットｓｂ０がそれ
ぞれ内部テーブル４１０、４１１にそれぞれ登録され
る。さらに、ＴＣＰ／ＩＰエミュレーションライブラリ
１１２、１１３は、並列計算機１０１内における各計算
機の識別子などの高速通信機構１１６，１１７を用いる
ために必要なデータを交換する。このデータ交換のため
に、アプリケーションプログラム１０７、１０８はそれ
ぞれ関数ｒｅａｄ，ｗｒｉｔｅシステムコールを発行す
る（処理５１２、５１３、５１５、５１６）。ここでの
データ交換は、既に接続されているソケットｓａ１とｓ
ｂ０を利用し、従来と同様にＴＣＰ／ＩＰ処理ルーチン
１１４，１１５と通信路１０６を介して行われる。その
後それぞれサーバ側、クライアント側のアプリケーショ
ンプログラム１０７，１０８にリターンする（処理５１
８，５１９）。このとき、サーバ側のＴＣＰ／ＩＰエミ
ュレーションライブラリ１１２はアプリケーションプロ
グラム１０７に生成されたソケットｓａ１の識別子を戻
す。Then, the function EMU_accept, EM
U_connect confirms whether the IP address of the other party is an address of a computer inside the parallel computer 101 (steps 506 and 507). If the partner is inside the parallel computer 101, the socket descriptor is registered in the internal table 410 or 411 (steps 510 and 5).
11). In this case, the socket sa1 for the application program 107 on the server side and the socket sb0 for the application program on the client side are registered in the internal tables 410 and 411, respectively. Further, the TCP / IP emulation libraries 112 and 113 exchange data necessary for using the high-speed communication mechanisms 116 and 117 such as the identifier of each computer in the parallel computer 101. For this data exchange, the application programs 107 and 108 issue function read and write system calls, respectively (processing 512, 513, 515 and 516). The data exchange here is performed by the already connected sockets sa1 and s
Using b0, TCP / IP processing routines 114 and 115 and a communication path 106 are performed in the same manner as in the related art. Thereafter, the process returns to the server-side and client-side application programs 107 and 108, respectively (process 51).
8, 519). At this time, the server side TCP / IP emulation library 112 returns the generated identifier of the socket sa1 to the application program 107.

【００６３】処理５０６，５０７において、相手が並列
計算機１０１の内部の計算機でないことが判明したとき
は、関数ＥＭＵ＿ｃｏｎｎｅｃｔはクライアント側のア
プリケーションプログラム１０８に返り、関数ＥＭＵ＿
ａｃｃｅｐｔはソケットｓａ１をサーバ側のアプリケー
ションプログラム１０７に返し、そのプログラムに戻る
（処理５０８，５０９）。こうして、サーバ側のアプリ
ケーションプログラムに対するソケットｓａ１とクライ
アント側のアプリケーションプログラムに対するソケッ
トｓｂ０は、それぞれに対応するＴＣＰ／ＩＰ処理ルー
チンと通信路１０６を介して接続される。When it is determined in the processes 506 and 507 that the other party is not a computer inside the parallel computer 101, the function EMU_connect returns to the client-side application program 108, and the function EMU_connect is returned.
The accept returns the socket sa1 to the application program 107 on the server side, and returns to the program (processing 508, 509). Thus, the socket sa1 for the application program on the server side and the socket sb0 for the application program on the client side are connected to the corresponding TCP / IP processing routine via the communication path 106.

【００６４】なお、図３は、本実施の形態によりアプリ
ケーションプログラム１０７，１０８，１０９が生成し
たソケットがお互いに接続されている一つの状態を示し
ている。ここでは、アプリケーションプログラム１０７
のソケットＳＡ１（４０２）とアプリケーションプログ
ラム１０８のソケットＳＢ０（４０３）とが接続され
（４０７）、アプリケーションプログラム１０８のソケ
ットＳＢ１（４０４）とアプリケーションプログラム１
０９のソケットＳＣ０（４０５）が接続され（４０
８）、アプリケーションプログラム１０９のソケットＳ
Ｃ１（４０６）とアプリケーションプログラム１０７の
ソケットＳＡ０（４０１）が接続されている（４０
９）。ここで、ＴＣＰ／ＩＰエミュレーションライブラ
リ１１２，１１３は、内部ソケットテーブル４１０，４
１１を保持している。内部ソケットテーブ４１０、４１
１には、ソケットの接続先が内部計算機である場合に、
そのソケット記述子を登録する。例えば、アプリケーシ
ョンプログラム１０７のソケットＳＡ１（４０２）の接
続先であるソケットＳＢ０（４０３）は、内部計算機１
０３の上で動くアプリケーションプログラム１０８のソ
ケットなので、ＳＡ１を内部ソケットテーブル４１０に
登録する。同様に、ソケットＳＢ０の接続先であるソケ
ットＳＡ１は、内部計算機１０２の上で動くアプリケー
ションプログラム１０７のソケットなので、ＳＢ０を内
部ソケットテーブル４１１に登録する。一方、ソケット
ＳＡ０（４０１）の接続先であるソケットＳＣ１（４０
６）は、外部計算機上アプリケーションプログラム１０
９のソケットなので、ＳＡ０は内部ソケットテーブル４
１０には登録しない。同様に、ＳＣ０に接続されている
ＳＢ１も内部ソケットテーブル４１１には登録しない。FIG. 3 shows one state in which the sockets generated by the application programs 107, 108, and 109 according to the present embodiment are connected to each other. Here, the application program 107
The socket SA1 (402) of the application program 108 and the socket SB0 (403) of the application program 108 are connected (407), and the socket SB1 (404) of the application program 108 and the application program 1 are connected.
09 socket SC0 (405) is connected (40
8), socket S of application program 109
C1 (406) and the socket SA0 (401) of the application program 107 are connected (40).
9). Here, the TCP / IP emulation libraries 112 and 113 store the internal socket tables 410 and 4 respectively.
11 is held. Internal socket tape 410, 41
1, when the connection destination of the socket is an internal computer,
Register that socket descriptor. For example, the socket SB0 (403) to which the socket SA1 (402) of the application program 107 is connected is the internal computer 1
Since SA is the socket of the application program 108 that operates on SA 03, SA1 is registered in the internal socket table 410. Similarly, since the socket SA1 to which the socket SB0 is connected is the socket of the application program 107 running on the internal computer 102, SB0 is registered in the internal socket table 411. On the other hand, the socket SC1 (40) to which the socket SA0 (401) is connected is connected.
6) The application program 10 on the external computer
9 socket, SA0 is the internal socket table 4
No registration for 10. Similarly, SB1 connected to SC0 is not registered in internal socket table 411.

【００６５】（５）内部通信と外部通信の切り分け方法その後、サーバ側のアプリケーションプログラムとクラ
イアント側のアプリケーションプログラムはデータ通信
を開始する。これらの二つのプログラムの内の一方およ
び他方は、データ送信および受信のために関数ＥＭＵ＿
ｗｒｉｔｅ，ＥＭＵ＿ｒｅａｄをそれぞれ呼び出す。先
に示したクライアント側とサーバ側のプログラムの例で
は、サーバ側のアプリケーションプログラムが、関数Ｅ
ＭＵ＿ｒｅａｄを呼び出し、クライアント側のアプリケ
ーションプログラムが関数ＥＭＵ＿ｗｒｉｔｅを呼び出
している。これらで指定する引数は、ＴＣＰ／ＩＰ処理
ルーチン１１４，１１５に含まれたシステムコールｗｒ
ｉｔｅ、ｒｅａｄに対する引数と同じである。(5) Method of separating internal communication from external communication After that, the application program on the server side and the application program on the client side start data communication. One and the other of these two programs use the function EMU_ for data transmission and reception.
Write and EMU_read are called respectively. In the example of the client-side and server-side programs described above, the server-side application program executes the function E
MU_read is called, and the client-side application program calls the function EMU_write. The arguments specified by these are the system call wr included in the TCP / IP processing routines 114 and 115.
Same as arguments for item and read.

【００６６】図４を参照するに、ＥＭＵ＿ｒｅａｄ，Ｅ
ＭＵ＿ｗｒｉｔｅが呼び出されると（処理７０１、７０
２）、それぞれの関数は、それぞれの引数で指定された
ソケットｓａ１、ｓｂ０のソケット記述子が対応する内
部ソケットテーブル４１０、４１１にそれぞれ登録され
ているか否かを判定する（処理７０３，７０４）。それ
ぞれのソケット識別子が内部ソケットテーブル４１０，
４１１にそれぞれ登録されているときには、後に詳細に
説明する手順で高速通信機構１１６，１１７、高速内部
通信網１０５を用いてメッセージパッシング方式の通信
を行う（処理７０７，７０８）。それぞれのソケット識
別子が内部ソケットテーブル４１０、４１１に登録され
ていなかったら、そのまま従来のｒｅａｄ，ｗｒｉｔｅ
システムコールをそれぞれ指定されたソケットに対して
発行する（処理７０５，７０６）。これらのシステムコ
ールは対応するＴＣＰ／ＩＰ処理ルーチン１１４，１１
５により処理され、グローバル通信路１０６を用いた通
信がそれらのＴＣＰ／ＩＰ処理ルーチン１１４，１１５
により実行される。Referring to FIG. 4, EMU_read, E
When MU_write is called (processing 701, 70
2) The respective functions determine whether or not the socket descriptors of the sockets sa1 and sb0 specified by the respective arguments are registered in the corresponding internal socket tables 410 and 411 (steps 703 and 704). Each socket identifier is an internal socket table 410,
If they are registered in 411, respectively, message-passing communication is performed using the high-speed communication mechanisms 116 and 117 and the high-speed internal communication network 105 in a procedure described in detail later (processing 707 and 708). If the respective socket identifiers are not registered in the internal socket tables 410 and 411, the conventional read, write
A system call is issued to each specified socket (processes 705 and 706). These system calls correspond to the corresponding TCP / IP processing routines 114 and 11.
5 and the TCP / IP processing routines 114 and 115
Is executed by

【００６７】なお、図５は、図３で示したソケットの接
続状態において、各アプリケーションプログラム１０
７，１０８，１０９同士が通信を行う場合のデータの流
れを示して、上記切り分け方法を説明するための全体構
成図である。アプリケーションプログラム１０７とアプ
リケーションプログラム１０８が通信する場合は、ソケ
ットｓａ１（４０２）とソケットｓｂ０（４０３）を通
信端として用いる。これらのソケットは、ＴＣＰ／ＩＰ
エミュレーションライブラリ１１２，１１３が保持する
内部ソケットテーブル４１０，４１１に登録されてい
る。そこでＴＣＰ／ＩＰエミュレーションライブラリ１
１２，１１３は、データ通信処理にＴＣＰ／ＩＰ処理ル
ーチン１１４，１１５ではなく高速通信機構１１６，１
１７を利用する（６０１）。一方、アプリケーションプ
ログラム１０８とアプリケーションプログラム１０９が
通信する場合は、ソケットＳＢ１（４０４）とソケット
ＳＣ０（４０５）を通信端として用いる。アプリケーシ
ョンプログラム１０８とリンクされているＴＣＰ／ＩＰ
エミュレーションライブラリ１１３が保持する内部ソケ
ットテーブル４１１には、ＳＢ１は登録されていない。
そこでＴＣＰ／ＩＰエミュレーションライブラリ１１３
は、データ通信処理に、ＴＣＰ／ＩＰ処理ルーチン１１
５をそのまま利用する（６０２）。これによって、外部
のＴＣＰ／ＩＰ処理ルーチン１１０とのデータ交換が可
能となる。FIG. 5 shows each application program 10 in the connection state of the socket shown in FIG.
FIG. 7 is an overall configuration diagram illustrating a data flow in a case where 7, 108, and 109 communicate with each other, and illustrating the above-described separation method. When the application program 107 and the application program 108 communicate, the socket sa1 (402) and the socket sb0 (403) are used as communication ends. These sockets are TCP / IP
It is registered in the internal socket tables 410 and 411 held by the emulation libraries 112 and 113. So TCP / IP emulation library 1
12 and 113 are not TCP / IP processing routines 114 and 115 but high-speed communication mechanisms 116 and 1 for data communication processing.
17 is used (601). On the other hand, when the application program 108 and the application program 109 communicate, the socket SB1 (404) and the socket SC0 (405) are used as communication ends. TCP / IP linked to application program 108
SB1 is not registered in the internal socket table 411 held by the emulation library 113.
Therefore, the TCP / IP emulation library 113
Is a TCP / IP processing routine 11 for data communication processing.
5 is used as it is (602). As a result, data exchange with the external TCP / IP processing routine 110 becomes possible.

【００６８】本方式により、バインドシステムコール、
リスンシステムコールは従来のＴＣＰ／ＩＰ処理ルーチ
ンから変更せずにそのまま用いて、内部通信と外部通信
の切り分けを実現できる。According to this method, a bind system call,
The listen system call can be used without any change from the conventional TCP / IP processing routine to realize the separation between internal communication and external communication.

【００６９】（６）メッセージパッシング型ライブラリ
１４０，１４１並列計算機の各計算機に使用される内部高速通信網１０
５を使用するための通信ハードウェア１１９、１２０お
よび高速通信ライブラリ１３５，１３６はベンダ特有で
ある場合が多いので、その利用方法もマシンによって様
々である。よって、各マシンの高速通信ハードウェアと
高速通信ライブラリを利用したアプリケーションプログ
ラムを作ろうとする場合、マシンに特化した汎用性の低
いプログラムにならざるを得なかった。これに対して、
並列計算機内部の通信ハードウェアを利用して通信する
ためのライブラリを用意し、このライブラリのアプリケ
ーションプログラムインタフェースを標準として規定す
ることで、アプリケーションプログラムの汎用性を高め
ようという活動が世界中で活発である。(6) Message passing type libraries 140 and 141 Internal high-speed communication network 10 used for each computer of the parallel computer
Since the communication hardware 119 and 120 and the high-speed communication libraries 135 and 136 for using the H.5 are often vendor-specific, their usages also vary depending on the machine. Therefore, when trying to create an application program using the high-speed communication hardware and the high-speed communication library of each machine, the program must be a low-versatility program specialized for the machine. On the contrary,
A library for communication using the communication hardware inside the parallel computer is prepared, and activities to increase the versatility of application programs by defining the application program interface of this library as a standard are active worldwide. is there.

【００７０】このメッセージ型の通信を使用するための
汎用のインタフェースとして現在広く使用されているイ
ンタフェースは、ＭＰＩと呼ばれるメッセージパッシン
グインタフェース（ＭＰＩ−ＭｅｓｓａｇｅＰａｓｓ
ｉｎｇＩｎｔｅｒｆａｃｅ）である。例えば、文
献："MPI: Message Passing Interface Standard versi
on1.1", MPI Forum, University of Tennessee, 1995
参照。このインタフェースは、メッセージパッシング型
の通信を実現するための上に述べた高速通信ライブラリ
が存在することを前提としているものであり、このイン
タフェースを使用しても、メッセージパッシング型の通
信は、基本的には上記高速通信ライブラリにより実現さ
れることには変わらない。An interface widely used at present as a general-purpose interface for using this message type communication is a message passing interface (MPI-Message Pass) called MPI.
ing Interface). For example, reference: "MPI: Message Passing Interface Standard versi
on1.1 ", MPI Forum, University of Tennessee, 1995
reference. This interface is based on the premise that the above-mentioned high-speed communication library for implementing message-passing type communication exists. Even if this interface is used, message-passing type communication can Is still realized by the high-speed communication library.

【００７１】多くの並列計算機ベンダが、ＭＰＩに準拠
したアプリケーションプログラムが並列計算機内部の高
速通信ハードウェアを使用できるようにするための高速
通信ライブラリを提供している。Many parallel computer vendors provide high-speed communication libraries that enable application programs compliant with MPI to use high-speed communication hardware inside the parallel computer.

【００７２】本明細書では、メッセージ型の通信を使用
するためのインタフェースをメッセージパッシング型イ
ンタフェースと呼び、そのインタフェースを有するライ
ブラリをメッセージパッシング型ライブラリと呼ぶ。特
に、ＭＰＩ仕様のインタフェースをＭＰＩあるいはＭＰ
Ｉインタフェースあるいはメッセージパッシングインタ
フェースと呼び、そのインタフェースを有するライブラ
リをＭＰＩライブラリあるいはメッセージパッシングイ
ンタフェースライブラリと呼ぶことがある。In this specification, an interface for using message type communication is called a message passing type interface, and a library having the interface is called a message passing type library. In particular, MPI or MPI interface
It is called an I interface or a message passing interface, and a library having the interface may be called an MPI library or a message passing interface library.

【００７３】本実施の形態では、メッセージパッシング
型ライブラリ１４０、１４１として多くの並列計算機で
利用可能である標準のＭＰＩ仕様により定められたイン
タフェースでもってコマンドあるいはデータを交換する
ライブラリを使用する。それでもって、ＴＣＰ／ＩＰエ
ミュレーションライブラリ１１２，１１３の汎用性を高
める。しかし、他の仕様のインタフェースを使用しても
よい。In the present embodiment, a library for exchanging commands or data with an interface defined by a standard MPI specification that can be used in many parallel computers is used as the message passing type libraries 140 and 141. Therefore, the versatility of the TCP / IP emulation libraries 112 and 113 is improved. However, other specification interfaces may be used.

【００７４】ＭＰＩに準拠した従来のアプリケーション
プログラムの記述例を以下に示す。本実施の形態では、
ＴＣＰ／ＩＰエミュレーションライブラリ１１２、１１
３は、メッセージパッシング型ライブラリ１４０、１４
１を起動する部分に関しては、以下に示すプログラム部
分を有する。このプログラム部分のより詳細は、後にふ
れる。A description example of a conventional application program conforming to MPI is shown below. In the present embodiment,
TCP / IP emulation libraries 112 and 11
3 is a message passing type library 140, 14
1 has the following program part. More details on this part of the program will be mentioned later.

【００７５】 MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); if ( rank == 0 ) /* receiver */ MPI_Recv(buffer, length, type, 1, tag, comm, status); else if ( rank == 1) /* sender */ MPI_Send(buffer, length, type, 0, tag, comm); MPI_Finalize(); ＭＰＩでは、通信しようとする全プロセスを一斉に立ち
上げる。この時、各プロセスにはランクと呼ばれるプロ
セス識別子が決定される。上の例では、２つのプロセ
ス、すなわち送信側プロセスｓｅｎｄｅｒと受信側プロ
セスｒｅｃｅｉｖｅｒとを立ち上げる場合を示してい
る。この時、それぞれのプロセスにはランク０と１が付
けられる。各プロセスは、まずＭＰＩ初期化関数ＭＰＩ
＿ＩｎｉｔによってＭＰＩの初期化を行い、ＭＰＩ通信
ランク関数ＭＰＩ＿Ｃｏｍｍ＿ｒａｎｋによって自分の
ランクを取得する。その後、各ランクごとの処理を行
う。上の例では、ランク１のプロセスがＭＰＩ送信関数
ＭＰＩ＿Ｓｅｎｄによって送るデータを、ランク０のプ
ロセスがＭＰＩ受信関数ＭＰＩ＿Ｒｅｃｖで受け取る場
合を示している。MPI_Init (& argc, &argv); MPI_Comm_rank (MPI_COMM_WORLD, &rank); if (rank == 0) / * receiver * / MPI_Recv (buffer, length, type, 1, tag, comm, status); else if (rank == 1) / * sender * / MPI_Send (buffer, length, type, 0, tag, comm); MPI_Finalize (); In MPI, all processes that want to communicate are started at once. At this time, a process identifier called a rank is determined for each process. The above example shows a case where two processes are started, that is, a sender process sender and a receiver process receiver. At this time, ranks 0 and 1 are assigned to the respective processes. Each process starts with an MPI initialization function MPI
MPI is initialized by _Init, and its own rank is acquired by MPI communication rank function MPI_Comm_rank. After that, processing for each rank is performed. The above example shows a case where the process of rank 0 receives data transmitted by the process of rank 1 by the MPI transmission function MPI_Send by the MPI reception function MPI_Recv.

【００７６】（７）高速通信機構を用いたメッセージパ
ッシング型の通信高速通信機構１１６，１１７を用いるデータ受信処理７
０７，データ送信処理７０８の詳細を説明する前に、こ
れらの処理の実行に必要な、高速通信機構１１６，１１
７を用いたメッセージパッシング型の通信の概要を説明
する。(7) Message-passing type communication using high-speed communication mechanism Data reception processing 7 using high-speed communication mechanisms 116 and 117
07, before describing the details of the data transmission processing 708, the high-speed communication mechanisms 116 and 11 required to execute these processing are described.
The outline of the message-passing type communication using No. 7 will be described.

【００７７】一般にメッセージパッシング型の通信を実
行するには、送信側の計算機がデータを送信するための
送信関数を実行し、受信側の計算機がそのデータを受信
するための受信関数を実行する。後続のデータをそれら
の計算機の間で転送するために、それらの関数が繰り返
し実行される。本実施の形態では、メッセージパッシン
グ型の通信の具体的な説明として、ＭＰＩ仕様によるメ
ッセージパッシング通信を説明する。In general, to execute message-passing type communication, a computer on the transmitting side executes a transmission function for transmitting data, and a computer on the receiving side executes a receiving function for receiving the data. These functions are repeatedly executed to transfer subsequent data between the computers. In the present embodiment, message passing communication according to the MPI specification will be described as a specific description of message passing communication.

【００７８】ＭＰＩ仕様によるメッセージパッシング通
信では、送信側の計算機は送信関数ＭＰＩ＿Ｓｅｎｄを
実行し、受信側の計算機は受信関数ＭＰＩ＿Ｒｅｃｖを
実行する。今の場合、送信処理７０８が起動されると、
送信側のＴＣＰ／ＩＰエミュレーションライブラリ１１
３は、先に記載したＭＰＩ仕様のアプリケーションプロ
グラムの例にあるように、送信関数ＭＰＩ＿Ｓｅｎｄを
呼び出す。なお、送信関数ＭＰＩ＿Ｓｅｎｄを呼び出す
前に初期処理としてＭＰＩ＿Ｉｎｉｔ、ＭＰＩ＿Ｃｏｍ
ｍ＿ｒａｎｋ等の関数の呼び出しを実行する必要がある
が、これらの処理は、先に記載した関数ＥＭＵ＿ａｃｃ
ｅｐｔ，ＥＭＵ＿ｃｏｎｎｅｃｔ内の内部テーブル登
録処理（５１０，５１１）と同時に行なう。一方、受
信側のＴＣＰ／ＩＰエミュレーションライブラリ１１２
は、内部高速通信網１０５を用いるために受信関数ＭＰ
Ｉ＿Ｒｅｃｖを呼び出す。In the message passing communication according to the MPI specification, the computer on the transmission side executes a transmission function MPI_Send, and the computer on the reception side executes the reception function MPI_Recv. In this case, when the transmission process 708 is activated,
Transmission side TCP / IP emulation library 11
3 calls the transmission function MPI_Send as in the example of the application program of the MPI specification described above. Before calling the transmission function MPI_Send, MPI_Init, MPI_Com
It is necessary to execute a call to a function such as m_rank, and these processes are performed by the function EMU_acc described above.
ept, performed at the same time as the internal table registration processing (510, 511) in the EMU_connect. On the other hand, the TCP / IP emulation library 112 on the receiving side
Is a receiving function MP for using the internal high-speed communication network 105.
Call I_Recv.

【００７９】送信側のＴＣＰ／ＩＰエミュレーションラ
イブラリ１１３が上記送信処理７０８内で呼び出す関数
ＭＰＩ＿Ｓｅｎｄの引数で指定するバッファアドレスｂ
ｕｆｆｅｒおよびバッファ長ｌｅｎｇｔｈは、送信側の
アプリケーションプログラム１０８が発行した関数ＥＭ
Ｕ＿ｗｒｉｔｅの対応する引数に等しくされる。一方、
受信側のＴＣＰ／ＩＰエミュレーションライブラリ１１
２が受信処理７０７内で呼び出す関数ＭＰＩ＿Ｒｅｃｖ
の引数で指定するバッファアドレスｂｕｆｆｅｒおよび
バッファ長ｌｅｎｇｔｈは、後に説明するように受信側
のアプリケーションプログラム１０７が発行した、関数
ＥＭＵ＿ｒｅａｄの引数が指定するバッファアドレスお
よびバッファ長もしくは受信側のＴＣＰ／ＩＰエミュレ
ーションライブラリ１１３内に設けられる特定のバッフ
ァのアドレスおよびサイズに等しくされる。The buffer address b specified by the argument of the function MPI_Send called by the TCP / IP emulation library 113 on the transmission side in the transmission processing 708
buffer and the buffer length length are the functions EM issued by the application program 108 on the transmission side.
It is equal to the corresponding argument of U_write. on the other hand,
TCP / IP emulation library 11 on the receiving side
2 calls function MPI_Recv in reception processing 707
The buffer address buffer and the buffer length specified by the argument are the buffer address and the buffer length specified by the argument of the function EMU_read or the TCP / IP emulation library on the receiving side issued by the application program 107 on the receiving side as described later. It is equal to the address and size of the particular buffer provided in 113.

【００８０】送信側のメッセージパッシング型ライブラ
リ１４１内の関数ＭＰＩ＿Ｓｅｎｄは、起動されると、
対応する高速通信ライブラリ１３６に対し、関数ＭＰＩ
＿Ｓｅｎｄに対する引数が指定するアドレスのバッファ
からその引数が指定する長さのデータを内部高速通信網
１０５を介して受信側の計算機１２４に転送することを
要求する。受信側のメッセージパッシング型ライブラリ
１４０内の関数ＭＰＩ＿Ｒｅｃｖは、起動されると、対
応する高速通信ライブラリ１３５に対し、その高速通信
ライブラリ１３５に、この転送されたデータを内部高速
通信網１０５を介して受信し、関数ＭＰＩ＿Ｒｅｃｖに
対する引数が指定するアドレスのバッファにその引数が
指定する長さのデータを書き込むことを要求する。When the function MPI_Send in the message-passing type library 141 on the transmitting side is activated,
For the corresponding high-speed communication library 136, the function MPI
A request is made to transfer data of the length specified by the argument from the buffer at the address specified by the argument to _Send to the computer 124 on the receiving side via the internal high-speed communication network 105. When the function MPI_Recv in the message-passing type library 140 on the receiving side is started, the function MPI_Recv receives the transferred data via the internal high-speed communication network 105 to the corresponding high-speed communication library 135. Then, a request is made to write data of the length specified by the argument to the buffer at the address specified by the argument for the function MPI_Recv.

【００８１】起動された送信側の高速転送ライブラリ１
３６と受信側の高速転送ライブラリ１３５は、要求され
たデータの送信と受信をそれ自体公知の方法により内部
高速通信網１０５を介して行う。このデータ転送は具体
的には以下のように行われる。一般にはリモートメモリ
書き込みコマンドあるいはＰＵＴコマンドといわれるコ
マンドが使用される。以下では、このコマンドをＰＵＴ
コマンドと呼ぶ。内部高速通信網１０５を介したデータ
転送を行う通信方法は３つのＰＵＴコマンドにより行わ
れる。The activated high-speed transfer library 1 on the transmitting side
36 and the high-speed transfer library 135 on the receiving side transmit and receive the requested data via the internal high-speed communication network 105 by a method known per se. This data transfer is specifically performed as follows. Generally, a command called a remote memory write command or a PUT command is used. In the following, this command is called PUT
Called a command. A communication method for performing data transfer via the internal high-speed communication network 105 is performed by three PUT commands.

【００８２】まず、送信側のＴＣＰ／ＩＰエミュレーシ
ョンライブラリ１１３から関数ＭＰＩ＿Ｓｅｎｄが呼び
出されると、送信側のメッセージパッシング型ライブラ
リ１４１は、この引数が指定したバッファ長（これはす
なわち送信すべきデータの長さである）を含むデータの
属性情報を含むヘッダの送信を送信側の高速転送ライブ
ラリ１３６に要求する第１のＰＵＴコマンドを発行す
る。この高速転送ライブラリ１３６は、受信側の計算機
１０２にこのヘッダを送信する。計算機１０２内の内部
高速通信専用ハードウェア１１９が、このデータを受信
側のメモリ１２４内の所定の位置に直接書き込む。First, when the function MPI_Send is called from the TCP / IP emulation library 113 on the transmitting side, the message passing type library 141 on the transmitting side sends the buffer length specified by this argument (that is, the length of the data to be transmitted). The first PUT command is issued to request the high-speed transfer library 136 on the transmission side to transmit a header including the attribute information of the data including the following. The high-speed transfer library 136 transmits the header to the computer 102 on the receiving side. The hardware 119 dedicated to internal high-speed communication in the computer 102 directly writes this data to a predetermined position in the memory 124 on the receiving side.

【００８３】一方、受信側ＴＣＰ／ＩＰエミュレーショ
ンライブラリ１１２から受信命令ＭＰＩ＿Ｒｅｃｖが呼
び出されると、受信側のメッセージパッシング型ライブ
ラリ１４０は、まず送信側の計算機１０３からヘッダが
すでに送られてきているか調べる。このために後述する
データ検査命令ＭＰＩ＿ｐｒｏｂｅが使用される。ヘッ
ダが送られてきていないときには、その受信命令の実行
は終了する。ヘッダがすでに送られてきているときに
は、そのヘッダ内のバッファ長から送信データが受信可
能か否かを判定する。すなわち、その送信データの長さ
が、上記受信命令ＭＰＩ＿Ｒｅｃｖが指定したバッファ
長以下であるか否かが判定される。ＭＰＩ仕様のメッセ
ージパッシング通信に限らず、一般にメッセージパッシ
ング型の通信では、送信命令が指定した送信データ長
が、受信命令が指定した受信バッファサイズより大きい
ときには、受信側の計算機はその送信データを受信しな
いようになっている。もし受信側の計算機が受信側のバ
ッファのサイズを超える受信データを受信したときに
は、そのバッファ以外のメモリ領域が受信データにより
破壊する恐れがあるためである。受信側のメッセージパ
ッシング型ライブラリ１４０は、上記判定の結果、送信
データが受信可能であると判断したときには、受信命令
ＭＰＩ＿Ｒｅｃｖに対する引数が指定した受信バッファ
のアドレスの送信を受信側の高速通信ライブラリ１３５
に要求する第２のＰＵＴコマンドを発行する。この高速
通信ライブラリ１３５はこのコマンドに対する応答とし
てバッファアドレスを送信側の計算機１０３に送信す
る。１０２内の内部高速通信専用ハードウェア１１９
は、このアドレスを送信側のメモリ１２５内の所定の位
置に書き込む。なお、上記判定の結果、送信データが受
信可能でないときには、受信側のメッセージパッシング
型ライブラリはバッファドレスを返さないで、エラーを
送信側の計算機に通知する。On the other hand, when the receiving command MPI_Recv is called from the TCP / IP emulation library 112 on the receiving side, the message passing type library 140 on the receiving side first checks whether the header has already been sent from the computer 103 on the transmitting side. For this purpose, a data check instruction MPI_probe described later is used. When the header has not been sent, the execution of the reception instruction ends. If the header has already been sent, it is determined from the buffer length in the header whether the transmission data can be received. That is, it is determined whether or not the length of the transmission data is equal to or less than the buffer length specified by the reception command MPI_Recv. Not only in the message passing communication of the MPI specification, but generally in the message passing type communication, when the transmission data length specified by the transmission command is larger than the reception buffer size specified by the reception command, the receiving computer receives the transmission data. Not to be. This is because if the receiving computer receives received data exceeding the size of the receiving buffer, a memory area other than the buffer may be damaged by the received data. When the message passing type library 140 on the receiving side determines that the transmission data is receivable as a result of the above determination, the message passing type library 140 on the receiving side transmits the address of the receiving buffer designated by the argument to the receiving instruction MPI_Recv.
Issue a second PUT command. The high-speed communication library 135 transmits the buffer address to the transmitting computer 103 as a response to the command. Internal high-speed communication dedicated hardware 119 in 102
Writes this address in a predetermined position in the memory 125 on the transmission side. As a result of the determination, when the transmission data is not receivable, the message-passing type library on the receiving side does not return the buffer address, but notifies the computer on the transmitting side of the error.

【００８４】最後に、送信側のメッセージパッシング型
ライブラリは、送信データと受信されたバッファアドレ
スの送信を送信側の高速転送ライブラリ１３６に要求す
る第３のＰＵＴコマンドを発行する。高速転送ライブラ
リ１３６は、このデータとバッファアドレスを受信側の
計算機１０２に送信する。受信側の計算機１０２内の内
部高速通信専用ハードウェア１１９は、このバッファア
ドレスとデータを受信し、そのアドレスを有するバッフ
ァに受信したデータを直接書き込む。こうして、データ
の転送が終了する。Finally, the message-passing library on the transmitting side issues a third PUT command requesting the high-speed transfer library 136 on the transmitting side to transmit the transmission data and the received buffer address. The high-speed transfer library 136 transmits the data and the buffer address to the computer 102 on the receiving side. The internal dedicated hardware 119 for high-speed communication in the computer 102 on the receiving side receives the buffer address and the data, and directly writes the received data to the buffer having the address. Thus, the data transfer ends.

【００８５】なお、内部高速通信網１０５を介したデー
タ通信方法は以上の方法に限らず他の方法も使用可能で
ある。たとえば並列計算機によっては、高速通信ライブ
ラリはＰＵＴコマンドの他にリモートメモリ書き込みコ
マンドあるいはＧＥＴコマンドも実行可能である。この
ような並列計算機の場合には、送信側のメッセージパッ
シング型ライブラリ１４１が上記第３のＰＵＴコマンド
を発行するのに代えて、受信側のメッセージパッシング
型ライブラリ１４０がＧＥＴコマンドを発行し、受信側
の高速通信ライブラリ１３５が送信側のメモリ１２５か
ら送信データを読み出す処理を実行する。The data communication method via the internal high-speed communication network 105 is not limited to the above method, and other methods can be used. For example, depending on the parallel computer, the high-speed communication library can execute a remote memory write command or a GET command in addition to the PUT command. In such a parallel computer, instead of the message-passing type library 141 on the transmitting side issuing the third PUT command, the message-passing type library 140 on the receiving side issues a GET command and issues a GET command. The high-speed communication library 135 executes a process of reading out transmission data from the memory 125 on the transmission side.

【００８６】（８）メッセージパッシング型の通信にお
けるストリーム通信図４に示されたデータ受信処理７０７，送信処理７０８
の詳細を図７、８に示した具体例を適宜参照して説明す
る。図４において、すでに述べたように送信側のアプリ
ケーションプログラム１０８が、関数ＥＭＵ＿ｗｒｉｔ
ｅを呼び出した結果（７０２）、送信処理７０８が起動
されると、メッセージパッシング型ライブラリ１４１に
含まれた関数ＭＰＩ＿Ｓｅｎｄが呼び出される（７８
１）。図７に示した例では、関数ＥＭＵ＿ｗｒｉｔｅ呼
び出し時（７０２）の引数ではバッファア８０３のアド
レスＳｂｕｆｆと、５０キロバイト（ＫＢ）のバッファ
サイズを指定すると仮定する。関数ＭＰＩ＿Ｓｅｎｄ呼
び出し時（７８１）の引数が指定するバッファアドレス
とバッファサイズはこれらの値に等しくされる。以下で
は使用する関数の他の引数の説明は簡単化のために省略
する。また、指定するバッファサイズの単位はＫＢであ
ると仮定し、関数の引数を図示するときには、この単位
ＫＢは簡単化のために図示しない。メッセージパッシン
グ型ライブラリ１４１はこの関数呼び出し７８１に応答
して、データの転送のために、既に述べたように高速通
信機構１１７にヘッダの送信を指示し、その後データの
送信を指示する。(8) Stream Communication in Message Passing Type Communication Data reception processing 707 and transmission processing 708 shown in FIG.
Will be described with reference to the specific examples shown in FIGS. In FIG. 4, as described above, the application program 108 on the transmission side performs the function EMU_write
When the transmission process 708 is activated as a result of calling e (702), the function MPI_Send included in the message passing type library 141 is called (78).
1). In the example shown in FIG. 7, it is assumed that the address Sbuff of the buffer 803 and the buffer size of 50 kilobytes (KB) are specified in the argument at the time of calling the function EMU_write (702). The buffer address and buffer size specified by the argument at the time of calling the function MPI_Send (781) are made equal to these values. In the following, description of other arguments of the function to be used is omitted for simplification. Also, it is assumed that the unit of the designated buffer size is KB, and when the arguments of the function are illustrated, the unit KB is not illustrated for simplicity. In response to the function call 781, the message-passing type library 141 instructs the high-speed communication mechanism 117 to transmit a header as described above, and then instructs data transmission for data transfer.

【００８７】一方、図４において、すでに述べたように
受信側のアプリケーションプログラム１０７が、関数Ｅ
ＭＵ＿ｒｅａｄを呼び出した結果（７０１）、受信処理
７０７が起動されると、この受信処理７０７では図６に
示す処理がなされる。なお、図７に示した例では、関数
ＥＭＵ＿ｒｅａｄの呼び出し７０１の引数が指定するバ
ッファアドレスはＲｂｕｆｆであり、バッファサイズは
３０ＫＢであり、送信関数ＥＭＵ＿ｗｒｉｔｅの呼び出
し７０２が指定したバッファサイズより小さいサイズを
指定していると仮定する。On the other hand, in FIG. 4, as described above, the application program 107 on the receiving side executes the function E
When the reception process 707 is activated as a result of calling MU_read (701), the reception process 707 performs the process shown in FIG. In the example shown in FIG. 7, the buffer address specified by the argument of the call 701 of the function EMU_read is Rbuff, the buffer size is 30 KB, and a size smaller than the buffer size specified by the call 702 of the transmission function EMU_write is specified. Suppose you are.

【００８８】この受信処理７０７では、まず、受信側の
ＴＣＰ／ＩＰエミュレーションライブラリ１１２内に設
けられるデータ受信用の特定のバッファ（後に示すバッ
ファ９０１）に受信済みでまだ受信側のアプリケーショ
ンプログラムに転送されていないデータが残っているか
を判定する命令を発行する（処理２０２）。この命令は
図７には示されていない。今仮定しているように最初に
関数ＥＭＵ＿ｒｅａｄが実行されたときにはこの判定の
結果は否定的となる。その後、送信データ検知命令ＭＰ
Ｉ＿ｐｒｏｂｅ７７１が発行され、受信側のアプリケー
ションプログラムに宛てて送信されようとするデータが
あるかを判別する（処理２０６）。この命令は、具体的
には、このアプリケーションプログラム１０７に宛てて
送信されるべきデータに関する、既に説明したヘッダが
高速通信機構１１６により受信済みであるか否かを判定
する命令である。もしこのヘッダが送信側のアプリケー
ションプログラム１０８からまだ送信されていない場合
には、受信処理７０８は、処理を終了し受信側アプリケ
ーションプログラム１０７に戻る（処理２０７）。受信
側アプリケーションプログラム１０７は受信が失敗した
ときの処理を実行する。たとえば、受信が成功するまで
関数７０１を繰り返し呼び出す。In the receiving process 707, first, the data has been received by a specific data receiving buffer (a buffer 901 described later) provided in the TCP / IP emulation library 112 on the receiving side and is transferred to the application program on the receiving side. An instruction to determine whether or not unremained data remains is issued (process 202). This instruction is not shown in FIG. The first time the function EMU_read is executed, as now assumed, the result of this determination is negative. After that, the transmission data detection instruction MP
I_probe 771 is issued, and it is determined whether there is data to be transmitted to the application program on the receiving side (process 206). This instruction is, specifically, an instruction for determining whether or not the already described header regarding data to be transmitted to the application program 107 has been received by the high-speed communication mechanism 116. If the header has not been transmitted from the application program 108 on the transmitting side, the receiving process 708 ends the process and returns to the receiving application program 107 (process 207). The receiving-side application program 107 executes processing when reception has failed. For example, the function 701 is repeatedly called until the reception is successful.

【００８９】ヘッダが送信側のアプリケーションプログ
ラム１０８からすでに送信済みであると仮定すると、そ
のヘッダが指定する送信データが受信側のアプリケーシ
ョンプログラムのバッファ８１２に入りきるか否かが判
定される（処理２０８）。図７の例では送信データのサ
イズは、５０ＫＢであり、受信側のバッファ８１２のサ
イズは３０ＫＢであり、この判定の結果は否定的とな
る。このような場合に、受信側のＴＣＰ／ＩＰエミュレ
ーションライブラリ１１２が、この送信データを受信側
のアプリケーションプログラム１０７のバッファ８１２
に受信するＭＰＩ＿Ｒｅｃｖ命令を発行すると、受信側
のメッセージパッシング型ライブラリ１４０は、通常は
ＭＰＩ仕様によりこの命令をエラーとして処理するかも
しくは送信データの内、受信側のアプリケーションプロ
グラムのバッファ８１２に入りきらない分を捨ててしま
う。Assuming that the header has already been transmitted from the application program 108 on the transmitting side, it is determined whether or not the transmission data specified by the header can fit in the buffer 812 of the application program on the receiving side (step 208). ). In the example of FIG. 7, the size of the transmission data is 50 KB, the size of the buffer 812 on the receiving side is 30 KB, and the result of this determination is negative. In such a case, the TCP / IP emulation library 112 on the receiving side stores the transmission data in the buffer 812 of the application program 107 on the receiving side.
When the MPI_Recv command to be received is issued, the message-passing type library 140 on the receiving side normally processes this command as an error according to the MPI specification or does not fit in the buffer 812 of the application program on the receiving side out of the transmission data. Discard the minute.

【００９０】これを防ぐために、本実施の形態では受信
側アプリケーションプログラム１０７とリンクしている
受信側ＴＣＰ／ＩＰエミュレーションライブラリ１１２
内に、特別なバッファ９０１を用意し、受信側ＴＣＰ／
ＩＰエミュレーションライブラリ１１２はここに送信デ
ータをこのバッファ９０１に一旦受信することを要求す
る。すなわち、このバッファ９０１のアドレスＥｂｕｆ
ｆと全送信データのサイズ５０ＫＢとを指定する、関数
ＭＰＩ＿Ｒｅｃｖを呼び出す（７７２）。To prevent this, in the present embodiment, the receiving side TCP / IP emulation library 112 linked to the receiving side application program 107
, A special buffer 901 is prepared and TCP /
The IP emulation library 112 requests that the buffer 901 once receive transmission data. That is, the address Ebuf of the buffer 901
Call the function MPI_Recv, which specifies f and the size of all transmission data 50 KB (772).

【００９１】この関数呼び出し７７２に応答して、受信
側の高速通信機構１１６と送信側の高速通信機構１１７
は、すでに述べたようにしてデータを送受信し、受信側
の高速通信機構１１６は、このデータを上記バッファ９
０１に書き込む。受信側のＴＣＰ／ＩＰエミュレーショ
ンライブラリ１１２はその後、この受信データの内、関
数ＥＭＵ＿Ｒｅａｄ呼び出し時の引数で指定されたサイ
ズ３０ＫＢのデータを受信側のアプリケーションプログ
ラム１０７のバッファ８１２にコピーするメモリコピー
命令（ＭＥＭＣＰＹ（Ｒｂｕｆｆ、Ｅｂｕｆｆ、３０Ｋ
Ｂ））７７３を発行する（処理２０９）。その後処理は
アプリケーションプログラムに戻る（処理２１１）。こ
うして、ＴＣＰ／ＩＰエミュレーションライブラリ１１
２内のバッファ９０１に受信されたデータの一部９０６
が残っている状態で、アプリケーションプログラム間の
送受信が完了する。In response to this function call 772, the high-speed communication mechanism 116 on the receiving side and the high-speed communication mechanism 117 on the transmitting side
Transmits and receives data as described above, and the high-speed communication mechanism 116 on the receiving side transmits the data to the buffer 9.
Write to 01. The TCP / IP emulation library 112 on the receiving side then performs a memory copy instruction (MEMCPY) for copying the data of the size 30 KB specified by the argument at the time of calling the function EMU_Read from the received data to the buffer 812 of the application program 107 on the receiving side. (Rbuff, Ebuff, 30K
B)) 773 is issued (process 209). Thereafter, the process returns to the application program (process 211). Thus, the TCP / IP emulation library 11
Part 906 of data received in buffer 901 in
The transmission and reception between the application programs are completed in the state in which.

【００９２】その後さらに送信側アプリケーションプロ
グラム１０８がアドレスＳｂｕｆｆ’のバッファ８０４
のデータ８０ＫＢを送信するために関数ＥＭＵ＿ｗｒｉ
ｔｅ（Ｓｂｕｆｆ’，８０ＫＢ）（７９０）（図７
（Ｂ））を呼び出すと、同様にして送信処理７０８が実
行され、この処理の中で関数ＭＰＩ＿Ｓｅｎｄ（Ｓｂｕ
ｆｆ’，８０）（７９１）が呼び出される。一方、受信
側アプリケーションプログラム１０７もアドレスＲｂｕ
ｆｆ’のバッファ８１３に１００ＫＢのデータを受信す
るために関数ＥＭＵ＿ｒｅａｄ（Ｒｂｕｆｆ’，１００
ＫＢ）（７７５）を呼び出すと、これに対しても同様に
受信７０７が実行される。Thereafter, the transmitting side application program 108 further stores the buffer 804 of the address Sbuff '.
Function EMU_wr to transmit 80 KB of data
te (Sbuff ', 80 KB) (790) (FIG. 7)
When (B)) is called, the transmission processing 708 is executed in the same manner, and in this processing, the function MPI_Send (Sbu
ff ', 80) (791) is called. On the other hand, the receiving side application program 107 also stores the address Rbu.
To receive 100 KB of data in the buffer 813 of the ff ′, the function EMU_read (Rbuff ′, 100
When KB) (775) is called, reception 707 is similarly executed.

【００９３】この受信処理７０７は、最初の判定処理２
０２における判定では、受信側のＴＣＰ／ＩＰエミュレ
ーションライブラリ１１２内の受信用のバッファ９０１
内に未転送のデータ９０６が残っていると判断される。
その結果、メモリコピー命令ｍｅｍｃｐｙ（Ｒｂｕｆ
ｆ’，Ｅｂｕｆｆ＋３０ＫＢ，２０ＫＢ）（７７６）を
発行して、このデータ９０６を受信側のアプリケーショ
ンプログラム１０７内のバッファ８１３の先頭領域９０
７にコピーする（処理２０３）。このメモリコピーする
データの長さは、バッファ９０１に保持されているデー
タの長さの内、関数ＥＭＵ＿ｒｅａｄに対する上記第２
の関数呼び出しが指定するバッファ長を超えない範囲に
設定される。今の場合にはバッファ９０１に保持されて
いるデータの長さが２０ＫＢであり、受信を要求された
データの長さが１００ＫＢより小さいので、このデータ
２０ＫＢが全てコピーされる。This reception processing 707 is the first judgment processing 2
02, the receiving buffer 901 in the TCP / IP emulation library 112 on the receiving side is determined.
It is determined that untransferred data 906 remains in the data.
As a result, the memory copy instruction memcpy (Rbuf
f ′, Ebuff + 30 KB, 20 KB) (776), and the data 906 is stored in the head area 90 of the buffer 813 in the application program 107 on the receiving side.
7 (process 203). The length of the data to be memory-copied is the second of the length of the data held in the buffer 901 with respect to the function EMU_read.
Is set to a range that does not exceed the buffer length specified by the function call of. In this case, since the length of the data held in the buffer 901 is 20 KB and the length of the data requested to be received is smaller than 100 KB, the entire 20 KB of the data is copied.

【００９４】その後受信側アプリケーションプログラム
１０７が要求する８０ＫＢの残りのデータをさらに受信
するために、処理２０６が実行される。この処理２０６
ではすでに説明したように、送信データがあるか否かを
か調べる命令ＭＰＩ＿Ｐｒｏｂｅ（）が発行される。具
体的には、送信データに対するヘッダが受信済みである
か否かが判定される。今の場合に、送信側の関数呼び出
し７９１がすでに実行済みであると仮定すると、８０Ｋ
Ｂの送信データがあることが判明する。その場合には、
この送信データのサイズが受信側アプリケーションプロ
グラム１０２が指定するバッファ８１３に入りきるか否
かが判定される（処理２０８）。今の場合には受信側の
アプリケーションプログラムのバッファ８１３の残りの
領域９０８のサイズは８０ＫＢなので、送信データはバ
ッファ８１３のこの領域９０８に入りきる。Thereafter, the process 206 is executed to further receive the remaining data of 80 KB requested by the receiving-side application program 107. This processing 206
As described above, the instruction MPI_Probe () for checking whether or not there is transmission data is issued. Specifically, it is determined whether a header for the transmission data has been received. In this case, assuming that the function call 791 on the transmitting side has already been executed, 80K
It turns out that there is transmission data of B. In that case,
It is determined whether the size of the transmission data can fit in the buffer 813 specified by the receiving-side application program 102 (process 208). In this case, the size of the remaining area 908 of the buffer 813 of the application program on the receiving side is 80 KB, so that the transmission data can fit in this area 908 of the buffer 813.

【００９５】この結果、関数ＭＰＩ＿Ｒｅｃｖ（Ｒｂｕ
ｆｆ’＋２０ＫＢ，８０ＫＢ）（７７８）が呼び出され
る。この関数呼び出しは、バッファ８１３の残りの領域
９０８のアドレスと送信データのサイズ８０ＫＢを指定
する。こうして、この送信データがバッファ８１３の領
域９０８に直接受信される（処理２１０）。その後、受
信を要求されたデータが全て受信されたか否かが判定さ
れる（処理２１２）。今の場合は判定の結果が肯定的で
あるので、受信処理７０７は終了し、処理はアプリケー
ションプログラムに戻る（処理２１１）。As a result, the function MPI_Recv (Rbu
ff ′ + 20 KB, 80 KB) (778) is called. This function call specifies the address of the remaining area 908 of the buffer 813 and the size of the transmission data of 80 KB. Thus, the transmission data is directly received in the area 908 of the buffer 813 (process 210). Thereafter, it is determined whether or not all the data requested to be received has been received (process 212). In this case, since the result of the determination is positive, the receiving process 707 ends, and the process returns to the application program (process 211).

【００９６】以上の手順により、複数の関数ＥＭＵ＿ｗ
ｒｉｔｅの呼び出し（７０２，７９０）が指定する送信
データをひと繋がりのデータストリームとして複数の関
数の呼び出しＥＭＵ＿ｒｅａｄ（７０１、７７５）によ
り受信する、ストリーム通信を実現することができる。According to the above procedure, a plurality of functions EMU_w
It is possible to realize stream communication in which transmission data specified by the call of write (702, 790) is received as a continuous data stream by calling EMU_read (701, 775) of a plurality of functions.

【００９７】送信側のアプリケーションプログラムが指
定する送信データのサイズが受信側のアプリケーション
プログラムが指定するバッファのサイズよりも小さいと
きでも、以下のようにしてストリーム通信が簡単に実現
される。たとえば、送信側のアプリケーションプログラ
ムが５０ＫＢのデータの送信を繰り返し要求し、受信側
のアプリケーションプログラムが１００ＫＢのデータの
受信を要求する場合のストリーム通信を図８を参照して
説明する。Even when the size of the transmission data specified by the application program on the transmission side is smaller than the size of the buffer specified by the application program on the reception side, stream communication can be easily realized as follows. For example, a description will be given of stream communication in a case where a transmission-side application program repeatedly requests transmission of 50 KB data and a reception-side application program requests reception of 100 KB data, with reference to FIG.

【００９８】送信側のアプリケーションプログラムが呼
び出す関数ＥＭＵ＿ｗｒｉｔｅ（７０２）に対する送信
処理７０８（図４）の中で、関数ＭＰＩ＿ｓｅｎｄ（７
８１）が呼び出される。この関数呼び出しでは、送信側
アプリケーションプログラムのバッファ８０３のアドレ
スＳｂｕｆｆとサイズ５０ＫＢを指定する。In the transmission processing 708 (FIG. 4) for the function EMU_write (702) called by the application program on the transmission side, the function MPI_send (7
81) is called. In this function call, the address Sbuff and the size 50 KB of the buffer 803 of the transmission side application program are specified.

【００９９】受信側のアプリケーションプログラムが呼
び出す関数ＥＭＵ＿ｒｅａｄ（７０１）に対する受信処
理７０７（図４）も、すでに述べたように図６に従い処
理される。今の仮定では処理２０２での判定は失敗す
る。処理２０６において、データ検査命令７７１が実行
されたときに、送信データがあると判定されたと仮定す
る。今の場合には受信側のバッファ８１２のサイズは、
送信側のバッファのサイズより大きいので、処理２０８
での判定の結果は肯定的となる。その結果、処理２１０
が実行される。この処理では、送信データを受信側アプ
リケーションプログラムが指定したバッファ８１２に直
接受信するための関数ＭＰＩ＿ｒｅｃｖ（７７４）が呼
び出される。この関数呼び出しは、受信側アプリケーシ
ョンプログラムのバッファ８１２の先頭アドレスＲｂｕ
ｆｆと送信側のバッファ８０３のサイズ５０ＫＢを指定
する。こうして、送信側のバッファ８０３内の全データ
が、受信側のバッファ８１２内の先頭の５０ＫＢの領域
９０７に書き込まれる。次に処理２１２が実行される。
今の場合、受信を要求されたデータのサイズは１００Ｋ
Ｂであるのに対して、すでに受信されたデータのサイズ
は５０ＫＢである。したがって、要求されたデータの一
部がまだ受信されていない。したがって、判定２１２の
結果は否定的となり、残りのデータを受信するために処
理２０６が再度実行される。The reception process 707 (FIG. 4) for the function EMU_read (701) called by the application program on the reception side is also processed according to FIG. 6 as described above. Under the current assumption, the determination in the process 202 fails. Assume that in process 206, when the data check instruction 771 is executed, it is determined that there is transmission data. In this case, the size of the buffer 812 on the receiving side is
Since it is larger than the size of the buffer on the transmission side, the process 208
The result of the determination in is positive. As a result, the process 210
Is executed. In this process, a function MPI_recv (774) for directly receiving transmission data in the buffer 812 specified by the receiving-side application program is called. This function call is performed at the start address Rbu of the buffer 812 of the receiving side application program.
ff and the size of the transmission side buffer 803 of 50 KB are designated. In this way, all data in the buffer 803 on the transmission side is written to the first 50 KB area 907 in the buffer 812 on the reception side. Next, the process 212 is executed.
In this case, the size of the data requested to be received is 100K
B, whereas the size of the data already received is 50 KB. Thus, some of the requested data has not yet been received. Therefore, the result of decision 212 is negative, and process 206 is executed again to receive the remaining data.

【０１００】もし、送信側のアプリケーションプログラ
ムが次に関数ＥＭＵ＿ｗｒｉｔｅ７９０を呼び出せば、
それに対する送信処理７０８（図４）の中で、関数ＭＰ
Ｉ＿ｓｅｎｄ（７９１）が同様に呼び出される。この関
数呼び出しでも、送信側アプリケーションプログラムの
次のバッファ８０４のアドレスＳｂｕｆｆ’とサイズ５
０ＫＢを指定する。If the transmitting application program next calls the function EMU_write 790,
In the transmission processing 708 (FIG. 4) corresponding thereto, the function MP
I_send (791) is similarly called. Even in this function call, the address Sbuff 'and the size 5 of the buffer 804 next to the transmission side application program are used.
Specify 0KB.

【０１０１】上記処理２０６を繰り返したときに、すで
に送信信側のアプリケーションプログラムが上記次の関
数ＥＭＵ＿Ｓｅｎｄ７９１を呼び出していたならば、処
理２０６での判定結果は肯定的となり、判定処理２０８
に移る。今の場合には、受信側のバッファ８１２の残り
の領域９１０のサイズは送信されようとするデータのサ
イズに等しいので、この判定の結果は肯定的となる。そ
の結果、処理２１０が実行され、送信データを受信側の
バッファ８１２の残りの領域９１０に直接書き込むため
の第２の受信関数７７９が呼び出される。この関数呼び
出しでは、受信側アプリケーションプログラムのバッフ
ァ８１２の残りの領域９１０のアドレスＲｂｕｆｆ＋５
０ＫＢと送信データのサイズ５０ＫＢとを指定する。こ
うして、処理２１２において、要求された全てのデータ
の受信が完了したと判断されるので、受信処理７０７は
完了する。なお、上記処理２０６が繰り返し実行された
時点で送信データが存在しないときには、受信処理７０
７は終了し、処理は受信側アプリケーションプログラム
に戻る。また、上記処理２０８が繰り返された時点で、
処理２０８での判定結果が否定的であるときには、処理
２０９が実行される。この処理の内容は、すでに説明し
たものと同じである。以上のごとく、送信側のアプリケ
ーションプログラムが発行した送信命令ＥＭＵ＿ｗｒｉ
ｔｅが指定するバッファのサイズと受信側のアプリケー
ションプログラムが発行した受信命令ＥＭＵ＿ｒｅａｄ
が指定するバッファのサイズが異なっていても、また、
送信側のアプリケーションプログラムが発行する送信命
令ＥＭＵ＿ｗｒｉｔｅの数と受信側のアプリケーション
プログラムが発行する受信命令ＥＭＵ＿ｒｅａｄの数が
異なっていてもバッファストリーム通信が実現されるこ
とが分かる。If the application program on the transmitting side has already called the next function EMU_Send 791 when repeating the above processing 206, the result of the determination in the processing 206 becomes affirmative and the determination processing 208
Move on to In this case, the result of this determination is positive because the size of the remaining area 910 of the receiving buffer 812 is equal to the size of the data to be transmitted. As a result, the process 210 is executed, and the second reception function 779 for directly writing the transmission data to the remaining area 910 of the buffer 812 on the reception side is called. In this function call, the address Rbuff + 5 of the remaining area 910 of the buffer 812 of the receiving side application program is used.
Specify 0 KB and the size of the transmission data 50 KB. Thus, in the process 212, it is determined that the reception of all the requested data has been completed, and the reception process 707 is completed. If there is no transmission data at the time when the above-described processing 206 is repeatedly executed, the reception processing 70
7 ends, and the process returns to the receiving application program. Also, when the above-mentioned process 208 is repeated,
When the result of the determination at step 208 is negative, step 209 is executed. The contents of this processing are the same as those already described. As described above, the transmission instruction EMU_wr issued by the transmission-side application program
The size of the buffer specified by te and the reception instruction EMU_read issued by the application program on the reception side
If the size of the buffer specified by is different,
It can be seen that the buffer stream communication can be realized even if the number of transmission instructions EMU_write issued by the application program on the transmission side is different from the number of reception instructions EMU_read issued by the application program on the reception side.

【０１０２】以上の説明から分かるように、本実施の形
態によれば、並列計算機内部の様に、メッセージパッシ
ング型通信方式の高速通信機構が提供されている計算機
で動くアプリケーション同士が、ＴＣＰ／ＩＰを用いて
データ通信を行なう際に、高速通信機構の特徴を活かし
た高速通信が可能となる。また、それ以外の計算機上で
動くアプリケーションとは、従来通りのＴＣＰ／ＩＰに
よる通信を保証する。利用者は、既存のＴＣＰ／ＩＰア
プリケーションを一切変更する必要がない。As can be understood from the above description, according to the present embodiment, applications running on a computer provided with a high-speed communication mechanism of a message passing communication system, such as inside a parallel computer, can communicate with TCP / IP. When performing data communication using, high-speed communication utilizing the features of the high-speed communication mechanism becomes possible. In addition, with applications running on other computers, communication by TCP / IP as usual is guaranteed. The user does not need to change any existing TCP / IP application.

【０１０３】＜発明の実施の形態１の変形例＞本発明
は、実施の形態１の内容に限定されるのではなく、以下
に例示する変形例および他の変形例を含めいろいろの実
施形態により実施できる。<Modifications of First Embodiment of the Invention> The present invention is not limited to the contents of the first embodiment, but includes various modifications including the following modifications and other modifications. Can be implemented.

【０１０４】（１）広域ネットワークの通信規約として
ＴＣＰ／ＩＰを使用したが、これに代えて他の通信規約
を用いることもできる。そのときには、ＴＣＰ／ＩＰ処
理ルーチン、エミュレーションライブラリを変更する必
要があるのは言うまでもない。(1) Although TCP / IP is used as the communication protocol of the wide area network, another communication protocol may be used instead. At that time, it is needless to say that the TCP / IP processing routine and the emulation library need to be changed.

【０１０５】（２）実施の形態１では、メッセージパッ
シング型ライブラリを使用したが、これを使用しないこ
とも可能である。このときには、エミュレーションライ
ブラリは、直接高速通信ライブラリを呼び出すことにな
る。(2) In the first embodiment, the message passing type library is used, but it is also possible not to use this. At this time, the emulation library directly calls the high-speed communication library.

【０１０６】（３）さらに、この通信ライブラリをなく
すことも可能である。たとえば、これに代えて、専用の
回路を使用することもできる。(3) Further, it is possible to eliminate this communication library. For example, a dedicated circuit can be used instead.

【０１０７】（４）実施の形態１では、内部計算機は全
て広域通信網に接続されると想定した。しかし、一部の
内部計算機が広域通信網に接続されている場合にも同様
に本発明を適用できる。(4) In the first embodiment, it is assumed that all internal computers are connected to a wide area communication network. However, the present invention can be similarly applied to a case where some internal computers are connected to a wide area communication network.

【０１０８】（５）実施の形態１では、広域通信網と内
部高速通信網の両方を利用することを前提とした。しか
し、本発明によるストリーム通信それ自体は、メッセー
ジパッシング型の通信を実行可能な計算機間に適用でき
るものであり、したがって、このストリーム通信を実行
するには、ＴＣＰ／ＩＰ通信を使用しないでメッセージ
パッシング型の通信のみを使用するアプリケーションプ
ログラム間の通信にも適用できる。この場合には複数種
類の通信網を使用しなくてもよい。その際には、送信側
のエミュレーションライブラリは実質的には使用しない
変形例も可能である。(5) In the first embodiment, it is assumed that both the wide area communication network and the internal high speed communication network are used. However, the stream communication itself according to the present invention can be applied between computers capable of executing message-passing type communication. Therefore, in order to execute the stream communication, message passing without using TCP / IP communication is performed. It can also be applied to communication between application programs that use only type communication. In this case, it is not necessary to use a plurality of types of communication networks. In that case, a modification example in which the emulation library on the transmission side is not substantially used is also possible.

【０１０９】＜発明の実施の形態２＞ＴＣＰ／ＩＰ処理
ルーチンには従来からアプリケーションプログラムが使
用可能な関数としてｓｅｌｅｃｔ関数が設けられてい
る。そもそもソケット記述子はファイル記述子の一種と
して定義されている。このファイル記述子が指定するオ
ブジェクトから、データを取得することが可能であるか
どうかを調べるためのシステムコールとして、ｓｅｌｅ
ｃｔ関数が用意されている。例えば、あるソケットから
受け取り可能なデータが送信側から送られてきている
（あるいは送られようとしている）かどうかも、ｓｅｌ
ｅｃｔ関数によって調べることができる。具体的にはあ
るソケットを割り当てられているアプリケーションプロ
グラムが、送信システムコールｗｒｉｔｅを発行したか
否かが判定できる。ｓｅｌｅｃｔ関数では、見張ろうと
するファイル記述子をビット列で指定する。このビット
列の各ビットはそれぞれ個別のファイル記述子に対応し
ており、ビットを１にすることでファイル記述子を指定
する。複数のビットを１にすれば、一回のｓｅｌｅｃｔ
関数で複数のファイル記述子を同時に見張ることができ
る。ｓｅｌｅｃｔ関数は、見張っているファイル記述子
の何れかがデータ受け取り可能な状態になるまでブロッ
クする。<Second Embodiment of the Invention> The TCP / IP processing routine has conventionally been provided with a select function as a function that can be used by an application program. In the first place, a socket descriptor is defined as a kind of file descriptor. As a system call to check whether data can be obtained from the object specified by this file descriptor,
A ct function is provided. For example, whether or not data that can be received from a certain socket has been sent (or is about to be sent) from the sender is also determined by sel.
It can be checked by an ect function. Specifically, it can be determined whether or not an application program to which a certain socket is assigned has issued a transmission system call write. In the select function, a file descriptor to be monitored is specified by a bit string. Each bit of this bit string corresponds to an individual file descriptor, and setting the bit to 1 designates the file descriptor. If a plurality of bits are set to 1, one select
Functions can watch multiple file descriptors simultaneously. The select function blocks until one of the watched file descriptors is ready to receive data.

【０１１０】発明の実施の形態１で用いるＴＣＰ／ＩＰ
エミュレーションライブラリでは、並列計算機内部の通
信時には従来のソケットライブラリを用いないため、ｓ
ｅｌｅｃｔシステムコールでは、内部通信用のソケット
からデータ受信可能であるかどうかを調べることができ
ない。そこで、本実施の形態では、実施の形態１のごと
くＴＣＰ／ＩＰエミュレーションライブラリを使用する
計算機システムにおいても、アプリケーションプログラ
ムが従来と同様にセレクト関数を利用可能にする。TCP / IP used in Embodiment 1 of the Invention
The emulation library does not use the conventional socket library for communication inside the parallel computer.
The select system call cannot check whether data can be received from the internal communication socket. Therefore, in the present embodiment, even in a computer system that uses the TCP / IP emulation library as in the first embodiment, the application program can use the select function as in the related art.

【０１１１】本実施の形態では、内部ソケットテーブル
４１０、４１１等に登録されているソケット記述子に対
しては、内部通信専用のｓｅｌｅｃｔに相当する処理を
行い、それ以外のファイル記述子に対しては、従来のｓ
ｅｌｅｃｔシステムコールをそのまま用いる、という切
り分けを行うセレクト関数ＥＭＵ＿ｓｅｌｅｃｔをＴＣ
Ｐ／ＩＰエミュレーションライブラリ１１３内に設け
る。In the present embodiment, processing corresponding to “select” dedicated to internal communication is performed on socket descriptors registered in the internal socket tables 410 and 411, and other file descriptors are processed. Is the traditional s
The select function EMU_select, which separates the use of the
Provided in the P / IP emulation library 113.

【０１１２】ただし、内部通信用のｓｅｌｅｃｔとｓｅ
ｌｅｃｔシステムコールは同時に並行して実行しなくて
はならない。２つのｓｅｌｅｃｔを逐次に実行するので
は、例えば内部通信用のｓｅｌｅｃｔがデータを待って
いる間は、外部との通信用のソケットや標準入出力など
がデータを受け取り可能な状態になった場合でも、それ
を検知することができないからである。However, select and se for internal communication are used.
The select system call must be executed simultaneously and in parallel. By executing two select operations sequentially, for example, while the internal communication select is waiting for data, even if the external communication socket or the standard input / output is ready to receive data, Because it cannot be detected.

【０１１３】ｓｅｌｅｃｔの同時実行を疑似的に実現す
る手段として、内部用ｓｅｌｅｃｔとｓｅｌｅｃｔシス
テムコールをノンブロッキングで続けて発行することを
スピンループで繰り返す、という方法が考えられる。し
かしこの方法を用いると、データが受け取り可能になる
まで計算機を占有してしまい、同一計算機上で走ってい
る他のプロセスに処理が渡らなくなってしまう。As a means for simulating the simultaneous execution of select, a method of repeatedly issuing the internal select and the select system call in a non-blocking manner repeatedly in a spin loop is conceivable. However, if this method is used, the computer is occupied until data can be received, and the process cannot be passed to another process running on the same computer.

【０１１４】これに対して本実施の形態では、１ループ
毎に処理を他のプロセスに譲渡する命令を挿入するとい
う方法を採る。この方法により、スピンループによる計
算機の占有を避けることができる。On the other hand, in the present embodiment, a method is employed in which an instruction for transferring the process to another process is inserted for each loop. With this method, the computer occupation by the spin loop can be avoided.

【０１１５】図９は、図３で示した接続状態において、
アプリケーションプログラム１０８および１０９が送信
命令１００２，１００３を発行して、アプリケーション
プログラム１０７にデータを送信し、アプリケーション
プログラム１０７側でそれらのデータの到着を、ｓｅｌ
ｅｃｔ命令１００１によって見張っている様子を表す。
ｓｅｌｅｃｔ命令１００１で指定しているビット列はソ
ケットＳＡ０およびｓａ１に対応しているとする（１０
０４，１００５）。このうちｓａ１は内部ソケットテー
ブルに登録されているので、内部用ｓｅｌｅｃｔで見張
る（１００８）。一方、ＳＡ０は内部ソケットテーブル
に登録されていないので、ｓｅｌｅｃｔシステムコール
で見張る（１００９）。ＴＣＰ／ＩＰエミュレーション
ライブラリで発行するｓｅｌｅｃｔシステムコールでは
ｓａ１を見張る必要が無いので、アプリケーションプロ
グラム１０７が発行したｓｅｌｅｃｔ命令１００１で指
定されていたビット列に対し、ｓａ１に対応するビット
を０にしたビット列を指定する。処理１００８と１００
９はノンブロッキングに発行し、交互に繰り返す（１０
１０，１０１１）。ただし、繰り返しの途中で処理を一
旦、他のプロセスに譲渡する。FIG. 9 shows the state in the connection state shown in FIG.
The application programs 108 and 109 issue transmission commands 1002 and 1003 to transmit data to the application program 107, and the application program 107 determines that the data has arrived at sel.
It shows a state of being watched by an ect instruction 1001.
It is assumed that the bit string specified by the select instruction 1001 corresponds to sockets SA0 and sa1 (10
04, 1005). Among them, sa1 is registered in the internal socket table, so that it is monitored by internal select (1008). On the other hand, since SA0 is not registered in the internal socket table, it is monitored by the select system call (1009). Since the select system call issued by the TCP / IP emulation library does not need to monitor sa1, a bit string in which the bit corresponding to sa1 is set to 0 is specified for the bit string specified by the select instruction 1001 issued by the application program 107. I do. Processing 1008 and 100
9 is issued non-blocking and alternately repeated (10
10, 1011). However, the process is temporarily transferred to another process during the repetition.

【０１１６】図９に示したｓｅｌｅｃｔ命令、内部用の
ｓｅｌｅｃｔ関数をアプリケーションプログラムが使用
可能にするためには、ＴＣＰ／ＩＰエミュレーションラ
イブラリ１１２，１１３等にはエミュレーションセレク
ト関数ＥＭＵ＿ｓｅｌｅｃｔが設けられ、アプリケーシ
ョンプログラムは、これを呼び出して使用する。アプリ
ケーションプログラムがＥＭＵ＿ｓｅｌｅｃｔ関数を呼
び出す際には従来と同じく、それぞれ一つのソケットに
対応するビットからなるビット列ａｐ＿ｂｉｔｓを指定
する。ＴＣＰ／ＩＰエミュレーションライブラリは、こ
の関数呼び出しに応答して図１０にともない処理を実行
する。In order for the application program to be able to use the select instruction and the internal select function shown in FIG. 9, the TCP / IP emulation libraries 112 and 113 are provided with an emulation select function EMU_select. , Call and use this. When the application program calls the EMU_select function, a bit string ap_bits composed of bits corresponding to one socket is specified as in the related art. The TCP / IP emulation library executes processing according to FIG. 10 in response to the function call.

【０１１７】まず、そのビット列ａｐ＿ｂｉｔｓに対し
（処理１１０１）、内部ソケットテーブルに登録されて
いるソケット記述子に対応したビットを０にするための
マスクをかける（処理１１０２）。ｉｎ＿ｍａｓｋは、
内部ソケットテーブルに登録されている全てのソケット
記述子に対応するビットが０、それ以外のビットが１で
あるようなビット列である。よって、ａｐ＿ｂｉｔｓに
ｉｎ＿ｍａｓｋをかけて作成したビット列ｅｘ＿ｂｉｔ
ｓは、ａｐ＿ｂｉｔｓで指定されたファイル記述子のう
ち、内部通信用のソケット記述子を除いたビット列とな
る。その後、ｅｘ＿ｂｉｔｓを引数にしたｓｅｌｅｃｔ
システムコールと、内部用ｓｅｌｅｃｔ処理をノンブロ
ッキングで一回ずつ実行し（処理１１０３，１１０
４）、もし、この処理で調べたファイル記述子の何れか
がデータ受け取り可能状態であった場合にはリターンす
る（処理１１０６）。そうでない場合は、一旦他のプロ
セスに処理を譲渡し（処理１１０７）、再びｓｅｌｅｃ
ｔ処理を繰り返す（処理１１０８）。First, a mask is applied to the bit string ap_bits (step 1101) to set the bit corresponding to the socket descriptor registered in the internal socket table to 0 (step 1102). in_mask is
The bit string is such that the bits corresponding to all socket descriptors registered in the internal socket table are 0, and the other bits are 1. Therefore, a bit string ex_bit created by multiplying ap_bits by in_mask
s is a bit string excluding the socket descriptor for internal communication from the file descriptor specified by ap_bits. After that, select using ex_bits as an argument
The system call and the internal select processing are executed once each in a non-blocking manner (processing 1103, 110
4) If any of the file descriptors checked in this process is in a data receivable state, the process returns (process 1106). If not, the process is temporarily transferred to another process (process 1107), and select is performed again.
The t processing is repeated (processing 1108).

【０１１８】こうして、本実施の形態によれば、実施の
形態１のように内部高速通信機構を併用する計算機シス
テムにおいても、アプリケーションプログラムがｓｅｌ
ｅｃｔ関数を利用可能になる。As described above, according to the present embodiment, even in the computer system using the internal high-speed communication mechanism as in the first embodiment, the application program
The ect function becomes available.

【０１１９】＜発明の実施の形態３＞上記スピンループ
によるｓｅｌｅｃｔ関数の実現方法では、スピンループ
の途中に他のプロセスに処理を譲渡する処理１１０７を
挿入することで計算機の占有を回避するが、同じ計算機
上で処理されるプロセスの優先度が低いと、そのプロセ
スには処理が渡らない可能性がある。データの到着をス
リープして待つブロッキングウェイトを用いればこれを
避けることができるが、ブロッキングウェイトの内部通
信用ｓｅｌｅｃｔ処理とブロッキングウェイトのｓｅｌ
ｅｃｔシステムコールを、１プロセス・１スレッド上で
同時に実行することはできない。<Embodiment 3> In the above-described method for realizing a select function by a spin loop, the occupation of a computer is avoided by inserting a process 1107 for transferring a process to another process in the middle of the spin loop. If the priority of a process processed on the same computer is low, the process may not be passed to that process. This can be avoided by using a blocking weight that sleeps and waits for the arrival of data.
The ect system call cannot be executed simultaneously on one process and one thread.

【０１２０】これに対して本実施の形態では、まず２つ
のスレッドを生成し、一方のスレッド上では内部用ｓｅ
ｌｅｃｔを、もう一方ではｓｅｌｅｃｔシステムコール
を実行するという方法を採る。この方法によれば、内部
用ｓｅｌｅｃｔとｓｅｌｅｃｔシステムコールがそれぞ
れのスレッド上で独立に動作することができるため、同
時にブロックしてデータを見張ることができる。On the other hand, in the present embodiment, first, two threads are generated, and on one thread, the internal thread is generated.
Select, and the other is to execute the select system call. According to this method, since the internal select and the select system call can operate independently on each thread, it is possible to block and watch data at the same time.

【０１２１】図１１は、図９で示したのと同じ通信状態
を表している。ただし、内部用ｓｅｌｅｃｔとｓｅｌｅ
ｃｔシステムコールの同時実行の実現方法が異なる。図
１１では、これら２つのｓｅｌｅｃｔ処理は、別々のス
レッドの上で実行する（１２０３，１２０４）。処理１
００９と同様に、スレッド１２０３上のｓｅｌｅｃｔシ
ステムコールでは、ｓａ１に対応するビットを０にした
ビット列を指定する。FIG. 11 shows the same communication state as that shown in FIG. However, internal select and select
The method of realizing the simultaneous execution of the ct system call is different. In FIG. 11, these two select processes are executed on separate threads (1203, 1204). Processing 1
Similarly to 009, in the select system call on the thread 1203, a bit string in which the bit corresponding to sa1 is set to 0 is specified.

【０１２２】図１２を参照するに、本実施の形態におい
て、アプリケーションプログラムがＥＭＵ＿ｓｅｌｅｃ
ｔ関数を発行する際に指定したビット列ａｐ＿ｂｉｔｓ
に対し、内部ソケットテーブルに登録されているソケッ
ト記述子に対応したビットを０にするためのマスクをか
けてｅｘ＿ｂｉｔｓを作成する処理までは図１０の処理
（１１０１，１１０２）と同じである（処理１３０１，
１３０２）。その後、まずノンブロッキングでｓｅｌｅ
ｃｔシステムコール（処理１３０３）と内部用ｓｅｌｅ
ｃｔ処理（処理１３０４）を１回ずつ行う。この処理で
調べたファイル記述子の何れかがデータ受け取り可能状
態であった場合にはリターンする。そうでない場合に
は、スレッドを２つ生成し（処理１３０７）、それぞれ
の上でｓｅｌｅｃｔシステムコール（処理１３０８）と
内部用ｓｅｌｅｃｔ（処理１３０９）を実行する。これ
らの処理は、それぞれデータ受け取り状態になるまでブ
ロックする。両処理のうち先にブロックが解けた方は、
もう一方のスレッドをキャンセルして（処理１３１０，
１３１１）リターンする（処理１３１４，１３１５）。
このキャンセル処理では、スレッドを強制的に終了させ
るのではなく、そのスレッドがもう不要であるという印
を付ける。スレッドはブロックが解けた時にこの印が付
けられているかどうかを調べ（処理１３１６，１３１
７）、もし付けられていれば、キャンセルされていたこ
とになるのでそのまま消滅する（処理１３１８，１３１
９）。Referring to FIG. 12, in the present embodiment, the application program is EMU_select
Bit string ap_bits specified when issuing the t function
On the other hand, the process up to the process of creating ex_bits by applying a mask for setting the bit corresponding to the socket descriptor registered in the internal socket table to 0 is the same as the process (1101, 1102) in FIG. 1301,
1302). Then, first, non-blocking and sele
ct system call (process 1303) and internal cell
The ct process (process 1304) is performed once. If any of the file descriptors checked in this process are in a data receivable state, the process returns. If not, two threads are generated (process 1307), and a select system call (process 1308) and an internal select (process 1309) are executed on each of them. These processes block until a data receiving state is attained. If the block is solved first in both processes,
Cancel the other thread (process 1310,
1311) Return (processing 1314, 1315).
In this canceling process, instead of forcibly terminating the thread, the thread is marked as unnecessary. The thread checks whether or not this mark is added when the block is unwound (processing 1316, 131
7) If it is attached, it means that it has been cancelled, and it is deleted as it is (processing 1318, 131)
9).

【０１２３】上記ｓｅｌｅｃｔ処理において、スレッド
分割の前に処理１３０３および処理１３０４のノンブロ
ッキングｓｅｌｅｃｔ処理を一回ずつ行うのは、次の理
由による。もし、処理１３０１が発行される以前に内部
用ソケットとそれ以外のファイル記述子が共にデータ受
け取り可能な状態になっている場合、ｓｅｌｅｃｔ関数
はその両方を検知できなければならない。しかし、いき
なりスレッドを分割して、内部ソケットとそれ以外のフ
ァイル記述子を別々に見張り始めると、両スレッドのう
ち若干早く検知した方がもう一方のスレッドをキャンセ
ルしてしまうため、片方のスレッドの状態しか検知する
ことができない。これに対して処理１３０３および処理
１３０４を実行することで、処理１３０１が発行される
以前の内部ソケットとそれ以外の記述子の状態を両方と
も確実に調べることができる。In the above-described select processing, the non-blocking select processing of the processing 1303 and the processing 1304 is performed once before thread division for the following reason. If both the internal socket and the other file descriptors are ready to receive data before the processing 1301 is issued, the select function must be able to detect both. However, if you split the thread and start watching the internal socket and other file descriptors separately, one of the two threads will detect the other thread slightly earlier and cancel the other thread. Only the state can be detected. On the other hand, by executing the processing 1303 and the processing 1304, it is possible to reliably check both the state of the internal socket and the state of the other descriptors before the processing 1301 is issued.

【０１２４】本方式では、データの到着をスリープして
待つため、実施の形態２のスピンループで待つ方法に比
べると、データ検出のタイミングが遅れるが、優先度の
低いプロセスに対しても、処理の妨げとなることを回避
できる。In this method, the data is asleep while waiting for the arrival of data, so that the data detection timing is delayed as compared with the method of waiting in the spin loop of the second embodiment. Can be avoided.

【０１２５】[0125]

【発明の効果】本発明によれば、ストリーム通信をメッ
セージパッシング型の通信でもって実現できる。According to the present invention, stream communication can be realized by message passing type communication.

【０１２６】本発明の他の態様によれば、第１の通信網
とそれより高速の第２の通信網に接続された計算機上で
動作するアプリケーションプログラムが、第１の通信網
に接続された他の計算機上で動作する他のアプリケーシ
ョンプログラムとの間で第１の通信規約に基づいて通信
することができ、さらに、第２の通信網に接続された他
の計算機上で動作する他のアプリケーションプログラム
との間でその第２の通信網を使用した高速の通信を行う
ことができる。According to another aspect of the present invention, an application program running on a computer connected to a first communication network and a higher speed second communication network is connected to the first communication network. It is possible to communicate with another application program running on another computer based on the first communication protocol, and further, another application running on another computer connected to the second communication network. High-speed communication can be performed with the program using the second communication network.

【０１２７】さらに具体的には、上記第１の通信規約は
ＴＣＰ／ＩＰ通信規約を使用できる。また、上記第２の
通信網を使用した通信を、メッセージパッシング型の通
信とすることができる。More specifically, the first communication protocol can use a TCP / IP communication protocol. The communication using the second communication network can be a message passing type communication.

[Brief description of the drawings]

【図１】本発明の実施例の全体構成図。FIG. 1 is an overall configuration diagram of an embodiment of the present invention.

【図２】ソケット接続のフローチャート。FIG. 2 is a flowchart of socket connection.

【図３】ソケット接続を説明すための図。FIG. 3 is a diagram illustrating socket connection.

【図４】内部・外部通信切り分け方法のフローチャー
ト。FIG. 4 is a flowchart of an internal / external communication separation method.

【図５】内部・外部通信切り分け方法を説明するための
図。FIG. 5 is a view for explaining an internal / external communication separation method.

【図６】ストリーム通信のフローチャート。FIG. 6 is a flowchart of stream communication.

【図７】送受信動作に使用される命令列の一例を示す
図。FIG. 7 is a diagram showing an example of an instruction sequence used for a transmission / reception operation.

【図８】送受信動作に使用される命令列の他の例を示す
図。FIG. 8 is a diagram showing another example of the instruction sequence used for the transmission / reception operation.

【図９】ｓｅｌｅｃｔ機能をスピンループで実現する方
法を説明するための図。FIG. 9 is a view for explaining a method of realizing a select function by a spin loop.

【図１０】ｓｅｌｅｃｔ機能をスピンループで実現する
方法のフローチャート。FIG. 10 is a flowchart of a method for implementing a select function by a spin loop.

【図１１】ｓｅｌｅｃｔ機能をスレッドを用いて実現す
る方法を説明するための図。FIG. 11 is a view for explaining a method of realizing a select function using a thread.

【図１２】ｓｅｌｅｃｔ機能をスレッドを用いて実現す
る方法のフローチャート。FIG. 12 is a flowchart of a method for implementing a select function using a thread.

【図１３】従来のＴＣＰ／ＩＰ通信規約の階層図。FIG. 13 is a hierarchical diagram of a conventional TCP / IP communication protocol.

【図１４】従来のストリーム通信を説明するための図。FIG. 14 is a diagram for explaining conventional stream communication.

[Explanation of symbols]

１０５．．．並列計算機内部通信網，１０６．．．グロ
ーバル通信網，１１１．．．ソケットアプリケーション
プログラムインタフェース，１１８．．．ＭＰＩ仕様の
インタフェース，１１９，１２０，１２１，１２２，１
２３．．．ネットワークインタフェースハードウェア，
１４０、１４１．．．メッセージパッシング型ライブラ
リ，４０１，４０２，４０３，４０４，４０５，４０
６．．．ソケット，４０７，４０８，４０９．．．ソケ
ットのコネクション，４１０，４１１．．．内部ソケッ
トテーブル，６０１．．．並列計算機内部通信時におけ
るデータ経路，６０２．．．外部計算機との通信時にお
けるデータ経路，８０３，８０４，８１２，８１
３．．．アプリケーションプログラムのバッファ，８０
９．．．ストリーム，９０１．．．ＴＣＰ／ＩＰエミュ
レーションライブラリ内のバッファ。105. . . Parallel computer internal communication network, 106. . . Global Communication Network, 111. . . Socket application program interface, 118. . . Interface of MPI specification, 119, 120, 121, 122, 1
23. . . Network interface hardware,
140, 141. . . Message passing type library, 401, 402, 403, 404, 405, 40
6. . . Socket, 407, 408, 409. . . Socket connection, 410, 411. . . Internal socket table, 601. . . Data path at the time of parallel computer internal communication, 602. . . Data path at the time of communication with an external computer, 803, 804, 812, 81
3. . . Application program buffer, 80
9. . . Stream, 901. . . Buffer in TCP / IP emulation library.

Claims

[Claims]

A plurality of transmission data specified by a plurality of transmission instructions issued by a first application program executed on a first computer, each of the plurality of transmission data being executed on a second computer; A plurality of transmission data received in response to a plurality of reception instructions issued by the second application program under message-passing communication under the control of an emulation library provided on the second computer. Among the connected data, each received transmission data is stored in a plurality of buffers specified by the respective reception instructions so that a portion obtained by dividing the data into the size part specified by the plurality of reception instructions is stored in the plurality of buffers specified by the respective reception instructions. An inter-computer data transmission / reception method that processes under the control of the emulation library.

2. In response to a plurality of transmission instructions issued by a first application program running on a first computer having a first communication library for executing message-passing type communication, A plurality of transmission commands requesting transmission of a plurality of transmission data held in a plurality of buffers specified by the first transmission command by the first emulation library provided in the first computer. In response to a plurality of receiving instructions issued by a second application program running on a second computer having a second communication library for issuing message-passing type communication to the library. A plurality of receiving instructions for receiving the plurality of transmission data by message-passing type communication, A second emulation library provided in the computer of the second computer, the second emulation library issues to the second communication library, and the first emulation library issues a plurality of transmission data specified by the plurality of transmission instructions issued by the first emulation library. According to the plurality of reception instructions issued by the second emulation library, the connected data is divided into size portions designated by the plurality of reception instructions and stored in a plurality of buffers designated by the respective reception instructions. Inter-computer data transmission / reception controlling the storage positions of the received plurality of transmission data in the second computer and the movement of the plurality of transmission data in the second computer after receiving the respective transmission data by the second emulation library. Method.

3. In response to a plurality of transmission instructions issued by a first application program executed on a first computer having a first communication library for executing message-passing type communication, A plurality of transmission instructions requesting transmission of a plurality of transmission data held in a plurality of buffers specified by the transmission instructions, respectively.
Issued to the first communication library by a first emulation library provided in the first computer, on a second computer having a second communication library for executing message-passing type communication. In response to the plurality of reception instructions issued by the second application program being executed, the second computer is provided with a plurality of reception instructions for receiving the plurality of transmission data by message passing communication. A series of data that are issued to the second communication library by the obtained second emulation library and are formed by linking a plurality of transmission data specified by the plurality of transmission instructions issued by the first emulation library. Is specified by multiple receiving instructions issued by the receiving application program. The second emulation library (a) allows each of the plurality of reception instructions issued by the second emulation library to be stored in a plurality of buffers separately in a size portion designated by each reception instruction. The address of the buffer for storing the received data to be specified is set to the address belonging to the buffer specified by the reception command issued by the second application program currently being processed or the address belonging to the buffer usable by the second communication library. And (b) specifying the address of the buffer that can be used by the second communication library, and the first transmission data received by a reception command issued by the second emulation library. Controls movement within the computer and generates a second emulation license. The plurality of reception instructions issued by the Rally are respectively issued corresponding to one of the plurality of transmission instructions issued by the first emulation library, and each of the plurality of reception instructions issued by the second emulation library is issued. Is set to be equal to the size of the transmission data specified by the corresponding one of the plurality of transmission instructions issued by the first emulation library. Method.

And (a) issuing, within the computer on the transmission side, a transmission command determined by the message-passing type communication requesting transmission of any transmission data specified by the application on the transmission side; The transmission data length specified by the transmission command is detected by a reception emulation library operating on a reception-side computer before transmission of the transmission data. The specified reception data length and the transmission data length are compared by the reception side emulation library, and (d) the transmission data is stored in an emulation buffer in the reception side computer in accordance with the result of the comparison. Or request to receive the transmission data in the application buffer. (E) issuing a reception command determined by the message-passing type communication by the emulation library on the receiving side; and (e) responding to the transmission command and the reception command issued by the emulation library on the receiving side, and An inter-computer data transmission / reception method in which transmission data is transferred from the transmission-side computer to the emulation buffer or the application buffer by the transmission-side computer and the reception-side computer.

(F) when the transmission data is transmitted to the emulation buffer, data corresponding to the reception data length from the emulation buffer is stored in an application buffer designated by the reception instruction;
5. The method according to claim 4, further comprising the step of copying using the emulation library on the receiving side.

(G) If the transmission data is transferred into the application buffer in the step (e), it waits for the computer on the transmission side to execute the step (a) for the subsequent transmission data. (H) When step (a) is performed on the subsequent transmission data, the subsequent transmission data and the data of the remaining length of the reception data length requested by the reception command are processed from step (b) onward. The method according to claim 5, further comprising the step of:

(I) Before executing the step (a), it is determined whether or not data remaining in the emulation buffer and not transferred to any application buffer remains. 7. The inter-computer data transmission / reception method according to claim 6, further comprising the step of: when the data remains, copying from the buffer to the application within a range not exceeding the reception data length.

8. A computer system having a plurality of computers interconnected by a first communication network, wherein at least some of the computers are interconnected by a second communication network. A communication path used for data transmission / reception between a first application program operating on a first computer among some of the computers and any second application program operating on a second computer To
Determined by a processing routine that operates in accordance with a communication protocol defined for using the first communication network, and (b) processing a connection request for the determined communication path from the first application program by an emulation library. In the processing, it is determined whether or not the second communication network can be used for communication with the second application program. (C) When it is determined that the second communication network can be used, Registering the communication path in the emulation library; and (d) when the first application program issues a data transmission / reception instruction thereafter, the communication path specified by the transmission / reception instruction is registered in the emulation library. (E) that the specified communication path is When it is registered in the library is,
The transmission / reception process requested by the transmission / reception command is executed by the emulation library using the second communication network. (F) When the specified communication path is not registered in the emulation library, A data transmission / reception method, wherein an emulation library requests the processing routine to execute transmission / reception processing requested by the transmission / reception command, and the processing routine executes the transmission / reception processing using the first communication network.

9. The data transmission / reception method according to claim 8, wherein the communication protocol is TCP / IP, and the determination of the communication path includes a process of generating a socket and naming the socket.

10. The data transmission / reception method according to claim 8, wherein the communication using the second communication network is a message passing type communication.

11. A computer having a plurality of computers interconnected by a first communication network, wherein at least some of the computers are connected to the first communication network.
From a first application program operating on a first computer of the above-mentioned some computers, the second computer being connected to a second communication network capable of transferring data at a higher speed than the communication network of the second computer; When the second communication network is available for data transmission / reception with any of the second application programs operating on the above, this is used,
Otherwise, in the computer system using the second communication network, when detecting the presence / absence of data transmitted from another application program addressed to the first application program, through the second communication network The process of detecting data to be transmitted and the process of detecting data transmitted via the first communication network are alternately repeated in a non-block manner. And a transfer data detecting method for transferring the processing to another process running on the first computer.

12. A computer having a plurality of computers interconnected by a first communication network, at least some of which are connected to the first computer.
From a first application program operating on a first computer of the above-mentioned some computers, the second computer being connected to a second communication network capable of transferring data at a higher speed than the communication network of the second computer; When the second communication network is available for data transmission / reception with any of the second application programs operating on the above, this is used,
Otherwise, in the computer system using the second communication network, when detecting the presence / absence of data transmitted from another application program to the first application program, two threads are generated. A data detection method for independently performing detection processing of data transmitted via the second communication network and detection processing of data transmitted via the first communication network on respective threads.