JPH06202867A

JPH06202867A - Parallel computer

Info

Publication number: JPH06202867A
Application number: JP34843092A
Authority: JP
Inventors: Chikako Nakanishi; 知嘉子中西
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 1992-12-28
Filing date: 1992-12-28
Publication date: 1994-07-22
Anticipated expiration: 2013-11-05
Also published as: JP2821328B2

Abstract

(57)【要約】【目的】効率的なデータ処理を行い処理能力の向上を
図った並列計算機を得る。【構成】実行ステージ６′は、命令コードＯＰが読み
出しを指示する場合、メモリーデータ記憶部９のアドレ
スＳＭＡと、演算実行器１３で計算した読み出しアドレ
スである演算結果ＭＡとを、メモリ・アドレス比較器１
０により比較する。そして、メモリ・アドレス比較器１
０が比較結果である一致信号ＧＥＴがＨレベルで、両者
ＭＡ及びＳＭＡの一致を判定するとき、次のステージで
あるメモリ・アクセス・ステージ７によるデータ・メモ
リ３からの読み出しデータと同じデータを、メモリーデ
ータ記憶部９のデータＭＤに基づくデータＤＡＴＡを読
み出することにより得ることができる。【効果】パイプライン処理をより効率的に行うことが
でき、処理能力の向上を図ることができる。 (57) [Summary] [Purpose] To obtain a parallel computer with efficient data processing and improved processing capability. The execution stage 6'compares the memory address with the address SMA of the memory data storage unit 9 and the operation result MA which is the read address calculated by the operation executor 13 when the instruction code OP instructs reading. Bowl 1
Compare by 0. And the memory / address comparator 1
When the match signal GET whose comparison result is 0 is the H level and it is determined that both MA and SMA match, the same data as the read data from the data memory 3 by the memory access stage 7 which is the next stage, It can be obtained by reading the data DATA based on the data MD in the memory data storage unit 9. [Effect] The pipeline processing can be performed more efficiently, and the processing capacity can be improved.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】この発明は、命令をパイプライン
方式で実行する並列計算機に関し、特にパイプライン方
式を採用した縮小命令セットコンピュータＲＩＳＣに関
する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a parallel computer for executing instructions in a pipeline system, and more particularly to a reduced instruction set computer RISC adopting a pipeline system.

【０００２】[0002]

【従来の技術】ＲＩＳＣでは、メモリへのアクセスは、
ロード（読み出し）命令及びストア命令に限定されるロ
ード・ストア・アーキテクチャを採用する。また演算命
令は全て内部のレジスタに格納されたデータを用いて行
われる。ＲＩＳＣは、一般に以下の特徴を備える。2. Description of the Related Art In RISC, memory access is
It employs a load / store architecture that is limited to load (read) and store instructions. All arithmetic instructions are performed using the data stored in the internal registers. RISC generally has the following features.

【０００３】（１）命令の実行を１マシンサイクルで
行う。(1) The instruction is executed in one machine cycle.

【０００４】（２）固定長命令（３）ロード・ストア命令のみメモリへアクセスし、
後の命令は、レジスタに対して行われる。(2) Fixed-length instructions (3) Only load / store instructions access the memory,
Later instructions are performed on registers.

【０００５】（４）パイプライン処理により、同時に
いくつかの命令を並列処理する。(4) By pipeline processing, several instructions are simultaneously processed in parallel.

【０００６】（５）性能を上げるための処理は、ハー
ドウェアで行い、複雑な処理は、ソフトウェアで行う。(5) Processing for improving performance is performed by hardware, and complicated processing is performed by software.

【０００７】図９にＲＩＳＣの一般的な機能構成の一例
を示す。図９においてＲＩＳＣは、命令を格納する命令
メモリ１とデータを一時的に格納するための複数のレジ
スタからなるレジスタ・ファイル２とデータを格納する
ためのデータ・メモリ３と５段のパイプライン・ステー
ジ４〜８とから構成される。FIG. 9 shows an example of a general functional configuration of RISC. In FIG. 9, the RISC is composed of an instruction memory 1 for storing instructions, a register file 2 including a plurality of registers for temporarily storing data, a data memory 3 for storing data, and a pipeline of 5 stages. It consists of stages 4-8.

【０００８】命令フェッチ・ステージ（ＩＦ）４は、プ
ログラムカウンタ（図示せず）を備えており、プログラ
ムカウンタから発生されたアドレス信号を命令メモリ１
に与える。与えられたアドレス信号により指定された命
令が取り出され、命令デコード・ステージ５に送られ
る。The instruction fetch stage (IF) 4 is provided with a program counter (not shown), and the address signal generated from the program counter is used for the instruction memory 1
Give to. The instruction designated by the applied address signal is fetched and sent to the instruction decoding stage 5.

【０００９】命令デコード・ステージ（ＩＤ）５は、命
令フェッチ・ステージ４を介して命令メモリ１からの命
令を受け、該命令をデコードする。そして、命令デコー
ド・ステージ５は、命令のデコード結果に基づき、与え
られた命令が待機することなく、次段の実行ステージ６
で実行することのできる命令の場合、実行ステージ６が
与えられた命令を実行するのに使うデータをレジスタ・
フィルタ２から読み出し、読み出したデータを実行ステ
ージ６に与える。The instruction decode stage (ID) 5 receives an instruction from the instruction memory 1 via the instruction fetch stage 4 and decodes the instruction. Then, the instruction decoding stage 5 executes the next execution stage 6 without waiting for the given instruction based on the instruction decoding result.
In the case of an instruction that can be executed by, the execution stage 6 registers the data used to execute the given instruction.
The data is read from the filter 2 and the read data is given to the execution stage 6.

【００１０】実行ステージ（ＥＸＣ）６はデコードされ
た命令が演算命令の場合には、与えられた命令を実行す
る。デコードされた命令がメモリ・アクセス命令（ロー
ド命令、ストア命令）の場合、データ・メモリ３の実行
アドレスを計算してメモリ・アクセス・ステージ（ＭＥ
Ｍ）７に与える。The execution stage (EXC) 6 executes a given instruction when the decoded instruction is an arithmetic instruction. When the decoded instruction is a memory access instruction (load instruction, store instruction), the execution address of the data memory 3 is calculated and the memory access stage (ME
M) give to 7.

【００１１】メモリ・アクセス・ステージ７は、実行ス
テージ６から得た実行アドレスにしたがってデータ・メ
モリ３にアクセスし、データの書き込み、読み出しを実
行する。The memory access stage 7 accesses the data memory 3 according to the execution address obtained from the execution stage 6 and executes writing and reading of data.

【００１２】ライトバック・ステージ（ＷＢ）は、演算
結果及びデータ・メモリ３からの読み出しデータをレジ
スタ・ファイル２に書き込む。The write back stage (WB) writes the calculation result and the read data from the data memory 3 into the register file 2.

【００１３】ＲＩＳＣは外部的に与えられる２相のノン
オーバラップ・クロック信号も応答して動作する。２相
のノンオーバラップ・クロック信号φ１，φ２の例を図
１０に示す。ＲＩＳＣはパイプライン化されており、各
クロック・サイクルで新しい命令をフェッチする。図９
に示すＲＩＳＣにおいては、１つの命令の実行完了には
５サイクルが必要とされる。しかし、並列計算機である
ＲＩＳＣは、各クロックサイクル毎に新しい命令を開始
することができるようにパイプライン化されているた
め、新しい命令の開始は、現在の命令の完了前に行うこ
とができる。The RISC also operates in response to an externally applied two-phase non-overlap clock signal. An example of the two-phase non-overlap clock signals φ1 and φ2 is shown in FIG. RISC is pipelined and fetches a new instruction every clock cycle. Figure 9
In the RISC shown in (5), 5 cycles are required to complete the execution of one instruction. However, because the parallel computer RISC is pipelined so that it can start a new instruction every clock cycle, the start of a new instruction can be done before the completion of the current instruction.

【００１４】図１１（ａ）にＲＩＳＣのパイプライン動
作を示す。命令１〜命令３はそれぞれ命令フェッチ・ス
テージ４でフェッチされた後、命令デコード・ステージ
５、実行ステージ６、メモリ・アクセス・ステージ７及
びライトバック・ステージ８を通過する。FIG. 11A shows the pipeline operation of RISC. Instructions 1 to 3 are fetched in the instruction fetch stage 4, and then pass through the instruction decode stage 5, the execution stage 6, the memory access stage 7 and the write back stage 8.

【００１５】図１１（ａ）に示すように、サイクルＴ２
において、命令１の命令デコードと命令２のフェッチが
同時に行われる。サイクルＴ３において、命令１の実
行、命令２のデコード及び命令３のフェッチが同時に行
われる。このように、命令がパイプライン式に並行して
実行されるため、全体としては、１マシン・サイクルで
１命令を実行するのことができる。As shown in FIG. 11A, the cycle T2
In, the instruction decoding of the instruction 1 and the fetching of the instruction 2 are simultaneously performed. In cycle T3, instruction 1 is executed, instruction 2 is decoded, and instruction 3 is fetched at the same time. In this way, since the instructions are executed in parallel in a pipeline manner, it is possible to execute one instruction in one machine cycle as a whole.

【００１６】ＲＩＳＣにおけるメモリ・アクセス命令に
は、ワード単位でデータを扱う命令の他に、バイト単位
やハーフ・ワード単位で扱う命令が存在する。一般にメ
モリからの読み出しは、１ワード（複数バイト）単位で
行われ、命令がバイト単位や、ハーフ・ワード単位の場
合、不必要なデータは捨てられてしまう。Memory access instructions in RISC include instructions that handle data in word units, as well as instructions that handle bytes or half words. Generally, reading from the memory is performed in units of one word (a plurality of bytes), and if the instruction is in units of bytes or half words, unnecessary data is discarded.

【００１７】[0017]

【発明が解決しようとする課題】上記構成の従来のＲＩ
ＳＣに対し、図１１（ｂ）に示すように、命令１（バイ
ト単位のロード（読み出し）命令），命令２（バイト単
位のロード（読み出し）命令），命令３（演算命令）の
順で３つの命令が順次与えられて処理され、これに加え
て、命令１でデータ・メモリ３から読み出されたデータ
の一部を命令２が読み出し、命令２の実行により読み出
されたデータが命令３の処理の中で使われると仮定す
る。DISCLOSURE OF THE INVENTION The conventional RI having the above structure
As shown in FIG. 11 (b), the instruction 3 (load (read) instruction in byte unit), the instruction 2 (load (read) instruction in byte unit), and the instruction 3 (operation instruction) are sequentially performed in 3 order as shown in FIG. One instruction is sequentially given and processed, and in addition, the instruction 2 reads a part of the data read from the data memory 3 by the instruction 1, and the data read by the execution of the instruction 2 is the instruction 3 Suppose it is used in the processing of.

【００１８】このような場合、図１１（ｂ）に示すよう
に、まず命令１が実行され、かつ終了する。すなわち、
期間Ｔ１において命令フェッチ・ステージ４における命
令１の処理が行われ、期間Ｔ２においては命令デコード
・ステージ５における命令１の処理が行われる。実行ス
テージ６，メモリ・アクセス・ステージ７，ライトバッ
ク・ステージ８における命令１の処理は、期間Ｔ３，Ｔ
４，Ｔ５においてそれぞれ実行される。In such a case, as shown in FIG. 11B, the instruction 1 is first executed and then terminated. That is,
The instruction 1 is processed in the instruction fetch stage 4 during the period T1, and the instruction 1 is processed in the instruction decode stage 5 during the period T2. The processing of the instruction 1 in the execution stage 6, the memory access stage 7, and the write back stage 8 is performed in the periods T3 and T.
4 and T5, respectively.

【００１９】一方、命令２は、期間Ｔ２に命令フェッチ
・ステージ４が実行され、期間Ｔ３においては命令デコ
ード・ステージ５における命令２の処理が行われる。期
間Ｔ４，Ｔ５，Ｔ６において実行ステージ６，メモリ・
アクセス・ステージ７，ライトバック・ステージ８にお
ける命令２の処理が行われる。On the other hand, for the instruction 2, the instruction fetch stage 4 is executed in the period T2, and the instruction 2 in the instruction decode stage 5 is processed in the period T3. In the periods T4, T5, T6, the execution stage 6, the memory,
The processing of the instruction 2 in the access stage 7 and the write back stage 8 is performed.

【００２０】命令３は、期間Ｔ３において命令フェッチ
・ステージ４における処理が行われ、期間Ｔ４において
は命令デコード・ステージ５における命令３の処理が行
われる。しかし、期間５からの実行ステージ６，メモリ
・アクセス・ステージ７，ライトバック・ステージ８に
おける命令３の処理の実行は停止される。なぜなら、前
述のように、命令３が命令２でデータ・メモリ３から読
み出されたデータを使用するため、命令２の実行の終了
（ライトバック・ステージ８によるレジスタ・ファイル
２へのデータ格納）を待たないと命令３が実行できない
からである。The instruction 3 is processed in the instruction fetch stage 4 in the period T3, and is processed in the instruction decode stage 5 in the period T4. However, the execution of the processing of the instruction 3 in the execution stage 6, the memory access stage 7, and the write back stage 8 from the period 5 is stopped. This is because, as described above, the instruction 3 uses the data read from the data memory 3 in the instruction 2, and thus the execution of the instruction 2 ends (the data is stored in the register file 2 by the write-back stage 8). This is because the instruction 3 cannot be executed without waiting for.

【００２１】すなわち、命令３は、期間Ｔ５及び期間Ｔ
６において待期状態（パイプライン・インターロック）
になり、命令２の完了した期間Ｔ６の次の期間Ｔ７で実
行が再開される。再開された命令３は、期間Ｔ７で命令
・デコード・ステージ５における命令３の処理が行われ
る。そして、実行ステージ６，メモリ・アクセス・ステ
ージ７，ライトバック・ステージ８における命令３の処
理が、期間Ｔ８，Ｔ９，Ｔ１０においてそれぞれ実行さ
れる。That is, the instruction 3 has a period T5 and a period T.
Waiting state in 6 (pipeline interlock)
Then, the execution is restarted in the period T7 next to the period T6 in which the instruction 2 is completed. As for the restarted instruction 3, the processing of the instruction 3 in the instruction / decode stage 5 is performed in the period T7. Then, the processing of the instruction 3 in the execution stage 6, the memory access stage 7, and the writeback stage 8 is executed in the periods T8, T9, and T10, respectively.

【００２２】このように、命令１と命令２とが同じデー
タを連続して読み出し、命令２の読み出し対象のデータ
が命令１の実行によりデータ・メモリ３から既に読み出
されているにもかかわらず、命令２によって再度、命令
１と同様の読み出し動作を行うという無駄な動作が行わ
れている。。As described above, although the same data is continuously read by the instruction 1 and the instruction 2, and the data to be read by the instruction 2 is already read from the data memory 3 by the execution of the instruction 1. , The instruction 2 again performs the same read operation as the instruction 1, which is a wasteful operation. .

【００２３】また、命令３が命令３の読み出しデータを
利用する場合は、命令２で読み出されたデータが一旦レ
ジスタファイル２に書き込まれた後、実行ステージ６で
実行することができ、命令３においてレジスタ・ファイ
ル２をアクセスする。このため、命令２の読み出し処理
実行中の期間Ｔ５及びＴ６において、待機しなければな
らない。When the instruction 3 uses the read data of the instruction 3, the data read by the instruction 2 can be executed in the execution stage 6 after being written in the register file 2 once. Access register file 2 at. Therefore, it is necessary to wait in the periods T5 and T6 during the execution of the read process of the instruction 2.

【００２４】しかしながら、図１１（ｂ）に示したよう
に、命令２の実行により読み出されるデータは、期間Ｔ
４の時点で既にデータ・メモリ３から読み出されてい
る。にもかかわらず、命令２の処理の際に捨てられてし
まうため、そのデータを参照する命令３は、命令２の実
行が終了を待たなければならない。However, as shown in FIG. 11B, the data read by the execution of the instruction 2 has a period T.
It has already been read from the data memory 3 at the time point 4. Nevertheless, since the instruction 3 is discarded during the processing of the instruction 2, the instruction 3 referring to the data must wait until the execution of the instruction 2 ends.

【００２５】その結果、余分な待機時間が設けられてし
まう分、命令の実行を完了するのに長い時間を要し、Ｒ
ＩＳＣ等の並列計算機の処理能力を低下させていた。As a result, since an extra waiting time is provided, it takes a long time to complete the execution of the instruction.
It has reduced the processing capacity of parallel computers such as ISC.

【００２６】この発明は上記問題点を解決するためにな
されたもので、効率的なデータ処理を行い処理能力の向
上を図った並列計算機を得ることを目的とする。The present invention has been made to solve the above problems, and an object of the present invention is to obtain a parallel computer which performs efficient data processing and improves the processing capacity.

【００２７】[0027]

【課題を解決するための手段】この発明にかかる請求項
１記載の並列計算機は、レジスタと、命令実行に必要な
データが格納されるデータ記憶手段と、命令コードを受
け、前記命令コードを解読する命令解読手段と、前記命
令解読手段の解読結果に基づき決定される命令を、必要
に応じて前記レジスタの格納データを用いて実行し、前
記命令が前記データ記憶手段からの読み出しを指示する
場合、前記データ記憶手段の読み出しアドレスを演算し
て出力する命令実行手段と、前記命令が前記読み出しを
指示する場合、前記データ記憶手段から、前記読み出し
アドレスの格納データを読み出す外部データ読み出し手
段と、前記命令が前記読み出しを指示する場合、前記読
み出しデータを前記レジスタに書き込むレジスタ書き込
み手段とを備え、前記命令解読手段、前記命令実行手
段、前記外部データ読み出し手段及びレジスタ書き込み
手段は、各々パイプライン方式で並列実行可能であり、
前記データ記憶手段からの読み出しが実行される際、前
記読み出しアドレスを一時格納アドレスとし、前記読み
出しデータを一時格納データとして格納する一時格納記
憶手段をさらに備え、前記命令実行手段は、前記読み出
しアドレスが前記一時格納アドレスを一致すると、前記
一時的記憶手段の前記一時格納データを読み出して、前
記命令を実行することができる。A parallel computer according to a first aspect of the present invention receives a register, data storage means for storing data necessary for executing an instruction, an instruction code, and decodes the instruction code. And a command determined based on the result of decoding by the command decoding unit are executed using the data stored in the register as necessary, and the command instructs reading from the data storage unit. An instruction execution unit that calculates and outputs a read address of the data storage unit; and an external data read unit that reads the stored data of the read address from the data storage unit when the instruction instructs the read. Register writing means for writing the read data to the register when the instruction instructs the read, Serial instruction decode unit, said instruction execution means, the external data reading means and the register write means can each parallel execution in a pipelined manner,
When the reading from the data storage means is executed, the read address is used as a temporary storage address, and the read data is further stored as temporary storage data. When the temporary storage addresses match, the temporary storage data in the temporary storage means can be read and the instruction can be executed.

【００２８】望ましくは、請求項２記載の並列計算機の
ように、前記命令実行手段は、前記読み出しアドレスが
前記一時格納アドレスに一致すると、前記命令の前記読
み出し指示を変更して、前記外部データ読み出し手段に
よる前記データ記憶手段に対する読み出し処理を無効に
する。Preferably, as in the parallel computer according to a second aspect, the instruction executing means changes the read instruction of the instruction to read the external data when the read address matches the temporary storage address. The reading processing by the means for the data storage means is invalidated.

【００２９】[0029]

【作用】この発明における請求項１記載の並列計算機の
一時格納記憶手段は、データ記憶手段からの読み出しが
実行される際、読み出しアドレスを一時格納アドレスと
し、読み出しデータを一時格納データをして格納してお
り、命令実行手段は、読み出しアドレスが一時格納アド
レスを一致すると、一時格納記憶手段の一時格納データ
を読み出して命令を実行することが可能である。According to the first aspect of the present invention, the temporary storage storage means of the parallel computer according to claim 1 stores the read data as the temporary storage data and the temporary storage data when the reading from the data storage means is executed. Therefore, when the read address coincides with the temporary storage address, the instruction execution means can read the temporarily stored data in the temporary storage means and execute the instruction.

【００３０】したがって、先に実行される第１の命令が
データ記憶手段からの読み出しを指示し、後に実行され
る第２の命令が第１の命令の読み出しデータを利用する
処理を指示する場合、第２の命令は、第１の命令の読み
出しデータがレジスタに格納されるのを待つことなく、
一時格納記憶手段に格納された格納データを読み出して
命令を実行することができる。Therefore, when the first instruction to be executed first instructs the reading from the data storage means and the second instruction to be executed later instructs the processing using the read data of the first instruction, The second instruction does not wait for the read data of the first instruction to be stored in the register,
The instruction can be executed by reading the stored data stored in the temporary storage means.

【００３１】[0031]

【Example】

＜第１の実施例＞図１は、この発明の第１の実施例であ
るＲＩＳＣの内部構成を詳細に示すブロック図である。
図１に示すＲＩＳＣは、命令を格納する命令メモリ１
と、データを一時的に格納するための複数のレジスタか
らなるレジスタ・ファイル２と、データを格納するため
のデータ・メモリ３と、５段のパイプライン・ステージ
である、命令フェッチ・ステージ４、命令デコード・ス
テージ５、実行ステージ６′、メモリ・アクセス・ステ
ージ７及びライトバック・ステージ８とから構成され
る。加えて、メモリーデータ記憶部９が実行ステージ
６′とアクセス可能に構成される。<First Embodiment> FIG. 1 is a block diagram showing in detail the internal structure of a RISC according to the first embodiment of the present invention.
The RISC shown in FIG. 1 is an instruction memory 1 for storing instructions.
A register file 2 composed of a plurality of registers for temporarily storing data, a data memory 3 for storing data, and an instruction fetch stage 4, which is a pipeline stage of 5 stages, It is composed of an instruction decode stage 5, an execution stage 6 ', a memory access stage 7 and a write back stage 8. In addition, the memory data storage unit 9 is configured to be accessible to the execution stage 6 '.

【００３２】ＲＩＳＣの読み出し処理は、データ・メモ
リ３から１ワード（４バイト）単位で行われる。また、
ＲＩＳＣは外部的に与えられる２相のノンオーバラップ
・クロック信号φ１及びφ２に応答して動作する。基本
的な動作は、図１０に示した従来の信号φ１及びφ２と
同様であるので説明を省略する。The RISC read process is performed in units of one word (4 bytes) from the data memory 3. Also,
The RISC operates in response to externally applied two-phase non-overlap clock signals φ1 and φ2. Since the basic operation is the same as that of the conventional signals φ1 and φ2 shown in FIG. 10, description thereof will be omitted.

【００３３】命令フェッチ・ステージ４は、プログラム
カウンタ４０を備えており、プログラムカウンタから発
生されたアドレス信号を命令メモリ１に与える。命令フ
ェッチ・ステージ４により、与えられたアドレス信号に
より指定された命令が取り出され、命令デコード・ステ
ージ５に送られる。The instruction fetch stage 4 has a program counter 40, and supplies the address signal generated from the program counter to the instruction memory 1. The instruction designated by the applied address signal is fetched by the instruction fetch stage 4 and sent to the instruction decode stage 5.

【００３４】命令デコード・ステージ５は、命令フェッ
チ・ステージ４を介して命令メモリ１からの命令５０を
受け、該命令をデコードする。そして、命令デコード・
ステージ５は、命令のデコード結果に基づき、与えられ
た命令が待機することなく、次段の実行ステージ６′で
実行することのできる命令の場合、必要に応じて実行ス
テージ６が与えられた命令を実行するのに使うデータを
レジスタ・フィルタ２から読み出し、読み出したデータ
を実行ステージ６に与える。The instruction decode stage 5 receives the instruction 50 from the instruction memory 1 via the instruction fetch stage 4 and decodes the instruction. And instruction decoding
Based on the decoding result of the instruction, the stage 5 is an instruction to which the execution stage 6 is given if necessary, when the given instruction can be executed in the next execution stage 6'without waiting. The data used to execute the above is read from the register filter 2, and the read data is given to the execution stage 6.

【００３５】実行ステージ６′は、演算実行器１３から
出力される演算結果ＭＡとメモリーデータ記憶部９に保
持されているアドレスＳＭＡとを比較するメモリ・アド
レス比較器１０と、先行命令の命令コードＯＰに従っ
て、先行命令の格納アドレスＤとソース・アドレスＳ３
１，Ｓ３２それぞれとを比較して比較結果である選択信
号Ｓ１およびＳ２を出力するレジスタファイル・アドレ
ス比較器１１と、選択信号Ｓ１およびＳ２，一致信号Ｇ
ＥＴに応答してデータを選択するレジスタ・ファイル・
データ・セレクタ１２と、選択されたデータに基づいて
演算を実行する演算実行器１３と、レジスタ１６，１９
及び２１とから構成される。なお、デコード・ステージ
５から与えられる命令５０は、命令コードＯＰと２つの
ソース・アドレスＳ３１，Ｓ３２，格納アドレスＤを含
む。The execution stage 6'comprises a memory / address comparator 10 for comparing the operation result MA output from the operation executor 13 with the address SMA held in the memory data storage unit 9, and the instruction code of the preceding instruction. According to OP, the storage address D of the preceding instruction and the source address S3
1 and S32 respectively, and a register file address comparator 11 for outputting selection signals S1 and S2 which are comparison results, and selection signals S1 and S2, a coincidence signal G
Register file that selects data in response to ET
A data selector 12, an operation executor 13 that executes an operation based on the selected data, and registers 16 and 19
And 21. The instruction 50 given from the decode stage 5 includes an instruction code OP, two source addresses S31 and S32, and a storage address D.

【００３６】レジスタファイル・アドレス比較器１１
は、レジスタ１９を介して先行命令の格納アドレスＤ、
現在の命令のソース・アドレスＳ３１，Ｓ３２とを受
け、ソース・アドレスＳ３１，Ｓ３２それぞれと格納ア
ドレスＤ２を比較し、一致／不一致を検出して選択信号
Ｓ１及びＳ２をレジスタ・ファイル・データ・セレクタ
１２に出力する。なお、詳細については後に記述する。Register file / address comparator 11
Is the storage address D of the preceding instruction via the register 19,
The source addresses S31 and S32 of the current instruction are received, the source addresses S31 and S32 are respectively compared with the storage address D2, a match / mismatch is detected, and the selection signals S1 and S2 are sent to the register file data selector 12. Output to. The details will be described later.

【００３７】格納レジスタ１６は、デコード・ステージ
５から与えられる命令コードＯＰを保持する。保持され
た命令コードＯＰは、メモリ・アクセス・ステージ７内
のレジスタ１７及びメモリ・アドレス比較器１０に与え
られる。The storage register 16 holds the instruction code OP given from the decode stage 5. The held instruction code OP is given to the register 17 and the memory address comparator 10 in the memory access stage 7.

【００３８】格納レジスタ１９は、デコード・ステージ
５から与えられる格納アドレスＤを保持する。保持され
た格納アドレスＤは、レジスタファイル・アドレス比較
器１１およびメモリ・アクセス・ステージ７内のレジス
タ２０に与えられる。The storage register 19 holds the storage address D given from the decoding stage 5. The held storage address D is given to the register file address comparator 11 and the register 20 in the memory access stage 7.

【００３９】レジスタ・ファイル・データ・セレクタ１
２は、レジスタ・ファイル２より与えられる２つのデー
タＲ１，Ｒ２を受けるとともに、メモリ・アドレス比較
器１０から得られたメモリ・データＤＡＴＡを受ける。
そして、レジスタ・ファイル・データ・セレクタ１２
は、レジスタ・アドレス比較器１１から与えられる選択
信号Ｓ１、Ｓ２及び一致信号ＧＥＴに応答して動作す
る。詳細は後に記述する。Register file data selector 1
2 receives the two data R1 and R2 given from the register file 2 and the memory data DATA obtained from the memory address comparator 10.
And register file data selector 12
Operates in response to the selection signals S1 and S2 and the coincidence signal GET provided from the register / address comparator 11. Details will be described later.

【００４０】演算実行器１３は、データ・バス２３及び
２４を介してレジスタ・ファイル・データ・セレクタ１
２と接続され、与えられたデータを使用して命令コード
ＯＰ基づく演算を実行する。演算の実行結果ＭＡは、レ
ジスタ２１、メモリ・アドレス比較器１０及びデータ・
メモリ３に与えられる。詳細については、後に記述す
る。The operation executor 13 uses the register file data selector 1 via the data buses 23 and 24.
2 is connected to perform operation based on the operation code OP using the given data. The execution result MA of the operation is calculated by the register 21, the memory / address comparator 10 and the data.
It is given to the memory 3. Details will be described later.

【００４１】メモリ・アドレス比較器１０は、命令コー
ドＯＰと、メモリーデータ記憶部９に格納されているメ
モリ・アドレスＳＭＡ及びメモリ・データＭＤを受ける
とともに、演算実行器１３によって計算された演算結果
ＭＡを受ける。この演算結果ＭＡは読み出し時には読み
出しアドレスとなる。そして、メモリ・アドレス比較器
１０は、アドレスＳＭＡと演算結果ＭＡとを比較し、一
致／不一致を検出する。詳細については後に記述する。The memory / address comparator 10 receives the instruction code OP, the memory address SMA and the memory data MD stored in the memory data storage unit 9, and the operation result MA calculated by the operation executor 13. Receive. This calculation result MA becomes a read address at the time of reading. Then, the memory / address comparator 10 compares the address SMA with the operation result MA and detects a match / mismatch. Details will be described later.

【００４２】メモリ・アクセス・ステージ７は、命令コ
ードを保持するためのレジスタ１７と格納アドレスを保
持するためのレジスタ２０と演算結果のデータを保持す
るデータレジスタ２２とを含む。ライトバック・ステー
ジ８では、与えられた格納アドレス２０のアドレスにし
たがって実行結果データを保持しているレジスタ２２の
データをレジスタ・ファイル２に書き込む。Memory access stage 7 includes a register 17 for holding an instruction code, a register 20 for holding a storage address, and a data register 22 for holding operation result data. In the write-back stage 8, the data of the register 22 holding the execution result data is written in the register file 2 according to the address of the given storage address 20.

【００４３】図１に示したレジスタファイル・アドレス
比較器１１の一構成例を図２に示す。レジスタファイル
・アドレス比較器１１は、一致検出器１１１と一致検出
器１１２とからなる。FIG. 2 shows an example of the configuration of the register file / address comparator 11 shown in FIG. The register file / address comparator 11 includes a match detector 111 and a match detector 112.

【００４４】一致検出器１１１は、レジスタ１９から得
られる先行命令の格納アドレスＤと、ソース・アドレス
Ｓ３１との一致／不一致に基づき、Ｈ／Ｌの選択信号Ｓ
１をレジスタ・ファイル・データ・セレクタ１２に出力
する。一致検出器１１２は、先行命令の格納アドレスＤ
と、ソース・アドレスＳ３２との一致／不一致に基づ
き、Ｈ／Ｌの選択信号Ｓ２をレジスタ・ファイル・デー
タ・セレクタ１２に出力する。The coincidence detector 111 selects the H / L selection signal S based on the coincidence / non-coincidence between the storage address D of the preceding instruction obtained from the register 19 and the source address S31.
1 is output to the register file data selector 12. The coincidence detector 112 determines the storage address D of the preceding instruction.
Based on the match / mismatch with the source address S32, the H / L selection signal S2 is output to the register file data selector 12.

【００４５】レジスタ・ファイル・データ・セレクタ１
２の一構成例を図３に示す。レジスタ・ファイル・デー
タ・セレクタ１２は、各々の出力がデータバス２３に接
続されたトライステードバッファ１２１及び１２２と、
各々の出力がデータバス２４に接続されたトライステー
ドバッファ１２３及び１２４と、ＡＮＤゲート１２５及
び１２６とから構成される。Register file data selector 1
An example of the configuration of No. 2 is shown in FIG. The register file data selector 12 includes tri-state buffers 121 and 122 each having an output connected to the data bus 23,
Each output is composed of tri-state buffers 123 and 124 connected to the data bus 24, and AND gates 125 and 126.

【００４６】ＡＮＤゲート１２５は、メモリ・アドレス
比較器１０より、所望するデータが得られた否かをＨ／
Ｌで示す一致信号ＧＥＴと、レジスタファイル・アドレ
ス比較器１１からの選択信号Ｓ１とを入力し、その出力
を制御信号ＧＳ１として、トライステードバッファ１２
１及び１２３に出力する。ＡＮＤゲート１２６は、一致
信号ＧＥＴと選択信号Ｓ２とを入力し、その出力を制御
信号ＧＳ２として、トライステードバッファ１２２及び
１２４に出力する。The AND gate 125 determines whether or not the desired data is obtained from the memory / address comparator 10 at H / H level.
The match signal GET indicated by L and the selection signal S1 from the register file / address comparator 11 are input and the output thereof is used as the control signal GS1.
Output to 1 and 123. The AND gate 126 inputs the coincidence signal GET and the selection signal S2, and outputs the output as the control signal GS2 to the tri-state buffers 122 and 124.

【００４７】トライステードバッファ１２１は、制御信
号ＧＳ１の反転信号のＨ／Ｌに基づき活性／非活性が制
御され、レジスタ・ファイル２から与えられるデータＲ
１を入力部に受ける。トライステードバッファ１２３
は、制御信号ＧＳ２の反転信号のＨ／Ｌに基づき活性／
非活性が制御され、レジスタ・ファイル２から与えられ
るデータＲ２を入力部に受ける。The tri-state buffer 121 is controlled to be active / inactive based on H / L of the inverted signal of the control signal GS1, and the data R supplied from the register file 2 is controlled.
Receive 1 in the input section. Tri-state buffer 123
Is activated based on H / L of the inverted signal of the control signal GS2.
The deactivation is controlled, and the data R2 given from the register file 2 is received at the input section.

【００４８】また、トライステードバッファ１２２は、
制御信号ＧＳ１のＨ／Ｌに基づき活性／非活性が制御さ
れ、メモリーデータ記憶部９から得られたデータＭＤを
入力部に受ける。トライステードバッファ１２４は、制
御信号ＧＳ２メモリ・アドレス活性／非活性が制御さ
れ、メモリーデータ記憶部９から得られたデータＭＤを
入力部に受ける。Further, the tri-state buffer 122 is
Activation / deactivation is controlled based on H / L of the control signal GS1, and the input unit receives the data MD obtained from the memory data storage unit 9. Tri-state buffer 124 is controlled by control signal GS2 memory address activation / deactivation, and receives data MD obtained from memory data storage 9 at its input.

【００４９】このような構成のレジスタ・ファイル・デ
ータ・セレクタ１２において、選択信号Ｓ１がＨレベル
で与えられかつ一致信号ＧＥＴがＨレベルで与えられた
とき、制御信号ＧＳ１がＨレベルとなるため、トライス
テードバッファ１２２は、活性状態となり、データＭＤ
をデータバス２３に出力する。このとき、トライステー
ドバッファ１２１は非活性状態であるため、出力はハイ
インピーダンス状態となる。そして、選択信号Ｓ２がＨ
レベルで与えられかつ一致信号ＧＥＴがＨレベルで与え
られたとき、制御信号ＧＳ２がＨレベルとなると、トラ
イステードバッファ１２４は、活性状態となりデータＭ
Ｄをデータバス２４に出力する。このとき、トライステ
ードバッファ１２３は非活性状態であるため、出力はハ
イインピーダンス状態となる。In the register file data selector 12 having such a configuration, when the selection signal S1 is provided at H level and the coincidence signal GET is provided at H level, the control signal GS1 becomes H level. The tri-state buffer 122 becomes active and the data MD
Is output to the data bus 23. At this time, since the tri-state buffer 121 is inactive, the output is in the high impedance state. Then, the selection signal S2 is H
When the control signal GS2 goes high when the match signal GET is given at level H and the coincidence signal GET is given at level H, the tri-state buffer 124 is activated.
D is output to the data bus 24. At this time, the tri-state buffer 123 is in the inactive state, so that the output is in the high impedance state.

【００５０】一方、選択信号Ｓ１及び一致信号ＧＥＴの
うち、一方の信号がＬレベルで与えられたとき、制御信
号ＧＳ１がＬレベルとなるため、トライステードバッフ
ァ１２１が活性状態となり、データＲ１をデータバス２
３に出力する。このとき、トライステードバッファ１２
２は非活性状態であるため、出力はハイインピーダンス
状態となる。そして、選択信号Ｓ２及び一致信号ＧＥＴ
のうち、一方の信号がＬレベルで与えられたとき、制御
信号ＧＳ１がＬレベルとなるため、トライステードバッ
ファ１２３が活性状態となり、データＲ２をデータバス
２４に出力する。このとき、トライステードバッファ１
２４は非活性状態であるため、出力はハイインピーダン
ス状態となる。On the other hand, when one of the selection signal S1 and the coincidence signal GET is given at the L level, the control signal GS1 goes to the L level, so that the tri-state buffer 121 is activated and the data R1 is transferred. Data bus 2
Output to 3. At this time, the tri-state buffer 12
Since 2 is inactive, the output is in a high impedance state. Then, the selection signal S2 and the coincidence signal GET
When one of these signals is applied at the L level, the control signal GS1 goes to the L level, so that the tri-state buffer 123 is activated and the data R2 is output to the data bus 24. At this time, tri-state buffer 1
Since 24 is inactive, the output is in a high impedance state.

【００５１】演算実行器１３の一構成例を図４に示す。
演算実行器１３は、データバス２３，２４を介して得ら
れたデータをそれぞれ保持するレジスタ１３１，１３２
と、レジスタ１３１，１３２によって保持されたデータ
を用いて、命令コードＯＰに基づく演算を実行する演算
器１３３とからなる。したがって、命令コードＯＰが読
み出し動作を指示する場合は、演算器１３３の演算結果
ＭＡが読み出しアドレスとなる。FIG. 4 shows an example of the configuration of the arithmetic execution unit 13.
The operation executor 13 has registers 131 and 132 for holding the data obtained via the data buses 23 and 24, respectively.
And an arithmetic unit 133 that executes an operation based on the instruction code OP using the data held by the registers 131 and 132. Therefore, when the instruction code OP instructs the read operation, the operation result MA of the operator 133 becomes the read address.

【００５２】メモリ・アドレス比較器１０の一構成例を
図６に示す。メモリ・アドレス比較器１０は、一致検出
器１０１とデータ作成部１０２とから構成される。FIG. 6 shows an example of the structure of the memory / address comparator 10. The memory / address comparator 10 includes a coincidence detector 101 and a data creation unit 102.

【００５３】一致検出器１０１は、メモリーデータ記憶
部９に格納されているメモリ・アドレスＭＡと、演算実
行器１３によって計算された演算結果ＭＡと、命令コー
ドＯＰをそれぞれ受ける。そして、一致検出器１０１
は、命令コードＯＰがメモリ読み出し命令であるとき、
保持されているメモリ・アドレスＳＭＡと、読み出しア
ドレスである演算結果ＭＡの一致／不一致を検出し、一
致／不一致に基づきＨ／Ｌとなり、データが得られたこ
とを示す一致信号ＧＥＴを発生する。また、一致検出器
１０１は、命令コードＯＰの読み出し内容に基づき、一
致の部分が、１ワード（４バイト）のデータＭＤ内のど
のバイトであるか否かを、各々がＨ／Ｌで示すバイト選
択信号Ｂ１〜Ｂ４をデータ作成部１０２に出力する。The coincidence detector 101 receives the memory address MA stored in the memory data storage unit 9, the operation result MA calculated by the operation executor 13, and the instruction code OP, respectively. Then, the coincidence detector 101
When the instruction code OP is a memory read instruction,
A match / mismatch between the held memory address SMA and the operation result MA that is a read address is detected, and H / L is set based on the match / mismatch, and a match signal GET indicating that data has been obtained is generated. Further, the coincidence detector 101 is a byte indicating H / L indicating which byte in the data MD of 1 word (4 bytes) is the coincident portion based on the read content of the instruction code OP. The selection signals B1 to B4 are output to the data creation unit 102.

【００５４】データ作成部１０２は、バイト選択信号Ｂ
１〜Ｂ４とメモリーデータ記憶部９に格納されているデ
ータＭＤを受ける。そして、例えば、選択信号Ｂ１がＨ
レベルで、選択信号Ｂ２〜Ｂ４がＬレベルであるとき、
データＭＤの上位１バイトを下位バイトまでシフトして
データバス信号ＤＡＴＡを生成する。これらの一致信号
ＧＥＴ及びデータ信号ＤＡＴＡは、レジスタ・ファイル
・データ・セレクタ１２に与えられる。The data generator 102 uses the byte selection signal B
1 to B4 and the data MD stored in the memory data storage unit 9 are received. Then, for example, when the selection signal B1 is H
At the level, when the selection signals B2 to B4 are at the L level,
The data bus signal DATA is generated by shifting the upper 1 byte of the data MD to the lower byte. The match signal GET and the data signal DATA are supplied to the register file data selector 12.

【００５５】メモリーデータ記憶部９の一構成例を図５
に示す。メモリーデータ記憶部９は、ロード（読み出
し）命令のアドレスを一時的に保持する一次メモリであ
るデータ・メモリ９１と、メモリ・データを一時的に保
持する一次メモリであるデータ・メモリ９２とからな
る。アドレス・メモリ９２は、命令コードＯＰとメモリ
・アドレスＭＡを受ける。アドレス・メモリ９１は、命
令コードＯＰがロード（読み出し）命令の場合、メモリ
・アドレスＭＡの下位にビットをマスクしたアドレスを
保持する。またストア命令の場合、リセットする。An example of the configuration of the memory data storage unit 9 is shown in FIG.
Shown in. The memory data storage unit 9 includes a data memory 91, which is a primary memory that temporarily holds an address of a load (read) instruction, and a data memory 92, which is a primary memory that temporarily holds memory data. . Address memory 92 receives instruction code OP and memory address MA. When the instruction code OP is a load (read) instruction, the address memory 91 holds an address whose bits are masked under the memory address MA. If it is a store instruction, it is reset.

【００５６】データ・メモリ９１は、メモリ・バスに接
続され、データ・メモリ３から出力されたデータＬＤを
データＭＤとして保持する。保持したデータＭＤをメモ
リ・アドレス比較器１０に与える。The data memory 91 is connected to the memory bus and holds the data LD output from the data memory 3 as the data MD. The held data MD is given to the memory / address comparator 10.

【００５７】以下、図１１（ｃ）を参照して、第１の実
施例の動作について説明する。以下の説明において図１
に示したＲＩＳＣが図１１（ｂ）を参照して説明した命
令１及び命令２，３を実行するものと仮定する。The operation of the first embodiment will be described below with reference to FIG. In the following description, FIG.
It is assumed that the RISC shown in FIG. 11 executes the instruction 1 and the instructions 2 and 3 described with reference to FIG.

【００５８】すなわち、命令１（バイト単位のロード
（読み出し）命令），命令２（バイト単位のロード（読
み出し）命令），命令３（演算命令）の順で３つの命令
が順次与えられて処理され、これに加えて、命令１でデ
ータ・メモリ３から読み出されたデータの一部を命令２
が読み出し、命令２の実行により読み出されたデータが
命令３の処理の中で使われると仮定する。That is, three instructions are sequentially given and processed in the order of instruction 1 (byte unit load (read) instruction), instruction 2 (byte unit load (read) instruction), and instruction 3 (arithmetic instruction). In addition to this, a part of the data read from the data memory 3 in the instruction 1 is used in the instruction 2
Is read, and the data read by executing the instruction 2 is used in the processing of the instruction 3.

【００５９】命令１は、期間Ｔ１において命令フェッチ
・ステージ４における処理が行われ、期間Ｔ２において
は命令デコード・ステージ５における命令１の処理が行
われる。実行ステージ６′，メモリ・アクセス・ステー
ジ７，ライトバック・ステージ８における命令１の処理
は、期間Ｔ３，Ｔ４，Ｔ５においてそれぞれ実行され
る。The instruction 1 is processed in the instruction fetch stage 4 in the period T1 and processed in the instruction decode stage 5 in the period T2. The processing of the instruction 1 in the execution stage 6 ', the memory access stage 7 and the write back stage 8 is executed in the periods T3, T4 and T5, respectively.

【００６０】一方、命令２は、期間Ｔ２に命令フェッチ
・ステージ４が実行され、期間Ｔ３においては命令デコ
ード・ステージ５における命令２の処理が行われる。On the other hand, for the instruction 2, the instruction fetch stage 4 is executed in the period T2, and the instruction 2 is processed in the instruction decode stage 5 in the period T3.

【００６１】期間Ｔ４において、演算実行器１３により
メモリ・アドレスである演算結果ＭＡが計算され、メモ
リーデータ記憶部９のメモリ・アドレスＳＭＡとがメモ
リ・アドレス比較器１０により比較される。In the period T4, the calculation result MA, which is a memory address, is calculated by the calculation executor 13, and the memory address SMA of the memory data storage unit 9 is compared by the memory address comparator 10.

【００６２】その結果、命令１において読み出されたメ
モリ・アドレスＭＡとメモリーデータ記憶部９に格納さ
れたアドレスＳＭＡとが一致しているので、Ｈレベルの
一致信号ＧＥＴがメモリ・アドレス比較器１０より発生
される。As a result, since the memory address MA read in the instruction 1 and the address SMA stored in the memory data storage unit 9 match, the H-level match signal GET indicates the memory address comparator 10 as the match signal GET. Generated by.

【００６３】また、命令３においては、期間Ｔ３で命令
フェッチ・ステージ４が実行され、期間Ｔ４において
は、命令デコード・ステージ５における命令２の処理が
行われる。期間Ｔ５においては、命令２のデータがメモ
リーデータ記憶部９に既に得られている。したがって、
期間Ｔ５で、命令３の実行ステージ６′として、メモリ
・アドレス比較器１０から出力されるデータＤＡＴＡを
用いて、演算実行器１３が演算処理を行うことができ
る。以下、この点を詳述する。In the instruction 3, the instruction fetch stage 4 is executed in the period T3, and in the period T4, the processing of the instruction 2 in the instruction decode stage 5 is performed. In the period T5, the data of the instruction 2 has already been obtained in the memory data storage unit 9. Therefore,
In the period T5, as the execution stage 6'of the instruction 3, the operation executor 13 can perform the operation processing by using the data DATA output from the memory / address comparator 10. Hereinafter, this point will be described in detail.

【００６４】命令２の実行ステージ６′において、メモ
リ・アドレス比較器１０は、メモリーデータ記憶部９に
格納されている命令１によって読み出されたデータのメ
モリ・アドレスＳＭＡと命令２のメモリ・アドレスであ
る演算結果ＭＡとを比較する。このとき、命令２の命令
コードＯＰを参照し、命令コードＯＰがロード（読み出
し）命令でかつ、アドレスの一致が検出されるため、Ｈ
レベルの一致信号ＧＥＴを発生する。さらに、メモリ・
アドレス比較器１０は、メモリーデータ記憶部９に格納
されているデータから必要部分を取り出しデータＤＡＴ
Ａとして出力する。従って、必要なデータがデータＤＡ
ＴＡとして、レジスタ・ファイル・データ・セレクタ１
２に転送される。In the execution stage 6'of the instruction 2, the memory address comparator 10 determines the memory address SMA of the data read by the instruction 1 stored in the memory data storage unit 9 and the memory address of the instruction 2 Is compared with the calculation result MA. At this time, the instruction code OP of the instruction 2 is referred to, the instruction code OP is a load (read) instruction, and the address match is detected.
A level coincidence signal GET is generated. In addition, memory
The address comparator 10 extracts a necessary portion from the data stored in the memory data storage unit 9 and outputs the data DAT.
Output as A. Therefore, the required data is the data DA
Register file data selector 1 as TA
2 is transferred.

【００６５】命令３の実行ステージ６′において、レジ
スタファイル・アドレス比較器１１は、命令２の格納ア
ドレスＤと自身の命令３に含まれるソース・アドレスＳ
３１，Ｓ３２を比較して選択信号Ｓ１及びＳ２を出力す
る。このとき、命令３は命令２の読み出しだデータを利
用した演算処理を行う命令であるため、選択信号Ｓ１，
Ｓ２の少なくとも一方はＨレベルとなる。In the execution stage 6'of the instruction 3, the register file address comparator 11 determines the storage address D of the instruction 2 and the source address S included in its own instruction 3.
31 and S32 are compared and selection signals S1 and S2 are output. At this time, since the instruction 3 is an instruction for performing arithmetic processing using the data read by the instruction 2, the selection signal S1,
At least one of S2 becomes H level.

【００６６】さらに、レジスタ・ファイル・データ・セ
レクタ１２は、選択信号Ｓ１、Ｓ２及び、モリ・データ
が得られたことを示すＨレベルの一致信号ＧＥＴに応答
して、データＤＡＴＡをデータバス２３に与える。すな
わち、データ選択信号Ｓ１またはＳ２がＨレベルで与え
られかつ一致信号ＧＥＴがＨレベルで与えらるため、そ
の信号線に応じて、データＤＡＴＡをデータバス２３あ
るいは２４に与える。Further, the register file data selector 12 sends the data DATA to the data bus 23 in response to the selection signals S1 and S2 and the H level coincidence signal GET indicating that the memory data has been obtained. give. That is, since data selection signal S1 or S2 is applied at H level and coincidence signal GET is applied at H level, data DATA is applied to data bus 23 or 24 in accordance with the signal line.

【００６７】従って、命令２で読み出したデータが命令
３で即時利用することができ、命令３は、図１１（ｂ）
で示した従来例のように、待機状態を設ける必要がなく
なる。すなわち、図１１（ｃ）を参照して、期間Ｔ４の
終了時点で、命令２の読み出したデータを得ることがで
き、得られたデータがデータＤＡＴＡとして命令３に与
えられるので、期間Ｔ５において実行ステージ６′によ
る命令３の演算処理が可能となる。Therefore, the data read by the command 2 can be immediately used by the command 3, and the command 3 can be used as shown in FIG.
There is no need to provide a standby state as in the conventional example shown in. That is, referring to FIG. 11C, at the end of the period T4, the read data of the instruction 2 can be obtained, and the obtained data is given to the instruction 3 as the data DATA. The instruction 6 can be processed by the stage 6 '.

【００６８】その結果、命令３の処理が効率的に行える
ことにより、ＲＩＳＣの処理能力の向上が図れる。As a result, the processing of the instruction 3 can be efficiently performed, so that the processing capability of the RISC can be improved.

【００６９】＜第２の実施例＞図７はこの発明の第２の
実施例であるＲＩＳＣの内部構成を詳細に示すブロック
図である。同図に示すように、第２の実施例の実行ステ
ージ６′′は、第１の実施例に加えて、データが得られ
たことを示す一致信号ＧＥＴに応答して命令を変更する
命令変更器２５をさらに有する。命令変更器２５の一構
成例を図８に示す。命令変更器２５は、レジスタ・ファ
イル書き込み命令を保持するレジスタ２５１と選択回路
２５２とからなる。選択回路２５２はメモリ・アドレス
比較器１０から与えられた一致信号ＧＥＴと、レジスタ
１６から与えられる命令コードＯＰとを受ける。選択回
路２５２は、一致信号ＧＥＴによって制御される。例え
ば、一致信号ＧＥＴがＨレベルである場合、選択回路
は、レジスタ２５１に保持された読み出し命令無効（変
更命令）を指示する命令コードを選択し、メモリ・アク
セス・ステージ７のレジスタ１７に書き込む。一方、一
致信号ＧＥＴがＬレベルの時は、命令コードＯＰをその
ままレジスタ１７に書き込む。<Second Embodiment> FIG. 7 is a block diagram showing in detail the internal structure of a RISC according to a second embodiment of the present invention. As shown in the figure, in addition to the first embodiment, the execution stage 6 '' of the second embodiment changes the instruction in response to the coincidence signal GET indicating that data has been obtained. It further has a container 25. FIG. 8 shows an example of the configuration of the instruction changer 25. The instruction changer 25 includes a register 251 that holds a register file write instruction and a selection circuit 252. The selection circuit 252 receives the coincidence signal GET given from the memory / address comparator 10 and the instruction code OP given from the register 16. The selection circuit 252 is controlled by the coincidence signal GET. For example, when the match signal GET is at the H level, the selection circuit selects the instruction code instructing the read instruction invalidity (change instruction) held in the register 251, and writes it in the register 17 of the memory access stage 7. On the other hand, when the coincidence signal GET is at L level, the instruction code OP is written in the register 17 as it is.

【００７０】他の構成及び基本的な動作は第１の実施例
と同様なので省略する。以下、第１の実施例と異なる動
作についてのみ言及する。Other configurations and basic operations are the same as those in the first embodiment, and will not be repeated. Only the operation different from that of the first embodiment will be described below.

【００７１】命令変更器２５が、一致信号ＧＥＴに応答
して、レジスタ１６から得た命令コードＯＰあるいはレ
ジスタ２５１に格納された、読み出し命令の無効を指示
する変更命令のいずれかを選択する。すなわち、一致信
号ＧＥＴがＨレベルである場合、選択回路２５２によ
り、レジスタ２５１に保持された変更命令が選択され、
レジスタ１７に書き込まれる。In response to the coincidence signal GET, the instruction changer 25 selects either the instruction code OP obtained from the register 16 or the change instruction stored in the register 251 and indicating the invalidation of the read instruction. That is, when the match signal GET is at the H level, the selection circuit 252 selects the change instruction held in the register 251,
It is written in the register 17.

【００７２】その結果、既に、実行ステージ６′′で、
メモリーデータ記憶部９より読み出し処理が行われた命
令２のような読み出し命令の実行時には、次段のメモリ
・アクセス・ステージ７で、再度データ・メモリ３から
の読み出し処理を行わないように、命令変更することが
できるため、無駄なメモリ・アクセス処理の実行を阻止
することができる。As a result, already in the execution stage 6 '',
When a read instruction such as the instruction 2 read from the memory data storage unit 9 is executed, the instruction is made so that the read processing from the data memory 3 is not performed again in the next memory access stage 7. Since it can be changed, useless execution of memory access processing can be prevented.

【００７３】[0073]

【発明の効果】この発明における請求項１記載の並列計
算機の一時格納記憶手段は、データ記憶手段からの読み
出しが実行される際、読み出しアドレスを一時格納アド
レスとし、読み出しデータを一時格納データをして格納
しており、命令実行手段は、読み出しアドレスが一時格
納アドレスと一致すると、一時格納記憶手段の一時格納
データを読み出して命令を実行することが可能である。According to the first aspect of the present invention, the temporary storage storage means of the parallel computer according to claim 1 uses the read address as the temporary storage address and the read data as the temporary storage data when reading from the data storage means. When the read address matches the temporary storage address, the instruction executing means can read the temporarily stored data in the temporary storage storage means and execute the instruction.

【００７４】したがって、先に実行される第１の命令が
データ記憶手段からの読み出しを指示し、後に実行され
る第２の命令が第１の命令の読み出しデータを利用する
処理を指示する場合、第２の命令は、第１の命令の読み
出しデータがレジスタに格納されるのを待つことなく、
一時格納記憶手段に格納された格納データを読み出して
命令を実行することができる。Therefore, when the first instruction to be executed first instructs the reading from the data storage means and the second instruction to be executed later instructs the processing using the read data of the first instruction, The second instruction does not wait for the read data of the first instruction to be stored in the register,
The instruction can be executed by reading the stored data stored in the temporary storage means.

【００７５】その結果、第２の命令の処理に要する時間
が従来に比べ短縮されるため、効率的なデータ処理を行
い処理能力の向上を図ることができる。As a result, the time required to process the second instruction is shortened as compared with the conventional technique, and efficient data processing can be performed to improve the processing capability.

【００７６】また、先に実行される第１の命令がデータ
記憶手段の第１のアドレスからの読み出しを指示し、後
に実行される第２の命令もデータ記憶手段の上記第１の
アドレスからの読み出しを指示する場合、第２の命令
は、データ記憶手段にアクセスすることなく、一時格納
記憶手段に格納された格納データを読み出すことができ
る。そこで、請求項２記載の並列計算機の命令実行手段
により、読み出しアドレスが一時格納アドレスに一致し
た（一時格納記憶手段からの読み出しに成功した）場
合、外部データ読み出し手段に与える命令を変更して読
み出し命令を無効にすることにより、外部データ読み出
し手段によるデータ記憶手段からの読み出し処理を行う
という無駄を省くことができる。Further, the first instruction executed first instructs reading from the first address of the data storage means, and the second instruction executed later also reads from the first address of the data storage means. When instructing the reading, the second instruction can read the stored data stored in the temporary storage storage unit without accessing the data storage unit. Therefore, when the read address coincides with the temporary storage address (successful reading from the temporary storage storage means), the instruction execution means of the parallel computer according to claim 2 changes and reads the instruction given to the external data reading means. By invalidating the instruction, it is possible to eliminate the waste of the reading process from the data storage unit by the external data reading unit.

【００７７】このように、第１の命令と第２の命令が同
じアドレスの読み出しを行う場合は、ＲＩＳＣのよう
に、同じアドレスにアクセスする読み出し命令であっ
て、バイト単位や・ハーフ・バイト単位でデータを読み
出す命令の場合でも、常に複数バイトからなるワード単
位でデータを読み出して必要な部分を抽出する形式の並
列計算機には比較的起こりうるため、処理効率の向上を
図るにはたいへん有効である。As described above, when the first instruction and the second instruction read the same address, the read instruction accesses the same address like RISC, and is a byte unit or half-byte unit. Even in the case of an instruction to read data with, it is relatively likely to occur in a parallel computer of the type that always reads data in units of multiple bytes and extracts the necessary part, so it is very effective in improving processing efficiency. is there.

[Brief description of drawings]

【図１】この発明の第１の実施例であるＲＩＳＣの内部
構成を示すブロック図である。FIG. 1 is a block diagram showing an internal configuration of a RISC that is a first embodiment of the present invention.

【図２】図１で示したレジスタ・ファイル・アドレス比
較器の一構成例を示すブロック図である。FIG. 2 is a block diagram showing a configuration example of a register file address comparator shown in FIG.

【図３】図１で示したレジスタファイル・データ・セレ
クタの一構成例を示す回路図である。3 is a circuit diagram showing a configuration example of a register file data selector shown in FIG.

【図４】図１で示した演算実行器の一構成例を示す説明
図である。FIG. 4 is an explanatory diagram showing a configuration example of the arithmetic execution unit shown in FIG.

【図５】図１で示したメモリーデータ記憶部の内部構成
例を示す説明図である。5 is an explanatory diagram showing an internal configuration example of a memory data storage section shown in FIG. 1. FIG.

【図６】図１で示したメモリ・アドレス比較器の一構成
例を示すブロック図である。6 is a block diagram showing a configuration example of a memory / address comparator shown in FIG. 1. FIG.

【図７】この発明の第２の実施例であるＲＩＳＣの内部
構成を示すブロック図である。FIG. 7 is a block diagram showing an internal configuration of a RISC that is a second embodiment of the present invention.

【図８】図７で示した命令変更器の一構成例を示す説明
図である。8 is an explanatory diagram showing a configuration example of the instruction changer shown in FIG.

【図９】従来のＲＩＳＣの構成を示すのブロック図であ
る。FIG. 9 is a block diagram showing a configuration of a conventional RISC.

【図１０】ＲＩＳＣを制御する２相のクロックを示すタ
イミング図である。FIG. 10 is a timing diagram showing a two-phase clock controlling a RISC.

【図１１】従来及び実施例のＲＩＳＣによるパイプライ
ンの処理の進行状況を示す説明図である。FIG. 11 is an explanatory diagram showing the progress of pipeline processing by RISC according to the related art and the embodiment.

[Explanation of symbols]

２レジスタ・ファイル３データ・メモリ６′ 実行ステージ６′′ 実行ステージ９メモリーデータ記憶部１０メモリ・アドレス比較器１１レジスタ・ファイル・アドレス比較器１２レジスタ・ファイル・データ・セレクタ１３演算実行器２５命令変更器 2 register file 3 data memory 6 ′ execution stage 6 ″ execution stage 9 memory data storage unit 10 memory address comparator 11 register file address comparator 12 register file data selector 13 operation executor 25 instructions Changer

Claims

[Claims]

1. A register, a data storage means for storing data necessary for executing an instruction, an instruction decoding means for receiving an instruction code and decoding the instruction code, and a determination result based on a decoding result of the instruction decoding means. An instruction executing means for executing a read instruction from the data storage means, and executing and outputting the read address of the data storage means when the instruction stores the data stored in the register. And external data read means for reading the stored data of the read address from the data storage means when the instruction directs the reading, and writing the read data to the register when the instruction directs the reading. Register writing means, the instruction decoding means, the instruction executing means, the external data reading The read means and the register write means can be respectively executed in parallel in a pipeline manner, and when the read from the data storage means is executed, the read address is used as a temporary storage address and the read data is stored as a temporary storage data. The instruction execution means may read the temporary storage data of the temporary storage means and execute the instruction when the read address matches the temporary storage address. calculator.

2. The instruction executing means, when the read address matches the temporary storage address, changes the read instruction of the instruction to invalidate the read processing for the data storage means by the external data read means. The parallel computer according to claim 1.