JPH06332700A

JPH06332700A - Information processing equipment

Info

Publication number: JPH06332700A
Application number: JP5122524A
Authority: JP
Inventors: Kozo Kimura; 浩三木村; Hiroaki Hirata; 博章平田
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1993-05-25
Filing date: 1993-05-25
Publication date: 1994-12-02

Abstract

(57)【要約】【目的】複数のレジスタファイルを有するプロセッサ
において、レジスタファイルの一部を共用することで、
ハードウェア量を削減しコストを低減でき、高速通信機
能を有し高性能化を図れる情報処理装置を提供すること
を目的とする。【構成】複数の解読ユニット11,13と、複数の演算実
行を行なうロードストアユニット20、整数演算ユニット
21、浮動小数点ユニット部22と、各命令流に対応してア
クセスできるレジスタファイル25,26と、どの命令流か
らでもアクセスできるレジスタ27を備え、命令を解読ユ
ニット11,13で解読する際に、レジスタの番号によって
命令流独自のレジスタ25,26にアクセスするか共用レジ
スタ27にアクセスするかを決定し、それに応じて演算ユ
ニットに対応するデータを出力する。 (57) [Summary] [Purpose] By sharing part of a register file in a processor with multiple register files,
An object of the present invention is to provide an information processing device which can reduce the amount of hardware and cost and has a high-speed communication function and high performance. [Structure] A plurality of decoding units 11 and 13, a load / store unit 20 for executing a plurality of operations, an integer operation unit
21, a floating point unit 22, a register file 25, 26 that can be accessed corresponding to each instruction stream, and a register 27 that can be accessed from any instruction stream, when the instruction is decoded by the decoding unit 11, 13, Depending on the register number, it is determined whether to access the register 25, 26 unique to the instruction stream or the shared register 27, and the data corresponding to the arithmetic unit is output accordingly.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は複数の命令ストリームの
命令を並列に発行することによって、複数の演算ユニッ
トを効率よく使用する情報処理装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an information processing apparatus for efficiently using a plurality of arithmetic units by issuing instructions of a plurality of instruction streams in parallel.

【０００２】[0002]

【従来の技術】従来の情報処理装置の例としては、一つ
のプロセッサ内で複数の命令流を同時に処理するマルチ
スレッド・プロセッサがある。この方式については、
「アマルチスレテッドプロセッサーアーキテクチ
ャーウイズシマルテイニアスインストラクション
イシュイング」（ "Ａ Multithreaded Processor Ａrch
itecture with Simultaneous Instruction Issuing," I
n Proc. of ISS'91:International Symposium on Super
computing, Fukuoka, Japan, pp.87-96, November 1991
）に詳細に述べられている。2. Description of the Related Art As an example of a conventional information processing apparatus, there is a multi-thread processor which processes a plurality of instruction streams simultaneously in one processor. For this method,
"A Multi-Threaded Processor Architecture With Simultaneous Instructions
Issuing "(" A Multithreaded Processor Arch
itecture with Simultaneous Instruction Issuing, "I
n Proc. of ISS'91: International Symposium on Super
computing, Fukuoka, Japan, pp.87-96, November 1991
) Is described in detail.

【０００３】この従来の情報処理装置の構成を図３に示
す。図３において、２００は命令キャッシュ、２０１は
命令フェッチユニット、２０２は解読ユニット、２０３
はスタンバイステーション、２０４は命令スケジュール
ユニット、２０５は機能ユニット、２０６はレジスタセ
ットである。以上のように構成された従来例の情報処理
装置について、その動作を説明する。The structure of this conventional information processing apparatus is shown in FIG. In FIG. 3, 200 is an instruction cache, 201 is an instruction fetch unit, 202 is a decoding unit, and 203.
Is a standby station, 204 is an instruction scheduling unit, 205 is a functional unit, and 206 is a register set. The operation of the conventional information processing apparatus configured as described above will be described.

【０００４】まず、命令フェッチユニット２０１はそれ
ぞれ異なる命令流の命令を命令キャッシュ２００から読
み込む。解読ユニット２０２はそれぞれの命令流の命令
を解読し、命令を処理可能な機能ユニット２０５に接続
されているスタンバイステーション２０３に格納する。
命令スケジューリングユニット２０４はスタンバイステ
ーション２０３から適当な命令を選択し、機能ユニット
２０５に送る。機能ユニット２０５はレジスタセット２
０６を用いて実行する。このプロセッサの特徴は複数の
命令流を演算器を共有して実行することである。First, the instruction fetch unit 201 reads in instructions of different instruction streams from the instruction cache 200. The decoding unit 202 decodes the instructions of each instruction stream and stores the instructions in a standby station 203 connected to a functional unit 205 capable of processing.
The instruction scheduling unit 204 selects an appropriate instruction from the standby station 203 and sends it to the functional unit 205. Functional unit 205 is register set 2
Run with 06. The feature of this processor is that it executes a plurality of instruction streams by sharing an arithmetic unit.

【０００５】既存のスーパースカラ処理方式のプロセッ
サは機能ユニットのみの多重化（複数化）のため、同時
に処理可能な命令ストリームは１つで、命令間の依存関
係によりパイプラインインタロックが頻繁に発生する。
その結果、機能ユニットの使用効率は上がらず性能向上
が困難であった。従来例のプロセッサでは複数の命令ス
トリームの命令を並列に実行することにより命令レベル
の並列性を増加し、各機能ユニットの使用効率を上げ、
性能向上を実現できる。Since the existing superscalar processing type processor has multiple (multiple) functional units only, only one instruction stream can be processed at the same time, and pipeline interlock frequently occurs due to dependency between instructions. To do.
As a result, it has been difficult to improve the use efficiency of the functional unit and improve the performance. In the processor of the conventional example, the instruction level parallelism is increased by executing the instructions of a plurality of instruction streams in parallel, and the use efficiency of each functional unit is increased.
Performance improvement can be realized.

【０００６】[0006]

【発明が解決しようとする課題】しかしながら上記従来
の構成では、下記の問題点を有していた。まず第１に
は、各命令流に対応するレジスタファイルが独立に存在
するため、ハードウェア量が大きくプロセッサの面積が
巨大化してしまうことである。これよりＬＳＩ製造時の
歩留まりが下がりプロセッサのコストが増加してしま
う。例えば、画像生成において、画面を複数に分割して
同時に生成しようとする場合、このプロセッサの命令流
に同じ画像生成のプログラムを走らせ、データはそれぞ
れ分割した画面用のデータを割り付けることが可能で有
る。この時、各命令流で共通なデータを各レジスタに持
つような場合、各命令流に対応するレジスタに共通して
同じデータを格納するのは無駄である。However, the above conventional structure has the following problems. First of all, since the register files corresponding to the respective instruction streams exist independently, the amount of hardware is large and the area of the processor becomes huge. As a result, the yield at the time of manufacturing the LSI is lowered and the cost of the processor is increased. For example, in image generation, when a screen is divided into a plurality of areas and they are to be generated at the same time, the same image generation program can be run in the instruction stream of this processor, and data can be assigned to the divided screen data. . At this time, if each register has common data in each instruction stream, it is useless to store the same data in common in the registers corresponding to each instruction stream.

【０００７】第２に各命令流間で高速な通信ができない
ことである。もちろん、外部メモリを使用した場合には
可能であるが、メモリアクセスに伴うオーバーヘッドを
生じてしまう。Secondly, it is impossible to perform high-speed communication between each instruction stream. Of course, when an external memory is used, this is possible, but the overhead associated with memory access will occur.

【０００８】本発明は上記問題点に鑑み、複数の命令流
を同時実行するため、またはコンテキストの切り替えを
高速にするため複数のレジスタファイルを有するプロセ
ッサにおいて、ハードウェア量を削減しコストを低減す
る、および命令流間の高速通信機能を有し高性能化を実
現する情報処理装置を提供することを目的とする。In view of the above problems, the present invention reduces the amount of hardware and the cost in a processor having a plurality of register files in order to execute a plurality of instruction streams simultaneously or to speed up context switching. It is an object of the present invention to provide an information processing apparatus having a high-speed communication function between instruction streams and realizing high performance.

【０００９】[0009]

【課題を解決するための手段】上記目的を達するため、
第１の発明の情報処理装置は、コンテキストが独立であ
る複数のレジスタファイルの一部を共用することを特徴
としている。[Means for Solving the Problems] To achieve the above object,
The information processing apparatus of the first invention is characterized by sharing a part of a plurality of register files having independent contexts.

【００１０】第２の発明の情報処理装置は、一部を共用
する複数のレジスタファイルと、演算ユニットと複数の
レジスタファイルを接続するオペランド用スイッチを備
え、オペランド用スイッチは、レジスタファイルからデ
ータを読み出す際に、レジスタ番号によって共用してい
る部分か否かを判断し、それに応じて演算ユニットとレ
ジスタファイルを接続することを特徴としている。The information processing apparatus of the second invention comprises a plurality of register files that share a part and an operand switch for connecting the arithmetic unit and the plurality of register files. The operand switch receives data from the register file. At the time of reading, it is characterized by determining whether or not it is a shared portion by the register number and connecting the arithmetic unit and the register file accordingly.

【００１１】第３の発明の情報処理装置は、一部を共用
する複数のレジスタファイルと、演算ユニットと複数の
レジスタファイルを接続するオペランド用スイッチを備
え、オペランド用スイッチは、レジスタファイルからデ
ータを書き込む際に、レジスタ番号によって共用してい
る部分か否かを判断し、それに応じて演算ユニットとレ
ジスタファイルを接続することを特徴としている。An information processing apparatus according to a third aspect of the present invention comprises a plurality of register files that share a part and an operand switch that connects an arithmetic unit and a plurality of register files. The operand switch receives data from the register file. At the time of writing, it is characterized by judging whether or not it is a shared portion by the register number and connecting the arithmetic unit and the register file accordingly.

【００１２】第４の発明の情報処理装置は、一部を共用
する複数のレジスタファイルと、演算ユニットと複数の
レジスタファイルを接続するオペランド用スイッチと、
レジスタ読み出しアービターを備え、レジスタ読み出し
アービターは、レジスタファイルからデータを読み出す
際に、レジスタ番号によって共用している部分か否かを
判断し、共用レジスタのポート数以上に読み出し要求が
ある場合には、ポート数の範囲で実行できるように命令
発行を制御すると共に、オペランド用スイッチに対して
演算ユニットとレジスタファイルを接続させることを特
徴としている。An information processing apparatus according to a fourth aspect of the present invention includes a plurality of register files that share a part, an operand switch that connects an arithmetic unit and a plurality of register files,
It has a register read arbiter, and when reading data from a register file, the register read arbiter determines whether or not it is a part shared by register numbers, and if there are read requests more than the number of ports of shared registers, It is characterized by controlling instruction issuance so that it can be executed within the range of the number of ports and connecting the arithmetic unit and the register file to the operand switch.

【００１３】第５の発明の情報処理装置は、一部を共用
する複数のレジスタファイルと、演算ユニットと複数の
レジスタファイルを接続するオペランド用スイッチと、
レジスタ書き込みアービターを備え、レジスタ書き込み
アービターは、演算ユニットからレジスタファイルにデ
ータを書き込む際に、レジスタ番号によって共用してい
る部分か否かを判断し、共用レジスタのポート数以上に
書き込み要求がある場合には、ポート数の範囲で実行で
きるように演算ユニットを制御すると共に、オペランド
用スイッチに対して演算ユニットとレジスタファイルを
接続させる構成である。An information processing apparatus according to a fifth aspect of the present invention includes a plurality of register files that share a part, an operand switch that connects an arithmetic unit and a plurality of register files,
It has a register write arbiter, and when writing data from the arithmetic unit to the register file, the register write arbiter judges whether or not it is a shared part by the register number, and if there is a write request more than the number of ports of the shared register. In this configuration, the arithmetic unit is controlled so that it can be executed within the range of the number of ports, and the arithmetic unit and the register file are connected to the operand switch.

【００１４】[0014]

【作用】第１、第２、第３の発明に係る情報処理装置に
おいては、レジスタの一部を共用することにより、ハー
ドウェア量を削減しコストを低減できる、とともに各命
令流で同一のレジスタをアクセスできることにより、高
速通信機能を有し高性能化を実現できる。In the information processing apparatus according to the first, second and third aspects of the present invention, by sharing a part of the registers, the amount of hardware can be reduced and the cost can be reduced, and the same register can be used for each instruction stream. By having access to, it is possible to realize high performance with a high-speed communication function.

【００１５】第４、第５の発明に係る情報処理装置にお
いては、レジスタに対して書き込みおよび読み出しアー
ビターを設けることにより、共用レジスタのポート数を
削減でき、よりハードウェア量を削減できる。また、レ
ジスタを共用する場合としない場合をプログラム毎に自
由に使い分けることが可能になり、高性能化を図ること
が可能になる。In the information processing apparatus according to the fourth and fifth aspects of the present invention, by providing the write / read arbiter for the register, the number of ports of the common register can be reduced and the hardware amount can be further reduced. Further, it becomes possible to freely use the case where the register is shared and the case where the register is not used, and it is possible to improve the performance.

【００１６】[0016]

【実施例】以下本発明に係る情報処理装置の一実施例に
ついて、図面を参照しながら説明する。図１は本発明の
第１の実施例における情報処理装置の構成を示すもので
ある。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment of an information processing apparatus according to the present invention will be described below with reference to the drawings. FIG. 1 shows the configuration of an information processing apparatus according to the first embodiment of the present invention.

【００１７】図１において、１０はフェッチしてくる命
令のアドレス計算およびアドレス管理を行なう命令フェ
ッチ制御部、１１及び１３は命令を解読する解読ユニッ
ト、１２及び１４は解読ユニット１１及び１３と対応し
て解読中の命令のアドレスを格納する解読プログラムカ
ウンタ（以下解読ＰＣと略）である。２０、２１および
２２は、演算部を構成する演算ユニットである。演算ユ
ニットの機能は特に限定する必要はないが、説明を簡単
にするため、かつ一般的な構成を考えると、２０は、メ
モリアクセス命令を処理するロードストアユニット、２
１は、整数演算を処理する整数演算ユニット、２２は、
浮動小数点の四則や整数と浮動小数点間の変換を行なう
浮動小数点ユニットである。演算ユニットは命令流のＩ
Ｄ番号も保持している。In FIG. 1, 10 is an instruction fetch controller for calculating and managing addresses of fetched instructions, 11 and 13 are decoding units for decoding instructions, and 12 and 14 are corresponding to decoding units 11 and 13. A decoding program counter (hereinafter referred to as decoding PC) for storing the address of the instruction being decoded. Numerals 20, 21 and 22 are arithmetic units forming an arithmetic unit. The function of the arithmetic unit is not particularly limited, but in order to simplify the description and considering a general configuration, 20 is a load / store unit for processing a memory access instruction,
1 is an integer arithmetic unit that processes integer arithmetic, 22 is
It is a floating point unit that performs the four rules of floating point and the conversion between integer and floating point. The arithmetic unit is an instruction flow I
It also holds the D number.

【００１８】２３は、命令の解読結果であるオペレーシ
ョン及び即値などを処理対象の演算ユニットに接続する
オペレーション用スイッチ、２４はレジスタファイル２
５、２６および２７から出力されたオペランドを演算ユ
ニット２０、２１、および２２に接続するオペランドス
イッチ、２５、２６および２７はレジスタファイルであ
る。レジスタファイルＡ２５は解読ユニットＡ１１に対
応し、レジスタファイルＢ２６は解読ユニットＢ１３に
対応する。共用レジスタ２７は解読ユニットＡ１１およ
び解読ユニットＢ１３の両方に対応する。Reference numeral 23 is an operation switch for connecting an operation resulting from the decoding of the instruction and an immediate value to the arithmetic unit to be processed. Reference numeral 24 is a register file 2.
The operand switches, 25, 26 and 27, which connect the operands output from 5, 26 and 27 to the arithmetic units 20, 21 and 22, are register files. Register file A25 corresponds to decryption unit A11 and register file B26 corresponds to decryption unit B13. Shared register 27 corresponds to both decryption unit A11 and decryption unit B13.

【００１９】２８は演算結果を対応するレジスタに接続
する書き込み用スイッチ、２９はスコアボード３０の状
態と命令よりデータ依存が発生しているか否かを調べる
データ依存チェック部、３０は命令実行中でデータが確
定していないレジスタの番号を示すスコアボードであ
る。Reference numeral 28 is a write switch for connecting the operation result to the corresponding register, 29 is a data dependence check unit for checking whether or not data dependence has occurred depending on the state of the scoreboard 30 and the instruction, and 30 is executing the instruction. It is a scoreboard showing the numbers of registers whose data has not been finalized.

【００２０】説明を分かりやすくするために、同じ機能
を持つ機構（命令バッファや解読ユニットなど）につい
ては図１に示すように名前の最後に適当なアルファベッ
トを付加して区別する。同じアルファベットが付加され
たものは、同一の命令流を扱うと考えてよい。命令解読
ユニット１１および１３をそれぞれ解読ユニットＡ１
１、解読ユニットＢ１３とする。レジスタファイル２５
および２６については、それぞれレジスタファイルＡ２
５、レジスタファイルＢ２６とする。For the sake of clarity, mechanisms having the same function (such as an instruction buffer and a decoding unit) will be distinguished by adding an appropriate alphabet to the end of the name as shown in FIG. Those to which the same alphabet is added may be considered to handle the same instruction stream. The instruction decoding units 11 and 13 are respectively replaced by the decoding unit A1.
1. Decoding unit B13. Register file 25
Register files A2 and
5, register file B26.

【００２１】まず、動作を説明する前に、データ（命令
も含む）や構成に関する前提条件について述べる。命令
流（スレッド）は２個とし説明を分かりやすくするため
に、命令流Ａ、命令流Ｂとする。それに対応する解読ユ
ニットを解読ユニットＡ、解読ユニットＢおよび対応す
るレジスタファイルをレジスタファイルＡ、レジスタフ
ァイルＢとする。従って、同時実行の可能な命令流は２
個である。各命令流はそれぞれ３２本のレジスタ（R0〜
R31とする）をコンテクストとして持つが、そのうち命
令流に独立に持つものは１６本（R16〜R31）、命令流に
共通に持つものが共用レジスタ１６本（R0〜R15）とす
る。First, before explaining the operation, preconditions regarding data (including instructions) and configuration will be described. There are two instruction streams (threads), which are instruction stream A and instruction stream B for the sake of easy understanding. The corresponding decoding unit is called decoding unit A, decoding unit B, and the corresponding register files are register file A and register file B. Therefore, the number of instruction streams that can be executed simultaneously is 2
It is an individual. Each instruction stream has 32 registers (R0-
16) (R16 to R31), and 16 shared registers (R0 to R15) common to the instruction stream.

【００２２】演算ユニットはパイプライン化されている
が、パイプライン段数を始め詳細な構成等については本
発明とは直接関係しないので特に規定しない。また、命
令解読ユニットは命令流１個につき１命令を解読し、一
度に発行できる命令も１個とする。ロードストアユニッ
ト２０および命令フェッチ制御部１０はメモリに接続さ
れている。Although the arithmetic unit is pipelined, the detailed configuration including the number of pipeline stages is not specified because it is not directly related to the present invention. Further, the instruction decoding unit decodes one instruction for each instruction stream and can issue only one instruction at a time. The load / store unit 20 and the instruction fetch control unit 10 are connected to the memory.

【００２３】演算ユニットでは命令流Ａまたは命令流Ｂ
の命令が実行されており、処理する命令が無い場合には
アイドル状態になっている。スコアボード３０では実行
中により結果が確定していないレジスタ番号が命令流毎
に管理されている。但し、スレッドに共用されている共
用レジスタ２７については、スコアボード３０も共用で
あり、命令流Ａおよび命令流Ｂどちらからもセットリセ
ット可能である。詳細な内容については実施例の動作で
も機能を説明する。In the arithmetic unit, instruction stream A or instruction stream B
Is being executed and there is no instruction to process, it is in the idle state. In the scoreboard 30, register numbers whose results have not been determined due to execution are managed for each instruction stream. However, the shared register 27 shared by the threads is also shared by the scoreboard 30 and can be set / reset by both the instruction stream A and the instruction stream B. Regarding the detailed contents, the function will be described in the operation of the embodiment.

【００２４】以上のように構成された本実施例の情報処
理装置において、まずレジスタＡ、レジスタＢを使用す
る場合について、以下図１を用いて、その動作を説明す
る。命令フェッチ制御部１０は命令流Ａおよび命令流Ｂ
の命令をフェッチし、解読ユニットＡ１１、解読ユニッ
トＢ１３に転送する。この命令フェッチはそれぞれの命
令流を交互に取りに行き、バッファリングしておけばよ
い。この機能およびバッファ等は本発明とは特に関係し
ないので省略する。また、分岐が発生した場合には分岐
先命令のアドレス計算や分岐先命令のフェッチなどの動
作もあるが、本発明とは特に関係しないので省略する。In the information processing apparatus of the present embodiment having the above-mentioned configuration, the operation of the case where the register A and the register B are used will be described below with reference to FIG. The instruction fetch control unit 10 uses the instruction stream A and the instruction stream B.
Is fetched and transferred to the decoding unit A11 and the decoding unit B13. In this instruction fetch, each instruction stream is alternately taken and buffered. This function, the buffer, and the like have no particular relation to the present invention, and are omitted. Further, when a branch occurs, there are operations such as address calculation of a branch target instruction and fetch of a branch target instruction, but since they are not particularly related to the present invention, they are omitted.

【００２５】解読ユニットＡ１１および解読ユニットＢ
１３より以下のステージの説明は動作を判り易くするた
めに、命令流Ａおよび命令流Ｂの命令から動作を説明す
る。解読ユニットＡ１１は命令フェッチ制御部１０から
ＬＯＡＤ命令（メモリからレジスタへのロード命令：ME
M(R19)→R20）を取り出し、解読ユニットＢ１３は命令
フェッチ制御部１０からＡＤＤ命令（レジスタ間加算命
令：R20 + R21→R22）を取り出し、それぞれ解読しオペ
レーションを作成するとともに、そのオペレーションを
処理すべき演算ユニットを決定する。Decoding unit A11 and decoding unit B
In order to make the operation easier to understand, the operation of the stages starting from 13 will be described from the instructions of the instruction stream A and the instruction stream B. The decoding unit A11 receives a LOAD instruction (load instruction from memory to register: ME
M (R19) → R20) is taken out, and the decoding unit B13 takes out the ADD instruction (register-to-register addition instruction: R20 + R21 → R22) from the instruction fetch control unit 10, decodes each and creates an operation, and processes the operation. Decide which arithmetic unit should be used.

【００２６】同時に同一命令流内でデータ依存関係が発
生していないかをチェックする。依存関係のチェック
は、スコアボード３０とデータ依存チェック部２９が行
なう。具体的には現在実行中のためにレジスタの値が確
定していないレジスタ番号がスコアボードに登録されて
おり、解読ユニットからこれから読み出すレジスタ番号
と比較し、一致すればデータ依存発生を解読ユニットに
返す。これから実行する命令が、結果を反映していない
レジスタの値を使用することを防ぐためである。レジス
タ番号の登録は解読ユニット１１および１３が命令を演
算部に発行するときに登録し、レジスタ番号の解除は各
演算ユニット２０、２１および２２が命令実行の終了と
ともに行なう。データ依存関係については信学技報ＣＰ
ＳＹ−９０−５４（’９０．７）「ＳＩＭＰ（単一命令
流／多重命令パイプライン）方式に基づくスーパースカ
ラ・プロセッサの改良方針」に詳細に解説されている。At the same time, it is checked whether or not a data dependency is generated within the same instruction stream. The dependency check is performed by the scoreboard 30 and the data dependency check unit 29. Specifically, the register number for which the register value is not fixed because it is currently being executed is registered in the scoreboard, and is compared with the register number to be read from the decoding unit. return. This is to prevent an instruction to be executed hereafter from using a register value that does not reflect the result. The register number is registered when the decoding units 11 and 13 issue the instruction to the arithmetic unit, and the register number is released when the arithmetic units 20, 21 and 22 finish the instruction execution. Regarding the data dependency
SY-90-54 ('90 .7) "Improvement policy of superscalar processor based on SIMP (single instruction stream / multiple instruction pipeline) method" is described in detail.

【００２７】データ依存が発生している場合には依存関
係が解除されるまで命令の発行を停止し、発生していな
い場合には、解読ユニットはオペレーションを演算部に
発行すると同時に、レジスタファイルへ読みだし要求を
出す。命令流ＡはレジスタファイルＡに対応しており、
かつ読みだすレジスタ番号はR16〜R31なので、解読ユニ
ットＡ１１はレジスタファイルＡ２５へレジスタ番号を
送出する。命令流ＢはレジスタファイルＢ２６に対応し
ており、かつレジスタ番号はR16〜R31なので、解読ユニ
ットＢ１３はレジスタファイルＢ２６へ送出する。When the data dependence has occurred, the issuance of the instruction is stopped until the dependence is released, and when it does not occur, the decoding unit issues the operation to the arithmetic unit and at the same time to the register file. Make a read request. Instruction stream A corresponds to register file A,
Since the read register numbers are R16 to R31, the decoding unit A11 sends the register numbers to the register file A25. Since the instruction stream B corresponds to the register file B26 and the register numbers are R16 to R31, the decoding unit B13 sends it to the register file B26.

【００２８】オペランド用スイッチ２４はレジスタファ
イルＡ２５とロードストアユニット２５を、レジスタフ
ァイルＢ２６と整数演算ユニット２１を接続する。オペ
レーション用スイッチ２３は解読結果より解読ユニット
と演算ユニットを接続する。本実施例の場合は、解読ユ
ニットＡ１１からＬＯＡＤ命令が、解読ユニットＢ１３
からＡＤＤ命令が発行されるので、オペレーション用ス
イッチ２３は解読ユニットＡ１１とロードストアユニッ
ト２０、解読ユニットＢ１３と整数演算ユニット２１を
接続する。The operand switch 24 connects the register file A25 and the load / store unit 25, and the register file B26 and the integer operation unit 21. The operation switch 23 connects the decoding unit and the arithmetic unit according to the decoding result. In the case of the present embodiment, the LOAD instruction is sent from the decoding unit A11 to the decoding unit B13.
Since the ADD command is issued by the operation switch 23, the operation switch 23 connects the decoding unit A11 and the load / store unit 20, and the decoding unit B13 and the integer operation unit 21.

【００２９】ロードストアユニット２０にはレジスタフ
ァイルＡ２５から読み出したオペランド値を、整数演算
ユニット２１にはレジスタファイルＢ２６から読み出し
たオペランド値を格納する。そして解読ユニットＡ１１
が発行したオペレーションをロードストアユニット２０
に、解読ユニットＢ２４が発行したオペレーションを整
数演算ユニット２１に、それぞれ格納する。The load / store unit 20 stores the operand value read from the register file A25, and the integer operation unit 21 stores the operand value read from the register file B26. And decoding unit A11
Load store unit 20
Then, the operation issued by the decoding unit B24 is stored in the integer arithmetic unit 21, respectively.

【００３０】以下、各演算ユニットはオペレーションや
オペランドを使用して実行し、結果をそれぞれレジスタ
やメモリ等に格納する。ロードストアユニット２０にお
いては、命令流ＡのＬＯＡＤ命令が実行される。命令流
ＡはレジスタファイルＡ２５に対応しており、かつ書き
込み先のレジスタ番号はR16〜R31なので、計算したメモ
リアドレスでメモリからデータをフェッチし、レジスタ
ファイルＡ２５内のレジスタR20に格納する。整数演算
ユニット２１においては、ＡＤＤ命令が実行される。Hereinafter, each arithmetic unit executes using an operation or an operand and stores the result in a register, a memory or the like. In the load / store unit 20, the LOAD instruction of the instruction stream A is executed. Since the instruction stream A corresponds to the register file A25 and the register numbers of the write destination are R16 to R31, the data is fetched from the memory at the calculated memory address and stored in the register R20 in the register file A25. In the integer arithmetic unit 21, the ADD instruction is executed.

【００３１】命令流ＢはレジスタファイルＢ２６に対応
しており、かつ書き込み先のレジスタ番号はR16〜R31な
ので、加算結果をレジスタファイルＢ２６内のレジスタ
R22に格納する。ロードストアユニット２０とレジスタ
ファイルＡ２５を、整数演算ユニット２１とレジスタフ
ァイルＢ２６を接続するように書き込み用スイッチ２８
は制御される。各演算ユニットは演算が終了しレジスタ
への書き込みが終了すると、スコアボード３０に登録さ
れたレジスタ番号をクリアする。Since the instruction stream B corresponds to the register file B26 and the register numbers of the write destinations are R16 to R31, the addition result is the register in the register file B26.
Store in R22. A write switch 28 for connecting the load / store unit 20 and the register file A25, and connecting the integer arithmetic unit 21 and the register file B26.
Is controlled. Each arithmetic unit clears the register number registered in the scoreboard 30 when the arithmetic operation is completed and the writing to the register is completed.

【００３２】また、命令解読ユニットから発行されたオ
ペレーションが同じ演算ユニットを使用する場合、本実
施例の構成ではどちらか一方のオペレーションが待たせ
るための機構が必要になるが、本発明とは直接関係無い
ので、その機構は省略する。演算ユニットの構成が変わ
った場合には待ち合わせ機構も不要になる可能性もあ
る。同様に、同じ命令流の２命令が同時に演算が終了し
た場合には、レジスタへの書き込みを待たせる、または
レジスタファイルの書き込みポートを複数設けるなどの
対策が必要であるが、本発明とは直接関係無いのでその
説明は省略する。Further, when the operation issued from the instruction decoding unit uses the same arithmetic unit, the structure of the present embodiment requires a mechanism for holding either one of the operations, but this is not directly related to the present invention. Since it does not matter, the mechanism is omitted. If the configuration of the arithmetic unit changes, the waiting mechanism may become unnecessary. Similarly, when two instructions of the same instruction stream are simultaneously processed, it is necessary to wait for writing to the register, or to provide a plurality of register file write ports. Since it does not matter, its explanation is omitted.

【００３３】続いて、共用レジスタ２７を使用する場合
について動作を説明する。命令フェッチ制御部１０から
命令を読み込む部分は前述したので省略する。解読ユニ
ットＡ１１は命令フェッチ制御部１０からＬＯＡＤ命令
（メモリからレジスタへのロード命令：MEM(R１９)→R
5）を、解読ユニットＢ１３は命令フェッチ制御部１０
からＡＤＤ命令（レジスタ間加算命令：R６＋ R２１→R
２２）を取り出し、それぞれ解読しオペレーションを作
成するとともに、そのオペレーションを処理すべき演算
ユニットを決定する。同時に同一命令流内でデータ依存
関係が発生していないかをチェックする。Next, the operation when the shared register 27 is used will be described. The part for reading an instruction from the instruction fetch control unit 10 has been described above, and will be omitted. The decoding unit A11 receives a LOAD instruction from the instruction fetch control unit 10 (memory-to-register load instruction: MEM (R19) → R).
5), the decoding unit B13 uses the instruction fetch controller 10
To ADD instruction (register addition instruction: R6 + R21 → R
22), and each of them is decrypted to create an operation, and the arithmetic unit to process the operation is determined. At the same time, it is checked whether a data dependency has occurred in the same instruction stream.

【００３４】依存関係のチェックは、スコアボード３０
とデータ依存チェック部２９が行なう。具体的には現在
実行中のためにレジスタの値が確定していないレジスタ
番号がスコアボード３０に登録されており、解読ユニッ
トからこれから読み出すレジスタ番号と比較し、一致す
ればデータ依存発生を解読ユニットに返す。これから実
行する命令が、結果を反映していないレジスタの値を使
用することを防ぐためである。レジスタ番号の登録は解
読ユニット１１および１３が命令を演算部に発行すると
きに登録し、レジスタ番号の解除は各演算ユニット２
０、２１および２２が命令実行の終了とともに行なう。The check of the dependency relationship is performed by the scoreboard 30.
And the data dependence check unit 29. Specifically, a register number whose register value is not fixed because it is currently being executed is registered in the scoreboard 30, and is compared with the register number to be read from the decoding unit. Return to. This is to prevent an instruction to be executed hereafter from using a register value that does not reflect the result. The register number is registered when the decoding units 11 and 13 issue an instruction to the arithmetic unit, and the register number is released by each arithmetic unit 2.
0, 21 and 22 are executed at the end of instruction execution.

【００３５】データ依存が発生している場合には依存関
係が解除されるまで命令の発行を停止し、発生していな
い場合には、解読ユニットはオペレーションを演算部に
発行すると同時に、レジスタファイルへ読みだし要求を
出す。命令流ＡはレジスタファイルＡ２５に対応してお
り、かつ読みだすレジスタ番号はR16〜R31なので、解読
ユニットＡ１１はレジスタファイルＡ２５へレジスタ番
号を送出する。命令流ＢはレジスタファイルＢ２６に対
応しており、一方のレジスタ番号はR０〜R１５であり、
もう一方のレジスタ番号はR１６〜R３１なので、解読ユ
ニットＢ１３は共用レジスタ２７にレジスタ番号R６
を、レジスタファイルＢ２６へはR２１を送出する。If data dependence has occurred, the instruction issuance is stopped until the dependence is released, and if not, the decoding unit issues an operation to the arithmetic unit and at the same time writes to the register file. Make a read request. Since the instruction stream A corresponds to the register file A25 and the register numbers to be read are R16 to R31, the decoding unit A11 sends the register numbers to the register file A25. The instruction stream B corresponds to the register file B26, and one register number is R0 to R15,
Since the other register number is R16 to R31, the decoding unit B13 stores the register number R6 in the shared register 27.
Is sent to the register file B26.

【００３６】オペランド用スイッチ２４はレジスタファ
イルＡ２５とロードストアユニット２０を、整数演算ユ
ニット２１にはレジスタファイルＢ２６と共用レジスタ
２７の両方を接続する。オペレーション用スイッチ２３
は解読結果より解読ユニットと演算ユニットを接続す
る。本実施例の場合は、解読ユニットＡ１１からＬＯＡ
Ｄ命令が、解読ユニットＢ１３からＡＤＤ命令が発行さ
れるので、オペレーション用スイッチ２３は解読ユニッ
トＡ１１とロードストアユニット２０、解読ユニットＢ
１３と整数演算ユニット２１を接続する。The operand switch 24 connects the register file A25 and the load / store unit 20, and the integer arithmetic unit 21 connects both the register file B26 and the common register 27. Operation switch 23
Connects the decoding unit and the arithmetic unit from the decoding result. In the case of this embodiment, the decoding unit A11 to LOA
Since the AD command is issued from the decoding unit B13 as the D command, the operation switch 23 includes the decoding unit A11, the load / store unit 20, and the decoding unit B.
13 and the integer arithmetic unit 21 are connected.

【００３７】ロードストアユニット２０にはレジスタフ
ァイルＡ２５から読み出したオペランド値を、整数演算
ユニット２１にはレジスタファイルＢ２６と共用レジス
タ２７から読み出したオペランド値を格納する。そして
解読ユニットＡ１１が発行したオペレーションをロード
ストアユニット２０に、解読ユニットＢ２４が発行した
オペレーションを整数演算ユニット２１に、それぞれ格
納する。The load / store unit 20 stores the operand value read from the register file A25, and the integer operation unit 21 stores the operand value read from the register file B26 and the shared register 27. The operation issued by the decoding unit A11 is stored in the load / store unit 20, and the operation issued by the decoding unit B24 is stored in the integer arithmetic unit 21.

【００３８】以下、各演算ユニットはオペレーションや
オペランドを使用して実行し、結果をそれぞれレジスタ
やメモリ等に格納する。ロードストアユニット２０にお
いては、命令流ＡのＬＯＡＤ命令が実行される。命令の
書き込み先のレジスタ番号はR０〜R１５なので、計算し
たメモリアドレスでメモリからデータをフェッチし、共
用レジスタ２７内のレジスタR５に格納する。整数演算
ユニット２１においてはＡＤＤ命令が実行される。命令
流ＢはレジスタファイルＢ２６に対応しており、かつ書
き込み先のレジスタ番号はR16〜R31なので、加算結果を
レジスタファイルＢ２６内のレジスタR２２に格納す
る。ロードストアユニット２０と共用レジスタ２７を、
整数演算ユニット２１とレジスタファイルＢ２６を接続
するように書き込み用スイッチ２８は制御される。各演
算ユニットは演算が終了しレジスタへの書き込みが終了
すると、スコアボード３０に登録されたレジスタ番号を
クリアする。Hereinafter, each arithmetic unit executes using an operation or an operand, and stores the result in a register, a memory or the like. In the load / store unit 20, the LOAD instruction of the instruction stream A is executed. Since the register numbers of the instruction write destination are R0 to R15, the data is fetched from the memory at the calculated memory address and stored in the register R5 in the shared register 27. The ADD instruction is executed in the integer arithmetic unit 21. Since the instruction stream B corresponds to the register file B26 and the register numbers of the write destinations are R16 to R31, the addition result is stored in the register R22 in the register file B26. Load store unit 20 and shared register 27
The write switch 28 is controlled so as to connect the integer arithmetic unit 21 and the register file B26. Each arithmetic unit clears the register number registered in the scoreboard 30 when the arithmetic operation is completed and the writing to the register is completed.

【００３９】続いて、以下本発明に係る情報処理装置の
他の実施例について、図面を参照しながら説明する。図
２は本発明の他の実施例における情報処理装置のレジス
タファイルの構成図を示すものである。Next, another embodiment of the information processing apparatus according to the present invention will be described with reference to the drawings. FIG. 2 shows a configuration diagram of a register file of an information processing apparatus according to another embodiment of the present invention.

【００４０】図２において、４０及び４１は命令を解読
する解読ユニットである。５０、５１及び５２は、演算
ユニットである。演算ユニットの機能は特に限定する必
要はないので特に詳しい説明は付けない。５７はレジス
タファイル５３、５４および５５から出力されたオペラ
ンドを演算ユニット５０、５１、および５２に接続する
オペランドスイッチ、５３、５４および５５は読み出し
２書き込み１の３ポートのレジスタファイルである。レ
ジスタファイルのレジスタＡ５３は解読ユニットＡ４０
に対応し、レジスタＢ５４は解読ユニットＢ４１に対応
する。In FIG. 2, reference numerals 40 and 41 are decoding units for decoding instructions. Reference numerals 50, 51 and 52 are arithmetic units. The function of the arithmetic unit does not need to be particularly limited, and therefore detailed description will not be given. Reference numeral 57 is an operand switch for connecting the operands output from the register files 53, 54 and 55 to the arithmetic units 50, 51 and 52, and 53, 54 and 55 are register files of 3 ports of read 2 write 1. The register A53 of the register file is the decoding unit A40.
, And register B54 corresponds to decryption unit B41.

【００４１】共用レジスタ５５は解読ユニットＡ４０お
よび解読ユニットＢ４１の両方に対応する。５０は演算
結果を対応するレジスタに接続する書き込み用スイッ
チ、５８はレジスタファイルのうちどのデータを読みだ
すかを決定する読み出しアービター、５９は演算ユニッ
トの結果のうちどれをレジスタに書き込むかを決定する
書き込みアービターである。それ以外の構成の部分およ
び接続関係については図１に示されているため、省略す
る。Shared register 55 corresponds to both decryption unit A40 and decryption unit B41. 50 is a write switch for connecting the operation result to the corresponding register, 58 is a read arbiter for deciding which data in the register file to read, and 59 is for deciding which of the results of the operation unit is written in the register. Write arbiter. The other parts of the configuration and the connection relationship are shown in FIG.

【００４２】以上のように構成された本実施例の情報処
理装置において、以下図２を用いてその動作を説明す
る。なお、前述した動作と同じ部分は省略し、異なる部
分だけを説明する。The operation of the information processing apparatus of the present embodiment constructed as above will be described below with reference to FIG. It should be noted that the same parts as those of the above-described operation will be omitted, and only different parts will be described.

【００４３】まず、レジスタを読み出す場合について、
解読ユニットＡ４０はＬＯＡＤ命令（メモリからレジス
タへのロード命令：MEM(R０＋R１)→R１８）を、解読ユ
ニットＢ４１はＡＤＤ命令（レジスタ間加算命令：R２
＋ R３→R２２）を解読することにする。読み出しレジ
スタのレジスタ番号はすべてR０〜R１５なので、すべて
のデータは共用レジスタ５５から用意する必要がある。
ところが、共用レジスタ５５は読み出しポートが２つし
かないためすべてを読みだすわけには行かない。First, regarding reading the register,
The decoding unit A40 sends a LOAD instruction (memory-to-register load instruction: MEM (R0 + R1) → R18), and the decoding unit B41 sends an ADD instruction (register-to-register addition instruction: R2).
+ R3 → R22) will be decoded. Since the read register numbers are all R0 to R15, it is necessary to prepare all data from the shared register 55.
However, since the shared register 55 has only two read ports, it cannot read all.

【００４４】そこで、読み出しアービター５８は命令解
読ユニット４０および４１から読み出すレジスタ番号を
受け取り、もし、共用レジスタ５５から３種類以上のレ
ジスタの読み出し要求がある場合には、２種類に限定で
きるよう解読ユニットに実行可能性を返すと共に、対応
するレジスタを共用レジスタ５５から読み出す。同時に
読み出しアービター５８は共用レジスタから読みだした
内容を対応する演算ユニットに出力できるようオペラン
ド用スイッチ５７を制御する。Therefore, the read arbiter 58 receives the register numbers to be read from the instruction decoding units 40 and 41, and if there is a read request for more than three types of registers from the shared register 55, the reading unit can be limited to two types. And the corresponding register is read from the shared register 55. At the same time, the read arbiter 58 controls the operand switch 57 so that the contents read from the shared register can be output to the corresponding arithmetic unit.

【００４５】逆に、レジスタに書き込む場合について、
解読ユニットＡ４０はＬＯＡＤ命令（メモリからレジス
タへのロード命令：MEM(R０＋R１７)→R０）を、解読ユ
ニットＢ４１はＡＤＤ命令（レジスタ間加算命令：R２
＋ R１８→R１）を解読することにする。書き込みレジ
スタのレジスタ番号はすべてR０〜R１５なので、すべて
のデータは共用レジスタ５５に書き込む必要がある。と
ころが、共用レジスタ５５は書き込みポートが１つしか
ないためすべてを書き込むわけには行かない。On the contrary, when writing to the register,
The decoding unit A40 issues a LOAD instruction (memory-to-register load instruction: MEM (R0 + R17) → R0), and the decoding unit B41 gives an ADD instruction (register-to-register addition instruction: R2).
+ R18 → R1) will be decoded. Since all the register numbers of the write registers are R0 to R15, it is necessary to write all the data in the shared register 55. However, since the shared register 55 has only one write port, it cannot write all.

【００４６】そこで、書き込みアービター５９は演算ユ
ニットから読み出すレジスタ番号を受け取り、もし、共
用レジスタ５５から２種類以上のレジスタへの書き込み
要求がある場合には、１種類に限定できるよう演算ユニ
ットに実行可能性を返すと共に、共用レジスタ５５ない
の対応するレジスタに書き込む。同時に書き込みアービ
ター５９は演算ユニットから共用レジスタに書き込める
よう書き込み用スイッチ５６を制御する。Therefore, the write arbiter 59 receives the register number to be read from the arithmetic unit, and if there is a write request from the shared register 55 to two or more types of registers, it can be executed by the arithmetic unit so that the number can be limited to one. And the corresponding register in the shared register 55 is written. At the same time, the write arbiter 59 controls the write switch 56 so that the arithmetic unit can write to the shared register.

【００４７】なお、本実施例では、レジスタファイルは
２セットで説明したが、３以上でもよい。また、本実施
例では、複数の命令流が同時並列に実行するプロセッサ
を用いて説明したが、同時に実行される命令流は単一で
あるが複数のレジスタファイルを有するプロセッサでも
かまわない。In this embodiment, two sets of register files have been described, but three or more register files may be used. Further, although the present embodiment has been described using a processor in which a plurality of instruction streams are executed in parallel at the same time, a single instruction stream is executed at the same time, but a processor having a plurality of register files may be used.

【００４８】さらに、本実施例では、レジスタファイル
は３２本としたが特に制限はなく、共用するレジスタの
本数も制限はない。また、汎用レジスタとして共有レジ
スタを持たなくてもよい。Further, in the present embodiment, the number of register files is 32, but there is no particular limitation, and there is no limitation on the number of shared registers. Further, it is not necessary to have a shared register as a general-purpose register.

【００４９】[0049]

【発明の効果】以上のように、本発明は、以下の優れた
効果を奏することができる。（１）レジスタの１部を共用することにより、ハードウ
ェア量の削減しコストを低減できる。（２）各命令流で同一のレジスタをアクセスできること
により、高速通信機能を有し高性能化を実現できる。（３）レジスタに対して書き込みおよび読み出しアービ
ターを設けることにより、上記（１）（２）の効果以外
にも共用レジスタのポート数を削減でき、よりハードウ
ェア量を削減できる。（４）レジスタを共用する場合としない場合をプログラ
ム毎に自由に使い分けることが可能になり、高性能化を
図ることが可能になる。INDUSTRIAL APPLICABILITY As described above, the present invention can exert the following excellent effects. (1) By sharing a part of the register, the amount of hardware can be reduced and the cost can be reduced. (2) Since the same register can be accessed in each instruction stream, it has a high-speed communication function and high performance can be realized. (3) By providing the write and read arbiters for the registers, in addition to the effects of (1) and (2) above, the number of ports of the common register can be reduced and the hardware amount can be further reduced. (4) It is possible to freely use the case where the register is shared and the case where the register is not used, and it is possible to improve the performance.

[Brief description of drawings]

【図１】本発明の一実施例における情報処理装置の構成
図FIG. 1 is a configuration diagram of an information processing device according to an embodiment of the present invention.

【図２】本発明の他の実施例におけるレジスタの構成図FIG. 2 is a block diagram of a register in another embodiment of the present invention.

【図３】従来の情報処理装置の構成図FIG. 3 is a block diagram of a conventional information processing apparatus

[Explanation of symbols]

１０命令フェッチ制御部１１、１３解読ユニット１２、１４プログラムカウンタ（ＰＣ）２０ロードストアユニット２１整数演算ユニット２２浮動小数点ユニット２３オペレーション用スイッチ２４オペランド用スイッチ２５レジスタファイルＡ２６レジスタファイルＢ２７共用レジスタ２８書き込み用スイッチ２９データ依存チェック部３０スコアボード 10 instruction fetch control unit 11, 13 decoding unit 12, 14 program counter (PC) 20 load store unit 21 integer arithmetic unit 22 floating point unit 23 operation switch 24 operand switch 25 register file A 26 register file B 27 shared register 28 Write switch 29 Data dependency check unit 30 Scoreboard

Claims

[Claims]

1. An information processing apparatus having a plurality of register files, a part of which is shared.

2. A plurality of arithmetic units, a plurality of register files sharing a part thereof, and an operand switch for connecting the arithmetic units and the plurality of register files, wherein the operand switch is a data file from the register file. An information processing apparatus, which determines whether or not it is a shared portion according to a register number when reading out, and connects the arithmetic unit and the register file accordingly.

3. A plurality of arithmetic units, a plurality of register files sharing a part thereof, and an operand switch for connecting the arithmetic units and the plurality of register files, wherein the operand switch is a data file from the register file. An information processing apparatus, which determines whether or not it is a shared portion according to a register number when writing, and connects the arithmetic unit and the register file according to the determination.

4. A plurality of arithmetic units, a plurality of register files sharing a part thereof, an operand switch connecting the arithmetic units and the plurality of register files, and a register read arbiter, wherein the register read arbiter is When reading data from the register file, determine whether it is a shared part by register number, and if there is a read request more than the number of ports of the shared register, an instruction to execute within the range of the number of ports An information processing apparatus which controls issuing and connects the arithmetic unit and the register file to the operand switch.

5. A plurality of arithmetic units, a plurality of register files sharing a part thereof, an operand switch for connecting the arithmetic units and a plurality of register files, and a register write arbiter, wherein the register write arbiter comprises: When writing data from the arithmetic unit to the register file, it is judged whether or not it is a shared part by the register number, and if there is a write request more than the number of ports of the shared register, execute within the range of the number of ports. An information processing apparatus which controls the arithmetic unit so that the arithmetic unit and the register file can be connected to the operand switch.