JP2000081982A

JP2000081982A - Compiler, processor and recording medium

Info

Publication number: JP2000081982A
Application number: JP10250754A
Authority: JP
Inventors: Masato Suzuki; 正人鈴木
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1998-09-04
Filing date: 1998-09-04
Publication date: 2000-03-21
Anticipated expiration: 2018-09-04
Also published as: JP3692793B2

Abstract

(57)【要約】【課題】 VLIWプロセッサの命令の各スロットにはプロ
セッサが備える複数の演算器に対応したオペレーション
が指定されるが、オペレーションの依存関係等により、
常に並列実行可能なスロットの数だけのオペレーション
がスケジューリングできるとは限らないために、命令中
に置かれたnopコードによってプログラムサイズが増大
する。さらに、命令の並列度が増すほど挿入されるnop
コードの数が増加し、コード効率がさらに悪化する。【解決手段】コンパイラがnopコードの位置に有効な
オペレーションを必要により分割して埋め、プロセッサ
がこれを蓄積して実行する。さらに、ｎ並列実行オペレ
ーションをｍ（ｍ＜ｎ）スロットの命令のｎ個のnopコ
ードに埋めるか、またはｎ並列実行オペレーションの
（ｎ−ｍ）個をｍスロットの命令の（ｎ−ｍ）個のnop
コードに埋める。 (57) [Summary] [Problem] An operation corresponding to a plurality of arithmetic units provided in a processor is specified in each slot of an instruction of a VLIW processor.
Since it is not always possible to schedule as many operations as the number of slots that can be executed in parallel, the nop code placed in the instruction increases the program size. Furthermore, nop is inserted as the parallelism of instructions increases.
The number of codes increases, and the code efficiency further deteriorates. SOLUTION: A compiler divides and fills an effective operation into a nop code position as necessary, and a processor accumulates and executes the effective operation. Further, the n-parallel execution operation is embedded in n nop codes of the instruction of m (m <n) slots, or the (n-m) of the n-parallel execution operation is (nm) instructions of the m-slot. Nop
Embed in code.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、コンパイラとプロ
セッサと記憶媒体とに関し、特に単一命令方式またはVL
IW(Very Long Instruction Word)方式のプロセッサの実
行コード効率を向上させる技術に関するものを含む。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a compiler, a processor, and a storage medium.
Includes technologies related to improving the execution code efficiency of IW (Very Long Instruction Word) type processors.

【０００２】[0002]

【従来の技術】近年の電子技術の発展により、高性能な
プロセッサが普及し、あらゆる分野で用いられている。
そのようなプロセッサでは命令の並列処理により高い性
能を達成している。VLIWと呼ばれるアーキテクチャも命
令の並列処理の１つの形態であり、VLIWアーキテクチャ
を採るプロセッサ（以下、「VLIWプロセッサ」とい
う。）は、内部に複数の演算器を備え、１つの命令に置
かれたスロットと呼ばれる複数のフィールドに指定され
たオペレーションを同時並列に実行する。このようなVL
IWプロセッサの機械命令プログラムは、コンパイラによ
って高級言語で記述されたプログラムにおけるオペレー
ションレベルでの並列性が検出されスケジューリングさ
れた後に生成されたものである。機械命令プログラムは
実行コードとも呼ばれる。2. Description of the Related Art With the recent development of electronic technology, high-performance processors have become widespread and used in all fields.
Such a processor achieves high performance by parallel processing of instructions. An architecture called VLIW is also one form of instruction parallel processing. A processor employing the VLIW architecture (hereinafter referred to as a “VLIW processor”) has a plurality of arithmetic units inside and a slot provided for one instruction. Performs the operations specified in multiple fields, called, concurrently and in parallel. Such a VL
The machine instruction program of the IW processor is generated after the parallelism at the operation level in a program described in a high-level language is detected and scheduled by a compiler. The machine instruction program is also called an execution code.

【０００３】（第１の従来技術）図２１は、第１の従来
技術におけるプロセッサの構成図である。(First Prior Art) FIG. 21 is a block diagram of a processor according to the first prior art.

【０００４】第１の従来技術におけるプロセッサは２つ
のオペレーションを並列実行するもので、図５に示すよ
うな第１と第２の２つのスロットからなる命令列で構成
されるプログラムがROM１に格納され、それぞれのスロ
ットに書かれたオペレーションが第１命令解読器４と第
２命令解読器５とで解読された後、第１演算器１３と第
２演算器１４とで実行される。A processor according to the first prior art executes two operations in parallel, and a program composed of an instruction sequence composed of first and second two slots as shown in FIG. After the operation written in each slot is decoded by the first instruction decoder 4 and the second instruction decoder 5, it is executed by the first operator 13 and the second operator 14.

【０００５】（第２の従来技術）図２２は、第２の従来
技術におけるプロセッサの構成図である。(Second Prior Art) FIG. 22 is a block diagram of a processor according to a second prior art.

【０００６】第２の従来技術におけるプロセッサは３つ
のオペレーションを並列実行するものだが、基本的な考
え方は第１の従来技術におけるプロセッサと同一であ
り、図１４に示すような第１から第３の３つのスロット
からなる命令列で構成されるプログラムがROM４１に格
納され、それぞれのスロットに書かれたオペレーション
が第１命令解読器４５から第３命令解読器４７で解読さ
れた後、第１演算器５８から第３演算器６０で実行され
る。つまり、１つの命令を構成するスロットの数が増え
たに過ぎない。The processor according to the second prior art executes three operations in parallel, but the basic concept is the same as that of the processor according to the first prior art, and the first to third processors as shown in FIG. A program composed of an instruction sequence consisting of three slots is stored in the ROM 41, and the operations written in the respective slots are decoded by the first instruction decoder 45 to the third instruction decoder 47. The processing is executed by the third computing unit 60 from 58. That is, the number of slots constituting one instruction merely increases.

【０００７】[0007]

【発明が解決しようとする課題】しかしながら、上記の
従来技術ではいずれも、命令中に置かれたノーオペレー
ションコード（nopコード）によってプログラムサイズ
が増大する問題がある。なお、プログラムサイズの増大
はコード効率の低下とも表現される。VLIWプロセッサの
命令の各スロットにはプロセッサが備える複数の演算器
に対応したオペレーションが指定されるが、オペレーシ
ョンの依存関係等により、常に並列実行可能なスロット
の数だけのオペレーションがスケジューリングできると
は限らないからである。有効なオペレーションが置けな
い場合、コンパイラはそのスロットにnopコードを生成
する。However, any of the above prior arts has a problem that the program size is increased by a no operation code (nop code) placed in an instruction. Note that an increase in program size is also expressed as a decrease in code efficiency. Each slot of the VLIW processor instruction specifies an operation corresponding to multiple operation units of the processor.However, due to the dependence of operations, it is not always possible to schedule as many operations as the number of slots that can be executed in parallel. Because there is no. If a valid operation cannot be placed, the compiler generates nop code in that slot.

【０００８】上記の第１の従来技術では、例えば図５に
示すように、命令２ではBとCの２つの有効なオペレーシ
ョンを指定することができるが、命令１では第２スロッ
トに有効なオペレーションを指定することができずにno
pとなっている。また上記の第２の従来技術では、例え
ば図１４に示すように、命令１では第２と第３スロット
に、命令２では第３スロットに有効なオペレーションを
指定することができずにnopとなっている。このように
一般にVLIWプロセッサは、命令の並列度が増すほど挿入
されるnopコードの数が増加し、コード効率がさらに悪
化するという問題がある。これは、コンパイラにおいて
全スロットに有効なオペレーションがスケジューリング
できる確率が並列度に逆比例することに起因する。In the above-mentioned first prior art, for example, as shown in FIG. 5, two valid operations B and C can be designated by an instruction 2, but an effective operation is stored in a second slot by an instruction 1. Can not be specified without
p. In the second prior art, as shown in FIG. 14, for example, a valid operation cannot be designated to the second and third slots in the instruction 1 and to the third slot in the instruction 2, and the operation becomes nop. ing. As described above, in general, the VLIW processor has a problem that the number of nop codes to be inserted increases as the parallelism of instructions increases, and the code efficiency further deteriorates. This is because the probability that a valid operation can be scheduled for all slots in the compiler is inversely proportional to the degree of parallelism.

【０００９】そこで、本発明はかかる点に鑑みてなされ
たものであり、命令中の無駄領域を低減するコンパイラ
とプロセッサとを提供することを第１の目的とする。Accordingly, it is a first object of the present invention to provide a compiler and a processor that reduce a useless area in an instruction.

【００１０】また、本発明の第２の目的は、VLIWプロセ
ッサにおける命令の並列度の向上に伴うnopコードの増
大を軽減するコンパイラとプロセッサとを提供すること
である。A second object of the present invention is to provide a compiler and a processor for reducing an increase in nop code due to an improvement in the degree of instruction parallelism in a VLIW processor.

【００１１】[0011]

【課題を解決するための手段】本願発明のコンパイラ
は、高級言語プログラムからプロセッサが同時に並列実
行できる複数の操作を長語命令形式の命令を生成したの
ち、命令に含まれるｎｏｐをｎｏｐよりも後に実行され
る有効な操作に置き換え、有効な操作を削除するととも
に、置き換えたことを示す情報を付加することを特徴す
るものである。これにより、ｎｏｐを有効なオペレーシ
ョンに置き換えることができ、コードサイズを縮小化で
きる。The compiler according to the present invention generates a long-language instruction format instruction from a high-level language program for a plurality of operations that can be simultaneously executed by a processor, and then sets the nop included in the instruction after the nop. The method is characterized in that the operation is replaced with a valid operation to be executed, the valid operation is deleted, and information indicating that the operation has been replaced is added. As a result, nop can be replaced with a valid operation, and the code size can be reduced.

【００１２】また、このような命令を実行するプロセッ
サは、命令中の蓄積ビットの値に基づいて命令を一旦蓄
積バッファに格納したのち、蓄積バッファに蓄積された
命令を実行することを特徴とするプロセッサである。A processor for executing such an instruction temporarily stores the instruction in an accumulation buffer based on the value of an accumulation bit in the instruction, and then executes the instruction stored in the accumulation buffer. Processor.

【００１３】[0013]

【発明の実施の形態】以下、本発明の実施の形態につい
て、図面を用いて詳細に説明する。Embodiments of the present invention will be described below in detail with reference to the drawings.

【００１４】（実施の形態１）実施の形態１は、ｎｏｐ
の代わりに有効なオペレーションが配置された命令を一
旦蓄積しその後実行するもので、コードサイズの減少を
図るものである。(Embodiment 1) In Embodiment 1, the nop
Instead of this, an instruction in which a valid operation is arranged is temporarily stored and then executed, thereby reducing the code size.

【００１５】１．コンパイラ図１は、コンパイラの構成を示すブロック図である。1. Compiler FIG. 1 is a block diagram showing a configuration of the compiler.

【００１６】コンパイラ１０２は、ユーザが記述したＣ
言語プログラム１０１を翻訳し、機械命令プログラム１
１２を出力する。The compiler 102 executes the C
Translates the language program 101 into the machine instruction program 1
12 is output.

【００１７】コンパイラ１０２は、Ｃ言語プログラム１
０１を読込用バッファ１０４に読み込むファイル読込部
１０３と、読込用バッファ１０４に読み込まれたＣ言語
プログラムの構文や意味を解析して中間コードを生成し
中間コード用バッファ１０６に書き込む構文解析部１０
５と、中間コード用バッファ１０６に格納された中間コ
ードを入力して命令の２並列実行を目的とする命令のス
ケジューリングを行い、未圧縮の機械命令プログラムを
生成し暫定出力バッファ１０８に書き込む機械命令生成
部１０７と、暫定出力バッファ１０８に格納された未圧
縮の機械命令プログラムを圧縮して目的とする機械命令
プログラムを生成し出力用バッファ１１０に書き込む機
械命令圧縮部１０９と、出力用バッファ１１０に格納さ
れた機械命令プログラムをファイルに出力するファイル
出力部１１１とから構成される。ここで「機械命令プロ
グラムの圧縮」とは、機械命令プログラムの各命令に含
まれるnopコードを有効なオペレーションに置き換える
ことを言う。この圧縮を行う機械命令圧縮部１０９を除
く各要素は、公知の技術に基づいて構成すればよいので
ここでは説明を省略する。機械命令圧縮部１０９は、以
下に詳細に説明するが、次の原理に基づいて動作する。The compiler 102 executes the C language program 1
01 into the read buffer 104, and a syntax analyzer 10 that analyzes the syntax and meaning of the C language program read into the read buffer 104, generates intermediate code, and writes the intermediate code into the intermediate code buffer 106.
5, an intermediate code stored in the intermediate code buffer 106 is input, an instruction for the purpose of two-parallel execution of the instruction is scheduled, an uncompressed machine instruction program is generated, and a machine instruction to be written to the provisional output buffer 108 A generation unit 107, a machine instruction compression unit 109 that compresses the uncompressed machine instruction program stored in the provisional output buffer 108 to generate a target machine instruction program and writes it to the output buffer 110, And a file output unit 111 that outputs the stored machine instruction program to a file. Here, "compression of the machine instruction program" means to replace the nop code included in each instruction of the machine instruction program with a valid operation. The components other than the machine instruction compression unit 109 that performs this compression may be configured based on a known technique, and thus description thereof is omitted here. The machine instruction compression unit 109 operates based on the following principle, which will be described in detail below.

【００１８】命令の順に未圧縮の機械命令プログラムを
検索して、同順の第１スロットのnopコードと第２スロ
ットのnopコードとのペアを抽出し、このnopコードペア
の第１スロットおよび第２スロットを、該ペア以降に最
初に現れる有効オペレーションのペアの第１スロットお
よび第２スロットのオペレーションでそれぞれ置き換
え、置き換えたことをマーキングするとともに、置き換
えに使った有効オペレーションのペアを削除する。これ
により、２つの有効なオペレーションを含む命令を、こ
れよりも前に存在するｎｏｐの代わりに配置し、ｎｏｐ
を削減するものである。An uncompressed machine instruction program is searched in the order of instructions to extract a pair of the first slot nop code and the second slot nop code in the same order, and the first slot and the second slot of the nop code pair are extracted. The two slots are replaced with the operations of the first slot and the second slot of the pair of valid operations that appear first after the pair, respectively, and the replacement is marked, and the pair of valid operations used for replacement is deleted. This places an instruction containing two valid operations in place of the earlier nop, and nop
Is to reduce.

【００１９】図２から図４は、機械命令圧縮部１０９の
処理フローを示したフローチャートである。FIGS. 2 to 4 are flowcharts showing the processing flow of the machine instruction compression unit 109.

【００２０】機械命令圧縮部１０９の処理フローは、以
下の動作例を用いて詳細に説明する。The processing flow of the machine instruction compression unit 109 will be described in detail using the following operation example.

【００２１】１．１機械命令圧縮部１０９の動作例図５は、未圧縮の機械命令プログラムの例示図であり、
機械命令生成部１０７が上記した第１の従来技術になら
って生成したものである。1.1 Example of Operation of Machine Instruction Compression Unit 109 FIG. 5 is a view showing an example of an uncompressed machine instruction program.
This is generated by the machine instruction generation unit 107 according to the above-described first related art.

【００２２】命令は第１と第２の２つのスロットで構成
され、AからJの記号は有効なオペレーションが、nopはn
opコードが生成されていることを示す。The instruction consists of two slots, first and second, where the symbols A through J indicate a valid operation, while nop indicates n.
Indicates that an op code has been generated.

【００２３】図６は、圧縮された機械命令プログラムの
例示図であり、機械命令圧縮部１０９が図５の未圧縮の
機械命令プログラムを以下の手順で圧縮したものであ
る。FIG. 6 is a view showing an example of a compressed machine instruction program. The machine instruction compression unit 109 compresses the uncompressed machine instruction program of FIG. 5 in the following procedure.

【００２４】命令は第１と第２の２つのスロットで構成
され、各スロットは１ビットの蓄積ビットとオペレーシ
ョン（OP）フィールドとからなる。AからJの記号は図５
と同様、有効なオペレーションを示す。The instruction is composed of first and second two slots, each of which consists of one accumulated bit and an operation (OP) field. The symbols from A to J are shown in Fig. 5.
Indicates a valid operation as in.

【００２５】以下に、図５のプログラムを入力とした場
合における機械命令圧縮部１０９の動作について図２か
ら図６を用いて説明する。The operation of the machine instruction compression unit 109 when the program shown in FIG. 5 is input will be described below with reference to FIGS.

【００２６】図２に示すように、先ず初期化を行う。初
期化は、命令ポインタNを最初の命令、即ち図５の命令
１に合わせること、命令先取カウンタmを１にするこ
と、第１スロットnopカウンタC1および第２スロットnop
カウンタC2を０にすること、第１スロットバッファカウ
ンタB1を０にすること、第２スロットバッファカウンタ
B2を０にすることである（ステップS201）。ここで、
N、m、C1、C2、B1、B2は機械命令圧縮部１０９の内部的
に作られたパラメータである。As shown in FIG. 2, first, initialization is performed. Initialization is performed by setting the instruction pointer N to the first instruction, that is, the instruction 1 in FIG. 5, setting the instruction prefetch counter m to 1, the first slot nop counter C1 and the second slot nop.
Setting the counter C2 to 0, setting the first slot buffer counter B1 to 0, the second slot buffer counter
B2 is set to 0 (step S201). here,
N, m, C1, C2, B1, and B2 are parameters created internally by the machine instruction compression unit 109.

【００２７】次に、Nで示される命令、即ち図５の命令
１の型を評価する。命令１は第１スロットが有効なオペ
レーションAで第２スロットがnopコードなので、「OP
(1):nop(2)型」が該当し処理Aへ飛ぶ（ステップS20
2）。(1)、(2)は第１スロット、第２スロットを意味す
る。Next, the instruction indicated by N, that is, the type of the instruction 1 in FIG. 5 is evaluated. Instruction 1 is operation A where the first slot is valid and the second slot is a nop code.
(1): nop (2) type ”and jumps to process A (step S20
2). (1) and (2) mean the first slot and the second slot.

【００２８】図３に示す処理Aでは、最初に第２スロッ
トnopカウンタC2をインクリメントしC2=1と、第２スロ
ットバッファカウンタB2をインクリメントしB2=1とする
（ステップS301）。次いで機械命令圧縮部１０９の内部
的に作られたパラメータC1Xに第１スロットnopカウンタ
C1の値を代入し、C1X=0となる（ステップS302）。次い
で(N+m)で示される命令、即ち図５の命令２の型を評価
する。命令２は第１、第２スロットがそれぞれ有効なオ
ペレーションA、Bなので、「OP(1):OP(2)型」が該当し
ステップS312へ飛ぶ（ステップS303）。今、B2=1でB2≦
2を満たしているが、ステップS307のC1X≧1を満たさず
（C1=0)次にステップS305へ飛ぶ。ここで、C1X≧１を条
件としているのは、OP(1):OP(2)をOP(1):NOP(2)に置き
換えるというように、第２スロットのみを置き換えるこ
とにより新たに置き換え対象が生成されることを防止す
るためである。すなわち、この条件を付加することによ
り、OP(1):OP(2)は最終的にはNOP(1):NOP(2)に置き換え
られ、削除されることになる。また、B2≦2を条件とし
ているのは、後述する図７に示すプロセッサは第１、２
スロットそれぞれについて２組のバッファを備えてお
り、これを越える置き換えを防止するためである。ここ
では(N+m)で示される命令、即ち図５の命令２は最後の
命令でないのでステップS306へ移り（ステップS305）、
命令先取カウンタmを２にして次の命令に進め、ステッ
プS303に戻る（ステップS306）。In the process A shown in FIG. 3, first, the second slot nop counter C2 is incremented to C2 = 1, and the second slot buffer counter B2 is incremented to B2 = 1 (step S301). Next, the first slot nop counter is added to the internally generated parameter C1X of the machine instruction compression unit 109.
The value of C1 is substituted, and C1X = 0 (step S302). Next, the instruction indicated by (N + m), that is, the type of the instruction 2 in FIG. 5 is evaluated. The instruction 2 is an operation A or B in which the first and second slots are respectively valid, so that “OP (1): OP (2) type” is applicable, and the process jumps to step S312 (step S303). Now, B2 = 1 and B2 ≦
2, but does not satisfy C1X ≧ 1 in step S307 (C1 = 0), and then jumps to step S305. Here, the condition of C1X ≧ 1 is that a replacement target is newly obtained by replacing only the second slot, such as replacing OP (1): OP (2) with OP (1): NOP (2). Is to be prevented from being generated. That is, by adding this condition, OP (1): OP (2) is eventually replaced with NOP (1): NOP (2) and deleted. The condition that B2 ≦ 2 is satisfied is that the processor shown in FIG.
Two sets of buffers are provided for each slot to prevent replacement beyond this. Here, since the instruction indicated by (N + m), that is, the instruction 2 in FIG. 5, is not the last instruction, the process proceeds to step S306 (step S305).
The instruction prefetch counter m is set to 2 to proceed to the next instruction, and the process returns to step S303 (step S306).

【００２９】次に(N+m)で示される命令、今度は図５の
命令３の型を評価する。命令３は第１スロットが有効な
オペレーションDで第２スロットがnopコードなので、
「OP(1):nop(2)型」が該当しステップS305へ飛ぶ（ステ
ップS303）。(N+m)で示される命令、即ち図５の命令３
は最後の命令でないのでステップS306へ移り（ステップ
S305）、命令先取カウンタmを３にして次の命令に進
め、ステップS303に戻る（ステップS306）。Next, the type of the instruction indicated by (N + m), this time, the type of the instruction 3 in FIG. 5 is evaluated. Instruction 3 is operation D where the first slot is valid and the second slot is a nop code.
“OP (1): nop (2) type” is applicable, and the process jumps to step S305 (step S303). Instruction indicated by (N + m), that is, instruction 3 in FIG.
Is not the last instruction, so the process proceeds to step S306.
(S305), the instruction prefetch counter m is set to 3, the process proceeds to the next instruction, and the process returns to step S303 (step S306).

【００３０】次に(N+m)で示される命令、今度は図５の
命令４の型を評価する。命令４は第１スロットがnopコ
ードで第２スロットが有効なオペレーションEなので、
「nop(1):OP(2)型」が該当しステップS304へ飛ぶ（ステ
ップS303）。ここでC1Xに１が足されてC1X=1になる（ス
テップS304）。(N+m)で示される命令、即ち図５の命令
４は最後の命令でないのでステップS306へ移り（ステッ
プS305）、命令先取カウンタmを４にして次の命令に進
め、ステップS303に戻る（ステップS306）。Next, the type of the instruction indicated by (N + m), this time, the type of the instruction 4 in FIG. 5 is evaluated. Instruction 4 is operation E where the first slot is nop code and the second slot is valid.
“Nop (1): OP (2) type” is applicable, and the process jumps to step S304 (step S303). Here, 1 is added to C1X, and C1X = 1 (step S304). Since the instruction indicated by (N + m), that is, the instruction 4 in FIG. 5 is not the last instruction, the process proceeds to step S306 (step S305), the instruction prefetch counter m is set to 4, the process proceeds to the next instruction, and the process returns to step S303. Step S306).

【００３１】次に(N+m)で示される命令、即ち図５の命
令５の型を評価する。命令５は第１、第２スロットがそ
れぞれ有効なオペレーションF、Gなので、「OP(1):OP
(2)型」が該当しステップS312へ飛ぶ（ステップS30
3）。今、B2=1でB2≦2を満たしており、かつステップS3
07のC1X≧1（C1X=1)を満たしてステップS308へ飛ぶ。こ
こでOP(2)は有効なままなのでステップS309へ飛び（ス
テップS308）、Nで示される命令、即ち図５の命令１の
第１スロットの蓄積ビットを”０”にセットするととも
に、第２スロットの蓄積ビットを”１”にセットしてOP
フィールドをnopの代わりにオペレーションGで埋める。
このように、OP(2)が有効であることを確認しているの
は、OP(2)が存在していてもすでにｎｏｐの代わりに配
置され、実質的には存在しない命令である場合があるか
らである。こうして図６の命令１が生成される。続いて
(N+m)で示される命令、即ち図５の命令５のOP(2)を無効
にする（ステップS309）。このときOP(1)はまだ有効な
ままなので処理Aを終了し、ステップS206へ飛ぶ（ステ
ップS310）。なお、後述するが、OP(1)が無効であると
き（すでに置き換えられているとき）、ステップS311で
命令を削除する。Next, the instruction indicated by (N + m), that is, the type of the instruction 5 in FIG. 5 is evaluated. Since instruction 5 is an operation F or G in which the first and second slots are respectively valid, "OP (1): OP
(2) type "and jumps to step S312 (step S30
3). Now, B2 = 1 and B2 ≦ 2 are satisfied, and step S3
The process satisfies C1X ≧ 1 (C1X = 1) of 07 and jumps to step S308. Here, since OP (2) remains valid, the process jumps to step S309 (step S308), and sets the accumulation bit of the first slot of the instruction indicated by N, that is, the instruction 1 of FIG. Set the accumulation bit of the slot to “1” and OP
Fill fields with operation G instead of nop.
In this way, the fact that OP (2) is valid is determined by the fact that even if OP (2) is present, it is already placed in place of nop, and is an instruction that does not substantially exist. Because there is. Thus, the instruction 1 of FIG. 6 is generated. continue
The instruction indicated by (N + m), that is, OP (2) of instruction 5 in FIG. 5 is invalidated (step S309). At this time, since OP (1) is still valid, the process A is terminated, and the process jumps to step S206 (step S310). As will be described later, when OP (1) is invalid (when already replaced), the instruction is deleted in step S311.

【００３２】処理Aから戻って、Nで示される命令、即ち
図５の命令１は最後の命令でないでのステップS207に移
り（ステップS206）、命令ポインタNを次の命令、即ち
図５の命令２に進め、命令先取カウンタmを１に戻し、
ステップS202へ戻る（ステップS207）。Returning from the process A, the instruction indicated by N, that is, the instruction 1 in FIG. 5 is not the last instruction, and proceeds to step S207 (step S206), and the instruction pointer N is set to the next instruction, that is, the instruction in FIG. Step 2 returns the instruction prefetch counter m to 1,
The process returns to step S202 (step S207).

【００３３】引き続いてNで示される命令、即ち図５の
命令２の型を評価する。命令２は上記の通り「OP(1):OP
(2)型」が該当しステップS205に移る（ステップS20
2）。ここでNで示される命令、即ち図５の命令２の第１
および第２スロットの蓄積ビットを”０”にセットす
る。こうして図６の命令２が生成される。続いてNで示
される命令、即ち図５の命令２は最後の命令でないでの
ステップS207に移り（ステップS206）、命令ポインタN
を次の命令、即ち図５の命令３に進め、命令先取カウン
タmを１に戻し、ステップS202へ戻る（ステップS20
7）。Subsequently, the type of the instruction indicated by N, that is, the type of the instruction 2 in FIG. 5 is evaluated. Instruction 2 is "OP (1): OP"
(2) type ”and moves to step S205 (step S20
2). Here, the instruction indicated by N, that is, the first instruction of instruction 2 in FIG.
And the accumulation bit of the second slot is set to “0”. Thus, the instruction 2 of FIG. 6 is generated. Subsequently, the instruction indicated by N, that is, the instruction 2 in FIG. 5, moves to step S207 where the instruction is not the last instruction (step S206),
To the next instruction, that is, instruction 3 in FIG. 5, resets the instruction prefetch counter m to 1, and returns to step S202 (step S20).
7).

【００３４】引き続いてNで示される命令、即ち図５の
命令３の型を評価する。命令３は上記の通り「OP(1):no
p(2)型」が該当し処理Aへ飛ぶ（ステップS202）。Subsequently, the type of the instruction indicated by N, that is, the type of the instruction 3 in FIG. 5 is evaluated. Instruction 3 is "OP (1): no"
The “p (2) type” is applicable, and the processing jumps to processing A (step S202).

【００３５】処理Aでは、最初に第２スロットnopカウン
タC2をインクリメントしC2=2と、第２スロットバッファ
カウンタB2をインクリメントしB2=2とする（ステップS3
01）。次いでパラメータC1Xに第１スロットnopカウンタ
C1の値を代入し、C1X=0となる（ステップS302）。次い
で(N+m)で示される命令、即ち図５の命令４の型を評価
する。命令４は上記の通り「nop(1):OP(2)型」が該当し
ステップS304へ飛ぶ（ステップS303）。ここでC1Xに１
が足されてC1X=1になる（ステップS304）。(N+m)で示さ
れる命令、即ち図５の命令４は最後の命令でないのでス
テップS306へ移り（ステップS305）、命令先取カウンタ
mを２にして次の命令に進め、ステップS303に戻る（ス
テップS306）。In the process A, first, the second slot nop counter C2 is incremented to C2 = 2, and the second slot buffer counter B2 is incremented to B2 = 2 (step S3).
01). Next, the first slot nop counter is set to the parameter C1X.
The value of C1 is substituted, and C1X = 0 (step S302). Next, the instruction indicated by (N + m), that is, the type of the instruction 4 in FIG. 5 is evaluated. Instruction 4 corresponds to "nop (1): OP (2) type" as described above, and jumps to step S304 (step S303). Where C1X is 1
Are added and C1X = 1 (step S304). Since the instruction indicated by (N + m), that is, the instruction 4 in FIG. 5, is not the last instruction, the process proceeds to step S306 (step S305), and the instruction prefetch counter
The process proceeds to the next instruction by setting m to 2, and returns to step S303 (step S306).

【００３６】次に(N+m)で示される命令、即ち図５の命
令５の型を評価する。命令５は上記の通り「OP(1):OP
(2)型」が該当しステップS307へ飛ぶ（ステップS30
3）。今、C1X=1なのでC1X≧1を満たしてステップS308へ
飛ぶ。ここでOP(2)は以前に無効にされているのでステ
ップS305へ飛ぶ（ステップS308）。(N+m)で示される命
令、即ち図５の命令５は最後の命令でないのでステップ
S306へ移り（ステップS305）、命令先取カウンタmを３
にして次の命令に進め、ステップS303に戻る（ステップ
S306）。Next, the instruction indicated by (N + m), that is, the type of the instruction 5 in FIG. 5 is evaluated. Instruction 5 is "OP (1): OP"
(2) type "and jumps to step S307 (step S30
3). Since C1X = 1 now, C1X ≧ 1 is satisfied, and the routine jumps to step S308. Here, since OP (2) has been invalidated before, the process jumps to step S305 (step S308). The instruction indicated by (N + m), that is, instruction 5 in FIG.
Move to S306 (step S305), and set the instruction prefetch counter m to 3
And proceed to the next instruction, and return to step S303 (step
S306).

【００３７】次に(N+m)で示される命令、即ち図５の命
令６の型を評価する。命令６は第１スロットがnopコー
ドで第２スロットが有効なオペレーションHなので、「n
op(1):OP(2)型」が該当しステップS304へ飛ぶ（ステッ
プS303）。ここでC1Xに１が足されてC1X=2になる（ステ
ップS304）。(N+m)で示される命令、即ち図５の命令６
は最後の命令でないのでステップS306へ移り（ステップ
S305）、命令先取カウンタmを４にして次の命令に進
め、ステップS303に戻る（ステップS306）。Next, the type of the instruction indicated by (N + m), that is, the type of the instruction 6 in FIG. 5 is evaluated. Instruction 6 is operation H in which the first slot is a nop code and the second slot is valid.
op (1): OP (2) type "and jumps to step S304 (step S303). Here, 1 is added to C1X, and C1X = 2 (step S304). Instruction indicated by (N + m), that is, instruction 6 in FIG.
Is not the last instruction, so the process proceeds to step S306.
(S305), the instruction prefetch counter m is set to 4, the process proceeds to the next instruction, and the process returns to step S303 (step S306).

【００３８】次に(N+m)で示される命令、即ち図５の命
令７の型を評価する。命令７は第１、第２スロットがそ
れぞれ有効なオペレーションI、Jなので、「OP(1):OP
(2)型」が該当しステップS312へ飛ぶ（ステップS30
3）。今、B2=2でB2≦2を満たしており、かつステップS3
07のC1X≧1(C1X=2)を満たしてステップS308へ飛ぶ。こ
こでOP(2)は有効なままなのでステップS309へ飛び（ス
テップS308）、Nで示される命令、即ち図５の命令３の
第１スロットの蓄積ビットを”０”にセットするととも
に、第２スロットの蓄積ビットを”１”にセットしてOP
フィールドをnopの代わりにオペレーションJで埋める。
こうして図６の命令３が生成される。続いて(N+m)で示
される命令、即ち図５の命令７のOP(2)を無効にする
（ステップS309）。このときOP(1)はまだ有効なままな
ので処理Aを終了し、ステップS206へ飛ぶ（ステップS31
0）。Next, the instruction indicated by (N + m), that is, the type of the instruction 7 in FIG. 5 is evaluated. Since instruction 7 is an operation I or J in which the first and second slots are valid, respectively, "OP (1): OP
(2) type "and jumps to step S312 (step S30
3). Now, when B2 = 2, B2 ≦ 2 is satisfied, and step S3
The process satisfies C1X ≧ 1 (C1X = 2) of 07 and jumps to step S308. Here, since OP (2) remains valid, the process jumps to step S309 (step S308), and sets the accumulation bit of the first slot of the instruction indicated by N, that is, the instruction 3 of FIG. Set the accumulation bit of the slot to “1” and OP
Fill fields with operation J instead of nop.
Thus, the instruction 3 in FIG. 6 is generated. Subsequently, the instruction indicated by (N + m), that is, the OP (2) of the instruction 7 in FIG. 5 is invalidated (step S309). At this time, since OP (1) is still valid, the process A ends, and the process jumps to step S206 (step S31).
0).

【００３９】処理Aから戻って、Nで示される命令、即ち
図５の命令３は最後の命令でないでのステップS207に移
り（ステップS206）、命令ポインタNを次の命令、即ち
図５の命令４に進め、命令先取カウンタmを１に戻し、
ステップS202へ戻る（ステップS207）。Returning from the processing A, the instruction indicated by N, that is, the instruction 3 in FIG. 5 is not the last instruction, and proceeds to step S207 (step S206), and the instruction pointer N is set to the next instruction, that is, the instruction in FIG. 4 and return the instruction prefetch counter m to 1.
The process returns to step S202 (step S207).

【００４０】引き続いてNで示される命令、即ち図５の
命令４の型を評価する。命令４は上記の通り「nop(1):O
P(2)型」が該当し処理Bへ飛ぶ（ステップS202）。Subsequently, the type of the instruction indicated by N, that is, the instruction 4 in FIG. 5 is evaluated. Instruction 4 is "nop (1): O"
P (2) type ", and the process jumps to process B (step S202).

【００４１】処理Bでは、最初に第１スロットnopカウン
タC1をインクリメントしC1=1と、第１スロットバッファ
カウンタB1をインクリメントしB1=1とする（ステップS4
01）。次いで機械命令圧縮部１０９の内部的に作られた
パラメータC2Xに第２スロットnopカウンタC2の値を代入
し、C2X=2となる（ステップS402）。次いで(N+m)で示さ
れる命令、即ち図５の命令５の型を評価する。命令５は
上記の通り「OP(1):OP(2)型」が該当しステップS412へ
飛ぶ（ステップS403）。今、B1=1でB1≦2を満たしてお
り、かつステップS407のC2X≧1(C2X=2)を満たしてステ
ップS408へ飛ぶ。ここでOP(1)は有効なままなのでステ
ップS409へ飛び（ステップS408）、Nで示される命令、
即ち図５の命令４の第２スロットの蓄積ビットを”０”
にセットするとともに、第１スロットの蓄積ビットを”
１”にセットしてOPフィールドをnopの代わりにオペレ
ーションFで埋める。こうして図６の命令４が生成され
る。続いて(N+m)で示される命令、即ち図５の命令５のO
P(1)を無効にする（ステップS409）。次にOP(2)は以前
に無効にされているのでステップS411へ飛ぶ（ステップ
S410）。ここで(N+m)で示される命令、即ち図５の命令
５を削除し、第１スロットnopカウンタC1および第２ス
ロットnopカウンタC2をデクリメントしC1=0、C2=1と、
第１スロットバッファカウンタB1および第２スロットバ
ッファカウンタB2をデクリメントしB1=0、B2=1となる
（ステップS411）。これで処理Bを終了し、ステップS20
6へ飛ぶ。In the process B, first, the first slot nop counter C1 is incremented to C1 = 1, and the first slot buffer counter B1 is incremented to B1 = 1 (step S4).
01). Next, the value of the second slot nop counter C2 is substituted into the parameter C2X generated internally by the machine instruction compression unit 109, and C2X = 2 (step S402). Next, the instruction indicated by (N + m), that is, the type of the instruction 5 in FIG. 5 is evaluated. Instruction 5 corresponds to “OP (1): OP (2) type” as described above, and jumps to step S412 (step S403). Now, B1 = 1 and B1 ≦ 2 are satisfied, and C2X ≧ 1 (C2X = 2) in step S407 is satisfied, and the process jumps to step S408. Here, since OP (1) remains valid, the process jumps to step S409 (step S408), and the instruction indicated by N
That is, the storage bit of the second slot of the instruction 4 of FIG.
And set the accumulation bit of the first slot to "
The operation field is set to 1 "and the OP field is filled with operation F instead of nop. Thus, the instruction 4 of FIG. 6 is generated. Subsequently, the instruction indicated by (N + m), that is, the O of the instruction 5 of FIG.
P (1) is invalidated (step S409). Next, since OP (2) has been previously invalidated, the process jumps to step S411 (step
S410). Here, the instruction indicated by (N + m), that is, the instruction 5 in FIG. 5 is deleted, the first slot nop counter C1 and the second slot nop counter C2 are decremented, and C1 = 0, C2 = 1,
The first slot buffer counter B1 and the second slot buffer counter B2 are decremented to B1 = 0 and B2 = 1 (step S411). This ends the processing B, and the step S20
Fly to 6.

【００４２】処理Bから戻って、Nで示される命令、即ち
図５の命令４は最後の命令でないでのステップS207に移
り（ステップS206）、命令ポインタNを次の命令、即ち
図５の命令６（命令５は削除された）に進め、命令先取
カウンタmを１に戻し、ステップS202へ戻る（ステップS
207）。Returning from the process B, the instruction indicated by N, that is, the instruction 4 in FIG. 5 is not the last instruction, and proceeds to step S207 (step S206), and the instruction pointer N is set to the next instruction, that is, the instruction in FIG. 6 (the instruction 5 has been deleted), return the instruction prefetch counter m to 1, and return to step S202 (step S202).
207).

【００４３】引き続いてNで示される命令、即ち図５の
命令６の型を評価する。命令６は上記の通り「nop(1):O
P(2)型」が該当し処理Bへ飛ぶ（ステップS202）。Subsequently, the type of the instruction indicated by N, that is, the type of the instruction 6 in FIG. 5 is evaluated. Instruction 6 is "nop (1): O"
P (2) type ", and the process jumps to process B (step S202).

【００４４】処理Bでは、最初に第１スロットnopカウン
タC1をインクリメントしC1=1と、第１スロットバッファ
カウンタB1をインクリメントしB1=1になる（ステップS4
01）。次いでパラメータC2Xに第２スロットnopカウンタ
C2の値を代入し、C2X=1となる（ステップS402）。次い
で(N+m)で示される命令、即ち図５の命令７の型を評価
する。命令７は上記の通り「OP(1):OP(2)型」が該当し
ステップS412へ飛ぶ（ステップS403）。今、B1=1でB1≦
2を満たしており、かつステップS407のC2X≧1(C2X=１)
を満たしてステップS408へ飛ぶ。ここでOP(1)は有効な
ままなのでステップS409へ飛び（ステップS408）、Nで
示される命令、即ち図５の命令６の第２スロットの蓄積
ビットを”０”にセットするとともに、第１スロットの
蓄積ビットを”１”にセットしてOPフィールドをnopの
代わりにオペレーションIで埋める。こうして図６の命
令５が生成される。続いて(N+m)で示される命令、即ち
図５の命令７のOP(1)を無効にする（ステップS409）。
次にOP(2)は以前に無効にされているのでステップS411
へ飛ぶ（ステップS410）。ここで(N+m)で示される命
令、即ち図５の命令７を削除し、第１スロットnopカウ
ンタC1および第２スロットnopカウンタC2をデクリメン
トしC1=0、C2=0と、第１スロットバッファカウンタB1、
第２スロットバッファカウンタB2をデクリメントしB1=
0、B2=0となる（ステップS411）。これで処理Bを終了
し、ステップS206へ飛ぶ。In the process B, first, the first slot nop counter C1 is incremented to C1 = 1, and the first slot buffer counter B1 is incremented to B1 = 1 (step S4).
01). Next, the second slot nop counter is added to the parameter C2X.
The value of C2 is substituted, and C2X = 1 (step S402). Next, the instruction indicated by (N + m), that is, the type of the instruction 7 in FIG. 5 is evaluated. The instruction 7 corresponds to “OP (1): OP (2) type” as described above, and jumps to step S412 (step S403). Now, B1 = 1 and B1 ≦
2 and C2X ≧ 1 in step S407 (C2X = 1)
And jumps to step S408. Here, since OP (1) remains valid, the process jumps to step S409 (step S408), sets the accumulation bit of the second slot of the instruction indicated by N, that is, the instruction 6 of FIG. The storage bit of the slot is set to "1" and the OP field is filled with operation I instead of nop. Thus, the instruction 5 of FIG. 6 is generated. Subsequently, the instruction indicated by (N + m), that is, the OP (1) of the instruction 7 in FIG. 5 is invalidated (step S409).
Next, since OP (2) has been previously invalidated, step S411 is performed.
Jump to (step S410). Here, the instruction indicated by (N + m), that is, the instruction 7 in FIG. 5 is deleted, the first slot nop counter C1 and the second slot nop counter C2 are decremented, and C1 = 0, C2 = 0, Buffer counter B1,
Decrement the second slot buffer counter B2 and B1 =
0 and B2 = 0 (step S411). Thus, the process B ends, and the process jumps to step S206.

【００４５】処理Bから戻って、Nで示される命令、即ち
図５の命令６は最後の命令なので（命令７は削除され
た）全ての処理を終了する（ステップS206）。Returning from the process B, since the instruction indicated by N, that is, the instruction 6 in FIG. 5, is the last instruction (the instruction 7 has been deleted), all the processing ends (step S206).

【００４６】以上のように、図５の未圧縮の機械命令プ
ログラムは図６に示す圧縮された機械命令プログラムに
変換される。なお、上記動作例で図３および図４におけ
る未通過のステップがあるが、図３および図４は２つの
スロットについて相補的であるので説明を省略する。As described above, the uncompressed machine instruction program of FIG. 5 is converted into the compressed machine instruction program shown in FIG. Although there is a step in FIGS. 3 and 4 that has not been passed in the above operation example, the description of FIGS. 3 and 4 is omitted because the two slots are complementary.

【００４７】２．プロセッサ図７は、プロセッサの概略構成図である。2. Processor FIG. 7 is a schematic configuration diagram of a processor.

【００４８】このプロセッサは、命令フェッチステージ
（以下、IFステージ）、解読およびレジスタ読出しステ
ージ（以下、DECステージ）、実行ステージ（以下、EX
ステージ）の３つのステージからなる３段パイプライン
構造を成している。This processor includes an instruction fetch stage (hereinafter, IF stage), a decoding and register reading stage (hereinafter, DEC stage), and an execution stage (hereinafter, EX stage).
(Stage) in a three-stage pipeline structure.

【００４９】図７において、１は機械語プログラムを格
納するROM、２と３は機械語命令（以下、命令と略す）
の第１スロットと第２スロットの内容を格納するそれぞ
れＩ１ラッチとＩ２ラッチ、４と５はそれぞれＩ１ラッ
チ２とＩ２ラッチ３に保持された命令の第１スロットと
第２スロットの内容を解読しプロセッサ各部を制御する
第１命令解読器と第２命令解読器、６はオペランド格納
するレジスタファイル、７と８はそれぞれＩ１ラッチ２
とＩ２ラッチ３の内容の一部とレジスタファイル６の出
力との２入力から１つを選択するＤ１セレクタとＤ２セ
レクタ、９と１０はそれぞれＤ１セレクタ７とＤ２セレ
クタ８の出力を格納するＤ１１ラッチとＤ１２ラッチ、
１１と１２はレジスタファイル６の出力を格納するＤ２
１ラッチとＤ２２ラッチ、１３はＤ１１ラッチ９および
Ｄ２１ラッチ１１の内容を用いて算術論理演算を行う第
１演算器、１４はＤ１２ラッチ１０およびＤ２２ラッチ
１２の内容を用いて算術論理演算を行う第２演算器で、
第１演算器１３と第２演算器１４の出力はともにレジス
タファイル６に接続される。１５と１６はそれぞれＩ１
ラッチ２とＩ２ラッチ３に保持された命令の第１スロッ
トと第２スロットの内容を保持するＩＢ１１バッファと
ＩＢ１２バッファで、両者を合わせてＩＢ１バッファと
記す。１７と１８はそれぞれＩ１ラッチ２とＩ２ラッチ
３に保持された命令の第１スロットと第２スロットの内
容を保持するＩＢ２１バッファとＩＢ２２バッファで、
両者を合わせてＩＢ２バッファと記す。ＩＢ１バッファ
およびＩＢ２バッファへは各スロットの蓄積ビットが”
１”の時に内容が取込まれる。２３、２４はＩＢ１バッ
ファまたはＩＢ２バッファをいずれかを選択して出力す
るセレクタ、１９はROM１から読出された命令の第１ス
ロットの内容またはセレクタ２３のいずれかを選択して
Ｉ１ラッチ２に出力するＩ１セレクタ、２０はROM１か
ら読出された命令の第２スロットの内容またはセレクタ
２４のいずれかを選択してＩ２ラッチ３に出力するＩ２
セレクタ、２１、２２はＩ１ラッチ２、Ｉ２ラッチ３に
格納されたデータの蓄積ビットが”１”のときｎｏｐ
（ＮｏＯｐｅｒａｔｉｏｎ）を出力するｎｏｐ生成
器、２５、２６は蓄積ビットが”１”となったときは書
き込み信号を”０”、”１”と反転して出力し、蓄積ビ
ットが”０”のときは”０”を出力する書き込み信号生
成器、２７、２８は命令の蓄積完了を検出するＡＮＤ回
路、２９は蓄積した命令をデコード・実行する場合に命
令フェッチを止めるための信号等を生成するＯＲ回路、
３０、３１はクロックドバッファである。なお、ｎｏｐ
生成器２１、２２はＩ１ラッチ２、Ｉ２ラッチ３の出力
のそれぞれのビットと、蓄積ビットを反転したものとの
論理積を演算するＡＮＤ回路で構成されており、蓄積ビ
ットが”１”のときは、ｎｏｐを意味する（００・・・
０）₂を出力する。また、書き込み信号生成器２５、２
６はＴ形フリップフロップとＡＮＤ回路からなり、正転
出力とＴ型フリップフロップのトリガ入力（Ｉ１ラッチ
２、Ｉ２ラッチ３の蓄積ビット）との論理積をとるＡＮ
Ｄ回路の出力をＩＢ１１バッファ１５、ＩＢ１２バッフ
ァ１６への書き込み信号とし、反転出力とＴ型フリップ
フロップのトリガ入力との論理積をとるＡＮＤ回路の出
力をＩＢ２１バッファ１７、ＩＢ２２バッファ１８への
書き込み信号としている。In FIG. 7, 1 is a ROM for storing a machine language program, and 2 and 3 are machine language instructions (hereinafter abbreviated as instructions).
I1 latch and I2 latch for storing the contents of the first slot and the second slot, respectively, and 4 and 5 decode the contents of the first slot and the second slot of the instruction held in the I1 latch 2 and the I2 latch 3, respectively. A first instruction decoder and a second instruction decoder which control each part of the processor, 6 is a register file for storing operands, and 7 and 8 are I1 latches 2
D1 selector and D2 selector for selecting one of two inputs, ie, a part of the contents of the I2 latch 3 and the output of the register file 6, and D11 latches for storing the outputs of the D1 selector 7 and the D2 selector 8, respectively. And D12 latch,
11 and 12 are D2 for storing the output of the register file 6
1 latch and D22 latch, 13 is a first arithmetic unit for performing an arithmetic and logic operation using the contents of the D11 latch 9 and D21 latch 11, and 14 is a first arithmetic unit for performing an arithmetic and logic operation using the contents of the D12 latch 10 and the D22 latch 12. With two arithmetic units,
Outputs of the first computing unit 13 and the second computing unit 14 are both connected to the register file 6. 15 and 16 are I1
The IB11 buffer and the IB12 buffer that hold the contents of the first slot and the second slot of the instruction held in the latch 2 and the I2 latch 3 are collectively referred to as an IB1 buffer. Reference numerals 17 and 18 denote an IB21 buffer and an IB22 buffer which hold the contents of the first and second slots of the instruction held in the I1 latch 2 and the I2 latch 3, respectively.
Both are collectively referred to as an IB2 buffer. The IB1 buffer and the IB2 buffer receive the accumulated bit of each slot as "
The contents are fetched at the time of "1". Reference numerals 23 and 24 are selectors for selecting and outputting either the IB1 buffer or the IB2 buffer, and 19 is one of the contents of the first slot of the instruction read from the ROM 1 and the selector 23. Select the I2 latch 2 and output it to the I1 latch 2. The I2 selector 20 selects either the content of the second slot of the instruction read from the ROM 1 or the selector 24 and outputs it to the I2 latch 3.
When the storage bits of the data stored in the I1 latch 2 and the I2 latch 3 are “1”, the selectors 21 and 22 are nop.
The nop generators 25 and 26 which output (No Operation), when the storage bit becomes "1", invert the write signal to "0" or "1" and output it. At this time, a write signal generator for outputting "0", 27 and 28 are AND circuits for detecting the completion of the storage of the instruction, and 29 generate a signal for stopping the instruction fetch when decoding and executing the stored instruction. OR circuit,
30 and 31 are clocked buffers. In addition, nop
The generators 21 and 22 are constituted by AND circuits for calculating the logical product of the output bits of the I1 latch 2 and the I2 latch 3 and the inverted bit of the storage bit. Means nop (00 ...
0) Output ₂ Also, the write signal generators 25, 2
Reference numeral 6 denotes a T-type flip-flop and an AND circuit, which performs a logical AND operation on the non-inverting output and the trigger input of the T-type flip-flop (accumulated bits of the I1 latch 2 and the I2 latch 3).
The output of the D circuit is used as a write signal to the IB11 buffer 15 and the IB12 buffer 16, and the output of the AND circuit that takes the logical product of the inverted output and the trigger input of the T-type flip-flop is used as the write signal to the IB21 buffer 17 and the IB22 buffer 18. And

【００５０】レジスタファイル６は、レジスタR0からR7
の汎用レジスタを含み、読出し４ポート、書込み２ポー
トを有する。即ち、同時に４つのレジスタ（重複は可）
の読出しと２つのレジスタ（重複は不可）の書込みを許
す。Ｄ１セレクタ７およびＤ２セレクタ８はそれぞれ第
１命令解読器４および第２命令解読器５の指示により、
命令に即値などの定数値を伴う場合はこれを選択する。The register file 6 contains registers R0 to R7.
And four ports for reading and two ports for writing. That is, four registers at the same time (duplication is allowed)
And writing of two registers (duplication is not allowed). The D1 selector 7 and the D2 selector 8 are controlled by the first instruction decoder 4 and the second instruction decoder 5, respectively.
Select this when the instruction involves a constant value such as an immediate value.

【００５１】このプロセッサはいわゆるVLIW（Very Lon
g Instruction Word）形式の命令に基づいており、１つ
の命令で２つの演算などの操作が定義される。第１スロ
ットのオペレーションは、Ｉ１ラッチ２に格納され第１
命令解読器４で解読され第１演算器１３で実行される。
また第２スロットのオペレーションは、Ｉ２ラッチ３に
格納され第２命令解読器５で解読され第２演算器１４で
実行される。このようにして同時に２つの操作を実行す
るためVLIW形式のプロセッサは効率が高い。This processor is a so-called VLIW (Very Lon)
g Instruction Word) format, and one operation defines two operations such as operations. The operation of the first slot is stored in the I1 latch 2 and
The instruction is decoded by the instruction decoder 4 and executed by the first calculator 13.
The operation of the second slot is stored in the I2 latch 3, decoded by the second instruction decoder 5, and executed by the second calculator 14. Since two operations are performed simultaneously in this manner, the VLIW type processor is highly efficient.

【００５２】２．１プロセッサの動作例以下に、図６の機械命令プログラムがROM１に格納され
た場合における上記構成をもつプロセッサの動作につい
て図８を用いて説明する。2.1 Example of Operation of Processor Hereinafter, the operation of the processor having the above configuration when the machine instruction program of FIG. 6 is stored in the ROM 1 will be described with reference to FIG.

【００５３】図８は、図６の機械命令プログラムがROM
１に格納された場合におけるプロセッサの動作タイミン
グ図である。同図は、プロセッサの動作をパイプライン
のIFステージでROM１から読出される命令、DECステージ
で解読される命令、EXステージで実行される命令と、Ｉ
Ｂ１バッファおよびＩＢ２バッファが保持する命令をマ
シンサイクルと呼ばれるタイミング毎に示している。以
下、時間が経過する順にタイミング毎にその動作を説明
する。なお図中、「：」はスロットの区切りを表し、左
が第１スロット、右が第２スロットを意味し、「−」は
有効なオペレーションが保持されていないもしくは作用
していないことを表す。FIG. 8 shows that the machine instruction program of FIG.
FIG. 7 is an operation timing chart of the processor when the value is stored in the “1”. The figure shows the operation of the processor, the instruction read from the ROM 1 in the IF stage of the pipeline, the instruction decoded in the DEC stage, the instruction executed in the EX stage,
Instructions held by the B1 buffer and the IB2 buffer are shown at each timing called a machine cycle. Hereinafter, the operation will be described for each timing in order of elapse of time. In the figure, “:” indicates a slot division, the left indicates the first slot, the right indicates the second slot, and “−” indicates that a valid operation is not held or does not operate.

【００５４】また、初期状態として、ＩＢ１１バッファ
１５、ＩＢ１２バッファ１６、ＩＢ２１バッファ１７、
ＩＢ２２バッファ１８はリセットされているものとす
る。The IB11 buffer 15, IB12 buffer 16, IB21 buffer 17,
It is assumed that the IB22 buffer 18 has been reset.

【００５５】（タイミングt1）・IFステージ：命令１命令１がROM１から読出され、第１スロット（蓄積ビッ
トが”０”でオペレーションA）がＩ１ラッチ２に、第
２スロット（蓄積ビットが”１”でオペレーションG）
がＩ２ラッチ３に格納される。すなわち、ＩＢバッファ
にはまだオペレーションが蓄積されていない（蓄積ビッ
トが”１”でない）ので、Ｉ１ＳＥＬ１９、Ｉ２ＳＥＬ
２０はいずれも、ＲＯＭ１からの出力を選択し出力す
る。(Timing t1) IF stage: Instruction 1 Instruction 1 is read from ROM1, the first slot (accumulation bit is "0" and operation A) is in I1 latch 2, and the second slot (accumulation bit is "1"). Operation G)
Are stored in the I2 latch 3. That is, since the operation has not yet been accumulated in the IB buffer (the accumulation bit is not "1"), I1SEL19, I2SEL
20 selects and outputs the output from the ROM 1.

【００５６】（タイミングt2）・DECステージ：命令１蓄積ビットが”１”であるＩ２ラッチ３の内容（蓄積ビ
ットが”１”でオペレーションG）がＩＢ１２バッファ
１６に取込まれる。具体的には、蓄積ビットが”１”で
あるため、書き込み信号生成器２６によりＩＢ１２バッ
ファ１６の書き込み信号がイネーブルとなり、Ｉ２ラッ
チ３の内容がＩＢ１２バッファ１６に蓄積されることと
なる。また、Ｉ２ラッチ３に格納された命令１の第２ス
ロットの蓄積ビットが”１”であるため、ｎｏｐ生成器
２２はｎｏｐ（００・・・０）₂を出力し、第２命令解
読器５はＥＸステージで実質的に何らの動作もしないよ
うなデコード結果を出力する。(Timing t2) DEC stage: instruction 1 The contents of the I2 latch 3 whose accumulation bit is "1" (operation G when the accumulation bit is "1") are taken into the IB12 buffer 16. Specifically, since the accumulation bit is "1", the write signal of the IB12 buffer 16 is enabled by the write signal generator 26, and the contents of the I2 latch 3 are accumulated in the IB12 buffer 16. Also, since the accumulation bit of the second slot of the instruction 1 stored in the I2 latch 3 is “1”, the nop generator 22 outputs nop (00... 0) ₂ and the second instruction decoder 5 Outputs a decoding result that does not substantially perform any operation in the EX stage.

【００５７】一方、Ｉ１ラッチ２に格納された命令１の
第１スロットが第１命令解読器４で解読される。解読さ
れた結果としてオペレーションAであることが判明す
る。この解読に基づいてレジスタファイル６から汎用レ
ジスタが読出され、読出された値または命令中の定数値
がＤ１１ラッチ９とＤ２１ラッチ１１に格納される。・IFステージ：命令２命令２がROM１から読出され、第１スロット（蓄積ビッ
トが”０”でオペレーションB）がＩ１ラッチ２に、第
２スロット（蓄積ビットが”０”でオペレーションC）
がＩ２ラッチ３に格納される。On the other hand, the first slot of the instruction 1 stored in the I1 latch 2 is decoded by the first instruction decoder 4. As a result of the decryption, operation A is determined. The general-purpose register is read from the register file 6 based on the decoding, and the read value or the constant value in the instruction is stored in the D11 latch 9 and the D21 latch 11. • IF stage: Instruction 2 Instruction 2 is read from ROM1, the first slot (operation B when the accumulation bit is "0") is stored in the I1 latch 2, and the second slot (operation C when the accumulation bit is "0").
Are stored in the I2 latch 3.

【００５８】（タイミングt3）・EXステージ：命令１Ｄ１１ラッチ９とＤ２１ラッチ１１に格納されたオペラ
ンドを第１演算器１３に入力してオペレーションAの演
算を行う。演算結果は必要に応じてレジスタファイル６
の汎用レジスタに格納する。一方、オペレーションGは
蓄積ビットが”１”でｎｏｐ生成器２２によりｎｏｐに
無効化されているため、第２演算器１４は作用しない。・DECステージ：命令２Ｉ１ラッチ２に格納された命令２の第１スロットが第１
命令解読器４で解読される。解読された結果としてオペ
レーションBであることが判明する。この解読に基づい
てレジスタファイル６から汎用レジスタが読出され、読
出された値または命令中の定数値がＤ１１ラッチ９とＤ
２１ラッチ１１に格納される。一方、Ｉ２ラッチ３に格
納された命令２の第２スロットが第２命令解読器５で解
読される。解読された結果としてオペレーションCであ
ることが判明する。この解読に基づいてレジスタファイ
ル６から汎用レジスタが読出され、読出された値または
命令中の定数値がＤ１２ラッチ１０とＤ２２ラッチ１２
に格納される。このとき、オペレーションB,Cいずれの
蓄積ビットも”０”であるため、いずれのＩＢバッファ
の書き込み信号もイネーブルとならず、書き込みは行わ
れない。・IFステージ：命令３命令３がROM１から読出され、第１スロット（蓄積ビッ
トが”０”でオペレーションD）がＩ１ラッチ２に、第
２スロット（蓄積ビットが”１”でオペレーションJ）
がＩ２ラッチ３に格納される。(Timing t3) EX stage: Instruction 1 Operands stored in the D11 latch 9 and the D21 latch 11 are input to the first computing unit 13 to perform the operation A. The calculation result is stored in the register file 6 if necessary.
In a general-purpose register. On the other hand, in the operation G, since the accumulated bit is “1” and is invalidated to “nop” by the nop generator 22, the second computing unit 14 does not operate. • DEC stage: instruction 2 The first slot of instruction 2 stored in I1 latch 2 is the first slot
The instruction is decoded by the instruction decoder 4. As a result of the decryption, operation B is determined. The general-purpose register is read from the register file 6 based on the decoding, and the read value or the constant value in the instruction is stored in the D11 latch 9 and the D11 latch.
21 are stored in the latch 11. On the other hand, the second slot of the instruction 2 stored in the I2 latch 3 is decoded by the second instruction decoder 5. As a result of the decryption, operation C is determined. The general-purpose register is read from the register file 6 based on the decoding, and the read value or the constant value in the instruction is stored in the D12 latch 10 and the D22 latch 12.
Is stored in At this time, since the accumulation bits of both the operations B and C are “0”, the write signal of any of the IB buffers is not enabled, and no write is performed. • IF stage: Instruction 3 Instruction 3 is read from ROM1, the first slot (operation D when the accumulation bit is "0") is stored in the I1 latch 2, and the second slot (operation J when the accumulation bit is "1").
Are stored in the I2 latch 3.

【００５９】（タイミングt4）・EXステージ：命令２Ｄ１１ラッチ９とＤ２１ラッチ１１に格納されたオペラ
ンドを第１演算器１３に入力してオペレーションBの演
算を行う。演算結果は必要に応じてレジスタファイル６
の汎用レジスタに格納する。一方、Ｄ１２ラッチ１０と
Ｄ２２ラッチ１２に格納されたオペランドを第２演算器
１４に入力してオペレーションCの演算を行う。演算結
果は必要に応じてレジスタファイル６の汎用レジスタに
格納する。・DECステージ：命令３蓄積ビットが”１”であるＩ２ラッチ３の内容（蓄積ビ
ットが”１”でオペレーションJ）がＩＢ２２バッファ
１８に取込まれる。具体的には、蓄積ビットが”１”で
あるため、ＩＢ１２バッファ１６またはＩＢ２２バッフ
ァ１８にデータの書き込みをしようとするが、すでにＩ
Ｂ１２バッファ１６にはデータを書き込んだので、書き
込み信号生成器２６によりＩＢ２２バッファ１８の書き
込み信号がイネーブルになる。また、Ｉ２ラッチ３に格
納された命令３の第２スロットの蓄積ビットが”１”で
あるため、ｎｏｐ生成器２２はｎｏｐを出力し、第２命
令解読器５はＥＸステージで実質的に何らの動作もしな
いようなデコード結果を出力する。(Timing t4) EX stage: instruction 2 Operands stored in the D11 latch 9 and the D21 latch 11 are input to the first computing unit 13 to perform the operation B. The calculation result is stored in the register file 6 if necessary.
In a general-purpose register. On the other hand, the operands stored in the D12 latch 10 and the D22 latch 12 are input to the second computing unit 14 to perform the operation C. The calculation result is stored in a general-purpose register of the register file 6 as needed. DEC stage: instruction 3 The contents of the I2 latch 3 whose accumulation bit is "1" (operation J when the accumulation bit is "1") are taken into the IB22 buffer 18. Specifically, since the accumulation bit is “1”, an attempt is made to write data to the IB12 buffer 16 or the IB22 buffer 18,
Since the data has been written to the B12 buffer 16, the write signal of the IB22 buffer 18 is enabled by the write signal generator 26. Further, since the accumulation bit of the second slot of the instruction 3 stored in the I2 latch 3 is “1”, the nop generator 22 outputs nop, and the second instruction decoder 5 performs substantially no EX stage. And outputs a decoding result that does not perform the above operation.

【００６０】一方、Ｉ１ラッチ２に格納された命令３の
第１スロットが第１命令解読器４で解読される。解読さ
れた結果としてオペレーションDであることが判明す
る。この解読に基づいてレジスタファイル６から汎用レ
ジスタが読出され、読出された値または命令中の定数値
がＤ１１ラッチ９とＤ２１ラッチ１１に格納される。・IFステージ：命令４命令４がROM１から読出され、第１スロット（蓄積ビッ
トが”１”でオペレーションF）がＩ１ラッチ２に、第
２スロット（蓄積ビットが”０”でオペレーションE）
がＩ２ラッチ３に格納される。On the other hand, the first slot of the instruction 3 stored in the I1 latch 2 is decoded by the first instruction decoder 4. As a result of the decryption, operation D is determined. The general-purpose register is read from the register file 6 based on the decoding, and the read value or the constant value in the instruction is stored in the D11 latch 9 and the D21 latch 11. • IF stage: Instruction 4 Instruction 4 is read from ROM1, the first slot (accumulation bit is “1” and operation F) is in I1 latch 2, and the second slot (accumulation bit is “0” and operation E).
Are stored in the I2 latch 3.

【００６１】（タイミングt5）・EXステージ：命令３Ｄ１１ラッチ９とＤ２１ラッチ１１に格納されたオペラ
ンドを第１演算器１３に入力してオペレーションDの演
算を行う。演算結果は必要に応じてレジスタファイル６
の汎用レジスタに格納する。一方、オペレーションJは
蓄積ビットが”１”でｎｏｐ生成器２２によりｎｏｐに
無効化されているため第２演算器１４は作用しない。・DECステージ：命令４蓄積ビットが”１”であるＩ１ラッチ２の内容（蓄積ビ
ットが”１”でオペレーションF）がＩＢ１１バッファ
１５に取込まれる。具体的には、蓄積ビットが”１”で
あるため、書き込み信号生成器２５によりＩＢ１１バッ
ファ１５の書き込み信号がイネーブルとなり、Ｉ１ラッ
チ２の内容がＩＢ１１バッファ１５に蓄積されることと
なる。また、Ｉ２ラッチ２に格納された命令４の第１ス
ロットの蓄積ビットが”１”であるため、ｎｏｐ生成器
２１はｎｏｐを出力し、第１命令解読器４はＥＸステー
ジで実質的に何らの動作もしないようなデコード結果を
出力する。(Timing t5) EX stage: instruction 3 Operands stored in the D11 latch 9 and the D21 latch 11 are input to the first computing unit 13 to perform the operation D. The calculation result is stored in the register file 6 if necessary.
In a general-purpose register. On the other hand, in the operation J, the second arithmetic unit 14 does not operate because the accumulated bit is “1” and is invalidated to “nop” by the nop generator 22. DEC stage: instruction 4 The contents of the I1 latch 2 whose accumulation bit is "1" (operation F when the accumulation bit is "1") are taken into the IB11 buffer 15. Specifically, since the accumulation bit is “1”, the write signal of the IB11 buffer 15 is enabled by the write signal generator 25, and the contents of the I1 latch 2 are accumulated in the IB11 buffer 15. Further, since the accumulation bit of the first slot of the instruction 4 stored in the I2 latch 2 is “1”, the nop generator 21 outputs nop, and the first instruction decoder 4 outputs substantially no data in the EX stage. And outputs a decoding result that does not perform the above operation.

【００６２】一方、Ｉ２ラッチ３に格納された命令４の
第２スロットが第２命令解読器５で解読される。解読さ
れた結果としてオペレーションEであることが判明す
る。この解読に基づいてレジスタファイル６から汎用レ
ジスタが読出され、読出された値または命令中の定数値
がＤ１２ラッチ１０とＤ２２ラッチ１２に格納される。・IFステージ：ＩＢ１バッファ蓄積命令ＩＢ１１バッファ１５とＩＢ１２バッファ１６との蓄積
ビットが共に”１”であるため、ＡＮＤ回路２７は蓄積
バッファにデータが蓄積されたとして”１”を出力し、
さらにＯＲ回路２９が命令フェッチを中断すべく”１”
を出力する。これにより、命令フェッチは中断される。
これとともに、ＩＢ１１バッファ１５の蓄積ビットが”
１”であるので、セレクタ２３、２４はＩＢ１バッファ
を選択・出力する。さらにＯＲ回路２９の出力により、
Ｉ１セレクタ１９、Ｉ２セレクタ２０はそれぞれＩＢ１
１バッファ１５、ＩＢ２１バッファ１６を選択し、蓄積
された命令はＩ１ラッチ２、Ｉ２ラッチ３に格納され
る。これにより、ＩＢ１１バッファ１５およびＩＢ２１
バッファ１６に格納された命令を使用したことになるの
で、クロックドバッファ３０によりタイミングを調整し
て、ＩＢ１１バッファ１５およびＩＢ２１バッファ２１
の内容をリセットし、蓄積ビットを”０”とする。な
お、ここではバッファそのものをリセットしているが、
蓄積ビットのみを”０”としても良い。なお、図面では
省略しているが、Ｉ１セレクタ１９およびＩ２セレクタ
２０は蓄積された命令を選択するときは蓄積ビットを”
０”にして、Ｉ１ラッチ２、Ｉ２ラッチ３に出力する。
ｎｏｐ生成器２１、２２が蓄積された命令をｎｏｐに無
効化する事を防止するためである。また、セレクタ２
３、２４の切り換え信号をＩＢ１１バッファ１５の蓄積
ビットのみとしているのは、蓄積された命令が実行され
るときは常にＩＢ１１バッファ１５およびＩＢ１２バッ
ファ１６（またはＩＢ２１バッファ１７およびＩＢ２２
バッファ１８）の蓄積ビットが”１”となっているた
め、ＩＢ１２バッファ１６の蓄積ビットまでを見る必要
はなく、またＩＢ１バッファに蓄積された命令を実行す
るということはＩＢ２バッファに蓄積された命令はまだ
実行しない状態を意味するからである。このため、ＩＢ
１１バッファ１５の蓄積ビットに限らず、いずれかの蓄
積ビットの値により切り換え信号とできる。On the other hand, the second slot of the instruction 4 stored in the I2 latch 3 is decoded by the second instruction decoder 5. As a result of the decryption, the operation E is determined. The general-purpose register is read from the register file 6 based on the decoding, and the read value or the constant value in the instruction is stored in the D12 latch 10 and the D22 latch 12. IF stage: IB1 buffer accumulation instruction Since the accumulation bits of the IB11 buffer 15 and the IB12 buffer 16 are both "1", the AND circuit 27 outputs "1" assuming that data has been accumulated in the accumulation buffer,
Further, the OR circuit 29 sets "1" to interrupt the instruction fetch.
Is output. As a result, the instruction fetch is suspended.
At the same time, the accumulation bit of the IB11 buffer 15 becomes "
1 ", the selectors 23 and 24 select and output the IB1 buffer.
The I1 selector 19 and the I2 selector 20 are respectively IB1
One buffer 15 and IB21 buffer 16 are selected, and the stored instructions are stored in I1 latch 2 and I2 latch 3. Thereby, the IB11 buffer 15 and the IB21
Since the instruction stored in the buffer 16 has been used, the timing is adjusted by the clocked buffer 30 and the IB11 buffer 15 and the IB21 buffer 21 are adjusted.
Is reset, and the accumulation bit is set to "0". Although the buffer is reset here,
Only the accumulation bit may be set to “0”. Although not shown in the drawing, the I1 selector 19 and the I2 selector 20 set the accumulation bit to "" when selecting the accumulated instruction.
0 ", and output to the I1 latch 2 and the I2 latch 3.
This is to prevent the nop generators 21 and 22 from invalidating the stored instruction to the nop. Selector 2
The reason why the switching signals 3 and 24 are only the accumulation bits of the IB11 buffer 15 is that the IB11 buffer 15 and the IB12 buffer 16 (or the IB21 buffer 17 and the IB22 buffer 16) always execute the accumulated instruction.
Since the accumulation bit of the buffer 18) is "1", it is not necessary to look up to the accumulation bit of the IB12 buffer 16, and executing the instruction accumulated in the IB1 buffer means that the instruction accumulated in the IB2 buffer is executed. Is a state that has not been executed yet. For this reason, IB
The switching signal is not limited to the stored bits of the 11 buffer 15 but can be a switching signal depending on the value of any stored bit.

【００６３】（タイミングt6）・EXステージ：命令４オペレーションFは蓄積ビットが”１”でｎｏｐ生成器
２１によりｎｏｐに無効化されているため、第１演算器
１３は作用しない。一方、Ｄ１２ラッチ１０とＤ２２ラ
ッチ１２に格納されたオペランドを第２演算器１４に入
力してオペレーションEの演算を行う。演算結果は必要
に応じてレジスタファイル６の汎用レジスタに格納す
る。・DECステージ：ＩＢ１バッファ蓄積命令Ｉ１ラッチ２に格納された第１スロットが第１命令解読
器４で解読される。解読された結果としてオペレーショ
ンFであることが判明する。この解読に基づいてレジス
タファイル６から汎用レジスタが読出され、読出された
値または命令中の定数値がＤ１１ラッチ９とＤ２１ラッ
チ１１に格納される。一方、Ｉ２ラッチ３に格納された
第２スロットが第２命令解読器５で解読される。解読さ
れた結果としてオペレーションGであることが判明す
る。この解読に基づいてレジスタファイル６から汎用レ
ジスタが読出され、読出された値または命令中の定数値
がＤ１２ラッチ１０とＤ２２ラッチ１２に格納される。・IFステージ：命令５命令５がROM１から読出され、第１スロット（蓄積ビッ
トが”１”でオペレーションI）がＩ１ラッチ２に、第
２スロット（蓄積ビットが”０”でオペレーションH）
がＩ２ラッチ３に格納される。(Timing t6) EX stage: instruction 4 In operation F, since the accumulated bit is "1" and the nop generator 21 has invalidated it to nop, the first computing unit 13 does not operate. On the other hand, the operands stored in the D12 latch 10 and the D22 latch 12 are input to the second computing unit 14 to perform the operation E. The calculation result is stored in a general-purpose register of the register file 6 as needed. DEC stage: IB1 buffer storage instruction The first slot stored in the I1 latch 2 is decoded by the first instruction decoder 4. As a result of the decryption, the operation F is determined. The general-purpose register is read from the register file 6 based on the decoding, and the read value or the constant value in the instruction is stored in the D11 latch 9 and the D21 latch 11. On the other hand, the second slot stored in the I2 latch 3 is decoded by the second instruction decoder 5. As a result of the decryption, operation G is determined. The general-purpose register is read from the register file 6 based on the decoding, and the read value or the constant value in the instruction is stored in the D12 latch 10 and the D22 latch 12. • IF stage: Instruction 5 Instruction 5 is read from ROM 1, the first slot (accumulation bit is “1” and operation I) is in I1 latch 2, and the second slot (accumulation bit is “0” and operation H).
Are stored in the I2 latch 3.

【００６４】（タイミングt7）・EXステージ：ＩＢ１バッファ蓄積命令Ｄ１１ラッチ９とＤ２１ラッチ１１に格納されたオペラ
ンドを第１演算器１３に入力してオペレーションFの演
算を行う。演算結果は必要に応じてレジスタファイル６
の汎用レジスタに格納する。一方、Ｄ１２ラッチ１０と
Ｄ２２ラッチ１２に格納されたオペランドを第２演算器
１４に入力してオペレーションGの演算を行う。演算結
果は必要に応じてレジスタファイル６の汎用レジスタに
格納する。・DECステージ：命令５蓄積ビットが”１”であるＩ１ラッチ２の内容（蓄積ビ
ットが”１”でオペレーションI）がＩＢ２１バッファ
１７に取込まれる。具体的には、蓄積ビットが”１”で
あるため、ＩＢ１１バッファ１５またはＩＢ２１バッフ
ァ１７にデータの書き込みをしようとするが、すでにＩ
Ｂ１１バッファ１５にはデータを書き込んだので、書き
込み信号生成器２５によりＩＢ２１バッファ１７の書き
込み信号がイネーブルとなる。また、Ｉ１ラッチ２に格
納された命令５の第１スロットの蓄積ビットが”１”で
あるため、ｎｏｐ生成器２１はｎｏｐを出力し、第１命
令解読器４はＥＸステージで実質的に何らの動作もしな
いようなデコード結果を出力する。(Timing t7) EX stage: IB1 buffer accumulation instruction Operands stored in the D11 latch 9 and the D21 latch 11 are input to the first computing unit 13 to perform the operation F. The calculation result is stored in the register file 6 if necessary.
In a general-purpose register. On the other hand, the operands stored in the D12 latch 10 and the D22 latch 12 are input to the second computing unit 14 to perform the operation G. The calculation result is stored in a general-purpose register of the register file 6 as needed. DEC stage: instruction 5 The contents of the I1 latch 2 whose accumulation bit is "1" (operation I when the accumulation bit is "1") are taken into the IB21 buffer 17. Specifically, since the accumulation bit is “1”, an attempt is made to write data to the IB11 buffer 15 or the IB21 buffer 17,
Since the data has been written to the B11 buffer 15, the write signal of the IB21 buffer 17 is enabled by the write signal generator 25. Also, since the accumulation bit of the first slot of the instruction 5 stored in the I1 latch 2 is “1”, the nop generator 21 outputs nop, and the first instruction decoder 4 outputs substantially no EX stage. And outputs a decoding result that does not perform the above operation.

【００６５】一方、Ｉ２ラッチ３に格納された命令５の
第２スロットが第２命令解読器５で解読される。解読さ
れた結果としてオペレーションHであることが判明す
る。この解読に基づいてレジスタファイル６から汎用レ
ジスタが読出され、読出された値または命令中の定数値
がＤ１２ラッチ１０とＤ２２ラッチ１２に格納される。・IFステージ：ＩＢ２バッファ蓄積命令ＩＢ２１バッファ１７とＩＢ２２バッファ１８との蓄積
フラグが共に”１”であるため、ＡＮＤ回路２７は蓄積
バッファにデータが蓄積されたとして”１”を出力し、
さらにＯＲ回路２９が命令フェッチを中断すべく”１”
を出力する。これにより、命令フェッチは中断される。
これとともに、ＩＢ１１バッファ１５の蓄積ビットが”
０”である（ＩＢ２バッファに蓄積された命令が存在す
る可能性がある）ので、セレクタ２３、２４はＩＢ２バ
ッファを選択・出力する。さらに、ＯＲ回路２９の出力
によりＩ１セレクタ１９、Ｉ２セレクタ２０はそれぞれ
ＩＢ２１バッファ１７、ＩＢ２２バッファ１８を選択
し、蓄積された命令はＩ１ラッチ２、Ｉ２ラッチ３に格
納される。これにより、ＩＢ２１バッファ１７およびＩ
Ｂ２２バッファ１８に格納された命令を使用したことに
なるので、クロックドバッファ３１によりタイミングを
調整して、ＩＢ２１バッファ１７およびＩＢ２２バッフ
ァ１８の内容をリセットし、蓄積フラグを”０”とす
る。On the other hand, the second slot of the instruction 5 stored in the I2 latch 3 is decoded by the second instruction decoder 5. As a result of the decryption, operation H is determined. The general-purpose register is read from the register file 6 based on the decoding, and the read value or the constant value in the instruction is stored in the D12 latch 10 and the D22 latch 12. IF stage: IB2 buffer accumulation instruction Since the accumulation flags of the IB21 buffer 17 and the IB22 buffer 18 are both "1", the AND circuit 27 outputs "1" assuming that data is accumulated in the accumulation buffer,
Further, the OR circuit 29 sets "1" to interrupt the instruction fetch.
Is output. As a result, the instruction fetch is suspended.
At the same time, the accumulation bit of the IB11 buffer 15 becomes "
0 "(there is a possibility that there is an instruction stored in the IB2 buffer), so that the selectors 23 and 24 select and output the IB2 buffer. Further, the I1 selector 19 and the I2 selector 20 are output from the OR circuit 29. Selects the IB21 buffer 17 and the IB22 buffer 18, respectively, and stores the stored instruction in the I1 latch 2 and the I2 latch 3. Thereby, the IB21 buffer 17 and the I2 latch 3 are stored.
Since the instruction stored in the B22 buffer 18 has been used, the timing is adjusted by the clocked buffer 31, the contents of the IB21 buffer 17 and the IB22 buffer 18 are reset, and the accumulation flag is set to "0".

【００６６】（タイミングt8）・EXステージ：命令５オペレーションIは蓄積ビットが”１”でｎｏｐ生成器
２１によりｎｏｐに無効化されているため、第１演算器
１３は作用しない。一方、Ｄ１２ラッチ１０とＤ２２ラ
ッチ１２に格納されたオペランドを第２演算器１４に入
力してオペレーションHの演算を行う。演算結果は必要
に応じてレジスタファイル６の汎用レジスタに格納す
る。・DECステージ：ＩＢ２バッファ蓄積命令Ｉ１ラッチ２に格納された第１スロットが第１命令解読
器４で解読される。解読された結果としてオペレーショ
ンIであることが判明する。この解読に基づいてレジス
タファイル６から汎用レジスタが読出され、読出された
値または命令中の定数値がＤ１１ラッチ９とＤ２１ラッ
チ１１に格納される。一方、Ｉ２ラッチ３に格納された
第２スロットが第２命令解読器５で解読される。解読さ
れた結果としてオペレーションJであることが判明す
る。この解読に基づいてレジスタファイル６から汎用レ
ジスタが読出され、読出された値または命令中の定数値
がＤ１２ラッチ１０とＤ２２ラッチ１２に格納される。(Timing t8) EX stage: instruction 5 In operation I, the accumulated bit is "1" and the nop generator 21 has invalidated it to nop, so the first computing unit 13 does not operate. On the other hand, the operands stored in the D12 latch 10 and the D22 latch 12 are input to the second computing unit 14 to perform the operation H. The calculation result is stored in a general-purpose register of the register file 6 as needed. DEC stage: IB2 buffer storage instruction The first slot stored in the I1 latch 2 is decoded by the first instruction decoder 4. As a result of the decryption, it is determined that the operation is operation I. The general-purpose register is read from the register file 6 based on the decoding, and the read value or the constant value in the instruction is stored in the D11 latch 9 and the D21 latch 11. On the other hand, the second slot stored in the I2 latch 3 is decoded by the second instruction decoder 5. As a result of the decryption, it is determined that the operation is operation J. The general-purpose register is read from the register file 6 based on the decoding, and the read value or the constant value in the instruction is stored in the D12 latch 10 and the D22 latch 12.

【００６７】（タイミングt9）・EXステージ：ＩＢ２バッファ蓄積命令Ｄ１１ラッチ９とＤ２１ラッチ１１に格納されたオペラ
ンドを第１演算器１３に入力してオペレーションIの演
算を行う。演算結果は必要に応じてレジスタファイル６
の汎用レジスタに格納する。一方、Ｄ１２ラッチ１０と
Ｄ２２ラッチ１２に格納されたオペランドを第２演算器
１４に入力してオペレーションJの演算を行う。演算結
果は必要に応じてレジスタファイル６の汎用レジスタに
格納する。(Timing t9) EX stage: IB2 buffer accumulation instruction Operands stored in the D11 latch 9 and the D21 latch 11 are input to the first computing unit 13 to perform the operation I. The calculation result is stored in the register file 6 if necessary.
In a general-purpose register. On the other hand, the operands stored in the D12 latch 10 and the D22 latch 12 are input to the second computing unit 14 to perform the operation J. The calculation result is stored in a general-purpose register of the register file 6 as needed.

【００６８】３．記録媒体本発明の記録媒体の実施の形態として、図６の機械命令
プログラム１１２を記録した磁気ディスク（フロッピー
ディスクやハードディスクなど）、光ディスク（CD-ROM
やPDなど）、光磁気ディスク、半導体メモリ（ROMやフ
ラッシュメモリなど）がある。3. Recording Medium As an embodiment of the recording medium of the present invention, a magnetic disk (floppy disk, hard disk, or the like) on which the machine instruction program 112 of FIG.
And PD), magneto-optical disks, and semiconductor memory (ROM and flash memory, etc.).

【００６９】以上のように本実施の形態によれば、コン
パイラの機械命令圧縮部１０９が、同順の第１スロット
のnopコードと第２スロットのnopコードとのペアを抽出
し、このnopコードペアの第１スロットおよび第２スロ
ットを、該ペア以降に最初に現れる有効オペレーション
のペアの第１スロットおよび第２スロットのオペレーシ
ョンでそれぞれ置き換え、置き換えに使った有効オペレ
ーションのペアを削除することにより命令中の無駄領域
が低減され、プログラムサイズの削減を図ることができ
る。As described above, according to the present embodiment, the machine instruction compression unit 109 of the compiler extracts a pair of the nop code of the first slot and the nop code of the second slot in the same order, and The instruction by replacing the first slot and the second slot of the pair with the operation of the first slot and the second slot of the pair of the valid operation that appears first after the pair, and deleting the valid operation pair used for the replacement. The waste area in the inside is reduced, and the program size can be reduced.

【００７０】また本実施の形態のプロセッサによれば、
散在する従来のnopコードの位置に埋められた有効オペ
レーションを蓄積するＩＢ１バッファおよびＩＢ２バッ
ファを設け、ＩＢ１バッファまたはＩＢ２バッファのい
ずれかに有効オペレーションが２つ揃った時点でこれを
実行することにより、従来の処理性能を保つつ圧縮され
た機械命令プログラムの実行が可能である。According to the processor of this embodiment,
By providing an IB1 buffer and an IB2 buffer for storing valid operations embedded in scattered conventional nop code positions, and executing this when either of the IB1 or IB2 buffers has two valid operations, It is possible to execute a compressed machine instruction program while maintaining the conventional processing performance.

【００７１】さらに本実施の形態によれば、従来のnop
コードの位置に該nopコードと同じスロットにある有効
オペレーションを埋めるという考え方に基づくため、第
１スロットと第２スロットとの間でオペレーションを相
互に転送する必要がなくなりプロセッサの構成が簡単に
なるという効果を有する。具体的には、Ｉ１ラッチ２の
オペレーションはＩＢ１１バッファ１５またはＩＢ２１
バッファ１７にのみ蓄積し、それらに蓄積したオペレー
ションはＩ１ラッチ２にのみ戻せばよく、逆にＩ２ラッ
チ３のオペレーションはＩＢ１２バッファ１６またはＩ
Ｂ２２バッファ１８にのみ蓄積し、それらに蓄積したオ
ペレーションはＩ２ラッチ３にのみ戻せばよいため、第
１スロットと第２スロットとの間の転送路および転送制
御手段を必要としない。Further, according to the present embodiment, the conventional nop
Since it is based on the idea that the valid operation in the same slot as the nop code is filled in the position of the code, it is not necessary to transfer the operation between the first slot and the second slot, thereby simplifying the configuration of the processor. Has an effect. Specifically, the operation of the I1 latch 2 is performed by the IB11 buffer 15 or the IB21.
The operation stored in the buffer 17 only needs to be returned to the I1 latch 2 only.
Since the data stored only in the B22 buffer 18 and the operations stored therein need only be returned to the I2 latch 3, the transfer path between the first slot and the second slot and the transfer control means are not required.

【００７２】なお、本実施形態のプロセッサでは、Ｉ１
セレクタ１９およびＩ２セレクタ２０をそれぞれＩ１ラ
ッチ２およびＩ２ラッチ３の入力側に設けているが、そ
れぞれＩ１ラッチ２およびＩ２ラッチ３の出力側に設
け、第１命令解読器４および第命令２解読器５の入力を
選択するようにしてもよい。このようにする際は、ＩＢ
１バッファおよびＩＢ２バッファへの入力をIFステージ
においてROM１から直接行うように変更しなければなら
ないが、ＩＢ１バッファおよびＩＢ２バッファへの取り
込みやＩ１セレクタ１９およびＩ２セレクタ２０の選択
については本実施形態と同様に蓄積ビットの値で制御す
ればよい。In the processor of this embodiment, I1
The selector 19 and the I2 selector 20 are provided on the input side of the I1 latch 2 and the I2 latch 3, respectively. However, the selector 19 and the I2 selector 20 are provided on the output side of the I1 latch 2 and the I2 latch 3, respectively. 5 may be selected. When doing this, IB
The input to the 1 buffer and the IB2 buffer must be changed so as to be performed directly from the ROM 1 in the IF stage. However, the input to the IB1 buffer and the IB2 buffer and the selection of the I1 selector 19 and the I2 selector 20 are the same as in the present embodiment. Can be controlled by the value of the accumulation bit.

【００７３】また、本実施形態のプロセッサでは、ＩＢ
１バッファとＩＢ２バッファとの２つの蓄積バッファを
設けているが、いかなる数であってもよい。蓄積バッフ
ァの数が多くなるほどnopコードを有効オペレーション
で埋める機会が増加し、プログラムサイズの一層の削減
が図れる。このことは、例えば本実施形態のプロセッサ
におけるＩＢ２バッファがないものとすると、図５の命
令３の第２スロットのnopコードは有効オペレーション
で埋められないことから容易に伺い知れる。In the processor of the present embodiment, IB
Although two accumulation buffers, one buffer and IB2 buffer, are provided, any number may be used. As the number of storage buffers increases, the chances of filling nop codes with valid operations increase, and the program size can be further reduced. This can be easily understood from, for example, assuming that there is no IB2 buffer in the processor of the present embodiment, because the nop code in the second slot of the instruction 3 in FIG. 5 is not filled with a valid operation.

【００７４】（実施の形態２）実施の形態２は、実施の
形態１から、nopコードのスロットの有効オペレーショ
ンでの埋め方の自由度を高めたものである。(Embodiment 2) Embodiment 2 is different from Embodiment 1 in that the degree of freedom in filling the nop code slot in the effective operation is increased.

【００７５】１．コンパイラコンパイラの構成は、機械命令圧縮部１０９の動作を除
いて実施の形態１に記したものと同じである。機械命令
圧縮部１０９は図１０から図１２に示すもので、次の原
理に基づいて動作する。1. Compiler The configuration of the compiler is the same as that described in the first embodiment except for the operation of the machine instruction compression unit 109. The machine instruction compression unit 109 is shown in FIGS. 10 to 12, and operates based on the following principle.

【００７６】命令の順に未圧縮の機械命令プログラムを
検索して、第１スロットと第２スロットのいずれかにか
かわらず出現順序が連続する２つのnopコードを抽出
し、これらのnopコードのスロットを、該２つのnopコー
ドの以降に最初に現れる有効オペレーションのペアの第
１スロットおよび第２スロットのオペレーションでそれ
ぞれ置き換え、置き換えたことをマーキングするととも
に、置き換えに使った有効オペレーションのペアを削除
し、削除されたペアの直前の命令の第１スロットと第２
スロットのいずれかに削除したことをマーキングする。
すなわち、実施の形態１のコンパイラでは各スロット毎
にnopを削除していたが、本実施の形態におけるコンパ
イラはスロットを意識せず、nopを出現順に有効なオペ
レーションに置き換えるものである。このため、いずれ
かのスロットにnopが集中していた場合でも有効なオペ
レーションに置き換えることができる。An uncompressed machine instruction program is searched in the order of instructions to extract two nop codes whose appearance order is continuous regardless of either the first slot or the second slot. , The operation of the first slot and the operation of the second slot of the effective operation pair appearing first after the two nop codes are respectively replaced and marked as replaced, and the effective operation pair used for replacement is deleted, The first and second slots of the instruction immediately before the deleted pair
Mark one of the slots as deleted.
That is, in the compiler of the first embodiment, the nop is deleted for each slot, but the compiler of the present embodiment replaces the nop with an effective operation in the order of appearance without considering the slot. For this reason, even if nop is concentrated in any of the slots, it can be replaced with an effective operation.

【００７７】１．１機械命令圧縮部１０９の動作例図９は、圧縮された機械命令プログラムの例示図であ
り、機械命令圧縮部１０９が図５の未圧縮の機械命令プ
ログラムを上述の手順で圧縮したものである。圧縮され
た命令は第１と第２の２つのスロットで構成され、各ス
ロットは蓄積ビットと位置ビットとオペレーション（O
P）フィールドとからなる。AからJの記号は有効なオペ
レーションを示す。蓄積ビットと位置ビットは次のよう
にエンコードしている。００、０１何もしない１０ＩＢ１バッファに蓄積すべし１１ＩＢ２バッファに蓄積すべし具体的に説明すると、図５の命令５のオペレーションF
とオペレーションGとを命令１と命令３とのnopコードの
スロットに埋め、命令７のオペレーションIとオペレー
ションJとを命令４と命令６とのnopコードのスロットに
埋め、埋められた以上のスロットの蓄積ビットを０１に
セットし、命令５と命令７とを削除する。オペレーショ
ンFとオペレーションGとオペレーションIとオペレーシ
ョンJとは、この順にＩＢ１バッファの第１スロット、
第２スロット、ＩＢ２バッファの第１スロット、第２ス
ロットに蓄積されることを前提にしており、削除された
命令５の直前の命令４の第２スロットの蓄積ビットには
１０を、削除された命令７の直前の命令６の第２スロッ
トの蓄積ビットには１１をセットする。その他のスロッ
トの蓄積ビットは００である。このようにして生成され
た機械命令プログラムが図９に示すものである。なお図
９の命令５は図５の命令６から生成したものである。1.1 Example of Operation of Machine Instruction Compression Unit 109 FIG. 9 is a view showing an example of a compressed machine instruction program. It is compressed. The compressed instruction is made up of two slots, a first and a second, where each slot has an accumulation bit, a position bit, and an operation (O
P) field. The symbols A to J indicate valid operations. The accumulated bits and position bits are encoded as follows. 00, 01 Do nothing 10 Should be stored in IB1 buffer 11 Should be stored in IB2 buffer More specifically, operation F of instruction 5 in FIG.
And operation G are buried in the nop code slot of instruction 1 and instruction 3, and operation I and operation J of instruction 7 are buried in the nop code slot of instruction 4 and instruction 6. The accumulation bit is set to 01, and instructions 5 and 7 are deleted. The operation F, the operation G, the operation I, and the operation J are in this order, the first slot of the IB1 buffer,
It is assumed that the data is stored in the second slot and the first and second slots of the IB2 buffer, and 10 is stored in the storage bit of the second slot of the instruction 4 immediately before the deleted instruction 5, and 10 is deleted. 11 is set to the accumulation bit of the second slot of the instruction 6 immediately before the instruction 7. The accumulated bits of the other slots are 00. The machine instruction program generated in this way is shown in FIG. Note that the instruction 5 in FIG. 9 is generated from the instruction 6 in FIG.

【００７８】図１０と図２とを比べると、ｎｏｐカウン
タが１つである点(S501)、位置ビットをセットする点(S
505)が異なる。ｎｏｐカウンタを１つとしたのは、本実
施の形態では、実施の形態１とは異なりスロットを意識
する必要がないからである。ただし、このｎｏｐカウン
タは図２のｎｏｐカウンタとは全く異なる用途に使用さ
れるもので、位置ビットの値を決定するためにｎｏｐが
出願する度に”０”、”１”を繰り返すものである。When FIG. 10 is compared with FIG. 2, the point where the nop counter is one (S501) and the point where the position bit is set (S501) are set.
505) are different. The reason why the number of nop counters is one is that in the present embodiment, unlike the first embodiment, there is no need to be aware of the slot. However, this nop counter is used for a completely different purpose from the nop counter of FIG. 2, and repeats "0" and "1" every time nop applies for determining the value of the position bit. .

【００７９】図１１、１２は図３、４と基本的に同じで
あるが、位置ビットの値をｎｏｐカウンタによって決定
しているところが大きく異なる(S609,S709)。また、前
述したｎｏｐカウントの用途のため、命令を削除したと
きはC=0とする点も異なる(S611,S711))。FIGS. 11 and 12 are basically the same as FIGS. 3 and 4, except that the value of the position bit is determined by the nop counter (S609, S709). Another difference is that C = 0 when an instruction is deleted due to the use of the above-mentioned nop count (S611, S711).

【００８０】２．プロセッサ図１３は、プロセッサのIFステージ部分の概略構成図で
ある。2. Processor FIG. 13 is a schematic configuration diagram of an IF stage portion of the processor.

【００８１】DECステージおよびEXステージの図示して
いない部分は図７と同じ構成であり、また実施の形態１
と同一の構成要素には同一の符号を付している。図７と
比べると、セレクタ３２、３３を有している点が異な
る。すなわち、位置ビットの値により、Ｉ１ラッチ２に
格納された命令であってもＩＢ１２バッファ１６または
ＩＢ２２バッファ１８に、Ｉ２ラッチ３に格納された命
令であってもＩＢ１１バッファ１５またはＩＢ２１バッ
ファ１７に命令を蓄積することが可能となり、実施の形
態１と比べてさらにｎｏｐを軽減することができる。他
の動作は、実施の形態１と同じであるため説明は省略す
る。The DEC stage and the EX stage, not shown, have the same structure as in FIG.
The same components as those described above are denoted by the same reference numerals. 7 is different from FIG. 7 in that selectors 32 and 33 are provided. That is, depending on the value of the position bit, the instruction stored in the I1 latch 2 is stored in the IB12 buffer 16 or the IB22 buffer 18, and the instruction stored in the I2 latch 3 is stored in the IB11 buffer 15 or the IB21 buffer 17. Can be accumulated, and the nop can be further reduced as compared with the first embodiment. Other operations are the same as those in the first embodiment, and a description thereof will not be repeated.

【００８２】３．記録媒体本発明の記録媒体の実施の形態として、図９の機械命令
プログラムを記録した磁気ディスク（フロッピーディス
クやハードディスクなど）、光ディスク（CD-ROMやPDな
ど）、光磁気ディスク、半導体メモリ（ROMやフラッシ
ュメモリなど）がある。3. Recording Medium As an embodiment of the recording medium of the present invention, a magnetic disk (floppy disk, hard disk, etc.), an optical disk (CD-ROM, PD, etc.), a magneto-optical disk, a semiconductor memory (ROM) storing the machine instruction program of FIG. And flash memory).

【００８３】以上のように本実施の形態によれば、コン
パイラの機械命令圧縮部１０９が、第１スロットと第２
スロットのいずれかにかかわらず出現順序が連続する２
つのnopコードを抽出し、このnopコードスロットを、該
２つのnopコードの以降に最初に現れる有効オペレーシ
ョンのペアの第１スロットおよび第２スロットのオペレ
ーションでそれぞれ置き換え、置き換えに使った有効オ
ペレーションのペアを削除することにより命令中の無駄
領域が低減され、プログラムサイズの削減を図ることが
できる。As described above, according to the present embodiment, the machine instruction compression unit 109 of the compiler performs
Appearance order is continuous regardless of one of the slots 2
Extract two nop codes, replace the nop code slot with the operation of the first slot and the operation of the second slot of the pair of valid operations that appear first after the two nop codes, respectively, and use the pair of valid operations used for the replacement. Is deleted, the useless area in the instruction is reduced, and the program size can be reduced.

【００８４】また本実施の形態のプロセッサによれば、
散在する従来のnopコードの位置に埋められた有効オペ
レーションを蓄積するＩＢ１バッファおよびＩＢ２バッ
ファを設け、実行させるべき位置の直前の命令中の蓄積
ビットでＩＢ１バッファまたはＩＢ２バッファのいずれ
かを指定して蓄積したオペレーションを実行することに
より、従来の処理性能を保つつ圧縮された機械命令プロ
グラムの実行が可能である。According to the processor of this embodiment,
Providing an IB1 buffer and an IB2 buffer for storing valid operations embedded in scattered conventional nop code positions, and specifying either the IB1 buffer or the IB2 buffer by a storage bit in an instruction immediately before the position to be executed Executing the stored operation enables execution of a compressed machine instruction program while maintaining the conventional processing performance.

【００８５】さらに本実施の形態によれば、スロットの
位置にかかわらず出現順にnopコードを有効オペレーシ
ョンで埋めるという考え方に基づくため、nopコードが
第１スロットと第２スロットとのいずれにあるかを識別
する必要がなくなり、コンパイラの構成が実施の形態１
のものより簡単になるという効果を有する。Further, according to the present embodiment, since the nop code is filled with valid operations in the order of appearance regardless of the position of the slot, it is determined whether the nop code is in the first slot or the second slot. There is no need to identify, and the configuration of the compiler is changed to the first embodiment.
This has the effect of being simpler than that of

【００８６】なお、本実施形態のプロセッサでは、Ｉ１
セレクタ１９およびＩ２セレクタ２０をそれぞれＩ１ラ
ッチ２およびＩ２ラッチ３の入力側に設けているが、そ
れぞれＩ１ラッチ２およびＩ２ラッチ３の出力側に設
け、第１解読器４および第２解読器５の入力を選択する
ようにしてもよい。このようにする際は、ＩＢ１バッフ
ァおよびＩＢ２バッファへの入力をIFステージにおいて
ROM１から直接行うように変更し、ROM１から読出された
命令の蓄積ビットの値によってＩＢ１セレクタ３１とＩ
Ｂ２セレクタ３２とを制御するように変更しなければな
らないが、ＩＢ１バッファおよびＩＢ２バッファへの取
り込みやＩ１セレクタ１９およびＩ２セレクタ２０の選
択については本実施形態と同様に蓄積ビットの値で制御
すればよい。In the processor of this embodiment, I1
The selector 19 and the I2 selector 20 are provided on the input side of the I1 latch 2 and the I2 latch 3, respectively. The selector 19 and the I2 selector 20 are provided on the output side of the I1 latch 2 and the I2 latch 3, respectively. The input may be selected. To do this, the inputs to the IB1 and IB2 buffers are
The IB1 selector 31 and the IB1 selector 31 are changed according to the value of the storage bit of the instruction read from the ROM1.
It is necessary to make a change to control the B2 selector 32. However, as with the present embodiment, it is necessary to control the capture to the IB1 buffer and the IB2 buffer and the selection of the I1 selector 19 and the I2 selector 20 by the value of the accumulation bit as in the present embodiment. Good.

【００８７】また、本実施形態のプロセッサでは、ＩＢ
１バッファとＩＢ２バッファとの２つの蓄積バッファを
設けているが、いかなる数であってもよい。蓄積バッフ
ァの数が多くなるほどnopコードを有効オペレーション
で埋める機会が増加し、プログラムサイズの一層の削減
が図れる。このことは、例えば本実施形態のプロセッサ
におけるＩＢ２バッファがないものとすると、図５の命
令４の第１スロットのnopコードは有効オペレーション
で埋められないことから容易に伺い知れる。In the processor of the present embodiment, IB
Although two accumulation buffers, one buffer and IB2 buffer, are provided, any number may be used. As the number of storage buffers increases, the chances of filling nop codes with valid operations increase, and the program size can be further reduced. This can be easily understood from the fact that, for example, assuming that there is no IB2 buffer in the processor of the present embodiment, the nop code in the first slot of the instruction 4 in FIG. 5 is not filled with a valid operation.

【００８８】（実施の形態３）実施の形態３は、２つの
スロットしかない命令で３つのオペレーションを並列実
行するVLIWアーキテクチャのコンパイラおよびプロセッ
サである。(Embodiment 3) Embodiment 3 is a VLIW architecture compiler and processor that executes three operations in parallel with an instruction having only two slots.

【００８９】１．コンパイラコンパイラの構成は、機械命令生成部１０７と機械命令
圧縮部１０９との動作を除いて実施の形態１に記したも
のと同じである。機械命令生成部１０７は、中間コード
用バッファ１０６に格納された中間コードを入力して命
令の３並列実行（実施の形態１は２並列実行）を目的と
する命令のスケジューリングを行い、未圧縮の機械命令
プログラムを生成し暫定出力バッファ１０８に書き込
む。機械命令圧縮部１０９は次の原理に基づいて動作す
る。1. Compiler The configuration of the compiler is the same as that described in the first embodiment except for the operation of the machine instruction generation unit 107 and the machine instruction compression unit 109. The machine instruction generation unit 107 receives the intermediate code stored in the intermediate code buffer 106, schedules the instruction for the purpose of executing the instruction in three parallels (two parallel executions in the first embodiment), and executes the uncompressed instruction. A machine instruction program is generated and written to the provisional output buffer 108. The machine instruction compression unit 109 operates based on the following principle.

【００９０】命令の順に未圧縮の機械命令プログラムを
検索して、第３スロットを除く第１スロットと第２スロ
ットのいずれかにかかわらず出現順序が連続する３つの
nopコードを抽出し、これらのnopコードのスロットを、
該３つのnopコードの以降に最初に現れる３つの有効オ
ペレーションが指定される命令の第１スロットから第３
スロットのオペレーションでそれぞれ置き換え、置き換
えたことをマーキングするとともに、置き換えに使った
３つの有効オペレーションが指定される命令を削除し、
削除されたペアの直前の命令の第１スロットと第２スロ
ットのいずれかに削除したことをマーキングする。An uncompressed machine instruction program is searched in the order of instructions, and three successive appearances are determined regardless of the first slot or the second slot except for the third slot.
extract the nop code and slot these nop code,
From the first slot to the third slot of the instruction in which three valid operations appearing first after the three nop codes are specified.
Replace with each slot operation, mark the replacement, delete the instruction that specifies the three valid operations used for replacement,
Mark the deletion in one of the first slot and the second slot of the instruction immediately before the deleted pair.

【００９１】１．１機械命令圧縮部１０９の動作例図１５は、圧縮された機械命令プログラムの例示図であ
り、機械命令圧縮部１０９が図１４の未圧縮の機械命令
プログラムを上述の手順で圧縮したものである。圧縮さ
れた命令は第１と第２の２つのスロットで構成され、各
スロットは２ビットの蓄積ビットとオペレーション（O
P）フィールドとからなる。AからHの記号は有効なオペ
レーションを示す。蓄積ビット（左側）と実行ビット
（右側）の２ビットは次のようにエンコードしている。００何もしない１０オペレーションは置き換えられたものであり、Ｉ
Ｂバッファに第１、第２、第３スロットの順に逐次蓄積
すべし０１直後の命令が削除されたので、ＩＢバッファの命
令を実行すべし１１（未使用）具体的に説明すると、図１４の命令５のオペレーション
FとオペレーションGとオペレーションHとを命令１の第
２スロットと命令３の第２スロットと命令４の第１スロ
ットとのnopコードのスロットに埋め、埋められた以上
のスロットの蓄積ビットを０１にセットし、命令５を削
除する。オペレーションFとオペレーションGとオペレー
ションHとは、この順にＩＢバッファの第１スロット、
第２スロット、第３スロットに蓄積されることを前提に
しており、削除された命令５の直前の命令４の第２スロ
ットの蓄積ビットは”１”と実行ビットは”０”とセッ
トする。その他のスロットの蓄積ビットは”０”と実行
ビットは”０”とセットする。このようにして生成され
た機械命令プログラムが図１５に示すものである。なお
「ＩＢバッファ」は次に説明する。1.1 Example of Operation of Machine Instruction Compression Unit 109 FIG. 15 is a view showing an example of a compressed machine instruction program. The machine instruction compression unit 109 converts the uncompressed machine instruction program of FIG. It is compressed. The compressed instruction consists of two slots, a first and a second, each slot having two stored bits and an operation (O
P) field. The symbols A through H indicate valid operations. The two bits of the accumulation bit (left side) and the execution bit (right side) are encoded as follows. 00 Do nothing 10 The operation has been replaced and I
The instruction immediately after 01 should be sequentially stored in the B buffer in the order of the first, second, and third slots 01. The instruction in the IB buffer should be executed 11 (unused). Operation of instruction 5
F, operation G, and operation H are filled in the slots of the nop code of the second slot of the instruction 1, the second slot of the instruction 3, and the first slot of the instruction 4, and the accumulated bits of the filled slots are set to 01. Set and delete instruction 5. The operation F, the operation G, and the operation H are in this order, the first slot of the IB buffer,
It is assumed that data is stored in the second slot and the third slot. The storage bit of the second slot of the instruction 4 immediately before the deleted instruction 5 is set to “1” and the execution bit is set to “0”. The accumulation bits of the other slots are set to "0" and the execution bits are set to "0". The machine instruction program generated in this way is shown in FIG. The "IB buffer" will be described next.

【００９２】２．プロセッサ図１６は、プロセッサの概略構成図である。[0092] 2. Processor FIG. 16 is a schematic configuration diagram of a processor.

【００９３】図７と比べると、２つのスロットしかない
命令で３つのオペレーションを並列実行するために、２
つのスロットの命令をＩＢ３バッファ４１を含む３つの
バッファに蓄積することにより内部で３つのスロットの
命令に変換するものである。そして、３つ目のスロット
の命令を与えるためのＩ３ラッチ３８、ｎｏｐ生成器３
９、第３命令解読器４０を有し、さらに３つ目のスロッ
トの命令を実行するためのＤ３セレクタ３４、Ｄ１３ラ
ッチ３５、Ｄ２３ラッチ３６及び第３演算器３７を有す
る点で異なる。また、リングカウンタ４２により、ＩＢ
１バッファ１５、ＩＢ２バッファ１６、ＩＢ３バッファ
４１の書き込み信号を順にイネーブルにする。Compared to FIG. 7, since three operations are executed in parallel with an instruction having only two slots,
The instruction of one slot is internally converted into an instruction of three slots by storing the instruction in three buffers including the IB3 buffer 41. Then, the I3 latch 38 for giving the instruction of the third slot, the nop generator 3
9. The difference is that the third instruction decoder 40 is provided, and further, a D3 selector 34, a D13 latch 35, a D23 latch 36, and a third calculator 37 for executing the instruction of the third slot are provided. In addition, the IB
The write signals of the first buffer 15, the IB2 buffer 16, and the IB3 buffer 41 are sequentially enabled.

【００９４】２．１プロセッサの動作例以下に、図１５の機械命令プログラムがROM１に格納さ
れた場合における上記構成をもつプロセッサの動作につ
いて図１７を用いて説明する。2.1 Example of Operation of Processor Hereinafter, the operation of the processor having the above configuration when the machine instruction program of FIG. 15 is stored in the ROM 1 will be described with reference to FIG.

【００９５】図１７は、図１５の機械命令プログラムが
ROM１に格納された場合におけるプロセッサの動作タイ
ミング図である。同図は、プロセッサの動作をパイプラ
インのIFステージでROM４１から読出される命令、DECス
テージで解読される命令、EXステージで実行される命令
と、ＩＢバッファが保持する命令をマシンサイクルと呼
ばれるタイミング毎に示している。以下、時間が経過す
る順にタイミング毎にその動作を説明する。なお図中、
「：」はスロットの区切りを表し、左が第１スロット、
中央が第２スロット、右が第３スロットを意味し、
「−」は有効なオペレーションが保持されていないもし
くは作用していないことを表す。FIG. 17 shows that the machine instruction program of FIG.
FIG. 6 is an operation timing chart of the processor when stored in the ROM 1. The figure shows the operation of the processor in which an instruction read from the ROM 41 at the IF stage of the pipeline, an instruction decoded at the DEC stage, an instruction executed at the EX stage, and an instruction held by the IB buffer are called machine cycles. It is shown for each. Hereinafter, the operation will be described for each timing in order of elapse of time. In the figure,
":" Represents a slot break, the left is the first slot,
The center means the second slot, the right means the third slot,
"-" Indicates that a valid operation is not held or is not operating.

【００９６】（タイミングt1）初期状態として、ＩＢ１
バッファ１５、ＩＢ２バッファ１６、ＩＢ３バッファ４
１がリセットされ、それぞれに（０・・・００）₂が格
納されているものとする。また、リングカウンタ４２も
初期状態として（００１）₂にセットされ、Ｉ１ラッチ
２またはＩ２ラッチ３に蓄積ビットが”１”の最初のオ
ペレーションが格納されると（１００）₂となり、ＩＢ
１バッファ１５にオペレーションが蓄積されることとな
る。・IFステージ：命令１命令１がROM１から読出され、第１スロット（オペレー
ションA）がＩ１ラッチ２に、第２スロット（オペレー
ションF）がＩ２ラッチ３に格納される。Ｉ３ラッチ３
８にはＩＢ３バッファ４１の（０・・・００）₂が格納
される。(Timing t1) As an initial state, IB1
Buffer 15, IB2 buffer 16, IB3 buffer 4
Assume that 1 is reset and (0... 00) ₂ is stored in each of them. Also, the ring counter 42 is set to (001) ₂ as an initial state, and when the first operation of which the accumulation bit is “1” is stored in the I1 latch 2 or I2 latch 3, it becomes (100) ₂ and IB
The operation is accumulated in one buffer 15. IF stage: Instruction 1 Instruction 1 is read from ROM 1 and the first slot (operation A) is stored in I1 latch 2 and the second slot (operation F) is stored in I2 latch 3. I3 latch 3
8 stores (0... 00) ₂ of the IB3 buffer 41.

【００９７】（タイミングt2）・DECステージ：命令１蓄積ビットが”１”であるＩ２ラッチ３の内容（オペレ
ーションF）がＩＢ１バッファ１５に取込まれる。具体
的には、蓄積ビットが”１”である最初のオペレーショ
ンなのでリングカウンタ４２が（１００）₂を出力する
ことにより、ＩＢ１バッファ１５の書き込み信号がイネ
ーブルとなり、Ｉ２ラッチ３の内容がＩＢ１バッファ１
５に蓄積される。(Timing t2) DEC stage: instruction 1 The contents (operation F) of the I2 latch 3 whose accumulation bit is "1" are taken into the IB1 buffer 15. Specifically, since the accumulation operation is the first operation in which the accumulation bit is "1", the write signal of the IB1 buffer 15 is enabled when the ring counter 42 outputs (100) ₂ , and the contents of the I2 latch 3 are changed to the IB1 buffer 1
5 is stored.

【００９８】Ｉ１ラッチ２に格納された命令１の第１ス
ロットが第１命令解読器４で解読される。解読された結
果としてオペレーションAであることが判明する。この
解読に基づいてレジスタファイル６から汎用レジスタが
読出され、読出された値または命令中の定数値がＤ１１
ラッチ９とＤ２１ラッチ１１に格納される。一方、Ｉ２
ラッチ３に格納された命令１の第２スロットの蓄積ビッ
トが”１”であるため、ｎｏｐ生成器２２はｎｏｐを出
力し、第２命令解読器５はＥＸステージで実質的に何ら
の動作もしないようなデコード結果を出力する。また、
３つのスロットの命令を実行する場合以外は第３演算器
３７を動作させる必要がないため、実行ビットが”０”
のときは、ｎｏｐ生成器３９はｎｏｐを出力する。・IFステージ：命令２命令２がROM１から読出され、第１スロット（オペレー
ションB）がＩ１ラッチ２に、第２スロット（オペレー
ションC）がＩ２ラッチ３に格納される。Ｉ３ラッチ３
１には再びＩＢ３バッファ４１の（０・・・００）₂が
格納される。The first slot of the instruction 1 stored in the I1 latch 2 is decoded by the first instruction decoder 4. As a result of the decryption, operation A is determined. The general-purpose register is read from the register file 6 based on the decoding, and the read value or the constant value in the instruction is set to D11.
The data is stored in the latch 9 and the D21 latch 11. On the other hand, I2
Since the accumulation bit of the second slot of the instruction 1 stored in the latch 3 is “1”, the nop generator 22 outputs nop, and the second instruction decoder 5 performs substantially no operation in the EX stage. Output a decoding result that does not occur. Also,
Since the third computing unit 37 does not need to be operated except when executing instructions in three slots, the execution bit is “0”.
In this case, the nop generator 39 outputs nop. IF stage: instruction 2 Instruction 2 is read from ROM 1 and the first slot (operation B) is stored in I1 latch 2 and the second slot (operation C) is stored in I2 latch 3. I3 latch 3
In (1), (0... 00) _{2 of the} IB3 buffer 41 is stored again.

【００９９】（タイミングt3）・EXステージ：命令１Ｄ１１ラッチ９とＤ２１ラッチ１１に格納されたオペラ
ンドを第１演算器１３に入力してオペレーションAの演
算を行う。演算結果は必要に応じてレジスタファイル６
の汎用レジスタに格納する。一方、第２演算器１４と第
３演算器３７はｎｏｐ生成器２２、３９により無効化さ
れているため作用しない。・DECステージ：命令２Ｉ１ラッチ２に格納された命令２の第１スロットが第１
命令解読器４で解読される。解読された結果としてオペ
レーションBであることが判明する。この解読に基づい
てレジスタファイル６から汎用レジスタが読出され、読
出された値または命令中の定数値がＤ１１ラッチ９とＤ
２１ラッチ１１に格納される。一方、Ｉ２ラッチ３に格
納された命令２の第２スロットが第２命令解読器５で解
読される。解読された結果としてオペレーションCであ
ることが判明する。この解読に基づいてレジスタファイ
ル６から汎用レジスタが読出され、読出された値または
命令中の定数値がＤ１２ラッチ１０とＤ２２ラッチ１２
に格納される。また、Ｉ３ラッチ３８の実行ビットは”
０”であるため、ｎｏｐ生成器３９はｎｏｐを出力し、
第３命令解読器４０はＥＸステージで実質的に何らの動
作もしないようなデコード結果を出力する。・IFステージ：命令３命令３がROM１から読出され、第１スロット（蓄積ビッ
トが（００）₂でオペレーションD）がＩ１ラッチ２に、
第２スロット（蓄積ビットが（０１）₂でオペレーショ
ンG）がＩ２ラッチ３に格納される。Ｉ３ラッチ３８に
は再びＩＢ３バッファ４１の（０・・・００）₂が格納
される。(Timing t3) EX stage: Instruction 1 Operands stored in the D11 latch 9 and the D21 latch 11 are input to the first computing unit 13 to perform the operation A. The calculation result is stored in the register file 6 if necessary.
In a general-purpose register. On the other hand, the second computing unit 14 and the third computing unit 37 do not operate because they are invalidated by the nop generators 22 and 39. • DEC stage: instruction 2 The first slot of instruction 2 stored in I1 latch 2 is the first slot
The instruction is decoded by the instruction decoder 4. As a result of the decryption, operation B is determined. The general-purpose register is read from the register file 6 based on the decoding, and the read value or the constant value in the instruction is stored in the D11 latch 9 and the D11 latch.
21 are stored in the latch 11. On the other hand, the second slot of the instruction 2 stored in the I2 latch 3 is decoded by the second instruction decoder 5. As a result of the decryption, operation C is determined. The general-purpose register is read from the register file 6 based on the decoding, and the read value or the constant value in the instruction is stored in the D12 latch 10 and the D22 latch 12.
Is stored in The execution bit of the I3 latch 38 is "
0 ", the nop generator 39 outputs nop,
The third instruction decoder 40 outputs a decoding result that does not substantially perform any operation in the EX stage. • IF stage: instruction 3 Instruction 3 is read from ROM1, and the first slot (the accumulation bit is (00) ₂ and operation D) is stored in I1 latch 2,
The second slot (accumulated bit is (01) ₂ and operation G) is stored in the I2 latch 3. (0... 00) _{2 of the} IB3 buffer 41 is stored in the I3 latch 38 again.

【０１００】（タイミングt4）・EXステージ：命令２Ｄ１１ラッチ９とＤ２１ラッチ５５に格納されたオペラ
ンドを第１演算器１３に入力してオペレーションBの演
算を行う。演算結果は必要に応じてレジスタファイル６
の汎用レジスタに格納する。一方、Ｄ１２ラッチ１１と
Ｄ２２ラッチ１２に格納されたオペランドを第２演算器
１４に入力してオペレーションCの演算を行う。演算結
果は必要に応じてレジスタファイル６の汎用レジスタに
格納する。また、第３演算器３７はｎｏｐ生成器３９に
より無効化されているので作用しない。・DECステージ：命令３蓄積ビットが（１０）₂であるＩ２ラッチ３の内容（オ
ペレーションG）がＩＢ２バッファ１６に取込まれる。
具体的には、タイミングt1とほとんど同様の動作である
が、ＩＢ１バッファ１５にはすでにオペレーションFが
蓄積されているので、リングカウンタ４２が（０１０）
₂を出力することにより、ＩＢ２バッファ１６の書き込
み信号がイネーブルとなり、ＩＢ２バッファ１６にオペ
レーションが蓄積される。(Timing t4) EX stage: instruction 2 Operands stored in the D11 latch 9 and the D21 latch 55 are input to the first computing unit 13 to perform the operation B. The calculation result is stored in the register file 6 if necessary.
In a general-purpose register. On the other hand, the operands stored in the D12 latch 11 and the D22 latch 12 are input to the second computing unit 14 to perform the operation C. The calculation result is stored in a general-purpose register of the register file 6 as needed. Further, the third computing unit 37 does not operate because it is invalidated by the nop generator 39. DEC stage: instruction 3 The content (operation G) of the I2 latch 3 whose accumulation bit is (10) ₂ is taken into the IB2 buffer 16.
Specifically, the operation is almost the same as that at the timing t1, but since the operation F has already been accumulated in the IB1 buffer 15, the ring counter 42 is set to (010).
By outputting ₂ , the write signal of the IB2 buffer 16 is enabled, and the operation is accumulated in the IB2 buffer 16.

【０１０１】Ｉ１ラッチ２に格納された命令３の第１ス
ロットが第１命令解読器４で解読される。解読された結
果としてオペレーションDであることが判明する。この
解読に基づいてレジスタファイル６から汎用レジスタが
読出され、読出された値または命令中の定数値がＤ１１
ラッチ９とＤ２１ラッチ５５に格納される。一方、Ｉ２
ラッチ３に格納された命令３の第２スロットの蓄積ビッ
トが”１”であるため、ｎｏｐ生成器２２はｎｏｐを出
力し、第２命令解読器５はＥＸステージで実質的に何ら
の動作もしないようなデコード結果を出力する。また、
実行フラグは”０”であるため、ｎｏｐ生成器３９はｎ
ｏｐを出力し、第３命令解読器４０はＥＸステージで実
質的に何らの動作もしないようなデコード結果を出力す
る。・IFステージ：命令４命令４がROM１から読出され、第１スロット（オペレー
ションH）がＩ１ラッチ２に、第２スロット（オペレー
ションE）がＩ２ラッチ３に格納される。Ｉ３ラッチ３
８には再びＩＢ３バッファ４１の（０・・・００）₂が
格納される。The first slot of the instruction 3 stored in the I1 latch 2 is decoded by the first instruction decoder 4. As a result of the decryption, operation D is determined. The general-purpose register is read from the register file 6 based on the decoding, and the read value or the constant value in the instruction is set to D11.
The data is stored in the latch 9 and the D21 latch 55. On the other hand, I2
Since the accumulation bit of the second slot of the instruction 3 stored in the latch 3 is “1”, the nop generator 22 outputs nop, and the second instruction decoder 5 performs substantially no operation in the EX stage. Output a decoding result that does not occur. Also,
Since the execution flag is “0”, the nop generator 39 outputs n
op, and the third instruction decoder 40 outputs a decoding result that does not substantially perform any operation in the EX stage. IF stage: instruction 4 Instruction 4 is read from ROM 1 and the first slot (operation H) is stored in I1 latch 2 and the second slot (operation E) is stored in I2 latch 3. I3 latch 3
8 stores (0... 00) _{2 of the} IB3 buffer 41 again.

【０１０２】（タイミングt5）・EXステージ：命令３Ｄ１１ラッチ９とＤ２１ラッチ５５に格納されたオペラ
ンドを第１演算器１３に入力してオペレーションDの演
算を行う。演算結果は必要に応じてレジスタファイル６
の汎用レジスタに格納する。一方、第２演算器１４と第
３演算器３７とはｎｏｐ生成器２２、３９により無効化
されているので作用しない。・DECステージ：命令４蓄積ビットが”１”であるＩ１ラッチ２の内容（オペレ
ーションH）がＩＢ３バッファ４１に取込まれる。この
とき、ＩＢ１バッファ１５、ＩＢ２バッファ１６には既
にオペレーションが蓄積されているので、リングカウン
タ４２は（００１）₂を出力することによりＩＢ３バッ
ファ４１の書き込み信号がイネーブルとなり、ＩＢ３バ
ッファ４１にオペレーションが蓄積される。また、Ｉ１
ラッチ２に格納された命令４の第１スロットの蓄積ビッ
トが”１”であるため、ｎｏｐ生成器２１はｎｏｐを出
力し、第１命令解読器４はＥＸステージで実質的に何ら
の動作もしないようなデコード結果を出力する。(Timing t5) EX stage: Instruction 3 Operands stored in the D11 latch 9 and the D21 latch 55 are input to the first computing unit 13 to perform the operation D. The calculation result is stored in the register file 6 if necessary.
In a general-purpose register. On the other hand, the second computing unit 14 and the third computing unit 37 have no effect because they are invalidated by the nop generators 22 and 39. DEC stage: instruction 4 The contents (operation H) of the I1 latch 2 whose accumulation bit is "1" are taken into the IB3 buffer 41. At this time, since the operations are already stored in the IB1 buffer 15 and the IB2 buffer 16, the ring counter 42 outputs (001) ₂ to enable the write signal of the IB3 buffer 41, and the operation is stored in the IB3 buffer 41. Stored. Also, I1
Since the accumulation bit of the first slot of the instruction 4 stored in the latch 2 is “1”, the nop generator 21 outputs nop, and the first instruction decoder 4 performs substantially no operation in the EX stage. Output a decoding result that does not occur.

【０１０３】一方、Ｉ２ラッチ３に格納された命令４の
第２スロットが第２命令解読器５で解読される。解読さ
れた結果としてオペレーションEであることが判明す
る。この解読に基づいてレジスタファイル６から汎用レ
ジスタが読出され、読出された値または命令中の定数値
がＤ１２ラッチ１１とＤ２２ラッチ１２に格納される。
また、実行フラグは”０”であるため、ｎｏｐ生成器３
９はｎｏｐを出力し、第３命令解読器４０は実行ステー
ジで実質的に何らの動作もしないようなデコード結果を
出力する。・IFステージ：ＩＢバッファ蓄積命令Ｉ２ラッチ３に格納された命令４の第２スロットの実行
ビットが”１”であるため、命令フェッチ制御部により
命令フェッチを中断する。これとともに、Ｉ１セレクタ
１９、Ｉ２セレクタ２０がそれぞれＩＢ１バッファ１
５、ＩＢ２バッファ１６を選択し、Ｉ１ラッチ２、Ｉ２
ラッチ３、Ｉ３ラッチ３８にはＩＢ１バッファ１５、Ｉ
Ｂ２バッファ１６、ＩＢ３バッファ４１の内容が格納さ
れる。そして、Ｉ３ラッチ３８の実行ビットが”１”と
なると、ＩＢバッファの内容をリセットする。On the other hand, the second slot of the instruction 4 stored in the I2 latch 3 is decoded by the second instruction decoder 5. As a result of the decryption, the operation E is determined. The general-purpose register is read from the register file 6 based on the decoding, and the read value or the constant value in the instruction is stored in the D12 latch 11 and the D22 latch 12.
Further, since the execution flag is “0”, the nop generator 3
9 outputs nop, and the third instruction decoder 40 outputs a decoding result that does not perform any operation in the execution stage. IF stage: IB buffer accumulation instruction Since the execution bit of the second slot of the instruction 4 stored in the I2 latch 3 is “1”, the instruction fetch control unit interrupts the instruction fetch. At the same time, the I1 selector 19 and the I2 selector 20
5, the IB2 buffer 16 is selected, and the I1 latch 2, I2
The IB1 buffer 15, I
The contents of the B2 buffer 16 and the IB3 buffer 41 are stored. When the execution bit of the I3 latch 38 becomes "1", the contents of the IB buffer are reset.

【０１０４】（タイミングt6）・EXステージ：命令４第１演算器１３、第３演算器３７はｎｏｐ生成器２１、
ｎｏｐ生成器３９により無効化されているので作用しな
い。一方、Ｄ１２ラッチ１０とＤ２２ラッチ１２に格納
されたオペランドを第２演算器１４に入力してオペレー
ションEの演算を行う。演算結果は必要に応じてレジス
タファイル６の汎用レジスタに格納する。・DECステージ：ＩＢバッファ蓄積命令Ｉ１ラッチ２に格納された第１スロットが第１命令解読
器４で解読される。解読された結果としてオペレーショ
ンFであることが判明する。この解読に基づいてレジス
タファイル６から汎用レジスタが読出され、読出された
値または命令中の定数値がＤ１１ラッチ９とＤ２１ラッ
チ５５に格納される。一方、Ｉ２ラッチ３に格納された
第２スロットが第２命令解読器５で解読される。解読さ
れた結果としてオペレーションGであることが判明す
る。この解読に基づいてレジスタファイル６から汎用レ
ジスタが読出され、読出された値または命令中の定数値
がＤ１２ラッチ１１とＤ２２ラッチ１２に格納される。
また、Ｉ３ラッチ３に格納された第３スロットが第３命
令解読器４０で解読される。すなわち、実行ビットが”
１”であるため、ｎｏｐ生成器３９はＩ３ラッチ３８の
内容をそのまま出力し、解読された結果としてオペレー
ションHであることが判明する。この解読に基づいてレ
ジスタファイル６から汎用レジスタが読出され、読出さ
れた値または命令中の定数値がＤ１３ラッチ３５とＤ２
３ラッチ３６に格納される。(Timing t6) EX stage: instruction 4 The first computing unit 13 and the third computing unit 37 are the nop generator 21,
It has no effect because it has been invalidated by the nop generator 39. On the other hand, the operands stored in the D12 latch 10 and the D22 latch 12 are input to the second computing unit 14 to perform the operation E. The calculation result is stored in a general-purpose register of the register file 6 as needed. DEC stage: IB buffer storage instruction The first slot stored in the I1 latch 2 is decoded by the first instruction decoder 4. As a result of the decryption, the operation F is determined. The general-purpose register is read from the register file 6 based on the decoding, and the read value or the constant value in the instruction is stored in the D11 latch 9 and the D21 latch 55. On the other hand, the second slot stored in the I2 latch 3 is decoded by the second instruction decoder 5. As a result of the decryption, operation G is determined. The general-purpose register is read from the register file 6 based on the decoding, and the read value or the constant value in the instruction is stored in the D12 latch 11 and the D22 latch 12.
The third slot stored in the I3 latch 3 is decoded by the third instruction decoder 40. That is, the execution bit is "
Since the value is 1 ", the nop generator 39 outputs the contents of the I3 latch 38 as it is, and it is determined that the operation is operation H as a result of the decoding. The read value or the constant value in the instruction is the D13 latch 35 and D2
3 latch 36.

【０１０５】（タイミングt7）・EXステージ：ＩＢバッファ蓄積命令Ｄ１１ラッチ９とＤ２１ラッチ５５に格納されたオペラ
ンドを第１演算器１３に入力してオペレーションFの演
算を行う。演算結果は必要に応じてレジスタファイル６
の汎用レジスタに格納する。一方、Ｄ１２ラッチ１１と
Ｄ２２ラッチ１２に格納されたオペランドを第２演算器
１４に入力してオペレーションGの演算を行う。演算結
果は必要に応じてレジスタファイル６の汎用レジスタに
格納する。また、Ｄ１３ラッチ３５とＤ２３ラッチ３６
に格納されたオペランドを第３演算器３７に入力してオ
ペレーションHの演算を行う。演算結果は必要に応じて
レジスタファイル６の汎用レジスタに格納する。(Timing t7) EX stage: IB buffer accumulation instruction Operands stored in the D11 latch 9 and the D21 latch 55 are input to the first computing unit 13 to perform the operation F. The calculation result is stored in the register file 6 if necessary.
In a general-purpose register. On the other hand, the operands stored in the D12 latch 11 and the D22 latch 12 are input to the second computing unit 14 to perform the operation G. The calculation result is stored in a general-purpose register of the register file 6 as needed. The D13 latch 35 and the D23 latch 36
Is input to the third computing unit 37 to perform the operation of operation H. The calculation result is stored in a general-purpose register of the register file 6 as needed.

【０１０６】３．記録媒体本発明の記録媒体の実施の形態として、図１５の機械命
令プログラムを記録した磁気ディスク（フロッピーディ
スクやハードディスクなど）、光ディスク（CD-ROMやPD
など）、光磁気ディスク、半導体メモリ（ROMやフラッ
シュメモリなど）がある。3. Recording Medium As an embodiment of the recording medium of the present invention, a magnetic disk (floppy disk, hard disk, etc.) and an optical disk (CD-ROM, PD
), A magneto-optical disk, and a semiconductor memory (such as a ROM or a flash memory).

【０１０７】以上のように本実施の形態によれば、コン
パイラの機械命令圧縮部１０９が、第３スロットを除く
第１スロットと第２スロットのいずれかにかかわらず出
現順序が連続する３つのnopコードを抽出し、これらのn
opコードのスロットを、該３つのnopコードの以降に最
初に現れる３つの有効オペレーションが指定される命令
の第１スロットから第３スロットのオペレーションでそ
れぞれ置き換え、置き換えに使った３つの有効オペレー
ションが指定される命令を削除することにより命令中の
無駄領域が低減され、プログラムサイズの削減を図るこ
とができる。特に本実施の形態によれば、従来３つのス
ロットからなる命令で３並列実行していた所を、従来no
pコードとなるスロットを利用して２つスロットからな
る命令で実行できるため、極めてコード効率が高い。上
記に示した動作例では、図１２の３スロット×５命令＝
１５スロットが、図１５の２スロット×４命令＝８スロ
ットに圧縮されていることがわかる。As described above, according to the present embodiment, the machine instruction compression unit 109 of the compiler determines that the three nop sequences whose appearance order is continuous regardless of either the first slot or the second slot except for the third slot. Extract the code and these n
The op code slot is replaced with the first to third slot operations of an instruction in which three valid operations appearing first after the three nop codes are specified, and the three valid operations used for replacement are specified. By deleting the instruction to be executed, a waste area in the instruction is reduced, and the program size can be reduced. In particular, according to the present embodiment, the conventional three-parallel execution using an instruction consisting of three slots has been replaced by the conventional no
Since the instruction can be executed with an instruction consisting of two slots using a slot serving as a p-code, the code efficiency is extremely high. In the operation example described above, 3 slots × 5 instructions in FIG.
It can be seen that 15 slots are compressed to 2 slots × 4 instructions = 8 slots in FIG.

【０１０８】また本実施の形態のプロセッサによれば、
散在する従来のnopコードの位置に埋められた有効オペ
レーションを蓄積するＩＢバッファを設け、実行させる
べき位置の直前の命令中の蓄積ビットでＩＢバッファを
指定して蓄積したオペレーションを実行することによ
り、従来の処理性能を保ちつつ圧縮された機械命令プロ
グラムの実行が可能である。Further, according to the processor of the present embodiment,
By providing an IB buffer for storing valid operations embedded in scattered conventional nop code positions and executing the stored operation by specifying the IB buffer with a storage bit in an instruction immediately before the position to be executed, It is possible to execute the compressed machine instruction program while maintaining the conventional processing performance.

【０１０９】なお、本実施形態のプロセッサでは、Ｉ１
セレクタ１９およびＩ２セレクタ２０をそれぞれＩ１ラ
ッチ２およびＩ２ラッチ４３の入力側に設けているが、
それぞれＩ１ラッチ２およびＩ２ラッチ３の出力側に設
け、第１命令解続器４および第２命令解読器４の入力を
選択するようにしてもよい。このようにする際は、ＩＢ
バッファへの入力をIFステージにおいてROM１から直接
行うように変更し、ROM１から読出された命令の蓄積ビ
ットの値によってＩＢセレクタ６６を制御するように変
更しなければならないが、ＩＢバッファへの取り込みや
Ｉ１セレクタ１９およびＩ２セレクタ２０の選択につい
ては本実施形態と同様に蓄積ビットの値で制御すればよ
い。In the processor of this embodiment, I1
The selector 19 and the I2 selector 20 are provided on the input side of the I1 latch 2 and the I2 latch 43, respectively.
The output of the I1 latch 2 and the I2 latch 3, respectively, may be provided to select the input of the first instruction interpreter 4 and the second instruction decoder 4. When doing this, IB
In the IF stage, the input to the buffer must be changed so as to be performed directly from ROM1, and the value of the accumulation bit of the instruction read from ROM1 must be changed so as to control the IB selector 66. The selection of the I1 selector 19 and the I2 selector 20 may be controlled by the value of the accumulation bit as in the present embodiment.

【０１１０】また、本実施形態のプロセッサでは、ＩＢ
バッファという１つの蓄積バッファを設けているが、複
数設けてもよい。蓄積バッファの数が多くなるほどnop
コードを有効オペレーションで埋める機会が増加し、プ
ログラムサイズの一層の削減が図れる。In the processor of this embodiment, IB
Although one accumulation buffer called a buffer is provided, a plurality of storage buffers may be provided. Nop as the number of accumulation buffers increases
The opportunity to fill the code with valid operations increases, and the program size can be further reduced.

【０１１１】さらにまた、本実施形態のプロセッサで
は、３つの命令解読器と３つの演算器を設けて最大３並
列実行を達成しているが、これらを４つずつ設けて４並
列実行してもよいし、あるいはそれ以上であってもよ
い。４並列実行の場合は、本実施形態と同様に２つスロ
ットからなる命令の未圧縮時にnopコードとなるスロッ
トを４つ利用して有効オペレーションを埋めてもよい
し、３つスロットからなる命令の未圧縮時にnopコード
となるスロットを４つ利用して有効オペレーションを埋
めてもよい。但し前者の場合、ＩＢバッファをもう１ス
ロット分だけ多く備える必要がある。前者は後者に比べ
て未圧縮時にnopコードとなるスロットが極めて多い場
合に有効で、相当のコード効率の向上が望める。このよ
うにすることにより、VLIWプロセッサにおける命令の並
列度が向上しても、nopコードの増大を大幅に軽減する
ことができる。Furthermore, in the processor of the present embodiment, three instruction decoders and three arithmetic units are provided to achieve a maximum of three parallel executions. Or more. In the case of four-parallel execution, the effective operation may be filled by using four slots that become a nop code when the two-slot instruction is uncompressed, as in the present embodiment. The effective operation may be filled by using four slots that are nop codes when not compressed. However, in the former case, it is necessary to provide an additional IB buffer for another slot. The former is effective when the number of slots that become nop codes when uncompressed is extremely large compared to the latter, and a considerable improvement in code efficiency can be expected. By doing so, even if the parallelism of instructions in the VLIW processor is improved, the increase in nop code can be greatly reduced.

【０１１２】（実施の形態４）実施の形態４は、実施の
形態３から、第３スロットのオペレーションだけを第１
あるいは第２スロットのnopコードのスロットに埋める
ように変えたものである。(Embodiment 4) Embodiment 4 differs from Embodiment 3 in that only the operation of the third slot is performed by the first slot.
Alternatively, it is changed so as to fill the slot of the nop code of the second slot.

【０１１３】１．コンパイラコンパイラの構成は、機械命令圧縮部１０９の動作を除
いて実施の形態３に記したものと同じである。機械命令
圧縮部１０９は次の原理に基づいて動作する。1. Compiler The configuration of the compiler is the same as that described in the third embodiment except for the operation of the machine instruction compression unit 109. The machine instruction compression unit 109 operates based on the following principle.

【０１１４】命令の順に未圧縮の機械命令プログラムを
検索して、第３スロットを除く第１スロットと第２スロ
ットのいずれかにかかわらず１つのnopコードを抽出
し、このnopコードのスロットを、該nopコードの以降に
最初に現れる第３スロットに有効オペレーションが指定
される命令の該オペレーションで置き換え、置き換えた
ことをマーキングするとともに、置き換えに使った有効
オペレーションが指定される命令の第３スロットを削除
し、該命令の第１スロットと第２スロットのいずれかに
削除したことをマーキングする。An uncompressed machine instruction program is searched in the order of instructions, and one nop code is extracted irrespective of either the first slot or the second slot except for the third slot. The third slot which appears first after the nop code is replaced with the operation of the instruction whose valid operation is specified, the replacement is marked, and the third slot of the instruction whose valid operation used for replacement is specified is Delete and mark the deletion in either the first slot or the second slot of the instruction.

【０１１５】１．１機械命令圧縮部１０９の動作例図１８は、圧縮された機械命令プログラムの例示図であ
り、機械命令圧縮部１０９が図１２の未圧縮の機械命令
プログラムを上述の手順で圧縮したものである。圧縮さ
れた命令は第１と第２の２つのスロットで構成され、各
スロットは２ビットの蓄積ビットとオペレーション（O
P）フィールドとからなる。AからHの記号は有効なオペ
レーションを、nopは有効でないnopコードを示す。蓄積
ビットの２ビットは次のようにエンコードしている。００何もしない０１オペレーションは置き換えられたものであり、Ｉ
Ｂバッファに蓄積すべし１０第３スロットが削除されたので、第３スロットに
はＩＢバッファのオペレーションを実行すべし１１（未使用）具体的に説明すると、第３スロットに置かれたオペレー
ションである、図１４の命令５のオペレーションHを、
命令１の第２スロットのnopコードのスロットに埋め、
埋められたスロットの蓄積ビットを０１にセットし、命
令５の第３スロットを削除する。オペレーションHは、
ＩＢバッファに蓄積されることを前提にしており、第３
スロットが削除された命令５の第２スロットの蓄積ビッ
トには１０をセットする（第１スロットの蓄積ビットで
あってもよい）。その他のスロットの蓄積ビットは００
である。このようにして生成された機械命令プログラム
が図１８に示すものである。ここでは、命令３の第２ス
ロットと命令４の第１スロットのnopコードは置き換え
られずに残っている。なお「ＩＢバッファ」は次に説明
する。1.1 Operation Example of Machine Instruction Compression Unit 109 FIG. 18 is a view showing an example of a compressed machine instruction program. The machine instruction compression unit 109 converts the uncompressed machine instruction program of FIG. It is compressed. The compressed instruction consists of two slots, a first and a second, each slot having two stored bits and an operation (O
P) field. The symbols A through H indicate valid operations, and nop indicates an invalid nop code. Two bits of the stored bits are encoded as follows. 00 Does nothing 01 The operation has been replaced and I
The operation of the IB buffer should be executed in the third slot since the third slot has been deleted. 11 (Unused) Specifically, the operation is the operation placed in the third slot. , Operation H of instruction 5 in FIG.
Fill in the nop code slot of the second slot of instruction 1,
The accumulation bit of the filled slot is set to 01, and the third slot of the instruction 5 is deleted. Operation H is
It is assumed that the data is accumulated in the IB buffer.
The storage bit of the second slot of the instruction 5 from which the slot has been deleted is set to 10 (it may be the storage bit of the first slot). The accumulated bits of other slots are 00
It is. The machine instruction program generated in this way is shown in FIG. Here, the nop codes of the second slot of the instruction 3 and the first slot of the instruction 4 remain without being replaced. The "IB buffer" will be described next.

【０１１６】２．プロセッサ図１９は、プロセッサのIFステージ部分の概略構成図で
ある。[0116] 2. Processor FIG. 19 is a schematic configuration diagram of an IF stage portion of the processor.

【０１１７】DECステージおよびEXステージの図示して
いない部分は図１６と同じ構成であり、また図１６と同
一の構成要素には同一の符号を付している。このプロセ
ッサは、図１４に示すものと比べると、ＩＢバッファ５
０を１つだけ有するものである点で異なる。このため、
図１６と比べるとＩＢバッファが１つで足りることはも
ちろんの事、３つのバッファに左から蓄積するためのセ
レクタ４１、４２が不要となり回路が簡単化できる。動
作は、蓄積先がＩＢバッファ５０に固定される以外は、
実施の形態３と同じであるため説明は省略する。The parts not shown of the DEC stage and the EX stage have the same configuration as in FIG. 16, and the same components as those in FIG. 16 are denoted by the same reference numerals. This processor is different from the one shown in FIG.
It differs in that it has only one 0. For this reason,
Compared with FIG. 16, it is needless to say that only one IB buffer is required, and the selectors 41 and 42 for accumulating the three buffers from the left are unnecessary, and the circuit can be simplified. The operation is the same except that the storage destination is fixed to the IB buffer 50.
The description is omitted because it is the same as the third embodiment.

【０１１８】３．記録媒体本発明の記録媒体の実施の形態として、図１８の機械命
令プログラムを記録した磁気ディスク（フロッピーディ
スクやハードディスクなど）、光ディスク（CD-ROMやPD
など）、光磁気ディスク、半導体メモリ（ROMやフラッ
シュメモリなど）がある。3. Recording Medium As an embodiment of the recording medium of the present invention, a magnetic disk (floppy disk, hard disk, etc.) and an optical disk (CD-ROM, PD
), A magneto-optical disk, and a semiconductor memory (such as a ROM or a flash memory).

【０１１９】以上のように本実施の形態によれば、コン
パイラの機械命令圧縮部１０９が、第３スロットを除く
第１スロットと第２スロットのいずれかにかかわらず１
つのnopコードを抽出し、このnopコードのスロットを、
該nopコードの以降に最初に現れる第３スロットに有効
オペレーションが指定される命令の該オペレーションで
置き換え、置き換えに使った有効オペレーションが指定
される命令の第３スロットを削除することにより命令中
の無駄領域が低減され、プログラムサイズの削減を図る
ことができる。特に本実施の形態によれば、従来３つの
スロットからなる命令で３並列実行していた所を、従来
nopコードとなるスロットを利用して２つスロットから
なる命令で実行できるため、極めてコード効率が高い。
上記に示した動作例では、図１２の３スロット×５命令
＝１５スロットが、図１８の２スロット×５命令＝１０
スロットに圧縮されていることがわかる。As described above, according to the present embodiment, the machine instruction compression unit 109 of the compiler determines whether or not one of the first and second slots except for the third slot.
Extract two nop codes, and slot this nop code,
The third slot that appears first after the nop code is replaced with the operation of the instruction whose valid operation is specified, and the third slot of the instruction whose valid operation is used for replacement is deleted. The area is reduced, and the program size can be reduced. In particular, according to the present embodiment, the conventional three-parallel execution with an instruction consisting of three slots
Since the instruction can be executed with an instruction consisting of two slots by using a slot serving as a nop code, the code efficiency is extremely high.
In the operation example described above, 3 slots × 5 instructions = 15 slots in FIG. 12 are replaced with 2 slots × 5 instructions = 10 slots in FIG.
It can be seen that it is compressed in the slot.

【０１２０】また本実施の形態のプロセッサによれば、
従来のnopコードの位置に埋められた有効オペレーショ
ンを蓄積するＩＢバッファを設け、命令中の蓄積ビット
でＩＢバッファを指定して該命令のオペレーションと蓄
積したオペレーションとを並列に実行することにより、
従来の処理性能を保つつ圧縮された機械命令プログラム
の実行が可能である。Further, according to the processor of the present embodiment,
By providing an IB buffer for storing an effective operation embedded in the position of the conventional nop code, specifying the IB buffer with a storage bit in an instruction, and executing the operation of the instruction and the stored operation in parallel,
It is possible to execute a compressed machine instruction program while maintaining the conventional processing performance.

【０１２１】なお、本実施形態のプロセッサでは、ＩＢ
バッファという１つの蓄積バッファを設けているが、複
数設けてもよい。蓄積バッファの数が多くなるほどnop
コードを有効オペレーションで埋める機会が増加し、プ
ログラムサイズの一層の削減が図れる。例えば、命令３
の第２スロットと命令４の第１スロットのnopコードは
置き換えられずに残っているが、未圧縮（図１４）の命
令５の直後に第３スロットに有効オペレーションが置か
れた命令が１つ後続する場合、または２つ後続する場
合、それぞれ、これらのnopコードの一方または両方を
その有効オペレーションで埋めることができる。In the processor of this embodiment, IB
Although one accumulation buffer called a buffer is provided, a plurality of storage buffers may be provided. Nop as the number of accumulation buffers increases
The opportunity to fill the code with valid operations increases, and the program size can be further reduced. For example, instruction 3
The nop code of the second slot of instruction 4 and the first slot of instruction 4 remains without being replaced, but one instruction having a valid operation placed in the third slot immediately after instruction 5 of the uncompressed (FIG. 14) If so, or two, respectively, one or both of these nop codes can be filled with the valid operation.

【０１２２】さらにまた、本実施形態のプロセッサで
は、３つの命令解読器と３つの演算器を設けて最大３並
列実行を達成しているが、これらを４つずつ設けて４並
列実行してもよいし、あるいはそれ以上であってもよ
い。４並列実行の場合は、本実施形態と同様に２つスロ
ットからなる命令の未圧縮時にnopコードとなるスロッ
トを２つ利用して有効オペレーションを埋めてもよい
し、３つスロットからなる命令の未圧縮時にnopコード
となるスロットを１つ利用して有効オペレーションを埋
めてもよい。但し前者の場合、ＩＢバッファをもう１ス
ロット分だけ多く備える必要がある。前者は後者に比べ
て未圧縮時にnopコードとなるスロットが極めて多い場
合に有効で、相当のコード効率の向上が望める。このよ
うにすることにより、VLIWプロセッサにおける命令の並
列度が向上しても、nopコードの増大を大幅に軽減する
ことができる。Furthermore, in the processor of the present embodiment, three instruction decoders and three arithmetic units are provided to achieve a maximum of three parallel executions. Or more. In the case of four-parallel execution, the effective operation may be filled by using two slots that become nop codes when the two-slot instruction is not compressed, as in the present embodiment. The valid operation may be filled by using one slot that becomes a nop code when not compressed. However, in the former case, it is necessary to provide an additional IB buffer for another slot. The former is effective when the number of slots that become nop codes when uncompressed is extremely large compared to the latter, and a considerable improvement in code efficiency can be expected. By doing so, even if the parallelism of instructions in the VLIW processor is improved, the increase in nop code can be greatly reduced.

【０１２３】以上、本発明に係るコンパイラ及びプロセ
ッサについて、上記の４つの実施形態に基づいて説明し
たが、本発明はこれら実施形態に限られないことは勿論
である。即ち、（１）上記の４つの実施形態では、１つの命令に２つま
たは３つのオペレーションを指定するVLIW形式のアーキ
テクチャとしているが、１つの命令で１つのオペレーシ
ョンを指定するVLIW形式でないアーキテクチャでもよ
い。As described above, the compiler and the processor according to the present invention have been described based on the above four embodiments, but it is needless to say that the present invention is not limited to these embodiments. That is, (1) In the above four embodiments, the architecture of the VLIW format in which two or three operations are specified for one instruction is used, but an architecture other than the VLIW format in which one operation specifies one operation may be used. .

【０１２４】特に、固定長命令の場合には未使用領域を
持つ命令が多く定義されることがある。例えば、MIPS R
ISCアーキテクチャによるプロセッサ“R3000”は３２ビ
ット固定長命令を実行するが、このプロセッサの演算命
令は図２３（ａ）に示すように、１２ビットのオペレー
ションフィールド（「op1」と「op2」で示す）とそれぞ
れが５ビットの３つのレジスタフィールド（ソースオペ
ランドの「rs」および「rt」と、デスティネーションオ
ペランドの「rd」で示す）から構成され、さらに５ビッ
トの「res」で示す未使用領域を有する。本発明によれ
ば、このような単一オペレーション命令中に生じる無駄
領域の発生も回避される。具体的には、図２３（ｂ）に
示すようにコンパイラが６つの命令Aから命令Fのそれぞ
れの未使用領域a〜fを利用して、命令F以降に実行され
るべき１つの命令を分割して配置するとともにこの命令
を削除し、プロセッサ内に設けた命令蓄積レジスタにこ
れらを順に蓄積し、命令Fの実行後にこのレジスタの内
容を実行する。こうすることにより、プログラム中の無
駄領域が解消されコード効率が向上する。また、命令蓄
積レジスタの内容の実行は、命令Fの直後でなく命令Fに
続く他の命令の実行後でもよいし、また命令Fと並列に
実行してもよい。特に後者の思想は、１つの命令で１つ
のオペレーションを指定するVLIW形式でないアーキテク
チャにおいて局所的ではあるが２つのオペレーションを
指定するVLIW形式のアーキテクチャを実現できるため有
用である。また、このような命令蓄積レジスタを複数設
けることにより、さらに３並列以上のVLIWアーキテクチ
ャも実現可能である。なお、６つの命令Aから命令Fは必
ずしもすき間なく連続している必要はない。In particular, in the case of fixed-length instructions, many instructions having unused areas may be defined. For example, MIPS R
The processor "R3000" based on the ISC architecture executes a 32-bit fixed-length instruction. The operation instruction of this processor has a 12-bit operation field (indicated by "op1" and "op2") as shown in FIG. And three register fields of 5 bits each (designated by “rs” and “rt” of the source operand and “rd” of the destination operand), and an unused area indicated by “res” of 5 bits. Have. According to the present invention, the occurrence of a waste area generated during such a single operation instruction is also avoided. Specifically, as shown in FIG. 23B, the compiler divides one instruction to be executed after the instruction F by using the unused areas a to f of the six instructions A to the instruction F. The instructions are deleted and the instructions are deleted. These instructions are sequentially stored in an instruction storage register provided in the processor, and the contents of this register are executed after the execution of the instruction F. By doing so, a waste area in the program is eliminated, and the code efficiency is improved. The contents of the instruction storage register may be executed not just after the instruction F but after execution of another instruction following the instruction F, or may be executed in parallel with the instruction F. In particular, the latter concept is useful because it is possible to realize a VLIW format architecture that specifies two operations, though local, in an architecture that is not a VLIW format that specifies one operation with one instruction. By providing a plurality of such instruction accumulation registers, a VLIW architecture with three or more parallels can be realized. Note that the six instructions A to F need not necessarily be continuous without any gaps.

【０１２５】（２）上記の４つの実施形態では、命令蓄
積レジスタ（ＩＢ１バッファ、ＩＢ２バッファ、ＩＢバ
ッファが相当）を読出すと同時に内容を消去している
が、消去せずに複数回読出して再利用してもよい。例え
ば、実施の形態３および実施の形態４では２ビットの蓄
積ビットが１１である状態を使用していないのでこれを
利用し、蓄積ビットが１１の時はＩＢバッファを消去せ
ずに実行する、とすることができる。こうすることによ
り、例えばプログラムがループを構成するような同じ命
令を繰り返し実行する場合に、度々同じ命令を何度もＩ
Ｂバッファに蓄積する必要がなくなり、一層コード効率
がよくなる。また、命令蓄積レジスタを、読出した直後
に内容が消去されるものと消去されず再利用可能なもの
と２種類設けることも可能である。(2) In the above four embodiments, the instruction storage register (corresponding to IB1, IB2, and IB buffers) is read and the contents are erased at the same time. It may be reused. For example, the third and fourth embodiments do not use the state in which the 2-bit accumulated bit is 11, and use this state. When the accumulated bit is 11, the IB buffer is executed without erasing. It can be. By doing so, for example, when a program repeatedly executes the same instruction such as forming a loop, the same instruction is frequently executed many times.
There is no need to store data in the B buffer, and the code efficiency is further improved. It is also possible to provide two types of instruction accumulation registers, those whose contents are erased immediately after reading and those which can be reused without being erased.

【０１２６】（３）上記の４つの実施形態では、コンパ
イラにおいて、機械命令生成部１０７が一旦、従来と同
じ機械命令プログラムを生成した後に機械命令圧縮部１
０９がこれを圧縮しているが、両者の機能を一体にし
て、従来と同じ機械命令プログラムを生成することなく
目的の圧縮された機械命令プログラムを直接生成するよ
うにしてもよい。(3) In the above four embodiments, in the compiler, the machine instruction generation unit 107 once generates the same machine instruction program as the conventional one, and then executes the machine instruction compression unit 1
09 compresses this, but the functions of both may be integrated to directly generate a target compressed machine instruction program without generating the same machine instruction program as in the related art.

【０１２７】（４）上記の４つの実施形態のプロセッサ
は、命令フェッチ、解読、実行の３段パイプラインで構
成されるとしているが、パイプラインの段数は何段であ
ってもよいし、パイプラインを採らなくともよい。(4) Although the processors of the above four embodiments are configured by a three-stage pipeline of instruction fetch, decoding, and execution, the number of stages of the pipeline may be any, It is not necessary to take a line.

【０１２８】[0128]

【発明の効果】以上の説明から明らかなように、本発明
によればｎｏｐを減少させることができ、コードサイズ
を小さくすることができる。As is apparent from the above description, according to the present invention, nop can be reduced and the code size can be reduced.

[Brief description of the drawings]

【図１】実施の形態１に係るコンパイラの構成を示すブ
ロック図FIG. 1 is a block diagram showing a configuration of a compiler according to a first embodiment.

【図２】実施の形態１に係るコンパイラの機械命令圧縮
部１０９の処理フローを示したフローチャートFIG. 2 is a flowchart showing a processing flow of a machine instruction compression unit 109 of the compiler according to the first embodiment;

【図３】実施の形態１に係るコンパイラの機械命令圧縮
部１０９の処理フローを示したフローチャートFIG. 3 is a flowchart showing a processing flow of a machine instruction compression unit 109 of the compiler according to the first embodiment;

【図４】実施の形態１に係るコンパイラの機械命令圧縮
部１０９の処理フローを示したフローチャートFIG. 4 is a flowchart showing a processing flow of a machine instruction compression unit 109 of the compiler according to the first embodiment;

【図５】未圧縮の機械命令プログラムの例示図FIG. 5 is an exemplary diagram of an uncompressed machine instruction program.

【図６】実施の形態１に係る圧縮された機械命令プログ
ラムの例示図FIG. 6 is an exemplary diagram of a compressed machine instruction program according to the first embodiment;

【図７】実施の形態１に係るプロセッサの概略構成図FIG. 7 is a schematic configuration diagram of a processor according to the first embodiment;

【図８】実施の形態１に係るプロセッサの図６の機械命
令プログラムに対応した動作タイミング図FIG. 8 is an operation timing chart corresponding to the machine instruction program of FIG. 6 of the processor according to the first embodiment;

【図９】実施の形態２に係る圧縮された機械命令プログ
ラムの例示図FIG. 9 is a view showing an example of a compressed machine instruction program according to the second embodiment;

【図１０】実施の形態２に係るコンパイラの機械命令圧
縮部１０９の処理フローを示したフローチャートFIG. 10 is a flowchart showing a processing flow of a machine instruction compression unit 109 of the compiler according to the second embodiment.

【図１１】実施の形態２に係るコンパイラの機械命令圧
縮部１０９の処理フローを示したフローチャートFIG. 11 is a flowchart showing a processing flow of a machine instruction compression unit 109 of a compiler according to the second embodiment.

【図１２】実施の形態２に係るコンパイラの機械命令圧
縮部１０９の処理フローを示したフローチャートFIG. 12 is a flowchart showing a processing flow of a machine instruction compression unit 109 of a compiler according to the second embodiment.

【図１３】実施の形態２に係るプロセッサのIFステージ
部分の概略構成図FIG. 13 is a schematic configuration diagram of an IF stage portion of the processor according to the second embodiment;

【図１４】未圧縮の機械命令プログラムの例示図FIG. 14 is an exemplary diagram of an uncompressed machine instruction program.

【図１５】実施の形態３に係る圧縮された機械命令プロ
グラムの例示図FIG. 15 is an exemplary diagram of a compressed machine instruction program according to the third embodiment.

【図１６】実施の形態３に係るプロセッサの概略構成図FIG. 16 is a schematic configuration diagram of a processor according to a third embodiment.

【図１７】実施の形態３に係るプロセッサの図１３の機
械命令プログラムに対応した動作タイミング図FIG. 17 is an operation timing chart corresponding to the machine instruction program of FIG. 13 of the processor according to the third embodiment;

【図１８】実施の形態４に係る圧縮された機械命令プロ
グラムの例示図FIG. 18 is an exemplary diagram of a compressed machine instruction program according to the fourth embodiment.

【図１９】実施の形態に係るプロセッサのIFステージ部
分の概略構成図FIG. 19 is a schematic configuration diagram of an IF stage portion of the processor according to the embodiment;

【図２０】実施の形態４に係るプロセッサの図１６の機
械命令プログラムに対応した動作タイミング図FIG. 20 is an operation timing chart corresponding to the machine instruction program of FIG. 16 of the processor according to the fourth embodiment;

【図２１】第１の従来技術におけるプロセッサの概略構
成図FIG. 21 is a schematic configuration diagram of a processor in the first related art.

【図２２】第２の従来技術におけるプロセッサの概略構
成図FIG. 22 is a schematic configuration diagram of a processor according to a second conventional technique.

【図２３】他の従来技術および他の実施形態に係る命令
のフォーマット図FIG. 23 is a format diagram of an instruction according to another related art and another embodiment.

[Explanation of symbols]

１、４１ ROM ２、４２Ｉ１ラッチ３、４３Ｉ２ラッチ４、４５第１命令解読器５、４６第２命令解読器６、４８レジスタファイル７、４９Ｄ１セレクタ８、５０Ｄ２セレクタ９、５２Ｄ１１ラッチ１０、５３Ｄ１２ラッチ１１、５５Ｄ２１ラッチ１２、５６Ｄ２２ラッチ１３、５８第１演算器１４、５９第２演算器１５、３３ＩＢ１１バッファ１６、３４ＩＢ１２バッファ１７、３５ＩＢ２１バッファ１８、３６ＩＢ２２バッファ１９、６４Ｉ１セレクタ２０、６５Ｉ２セレクタ２１、３７、６７、７２制御回路３１ＩＢ１セレクタ３２ＩＢ２セレクタ４４Ｉ３ラッチ４７第３命令解読器５１Ｄ３セレクタ５４Ｄ１３ラッチ５７Ｄ２３ラッチ６０第３演算器６１ＩＢ１バッファ６２ＩＢ２バッファ６３ＩＢ３バッファ６６ＩＢセレクタ７１ＩＢバッファ１０１Ｃ言語プログラム１０２コンパイラ１０３ファイル読込部１０４読込用バッファ１０５構文解析部１０６中間コード用バッファ１０７機械命令生成部１０８暫定出力用バッファ１０９機械命令圧縮部１１０出力用バッファ１１１ファイル出力部１１２機械命令プログラム 1, 41 ROM 2, 42 I1 latch 3, 43 I2 latch 4, 45 First instruction decoder 5, 46 Second instruction decoder 6, 48 Register file 7, 49 D1 selector 8, 50 D2 selector 9, 52 D11 latch 10, 53 D12 latch 11, 55 D21 latch 12, 56 D22 latch 13, 58 First operation unit 14, 59 Second operation unit 15, 33 IB11 buffer 16, 34 IB12 buffer 17, 35 IB21 buffer 18, 36 IB22 buffer 19 , 64 I1 selector 20, 65 I2 selector 21, 37, 67, 72 Control circuit 31 IB1 selector 32 IB2 selector 44 I3 latch 47 Third instruction decoder 51 D3 selector 54 D13 latch 57 D23 latch 60 Third arithmetic unit 61 IB1 buffer 62 IB2 buffer 3 IB3 buffer 66 IB selector 71 IB buffer 101 C language program 102 Compiler 103 File reading unit 104 Reading buffer 105 Syntax analysis unit 106 Intermediate code buffer 107 Machine instruction generation unit 108 Temporary output buffer 109 Machine instruction compression unit 110 Output Buffer 111 File output unit 112 Machine instruction program

Claims

[Claims]

1. A compiler for generating, from a high-level language program, a machine instruction program in a long-word instruction format having a plurality of slots in which a plurality of operation descriptions are arranged, a plurality of processors capable of simultaneously executing the processor in parallel from the high-level language program. After generating the operation in the long instruction format, the nop included in the instruction is replaced with a valid operation executed after the nop, and the valid operation is deleted. A compiler characterized by being added.

2. A compiler for generating, from a high-level language program, a machine instruction program in a long-word instruction format including a plurality of slots in which a plurality of operation descriptions are arranged, a plurality of processors capable of simultaneously executing the processor from the high-level language program in parallel. After generating an instruction in a long-word instruction format in which an operation is arranged in each slot, the nop included in the instruction is replaced with a valid operation executed later in the same slot as the nop, and information indicating the replacement. Compiler characterized by adding.

3. A compiler for generating, from a high-level language program, a machine instruction program in a long-word instruction format including a plurality of slots in which a plurality of operation descriptions are arranged, wherein the high-level language program allows the processors to execute simultaneously in parallel. After generating an instruction in the long-word instruction format in which the operation is arranged in each slot, the nop included in the instruction is replaced with a valid operation executed later regardless of whether or not the operation is in the same slot as the nop. And information indicating the replaced slot and the information of the replaced slot having a valid operation.

4. A compiler for generating, from a high-level language program, a machine instruction program in a long-word instruction format including a plurality of slots in which a plurality of operation descriptions are arranged, wherein the number of processors from the high-level language program is greater than the number of the slots. Generates an instruction in a long-word instruction format in which a plurality of operations that can be executed in parallel at the same time are arranged in each slot, and then replaces nops of slots exceeding the plurality of operations that can be executed in parallel by the processor with valid operations that are executed later. A compiler that adds information indicating that the slot has been replaced and information of the slot in which the replaced valid operation has been performed.

5. A compiler for generating, from a high-level language program, a machine instruction program in a long-word instruction format including a plurality of slots in which a plurality of operation descriptions are arranged, wherein the number of the processors from the high-level language program is larger than the number of the slots. After generating an instruction in a long-word instruction format in which a plurality of operations that can be executed in parallel at the same time are arranged in each slot, the nop is replaced with a valid operation to be executed later in the order of appearance, and information indicating the replacement is replaced with the replaced information. Compiler characterized by adding information of a slot where a valid operation has been performed.

6. A processor for executing instructions in a plurality of slots in parallel at the same time, wherein the instructions are temporarily stored in a storage buffer based on a value of a storage bit in the instruction, and then the instructions stored in the storage buffer are executed. A processor, characterized in that:

7. A processor for simultaneously executing instructions in a plurality of slots in parallel, comprising a storage buffer for storing the instructions, wherein the instructions have storage bits for controlling whether the instructions are to be executed immediately or temporarily stored. A processor that temporarily stores the instruction in a storage buffer based on a value of a storage bit in the instruction, and then executes the instruction stored in the storage buffer.

8. A processor for simultaneously executing instructions of a plurality of slots in parallel, comprising: a storage buffer for storing the instructions, wherein the instructions include a storage bit for controlling whether to execute the instructions immediately or to temporarily store the instructions. A position bit for controlling whether to store the instruction in the storage buffer, and temporarily storing the instruction in the storage buffer based on the value of the storage bit and the position bit in the instruction, and then executing the instruction stored in the storage buffer A processor comprising:

9. A processor for simultaneously executing instructions of a plurality of slots in parallel, comprising a storage buffer having a number of storage buffers larger than the number of slots for storing the instructions, wherein the instructions determine whether the instructions are to be executed immediately or stored once. A processor having an accumulation bit to be controlled, wherein the instruction is temporarily stored in an accumulation buffer based on a value of the accumulation bit in the instruction, and then the instruction accumulated in the accumulation buffer is executed.

10. A recording medium recording machine instructions to be executed by a processor, wherein the machine instructions include a plurality of operation descriptions and an accumulation bit for controlling whether to execute or accumulate the instructions immediately. A recording medium on which a characteristic machine instruction is recorded.

11. A recording medium storing machine instructions to be executed by a processor, wherein the machine instructions store a plurality of operation descriptions, an accumulation bit for controlling whether to execute or accumulate the instructions immediately, and where to store the instructions. And a position bit indicating whether the machine instruction has been recorded.