JP4111645B2

JP4111645B2 - Memory bus access control method after cache miss

Info

Publication number: JP4111645B2
Application number: JP34101499A
Authority: JP
Inventors: 真一郎多湖; 輝彦上方; 敦浩須賀; 廣岡野; 好正竹部; 泰造佐藤; 恭啓山崎; 斉依田
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1999-11-30
Filing date: 1999-11-30
Publication date: 2008-07-02
Anticipated expiration: 2019-11-30
Also published as: JP2001154845A

Description

【０００１】
【発明の属する技術分野】
本発明は、パイプライン処理により命令フェッチ、命令保持、命令デコード、実行を行う情報処理装置のメモリバスアクセス方式に関し、特に分岐成立側命令系列（以下、ターゲット側命令系列）と分岐非成立側命令系列（以下、シーケンシャル側命令系列）を、平行してフェッチするデュアル命令フェッチ型の情報処理システムにおける効率的なメモリバスアクセス方式を提供する。
【０００２】
【従来の技術】
パイプライン処理により命令フェッチ、命令保持、命令デコード、命令実行を行うマイクロプロセッサ（または情報処理装置）は、連続する命令列の命令フェッチを先行して行って、実行ユニットでの実行ステージに空きが発生することをなくし、高速処理を実現する。しかし、命令系列内に分岐命令が存在する場合は、その分岐命令の実行を待ってターゲット側命令系列に分岐するのか、シーケンシャル側命令系列を続けるのかに従って、次にフェッチする命令系列が異なる。その結果、一時的に実行ユニットの実行サイクルに空きが生じる。ここで、ターゲット側命令系列とは、分岐命令を実行した結果、分岐が成立した時に実行される分岐先の命令系列であり、シーケンシャル側命令系列とは、分岐命令を実行した結果、分岐が不成立の時に実行される命令系列である。
【０００３】
かかる事態を防止するために、ターゲット側命令系列とシーケンシャル側命令系列との両方の命令列に対して、ＣＰＵが同時に命令フェッチ要求を出して、ＣＰＵ内の２つの命令バッファにそれぞれ格納するデュアル命令フェッチ型の情報処理装置が提案されている。このデュアル命令フェッチ型であれば、分岐命令の実行結果がターゲット側への分岐または非分岐のいずれであっても、次に実行される命令系列が命令バッファに保持されているので、分岐命令の分岐方向の予測ミスに伴う新たな命令フェッチに伴う実行ステージの遅れをできるだけ少なくすることができる。
【０００４】
また、マイクロプロセッサであるＣＰＵは、命令フェッチを高速化するために、キャッシュメモリを利用する。外部のメモリバスを介してでなければ、命令やデータ等が格納されている外部のメインメモリから、それらの命令やデータをフェッチすることはできない。かかるメモリバスアクセスは、比較的長い時間（多くのパイプラインサイクル）を要するので、メインメモリ内の命令やデータを格納するキャッシュメモリがＣＰＵに隣接して設けられる。通常、ＣＰＵからの命令フェッチには、キャッシュメモリに対して要求され、フェッチされた命令が命令バッファに格納される。キャッシュメモリに格納されておらず、キャッシュミスした場合は、メモリバスを介してメインメモリからフェッチ対象の命令をフェッチし、命令バッファに格納すると共にキャッシュメモリにも格納する。
【０００５】
【発明が解決しようとする課題】
しかしながら、メインメモリから命令フェッチするメモリバスアクセスを頻繁に行うと、メモリバス内のトラフィックが増大する。かかるメモリバスのトラフィックの増大は、メモリバスアクセスの遅延を招く。特に、分岐命令を実行する前の段階で、実際には実行されないかもしれないターゲット側またはシーケンシャル側の命令をメインメモリから取得する結果、分岐命令の実行の結果必要になった命令をメインメモリからフェッチすることに、長時間を要するようになるのは、好ましくない。
【０００６】
そこで、本発明の目的は、過剰なメモリバスアクセスを軽減し、より効率的な命令フェッチを可能にする情報処理装置のメモリバスアクセス方式を提供することにある。
【０００７】
【課題を解決するための手段】
上記の目的を達成するために、本発明の一つの側面は、分岐命令のシーケンシャル側とターゲット側の命令系列の両方をフェッチする命令フェッチ部と、命令フェッチ部からのフェッチ要求に応答してキャッシュメモリまたはメインメモリから命令をフェッチするキャッシュ制御部と、メインメモリへのアクセスを行うメモリバスアクセス部と、フェッチした命令を保持する命令バッファとを有する情報処理装置において、
前記命令バッファに格納される分岐命令の分岐予測を分岐命令の実行に先行して行う分岐予測部を有し、前記キャッシュ制御部は、前記分岐命令の分岐方向が未確定の場合に、分岐予測部からの分岐予測方向に応じて、前記メインメモリへのメモリバスアクセスを行うことを特徴とする。
【０００８】
上記の発明において、より好ましい第１の実施例では、前記分岐命令の分岐方向が未確定の場合に、キャッシュ制御部は、分岐命令の分岐予測方向の命令についてキャッシュミスを起こした場合は、メインメモリへのメモリバスアクセスを行って命令フェッチを行い、分岐予測方向ではない命令についてキャッシュミスを起こした場合は、メモリバスアクセスを行わないで命令フェッチを中止する。
【０００９】
即ち、第１に、分岐命令の分岐予測方向がターゲット側にある場合で、シーケンシャル側の命令についてキャッシュミスを起こした場合は、メモリバスアクセスを行わないで命令フェッチを中止し、第２に、分岐命令の分岐予測方向がシーケンシャル側にある場合で、ターゲット側の命令についてキャッシュミスを起こした場合は、メモリバスアクセスを行わないで命令フェッチを中止する。それ以外の場合は、キャッシュ制御部は、メモリバスアクセスを行って命令フェッチを行う。それ以外の場合は、キャッシュミス後にメモリバスアクセスを許可する。
【００１０】
上記の発明において、より好ましい第２の実施例では、前記分岐命令の分岐方向が未確定の場合に、キャッシュ制御部は、分岐命令の分岐予測方向がシーケンシャル側にある場合で、ターゲット側の命令についてキャッシュミスを起こした場合は、メモリバスアクセスを行わないで命令フェッチを中止する。それ以外の場合は、キャッシュ制御部は、メモリバスアクセスを行って命令フェッチを行う。従って、上記第１の実施例と異なり、第２の実施例では、分岐予測方向がターゲット側の場合であってシーケンシャル側の命令についてキャッシュミスを起こしたら、メモリバスアクセスにより命令フェッチを行う。シーケンシャル側の命令フェッチがキャッシュミスする確率は低く、そのような頻度の少ないケースにおいてメモリバスアクセスを禁止する必要性が少ないからである。
【００１１】
上記の目的を達成するために、本発明の別の側面は、分岐命令のシーケンシャル側とターゲット側の命令系列の両方をフェッチする命令フェッチ部と、命令フェッチ部からのフェッチ要求に応答してキャッシュメモリまたはメインメモリから命令をフェッチするキャッシュ制御部と、メインメモリへのアクセスを行うメモリバスアクセス部と、フェッチした命令を保持する命令バッファとを有する情報処理装置において、
前記命令バッファに格納される分岐命令の分岐予測を分岐命令の実行に先行して行う分岐予測部を有し、前記キャッシュ制御部は、前記分岐命令の分岐方向が未確定の場合に、キャッシュミスしたらメモリバスアクセスを行わないで命令フェッチを中止し、前記分岐命令が確定している場合に、当該確定した分岐方向の命令についてキャッシュミスしたらメモリバスアクセスを行うことを特徴とする。
【００１２】
上記の発明によれば、分岐確定後の分岐方向の命令についてのみ、キャッシュミス後のメモリバスアクセスを行うことになり、メモリバスのトラフィックスを軽減することができる。即ち、分岐未確定の段階では、使用されるか否か不明であるので、キャッシュミス後のメモリバスアクセスは全面的に禁止する。また、分岐未確定時のターゲット側の命令は、キャッシュメモリに格納されている範囲内で、命令バッファへのプリフェッチが行われる。
【００１３】
【発明の実施の形態】
以下、図面を参照して本発明の実施の形態例を説明する。しかしながら、かかる実施の形態例が、本発明の技術的範囲を限定するものではない。
【００１４】
図１は、本発明の実施の形態例における情報処理装置のシステム図である。図１に示された情報処理装置は、マイクロプロセッサであり、チップ内にＣＰＵ４０と、キャッシュメモリユニット５０と、メモリバスアクセス部６０とを有する。メモリバスアクセス部６０から左側がチップ外であり、外部のメモリバス６２を介してメインメモリ６４に接続される。
【００１５】
ＣＰＵ４０は、命令をデコードしてその命令を実行する命令デコーダ及び命令実行部４９を有する。図１に示されたＣＰＵ４０は、分岐命令のシーケンス側とターゲット側との命令を両方、同時にフェッチを行うデュアル命令フェッチ方式の命令フェッチ部４１０，４１１を有する。更に、ＣＰＵ４０は、シーケンス側とターゲット側のフェッチされた命令を格納する命令バッファ４７０，４７１を有し、当該命令バッファの命令のうち、セレクタ４８で選択された側の命令が、命令デコーダ４９に供給される。セレクタ４８の選択は、後述する分岐命令の分岐予測信号Ｓ４３０，Ｓ４３１に従って行われる。
【００１６】
命令デコーダでデコードされた命令は、命令実行部４９で実行され、図示しない所定のレジスタなどに実行結果が書き込まれる。命令デコーダ及び命令実行部４９は、分岐命令の分岐先アドレス情報Ｓ１２を分岐側アドレス生成部４６に供給する。分岐側アドレス生成部４６は、その分岐先アドレス情報Ｓ１２に従って、分岐先アドレスＡ１０を生成し、分岐先アドレスバッファ４５に供給する。分岐先アドレスバッファ４５は、その供給されたターゲット側の命令のアドレスである分岐先アドレスを、その後の命令フェッチのために保持する。更に、連続側アドレスバッファ４４は、シーケンシャル側の命令のアドレスをインクリメントして生成し、保持する。
【００１７】
命令フェッチ部４１０，４１１は、それぞれアドレス選択部４２０，４２１を有する。アドレス選択部４２０，４２１には、連続側アドレスバッファ４４からシーケンシャル側のアドレスＡ１が、分岐先アドレスバッファ４５からターゲット側の分岐先アドレスＡ２が、そして、命令実行部４９から命令実行の結果生成したアドレスＡ３がそれぞれ供給され、その中から選択されたアドレスが、キャッシュメモリユニット５０に命令フェッチ要求Ｓ２０と共に供給される。命令フェッチ部４１０，４１１は、命令実行部４９から供給される分岐確定信号Ｓ１０に応答して、一方がシーケンシャル側の命令フェッチ部になり、他方がターゲット側の命令フェッチ部になる。また、分岐確定信号Ｓ１０に従って、命令フェッチが分岐未確定段階のプリフェッチか、分岐が確定したあとのフェッチかの区別を、命令フェッチ要求Ｓ２０に添付して、キャッシュメモリユニットに与える。
【００１８】
キャッシュメモリユニット５０は、キャッシュメモリ５２と、キャッシュ制御部５４，５６を有する。キャッシュ制御部５４，５６は、命令フェッチ部４１０，４１１からのフェッチ要求Ｓ２０に応答してキャッシュメモリ５２またはメインメモリ６４から命令をフェッチする。従って、キャッシュメモリユニット５０は、シーケンシャル側とターゲット側の命令フェッチ要求を同時に受け付けることができる２ポート形式になっている。キャッシュ制御部５４，５６は、キャッシュメモリ５２に対してアドレスＡＤを与えて命令をフェッチするが、その命令フェッチに対してキャッシュヒットしたかキャッシュミスしたかを示すヒット・ミス信号CHMが、キャッシュメモリ５２からそれぞれのキャッシュ制御部５４，５６に返信される。
【００１９】
各キャッシュ制御部５４，５６は、フェッチ要求Ｓ２０に応答してキャッシュメモリに命令フェッチした結果、キャッシュヒットした場合は、そのフェッチした命令を、対応する命令バッファ４７０，４７１に供給して格納する。キャッシュ制御部５４，５６は、キャッシュミスした場合は、後述するアルゴリズムに従って、メインメモリ６４から命令をフェッチするようメモリバスアクセス部６０にメモリバスアクセス要求を行う。但し、本実施の形態例では、このメモリバスアクセスは、分岐未確定の段階では一部制限されている。
【００２０】
メモリバスアクセス部６０は、外部のメモリバス６２を介してメインメモリ６４に接続され、メモリバス６２の制御を行い、キャッシュ制御部５４，５６からのメインメモリ６４へのフェッチ要求に応答して、メモリバスアクセスを行う。メインメモリ６４からフェッチされた命令は、それぞれ対応するキャッシュ制御部５４，５６に供給され、対応する命令バッファ４７０，４７１に格納されると共に、キャッシュメモリ５２にも記憶される。
【００２１】
キャッシュ制御部５４，５６は、フェッチ要求信号Ｓ２０に応答して、キャッシュメモリ５２から命令をフェッチしたか、メモリバスアクセスしてメインメモリ６４から命令をフェッチしたか、或いは命令フェッチを中止したかについての完了通知信号Ｓ２２を、対応するアドレス選択部４２０，４２１に供給する。
【００２２】
図１の情報処理装置は、ＣＰＵ４０内に分岐予測部４３０，４３１を有する。この分岐予測部４３０，４３１は、命令バッファに格納される命令コードが有する分岐予測ビットＳ３０，Ｓ３２に従って、そのフェッチされた分岐命令の分岐予測を行い、分岐予測情報Ｓ４３０，Ｓ４３１を適宜アドレス選択部４２０，４２１に供給する。アドレス選択部４２０，４２１は、フェッチ要求信号Ｓ２０に、その分岐予測情報、フェッチ先アドレス、及び分岐確定か否かの情報を加えて、キャッシュ制御部５４，５６に供給する。
【００２３】
図１に示された情報処理装置は、デュアル命令フェッチ方式であり、命令列のシーケンシャル側の命令列とターゲット側の命令列との両方をフェッチし、命令バッファ４７０，４７１に格納する。かかる命令フェッチは、分岐命令が命令実行部４９で実行されて分岐が確定する前の分岐未確定の段階で行われ、そのプリフェッチされたシーケンシャル側とターゲット側の命令列が、命令バッファ４７０，４７１に格納される。従って、分岐命令が実行された結果、いずれの方向に分岐が確定しても、分岐命令が確定した後の命令のデコードと実行のステージを、パイプラインのサイクルを乱すことなく行うことができる。
【００２４】
更に、図１に示された情報処理装置は、分岐予測部４３０，４３１によってフェッチされた命令の分岐予測を行い、分岐予測結果Ｓ４３０，Ｓ４３１に応じて、命令バッファ４７０，４７１の一方の命令をデコードする。分岐命令が確定する前に、分岐予測に従って命令のデコードをすることにより、分岐確定時におけるパイプライン処理のサイクルの乱れを少なくすることができる。
【００２５】
キャッシュ制御部５４，５６は、一般的には、フェッチ要求に応答して、キャッシュメモリ５２から命令をフェッチし、キャッシュヒットした場合は、そのフェッチした命令を命令バッファに格納し、キャッシュミスした場合は、メモリバスアクセス部６０にメモリバスアクセス要求を出して、メインメモリ６４から命令をフェッチする。
【００２６】
しかしながら、キャッシュメモリユニット５０内のデータバスは高速であるのに対して、外部にあるメモリバス６２は、その動作周波数が遅くまたバス幅も狭い。従って、メモリバスアクセスが頻繁に行われるとメモリバス６２へのトラフィックが増大し、メモリバスアクセス自体が時間を要することになる。従って、外部のメモリバス６２へのアクセス頻度を高くすると、例えば急に必要になった命令のフェッチをメインメモリから行わなければならなくなった時、そのメモリバスアクセスに時間がかかるという課題を有する。
【００２７】
本実施の形態例におけるキャッシュ制御部５４，５６は、後述する通り、分岐が確定していない場合は、必要に応じてまたは全て、キャッシュミスした後のメモリバスアクセスを行わないで命令フェッチを中止する。
【００２８】
第１の実施例では、分岐予測方向でない命令については、上記のキャッシュミス後のメモリバスアクセスを行わないで、命令フェッチを中止する。分岐予測方向でない命令の場合は、その後分岐命令が確定した時点でその命令フェッチが無駄になる可能性が高いので、かかる命令に対するメモリバスアクセスは行わないほうが効率的である。但し、分岐予測方向の命令については、キャッシュミス後にメモリバスアクセスを行う。
【００２９】
第２の実施例では、分岐予測方向がシーケンシャル側であって、ターゲット側の命令についてキャッシュミスを起こした場合は、そのメモリバスアクセスは行わないで命令フェッチを中止する。但し、分岐予測の方向がターゲット側であって、シーケンシャル側の命令についてキャッシュミスを起こした場合は、分岐予測方向と違う側の命令であっても、メモリバスアクセスを行って、命令フェッチを完了させる。その理由は、キャッシュミスをしてメモリバスアクセスされる場合は、その命令と連続するアドレスの命令が一括してキャッシュメモリ５２にフェッチされるので、シーケンシャル側の命令系列がキャッシュミスを起こす可能性は低い。従って、かかる頻度の低いメモリバスアクセスを許可しても、メモリバス６２のトラフィックの増大にはあまりつながらないからである。第２の実施例の場合、分岐予測方向の命令に対しては、キャッシュミス後にメモリバスアクセスを許可する。
【００３０】
第３の実施例としては、分岐命令が未確定の間は、キャッシュヒットした命令のみ命令バッファに格納し、キャッシュミスしたらメモリバスアクセスは行わずに命令フェッチを中止し、分岐命令が確定した後において、キャッシュミスした命令のメモリバスアクセスを行うようにする。この場合でも、以前にフェッチした命令がキャッシュメモリに記録されている限り、デュアル命令フェッチ方式により、両側の命令をプリフェッチして命令バッファに格納することができる。そして、確実に使用される分岐確定後の分岐方向の命令に対してのみメモリバスアクセスを行うので、メモリバスへのアクセス頻度を下げることができる。
【００３１】
図２は、キャッシュ制御部のブロック図である。前述した通り、ＣＰＵ４０からフェッチ要求Ｓ２０Ｂが、フェッチアドレスＳ２０Ａと分岐予測情報Ｓ２０Ｃと共に供給される。アドレスＳ２０Ａはキャッシュメモリ５２に供給されると共に、バスアクセスアドレス保持部７２で保持される。また、フェッチ要求信号Ｓ２０Ａと分岐予測情報Ｓ２０Ｃとは、バスアクセス要否判定部７０に供給される。
【００３２】
バスアクセス要否判定部７０は、キャッシュメモリ５２からのキャッシュ・ヒット・ミス信号CHMによるキャッシュヒット判定結果と、分岐予測情報Ｓ２０Ｃと、現在シーケンシャル側かターゲット側かのステータスなどに応じて、メモリバスアクセスを要求するか否かを判定する。また、バスアクセス要否判定部７０は、その判定結果を、バスアクセス要求信号Ｓ７１としてバスアクセス制御部７４に供給し、バスアクセス不要信号Ｓ７０を完了通知判定部７８に供給する。
【００３３】
上記の判定でメモリバスアクセスが必要と判定された場合は、バスアクセス制御部７４は、バスアクセス要求信号Ｓ７１に応答して、メモリバスアクセス部６０にバスアクセス要求信号Ｓ７６を送ると共に、バスアクセスアドレス保持部７２に制御信号Ｓ７５を出力して、保持しているフェッチアドレスを出力させる。また、上記の判定でメモリバスアクセスが不要と判定された場合は、バスアクセス制御部７４は、メモリバスアクセスは行わない。この判定は、上記の実施例１，２，３のアルゴリズムに従う。
【００３４】
メモリバスアクセスに応答して、メインメモリ６４からデータが返信されたときは、バスアクセス制御部７４は、メモリバスアクセス部６０からデータ有効信号Ｓ７７を受信し、それに応答して、バスアクセス完了信号Ｓ７４を完了通知判定部７８に供給する。完了通知判定部７８は、バスアクセス完了信号Ｓ７４やバスアクセス不要信号Ｓ７０に従って、命令をキャッシュメモリ５２からフェッチしたのか、命令フェッチを中止したのか、メモリバスアクセスによりメインメモリからフェッチしたのかの完了通知信号Ｓ２２を、ＣＰＵの命令フェッチ部に送る。
【００３５】
メインメモリからフェッチされた命令は、キャッシュ制御部を介して、キャッシュメモリに格納されると共に、命令バッファにも格納される。
【００３６】
以下、上記の第１、第２、第３の実施例におけるメモリバスアクセスを行わないアルゴリズムについて、説明する。
【００３７】
図３は、上記の第１の実施例における命令フェッチの動作を示す図表である。図表に沿ってその命令フェッチの動作を説明する。第１の実施例では、
（1）分岐命令の分岐方向が確定していない場合には、
(1-1)分岐予測部による分岐予測方向がターゲット側の場合には、第１に、シーケンシャル側の命令フェッチは、命令キャッシュミスを起こしたら、メモリバスアクセスしないで、命令フェッチを中止し、メモリバスアクセスをしない。第２に、ターゲット側の命令フェッチは、命令キャッシュミスを起こしたら、メモリバスアクセスして命令フェッチを完了する。
(1-2)分岐命令実行での分岐予測方向がシーケンシャル側の場合には、第１に、ターゲット側の命令フェッチは、命令キャッシュミスを起こしたら、メモリバスアクセスしないで、命令フェッチを中止し、メモリバスアクセスをしない。第２に、シーケンシャル側の命令フェッチは、命令キャッシュミスを起こしたら、メモリバスアクセスして命令フェッチを完了する。
（2）分岐命令の分岐方向が確定している場合には、
分岐方向が確定した側（シーケンシャル側、または、ターゲット側）のみを命令フェッチする。その場合は、キャッシュミスを起こしたらメモリバスアクセスして命令フェッチを完了する。
【００３８】
以上の通り、第１の実施例では、分岐方向が未確定の間は、分岐予測方向の命令フェッチについてのみ、キャッシュミス後のメモリバスアクセスを行うことを許可し、分岐予測方向ではない命令フェッチは、キャッシュミス後のメモリバスアクセスは禁止して、無駄になる可能性の高い命令フェッチのためのメモリバスアクセスは行わない。いずれの場合でもキャッシュヒットした場合は、それでフェッチされた命令は命令バッファ内に格納され、命令フェッチは完了する。
【００３９】
また、命令フェッチ部４１０，４１１内のアドレス選択部４２０，４２１は、命令フェッチが完了しなかった命令であって、分岐確定信号Ｓ１０により分岐が確定した方向の命令については、改めて命令フェッチ要求を出す。この時にキャッシュミスが生じたら、メモリバスアクセスを行って
必要な命令のフェッチを行う。そのとき、それに連続する命令列もキャッシュメモリ５２に格納される。
【００４０】
図４は、第１の実施例を改良した第２の実施例における命令フェッチの動作を示す図表である。図表に沿ってその命令フェッチの動作を説明する。第２の実施例では、
（1）分岐命令の分岐方向が確定していない場合には、
(1-1)分岐予測部の分岐予測方向がターゲット側の場合には、第１に、シーケンシャル側の命令フェッチは、命令キャッシュミスを起こしたら、メモリバスアクセスして、命令フェッチを完了する。第２に、ターゲット側の命令フェッチは、命令キャッシュミスを起こしたら、メモリバスアクセスして、命令フェッチを完了する。
(1-2)分岐命令実行での分岐予測方向がシーケンシャル側の場合には、第１に、ターゲット側の命令フェッチは、命令キャッシュミスを起こしたら、メモリバスアクセスしないで、命令フェッチを中止し、メモリバスアクセスをしない。第２に、シーケンシャル側の命令フェッチは、命令キャッシュミスを起こしたらメモリバスアクセスして、命令フェッチを完了する。
（2）分岐命令の分岐方向が確定している場合には、
分岐方向が確定した側（シーケンシャル側、または、ターゲット側）のみを命令フェッチする。その場合は、キャッシュミスを起こしたらメモリバスアクセスして命令フェッチを完了する。
【００４１】
第２の実施例が第１の実施例と異なるところは、分岐予測方向がターゲット側であってシーケンシャル側の命令フェッチに対してキャッシュミスが生じた場合は、分岐予測方向とは異なる側の命令ではあるが、メモリバスアクセスをして命令フェッチを完了することにある。かかるケースは、極めて可能性が低いので頻度が低く、従って、メモリバスアクセスを許可してもメモリバスのトラフィックを増大することにはならない。
【００４２】
図５は、第３の実施例における命令フェッチの動作を示す図表である。図表に沿ってその命令フェッチの動作を説明する。第３の実施例では、
（1）分岐命令の分岐方向が確定していない場合は、
(1-1)分岐予測部の分岐予測方向がターゲット側の場合には、第１に、シーケンシャル側の命令フェッチは、命令キャッシュミスを起こしたら、メモリバスアクセスしないで、命令フェッチを中止し、メモリバスアクセスをしない。第２に、ターゲット側の命令フェッチも、命令キャッシュミスを起こしたら、メモリバスアクセスしないで、命令フェッチを中止し、メモリバスアクセスをしない。
（1-2）分岐予測部の分岐予測方向がシーケンシャル側の場合には、第１に、ターゲット側の命令フェッチは、命令キャッシュミスを起こしたら、メモリバスアクセスしないで、命令フェッチを中止し、メモリバスアクセスをしない。第２に、シーケンシャル側の命令フェッチも、命令キャッシュミスを起こしたら、メモリバスアクセスしないで、命令フェッチを中止し、メモリバスアクセスをしない。
（2）分岐方向が確定している場合には、
分岐方向が確定した側（シーケンシャル側、または、ターゲット側）のみを命令フェッチする。この場合、キャッシュミスを起こしてもメモリバスアクセスしてメインメモリから命令をフェッチして命令フェッチを完了する。
【００４３】
第３の実施例は、分岐命令が実行されず分岐未確定の間は、一切のメモリバスアクセスを禁止し、分岐方向が確定した命令についてのみメモリバスアクセスを許可する。分岐未確定の場合は、メモリバスアクセスのよる命令フェッチが無駄になる可能性があるので、そのメモリバスアクセスを禁止してメモリバスのトラフィックを少なくする。キャッシュメモリには、分岐確定した命令が予め格納されるので、キャッシュミス自体はそれほど高い確率で発生するものではない。従って、キャッシュメモリからの命令フェッチだけでプリフェッチして、命令デコーダにシーケンス側とターゲット側の両方の命令系列を格納するだけでも、全体のパイプライン動作をあまり乱すことなく命令の実行を行うことが可能である。
【００４４】
最後に、第４の実施例として、上記以外のメモリバスアクセスを減らす方法について説明する。図６は、第４の実施例における命令フェッチの動作を示す図表である。図表に沿ってその命令フェッチの動作を説明する。第４の実施例では、（1）分岐命令の分岐方向が確定していない場合
（1-1）分岐予測部での分岐予測方向がターゲット側の場合には、シーケンシャル側の命令フェッチは、命令キャッシュミスを起こしたら、メモリバスアクセスしないで、命令フェッチを中止し、メモリバスアクセスをしない。一方で、ターゲット側の命令フェッチは、命令キャッシュミスを起こしたら、メモリバスアクセスして命令フェッチを完了する。
（1-2）分岐予測部での分岐予測方向がシーケンシャル側の場合には、第１に、ターゲット側の命令フェッチは、命令キャッシュミスを起こしたらメモリバスアクセスして、命令フェッチを完了する。第２に、シーケンシャル側の命令も、命令キャッシュミスを起こしたら、メモリバスアクセスして、命令フェッチを完了する。
（2）分岐命令の分岐方向が確定している場合
分岐方向が確定した側（シーケンシャル側、または、ターゲット側）のみを命令フェッチする。この場合は、キャッシュミスに対してメモリバスアクセスを行った命令フェッチを完了する。
【００４５】
上記第４の実施例の場合は、分岐命令の分岐未確定の場合は、少なくとも分岐予測方向がターゲット側であって、シーケンシャル側の命令フェッチでキャッシュミスを起こしたらメモリバスアクセスは行わない。これにより、その分だけメモリバスアクセスの回数を減らすことができる。
【００４６】
上記第４の実施例と同様に、分岐未確定の時に、任意の命令フェッチに対してメモリバスアクセスを禁止するようにしても、その分だけメモリバスアクセスの回数を減らすことはできる。但し、それに伴って分岐予測されている方向の命令プリフェッチができない場合も発生する。メモリバスアクセスの禁止と命令プリフェッチの失敗とのバランスを考慮して、設定することが望ましい。
【００４７】
上記４つの実施例のうち、メモリバスアクセスの禁止と命令プリフェッチの失敗とをある程度バランスさせている第２の実施例の動作について、図１を参照して説明する。前提として、シーケンシャル側の命令フェッチは、ポート０側で行われ、ターゲット側の命令フェッチは、ポート１側で行われると仮定する。
（1）分岐命令の分岐方向が確定していない場合において、
（1-1）分岐予測部430,431での分岐予測方向がターゲット側の場合には、シーケンシャル側の命令フェッチは、ＣＰＵ４０の命令フェッチ部410（Port-0）が命令フェッチ要求S20をキャッシュメモリユニット５０内のキャッシュ制御部５４（Port-0）に供給し、その命令フェッチ要求が命令キャッシュメモリ５２に渡される。この命令フェッチ要求には、フェッチアドレスに加えて、分岐未確定か否かの情報、分岐予測情報なども添付される。
【００４８】
命令キャッシュメモリ５２において、命令キャッシュミスを起こしたら、その信号CHMがキャッシュ制御部５４に返され、キャッシュ制御部５４は、メモリバスアクセス部６０にメモリバスアクセス要求を出す。それに応答して、メモリバスアクセス部６０はメモリバス６２にアクセスして、命令を主記憶６４から読み出して、キャッシュ制御部５４に渡し、メモリキャッシュ５２に書き込み、且つ、ＣＰＵ４０内命令バッファ(0)４７０に格納して、命令フェッチを完了する。命令フェッチ完了信号Ｓ２２が、命令フェッチ部４１０に返信される。
【００４９】
シーケンシャル側の命令フェッチのキャッシュミスの頻度はそれほど高くないので、この場合にメモリバスアクセスを許可しても全体のメモリバスの効率を低下させることにはあまりならない。
【００５０】
ターゲット側の命令フェッチは、ＣＰＵ４０内の命令フェッチ部４１１から命令フェッチ要求をキャッシュ制御部５６（Port-1）に供給し、その命令フェッチ要求が命令キャッシュメモリ５２に渡される。
【００５１】
命令キャッシュメモリ５２において、命令キャッシュミスを起こしたら、キャッシュ制御部５６（Port-1）がメモリバスアクセス部６０にメモリバスアクセス要求を出し、メモリバスアクセス部６０はメモリバス６２にアクセスして、命令を主記憶６４から読み出して、キャッシュ制御部５６（Port-1）に渡し、キャッシュメモリ５２に書き込み、且つ、CPU４０の命令バッファ(1)４７１に格納して、命令フェッチを完了する。そして、命令フェッチ完了信号が命令フェッチ部４１１に返信される。
【００５２】
この場合は、使用確率が高い分岐予測方向の命令がキャッシュミスしているので、メモリバスアクセスを許可して、プリフェッチを完了することが、分岐後のパイプライン動作の乱れを防止することになる。
（1-2）分岐予測部での分岐予測方向がシーケンシャル側の場合には、ターゲット側の命令フェッチは、CPU４０内の命令フェッチ部４１１から命令フェッチ要求がキャッシュ制御部５６（Port-1）に出され、その命令フェッチ要求が命令キャッシュメモリに渡される。
【００５３】
命令キャッシュメモリ５２において、命令キャッシュミスを起こしても、キャッシュ制御部（Port-１）５６がメモリバスアクセス部６０にメモリバスアクセス要求を出さない。その結果、メモリバスアクセス部６０はメモリバスアクセスしない。そして、キャッシュ制御部５６は、命令フェッチを中止し、アドレス選択部４２１に命令フェッチをキャンセルした結果信号を返信する。
【００５４】
一方、シーケンシャル側の命令フェッチは、CPU４０内の命令フェッチ部４１０から命令フェッチ要求がキャッシュ制御部（Port-0）５４に出され、その命令フェッチ要求が命令キャッシュメモリ５２に渡される。
【００５５】
命令キャッシュメモリ５２において、命令キャッシュミスを起こしたら、キャッシュ制御部５４がメモリバスアクセス部６０にメモリバスアクセス要求を出し、メモリバスアクセス部６０はメモリバス６２にアクセスして、命令を主記憶６４から読み出して、キャッシュ制御部５４に返す。キャッシュ制御部５４は、その命令をキャッシュメモリ５２に書き込み、且つ、CPUの命令バッファ(0)４７０に格納して、命令フェッチを完了する。
（2）分岐命令の実行により分岐方向が確定している場合
命令フェッチ部４２０，４２１は、分岐命令の実行により分岐方向が確定した側（シーケンシャル側、または、ターゲット側）のみを、命令フェッチする。その時、分岐確定方向がシーケンシャル側の場合には、命令フェッチ部４２０が、キャッシュ制御部（Port-0）５４を介して、メモリバスアクセス部６０にバスアクセスを要求する。メモリバスアクセス部６０は、フェッチ要求された命令を主記憶６４から読み出し、キャッシュ制御部５４を介して、命令バッファ（０）４７０とキャッシュメモリ５２に命令を格納して、命令フェッチを完了する。
【００５６】
分岐確定方向がターゲット側の場合には、命令フェッチ部４１１が、キャッシュ制御部（Port-1）５６を介して、メモリバスアクセス部６０にバスアクセスを要求する。メモリバスアクセス部６０が、フェッチ要求された命令を主記憶６４からを読み出し、キャッシュ制御部５６を介して、命令バッファ（１）４７１に命令を格納して、命令フェッチを完了する。なお、分岐確定方向がターゲット側になった時点で、ターゲット側はシーケンシャル側に、シーケンシャル側はターゲット側に交代する。
【００５７】
図７は、上記の第１または第２の実施例によりメモリバスアクセスが制限された場合の、具体的なパイプライン動作を示す図表である。この例は、図７の表の下に示したシーケンシャル側の命令列01〜09と分岐命令03に対応するターゲット側の命令列51〜54を例にして、パイプライン動作を示すものである。この例では、分岐命令03についての分岐予測は、分岐しない、つまりシーケンシャル側の方向が予測されている場合である。
【００５８】
パイプライン動作は、次のステージで構成される。
Ｐ：命令フェッチ要求ステージ：ＣＰＵがキャッシュ制御部に命令フェッチ要求をする。この段階では、分岐未確定のプリフェッチか、分岐確定後のフェッチかの区別を付けて命令フェッチ要求される。
Ｔ：フェッチステージ：キャッシュメモリでヒットミス判定を行い命令を取り出す準備をする。
Ｃ：命令バッファステージ：命令バッファに命令を取りこむ。
Ｄ：デコードステージ：命令デコーダが命令を解読し制御信号を生成する。
Ｅ：実行ステージ：デコード結果の制御信号に応答して命令を実行する。
Ｗ：書き込みステージ：命令を実行した結果をレジスタに書き込む。
Ｍ：キャッシュミス：キャッシュミスが発生した。
Ｂ：バスアクセス保持ステージ：メモリバスにアクセスするためアドレスをバスアクセスアドレス保持部にて保持する。
Ｒ：バスアクセス要求ステージ：メモリバスアクセス部に読み出しリクエストを出す。バスアクセスして命令が読み出されるまで１８サイクルを要すると仮定する。
【００５９】
図７に戻り、命令01は、サイクル１の命令フェッチ要求ステージＰ、サイクル２のフェッチステージＴによりキャッシュメモリから命令をフェッチすることができ、サイクル３で命令バッファに命令が取り込まれる（ステージＤ）。そして、サイクル５，６，７の３サイクルで命令が実行される（ステージＥ）。実行後に命令実行結果が各種レジスタに書き込まれる（ステージＷ）。
【００６０】
命令02も、ステージＰ、Ｔ、Ｃを経て、命令が命令バッファに取り込まれる。そして、命令01の実行ステージＥが終了した次のサイクル８で、デコードステージＤで待機していた命令02が実行され（ステージＥ）、実行結果がレジスタに書き込まれる（ステージＷ）。
【００６１】
命令03は、命令バッファステージＣの時点で、分岐予測部により分岐命令であることが判別され、分岐方向はシーケンシャル側であると予測される。従って、サイクル６からターゲット側の命令列51、52，53も命令プリフェッチが開始される。
【００６２】
命令03〜07までは、全てキャッシュヒットしてパイプラインサイクルを乱すことなく、それぞれの実行ステージＥが実行される。そして、命令08〜10がキャッシュミス（ステージＭ）を起こしたとする。また、ターゲット側の命令51〜53もキャッシュミス（ステージＭ）を起こしたとする。
【００６３】
命令08は、サイクル８の時点では分岐命令03の分岐が未確定であり、分岐未確定の命令プリフェッチとして要求される（ステージＰ）。そこで、サイクル１０でキャッシュミスを起こすが、第１または第２の実施例では、分岐予測がシーケンシャル側の時にシーケンシャル側の命令がキャッシュミスを起こすと、そのメモリバスアクセスを許可している。従って、サイクル１１でバスアクセス保持ステージＢ、サイクル１２からバスアクセス要求ステージＲに入る。バスアクセス要求ステージＲは、１８サイクルを要すると仮定したので、サイクル３０でフェッチされた命令が命令バッファに格納され、命令バッファステージＣになる。
【００６４】
命令08のメモリバスアクセスに伴い、それに後続する命令もメインメモリからフェッチされてキャッシュメモリに格納されるので、命令09以降の命令バッファステージＣは、命令08のステージＣに続いて起こることになる。
【００６５】
一方、命令51は、サイクル８の時点でキャッシュミスを起こすが、分岐予測方向がシーケンシャル側であるので、ターゲット側の命令51に対するメモリバスアクセスは禁止される。命令52、53も同様にメモリバスアクセスは禁止される。従って、命令08がメモリバスアクセス要求するしたサイクル１２では、メモリバスは空き状態にあり、即メモリバスアクセスを行うことができ、サイクル３２で命令08が実行される（ステージＥ）。
【００６６】
尚、命令11、12は、それぞれ分岐が確定した後に分岐確定後の命令フェッチステージＰを迎えるので、キャッシュミスしてもメモリバスアクセスは実行される。但し、図７の例では、すでに命令08のメモリバスアクセスでキャッシュメモリに命令11、12が格納されているので、キャッシュミスは起こしていない。
【００６７】
図７の例は、第１の実施例でも第２の実施例でも、同様の動作になる。即ち、分岐未確定時の命令08のプリフェッチに対してキャッシュミスを起こしても、分岐予測方向側の命令08に対しては、メモリバスアクセスは許可される。
【００６８】
第３の実施例の場合は、分岐未確定時の命令08のプリフェッチに対して、キャッシュミス後のメモリバスアクセスは禁止される。その場合は、分岐確定後に再度命令フェッチ部からの命令フェッチに応答して、キャッシュミス後にメモリバスアクセスにより命令がフェッチされる。その場合のメモリバスアクセスは、高速に行われる。
【００６９】
図８は、従来例のメモリバスアクセスが制限されていない場合の、具体的なパイプライン動作を示す図表である。この例も、図７の場合と同じ命令列に対するパイプライン動作を示すものである。
【００７０】
この例では、命令51は分岐予測方向ではないが、メモリバスアクセスを許可される。従って、サイクル１０からバスアクセス要求ステージＲになっている。このステージＲは１８サイクルを要するので、命令08がサイクル１０でキャッシュミス（ステージＭ）を起こしても、メモリバスがビジー状態であり、そのメモリバスアクセスＲは、サイクル２８まで待たされることになる。その結果、命令08の実行ステージＥは、サイクル４８まで遅れることになる。
【００７１】
このように、従来例に比較して、本実施例では、分岐未確定の段階でのメモリバスアクセスを制限したので、使用可能性が高い命令に対するメモリバスアクセスを効率的に行うことができ、パイプラインサイクルの乱れを最小限に止めることができる。
【００７２】
以上、本発明の保護範囲は、上記の実施の形態例に限定されるものではなく、特許請求の範囲に記載された発明とその均等物にまで及ぶものである。
【００７３】
【発明の効果】
以上、本発明によれば、分岐未確定の場合の命令フェッチに対して、キャッシュミスした時のメインメモリへのアクセスを適宜制限したので、分岐予測方向の命令や分岐確定後の命令に対するメインメモリへのアクセスをより効率的に行うことができる。
【図面の簡単な説明】
【図１】本発明の実施の形態例における情報処理装置のシステム図である。
【図２】キャッシュ制御部のブロック図である。
【図３】第１の実施例における命令フェッチの動作を示す図表である。
【図４】第２の実施例における命令フェッチの動作を示す図表である。
【図５】第３の実施例における命令フェッチの動作を示す図表である。
【図６】第４の実施例における命令フェッチの動作を示す図表である。
【図７】第１または第２の実施例によりメモリバスアクセスが制限された場合の、具体的なパイプライン動作を示す図表である。
【図８】従来例の場合の具体的なパイプライン動作を示す図表である。
【符号の説明】
４０ＣＰＵ
410,411 命令フェッチ部
430,431 分岐予測部
５０キャッシュメモリユニット
５２キャッシュメモリ
５４，５６キャッシュ制御部
６０メモリバスアクセス部
６２メモリバス
６４メインメモリ[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a memory bus access method of an information processing apparatus that performs instruction fetch, instruction hold, instruction decode, and execution by pipeline processing, and in particular, a branch success side instruction series (hereinafter, target side instruction series) and a branch non-establishment side instruction Provided is an efficient memory bus access method in a dual instruction fetch type information processing system for fetching a series (hereinafter referred to as a sequential instruction series) in parallel.
[0002]
[Prior art]
A microprocessor (or information processing device) that performs instruction fetch, instruction hold, instruction decode, and instruction execution by pipeline processing performs instruction fetch of a continuous instruction sequence in advance, and there is a vacancy in the execution stage in the execution unit. It eliminates the occurrence and realizes high-speed processing. However, when there is a branch instruction in the instruction series, the instruction series to be fetched next differs depending on whether the branch instruction is executed after branching to the target instruction series or the sequential instruction series is continued. As a result, there is a vacancy in the execution cycle of the execution unit temporarily. Here, the target side instruction series is a branch destination instruction series that is executed when a branch is established as a result of executing a branch instruction, and the sequential side instruction series is that the branch is not established as a result of executing the branch instruction. This is a sequence of instructions executed at the time.
[0003]
In order to prevent such a situation, the dual instruction in which the CPU issues an instruction fetch request simultaneously to both instruction sequences of the target side instruction sequence and the sequential side instruction sequence and stores them in two instruction buffers in the CPU, respectively. A fetch type information processing apparatus has been proposed. With this dual instruction fetch type, even if the execution result of a branch instruction is either a branch to the target side or a non-branch, the instruction sequence to be executed next is held in the instruction buffer. Execution stage delays associated with new instruction fetches due to branch direction mispredictions can be minimized.
[0004]
Further, the CPU that is a microprocessor uses a cache memory in order to speed up instruction fetch. Unless the external memory bus is used, the instructions and data cannot be fetched from the external main memory storing the instructions and data. Since such a memory bus access requires a relatively long time (many pipeline cycles), a cache memory for storing instructions and data in the main memory is provided adjacent to the CPU. Normally, instruction fetch from the CPU is requested from the cache memory, and the fetched instruction is stored in the instruction buffer. If a cache miss occurs without being stored in the cache memory, the instruction to be fetched is fetched from the main memory via the memory bus, stored in the instruction buffer, and also stored in the cache memory.
[0005]
[Problems to be solved by the invention]
However, if the memory bus access for fetching instructions from the main memory is frequently performed, traffic in the memory bus increases. Such an increase in memory bus traffic causes a delay in memory bus access. In particular, as a result of acquiring a target or sequential instruction that may not actually be executed from the main memory before executing a branch instruction from the main memory, an instruction required as a result of execution of the branch instruction is read from the main memory. It is not preferable that fetching takes a long time.
[0006]
Accordingly, an object of the present invention is to provide a memory bus access method for an information processing apparatus that reduces excessive memory bus access and enables more efficient instruction fetching.
[0007]
[Means for Solving the Problems]
In order to achieve the above object, one aspect of the present invention provides an instruction fetch unit that fetches both sequential and target instruction sequences of a branch instruction, and a cache in response to a fetch request from the instruction fetch unit. In an information processing apparatus having a cache control unit that fetches an instruction from a memory or a main memory, a memory bus access unit that accesses the main memory, and an instruction buffer that holds the fetched instruction.
A branch prediction unit that performs branch prediction of a branch instruction stored in the instruction buffer prior to execution of the branch instruction, and the cache control unit performs branch prediction when a branch direction of the branch instruction is uncertain. A memory bus access to the main memory is performed in accordance with a branch prediction direction from the unit.
[0008]
In the first aspect of the invention described above, in a more preferred first embodiment, when the branch direction of the branch instruction is uncertain, the cache control unit performs main processing when a cache miss occurs for an instruction in the branch prediction direction of the branch instruction. When a memory miss is performed for an instruction that is not in the branch prediction direction, the instruction fetch is stopped without performing the memory bus access.
[0009]
That is, first, when the branch prediction direction of the branch instruction is on the target side, and a cache miss occurs for the sequential side instruction, the instruction fetch is stopped without performing the memory bus access, and secondly, When the branch prediction direction of the branch instruction is on the sequential side and a cache miss occurs with respect to the instruction on the target side, the instruction fetch is stopped without performing the memory bus access. In other cases, the cache control unit performs an instruction fetch by performing a memory bus access. In other cases, memory bus access is permitted after a cache miss.
[0010]
In the above-described invention, in a more preferred second embodiment, when the branch direction of the branch instruction is uncertain, the cache control unit determines that the branch instruction direction of the branch instruction is on the sequential side and the instruction on the target side When a cache miss occurs, instruction fetch is stopped without performing memory bus access. In other cases, the cache control unit performs an instruction fetch by performing a memory bus access. Therefore, unlike the first embodiment, in the second embodiment, when a branch prediction direction is on the target side and a cache miss occurs with respect to a sequential instruction, instruction fetch is performed by memory bus access. This is because the instruction fetch on the sequential side has a low probability of cache miss, and there is little need to prohibit memory bus access in such a low frequency case.
[0011]
In order to achieve the above object, another aspect of the present invention provides an instruction fetch unit for fetching both a sequential side and a target side instruction sequence of a branch instruction, and a cache in response to a fetch request from the instruction fetch unit. In an information processing apparatus having a cache control unit that fetches an instruction from a memory or a main memory, a memory bus access unit that accesses the main memory, and an instruction buffer that holds the fetched instruction.
A branch prediction unit that performs branch prediction of a branch instruction stored in the instruction buffer prior to execution of the branch instruction, and the cache control unit performs a cache miss when a branch direction of the branch instruction is uncertain. Then, the instruction fetch is stopped without performing the memory bus access, and when the branch instruction is confirmed, the memory bus access is performed when a cache miss occurs with respect to the instruction in the confirmed branch direction.
[0012]
According to the above invention, the memory bus access after the cache miss is performed only for the instruction in the branch direction after the branch is confirmed, and the memory bus traffic can be reduced. In other words, since it is unclear whether or not the branch is used at the stage where the branch is not yet determined, memory bus access after a cache miss is completely prohibited. Further, the instruction on the target side when the branch is unconfirmed is prefetched into the instruction buffer within the range stored in the cache memory.
[0013]
DETAILED DESCRIPTION OF THE INVENTION
Embodiments of the present invention will be described below with reference to the drawings. However, such an embodiment does not limit the technical scope of the present invention.
[0014]
FIG. 1 is a system diagram of an information processing apparatus according to an embodiment of the present invention. The information processing apparatus shown in FIG. 1 is a microprocessor, and includes a CPU 40, a cache memory unit 50, and a memory bus access unit 60 in a chip. The left side from the memory bus access unit 60 is outside the chip, and is connected to the main memory 64 via the external memory bus 62.
[0015]
The CPU 40 includes an instruction decoder and an instruction execution unit 49 that decodes an instruction and executes the instruction. The CPU 40 shown in FIG. 1 includes instruction fetch units 410 and 411 of a dual instruction fetch system that fetches both the instructions on the sequence side and the target side of the branch instruction at the same time. Further, the CPU 40 has instruction buffers 470 and 471 for storing fetched instructions on the sequence side and the target side. Of the instructions in the instruction buffer, the instruction on the side selected by the selector 48 is sent to the instruction decoder 49. Supplied. Selection of the selector 48 is performed according to branch prediction signals S430 and S431 of a branch instruction to be described later.
[0016]
The instruction decoded by the instruction decoder is executed by the instruction execution unit 49, and the execution result is written in a predetermined register (not shown). The instruction decoder and instruction execution unit 49 supplies the branch destination address information S12 of the branch instruction to the branch side address generation unit 46. The branch side address generation unit 46 generates a branch destination address A10 according to the branch destination address information S12 and supplies it to the branch destination address buffer 45. The branch destination address buffer 45 holds the branch destination address, which is the address of the supplied instruction on the target side, for subsequent instruction fetch. Further, the continuous side address buffer 44 generates and holds the address of the sequential instruction by incrementing it.
[0017]
The instruction fetch units 410 and 411 have address selection units 420 and 421, respectively. The address selection units 420 and 421 generate the sequential address A1 from the continuous address buffer 44, the branch destination address A2 from the branch address buffer 45, and the instruction execution unit 49 as a result of instruction execution. Each address A3 is supplied, and an address selected from the addresses A3 is supplied to the cache memory unit 50 together with the instruction fetch request S20. In response to the branch decision signal S10 supplied from the instruction execution unit 49, one of the instruction fetch units 410 and 411 becomes a sequential instruction fetch unit, and the other becomes a target instruction fetch unit. Further, according to the branch confirmation signal S10, a distinction between whether the instruction fetch is a prefetch at a branch unconfirmed stage or a fetch after the branch is confirmed is attached to the instruction fetch request S20 and given to the cache memory unit.
[0018]
The cache memory unit 50 includes a cache memory 52 and cache control units 54 and 56. The cache control units 54 and 56 fetch an instruction from the cache memory 52 or the main memory 64 in response to the fetch request S20 from the instruction fetch units 410 and 411. Accordingly, the cache memory unit 50 has a two-port format that can simultaneously accept an instruction fetch request on the sequential side and the target side. The cache control units 54 and 56 fetch an instruction by giving an address AD to the cache memory 52, and a hit / miss signal CHM indicating whether a cache hit or a cache miss has occurred for the instruction fetch is a cache memory. 52 is returned to the respective cache control units 54 and 56.
[0019]
When each cache control unit 54, 56 fetches an instruction to the cache memory in response to the fetch request S20 and results in a cache hit, the fetched instruction is supplied to and stored in the corresponding instruction buffer 470, 471. When a cache miss occurs, the cache control units 54 and 56 make a memory bus access request to the memory bus access unit 60 to fetch an instruction from the main memory 64 according to an algorithm described later. However, in the present embodiment, this memory bus access is partially restricted at the stage where the branch has not yet been determined.
[0020]
The memory bus access unit 60 is connected to the main memory 64 via the external memory bus 62, controls the memory bus 62, and responds to fetch requests to the main memory 64 from the cache control units 54 and 56, Performs memory bus access. Instructions fetched from the main memory 64 are respectively supplied to the corresponding cache control units 54 and 56, stored in the corresponding instruction buffers 470 and 471, and also stored in the cache memory 52.
[0021]
In response to the fetch request signal S20, the cache control units 54 and 56 have fetched an instruction from the cache memory 52, accessed the memory bus, fetched an instruction from the main memory 64, or stopped the instruction fetch. The completion notification signal S22 is supplied to the corresponding address selectors 420 and 421.
[0022]
The information processing apparatus in FIG. 1 includes branch prediction units 430 and 431 in the CPU 40. The branch prediction units 430 and 431 perform branch prediction of the fetched branch instruction according to the branch prediction bits S30 and S32 included in the instruction code stored in the instruction buffer, and appropriately select the branch prediction information S430 and S431 as an address selection unit. 420 and 421. The address selectors 420 and 421 add the branch prediction information, the fetch destination address, and information on whether or not the branch is confirmed to the fetch request signal S20, and supply it to the cache controllers 54 and 56.
[0023]
The information processing apparatus shown in FIG. 1 is a dual instruction fetch method, fetches both a sequential instruction string and a target instruction string of an instruction string and stores them in instruction buffers 470 and 471. This instruction fetch is performed at a stage where the branch instruction is not yet confirmed before the branch instruction is executed by the instruction execution unit 49 and the branch is confirmed, and the prefetched sequential side and target side instruction sequences are stored in the instruction buffers 470 and 471. Stored in Therefore, as a result of execution of the branch instruction, regardless of which direction the branch is determined, the instruction decoding and execution stage after the branch instruction is determined can be performed without disturbing the pipeline cycle.
[0024]
Further, the information processing apparatus shown in FIG. 1 performs branch prediction of the instructions fetched by the branch prediction units 430 and 431, and executes one instruction of the instruction buffers 470 and 471 according to the branch prediction results S430 and S431. Decode. By decoding the instruction according to the branch prediction before the branch instruction is determined, it is possible to reduce disturbance in the pipeline processing cycle at the time of branch determination.
[0025]
In general, the cache control units 54 and 56 fetch an instruction from the cache memory 52 in response to a fetch request, and when a cache hit occurs, the fetched instruction is stored in the instruction buffer and a cache miss occurs. Issues a memory bus access request to the memory bus access unit 60 and fetches an instruction from the main memory 64.
[0026]
However, the data bus in the cache memory unit 50 is high-speed, whereas the external memory bus 62 has a low operating frequency and a narrow bus width. Therefore, if the memory bus access is frequently performed, the traffic to the memory bus 62 increases, and the memory bus access itself takes time. Therefore, if the frequency of access to the external memory bus 62 is increased, there is a problem that it takes time to access the memory bus when, for example, it becomes necessary to fetch an instruction that is suddenly needed from the main memory.
[0027]
As will be described later, the cache control units 54 and 56 according to the present embodiment cancel the instruction fetch without performing a memory bus access after a cache miss when necessary or when a branch is not confirmed. To do.
[0028]
In the first embodiment, for instructions that are not in the branch prediction direction, the instruction fetch is stopped without performing the memory bus access after the cache miss. In the case of an instruction that is not in the branch prediction direction, there is a high possibility that the instruction fetch will be wasted when the branch instruction is determined thereafter. Therefore, it is more efficient not to perform memory bus access to such an instruction. However, for the instruction in the branch prediction direction, the memory bus is accessed after a cache miss.
[0029]
In the second embodiment, when the branch prediction direction is the sequential side and a cache miss occurs with respect to the instruction on the target side, the instruction fetch is stopped without performing the memory bus access. However, if the branch prediction direction is the target side and a cache miss occurs for an instruction on the sequential side, even if the instruction is on the side different from the branch prediction direction, the memory bus is accessed and the instruction fetch is completed Let The reason for this is that when a memory bus is accessed due to a cache miss, instructions at addresses consecutive to that instruction are fetched to the cache memory 52 at a time, so that a sequential instruction sequence may cause a cache miss. Is low. Therefore, even if such infrequent memory bus access is permitted, the traffic on the memory bus 62 does not increase so much. In the case of the second embodiment, access to the memory bus is permitted after a cache miss for an instruction in the branch prediction direction.
[0030]
As a third embodiment, while the branch instruction is not confirmed, only the cache hit instruction is stored in the instruction buffer, and when the cache miss occurs, the instruction fetch is stopped without performing the memory bus access and the branch instruction is confirmed. In step 1, a memory bus access is performed for a cache missed instruction. Even in this case, as long as the previously fetched instruction is recorded in the cache memory, the instructions on both sides can be prefetched and stored in the instruction buffer by the dual instruction fetch method. Further, since the memory bus access is performed only for the instructions in the branch direction after the branch is determined to be used reliably, the access frequency to the memory bus can be reduced.
[0031]
FIG. 2 is a block diagram of the cache control unit. As described above, the fetch request S20B is supplied from the CPU 40 together with the fetch address S20A and the branch prediction information S20C. The address S20A is supplied to the cache memory 52 and held by the bus access address holding unit 72. The fetch request signal S20A and the branch prediction information S20C are supplied to the bus access necessity determination unit 70.
[0032]
The bus access necessity determination unit 70 determines the memory bus according to the cache hit determination result based on the cache hit / miss signal CHM from the cache memory 52, the branch prediction information S20C, the status of the current sequential side or the target side, and the like. Determine whether to request access. Further, the bus access necessity determination unit 70 supplies the determination result to the bus access control unit 74 as a bus access request signal S71 and supplies the bus access unnecessary signal S70 to the completion notification determination unit 78.
[0033]
If it is determined in the above determination that the memory bus access is necessary, the bus access control unit 74 sends a bus access request signal S76 to the memory bus access unit 60 in response to the bus access request signal S71, and also accesses the bus. The control signal S75 is output to the address holding unit 72 to output the held fetch address. If it is determined in the above determination that the memory bus access is unnecessary, the bus access control unit 74 does not perform the memory bus access. This determination is in accordance with the algorithm of the first, second, and third embodiments.
[0034]
When data is returned from the main memory 64 in response to the memory bus access, the bus access control unit 74 receives the data valid signal S77 from the memory bus access unit 60, and in response thereto, a bus access completion signal S74 is supplied to the completion notification determination unit 78. The completion notification determining unit 78 notifies completion according to the bus access completion signal S74 or the bus access unnecessary signal S70, whether the instruction is fetched from the cache memory 52, whether the instruction fetch is stopped, or fetched from the main memory by the memory bus access. The signal S22 is sent to the instruction fetch unit of the CPU.
[0035]
The instruction fetched from the main memory is stored in the cache memory and also in the instruction buffer via the cache control unit.
[0036]
The algorithm that does not perform memory bus access in the first, second, and third embodiments will be described below.
[0037]
FIG. 3 is a chart showing an instruction fetch operation in the first embodiment. The instruction fetch operation will be described with reference to the chart. In the first embodiment,
(1) If the branch direction of the branch instruction is not fixed,
(1-1) When the branch prediction direction by the branch prediction unit is the target side, first, when the instruction fetch on the sequential side causes an instruction cache miss, the instruction fetch is stopped without accessing the memory bus, Do not access the memory bus. Second, in the instruction fetch on the target side, when an instruction cache miss occurs, the memory bus is accessed to complete the instruction fetch.
(1-2) If the branch prediction direction in the branch instruction execution is the sequential side, first, the instruction fetch on the target side stops the instruction fetch without accessing the memory bus if an instruction cache miss occurs Do not access the memory bus. Second, in the sequential instruction fetch, when an instruction cache miss occurs, the memory bus is accessed to complete the instruction fetch.
(2) If the branch direction of the branch instruction is fixed,
Instruction fetch is performed only on the side where the branch direction is determined (sequential side or target side). In that case, if a cache miss occurs, the memory bus is accessed to complete the instruction fetch.
[0038]
As described above, in the first embodiment, while the branch direction is uncertain, only the instruction fetch in the branch prediction direction is permitted to perform a memory bus access after a cache miss, and the instruction fetch is not in the branch prediction direction. The memory bus access after a cache miss is prohibited, and the memory bus access for instruction fetch that is likely to be wasted is not performed. In any case, when a cache hit occurs, the instruction fetched by the cache hit is stored in the instruction buffer, and the instruction fetch is completed.
[0039]
Further, the address selection units 420 and 421 in the instruction fetch units 410 and 411 newly issue an instruction fetch request for an instruction for which instruction fetch has not been completed and whose branch is determined by the branch determination signal S10. put out. If a cache miss occurs at this time, access the memory bus.
Fetch the necessary instructions. At that time, the instruction sequence continuous with the instruction is also stored in the cache memory 52.
[0040]
FIG. 4 is a chart showing an instruction fetch operation in the second embodiment which is an improvement of the first embodiment. The instruction fetch operation will be described with reference to the chart. In the second embodiment,
(1) If the branch direction of the branch instruction is not fixed,
(1-1) When the branch prediction direction of the branch prediction unit is the target side, first, when an instruction cache miss occurs in the sequential instruction fetch, the memory bus is accessed and the instruction fetch is completed. Second, in the instruction fetch on the target side, when an instruction cache miss occurs, the memory bus is accessed to complete the instruction fetch.
(1-2) If the branch prediction direction in the branch instruction execution is the sequential side, first, the instruction fetch on the target side stops the instruction fetch without accessing the memory bus if an instruction cache miss occurs Do not access the memory bus. Second, in the sequential instruction fetch, when an instruction cache miss occurs, the memory bus is accessed to complete the instruction fetch.
(2) If the branch direction of the branch instruction is fixed,
Instruction fetch is performed only on the side where the branch direction is determined (sequential side or target side). In that case, if a cache miss occurs, the memory bus is accessed to complete the instruction fetch.
[0041]
The second embodiment differs from the first embodiment in that when the branch prediction direction is the target side and a cache miss occurs with respect to the instruction fetch on the sequential side, the instruction on the side different from the branch prediction direction However, the instruction is to complete the instruction fetch by accessing the memory bus. Such a case is very unlikely and therefore is less frequent, so allowing memory bus access does not increase memory bus traffic.
[0042]
FIG. 5 is a table showing an instruction fetch operation in the third embodiment. The instruction fetch operation will be described with reference to the chart. In the third embodiment,
(1) If the branch direction of the branch instruction is not fixed,
(1-1) When the branch prediction direction of the branch prediction unit is the target side, first, when the instruction fetch on the sequential side causes an instruction cache miss, the instruction fetch is stopped without accessing the memory bus, Do not access the memory bus. Second, in the case of instruction fetch on the target side, if an instruction cache miss occurs, the instruction fetch is stopped and the memory bus is not accessed without accessing the memory bus.
(1-2) When the branch prediction direction of the branch prediction unit is the sequential side, first, when the instruction fetch on the target side causes an instruction cache miss, the instruction fetch is stopped without accessing the memory bus, Do not access the memory bus. Secondly, the instruction fetch on the sequential side also stops the instruction fetch and does not access the memory bus without accessing the memory bus if an instruction cache miss occurs.
(2) If the branch direction is fixed,
Instruction fetch is performed only on the side where the branch direction is determined (sequential side or target side). In this case, even if a cache miss occurs, the memory bus is accessed and an instruction is fetched from the main memory to complete the instruction fetch.
[0043]
The third embodiment prohibits any memory bus access while the branch instruction is not executed and the branch is uncertain, and permits the memory bus access only for the instruction whose branch direction is determined. If the branch has not been determined, instruction fetch by memory bus access may be wasted, so that memory bus access is prohibited to reduce memory bus traffic. Since the branch-fixed instruction is stored in advance in the cache memory, the cache miss itself does not occur with a very high probability. Therefore, even if only prefetching is performed by fetching instructions from the cache memory and both the sequence side and target side instruction sequences are stored in the instruction decoder, the instructions can be executed without significantly disturbing the entire pipeline operation. Is possible.
[0044]
Finally, as a fourth embodiment, a method for reducing memory bus accesses other than those described above will be described. FIG. 6 is a table showing instruction fetch operations in the fourth embodiment. The instruction fetch operation will be described with reference to the chart. In the fourth embodiment, (1) when the branch direction of the branch instruction is not fixed
(1-1) When the branch prediction direction in the branch prediction unit is the target side, the instruction fetch on the sequential side stops the instruction fetch without accessing the memory bus if an instruction cache miss occurs, and accesses the memory bus Do not do. On the other hand, in the instruction fetch on the target side, when an instruction cache miss occurs, the memory bus is accessed to complete the instruction fetch.
(1-2) When the branch prediction direction in the branch prediction unit is the sequential side, first, the instruction fetch on the target side accesses the memory bus when an instruction cache miss occurs and completes the instruction fetch. Second, when an instruction cache miss occurs in the sequential side instruction, the memory bus is accessed to complete the instruction fetch.
(2) When the branch direction of the branch instruction is fixed
Instruction fetch is performed only on the side where the branch direction is determined (sequential side or target side). In this case, the instruction fetch that performed the memory bus access to the cache miss is completed.
[0045]
In the case of the fourth embodiment, when the branch instruction has not been determined, at least the branch prediction direction is the target side, and the memory bus access is not performed if a cache miss occurs in the instruction fetch on the sequential side. As a result, the number of memory bus accesses can be reduced accordingly.
[0046]
Similar to the fourth embodiment, even if the memory bus access is prohibited for an arbitrary instruction fetch when the branch is not yet determined, the number of memory bus accesses can be reduced by that amount. However, there are cases where instruction prefetching in the direction predicted to branch cannot be performed. It is desirable to set in consideration of the balance between memory bus access prohibition and instruction prefetch failure.
[0047]
Of the above four embodiments, the operation of the second embodiment that balances the prohibition of memory bus access and the failure of instruction prefetch to some extent will be described with reference to FIG. As a premise, it is assumed that the instruction fetch on the sequential side is performed on the port 0 side, and the instruction fetch on the target side is performed on the port 1 side.
(1) When the branch direction of the branch instruction is not fixed,
(1-1) When the branch prediction direction in the branch prediction units 430 and 431 is the target side, the instruction fetch unit 410 (Port-0) of the CPU 40 sends the instruction fetch request S20 to the cache memory unit 50 for sequential instruction fetch. And the instruction fetch request is passed to the instruction cache memory 52. In addition to the fetch address, this instruction fetch request is also attached with information indicating whether or not a branch has been determined, branch prediction information, and the like.
[0048]
If an instruction cache miss occurs in the instruction cache memory 52, the signal CHM is returned to the cache control unit 54, and the cache control unit 54 issues a memory bus access request to the memory bus access unit 60. In response to this, the memory bus access unit 60 accesses the memory bus 62, reads the instruction from the main memory 64, passes it to the cache control unit 54, writes it to the memory cache 52, and the instruction buffer (0) in the CPU 40. Store at 470 to complete the instruction fetch. An instruction fetch completion signal S22 is returned to the instruction fetch unit 410.
[0049]
Since the frequency of cache misses for sequential instruction fetches is not so high, even if memory bus access is permitted in this case, the efficiency of the entire memory bus is not reduced much.
[0050]
In the instruction fetch on the target side, an instruction fetch request is supplied from the instruction fetch unit 411 in the CPU 40 to the cache control unit 56 (Port-1), and the instruction fetch request is passed to the instruction cache memory 52.
[0051]
When an instruction cache miss occurs in the instruction cache memory 52, the cache control unit 56 (Port-1) issues a memory bus access request to the memory bus access unit 60, and the memory bus access unit 60 accesses the memory bus 62, The instruction is read from the main memory 64, transferred to the cache control unit 56 (Port-1), written to the cache memory 52, and stored in the instruction buffer (1) 471 of the CPU 40, thereby completing the instruction fetch. Then, an instruction fetch completion signal is returned to the instruction fetch unit 411.
[0052]
In this case, since the instruction in the branch prediction direction having a high use probability has a cache miss, allowing the memory bus access and completing the prefetch prevents the pipeline operation after the branch from being disturbed. .
(1-2) If the branch prediction direction in the branch prediction unit is the sequential side, the instruction fetch on the target side is sent from the instruction fetch unit 411 in the CPU 40 to the cache control unit 56 (Port-1). The instruction fetch request is passed to the instruction cache memory.
[0053]
Even if an instruction cache miss occurs in the instruction cache memory 52, the cache control unit (Port-1) 56 does not issue a memory bus access request to the memory bus access unit 60. As a result, the memory bus access unit 60 does not access the memory bus. Then, the cache control unit 56 stops the instruction fetch and returns a result signal indicating that the instruction fetch is canceled to the address selection unit 421.
[0054]
On the other hand, in the instruction fetch on the sequential side, an instruction fetch request is issued from the instruction fetch unit 410 in the CPU 40 to the cache control unit (Port-0) 54, and the instruction fetch request is passed to the instruction cache memory 52.
[0055]
If an instruction cache miss occurs in the instruction cache memory 52, the cache control unit 54 issues a memory bus access request to the memory bus access unit 60, and the memory bus access unit 60 accesses the memory bus 62 to store the instruction in the main memory 64. Is returned to the cache control unit 54. The cache control unit 54 writes the instruction to the cache memory 52 and stores it in the instruction buffer (0) 470 of the CPU, thereby completing the instruction fetch.
(2) When the branch direction is fixed by executing a branch instruction
The instruction fetch units 420 and 421 fetch instructions only on the side (sequential side or target side) on which the branch direction is determined by execution of the branch instruction. At that time, if the branch decision direction is the sequential side, the instruction fetch unit 420 requests the memory bus access unit 60 for bus access via the cache control unit (Port-0) 54. The memory bus access unit 60 reads the fetch requested instruction from the main memory 64, stores the instruction in the instruction buffer (0) 470 and the cache memory 52 via the cache control unit 54, and completes the instruction fetch.
[0056]
When the branch decision direction is the target side, the instruction fetch unit 411 requests the memory bus access unit 60 for bus access via the cache control unit (Port-1) 56. The memory bus access unit 60 reads the instruction requested to be fetched from the main memory 64, stores the instruction in the instruction buffer (1) 471 via the cache control unit 56, and completes the instruction fetch. Note that when the branch confirmation direction becomes the target side, the target side is changed to the sequential side, and the sequential side is changed to the target side.
[0057]
FIG. 7 is a chart showing a specific pipeline operation when memory bus access is restricted by the first or second embodiment. This example shows the pipeline operation by taking the sequential instruction sequences 01 to 09 and the target instruction sequences 51 to 54 corresponding to the branch instruction 03 shown below the table of FIG. In this example, the branch prediction for the branch instruction 03 is a case where a branch is not taken, that is, a sequential direction is predicted.
[0058]
Pipeline operations consist of the following stages.
P: Instruction fetch request stage: The CPU issues an instruction fetch request to the cache control unit. At this stage, an instruction fetch request is made with a distinction between prefetch with branch unconfirmed and fetch after branch confirmation.
T: Fetch stage: A hit miss determination is made in the cache memory and preparations for fetching an instruction are made.
C: Instruction buffer stage: An instruction is fetched into the instruction buffer.
D: Decode stage: The instruction decoder decodes the instruction and generates a control signal.
E: Execution stage: An instruction is executed in response to the control signal of the decoding result.
W: Write stage: The result of executing the instruction is written to the register.
M: Cache miss: A cache miss occurred.
B: Bus access holding stage: An address is held in the bus access address holding unit for accessing the memory bus.
R: Bus access request stage: A read request is issued to the memory bus access unit. Assume that 18 cycles are required until an instruction is read by bus access.
[0059]
Referring back to FIG. 7, the instruction 01 can fetch an instruction from the cache memory by the instruction fetch request stage P in cycle 1 and the fetch stage T in cycle 2 and the instruction is fetched into the instruction buffer in cycle 3 (stage D). . Then, the instruction is executed in three cycles of cycles 5, 6, and 7 (stage E). After execution, the instruction execution result is written into various registers (stage W).
[0060]
The instruction 02 is also taken into the instruction buffer through stages P, T, and C. Then, in the next cycle 8 when the execution stage E of the instruction 01 is completed, the instruction 02 waiting in the decode stage D is executed (stage E), and the execution result is written to the register (stage W).
[0061]
The instruction 03 is determined to be a branch instruction by the branch prediction unit at the time of the instruction buffer stage C, and the branch direction is predicted to be the sequential side. Accordingly, the instruction prefetch is also started from the instruction sequence 51, 52, 53 on the target side from the cycle 6.
[0062]
For the instructions 03 to 07, each execution stage E is executed without cache hitting and disturbing the pipeline cycle. Assume that instructions 08 to 10 cause a cache miss (stage M). Further, it is assumed that the instructions 51 to 53 on the target side also cause a cache miss (stage M).
[0063]
The instruction 08 is requested as an instruction prefetch in which the branch of the branch instruction 03 is unconfirmed at the time of cycle 8 and the branch is unconfirmed (stage P). Thus, although a cache miss occurs in cycle 10, in the first or second embodiment, if a sequential instruction causes a cache miss when the branch prediction is on the sequential side, the memory bus access is permitted. Therefore, the bus access holding stage B is entered in cycle 11 and the bus access request stage R is entered from cycle 12. Since it is assumed that the bus access request stage R requires 18 cycles, the instruction fetched in the cycle 30 is stored in the instruction buffer and becomes the instruction buffer stage C.
[0064]
As the instruction 08 accesses the memory bus, subsequent instructions are also fetched from the main memory and stored in the cache memory, so that the instruction buffer stage C after the instruction 09 follows the stage C of the instruction 08. .
[0065]
On the other hand, the instruction 51 causes a cache miss at the time of the cycle 8, but since the branch prediction direction is the sequential side, the memory bus access to the instruction 51 on the target side is prohibited. Similarly, instructions 52 and 53 are also prohibited from accessing the memory bus. Therefore, in the cycle 12 in which the instruction 08 requests the memory bus access, the memory bus is in an empty state and the memory bus can be accessed immediately, and the instruction 08 is executed in the cycle 32 (stage E).
[0066]
Since the instructions 11 and 12 reach the instruction fetch stage P after the branch is confirmed after the branch is confirmed, the memory bus access is executed even if a cache miss occurs. However, in the example of FIG. 7, since the instructions 11 and 12 are already stored in the cache memory by the memory bus access of the instruction 08, no cache miss has occurred.
[0067]
The example of FIG. 7 operates similarly in both the first embodiment and the second embodiment. That is, even if a cache miss occurs for the prefetch of the instruction 08 when the branch is not yet confirmed, the memory bus access is permitted for the instruction 08 on the branch prediction direction side.
[0068]
In the case of the third embodiment, the memory bus access after a cache miss is prohibited with respect to the prefetch of the instruction 08 when the branch is not confirmed. In this case, in response to the instruction fetch from the instruction fetch unit again after the branch is confirmed, the instruction is fetched by a memory bus access after a cache miss. In this case, the memory bus access is performed at high speed.
[0069]
FIG. 8 is a chart showing a specific pipeline operation when the memory bus access of the conventional example is not restricted. This example also shows the pipeline operation for the same instruction sequence as in FIG.
[0070]
In this example, instruction 51 is not in the branch prediction direction, but memory bus access is permitted. Therefore, the bus access request stage R starts from the cycle 10. Since this stage R requires 18 cycles, even if the instruction 08 causes a cache miss (stage M) in cycle 10, the memory bus is busy, and the memory bus access R is kept waiting until cycle 28. . As a result, the execution stage E of the instruction 08 is delayed until the cycle 48.
[0071]
As described above, compared to the conventional example, in this embodiment, the memory bus access at the stage where the branch is not yet determined is limited. Therefore, the memory bus access can be efficiently performed for the highly usable instruction. Pipeline cycle disturbance can be minimized.
[0072]
As described above, the protection scope of the present invention is not limited to the above-described embodiment, but extends to the invention described in the claims and equivalents thereof.
[0073]
【The invention's effect】
As described above, according to the present invention, the access to the main memory at the time of a cache miss is appropriately restricted with respect to the instruction fetch when the branch is unconfirmed, so the main memory for the instruction in the branch prediction direction and the instruction after the branch is confirmed Can be accessed more efficiently.
[Brief description of the drawings]
FIG. 1 is a system diagram of an information processing apparatus according to an embodiment of the present invention.
FIG. 2 is a block diagram of a cache control unit.
FIG. 3 is a chart showing an instruction fetch operation in the first embodiment.
FIG. 4 is a chart showing an instruction fetch operation in the second embodiment.
FIG. 5 is a chart showing an instruction fetch operation in the third embodiment;
FIG. 6 is a chart showing an instruction fetch operation in the fourth embodiment.
FIG. 7 is a chart showing a specific pipeline operation when memory bus access is restricted according to the first or second embodiment;
FIG. 8 is a chart showing a specific pipeline operation in the case of a conventional example.
[Explanation of symbols]
40 CPU
410,411 Instruction fetch section
430,431 Branch prediction unit
50 cache memory unit
52 cache memory
54, 56 Cache control unit
60 Memory bus access section
62 Memory bus
64 Main memory

Claims

An instruction fetch unit that fetches both sequential and target instruction sequences of a branch instruction; a cache control unit that fetches an instruction from a cache memory or a main memory in response to a fetch request from the instruction fetch unit; and the main In an information processing apparatus having a memory bus access unit for accessing a memory and an instruction buffer for holding the fetched instruction,
A branch prediction unit that performs branch prediction of a branch instruction stored in the instruction buffer prior to execution of the branch instruction;
When the branch direction of the branch instruction is uncertain, the cache control unit fetches an instruction from the cache memory, and when a cache miss occurs for an instruction in the branch prediction direction of the branch instruction, An information processing apparatus that performs an instruction fetch by performing a memory bus access, and stops an instruction fetch without performing a memory bus access when a cache miss occurs for an instruction that is not in a branch prediction direction.

An instruction fetch unit that fetches both sequential and target instruction sequences of a branch instruction; a cache control unit that fetches an instruction from a cache memory or a main memory in response to a fetch request from the instruction fetch unit; and the main In an information processing apparatus having a memory bus access unit for accessing a memory and an instruction buffer for holding the fetched instruction,
A branch prediction unit that performs branch prediction of a branch instruction stored in the instruction buffer prior to execution of the branch instruction;
When the branch direction of the branch instruction is uncertain, the cache control unit fetches an instruction from the cache memory, and when the branch instruction prediction direction of the branch instruction is on the sequential side, a cache miss is performed on the instruction on the target side. An information processing apparatus characterized in that when it happens, instruction fetch is stopped without performing memory bus access.

In claim 1 or 2,
The cache control unit, when the branch instruction is confirmed, performs a memory bus access of a cache missed instruction for the instruction in the confirmed branch direction.

4. The information processing apparatus according to claim 3, wherein when the branch direction of the branch instruction is uncertain, the cache hit instruction is prefetched and stored in the instruction buffer.

4. The information processing apparatus according to claim 3 , wherein the instruction on the sequential side or the target side of the instruction in the instruction buffer is selected and decoded according to the branch direction of the branch prediction unit.