[go: up one dir, main page]

WO2000008555A1 - Dispositif de traitement de donnees - Google Patents

Dispositif de traitement de donnees Download PDF

Info

Publication number
WO2000008555A1
WO2000008555A1 PCT/EP1999/005520 EP9905520W WO0008555A1 WO 2000008555 A1 WO2000008555 A1 WO 2000008555A1 EP 9905520 W EP9905520 W EP 9905520W WO 0008555 A1 WO0008555 A1 WO 0008555A1
Authority
WO
WIPO (PCT)
Prior art keywords
instruction
stage
register file
result
processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/EP1999/005520
Other languages
English (en)
Inventor
Fransiscus W. Sijstermans
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Koninklijke Philips NV
Original Assignee
Koninklijke Philips Electronics NV
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninklijke Philips Electronics NV filed Critical Koninklijke Philips Electronics NV
Publication of WO2000008555A1 publication Critical patent/WO2000008555A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30025Format conversion instructions, e.g. Floating-Point to Integer, decimal conversion
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30036Instructions to perform operations on packed data, e.g. vector, tile or matrix operations
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/30141Implementation provisions of register files, e.g. ports
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30181Instruction operation extension or modification
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3854Instruction completion, e.g. retiring, committing or graduating
    • G06F9/3858Result writeback, i.e. updating the architectural state or memory
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3867Concurrent instruction execution, e.g. pipeline or look ahead using instruction pipelines
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3885Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units

Definitions

  • the invention relates to a data processing device with an instruction execution pipeline.
  • PCT patent application No. WO 98/11483 teaches a data processing device with an instruction pipeline.
  • the pipeline contains a series of processing stages from a front end to a back end, for performing successive operations during the execution of an instruction.
  • the final stage of the back end writes back a processing result to a register file.
  • the pipeline can process several instructions in parallel, because the front end processing stages can start executing an instruction before the back end processing stages produced and written back the result of an earlier instruction.
  • the data processing device is described in Claim 1.
  • the invention provides for the possibility to write back results from different processing stages in the pipeline directly after such a processing stage completes processing of an instruction, that is, without passing the entire pipeline and before the entire pipeline has had the opportunity to process the instruction.
  • a first processing stage might perform an arithmetic operation and a second processing stage might perform a clipping operation on the result of the arithmetic operation.
  • one may include two types of instruction in the instruction set of the data processing device, one type for arithmetic operation with clipping and one type for arithmetic operations without clipping. In case of an operation with clipping the result would be written back from the second processing unit (after completion of the clipping operation) and in case of an operation without clipping the result would be written back from the first processing stage (before completion of the clipping operation).
  • the data processor may even write the result of both the first and the second stage (e.g. with and without clipping) in response to some instructions. This means that the result is written back to the register file directly after the processing stage produces its result, that is, earlier than if the processor has to wait for a time period corresponding to the time needed by the second processing stage.
  • Writing to the register file is normally followed by writing to a register after a predetermined delay, but without deviating from the invention, some types of register file may introduce a variable delay until writing is complete, for example in order to resolve access conflicts.
  • the register file is provided with more than one write port, so that results from different stages of the pipeline can be written back in parallel.
  • different write port of the register file are assigned to different processing stages, so that the pipeline is connected to more write ports than needed for writing the result of individual instructions, in order to be able to write results of different instructions in the pipeline from different processing stages in parallel.
  • Figure 1 shows an architecture of a data processing device
  • Figure 2 shows a functional unit.
  • FIG 1 shows the architecture of a data processor.
  • the processor contains a register file 10, a number of functional units 12a-f and an instruction issue unit 14.
  • the instruction issue unit 14 has instruction issue connections to the functional units 12a-f.
  • the functional units 12a-f are connected to the register file 10 via read and write ports.
  • a first one of the functional units 12a has two read ports and two write ports connected to the register file 10.
  • Figure 2 shows the first one of the functional units 70, with a cascade of a first and second sub-unit 72. 74. An output of the first sub-unit is coupled to an input of the second sub-unit and to a write port of the register file 10.
  • the functional unit 70 contains two control units 76, 78 coupled to a control input the first and second sub-unit 72, 74 respectively.
  • An input of the first control unit 76 is coupled to an output of the instruction issue unit for receiving an opcode.
  • An output of the first control unit 76 is coupled to an input of the second control unit 78.
  • the instruction issue unit 14 fetches successive instructions words from an instruction memory (not shown explicitly). Each instruction word may contain several instructions for different ones of the functional units 12a-f. Normally, each instruction contains fields specifying an opcode, one or more source registers and one result register. When the instruction issue unit 14 has fetched an instruction word from instruction memory, the fields specifying the source registers in a particular instruction are decoded and used to address the register file 10. In response, the register file 10 supplies the content of the source registers to the functional unit 12a-f that will execute the particular instruction.
  • the field specifying the opcode and the content of the source registers is supplied to the functional unit 70.
  • the functional unit 70 operates in successive processing cycles.
  • a control signal for the first sub-unit 72 is generated by the first control unit 76, dependent on the opcode.
  • the first sub-unit 72 generates a result which the first sub-unit may write to the register file via the write port (writing depends on the control signal).
  • the result (and possible additional information) is passed to the second sub-unit 74.
  • a further control signal dependent on the opcode is passed from the first control unit 74 to the second control unit 78.
  • the second sub-unit 74 processes the result generated by the first sub-unit 72 under control of the control signal passed by the second control unit 78.
  • a second result, generated by the second sub-unit 72 may be written to the register file via a write port (writing depends on the control signal from the second control unit 78).
  • the first control unit 76 may already cause the first sub-unit 72 to process a subsequent instruction.
  • processors that have a two or more functional units that can start processing different instructions in parallel, such as VLIW processors.
  • These processors can execute further instructions 13 and 14 that use the results of II and 12 respectively. Due to the invention such a processor can start 13 and 14 in the same cycle, which makes processing faster.
  • the first sub-unit 72 may be for example an ALU and the second sub-unit 74 may be clipping unit or a rounding unit.
  • the instruction may be for example an "ADD" instruction.
  • the first sub-unit 72 adds the source operands and writes the sum to the register file via its write port, i.e. without involvement of the second sub-unit 74; the second sub-unit 74 refrains from writing to its write port if it receives this first type of ADD instruction.
  • the first sub-unit 72 adds the source operands, but it refrains from writing the sum to the register file via its write port; the second sub-unit 74 responds to the second type of ADD instruction e.g. by rounding or clipping the sum, which the second sub-unit 74 receives from the first sub-unit 72. Also in response to the second type of ADD instruction the second sub-unit 74 write the result of its operation on the sum to the write port of the second sub-unit 74.
  • adding and rounding or clipping are used here merely by way of example, many other types of operations, which produce meaningful intermediate results, e.g instead of ADD other arithmetic or logic operations, or vector operations and instead of rounding or clipping further arithmetic or logic operations on the result of the first sub-unit 72.
  • the functional unit may respond to some instructions by writing back from both of the sub-units. This leads to the following pipeline table.
  • each sub-unit 72, 74 itself may contain one or more further subunits, or pipeline stages which process the instruction in successive processing stages.
  • more sub-units for implementing different pipeline stages may be placed in series with the first and second sub-unit 72, 74.
  • more than two of such further sub- units may be connected to their own write ports to the register file 10 for writing a result produced at an intermediate stage in the pipeline.
  • the pipeline table may be
  • forks in the pipeline may be included, where one sub-unit feeds two or more further subunits in parallel, one or more of these sub-units having their own write ports for writing results back to the register file 10.
  • one may include one or more sub-units (not shown) in parallel to the first sub-unit 72, each having its own instruction and operand inputs and its own write port for writing to the register file 10.
  • these one or more sub-units and the first sub-unit 72 may feed a single second sub-unit 74 in parallel via a multiplexer (not shown), the pipeli- ned instructions determining from which of the sub-units the multiplexer passes results to the second sub-unit 74.
  • several instructions may be executed in parallel and a selected one of them may be followed by postprocessing in the second sub-unit 74.
  • a compiler for the processor will have to schedule operations in such a way that results are produced timely, without conflicts about the use of functional units 12a-f or regis- ters.
  • the compiler can treat the functional unit 70 more or less as two or more conceptually different functional units, one for processing instructions without processing by the second sub-unit 74 and one for processing instructions including processing by the second sub-unit 74. These conceptually different functional units have different latencies.
  • the compiler will avoid scheduling instruction simultaneously at the functional unit, but the compiler may schedule the start a further instruction at a time when the second sub-unit 74 is still working on the previous instruction. Owing to the invention the compiler can schedule instructions that use the result of the further instruction earlier, for example as early as an instruction that uses a result of the previous instruction.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Advance Control (AREA)

Abstract

L'invention concerne un dispositif de traitement de données, qui possède un pipeline d'exécution des instructions comprenant au moins un premier et un second étage de traitement, directement ou indirectement en série. Les étages exécutent une première et une seconde étape d'exécution des instructions, un premier et un second nombre réciproquement différents de cycles de traitement après l'entrée de l'instruction dans le pipeline. Le premier et le second étage sont tous deux reliés à une pile de registres, afin de permettre l'écriture dans ladite pile d'un résultat de traitement obtenu au cours de la première et/ou la seconde étape et ce, une fois le premier et le second nombre de cycles de traitement exécutés respectivement.
PCT/EP1999/005520 1998-08-06 1999-07-29 Dispositif de traitement de donnees Ceased WO2000008555A1 (fr)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
EP98202647.8 1998-08-06
EP98202647 1998-08-06
EP98203425 1998-10-09
EP98203425.8 1998-10-09

Publications (1)

Publication Number Publication Date
WO2000008555A1 true WO2000008555A1 (fr) 2000-02-17

Family

ID=26150605

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP1999/005520 Ceased WO2000008555A1 (fr) 1998-08-06 1999-07-29 Dispositif de traitement de donnees

Country Status (1)

Country Link
WO (1) WO2000008555A1 (fr)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100418645C (zh) * 2003-03-21 2008-09-17 迪纳帕克压紧设备股份公司 用于调节压实机滚轮偏心轴偏心力矩的调节装置
EP2866138B1 (fr) * 2013-10-23 2019-08-07 Teknologian tutkimuskeskus VTT Oy Pipeline à support de virgule-flottante pour architecures émulées de mémoire partagée
EP2887207B1 (fr) * 2013-12-19 2019-10-16 Teknologian tutkimuskeskus VTT Oy Architecture pour des opérations de latence longue dans des architectures de mémoire partagée émulées
JP2021168189A (ja) * 2020-07-15 2021-10-21 ベイジン バイドゥ ネットコム サイエンス アンド テクノロジー カンパニー リミテッド 命令実行結果をライトバックするための装置及び方法、処理装置
CN118963839A (zh) * 2024-07-30 2024-11-15 中山大学 一种基于rv32i指令的伪两级流水线处理器及其控制方法

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4228497A (en) * 1977-11-17 1980-10-14 Burroughs Corporation Template micromemory structure for a pipelined microprogrammable data processing system
EP0653703A1 (fr) * 1993-11-17 1995-05-17 Sun Microsystems, Inc. Jeu de registres temporaire pour un processeur superpipeline-superscalaire

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4228497A (en) * 1977-11-17 1980-10-14 Burroughs Corporation Template micromemory structure for a pipelined microprogrammable data processing system
EP0653703A1 (fr) * 1993-11-17 1995-05-17 Sun Microsystems, Inc. Jeu de registres temporaire pour un processeur superpipeline-superscalaire

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"METHOD TO MAINTAIN PIPELINE THROUGHPUT WHILE PIPELINE DEPTH IS ALLOWED TO VARY", IBM TECHNICAL DISCLOSURE BULLETIN,US,IBM CORP. NEW YORK, vol. 39, no. 5, pages 31-32, XP000584045, ISSN: 0018-8689 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100418645C (zh) * 2003-03-21 2008-09-17 迪纳帕克压紧设备股份公司 用于调节压实机滚轮偏心轴偏心力矩的调节装置
EP2866138B1 (fr) * 2013-10-23 2019-08-07 Teknologian tutkimuskeskus VTT Oy Pipeline à support de virgule-flottante pour architecures émulées de mémoire partagée
EP2887207B1 (fr) * 2013-12-19 2019-10-16 Teknologian tutkimuskeskus VTT Oy Architecture pour des opérations de latence longue dans des architectures de mémoire partagée émulées
JP2021168189A (ja) * 2020-07-15 2021-10-21 ベイジン バイドゥ ネットコム サイエンス アンド テクノロジー カンパニー リミテッド 命令実行結果をライトバックするための装置及び方法、処理装置
EP3940531A1 (fr) * 2020-07-15 2022-01-19 Kunlunxin Technology (Beijing) Company Limited Appareil et procédé d'écriture de résultat d'exécution d'instructions et appareil de traitement
JP7229305B2 (ja) 2020-07-15 2023-02-27 ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド 命令実行結果をライトバックするための装置及び方法、処理装置
CN118963839A (zh) * 2024-07-30 2024-11-15 中山大学 一种基于rv32i指令的伪两级流水线处理器及其控制方法

Similar Documents

Publication Publication Date Title
US20020169942A1 (en) VLIW processor
EP1658559B1 (fr) Dispositif et methode de traitement de donnees a commande par instructions
JP2918631B2 (ja) デコーダ
US5404552A (en) Pipeline risc processing unit with improved efficiency when handling data dependency
US7281119B1 (en) Selective vertical and horizontal dependency resolution via split-bit propagation in a mixed-architecture system having superscalar and VLIW modes
JP3881763B2 (ja) データ処理装置
CN102063286B (zh) 程序流控制
US6260189B1 (en) Compiler-controlled dynamic instruction dispatch in pipelined processors
US6145074A (en) Selecting register or previous instruction result bypass as source operand path based on bypass specifier field in succeeding instruction
JP2002512399A (ja) 外部コプロセッサによりアクセス可能なコンテキストスイッチレジスタセットを備えたriscプロセッサ
JPH11224194A5 (fr)
US6154828A (en) Method and apparatus for employing a cycle bit parallel executing instructions
JP2003005958A (ja) データ処理装置およびその制御方法
JP3578883B2 (ja) データ処理装置
JP2874351B2 (ja) 並列パイプライン命令処理装置
WO2000008555A1 (fr) Dispositif de traitement de donnees
US7111152B1 (en) Computer system that operates in VLIW and superscalar modes and has selectable dependency control
US6099585A (en) System and method for streamlined execution of instructions
JPH08272611A (ja) マイクロプロセッサ
US7302555B2 (en) Zero overhead branching and looping in time stationary processors
JP3182591B2 (ja) マイクロプロセッサ
US6981130B2 (en) Forwarding the results of operations to dependent instructions more quickly via multiplexers working in parallel
JP2878792B2 (ja) 電子計算機
JP3534987B2 (ja) 情報処理装置
US6032249A (en) Method and system for executing a serializing instruction while bypassing a floating point unit pipeline

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): JP

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE

121 Ep: the epo has been informed by wipo that ep was designated in this application
122 Ep: pct application non-entry in european phase