CN118885218A - Instruction processing method and device for out-of-order multi-issue processor - Google Patents
Instruction processing method and device for out-of-order multi-issue processor Download PDFInfo
- Publication number
- CN118885218A CN118885218A CN202410917375.8A CN202410917375A CN118885218A CN 118885218 A CN118885218 A CN 118885218A CN 202410917375 A CN202410917375 A CN 202410917375A CN 118885218 A CN118885218 A CN 118885218A
- Authority
- CN
- China
- Prior art keywords
- instruction
- target
- csr
- instructions
- queue
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3824—Operand accessing
- G06F9/383—Operand prefetching
- G06F9/3832—Value prediction for operands; operand history buffers
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3802—Instruction prefetching
- G06F9/3804—Instruction prefetching for branches, e.g. hedging, branch folding
- G06F9/3806—Instruction prefetching for branches, e.g. hedging, branch folding using address prediction, e.g. return stack, branch history buffer
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3836—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
- G06F9/3838—Dependency mechanisms, e.g. register scoreboarding
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3836—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
- G06F9/3851—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution from multiple instruction streams, e.g. multistreaming
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3854—Instruction completion, e.g. retiring, committing or graduating
- G06F9/3858—Result writeback, i.e. updating the architectural state or memory
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3867—Concurrent instruction execution, e.g. pipeline or look ahead using instruction pipelines
- G06F9/3869—Implementation aspects, e.g. pipeline latches; pipeline synchronisation and clocking
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Advance Control (AREA)
Abstract
本申请实施例中提供一种用于乱序多发射处理器的指令处理方法及装置,所述方法包括:获取并缓存预设指令发射队列发射的目标指令及其对应的操作数;将多条目标指令及其对应的操作数存储至预设先进先出队列;对于先进先出队列中的每条目标指令及其对应的操作数,在向乱序多发射处理器的交付模块发送对应的第一同步信号,同时接收到交付模块发送的第二同步信号的情况下,执行目标指令,并将执行结果发送至乱序多发射处理器的写回模块,在整个流水线中的指令可能因为出现异常、中断或者分支预测错误而将被清空的情况下,能够确定乱序多发射处理器中哪些指令可以执行哪些指令不能执行。
In an embodiment of the present application, an instruction processing method and device for an out-of-order multi-issue processor are provided, the method comprising: acquiring and caching a target instruction and its corresponding operands emitted by a preset instruction emission queue; storing multiple target instructions and their corresponding operands in a preset first-in-first-out queue; for each target instruction and its corresponding operand in the first-in-first-out queue, when a corresponding first synchronization signal is sent to a delivery module of the out-of-order multi-issue processor and a second synchronization signal sent by the delivery module is received at the same time, executing the target instruction and sending the execution result to a write-back module of the out-of-order multi-issue processor, and when instructions in the entire pipeline may be cleared due to an exception, an interruption or a branch prediction error, it is possible to determine which instructions in the out-of-order multi-issue processor can be executed and which instructions cannot be executed.
Description
技术领域Technical Field
本申请涉及乱序处理器技术领域,具体地,涉及一种用于乱序多发射处理器的指令处理方法及装置、计算机设备和存储介质。The present application relates to the technical field of out-of-order processors, and in particular, to an instruction processing method and apparatus for an out-of-order multi-issue processor, a computer device, and a storage medium.
背景技术Background Art
目前,乱序多发射处理器的流水线的功能包括但不限于指令获取和分支预测、指令解码、分析指令间数据相关性并消除假数据冒险、发射指令到功能单元、并行执行指令和按照指令原始顺序更新机器状态,其中,在执行指令期间,如果有一条指令出现分支预测错误、中断和异常中的任意一种,整个流水线中的指令将被清空,这可能会导致无法确定乱序多发射处理器中哪些指令可以执行哪些指令不能执行的问题。Currently, the functions of the pipeline of an out-of-order multi-issue processor include but are not limited to instruction fetching and branch prediction, instruction decoding, analyzing data dependencies between instructions and eliminating false data hazards, issuing instructions to functional units, executing instructions in parallel, and updating machine status in the original order of instructions. During the execution of instructions, if an instruction has any of branch prediction errors, interruptions and exceptions, the instructions in the entire pipeline will be cleared, which may lead to the problem of being unable to determine which instructions in the out-of-order multi-issue processor can be executed and which cannot be executed.
发明内容Summary of the invention
本申请实施例中提供了一种用于乱序多发射处理器的指令处理方法及装置、计算机设备和存储介质。Embodiments of the present application provide an instruction processing method and apparatus for an out-of-order multi-issue processor, a computer device, and a storage medium.
本申请实施例的第一个方面,提供了一种用于乱序多发射处理器的指令处理方法,包括:A first aspect of an embodiment of the present application provides an instruction processing method for an out-of-order multi-issue processor, comprising:
获取并缓存预设指令发射队列发射的CSR指令及其对应的操作数;Obtain and cache the CSR instructions and their corresponding operands issued by the preset instruction issuance queue;
将多条CSR指令及其对应的操作数存储至预设先进先出队列;storing a plurality of CSR instructions and their corresponding operands into a preset first-in-first-out queue;
对于先进先出队列中的每条CSR指令及其对应的操作数,在向乱序多发射处理器的交付模块发送对应的第一同步信号,同时接收到交付模块发送的第二同步信号的情况下,执行CSR指令,并将执行结果发送至乱序多发射处理器的写回模块,交付模块将CSR指令交付结果发送至乱序多发射处理器的写回模块。For each CSR instruction and its corresponding operand in the first-in-first-out queue, when the corresponding first synchronization signal is sent to the delivery module of the out-of-order multi-issue processor and the second synchronization signal sent by the delivery module is received at the same time, the CSR instruction is executed and the execution result is sent to the write back module of the out-of-order multi-issue processor. The delivery module sends the CSR instruction delivery result to the write back module of the out-of-order multi-issue processor.
在本申请一个可选的实施例中,所述获取并缓存预设指令发射队列发射的CSR指令及其对应的操作数,包括:In an optional embodiment of the present application, the obtaining and caching of the CSR instructions and their corresponding operands issued by the preset instruction issue queue includes:
获取预设指令发射队列发射的CSR指令;Get the CSR instruction sent by the preset instruction sending queue;
根据CSR指令确定与其对应的操作数;Determine the operand corresponding to the CSR instruction;
缓存预设指令发射队列发射的CSR指令及其对应的操作数。The CSR instructions and their corresponding operands issued by the preset instruction issue queue are cached.
在本申请一个可选的实施例中,所述根据CSR指令确定与其对应的操作数,包括:In an optional embodiment of the present application, determining the operand corresponding to the CSR instruction according to the CSR instruction includes:
获取CSR指令中指定操作数寄存器的地址,其中,所述指定操作数寄存器存储与所述CSR指令对应的操作数;Obtaining an address of a specified operand register in a CSR instruction, wherein the specified operand register stores an operand corresponding to the CSR instruction;
按照指定操作数寄存器的地址,从所述指定寄存器中读取与所述CSR指令对应的操作数。According to the address of the designated operand register, the operand corresponding to the CSR instruction is read from the designated register.
在本申请一个可选的实施例中,所述将多条CSR指令及其对应的操作数存储至预设先进先出队列,包括:In an optional embodiment of the present application, storing the plurality of CSR instructions and their corresponding operands into a preset first-in-first-out queue includes:
按照读取操作数的先后顺序,将不同CSR指令及其对应的操作数存储至预设先进先出队列。According to the order of reading operands, different CSR instructions and their corresponding operands are stored in a preset first-in-first-out queue.
在本申请一个可选的实施例中,所述第一同步信号用于控制对CSR指令的读写操作,所述第二同步信号用于对指令进行交付。In an optional embodiment of the present application, the first synchronization signal is used to control the read and write operations of the CSR instruction, and the second synchronization signal is used to deliver the instruction.
在本申请一个可选的实施例中,所述执行CSR指令,并将执行结果发送至乱序多发射处理器的写回模块,包括:In an optional embodiment of the present application, executing the CSR instruction and sending the execution result to the write-back module of the out-of-order multi-issue processor includes:
更新CSR指令的值,生成对应的清空信号,并将CSR指令的更新值和清空信号存储至预设寄存器中,其中,清空信号为冲刷流水线的信号;Update the value of the CSR instruction, generate a corresponding clear signal, and store the updated value of the CSR instruction and the clear signal in a preset register, wherein the clear signal is a signal for flushing the pipeline;
将CSR指令的更新值、清空信号以及预设寄存器的存储记录发送至写回模块。The update value of the CSR instruction, the clear signal, and the storage record of the preset register are sent to the write-back module.
在本申请一个可选的实施例中,在所述执行CSR指令之前,所述方法还包括:In an optional embodiment of the present application, before executing the CSR instruction, the method further includes:
在CSR指令为读写操作指令,目的寄存器为通用整数寄存器x0的情况下,对CSR只写不读;When the CSR instruction is a read-write operation instruction and the destination register is the general integer register x0, the CSR is only written but not read;
在CSR指令为读并置位操作指令或者读并清除操作指令,操作数寄存器为通用整数寄存器x0的情况下,对CSR只读不写;When the CSR instruction is a read and set operation instruction or a read and clear operation instruction, and the operand register is the general integer register x0, the CSR is only read but not written;
在CSR指令为读并置位操作的立即数扩展指令或者读并清除操作的立即数扩展指令,立即数为0的情况下,对CSR只读不写。When the CSR instruction is an immediate value extension instruction of a read and set operation or an immediate value extension instruction of a read and clear operation, and the immediate value is 0, the CSR is only read but not written.
本申请实施例的第二个方面,提供了一种用于乱序多发射处理器的指令处理装置,包括:According to a second aspect of an embodiment of the present application, there is provided an instruction processing device for an out-of-order multi-issue processor, comprising:
第一缓存区,用于获取并缓存预设指令发射队列发射的目标指令及其对应的操作数;A first buffer area is used to obtain and cache target instructions and corresponding operands issued by a preset instruction issue queue;
第二缓存区,用于将多条目标指令及其对应的操作数存储至预设先进先出队列;A second cache area, for storing a plurality of target instructions and their corresponding operands into a preset first-in-first-out queue;
收发模块,用于对于先进先出队列中的每条目标指令及其对应的操作数,在向乱序多发射处理器的交付模块发送对应的第一同步信号,同时接收到交付模块发送的第二同步信号的情况下,执行目标指令,并将执行结果发送至乱序多发射处理器的写回模块;A transceiver module, for each target instruction and its corresponding operand in the first-in-first-out queue, when sending a corresponding first synchronization signal to a delivery module of the out-of-order multi-issue processor and receiving a second synchronization signal sent by the delivery module, executes the target instruction and sends the execution result to a write-back module of the out-of-order multi-issue processor;
交付模块,用于将目标指令交付结果发送至乱序多发射处理器的写回模块。The delivery module is used to send the target instruction delivery result to the write-back module of the out-of-order multi-issue processor.
本申请实施例的第三个方面,提供了一种计算机设备,包括:包括存储器和处理器,存储器存储有计算机程序,处理器执行计算机程序时实现如上任一项方法的步骤。According to a third aspect of an embodiment of the present application, a computer device is provided, comprising: a memory and a processor, wherein the memory stores a computer program, and the processor implements the steps of any of the above methods when executing the computer program.
本申请实施例的第四个方面,提供了一种计算机可读存储介质,其上存储有计算机程序,其特征在于,计算机程序被处理器执行时实现如上任一项的方法的步骤。A fourth aspect of the embodiments of the present application provides a computer-readable storage medium having a computer program stored thereon, wherein the computer program implements the steps of any of the above methods when executed by a processor.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
此处所说明的附图用来提供对本申请的进一步理解,构成本申请的一部分,本申请的示意性实施例及其说明用于解释本申请,并不构成对本申请的不当限定。在附图中:The drawings described herein are used to provide a further understanding of the present application and constitute a part of the present application. The illustrative embodiments of the present application and their descriptions are used to explain the present application and do not constitute an improper limitation on the present application. In the drawings:
图1为本申请一个实施例提供的用于乱序多发射处理器的指令处理方法的流程图;FIG1 is a flow chart of an instruction processing method for an out-of-order multi-issue processor provided by an embodiment of the present application;
图2为本申请一个实施例提供的乱序多发射处理器中CSR指令的流向流程图;FIG2 is a flow chart of a CSR instruction in an out-of-order multi-issue processor provided by an embodiment of the present application;
图3为本申请另一个实施例提供的用于乱序多发射处理器的指令处理方法的流程图;FIG3 is a flow chart of an instruction processing method for an out-of-order multi-issue processor provided by another embodiment of the present application;
图4为本申请一个实施例提供的乱序多发射处理器中VPU指令的流向流程图;FIG4 is a flow chart of VPU instructions in an out-of-order multi-issue processor provided by an embodiment of the present application;
图5为本申请一个实施例提供的用于乱序多发射处理器的指令处理装置结构示意图;FIG5 is a schematic diagram of the structure of an instruction processing device for an out-of-order multi-issue processor provided by an embodiment of the present application;
图6为本申请一个实施例提供的计算机设备结构示意图。FIG. 6 is a schematic diagram of the structure of a computer device provided in one embodiment of the present application.
具体实施方式DETAILED DESCRIPTION
在实现本申请的过程中,发明人发现,目前在执行指令期间,如果有一条指令出现分支预测错误、中断和异常中的任意一种,整个流水线中的指令将被清空,这可能会导致无法确定乱序多发射处理器中哪些指令已执行哪些指令未执行的问题。In the process of implementing the present application, the inventors discovered that, currently during the execution of instructions, if an instruction encounters any of a branch prediction error, interruption, and exception, the instructions in the entire pipeline will be cleared, which may lead to the problem of being unable to determine which instructions have been executed and which have not been executed in an out-of-order multi-issue processor.
针对上述问题,本申请实施例中提供了一种用于乱序多发射处理器的指令处理方法,获取并缓存预设指令发射队列发射的CSR(Control and Status Register,控制与状态寄存器)指令及其对应的操作数;将多条CSR指令及其对应的操作数存储至预设先进先出队列;对于先进先出队列中的每条CSR指令及其对应的操作数,在向乱序多发射处理器的交付模块发送对应的第一同步信号,同时接收到交付模块发送的第二同步信号的情况下,执行CSR指令,并将执行结果发送至乱序多发射处理器的写回模块,交付模块将CSR指令交付结果发送至乱序多发射处理器的写回模块,在整个流水线中的指令可能因为出现异常、中断或者分支预测错误而将被清空的情况下,能够确定乱序多发射处理器中哪些指令可以执行哪些指令不能执行。In response to the above problems, an embodiment of the present application provides an instruction processing method for an out-of-order multi-issue processor, which obtains and caches CSR (Control and Status Register) instructions and their corresponding operands emitted by a preset instruction emission queue; stores multiple CSR instructions and their corresponding operands in a preset first-in-first-out queue; for each CSR instruction and its corresponding operand in the first-in-first-out queue, when a corresponding first synchronization signal is sent to a delivery module of the out-of-order multi-issue processor and a second synchronization signal sent by the delivery module is received, the CSR instruction is executed, and the execution result is sent to the write-back module of the out-of-order multi-issue processor, and the delivery module sends the CSR instruction delivery result to the write-back module of the out-of-order multi-issue processor. When the instructions in the entire pipeline may be cleared due to an exception, interruption or branch prediction error, it can be determined which instructions in the out-of-order multi-issue processor can be executed and which instructions cannot be executed.
为了使本申请实施例中的技术方案及优点更加清楚明白,以下结合附图对本申请的示例性实施例进行进一步详细的说明,显然,所描述的实施例仅是本申请的一部分实施例,而不是所有实施例的穷举。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。In order to make the technical solutions and advantages in the embodiments of the present application more clearly understood, the exemplary embodiments of the present application are further described in detail below in conjunction with the accompanying drawings. Obviously, the described embodiments are only part of the embodiments of the present application, rather than an exhaustive list of all the embodiments. It should be noted that the embodiments in the present application and the features in the embodiments can be combined with each other without conflict.
请参见图1,本申请实施例提供的用于乱序多发射处理器的指令处理方法包括如下步骤S1-步骤S4:Referring to FIG. 1 , the instruction processing method for an out-of-order multi-issue processor provided in an embodiment of the present application includes the following steps S1 to S4:
S1,获取并缓存预设指令发射队列发射的目标指令及其对应的操作数。S1, obtaining and caching the target instruction and its corresponding operand issued by the preset instruction issue queue.
在本申请一个可选的实施例中,步骤S1中,所述获取并缓存预设指令发射队列发射的目标指令及其对应的操作数,包括:In an optional embodiment of the present application, in step S1, obtaining and caching the target instruction and its corresponding operand issued by the preset instruction issue queue includes:
获取预设指令发射队列发射的目标指令;Get the target instruction sent by the preset instruction sending queue;
根据目标指令确定与其对应的操作数;Determine the operand corresponding to the target instruction according to the target instruction;
缓存预设指令发射队列发射的目标指令及其对应的操作数。The cache presets the target instructions and their corresponding operands issued by the instruction issue queue.
在本申请一个可选的实施例中,所述根据目标指令确定与其对应的操作数,包括:In an optional embodiment of the present application, determining an operand corresponding to the target instruction according to the target instruction includes:
获取目标指令中指定操作数寄存器的地址,其中,所述指定操作数寄存器存储与所述目标指令对应的操作数;Obtaining an address of a specified operand register in a target instruction, wherein the specified operand register stores an operand corresponding to the target instruction;
按照指定操作数寄存器的地址,从所述指定寄存器中读取与所述目标指令对应的操作数。According to the address of the designated operand register, an operand corresponding to the target instruction is read from the designated register.
S2,将多条目标指令及其对应的操作数存储至预设先进先出队列。S2, storing the plurality of target instructions and their corresponding operands into a preset first-in-first-out queue.
在本申请一个可选的实施例中,步骤S2中,所述将多条目标指令及其对应的操作数存储至预设先进先出队列,包括:In an optional embodiment of the present application, in step S2, storing the plurality of target instructions and their corresponding operands into a preset first-in-first-out queue includes:
按照读取操作数的先后顺序,将不同目标指令及其对应的操作数存储至预设先进先出队列。According to the order of reading the operands, different target instructions and their corresponding operands are stored in a preset first-in-first-out queue.
S3,对于先进先出队列中的每条目标指令及其对应的操作数,在向乱序多发射处理器的交付模块发送对应的第一同步信号,同时接收到交付模块发送的第二同步信号的情况下,执行目标指令,并将执行结果发送至乱序多发射处理器的写回模块,其中,第一同步信号的发送与第二同步信号的接收之间没有先后顺序,可以是先发送第一同步信号后接收第二同步信号,也可以是先接收第二同步信号后发送第一同步信号。S3. For each target instruction and its corresponding operand in the first-in-first-out queue, when the corresponding first synchronization signal is sent to the delivery module of the out-of-order multi-issue processor and the second synchronization signal sent by the delivery module is received at the same time, the target instruction is executed and the execution result is sent to the write back module of the out-of-order multi-issue processor, wherein there is no order between the sending of the first synchronization signal and the receiving of the second synchronization signal, and the first synchronization signal may be sent first and then the second synchronization signal is received, or the second synchronization signal may be received first and then the first synchronization signal is sent.
在本申请一个可选的实施例中,所述第一同步信号用于控制对目标指令的读写操作,所述第二同步信号用于对指令进行交付,其中,对指令进行交付的条件是当前指令前序的指令没有发生分支预测错误、中断或异常。In an optional embodiment of the present application, the first synchronization signal is used to control the read and write operations of the target instruction, and the second synchronization signal is used to deliver the instruction, wherein the condition for delivering the instruction is that the instruction preceding the current instruction has no branch prediction error, interruption or exception.
在本申请一个可选的实施例中,步骤S3中,所述指令为CSR(Control and StatusRegister,控制和状态寄存器)指令,所述执行目标指令,并将执行结果发送至乱序多发射处理器的写回模块,包括:In an optional embodiment of the present application, in step S3, the instruction is a CSR (Control and Status Register) instruction, and executing the target instruction and sending the execution result to the write-back module of the out-of-order multi-issue processor includes:
更新目标指令的值,生成对应的清空信号(表示需要冲刷流水线的信号),并将目标指令的更新值和清空信号存储至预设寄存器中;Update the value of the target instruction, generate a corresponding clear signal (a signal indicating that the pipeline needs to be flushed), and store the updated value of the target instruction and the clear signal in a preset register;
将目标指令的更新值、清空信号以及预设寄存器的存储记录发送至写回模块。The updated value of the target instruction, the clear signal, and the storage record of the preset register are sent to the write-back module.
在本申请一个可选的实施例中,所述指令为CSR指令,在所述执行目标指令之前,所述方法还包括:In an optional embodiment of the present application, the instruction is a CSR instruction, and before executing the target instruction, the method further includes:
在CSR指令为读写操作指令,目的寄存器为通用整数寄存器x0的情况下,对CSR只写不读,其中,xreg为通用整数寄存器,xreg由代号x0~x31或者x0~x15表示,具体是32个寄存器还是16个寄存器受当前所用子集的影响,通用整数寄存器x0的值通常被预留为常数0,意味着向x0做写入操作无效;When the CSR instruction is a read-write operation instruction and the destination register is the general integer register x0, CSR is only written but not read, where xreg is a general integer register, represented by the code x0~x31 or x0~x15. Whether it is 32 registers or 16 registers is affected by the currently used subset. The value of the general integer register x0 is usually reserved as a constant 0, which means that writing to x0 is invalid.
在CSR指令为读并置位操作指令或者读并清除操作指令,操作数寄存器为通用整数寄存器x0的情况下,对CSR只读不写;When the CSR instruction is a read and set operation instruction or a read and clear operation instruction, and the operand register is the general integer register x0, the CSR is only read but not written;
在CSR指令为读并置位操作的立即数扩展指令或者读并清除操作的立即数扩展指令,立即数为0的情况下,对CSR只读不写。When the CSR instruction is an immediate value extension instruction of a read and set operation or an immediate value extension instruction of a read and clear operation, and the immediate value is 0, the CSR is only read but not written.
S4,交付模块将目标指令交付结果发送至乱序多发射处理器的写回模块。S4, the delivery module sends the target instruction delivery result to the write-back module of the out-of-order multi-issue processor.
参见图2,本公开的用于乱序多发射处理器的指令处理方法中指令的流向如下:Referring to FIG. 2 , the flow of instructions in the instruction processing method for an out-of-order multi-issue processor disclosed in the present invention is as follows:
图2中左侧是CSR指令的执行路径,右侧是交付模块和写回模块的路径。当CSR指令在用于指令发射的队列Inst Queue(instruction queue,指令队列)中出现之后,依次经过指令队列缓冲区iqbuf(instuction queue buffer,用来优化时序,Inst Queue和iqbuf都在读操作数之前)、操作数缓冲区opbuf(opcode buffer,在这个阶段需要拿到所有的操作数),到达封装模块csr_wrapper(用于封装与csr直接相关的模块深度为N的先进先出缓存区N-depth-fifo、指令控制模块csr_ctrl和指令更新模块csr_update,指令控制模块用于控制csr寄存器的读写,指令更新模块用于更新csr寄存器的值),其中,在从opbuf出来之前拿到操作数,否则会卡住,等拿到为止。当CSR指令进入csr_wrapper,先进入一个N深度的fifo缓存(N-depth-fifo,深度为N的先进先出缓存区,可缓存多条上游发下来的CSR指令)。在fifo缓存的出口,需要等到图2中的交付模块发送的csr cmt信号,才继续往下进入指令控制模块csr_ctrl。如果没有csr cmt信号(csr commit信号,由cmt模块发出,告知csr_ctrl模块,该指令可以被交付,cmt为commit指令交付模块),需要等待,直至等到为止。同时,在图2右侧的主流水线上,需要等到指令控制模块csr_ctrl发出的csr rslv信号(将CSR指令已走到csr_ctrl模块的信息告知交付模块),即CSR指令已正常走到csr_ctrl模块,才能继续往下游模块走。当CSR指令进入csr_ctrl模块,即可完成CSR指令的更新。更新CSR指令的同时产生相应的清空信号。与此同时,将读到的数据保存到寄存器中,并发送至写回模块(将指令的计算结果写回到通用寄存器组regfile),经过仲裁之后写回。The left side of Figure 2 shows the execution path of the CSR instruction, and the right side shows the path of the delivery module and the write-back module. When the CSR instruction appears in the instruction queue (instruction queue) for instruction emission, it passes through the instruction queue buffer iqbuf (instuction queue buffer, used to optimize timing, Inst Queue and iqbuf are both before reading operands), the operand buffer opbuf (opcode buffer, all operands need to be obtained at this stage), and reaches the encapsulation module csr_wrapper (used to encapsulate the N-depth-FIFO of the first-in-first-out buffer area with a depth of N for modules directly related to csr, the instruction control module csr_ctrl and the instruction update module csr_update, the instruction control module is used to control the reading and writing of the csr register, and the instruction update module is used to update the value of the csr register). Among them, the operand must be obtained before it comes out of opbuf, otherwise it will be stuck and wait until it is obtained. When the CSR instruction enters csr_wrapper, it first enters an N-depth fifo cache (N-depth-fifo, a first-in-first-out cache area with a depth of N, which can cache multiple CSR instructions sent from the upstream). At the exit of the fifo cache, it is necessary to wait for the csr cmt signal sent by the delivery module in Figure 2 before continuing to enter the instruction control module csr_ctrl. If there is no csr cmt signal (csr commit signal, sent by the cmt module, notifying the csr_ctrl module that the instruction can be delivered, cmt is the commit instruction delivery module), it is necessary to wait until it is received. At the same time, on the main pipeline on the right side of Figure 2, it is necessary to wait for the csr rslv signal sent by the instruction control module csr_ctrl (informing the delivery module that the CSR instruction has reached the csr_ctrl module), that is, the CSR instruction has reached the csr_ctrl module normally, before it can continue to move to the downstream module. When the CSR instruction enters the csr_ctrl module, the update of the CSR instruction can be completed. The corresponding clear signal is generated while updating the CSR instruction. At the same time, the read data is saved in the register and sent to the write-back module (writing the calculation result of the instruction back to the general register group regfile), and written back after arbitration.
本实施例中,对csr_ctrl模块和交付模块做强制同步,两边相互等待,直到满足条件,强制同步为了确定当前CSR指令一定可以被安全地执行,不会被其他指令清空,在当前CSR指令的前序的指令由于发生异常、中断或者分支预测错误,导致整条流水线中的指令被清空的情况下,能够确定当前CSR指令的执行状态,即是否可以安全地执行。In this embodiment, the csr_ctrl module and the delivery module are forced to be synchronized, and both sides wait for each other until the conditions are met. The forced synchronization is to ensure that the current CSR instruction can be safely executed and will not be cleared by other instructions. When the instructions preceding the current CSR instruction are cleared due to an exception, interruption or branch prediction error, the execution status of the current CSR instruction can be determined, that is, whether it can be executed safely.
请参见图3,所述目标指令为VPU(Vector Process Unit,向量处理单元)指令,本申请实施例提供的用于乱序多发射处理器的指令处理方法,还包括如下步骤S31-步骤S35:Please refer to FIG. 3 , the target instruction is a VPU (Vector Process Unit) instruction. The instruction processing method for an out-of-order multi-issue processor provided in the embodiment of the present application further includes the following steps S31 to S35:
S31,获取并缓存预设指令发射队列发射的指令。S31, obtaining and caching instructions issued by a preset instruction issuing queue.
在本申请一个可选的实施例中,对执行单元的派遣模块发出来的指令分组,不是每一组指令都有单独的指令发射队列,而是多组指令共用指令发射队列,综合考虑缓存区的面积和功能。In an optional embodiment of the present application, the instructions issued by the dispatch module of the execution unit are grouped, and not each group of instructions has a separate instruction issuance queue, but multiple groups of instructions share the instruction issuance queue, taking into account the area and function of the cache area.
S32,从所述指令中抽取VPU指令,并对所述指令进行译码。S32, extracting VPU instructions from the instructions, and decoding the instructions.
在本申请一个可选的实施例中,所述对所述指令进行译码得到的译码信息与VPU指令的通用整数寄存器以及操作数寄存器和目的寄存器索引相关。In an optional embodiment of the present application, the decoding information obtained by decoding the instruction is related to the general integer register, operand register and destination register index of the VPU instruction.
S33,按照指令的译码信息对抽取的VPU指令进行分组。S33, grouping the extracted VPU instructions according to the decoding information of the instructions.
在本申请一个可选的实施例中,所述译码信息与VPU指令的通用整数寄存器以及操作数寄存器和目的寄存器索引相关,所述按照指令的译码信息对抽取的VPU指令进行分组,包括:In an optional embodiment of the present application, the decoding information is related to the general integer register, operand register and destination register index of the VPU instruction, and the grouping of the extracted VPU instructions according to the decoding information of the instruction includes:
按照指令的译码信息将抽取的VPU指令分为访存指令和非访存指令,并将每个VPU指令切分为多个最小操作单元,VPU指令的操作数寄存器和目的寄存器索可以是通用整数寄存器或者向量寄存器。The extracted VPU instructions are divided into memory access instructions and non-memory access instructions according to instruction decoding information, and each VPU instruction is divided into multiple minimum operation units. The operand register and destination register of the VPU instruction can be general integer registers or vector registers.
S34,在获取VPU指令对应的操作数以及指令交付信号的情况下,确定是否完成对VPU指令的分组。S34, determining whether the grouping of the VPU instructions is completed when the operands corresponding to the VPU instructions and the instruction delivery signal are obtained.
在本申请一个可选的实施例中,所述获取VPU指令对应的操作数,包括:In an optional embodiment of the present application, obtaining the operand corresponding to the VPU instruction includes:
获取VPU指令指定操作数寄存器的地址,其中,所述指定操作数寄存器存储与所述VPU指令对应的操作数;Obtaining an address of a VPU instruction designated operand register, wherein the designated operand register stores an operand corresponding to the VPU instruction;
按照指定操作数寄存器的地址,从所述指定寄存器中读取与所述VPU指令对应的操作数。According to the address of the designated operand register, the operand corresponding to the VPU instruction is read from the designated register.
在本申请一个可选的实施例中,所述从所述指定寄存器中读取与所述VPU指令对应的操作数,包括:In an optional embodiment of the present application, reading the operand corresponding to the VPU instruction from the designated register includes:
在所述指定寄存器为通用整数寄存器,且与所述VPU指令对应的操作数与通用整数寄存器存在依赖关系的情况下,暂停对操作数的读取,直到与所述VPU指令对应的操作数与通用整数寄存器不存在依赖关系为止。When the designated register is a general integer register and the operand corresponding to the VPU instruction has a dependency relationship with the general integer register, reading of the operand is suspended until the operand corresponding to the VPU instruction has no dependency relationship with the general integer register.
在本申请一个可选的实施例中,所述获取VPU指令对应的指令交付信号,包括:In an optional embodiment of the present application, the step of obtaining an instruction delivery signal corresponding to the VPU instruction includes:
向乱序多发射处理器的交付模块发送指令;Sending instructions to a delivery module of an out-of-order multi-issue processor;
接收交付模块响应于所述指令发送的指令交付信号,其中,所述指令交付信号在所述指令前序的指令未发生分支预测错误、中断、和异常的情况下生成的。The receiving and committing module sends an instruction commit signal in response to the instruction, wherein the instruction commit signal is generated when no branch prediction error, interruption, or exception occurs in an instruction preceding the instruction.
S35,在已完成对VPU指令的分组且获取分组得到的访存指令对应的访存地址的情况下,按照访存地址根据访存指令执行访存操作,并将访存操作结果通过写回模块写回指定寄存器,对非访存指令直接执行。S35, when the VPU instructions have been grouped and the memory access address corresponding to the memory access instruction obtained by the grouping has been obtained, the memory access operation is performed according to the memory access instruction according to the memory access address, and the memory access operation result is written back to the specified register through the write-back module, and the non-memory access instruction is directly executed.
在本申请一个可选的实施例中,对非访存指令直接执行,包括:In an optional embodiment of the present application, directly executing a non-memory access instruction includes:
对于非访存指令,执行其对应的多个最小操作单元并将执行结果发送至写回模块。For non-memory access instructions, the corresponding multiple minimum operation units are executed and the execution results are sent to the write-back module.
在本申请一个可选的实施例中,所述获取分组得到的访存指令对应的访存地址,包括:In an optional embodiment of the present application, the memory access address corresponding to the memory access instruction obtained by the acquisition group includes:
根据访存指令对应的操作数,对访存指令中的偏移访存地址进行计算,得到访存指令对应的访存地址。The offset memory access address in the memory access instruction is calculated according to the operand corresponding to the memory access instruction to obtain the memory access address corresponding to the memory access instruction.
参见图4,本公开的用于乱序多发射处理器的指令处理方法中VPU指令的流向如下:Referring to FIG. 4 , the flow of VPU instructions in the instruction processing method for an out-of-order multi-issue processor disclosed in the present invention is as follows:
VPU指令经流水线执行单元的派遣模块(用于将指令派遣给不同的运算单元去执行)出来,分为两条路径,一条是交付模块(指令交付为在流水线中是指该指令不再是预测执行状态,它被判定为可以真正地在处理器中被执行,可以对处理器状态产生影响)的路径,一条是执行的路径,其中,预测执行状态为:如果一条指令B的上一条指令是分支跳转指令A,那么指令B就处于预测执行的状态,上一条的分支跳转指令A经过执行阶段时,如果发现预测错误,那么指令B就要被取消,不会执行;如果发现预测正确,那么指令B就不会被取消,可以执行,解除了预测执行状态,也就可以被真正地交付。一条指令,从执行单元的派遣模块被派遣出来,解析出来的指令信息则发送到用于指令发射的队列(在乱序多发射处理器中,指令可以乱序发射),根据时序情况,从队列出来可能需要进入队列(用于优化时序)缓存。在队列阶段和队列缓存阶段是VPU指令和AGU(Address Generate Unit,地址生成单元)指令等其他指令共用的,接下来分离出VPU指令。指令进入vpu分离器(与vpu解析器在同一个阶段,用于将VPU指令从混合指令流中抽取出来)和vpu解析器(进行VPU指令的向量寄存器和索引相关的译码工作,用于下游指令派遣模块对指令进行uop切分和分组),继而走向指令派遣模块(用于将VPU指令分组,同时进行指令uop(uop的全称是micro operation,对于VPU指令这种SIMD指令,官方指令集手册要求具体实现时必须拆分,SIMD的全称是Single Instruction Multiple Data,即一条指令需要处理多个数据,如果不做拆分,无法实现一次性处理大量数据,uop为指令的最小操作单位的切分),同时进入操作数缓存。在指令派遣模块阶段得到VPU指令,并且指令派遣模块阶段与操作数缓存阶段是强对齐的,指令派遣模块阶段的VPU指令往下发需要满足两个条件:如果需要用到xreg(xreg指通用整数寄存器),xreg的依赖关系在操作数缓存阶段解除;还要等待此VPU指令自身的交付(某条指令能被成功交付的前提是:它前序的指令没有发生分支预测错误、中断或异常)信息,若依赖关系没有解除,或者没有交付信息,需要等待。当VPU指令到达指令派遣模块,进行依赖关系检测,当依赖关系解除后,继续往下发,根据vpu分组的原则分配到不同的指令组中。被分到访存指令组的指令为访存指令,在访存指令组阶段要和第一矢量地址生成单元(根据地址偏移计算访存所用地址)保持强对齐。访存指令组需要告知第一矢量地址生成单元是否已完成分组等相关信息,而第一矢量地址生成单元也需要告知访存指令组是否准备好了地址等信息,如果有一边没有完成相关操作,另一边都需要等待,两边都准备好了,再一起走向第二矢量地址生成单元(需要同时拿到第一矢量地址生成单元和访存指令组送来的信息)。当访存指令到达第二矢量地址生成单元之后,将访存指令发送到访存单元做访存操作,最后将访存结果通过写回单元(用于将指令的计算结果写回通用寄存器组)写回通用寄存器组。The VPU instruction comes out of the dispatch module of the pipeline execution unit (used to dispatch instructions to different computing units for execution) and is divided into two paths, one is the delivery module (instruction delivery means that the instruction is no longer in the predicted execution state in the pipeline, it is determined that it can be actually executed in the processor and can affect the processor state), and the other is the execution path, where the predicted execution state is: if the previous instruction of an instruction B is a branch jump instruction A, then instruction B is in the predicted execution state. When the previous branch jump instruction A passes through the execution stage, if the prediction is wrong, then instruction B will be cancelled and will not be executed; if the prediction is correct, then instruction B will not be cancelled and can be executed, and the predicted execution state is released, and it can be truly delivered. An instruction is dispatched from the dispatch module of the execution unit, and the parsed instruction information is sent to the queue for instruction issuance (in an out-of-order multi-issue processor, instructions can be issued out of order). Depending on the timing, it may need to enter the queue (used to optimize timing) cache when it comes out of the queue. In the queue stage and queue cache stage, VPU instructions and other instructions such as AGU (Address Generate Unit) instructions are shared. Then the VPU instructions are separated. The instructions enter the vpu separator (in the same stage as the vpu parser, used to extract the VPU instructions from the mixed instruction stream) and the vpu parser (which performs decoding work related to the vector registers and indexes of the VPU instructions, and is used by the downstream instruction dispatch module to perform uop segmentation and grouping of instructions), and then go to the instruction dispatch module (which is used to group the VPU instructions and perform instruction uop (the full name of uop is micro operation. For SIMD instructions such as VPU instructions, the official instruction set manual requires that they must be split during specific implementation. The full name of SIMD is Single Instruction Multiple Data, that is, an instruction needs to process multiple data. If it is not split, it is impossible to process a large amount of data at one time. uop is the division of the smallest operation unit of the instruction) and enters the operand cache at the same time. The VPU instruction is obtained in the instruction dispatch module stage, and the instruction dispatch module stage and the operand cache stage are strongly aligned. The VPU instruction in the instruction dispatch module stage needs to meet two conditions to be sent down: if xreg (xreg refers to a general integer register) is needed, the dependency of xreg is released in the operand cache stage; it is also necessary to wait for the delivery information of this VPU instruction itself (the premise that an instruction can be successfully delivered is that its predecessor instruction has no branch prediction error, interruption or exception). If the dependency is not released, or there is no delivery information, it is necessary to wait. When the VPU instruction reaches the instruction dispatch module, a dependency check is performed. When the dependency is released, it continues to be sent down. According to the principle of vpu grouping, they are assigned to different instruction groups. The instructions assigned to the memory access instruction group are memory access instructions, and they must maintain strong alignment with the first vector address generation unit (the address used for memory access is calculated based on the address offset) at the memory access instruction group stage. The memory access instruction group needs to inform the first vector address generation unit whether the grouping and other related information have been completed, and the first vector address generation unit also needs to inform the memory access instruction group whether the address and other information are ready. If one side has not completed the relevant operation, the other side needs to wait. When both sides are ready, they will go to the second vector address generation unit together (the information sent by the first vector address generation unit and the memory access instruction group needs to be obtained at the same time). When the memory access instruction arrives at the second vector address generation unit, the memory access instruction is sent to the memory access unit for memory access operation, and finally the memory access result is written back to the general register group through the write-back unit (used to write the calculation result of the instruction back to the general register group).
应该理解的是,虽然流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。It should be understood that, although the various steps in the flow chart are displayed in sequence according to the indication of the arrows, these steps are not necessarily executed in sequence according to the order indicated by the arrows. Unless there is a clear description in this article, the execution of these steps is not strictly limited in order, and these steps can be executed in other orders. Moreover, at least a portion of the steps in the figure may include multiple sub-steps or multiple stages, and these sub-steps or stages are not necessarily executed at the same time, but can be executed at different times, and the execution order of these sub-steps or stages is not necessarily to be carried out in sequence, but can be executed in turn or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
请参见图5,本申请一个实施例提供了一种用于乱序多发射处理器的指令处理装置,包括:Referring to FIG. 5 , an embodiment of the present application provides an instruction processing device for an out-of-order multi-issue processor, including:
第一缓存区11,用于获取并缓存预设指令发射队列发射的CSR指令及其对应的操作数;The first buffer area 11 is used to obtain and cache the CSR instructions and their corresponding operands issued by the preset instruction issue queue;
第二缓存区12,用于将多条CSR指令及其对应的操作数存储至预设先进先出队列;The second buffer area 12 is used to store a plurality of CSR instructions and their corresponding operands into a preset first-in-first-out queue;
收发模块13,用于对于先进先出队列中的每条CSR指令及其对应的操作数,在向乱序多发射处理器的交付模块发送对应的第一同步信号,同时接收到交付模块发送的第二同步信号的情况下,执行CSR指令,并将执行结果发送至乱序多发射处理器的写回模块;The transceiver module 13 is used for, for each CSR instruction and its corresponding operand in the first-in-first-out queue, sending a corresponding first synchronization signal to the delivery module of the out-of-order multi-issue processor and receiving a second synchronization signal sent by the delivery module, executing the CSR instruction and sending the execution result to the write-back module of the out-of-order multi-issue processor;
交付模块14,用于将CSR指令交付结果发送至乱序多发射处理器的写回模块。The delivery module 14 is used to send the CSR instruction delivery result to the write-back module of the out-of-order multi-issue processor.
在本申请一个可选的实施例中,所述第一缓存区还用于:In an optional embodiment of the present application, the first buffer area is further used for:
获取预设指令发射队列发射的CSR指令;Get the CSR instruction sent by the preset instruction sending queue;
根据CSR指令确定与其对应的操作数;Determine the operand corresponding to the CSR instruction;
缓存预设指令发射队列发射的CSR指令及其对应的操作数。The CSR instructions and their corresponding operands issued by the preset instruction issue queue are cached.
在本申请一个可选的实施例中,所述第一缓存区还用于:In an optional embodiment of the present application, the first buffer area is further used for:
获取CSR指令中指定操作数寄存器的地址,其中,所述指定操作数寄存器存储与所述CSR指令对应的操作数;Obtaining an address of a specified operand register in a CSR instruction, wherein the specified operand register stores an operand corresponding to the CSR instruction;
按照指定操作数寄存器的地址,从所述指定寄存器中读取与所述CSR指令对应的操作数。According to the address of the designated operand register, the operand corresponding to the CSR instruction is read from the designated register.
在本申请一个可选的实施例中,所述第二缓存区还用于:In an optional embodiment of the present application, the second buffer area is further used for:
按照读取操作数的先后顺序,将不同CSR指令及其对应的操作数存储至预设先进先出队列。According to the order of reading operands, different CSR instructions and their corresponding operands are stored in a preset first-in-first-out queue.
在本申请一个可选的实施例中,在所述收发模块中,所述第一同步信号用于控制对CSR指令的读写操作,所述第二同步信号用于对指令进行交付。In an optional embodiment of the present application, in the transceiver module, the first synchronization signal is used to control the read and write operations of the CSR instruction, and the second synchronization signal is used to deliver the instruction.
在本申请一个可选的实施例中,所述收发模块还用于:In an optional embodiment of the present application, the transceiver module is further used for:
更新CSR指令的值,生成对应的清空信号,并将CSR指令的更新值和清空信号存储至预设寄存器中,其中,清空信号为冲刷流水线的信号;Update the value of the CSR instruction, generate a corresponding clear signal, and store the updated value of the CSR instruction and the clear signal in a preset register, wherein the clear signal is a signal for flushing the pipeline;
将CSR指令的更新值、清空信号以及预设寄存器的存储记录发送至写回模块。The update value of the CSR instruction, the clear signal, and the storage record of the preset register are sent to the write-back module.
在本申请一个可选的实施例中,所述收发模块还用于:In an optional embodiment of the present application, the transceiver module is further used for:
在CSR指令为读写操作指令,目的寄存器为通用整数寄存器x0的情况下,对CSR只写不读;When the CSR instruction is a read-write operation instruction and the destination register is the general integer register x0, the CSR is only written but not read;
在CSR指令为读并置位操作指令或者读并清除操作指令,操作数寄存器为通用整数寄存器x0的情况下,对CSR只读不写;When the CSR instruction is a read and set operation instruction or a read and clear operation instruction, and the operand register is the general integer register x0, the CSR is only read but not written;
在CSR指令为读并置位操作的立即数扩展指令或者读并清除操作的立即数扩展指令,立即数为0的情况下,对CSR只读不写。When the CSR instruction is an immediate value extension instruction of a read and set operation or an immediate value extension instruction of a read and clear operation, and the immediate value is 0, the CSR is only read but not written.
关于上述用于乱序多发射处理器的指令处理装置的具体限定可以参见上文中对于用于乱序多发射处理器的指令处理方法的限定,在此不再赘述。上述用于乱序多发射处理器的指令处理装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。For the specific definition of the above-mentioned instruction processing device for the out-of-order multi-issue processor, please refer to the definition of the instruction processing method for the out-of-order multi-issue processor above, which will not be repeated here. Each module in the above-mentioned instruction processing device for the out-of-order multi-issue processor can be implemented in whole or in part by software, hardware and a combination thereof. The above-mentioned modules can be embedded in or independent of the processor in the computer device in the form of hardware, or can be stored in the memory of the computer device in the form of software, so that the processor can call and execute the operations corresponding to the above modules.
在一个实施例中,提供了一种计算机设备,该计算机设备的内部结构图可以如图6所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口和数据库。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机程序和数据库。该内存储器为非易失性存储介质中的操作系统和计算机程序的运行提供环境。该计算机设备的数据库用于存储数据。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机程序被处理器执行时以实现如上的一种用于乱序多发射处理器的指令处理方法。包括:包括存储器和处理器,存储器存储有计算机程序,处理器执行计算机程序时实现如上用于乱序多发射处理器的指令处理方法中的任一步骤。In one embodiment, a computer device is provided, and the internal structure diagram of the computer device can be shown in Figure 6. The computer device includes a processor, a memory, a network interface and a database connected through a system bus. Among them, the processor of the computer device is used to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program and a database. The internal memory provides an environment for the operation of the operating system and the computer program in the non-volatile storage medium. The database of the computer device is used to store data. The network interface of the computer device is used to communicate with an external terminal through a network connection. When the computer program is executed by the processor, an instruction processing method for a disordered multi-issue processor as described above is implemented. It includes: including a memory and a processor, the memory stores a computer program, and when the processor executes the computer program, any step in the instruction processing method for a disordered multi-issue processor as described above is implemented.
在一个实施例中,提供了一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时可以实现如上用于乱序多发射处理器的指令处理方法中的任一步骤。In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored. When the computer program is executed by a processor, any step of the above instruction processing method for an out-of-order multi-issue processor can be implemented.
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art will appreciate that the embodiments of the present application may be provided as methods, systems, or computer program products. Therefore, the present application may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment in combination with software and hardware. Moreover, the present application may adopt the form of a computer program product implemented in one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) that include computer-usable program code.
本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present application is described with reference to the flowchart and/or block diagram of the method, device (system) and computer program product according to the embodiment of the present application. It should be understood that each process and/or box in the flowchart and/or block diagram, and the combination of the process and/or box in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, a special-purpose computer, an embedded processor or other programmable data processing device to produce a machine, so that the instructions executed by the processor of the computer or other programmable data processing device produce a device for realizing the function specified in one process or multiple processes in the flowchart and/or one box or multiple boxes in the block diagram.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing device to work in a specific manner, so that the instructions stored in the computer-readable memory produce a manufactured product including an instruction device that implements the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions may also be loaded onto a computer or other programmable data processing device so that a series of operational steps are executed on the computer or other programmable device to produce a computer-implemented process, whereby the instructions executed on the computer or other programmable device provide steps for implementing the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.
尽管已描述了本申请的优选实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例作出另外的变更和修改。所以,所附权利要求意欲解释为包括优选实施例以及落入本申请范围的所有变更和修改。Although the preferred embodiments of the present application have been described, those skilled in the art may make other changes and modifications to these embodiments once they have learned the basic creative concept. Therefore, the appended claims are intended to be interpreted as including the preferred embodiments and all changes and modifications falling within the scope of the present application.
显然,本领域的技术人员可以对本申请进行各种改动和变型而不脱离本申请的精神和范围。这样,倘若本申请的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包含这些改动和变型在内。Obviously, those skilled in the art can make various changes and modifications to the present application without departing from the spirit and scope of the present application. Thus, if these modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is also intended to include these modifications and variations.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410917375.8A CN118885218A (en) | 2024-07-09 | 2024-07-09 | Instruction processing method and device for out-of-order multi-issue processor |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410917375.8A CN118885218A (en) | 2024-07-09 | 2024-07-09 | Instruction processing method and device for out-of-order multi-issue processor |
Publications (1)
Publication Number | Publication Date |
---|---|
CN118885218A true CN118885218A (en) | 2024-11-01 |
Family
ID=93223304
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410917375.8A Pending CN118885218A (en) | 2024-07-09 | 2024-07-09 | Instruction processing method and device for out-of-order multi-issue processor |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118885218A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN119861969A (en) * | 2025-03-21 | 2025-04-22 | 北京微核芯科技有限公司 | Vector instruction processing method and device, electronic equipment and storage medium |
-
2024
- 2024-07-09 CN CN202410917375.8A patent/CN118885218A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN119861969A (en) * | 2025-03-21 | 2025-04-22 | 北京微核芯科技有限公司 | Vector instruction processing method and device, electronic equipment and storage medium |
CN119861969B (en) * | 2025-03-21 | 2025-06-20 | 北京微核芯科技有限公司 | Vector instruction processing method and device, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR101019224B1 (en) | Data guessing based on addressing patterns identifying dual purpose registers | |
CN100424635C (en) | System and method for verifying a memory file linking speculative results of load operations to register values | |
US6754856B2 (en) | Memory access debug facility | |
US9262160B2 (en) | Load latency speculation in an out-of-order computer processor | |
US7617384B1 (en) | Structured programming control flow using a disable mask in a SIMD architecture | |
CN100407134C (en) | System and method for handling exceptional instructions in a trace cache based processor | |
US9384000B2 (en) | Caching optimized internal instructions in loop buffer | |
US6301654B1 (en) | System and method for permitting out-of-order execution of load and store instructions | |
CN118885218A (en) | Instruction processing method and device for out-of-order multi-issue processor | |
US8799628B2 (en) | Early branch determination | |
CN118467041A (en) | Instruction processing method and device for out-of-order multi-issue processor | |
CN117931293B (en) | Instruction processing method, device, equipment and storage medium | |
CN105824604B (en) | Multiple-input and multiple-output processor pipeline data synchronization unit and method | |
CN118760475B (en) | Instruction issuance method and device for out-of-order processor | |
EP0753810B1 (en) | Computer instruction execution method and apparatus | |
CN117806706B (en) | Storage order violation processing method, storage order violation processing device, electronic equipment and medium | |
CN117827284B (en) | Vector processor memory access instruction processing method, system, equipment and storage medium | |
US20080222392A1 (en) | Method and arrangements for pipeline processing of instructions | |
CN116991480A (en) | Instruction processing method, device, circuit, transmitter, chip, medium and product | |
CN115840593A (en) | Method and device for verifying execution component in processor, equipment and storage medium | |
CN119003002B (en) | Processors, graphics cards, computer equipment, and methods to eliminate dependencies | |
US20230205535A1 (en) | Optimization of captured loops in a processor for optimizing loop replay performance | |
US20240338220A1 (en) | Apparatus and method for implementing many different loop types in a microprocessor | |
KR20070019750A (en) | System and method for validating a memory file that links speculative results of a load operation to register values | |
JP2901573B2 (en) | Super scalar information processing device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |