CN114816534B - A branch prediction method and related device for branch instructions - Google Patents
A branch prediction method and related device for branch instructionsInfo
- Publication number
- CN114816534B CN114816534B CN202210478867.2A CN202210478867A CN114816534B CN 114816534 B CN114816534 B CN 114816534B CN 202210478867 A CN202210478867 A CN 202210478867A CN 114816534 B CN114816534 B CN 114816534B
- Authority
- CN
- China
- Prior art keywords
- branch
- ium
- module
- branch instruction
- saturation counter
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3836—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
- G06F9/3842—Speculative instruction execution
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3861—Recovery, e.g. branch miss-prediction, exception handling
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Advance Control (AREA)
Abstract
The application provides a branch prediction method and a related device of a branch instruction, wherein the method is applied to a TAGE branch predictor, the TAGE branch predictor determines a branch prediction table entry of the branch instruction according to instruction addresses of the branch instruction to be predicted and global branch histories of different lengths, the TAGE branch predictor determines whether a saturation counter update value of the branch instruction exists in a preset IUM module according to the branch prediction table entry, and the TAGE branch predictor predicts the direction of the branch instruction by adopting the saturation counter update value under the condition that the saturation counter update value exists in the IUM module. The embodiment of the application is beneficial to improving the prediction accuracy of the TAGE branch predictor.
Description
Technical Field
The present application relates to the field of branch prediction technologies of processors, and in particular, to a branch prediction method and related apparatus for a branch instruction.
Background
At present, the architecture design of most processors adopts an instruction pipeline form to execute instructions in a running way, so that although the instruction throughput of the processors can be increased, when a branch instruction is encountered, whether an instruction needs to jump and which instruction needs to jump to are needed to be judged in time, so that the running of the pipeline is interrupted, the performance loss of the processors is caused, a branch prediction mechanism is needed, whether the branch instruction jumps and the jump target address are predicted in advance through certain algorithms, the continuity of the instruction flow is ensured as much as possible, and the overall performance of the processors is improved. The TAGE (TAggedGEmoetric history length branch prediction) branch predictor is one of branch prediction algorithms with high prediction accuracy, and the accuracy of branch prediction is high in dependence on training data, so that the improvement of the prediction accuracy of the TAGE branch predictor is an urgent problem to be solved at present.
Disclosure of Invention
In view of the above problems, the present application provides a branch prediction method and related device for a branch instruction, which are beneficial to improving the prediction accuracy of a tag branch predictor.
To achieve the above object, a first aspect of an embodiment of the present application provides a branch prediction method of a branch instruction, applied to a tag branch predictor, including:
the TAGE branch predictor determines a branch prediction table entry of the branch instruction according to the instruction address of the branch instruction to be predicted and global branch histories of different lengths;
the TAGE branch predictor determines whether a saturation counter update value of a branch instruction exists in a preset IUM module according to a branch prediction table entry;
The tag branch predictor uses the saturation counter update value to directionally predict a branch instruction in the presence of the saturation counter update value in the IUM module.
With reference to the first aspect, in one possible implementation manner, the tag branch predictor determines whether a saturation counter update value of a branch instruction exists in a preset IUM module according to a branch prediction table entry of the branch instruction, including:
the TAGE branch predictor inputs a branch prediction table entry to the IUM module through a Provider interface of the IUM module, wherein the branch prediction table entry comprises first tag information of a branch instruction;
the TAGE branch predictor acquires a Resp interface returned by the IUM module, determines that a saturation counter update value exists in the IUM module if a valid signal corresponding to the Resp interface indicates that a branch prediction table entry hits in the IUM module, and determines that a saturation counter update value does not exist in the IUM module if a valid signal corresponding to the Resp interface indicates that a branch prediction table entry does not hit in the IUM module.
With reference to the first aspect, in one possible implementation manner, before the predicting the direction of the branch instruction using the saturation counter update value, the method further includes:
The tag branch predictor receives the saturation counter update value returned by the IUM module via the Resp interface and the corresponding fetch target queue pointer for the saturation counter update value.
With reference to the first aspect, in one possible implementation manner, before the predicting the direction of the branch instruction using the saturation counter update value, the method further includes:
the TAGE branch predictor obtains second tag information of the branch instruction sent by the IUM module through a Resp_tag interface returned by the IUM module.
A second aspect of an embodiment of the present application provides a branch prediction method of a branch instruction, applied to an IUM module, the method including:
The IUM module acquires a branch prediction table entry of a branch instruction to be predicted, which is input by a TAGE branch predictor, through a Provider interface, wherein the branch prediction table entry comprises first tag information of the branch instruction;
the IUM module judges whether second tag information matched with the first tag information is stored in the content addressing memory through a enq _tag interface;
And under the condition that the second tag information is stored in the content addressing memory, the IUM module determines that the branch prediction table entry hits in the IUM module, and returns a saturation counter update value of the branch instruction to the TAGE branch predictor through a Resp interface, so that the TAGE branch predictor adopts the saturation counter update value to conduct direction prediction on the branch instruction.
With reference to the second aspect, in one possible implementation manner, before the IUM module obtains, through the Provider interface, a branch prediction table entry of a branch instruction to be predicted, which is input by the tag branch predictor, the method further includes:
The IUM module receives a saturation counter value returned when the branch instruction is executed last time by the back-end pipeline and a jump result of the branch instruction, wherein the saturation counter value is the saturation counter value of the branch instruction in a corresponding table entry of the TAGE branch predictor when the branch instruction is determined to have misprediction;
the IUM module calculates a saturation counter update value of the branch instruction according to the saturation counter value and the jump result;
The IUM module calculates second tag information by using the instruction address of the branch instruction and the global branch histories with different lengths, and stores the saturation counter update value, the second tag information and the corresponding instruction fetching target queue pointer.
With reference to the second aspect, in one possible implementation manner, the IUM module includes a content-addressable memory and a register file, stores a saturation counter update value, second tag information, and a corresponding finger target queue pointer, and includes:
The IUM module stores the second tag information in the content-addressed memory and stores the saturation counter update value and the index target queue pointer in the register file.
With reference to the second aspect, in a possible implementation manner, the method further includes:
and the IUM module returns the second tag information to the TAGE branch predictor through a resp_tag interface under the condition that the second tag information is stored in the content-addressable memory.
With reference to the second aspect, in one possible implementation manner, the finger target queue pointer is used to mark a saturation counter update value, and a enqueuing time of the second tag information in the IUM module, and the method further includes:
The IUM module dequeues entries having a enqueue time earlier than the saturation counter update value, the enqueue time of the second tag information, in the event of a branch instruction commit.
A third aspect of the present application provides a branch prediction apparatus for a branch instruction, which is applied to a tag branch predictor, where the apparatus includes a first processing unit and a first transceiver unit;
A first processing unit, configured to determine a branch prediction table entry of a branch instruction according to an instruction address of the branch instruction to be predicted and global branch histories of different lengths;
The first receiving and transmitting unit is used for determining whether a saturation counter update value of a branch instruction exists in a preset IUM module according to the branch prediction table entry;
The first processing unit is further configured to, in a case where a saturation counter update value exists in the IUM module, perform direction prediction on the branch instruction using the saturation counter update value.
A fourth aspect of the present application provides a branch prediction apparatus for a branch instruction, which is applied to an IUM module, and includes a second processing unit and a second transceiver unit;
The second transceiver unit is used for acquiring a branch prediction table entry of the branch instruction to be predicted, which is input by the TAGE branch predictor, through a Provider interface, wherein the branch prediction table entry comprises first tag information of the branch instruction;
The second processing unit is used for judging whether second tag information matched with the first tag information is stored in the content addressing memory through a enq _tag interface;
And the second processing unit is also used for determining that the branch prediction table entry hits in the IUM module under the condition that the second tag information is stored in the content addressing memory, and calling the second receiving and transmitting unit to return a saturation counter update value of the branch instruction to the TAGE branch predictor through the Resp interface so that the TAGE branch predictor adopts the saturation counter update value to conduct direction prediction on the branch instruction.
A fifth aspect of the embodiments of the present application provides an electronic device comprising an input device and an output device, further comprising a processor adapted to implement one or more instructions, and a computer storage medium storing one or more instructions adapted to be loaded by the processor and to perform the method steps as in any of the implementations of the first or second aspects.
A sixth aspect of the embodiments of the present application provides a computer storage medium storing one or more instructions adapted to be loaded by a processor and to perform the method steps as in any implementation of the first or second aspects described above.
The scheme of the application at least has the following beneficial effects that when one branch is mispredicted, the branch prediction clears the instruction on the wrong path, but the saturation counter in the TAGE branch predictor is not updated immediately, so that the training data of the TAGE branch predictor is not up-to-date, and the accuracy of the branch prediction is reduced to a certain extent. The application adds an IUM module for the TAGE branch predictor, if the last executed branch instruction has misprediction, the IUM module calculates the saturation counter updating value of the branch instruction, and stores the saturation counter updating value. When the TAGE branch predictor predicts the branch instruction this time, whether relevant table items of the branch instruction exist in the IUM module can be searched at the same time, if yes, the updated value of the saturation counter stored in the IUM module is directly used for predicting the direction of the branch instruction, and the updating of the saturation counter in the TAGE branch predictor is not required to be waited, so that the updating efficiency of training data of the TAGE branch predictor is improved, the TAGE branch predictor adopts the updated data for predicting the direction of the branch instruction, and the prediction accuracy of the TAGE branch predictor is improved.
Drawings
In order to more clearly illustrate the embodiments of the application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a system architecture according to an embodiment of the present application;
FIG. 2 is a flow chart of a branch prediction method of a branch instruction according to an embodiment of the present application;
FIG. 3 is a flow chart of another branch prediction method of a branch instruction according to an embodiment of the present application;
FIG. 4 is a schematic diagram of an IUM table entry according to an embodiment of the present application;
FIG. 5 is a schematic diagram of a branch prediction apparatus for a branch instruction according to an embodiment of the present application;
FIG. 6 is a schematic diagram illustrating another branch prediction apparatus for a branch instruction according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In order that those skilled in the art will better understand the present application, a technical solution in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, shall fall within the scope of the present application.
The terms "comprising" and "having" and any variations thereof, as used in the description, claims and drawings, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed steps or elements but may include other steps or elements not listed or inherent to such process, method, article, or apparatus. Furthermore, the terms "first," "second," and "third," etc. are used for distinguishing between different objects and not for describing a particular sequential order.
Reference in the specification to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the described embodiments of the application may be combined with other embodiments.
The embodiment of the application provides a branch prediction method of a branch instruction, which can be realized based on a system architecture shown in fig. 1, and as shown in fig. 1, the system architecture can comprise a branch prediction unit (Branch Prediction Unit, BPU), a fetch target Queue (FETCH TARGET Queue, FTQ), a fetch unit (Instruction Fetch Unit, IFU) and a back end (back end), wherein the BPU comprises a tag branch predictor, and a plug-in unit (IUM (IMMEDIATE UPDATE MIMICKER) module is set for the tag branch predictor. The FTQ is used for generating a fetching request, the IFU executes fetching of instructions based on the fetching request of the FTQ, and the BPU predicts the direction of the instructions through the TAGE branch predictor in the instruction fetching stage, namely judges whether the instructions need to jump or not. It should be appreciated that the tag branch predictor does not give instruction specific jump target addresses at branch prediction. When the branch prediction is performed, the TAGE branch predictor judges whether an item of the branch instruction hits in the IUM module, and if so, the saturation counter update value of the branch instruction stored in the IUM module is used for performing the direction prediction on the branch instruction. The BPU sends branch prediction results to the FTQ stores. The FTQ is also configured to send an update signal to the BPU after the back-end instruction commits, and to transmit update data. For example, the BPU may update the relevant entries of the branch prediction table in the TAGE branch predictor after the back-end instruction commits. Based on the branch prediction results stored in the FTQ, the backend, when executing the branch instruction, may determine whether the branch instruction is mispredicted and transmit a misprediction signal back to the FTQ. It can be seen that, based on the system architecture shown in fig. 1, the branch prediction in the instruction fetching stage can find whether there is a related entry of the branch instruction from the IUM module, if so, directly use the saturation counter update value stored in the IUM module to perform direction prediction on the branch instruction, without waiting for the update of the saturation counter in the tag branch predictor.
Referring to fig. 2, fig. 2 is a flowchart illustrating a branch prediction method of a branch instruction according to an embodiment of the present application, where the method is applied to a tag branch predictor, as shown in fig. 2, and includes steps 210-230:
The TAGE branch predictor determines branch prediction table entries for branch instructions based on instruction addresses of branch instructions to be predicted and global branch histories of different lengths.
In the embodiment of the application, the TAGE branch predictor calculates index values (such as exclusive OR operation) by adopting instruction addresses of branch instructions and global branch histories with different lengths. The global branch histories with different lengths correspond to different branch prediction tables of the branch instruction in the tag branch predictor, and an index value calculated by using the global branch history with the longest length and the instruction address can be indexed to the branch prediction table with the longest global branch history, for example, the index value can hit an entry of the branch prediction table with the longest global branch history, where the entry includes tag information (i.e., first tag information) of the branch instruction. Specifically, the tag branch predictor matches the tag information of the branch instruction in different branch prediction tables by using the calculated index value, determines the tag information in the branch prediction table with the longest global branch history and the matched branch prediction table as the first tag information, and determines each table entry related to the first tag information as the branch prediction table entry. It should be appreciated that branch prediction table entries typically include saturation counter values, tag information, and u-bit values for branch instructions, among others.
And 220, determining whether the saturation counter updating value of the branch instruction exists in the preset IUM module or not by the TAGE branch predictor according to the branch prediction table entry.
In the embodiment of the application, the TAGE branch predictor inputs the branch prediction table entry to the IUM module through the Provider interface of the IUM module, the Provider interface is an input interface, a special valid signal exists, the valid signal is used for indicating whether the input data is valid, and the input of the Provider interface is an unsigned integer used for indicating which branch prediction table and the corresponding table entry are matched in the TAGE branch predictor by the branch instruction. The IUM module adopts the first tag information to match with the tag information stored by the IUM module, and if the matched tag information (namely, the second tag information) can be found, the branch prediction table entry is indicated to hit in the IUM module. The IUM module returns a Resp interface to the tag branch predictor, the Resp interface being an output interface, there being a dedicated valid signal, if the valid signal is true, indicating that the branch prediction table entry hits in the IUM module, the hit of the branch prediction table entry in the IUM module indicating that the presence of the valid signal in the IUM module indicates that the branch prediction table entry misses in the IUM module. The IUM module outputs a stored saturation counter update value and a fetch target queue pointer for the branch instruction to the tag branch predictor via the Resp interface, which the tag branch predictor may use to update data in the branch prediction table entry. Wherein the data type of the saturation counter update value and the index target queue pointer is an unsigned integer.
Illustratively, the tag branch predictor obtains the second tag information of the branch instruction sent by the IUM module through the resp_tag interface returned by the IUM module. The IUM module returns second tag information to the tag branch predictor through the resp_tag interface under the condition that the hit of the branch prediction table entry is determined. The tag branch predictor may store the second tag information, where if there is a misprediction of the branch instruction at this time, the second tag information may be used to determine whether an entry corresponding to the index exists in the IUM module when performing direction prediction on the branch instruction next time. The data type output by the resp_tag interface is an unsigned integer.
230 TAGE branch predictor uses the saturation counter update value to directionally predict branch instructions in the presence of the saturation counter update value in the IUM block.
In the embodiment of the application, the TAGE branch predictor calculates the direction prediction result of the branch instruction by adopting the updated value of the saturation counter returned by the IUM module, namely whether the branch instruction jumps, if the branch instruction jumps, the saturation counter value is +1, and if the branch instruction does not jump, the saturation counter value is-1.
Illustratively, the TAGE branch predictor uses the saturation counter value of the branch instruction stored in the TAGE branch predictor to directionally predict the branch instruction in the absence of a saturation counter update value in the IUM block.
Referring to fig. 3, fig. 3 is a flowchart of another branch prediction method of a branch instruction according to an embodiment of the present application, which is applied to an IUM module, wherein the IUM module is a plug-in of a tag branch predictor, and the method includes steps 310-330:
and 310, the IUM module acquires a branch prediction table entry of the branch instruction to be predicted, which is input by the TAGE branch predictor, through a Provider interface, wherein the branch prediction table entry comprises first tag information of the branch instruction.
In the embodiment of the application, when the TAGE branch predictor predicts the direction of a branch instruction to be predicted, an index value is calculated according to the instruction address of the branch instruction to be predicted and the global branch histories of different lengths, then the index value is adopted to search a branch prediction table entry of the branch instruction, the branch prediction table entry is sent to an IUM module through a Provider interface of the IUM module, and the branch prediction table entry comprises the first tag information of the branch instruction. Illustratively, the IUM module further includes a Req interface, which is an input interface, with a dedicated valid signal indicating whether the request data of the tag branch predictor is valid.
And 320, judging whether second tag information matched with the first tag information is stored in the content addressing memory by the ium module through the enq _tag interface.
In the embodiment of the present application, the enq _tag interface is an input interface, and there is a dedicated valid signal, and the data type of the enq _tag interface is an unsigned integer. The IUM module obtains first tag information in the branch prediction table entry through a enq _tag interface, and matches the first tag information with a plurality of tag information stored in a content addressing memory to determine whether second tag information matched with the first tag information exists in the IUM module.
And 330, determining that the branch prediction table entry hits in the IUM module under the condition that the second tag information is stored in the content addressable memory, and returning a saturation counter update value of the branch instruction to the TAGE branch predictor through a Resp interface so that the TAGE branch predictor adopts the saturation counter update value to conduct direction prediction on the branch instruction.
In the embodiment of the present application, when the second tag information is stored in the content addressable memory, the IUM module returns a Resp interface to the tag branch predictor, and at this time, the valid signal dedicated to the Resp interface is true, which is used to inform the tag branch predictor that the branch prediction table entry of the branch instruction hits in the IUM module. The IUM module returns a prestored saturation counter update value of the branch instruction and a fetch target queue pointer of a branch instruction related table item to the tag branch predictor through a Resp interface, wherein the fetch target queue pointer is used for marking the saturation counter update value of the branch instruction and the enqueuing time of the second tag information in the IUM module. When the TAGE branch predictor obtains the saturation counter updated value returned by the IUM module, the direction predicted result of the branch instruction is calculated by adopting the saturation counter updated value.
The instructions executed by the processor are typically re-executed next, that is, the branch instruction to be predicted may have been executed last time, for example. When the back-end pipeline executes the branch instruction last time, if the branch instruction is determined to have misprediction, a signal that the branch instruction has misprediction is returned. The IUM module may fetch a signal through the enq _data interface that the last prediction was that there was a misprediction for the branch instruction. The enq _data interface is an input interface, and the data type of the input interface is an unsigned integer. For example, the IUM module may determine whether the branch instruction has a misprediction according to a valid signal specific to the enq _data interface, and if the valid signal is true, it indicates that the branch instruction has a misprediction in the last prediction. The IUM module receives a saturation counter value returned when the back-end pipeline executes the branch instruction last time and a jump result of the branch instruction, wherein the saturation counter value is a saturation counter value of the branch instruction in a corresponding table entry of the tag branch predictor when the branch instruction is determined to have misprediction. The saturation counter value and the branch instruction may be entered by the FTQ. The IUM module calculates a saturation counter update value for the branch instruction using the saturation counter value and a jump result for the branch instruction, and calculates second tag information for the branch instruction using the same calculation logic as the tag branch predictor. The saturation counter update value and the second tag information are stored as entries in the IUM of the branch instruction, and the fetch target queue pointer of the branch instruction in the IUM is stored.
Illustratively, the main memory structure of the IUM module includes a content-addressed memory, a register file, and a set of valid registers. The IUM module stores the second tag information in the content-addressed memory and stores the saturation counter update value and the index target queue pointer in the register file. The second tag information comprises three parts, namely predicting whether the second tag information is derived from a basic table, a first table and a table item, and splicing the three data into unsigned integer data to obtain the second tag information. The valid register is used for marking whether the table entry of the branch instruction in the IUM is valid.
Illustratively, the IUM module returns the second tag information to the tag branch predictor via the resp_tag interface in the case where the second tag information is stored in the content-addressable memory.
Illustratively, the IUM module dequeues entries having a enqueue time earlier than the saturation counter update value, the enqueue time of the second tag information, in the event of a branch instruction commit. The branch instruction commit indicates the stage at which the branch instruction exits from the backend pipeline, and in the event of a branch instruction commit, the FTQ issues an update signal to the BPU that the IUM module will obtain a pool value via the deq interface that instructs the IUM module to perform a dequeue scan. Wherein deq interface is an input interface. Meanwhile, the IUM module acquires a fetch target queue pointer corresponding to the branch instruction through the deq _ ftq _ptr interface, and the IUM module compares the fetch target queue pointer with the fetch target queue pointers of all the entries in the IUM module to dequeue the entry with the enqueuing time earlier than the branch instruction. Specifically, the IUM module sets the valid register of entries that have an enqueue time earlier than the branch instruction to false, indicating that the entries have been invalidated. Wherein deq _ ftq _ptr interface is an input interface. As shown in FIG. 4, each entry, ctr1 and tag1 represent related entries of branch instruction 1, ftq _ptr1 represents a target queue pointer for fetching related entries of branch instruction 1, ctr2 and tag2 represent related entries of branch instruction 2, and ftq _ptr2 represents a target queue pointer for fetching related entries of branch instruction 2. If the enqueue time of the associated entry of branch instruction 2 is earlier than the enqueue time of the associated entry of branch instruction 1, then the associated entry of branch instruction 2 is dequeued.
For example, if the branch instruction has multiple entries in the IUM module, the enqueue time of the last entry in the multiple entries is taken as the enqueue time of the branch instruction.
It can be seen that the present application adds an IUM module to the tag branch predictor, and if it is determined that there is a misprediction for the last executed branch instruction, the IUM module calculates the saturation counter update value of the branch instruction, and stores the saturation counter update value. When the TAGE branch predictor predicts the branch instruction this time, whether relevant table items of the branch instruction exist in the IUM module can be searched at the same time, if yes, the updated value of the saturation counter stored in the IUM module is directly used for predicting the direction of the branch instruction, and the updating of the saturation counter in the TAGE branch predictor is not required to be waited, so that the updating efficiency of training data of the TAGE branch predictor is improved, the TAGE branch predictor adopts the updated data for predicting the direction of the branch instruction, and the prediction accuracy of the TAGE branch predictor is improved.
Based on the description of the branch prediction method embodiments of the branch instruction, the present application further provides a branch prediction apparatus of a branch instruction, where the branch prediction apparatus of a branch instruction may be a computer program (including program code) running in a terminal. The branch prediction apparatus of the branch instruction may perform the method shown in fig. 2. The apparatus is applied to a tag branch predictor, please refer to fig. 5, and the apparatus includes a first processing unit 510 and a first transceiver unit 520, wherein:
a first processing unit 510, configured to determine a branch prediction table entry of a branch instruction according to an instruction address of the branch instruction to be predicted and global branch histories of different lengths;
A first transceiver unit 520, configured to determine whether a saturation counter update value of a branch instruction exists in a preset IUM module according to a branch prediction table entry;
The first processing unit 510 is further configured to, in the case that the saturation counter update value exists in the IUM module, perform direction prediction on the branch instruction using the saturation counter update value.
In one possible implementation manner, in determining whether there is a saturation counter update value of a branch instruction in the preset IUM module according to a branch prediction table entry of the branch instruction, the first transceiver unit 520 is specifically configured to:
inputting a branch prediction table entry to the IUM module through a Provider interface of the IUM module, wherein the branch prediction table entry comprises first tag information of a branch instruction;
and acquiring a Resp interface returned by the IUM module, if the valid signal corresponding to the Resp interface indicates that the branch prediction table item hits in the IUM module, determining that a saturation counter update value exists in the IUM module, and if the valid signal corresponding to the Resp interface indicates that the branch prediction table item does not hit in the IUM module, determining that the saturation counter update value does not exist in the IUM module.
In a possible embodiment, the first transceiver unit 520 is further configured to:
The receiving IUM module returns a saturation counter update value and a corresponding finger target queue pointer of the saturation counter update value over the Resp interface.
In a possible embodiment, the first transceiver unit 520 is further configured to:
And acquiring second tag information of the branch instruction sent by the IUM module through a resp_tag interface returned by the IUM module.
The present application also provides a branch prediction apparatus of a branch instruction, which may perform the method shown in fig. 3. The apparatus is applied to an IUM module, please refer to fig. 6, and the apparatus includes a second processing unit 610 and a second transceiver unit 620, wherein:
A second transceiver unit 620, configured to obtain, through a Provider interface, a branch prediction table entry of a branch instruction to be predicted, where the branch prediction table entry includes first tag information of the branch instruction;
a second processing unit 610, configured to determine, through a enq _tag interface, whether second tag information that matches the first tag information is stored in the content-addressable memory;
The second processing unit 610 is further configured to determine that the branch prediction table entry hits in the IUM module if the second tag information is stored in the content addressable memory, and call the second transceiver unit 620 to return the saturation counter update value of the branch instruction to the tag branch predictor through the Resp interface, so that the tag branch predictor performs direction prediction on the branch instruction using the saturation counter update value.
In a possible implementation, the second processing unit 610 is further configured to:
Receiving a saturation counter value returned when the branch instruction is executed at the last time by the back-end pipeline and a jump result of the branch instruction, wherein the saturation counter value is the saturation counter value of the branch instruction in a corresponding table entry of the TAGE branch predictor when the branch instruction is determined to have misprediction;
calculating a saturation counter update value of the branch instruction according to the saturation counter value and the jump result;
and calculating second tag information by adopting the instruction address of the branch instruction and global branch histories with different lengths, and storing the saturation counter update value, the second tag information and the corresponding instruction fetching target queue pointer.
In one possible implementation, the IUM module includes a content-addressable memory and a register file, and the second processing unit 610 is specifically configured to:
the second tag information is stored in the content-addressed memory and the saturation counter update value and the finger target queue pointer are stored in the register file.
In a possible implementation manner, the second transceiver unit 620 is further configured to:
And returning the second tag information to the TAGE branch predictor through a resp_tag interface under the condition that the second tag information is stored in the content-addressable memory.
In a possible implementation, the finger target queue pointer is used to mark a saturation counter update value, and a enqueuing time of the second tag information in the IUM module, and the second processing unit 610 is further configured to:
In the case of a branch instruction commit, dequeuing entries having a enqueue time earlier than the saturation counter update value, the enqueue time of the second tag information.
According to one embodiment of the present application, each module of the branch prediction apparatus for a branch instruction shown in fig. 5 or fig. 6 may be separately or completely combined into one or several additional units, or some module(s) thereof may be further split into a plurality of units with smaller functions, which may achieve the same operation without affecting the implementation of the technical effects of the embodiments of the present application. The above units are divided based on logic functions, and in practical applications, the functions of one unit may be implemented by a plurality of units, or the functions of a plurality of units may be implemented by one unit. In other embodiments of the application, the branch prediction means of the branch instruction may also comprise other units, and in actual practice, these functions may be carried out with the aid of other units, and may be carried out by a plurality of units in cooperation.
According to another embodiment of the present application, a branch prediction apparatus device of a branch instruction as shown in fig. 5 or 6 may be constructed by running a computer program (including program code) capable of executing the steps involved in the respective methods as shown in fig. 2 or 3 on a general-purpose computing device such as a computer including a processing element such as a Central Processing Unit (CPU), a random access storage medium (RAM), a read only storage medium (ROM), and the like, and a storage element, and a branch prediction method of a branch instruction of an embodiment of the present application is implemented. The computer program may be recorded on, for example, a computer-readable recording medium, and loaded into and executed by the above-described computing device via the computer-readable recording medium.
Referring to fig. 7 for a schematic structural diagram of an electronic device according to an embodiment of the present application, as shown in fig. 7, the electronic device includes at least a processor 710, an input device 720, an output device 730, and a computer storage medium 740. Wherein the processor 710, input device 720, output device 730, and computer storage medium 740 within the electronic device may be connected by a bus or other means.
The computer storage medium 740 may be stored in a memory of an electronic device, the computer storage medium 740 for storing a computer program comprising program instructions, and the processor 710 for executing the program instructions stored by the computer storage medium 740. The processor 710, or CPU (Central Processing Unit ), is a computing core as well as a control core of the electronic device, which is adapted to implement one or more instructions, in particular to load and execute one or more instructions to implement a corresponding method flow or a corresponding function.
In one embodiment, the processor 710 of the electronic device provided by the embodiment of the present application may be configured to perform a branch prediction process for a series of branch instructions:
Determining a branch prediction table entry of the branch instruction according to the instruction address of the branch instruction to be predicted and the global branch histories of different lengths;
determining whether a saturation counter update value of a branch instruction exists in a preset IUM module according to a branch prediction table entry;
in the case that a saturation counter update value exists in the IUM module, the branch instruction is directionally predicted using the saturation counter update value.
In yet another embodiment, processor 710 performs determining whether a saturation counter update value for a branch instruction exists in a predetermined IUM module based on a branch prediction table entry for the branch instruction, including:
inputting a branch prediction table entry to the IUM module through a Provider interface of the IUM module, wherein the branch prediction table entry comprises first tag information of a branch instruction;
and acquiring a Resp interface returned by the IUM module, if the valid signal corresponding to the Resp interface indicates that the branch prediction table item hits in the IUM module, determining that a saturation counter update value exists in the IUM module, and if the valid signal corresponding to the Resp interface indicates that the branch prediction table item does not hit in the IUM module, determining that the saturation counter update value does not exist in the IUM module.
In yet another embodiment, the processor 710 is further configured to, prior to using the saturation counter update value to direction predict the branch instruction, perform:
The receiving IUM module returns a saturation counter update value and a corresponding finger target queue pointer of the saturation counter update value over the Resp interface.
In yet another embodiment, the processor 710 is further configured to, prior to using the saturation counter update value to direction predict the branch instruction, perform:
And acquiring second tag information of the branch instruction sent by the IUM module through a resp_tag interface returned by the IUM module.
In another embodiment, the processor 710 of the electronic device provided by the embodiment of the present application may be further configured to perform a branch prediction process of another series of branch instructions:
acquiring a branch prediction table entry of a branch instruction to be predicted, which is input by a TAGE branch predictor, through a Provider interface, wherein the branch prediction table entry comprises first tag information of the branch instruction;
judging whether second tag information matched with the first tag information is stored in the content addressing memory through enq _tag interface;
under the condition that second tag information is stored in the content-addressable memory, determining that a branch prediction table entry hits in the IUM module, and returning a saturation counter update value of a branch instruction to the TAGE branch predictor through a Resp interface so that the TAGE branch predictor adopts the saturation counter update value to conduct direction prediction on the branch instruction.
In yet another embodiment, the processor 710 is further configured to, prior to the IUM module obtaining, via the Provider interface, a branch prediction table entry for a branch instruction to be predicted that is input by the tag branch predictor, execute:
Receiving a saturation counter value returned when the branch instruction is executed at the last time by the back-end pipeline and a jump result of the branch instruction, wherein the saturation counter value is the saturation counter value of the branch instruction in a corresponding table entry of the TAGE branch predictor when the branch instruction is determined to have misprediction;
calculating a saturation counter update value of the branch instruction according to the saturation counter value and the jump result;
and calculating second tag information by adopting the instruction address of the branch instruction and global branch histories with different lengths, and storing the saturation counter update value, the second tag information and the corresponding instruction fetching target queue pointer.
In yet another embodiment, the IUM module includes a content addressable memory and a register file, and the processor 710 executes a store saturation counter update value, second tag information, and corresponding finger target queue pointer, comprising:
The IUM module stores the second tag information in the content-addressed memory and stores the saturation counter update value and the index target queue pointer in the register file.
In yet another embodiment, the processor 710 is further configured to perform:
And returning the second tag information to the TAGE branch predictor through a resp_tag interface under the condition that the second tag information is stored in the content-addressable memory.
In yet another embodiment, the finger target queue pointer is used to mark a saturation counter update value, a enqueue time of the second tag information in the IUM module, and the processor 710 is further configured to perform:
In the case of a branch instruction commit, dequeuing entries having a enqueue time earlier than the saturation counter update value, the enqueue time of the second tag information.
The electronic device may be a computer, a server, a terminal device, or the like, and the server may be a separate physical server, a server cluster, or a distributed system. The electronic devices may include, but are not limited to, a processor 710, an input device 720, an output device 730, and a computer storage medium 740. And may also include memory, power supplies, application client modules, and the like. The input device 720 may be a keyboard, touch screen, radio frequency receiver, etc., and the output device 730 may be a speaker, display, radio frequency transmitter, etc. It will be appreciated by those skilled in the art that the schematic diagram is merely an example of an electronic device and is not limiting of an electronic device, and may include more or fewer components than shown, or certain components may be combined, or different components.
It should be noted that, since the steps in the branch prediction method of the branch instruction are implemented when the processor 710 of the electronic device executes the computer program, the embodiments of the branch prediction method of the branch instruction are applicable to the electronic device, and all achieve the same or similar advantages.
The embodiment of the application also provides a computer storage medium (Memory) which is a Memory device in an information processing device or an information transmitting device or an information receiving device and is used for storing programs and data. It will be appreciated that the computer storage medium herein may include both a built-in storage medium in the terminal and an extended storage medium supported by the terminal. The computer storage medium provides a storage space that stores an operating system of the terminal. Also stored in the memory space are one or more instructions, which may be one or more computer programs (including program code), adapted to be loaded and executed by the processor. The computer storage medium may be a high-speed RAM memory, a non-volatile memory (non-volatile memory), such as at least one magnetic disk memory, or at least one computer storage medium located remotely from the processor. In one embodiment, one or more instructions stored in a computer storage medium may be loaded and executed by a processor to implement the corresponding steps in the branch prediction method described above with respect to branch instructions.
The foregoing has outlined rather broadly the more detailed description of embodiments of the application, wherein the principles and embodiments of the application are explained in detail using specific examples, the above examples being provided solely to facilitate the understanding of the method and core concepts of the application; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.
Claims (13)
1. A method of branch prediction for a branch instruction, applied to a tag branch predictor, the method comprising:
The TAGE branch predictor determines a branch prediction table entry of a branch instruction according to the instruction address of the branch instruction to be predicted and global branch histories of different lengths;
The TAGE branch predictor determines whether a saturation counter update value of the branch instruction exists in a preset IUM module according to the branch prediction table entry, wherein the saturation counter update value is calculated by the IUM module under the condition that the branch instruction is mispredicted in the last prediction;
And the TAGE branch predictor adopts the saturation counter update value to conduct direction prediction on the branch instruction under the condition that the saturation counter update value exists in the IUM module.
2. The method of claim 1, wherein the tag branch predictor determining whether a saturation counter update value for the branch instruction exists in a preset IUM module according to a branch prediction table entry of the branch instruction comprises:
The TAGE branch predictor inputs the branch prediction table entry to the IUM module through a Provider interface of the IUM module, wherein the branch prediction table entry comprises first tag information of the branch instruction;
And the TAGE branch predictor acquires a Resp interface returned by the IUM module, if a valid signal corresponding to the Resp interface indicates that the branch prediction table entry hits in the IUM module, determining that the saturation counter updating value exists in the IUM module, and if the valid signal corresponding to the Resp interface indicates that the branch prediction table entry does not hit in the IUM module, determining that the saturation counter updating value does not exist in the IUM module.
3. The method of claim 2, wherein prior to predicting the direction of the branch instruction using the saturation counter update value, the method further comprises:
The TAGE branch predictor receives the saturation counter updating value returned by the IUM module through the Resp interface and the corresponding finger target queue pointer of the saturation counter updating value.
4. A method according to any of claims 1-3, wherein prior to direction predicting the branch instruction with the saturation counter update value, the method further comprises:
And the TAGE branch predictor acquires second tag information of the branch instruction sent by the IUM module through a Resp_tag interface returned by the IUM module.
5. A method of branch prediction for a branch instruction, applied to an IUM module, the method comprising:
the IUM module obtains a branch prediction table entry of a branch instruction to be predicted, which is input by a TAGE branch predictor, through a Provider interface, wherein the branch prediction table entry comprises first tag information of the branch instruction;
the IUM module judges whether second tag information matched with the first tag information is stored in a content addressing memory through enq _tag interfaces;
And under the condition that the second tag information is stored in the content-addressable memory, the IUM module determines that the branch prediction table entry hits in the IUM module, and returns a saturation counter update value of the branch instruction to the TAGE branch predictor through a Resp interface, so that the TAGE branch predictor adopts the saturation counter update value to conduct direction prediction on the branch instruction, wherein the saturation counter update value is calculated by the IUM module under the condition that the branch instruction is mispredicted in the last prediction.
6. The method of claim 5, wherein prior to the IUM module obtaining, via the Provider interface, a branch prediction table entry for a branch instruction to be predicted that is entered by a tag branch predictor, the method further comprises:
the IUM module receives a saturation counter value returned when the back-end pipeline executes the branch instruction last time and a jump result of the branch instruction, wherein the saturation counter value is a saturation counter value in a corresponding table entry of a TAGE branch predictor when the branch instruction is determined to have misprediction;
The IUM module calculates a saturation counter update value of the branch instruction according to the saturation counter value and the jump result;
the IUM module calculates the second tag information by adopting the instruction address of the branch instruction and the global branch histories with different lengths, and stores the saturation counter update value, the second tag information and the corresponding instruction fetching target queue pointer.
7. The method of claim 6, wherein the IUM module includes the content-addressable memory and a register file, the storing the saturation counter update value, the second tag information, and a corresponding finger target queue pointer, comprising:
The IUM module stores the second tag information in the content-addressed memory and stores the saturation counter update value and the fetch target queue pointer in the register file.
8. The method according to any one of claims 5-7, further comprising:
and the IUM module returns the second tag information to the TAGE branch predictor through a resp_tag interface under the condition that the second tag information is stored in the content-addressable memory.
9. The method of claim 6, wherein the finger target queue pointer is used to mark the saturation counter update value, a enqueue time of the second tag information in the IUM module, the method further comprising:
And the IUM module dequeues the table entry with the enqueue time earlier than the saturation counter update value and the enqueue time of the second tag information under the condition that the branch instruction is submitted.
10. A branch prediction apparatus for a branch instruction, applied to a tag branch predictor, said apparatus comprising a first processing unit and a first transceiver unit;
The first processing unit is configured to determine a branch prediction table entry of a branch instruction according to an instruction address of the branch instruction to be predicted and global branch histories of different lengths;
The first transceiver unit is configured to determine, according to the branch prediction table entry, whether a saturation counter update value of the branch instruction exists in a preset IUM module, where the saturation counter update value is calculated by the IUM module when the branch instruction is mispredicted in the last prediction;
The first processing unit is further configured to, when the saturation counter update value exists in the IUM module, perform direction prediction on the branch instruction using the saturation counter update value.
11. A branch prediction apparatus for a branch instruction, applied to an IUM module, the apparatus comprising a second processing unit and a second transceiver unit;
the second transceiver unit is configured to obtain, through a Provider interface, a branch prediction table entry of a branch instruction to be predicted, where the branch prediction table entry includes first tag information of the branch instruction;
The second processing unit is configured to determine, through a enq _tag interface, whether second tag information that matches the first tag information is stored in the content-addressable memory;
the second processing unit is further configured to determine that the branch prediction table entry hits in the IUM module when the second tag information is stored in the content addressable memory, and call the second transceiver unit to return a saturation counter update value of the branch instruction to the tag branch predictor through a Resp interface, so that the tag branch predictor performs direction prediction on the branch instruction by using the saturation counter update value, where the saturation counter update value is calculated by the IUM module when the branch instruction is mispredicted in the last prediction.
12. An electronic device comprising an input device and an output device, further comprising:
a processor adapted to implement one or more instructions, and
A computer storage medium storing one or more instructions adapted to be loaded by the processor and to perform the method of any of claims 1-4 or 5-9.
13. A computer storage medium storing one or more instructions adapted to be loaded by a processor and to perform the method of any one of claims 1-4 or 5-9.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202210478867.2A CN114816534B (en) | 2022-04-29 | 2022-04-29 | A branch prediction method and related device for branch instructions |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202210478867.2A CN114816534B (en) | 2022-04-29 | 2022-04-29 | A branch prediction method and related device for branch instructions |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN114816534A CN114816534A (en) | 2022-07-29 |
| CN114816534B true CN114816534B (en) | 2025-09-26 |
Family
ID=82511567
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202210478867.2A Active CN114816534B (en) | 2022-04-29 | 2022-04-29 | A branch prediction method and related device for branch instructions |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN114816534B (en) |
Families Citing this family (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN115562730B (en) * | 2022-09-29 | 2025-12-30 | 杭州中天微系统有限公司 | Branch predictor, related equipment and branch prediction method |
| CN115934171B (en) * | 2023-01-16 | 2023-05-16 | 北京微核芯科技有限公司 | Method and apparatus for scheduling branch predictors for multiple instructions |
| CN119740271A (en) * | 2024-11-25 | 2025-04-01 | 中山大学 | RISC-V-based secure branch predictor and operation method |
| CN120010927B (en) * | 2025-04-22 | 2025-07-18 | 山东云海国创云计算装备产业创新中心有限公司 | New entry allocation method, device, equipment and medium in branch instruction predictor |
Family Cites Families (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US7069426B1 (en) * | 2000-03-28 | 2006-06-27 | Intel Corporation | Branch predictor with saturating counter and local branch history table with algorithm for updating replacement and history fields of matching table entries |
| US10838731B2 (en) * | 2018-09-19 | 2020-11-17 | Qualcomm Incorporated | Branch prediction based on load-path history |
| CN113553104B (en) * | 2021-07-22 | 2024-08-27 | 江南大学 | A method for improving prediction accuracy of branch direction predictor |
| CN113703846B (en) * | 2021-08-30 | 2024-10-01 | 江南大学 | A new table entry allocation method for TAGE branch predictor |
| CN113626084B (en) * | 2021-09-03 | 2023-05-19 | 苏州睿芯集成电路科技有限公司 | Method for optimizing TAGE branch prediction algorithm for instruction stream with oversized cycle number |
-
2022
- 2022-04-29 CN CN202210478867.2A patent/CN114816534B/en active Active
Non-Patent Citations (1)
| Title |
|---|
| A new case for the TAGE branch predictor;André Seznec等;《2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)》;20170213;第2-5节 * |
Also Published As
| Publication number | Publication date |
|---|---|
| CN114816534A (en) | 2022-07-29 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN114816534B (en) | A branch prediction method and related device for branch instructions | |
| US10613869B2 (en) | Branch target address provision | |
| US4943908A (en) | Multiple branch analyzer for prefetching cache lines | |
| US8751776B2 (en) | Method for predicting branch target address based on previous prediction | |
| US20060242393A1 (en) | Branch target prediction for multi-target branches | |
| US20080209190A1 (en) | Parallel prediction of multiple branches | |
| US9092225B2 (en) | Systems and methods for reducing branch misprediction penalty | |
| US20230350683A1 (en) | Branch prediction method, branch prediction apparatus, processor, medium, and device | |
| CN116048627B (en) | Instruction buffering method, apparatus, processor, electronic device and readable storage medium | |
| JP6683918B2 (en) | Arithmetic processing device and method for controlling arithmetic processing device | |
| JP2006520964A5 (en) | ||
| US11061683B2 (en) | Limiting replay of load-based control independent (CI) instructions in speculative misprediction recovery in a processor | |
| WO2004086219A9 (en) | Method and apparatus for branch prediction based on branch targets | |
| CN106557304B (en) | Instruction fetch unit for predicting the target of a subroutine return instruction | |
| WO2017053111A1 (en) | Method and apparatus for dynamically tuning speculative optimizations based on predictor effectiveness | |
| EP2100220A2 (en) | Associate cached branch information with the last granularity of branch instruction in variable length instruction set | |
| CN117130666A (en) | Configuration methods, branch predictors, instruction recognizers and electronics | |
| EP4031964B1 (en) | Dynamic hammock branch training for branch hammock detection in an instruction stream executing in a processor | |
| US11055101B2 (en) | Processing apparatus and controlling method for processing apparatus | |
| US20090198985A1 (en) | Data processing system, processor and method of data processing having branch target address cache with hashed indices | |
| CN119005083B (en) | Processing module, instruction processing method and chip system | |
| US8683181B2 (en) | Processor and method for distributing load among plural pipeline units | |
| US20090070569A1 (en) | Branch prediction device,branch prediction method, and microprocessor | |
| US20070294518A1 (en) | System and method for predicting target address of branch instruction utilizing branch target buffer having entry indexed according to program counter value of previous instruction | |
| EP4302184B1 (en) | Processor branch prediction circuit employing back-invalidation of prediction cache entries based on decoded branch instructions and related methods |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |