[go: up one dir, main page]

CN111066006B - Systems, methods and media for facilitating processing within a computing environment - Google Patents

Systems, methods and media for facilitating processing within a computing environment Download PDF

Info

Publication number
CN111066006B
CN111066006B CN201880058321.3A CN201880058321A CN111066006B CN 111066006 B CN111066006 B CN 111066006B CN 201880058321 A CN201880058321 A CN 201880058321A CN 111066006 B CN111066006 B CN 111066006B
Authority
CN
China
Prior art keywords
toc
value
instruction
data structure
computer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201880058321.3A
Other languages
Chinese (zh)
Other versions
CN111066006A (en
Inventor
M.K.格施温德
V.萨拉普拉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Publication of CN111066006A publication Critical patent/CN111066006A/en
Application granted granted Critical
Publication of CN111066006B publication Critical patent/CN111066006B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/0815Cache consistency protocols
    • G06F12/0817Cache consistency protocols using directory methods
    • G06F12/0826Limited pointers directories; State-only directories without pointers
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/3004Arrangements for executing specific machine instructions to perform operations on memory
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0862Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with prefetch
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30076Arrangements for executing specific machine instructions to perform miscellaneous control operations, e.g. NOP
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/30101Special purpose registers
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/60Details of cache memory
    • G06F2212/6024History based prefetching
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/62Details of cache specific to multiprocessor cache arrangements

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Advance Control (AREA)
  • Executing Machine-Instructions (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

A table of contents (TOC) register instruction is set. Instructions are obtained and executed by a processor that provide pointers to reference data structures, such as TOCs. The executing includes determining a value of a pointer to the reference data structure and storing the value in a location (e.g., register) specified by the instruction.

Description

用于促进计算环境内的处理的系统、方法和介质Systems, methods, and media for facilitating processing within a computing environment

背景技术Background Art

一个或多个方面一般涉及计算环境内的处理,尤其涉及促进这样的处理。One or more aspects relate generally to processing within a computing environment, and more particularly to facilitating such processing.

许多计算系统使用全局偏移表(GOT)或内容表(TOC)来填充源代码内的变量。例如,编译器从源代码生成目标代码,而不知道代码/数据的最终地址或位移。具体地,编译器生成目标代码,该目标代码将访问变量值的变量地址参考数据结构(例如,全局偏移表或内容表),而不知道数据结构的最终大小或各种数据部分的偏移/地址。用于该信息的占位符被留在目标代码中并由链接器更新。Many computing systems use a global offset table (GOT) or table of contents (TOC) to populate variables within the source code. For example, a compiler generates object code from the source code without knowing the final address or displacement of the code/data. Specifically, the compiler generates object code that references a data structure (e.g., a global offset table or table of contents) for the variable address that accesses the variable value without knowing the final size of the data structure or the offsets/addresses of the various data sections. Placeholders for this information are left in the object code and updated by the linker.

为了访问GOT或TOC,使用指针。指针通常由指令序列计算。这些指令通常依赖于计算的寄存器,这些寄存器在处理器中不总是容易得到的。因此,对依赖于TOC的变量(即,除局部变量之外的变量)的访问会被延迟。To access the GOT or TOC, pointers are used. Pointers are usually calculated by a sequence of instructions. These instructions usually rely on calculated registers, which are not always easily available in the processor. Therefore, access to variables that depend on the TOC (i.e., variables other than local variables) is delayed.

发明内容Summary of the invention

通过提供一种用于促进计算环境内的处理的计算机程序产品,克服了现有技术的缺点,并提供了附加的优点。该计算机程序产品包括可由处理电路读取并且存储用于执行方法的指令的计算机可读存储介质。该方法包括例如由处理器获得指向参考数据结构的指针的指令。执行该指令,并且该执行包括:确定指向参考数据结构的指针的值,以及将该值存储在由该指令指定的位置中。使用指令来提供指针值通过限制等待所计算的寄存器值可能发生的延迟而便于处理并提高性能。The disadvantages of the prior art are overcome and additional advantages are provided by providing a computer program product for facilitating processing within a computing environment. The computer program product includes a computer-readable storage medium readable by a processing circuit and storing instructions for performing a method. The method includes, for example, an instruction to obtain, by a processor, a pointer to a reference data structure. The instruction is executed, and the execution includes: determining a value of the pointer to the reference data structure, and storing the value in a location specified by the instruction. Using instructions to provide a pointer value facilitates processing and improves performance by limiting delays that may occur waiting for a calculated register value.

在一个实施例中,确定该值包括执行数据结构的查找以确定该值。作为示例,数据结构包括参考数据结构指针高速缓存或填充有参考数据结构指针值的表。此外,在一个示例中,该位置包括由指令指定的寄存器。In one embodiment, determining the value includes performing a search of a data structure to determine the value. As an example, the data structure includes a reference data structure pointer cache or a table filled with reference data structure pointer values. In addition, in one example, the location includes a register specified by the instruction.

在另一实施例中,确定该值包括检查参考数据结构指针高速缓存以寻找包括该值的条目,并且基于找到该条目来执行存储。此外,基于该值未位于参考数据结构高速缓存中,由处理器引发到处理程序的陷阱。处理程序从填充有参考数据结构指针值的数据结构获得该值,并执行存储。在一个实施例中,该值也被存储在参考数据结构指针高速缓存中。In another embodiment, determining the value includes checking a reference data structure pointer cache for an entry that includes the value, and performing a store based on finding the entry. Additionally, a trap to a handler is triggered by the processor based on the value not being located in the reference data structure cache. The handler obtains the value from a data structure populated with the reference data structure pointer value, and performs a store. In one embodiment, the value is also stored in the reference data structure pointer cache.

在进一步的实施例中,基于该值未位于参考数据结构指针高速缓存中,执行高速缓存未命中处理以确定该值并存储该值。In a further embodiment, cache miss processing is performed to determine the value and store the value based on the value not being located in the reference data structure pointer cache.

在又一实施例中,基于获得该指令,由处理器引发到处理程序的陷阱,并且由处理程序执行确定和存储。In yet another embodiment, a trap to a handler is caused by the processor based on obtaining the instruction, and the determination and the storing are performed by the handler.

本文还描述并要求保护与一个或多个方面相关的计算机实现的方法和系统。此外,本文还描述并要求保护与一个或多个方面相关的服务。Computer-implemented methods and systems related to one or more aspects are also described and claimed herein. Additionally, services related to one or more aspects are also described and claimed herein.

通过本文所述的技术实现了额外的特征和优点。其他实施例和方面在本文中详细描述,且被视为所要求保护的方面的一部分。Additional features and advantages are realized through the techniques described herein.Other embodiments and aspects are described in detail herein and are considered a part of the claimed aspects.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

在说明书结尾处的权利要求中作为示例特别指出并清楚地要求了一个或多个方面。从结合附图的以下详细描述中,一个或多个方面的前述和目的、特征和优点将变得显而易见,在附图中:One or more aspects are particularly pointed out and distinctly claimed as examples in the claims at the conclusion of the specification. The foregoing and objects, features and advantages of one or more aspects will become apparent from the following detailed description taken in conjunction with the accompanying drawings, in which:

图1A示出了用于结合和使用本发明的一个或多个方面的计算环境的一个示例;FIG1A illustrates an example of a computing environment for incorporating and using one or more aspects of the present invention;

图1B示出了根据本发明的一个或多个方面的图1A的处理器的进一步细节;FIG. 1B illustrates further details of the processor of FIG. 1A according to one or more aspects of the present invention;

图1C示出了根据本发明的一个或多个方面使用的指令执行流水线的一个示例的进一步细节;FIG1C illustrates further details of an example of an instruction execution pipeline used in accordance with one or more aspects of the present invention;

图1D示出了根据本发明的一个方面的图1A的处理器的一个示例的进一步细节;FIG. 1D illustrates further details of an example of the processor of FIG. 1A according to an aspect of the present invention;

图2示出了根据本发明的一个方面的设置TOC寄存器(Set TOC Register,STR)指令的一个示例;FIG. 2 shows an example of a Set TOC Register (STR) instruction according to an aspect of the present invention;

图3示出了根据本发明的一个方面的与设置TOC寄存器指令相关联的处理的一个示例;FIG3 illustrates an example of processing associated with a Set TOC Register instruction according to an aspect of the present invention;

图4示出了根据本发明的一个方面的与设置TOC寄存器指令相关联的处理的另一示例;FIG. 4 illustrates another example of processing associated with a Set TOC Register instruction according to an aspect of the present invention;

图5示出了根据本发明的一个方面的与设置TOC寄存器指令相关联的处理的又一示例;FIG5 illustrates yet another example of processing associated with a Set TOC Register instruction according to an aspect of the present invention;

图6A-6B示出了根据本发明的一个方面的验证TOC寄存器(这里也称为TOC指针寄存器)设置的实施例;6A-6B illustrate an embodiment of verifying the setting of a TOC register (also referred to herein as a TOC pointer register) according to an aspect of the present invention;

图7A-7B示出了根据本发明的各个方面的验证TOC寄存器设置的其他实施例;7A-7B illustrate other embodiments of verifying TOC register settings according to various aspects of the present invention;

图8示出了根据本发明的一个方面的确定TOC指针值(这里也称为TOC值)的一个实施例;FIG8 illustrates one embodiment of determining a TOC pointer value (also referred to herein as a TOC value) according to one aspect of the present invention;

图9示出了根据本发明的一个方面的与响应于子例程分支预测TOC值相关联的处理的一个示例;FIG. 9 illustrates an example of processing associated with responding to a subroutine branch prediction TOC value according to an aspect of the present invention;

图10示出了根据本发明的一个方面的TOC值检查插入逻辑的一个示例;FIG10 illustrates an example of TOC value check insertion logic according to an aspect of the present invention;

图11示出了根据本发明的一个方面的与响应于子例程分支预测TOC值相关联的处理的另一个示例;FIG. 11 illustrates another example of processing associated with responding to a subroutine branch prediction TOC value according to an aspect of the present invention;

图12示出了根据本发明的一个方面的TOC值检查插入逻辑的另一示例;FIG. 12 illustrates another example of TOC value check insertion logic according to an aspect of the present invention;

图13示出了根据本发明的一个方面的TOC值检查插入逻辑的另一示例;FIG. 13 illustrates another example of TOC value check insertion logic according to an aspect of the present invention;

图14A示出了根据本发明的一个方面的TOC指针高速缓存(这里也称为TOC高速缓存)的一个示例;FIG. 14A illustrates an example of a TOC pointer cache (also referred to herein as a TOC cache) in accordance with an aspect of the present invention;

图14B示出了根据本发明的一个方面的TOC高速缓存插入处理的一个示例;FIG. 14B illustrates an example of a TOC cache insertion process according to an aspect of the present invention;

图15示出了根据本发明的一个方面的分配给动态共享对象的TOC值的一个示例;FIG. 15 illustrates an example of a TOC value assigned to a dynamic shared object according to an aspect of the present invention;

图16示出了根据本发明的一个方面的TOC高速缓存的另一示例;FIG. 16 illustrates another example of a TOC cache according to an aspect of the present invention;

图17示出了根据本发明的一个方面的TOC高速缓存插入处理的另一示例;FIG. 17 illustrates another example of a TOC cache insertion process according to an aspect of the present invention;

图18示出了根据本发明的一个方面的将TOC值存储到TOC跟踪结构中的一个示例;FIG. 18 illustrates an example of storing TOC values into a TOC tracking structure according to an aspect of the present invention;

图19示出了根据本发明的一个方面的由只读TOC寄存器参考的TOC的一个示例;FIG. 19 illustrates an example of a TOC referenced by a read-only TOC register according to an aspect of the present invention;

图20A-20C示出了根据本发明的各方面的加载TOC相对长指令(Load TOC-Relative Long instructions)的示例;20A-20C illustrate examples of Load TOC-Relative Long instructions according to aspects of the present invention;

图21示出了根据本发明的一个方面的加载地址TOC相对长指令(Load AddressTOC-Relative Long instruction)的一个示例;FIG. 21 shows an example of a Load Address TOC-Relative Long instruction according to an aspect of the present invention;

图22示出了根据本发明的一个方面的TOC加法立即移位指令(TOC add immediateshift instruction)的一个示例;FIG. 22 illustrates an example of a TOC add immediate shift instruction according to an aspect of the present invention;

图23示出了根据本发明的一个方面的加法TOC立即移位指令(add TOC immediateshifted instruction)的一个示例;FIG. 23 illustrates an example of an add TOC immediate shifted instruction according to an aspect of the present invention;

图24示出了根据本发明的一个方面的处理可包括TOC操作数的指令的一个实施例;FIG. 24 illustrates one embodiment of processing instructions that may include a TOC operand according to an aspect of the present invention;

图25-27示出了根据本发明的各方面获得指令的TOC操作数的实施例;25-27 illustrate embodiments of obtaining a TOC operand of an instruction according to aspects of the present invention;

图28示出了根据本发明的一个方面的与使用设置TOC寄存器指令(Set TOCRegister instruction)相关联的编译流程的一个示例;FIG. 28 illustrates an example of a compile flow associated with using a Set TOC Register instruction according to an aspect of the present invention;

图29示出了根据本发明的一个方面的与使用设置TOC寄存器指令相关联的静态链接器流程的一个示例;FIG. 29 illustrates an example of a static linker flow associated with using a Set TOC Register instruction according to an aspect of the present invention;

图30示出了根据本发明的一个方面的与使用TOC只读寄存器相关联的编译流程的一个示例;FIG30 illustrates an example of a compile flow associated with using a TOC read-only register according to an aspect of the present invention;

图31A-31B示出了根据本发明的一个方面的促进计算环境内的处理的一个实施例;31A-31B illustrate one embodiment of facilitating processing within a computing environment according to one aspect of the present invention;

图32A示出了结合和使用本发明的一个或多个方面的计算环境的另一个示例;FIG32A illustrates another example of a computing environment incorporating and using one or more aspects of the present invention;

图32B示出了图32A的存储器的进一步细节;Fig. 32B shows further details of the memory of Fig. 32A;

图33示出了云计算环境的一个实施例;以及FIG33 illustrates one embodiment of a cloud computing environment; and

图34示出了抽象模型层的一个示例。FIG. 34 shows an example of abstract model layers.

具体实施方式DETAILED DESCRIPTION

根据本发明的一个方面,促进提供指向参考数据结构的指针,所述参考数据结构例如内容表(TOC)或全局偏移表(GOT)。在一个示例中,提供了设置TOC寄存器(Set TOCRegister,STR)指令,其将用于访问TOC的值(例如指针值)加载到寄存器(或其他限定位置)。尽管TOC在这里被用作示例,但是这里描述的方面、特征和技术同样适用于GOT或其他类似类型的结构。According to one aspect of the present invention, a pointer to a reference data structure is provided, such as a table of contents (TOC) or a global offset table (GOT). In one example, a Set TOC Register (STR) instruction is provided, which loads a value (e.g., a pointer value) for accessing the TOC into a register (or other defined location). Although the TOC is used as an example here, the aspects, features, and techniques described herein are equally applicable to the GOT or other similar types of structures.

TOC指针值、TOC指针、TOC值和指向TOC的指针在这里可以互换使用。TOC寄存器保存TOC指针,因此在这里可以称为TOC指针寄存器或TOC寄存器。TOC pointer value, TOC pointer, TOC value, and pointer to TOC may be used interchangeably herein. The TOC register holds the TOC pointer and may therefore be referred to herein as the TOC pointer register or the TOC register.

此外,TOC指针高速缓存、TOC指针跟踪结构、TOC指针表等在这里也分别称为TOC高速缓存、TOC跟踪结构、TOC表等。类似地,参考数据结构指针高速缓存和参考数据结构高速缓存在本文中可互换使用。也可存在其他示例。In addition, TOC pointer cache, TOC pointer tracking structure, TOC pointer table, etc. are also referred to herein as TOC cache, TOC tracking structure, TOC table, etc. Similarly, reference data structure pointer cache and reference data structure cache are used interchangeably herein. Other examples may also exist.

在另一方面,通常用于设置TOC寄存器的指令序列由设置TOC寄存器指令来代替。作为示例,指令序列包括一个或多个指令。此外,验证操作可以用于验证TOC寄存器值。TOC寄存器可以是例如硬件寄存器或架构化寄存器,诸如通用寄存器(例如r2、r12),其由架构定义或由应用二进制接口(ABI)指定。其他示例是可能的。On the other hand, the instruction sequence that is usually used to set the TOC register is replaced by the set TOC register instruction. As an example, the instruction sequence includes one or more instructions. In addition, the verification operation can be used to verify the TOC register value. The TOC register can be, for example, a hardware register or an architected register, such as a general register (e.g., r2, r12), which is defined by the architecture or specified by the application binary interface (ABI). Other examples are possible.

在又一方面,TOC指针值是响应于分支到子例程而预测的。In yet another aspect, the TOC pointer value is predicted in response to a branch to a subroutine.

在又一方面,提供TOC高速缓存的实施例以促进处理。TOC高速缓存(或其他参考数据结构高速缓存)例如是高速处理器内高速缓存,其包括针对最近使用的程序中的不同位置/模块预测的各种TOC指针值。In yet another aspect, embodiments of a TOC cache are provided to facilitate processing. A TOC cache (or other reference data structure cache), for example, is a high-speed in-processor cache that includes various TOC pointer values predicted for different locations/modules in a recently used program.

此外,提供了一个方面来准备和初始化TOC跟踪结构以便进行TOC指针值预测。TOC跟踪结构可以是例如TOC高速缓存或存储器内表,其中所述存储器内表填充有要为程序中的不同位置/模块预测的TOC指针值。In addition, an aspect is provided to prepare and initialize a TOC tracking structure for TOC pointer value prediction. The TOC tracking structure may be, for example, a TOC cache or an in-memory table populated with TOC pointer values to be predicted for different locations/modules in a program.

在另一方面,伪寄存器(这里也称为只读TOC寄存器)用于提供指针值以及TOC寄存器寻址模式。伪寄存器不是硬件或架构化寄存器,也不具有与其相关联的存储;相反,它是例如从TOC高速缓存获得的TOC指针值(例如已经由STR产生的值)。On the other hand, a pseudo-register (also referred to herein as a read-only TOC register) is used to provide a pointer value and a TOC register addressing mode. A pseudo-register is not a hardware or architected register, nor does it have storage associated with it; instead, it is a TOC pointer value obtained, for example, from a TOC cache (e.g., a value that has been generated by STR).

此外,在另一方面,利用设置TOC寄存器指令和/或使用只读TOC寄存器来生成和/或编译代码。Additionally, in another aspect, code is generated and/or compiled using a set TOC register instruction and/or using a read-only TOC register.

本文描述了各个方面。此外,在不背离本发明的各方面的精神的情况下,许多变化是可能的。应当注意,并且除非不一致,否则本文描述的各个方面和特征及其变化可以与任何其他方面或特征组合。Various aspects are described herein. In addition, many variations are possible without departing from the spirit of the various aspects of the present invention. It should be noted that, unless inconsistent, the various aspects and features described herein and variations thereof may be combined with any other aspects or features.

参见图1A描述了结合和使用本发明的一个或多个方面的计算环境的实施例。在一个示例中,计算环境基于由纽约州阿蒙克市的国际商业机器公司(InternationalBusiness Machines Corporation)提供的z/架构(z/Architecture)。z/架构的一个实施例在IBM公开案第SA22-7832-10号(2015年3月)“z/架构工作原理(z/ArchitecturePrinciples of Operation)”中进行描述,其通过全文引用的方式并入本文。Z/ARCHITECTURE是美国纽约州阿蒙克市国际商业机器公司的注册商标。An embodiment of a computing environment incorporating and using one or more aspects of the present invention is described with reference to FIG. 1A. In one example, the computing environment is based on the z/Architecture provided by International Business Machines Corporation of Armonk, New York. An embodiment of the z/Architecture is described in IBM Publication No. SA22-7832-10 (March 2015), "z/Architecture Principles of Operation," which is incorporated herein by reference in its entirety. Z/ARCHITECTURE is a registered trademark of International Business Machines Corporation of Armonk, New York, USA.

在另一示例中,计算环境基于由纽约州阿蒙克市国际商业机器公司提供的Power架构(Power Architecture)。Power架构的一个实施例在国际商业机器公司2015年4月9日的“Power ISATM版本2.07B”中进行描述,其通过全文引用的方式并入本文。POWERARCHITECTURE是美国纽约州阿蒙克市国际商业机器公司的注册商标。In another example, the computing environment is based on the Power Architecture provided by International Business Machines Corporation of Armonk, New York. One embodiment of the Power Architecture is described in "Power ISA TM Version 2.07B" by International Business Machines Corporation, April 9, 2015, which is incorporated herein by reference in its entirety. POWERARCHITECTURE is a registered trademark of International Business Machines Corporation of Armonk, New York, USA.

计算环境还可基于其他架构,包括但不限于英特尔x86架构。也存在其他示例。The computing environment may also be based on other architectures, including but not limited to the Intel x86 architecture. Other examples also exist.

如在图1A中所示,计算环境100包括例如计算机系统102,计算机系统102以通用计算设备的形式示出。计算机系统102可包括但不限于经由一个或多个总线和/或其他连接110彼此耦接的一个或多个处理器或处理单元104(例如,中央处理单元(CPU))、存储器106(作为示例,被称作为主存储或存储装置)、以及一个或多个输入/输出(I/O)接口108。1A , computing environment 100 includes, for example, computer system 102, which is shown in the form of a general-purpose computing device. Computer system 102 may include, but is not limited to, one or more processors or processing units 104 (e.g., central processing units (CPUs)) coupled to each other via one or more buses and/or other connections 110, memory 106 (referred to as primary storage or storage, for example), and one or more input/output (I/O) interfaces 108.

总线110表示若干类型的总线结构中的任何一种的一个或多个,包括存储器总线或存储器控制器、外围总线、加速图形端口、以及使用各种总线架构中的任何一种的处理器或本地总线。作为示例而非限制,这些架构包括工业标准架构(ISA)、微通道架构(MCA)、增强型ISA(EISA)、视频电子标准协会(VESA)本地总线和外围组件互连(PCI)。Bus 110 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example and not limitation, these architectures include Industry Standard Architecture (ISA), Micro Channel Architecture (MCA), Enhanced ISA (EISA), Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI).

存储器106可包括例如高速缓存120,诸如共享高速缓存,该高速缓存可被耦接到处理器104的本地高速缓存122。此外,存储器106可包括一个或多个程序或应用130、操作系统132、以及一个或多个计算机可读程序指令134。计算机可读程序指令134可被配置为执行本发明的方面的实施例的功能。The memory 106 may include, for example, a cache 120, such as a shared cache, which may be coupled to a local cache 122 of the processor 104. In addition, the memory 106 may include one or more programs or applications 130, an operating system 132, and one or more computer-readable program instructions 134. The computer-readable program instructions 134 may be configured to perform the functions of embodiments of aspects of the present invention.

计算机系统102还可以经由例如I/O接口108与一个或多个外部设备140、一个或多个网络接口142和/或一个或多个数据存储设备144通信。示例外部设备包括用户终端、磁带驱动器、指示设备、显示器等。网络接口142使计算机系统102能够与一个或多个网络通信,诸如局域网(LAN)、通用广域网(WAN)和/或公共网络(例如,因特网),从而提供与其他计算设备或系统的通信。The computer system 102 may also communicate with one or more external devices 140, one or more network interfaces 142, and/or one or more data storage devices 144 via, for example, the I/O interface 108. Example external devices include a user terminal, a tape drive, a pointing device, a display, etc. The network interface 142 enables the computer system 102 to communicate with one or more networks, such as a local area network (LAN), a general wide area network (WAN), and/or a public network (e.g., the Internet), thereby providing communications with other computing devices or systems.

数据存储设备144可存储一个或多个程序146、一个或多个计算机可读程序指令148和/或数据等。计算机可读程序指令可被配置为执行本发明的方面的实施例的功能。The data storage device 144 may store one or more programs 146, one or more computer-readable program instructions 148, and/or data, etc. The computer-readable program instructions may be configured to perform the functions of embodiments of aspects of the present invention.

计算机系统102可以包括和/或耦合到可移动/不可移动、易失性/非易失性计算机系统存储介质。例如,它可以包括和/或耦合到不可移动的、非易失性磁介质(通常称为"硬盘驱动器")、用于从可移动的、非易失性磁盘(例如,"软盘")读取和向其写入的磁盘驱动器、和/或用于从可移动的、非易失性光盘(诸如CD-ROM、DVD-ROM或其他光学介质)读取或向其写入的光盘驱动器。应当理解,其他硬件和/或软件组件可以与计算机系统102结合使用。示例包括但不限于:微代码、设备驱动、冗余处理单元、外部磁盘驱动阵列、RAID系统、磁带驱动和数据档案存储系统等。The computer system 102 may include and/or be coupled to removable/non-removable, volatile/non-volatile computer system storage media. For example, it may include and/or be coupled to a non-removable, non-volatile magnetic media (commonly referred to as a "hard drive"), a disk drive for reading from and writing to a removable, non-volatile disk (e.g., a "floppy disk"), and/or an optical drive for reading from and writing to a removable, non-volatile optical disk (such as a CD-ROM, DVD-ROM, or other optical media). It should be understood that other hardware and/or software components may be used in conjunction with the computer system 102. Examples include, but are not limited to, microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data archival storage systems, among others.

计算机系统102可以与许多其他通用或专用计算系统环境或配置一起操作。适合与计算机系统102一起使用的公知的计算系统、环境和/或配置的示例包括但不限于个人计算机(PC)系统、服务器计算机系统、瘦客户端、胖客户端、手持式或膝上型设备、多处理器系统、基于微处理器的系统、机顶盒、可编程消费电子产品、网络PC、小型计算机系统、大型计算机系统、以及包括任何上述系统或设备的分布式云计算环境等。The computer system 102 may operate with numerous other general purpose or special purpose computing system environments or configurations. Examples of well-known computing systems, environments, and/or configurations suitable for use with the computer system 102 include, but are not limited to, personal computer (PC) systems, server computer systems, thin clients, fat clients, handheld or laptop devices, multiprocessor systems, microprocessor-based systems, set-top boxes, programmable consumer electronics, network PCs, minicomputer systems, mainframe computer systems, and distributed cloud computing environments including any of the above systems or devices, etc.

参见图1B描述关于处理器104的一个示例的进一步细节。处理器104包括用于执行指令的多个功能组件。这些功能组件包括例如用于提取要执行的指令的指令提取组件150、用于对所提取的指令进行解码并获得被解码的指令的操作数的指令解码单元152、用于执行被解码的指令的指令执行组件154、用于访问存储器以便视需要而进行指令执行的存储器存取组件156、以及用于提供所执行的指令的结果的写回组件160。根据本发明的一个方面,这些组件中的一个或多个可用于执行与内容表(TOC)指针处理166相关联的一个或多个指令和/或操作。See FIG. 1B for further details of an example of a processor 104. The processor 104 includes a plurality of functional components for executing instructions. These functional components include, for example, an instruction fetch component 150 for fetching instructions to be executed, an instruction decode unit 152 for decoding the fetched instructions and obtaining operands of the decoded instructions, an instruction execution component 154 for executing the decoded instructions, a memory access component 156 for accessing memory for executing instructions as needed, and a write-back component 160 for providing the results of the executed instructions. According to one aspect of the present invention, one or more of these components may be used to execute one or more instructions and/or operations associated with a table of contents (TOC) pointer processing 166.

在一个实施例中,处理器104还包括一个或多个寄存器168,以由一个或多个功能组件使用。处理器104可以包括比本文提供的示例更多、更少和/或其他的组件。In one embodiment, the processor 104 also includes one or more registers 168 for use by one or more functional components. The processor 104 may include more, fewer, and/or other components than the examples provided herein.

关于处理器104的执行流水线的进一步细节将参考图1C来描述。尽管在此描绘和描述了流水线的各个处理阶段,但是应当理解,在不脱离本发明的各方面的精神的情况下,可以使用附加的、更少的和/或其他阶段。Further details regarding the execution pipeline of processor 104 will be described with reference to Figure 1C. Although various processing stages of the pipeline are depicted and described herein, it should be understood that additional, fewer and/or other stages may be used without departing from the spirit of aspects of the present invention.

参考图1C,在一个实施例中,从指令队列中提取170指令,并且可以执行该指令的分支预测172和/或解码174。可将经解码的指令添加到指令组176以一起处理。分组的指令被提供给映射器178,该映射器确定任何依赖性、指派资源并将指令/操作组分派到适当的发布队列。对于不同类型的执行单元,存在一个或多个发布队列,包括例如分支、加载/存储、浮点、定点、向量等。在发布阶段180期间,指令/操作被发布到适当的执行单元。读取182任何寄存器以检索其源,且在执行阶段184期间执行指令/操作。如所指出的,作为示例,执行可以是针对分支、加载(LD)或存储(ST)、定点操作(FX)、浮点操作(FP)或向量操作(VX)。在写回阶段186期间,将任何结果写入到适当的寄存器。随后,指令完成188。如果存在中断或清洗190,则处理可返回到指令提取170。Referring to FIG. 1C , in one embodiment, an instruction is extracted 170 from an instruction queue, and branch prediction 172 and/or decoding 174 of the instruction may be performed. The decoded instruction may be added to an instruction group 176 to be processed together. The grouped instructions are provided to a mapper 178, which determines any dependencies, assigns resources, and dispatches the instruction/operation group to the appropriate issue queue. There are one or more issue queues for different types of execution units, including, for example, branch, load/store, floating point, fixed point, vector, etc. During the issue stage 180, the instruction/operation is issued to the appropriate execution unit. Any register is read 182 to retrieve its source, and the instruction/operation is executed during the execution stage 184. As noted, as an example, execution may be for a branch, load (LD) or store (ST), fixed point operation (FX), floating point operation (FP), or vector operation (VX). During the write-back stage 186, any results are written to the appropriate register. Subsequently, the instruction is completed 188. If there is an interrupt or flush 190, processing may return to instruction fetch 170.

此外,在一个示例中,耦合到解码单元的是寄存器重命名单元192,其可用于保存/恢复寄存器。Additionally, in one example, coupled to the decode unit is a register renaming unit 192, which may be used to save/restore registers.

关于处理器的附加细节参考图1D来描述。在一个示例中,诸如处理器104之类的处理器是流水线处理器,其可以包括例如预测硬件、寄存器、高速缓存、解码器、指令定序单元和指令执行单元。预测硬件包括例如本地分支历史表(BHT)105a、全局分支历史表(BHT)105b和全局选择器105c。通过指令提取地址寄存器(IFAR)107访问预测硬件,该寄存器具有下一个指令提取的地址。Additional details about the processor are described with reference to FIG. 1D. In one example, a processor such as processor 104 is a pipeline processor that may include, for example, prediction hardware, registers, caches, decoders, an instruction sequencing unit, and an instruction execution unit. The prediction hardware includes, for example, a local branch history table (BHT) 105a, a global branch history table (BHT) 105b, and a global selector 105c. The prediction hardware is accessed through an instruction fetch address register (IFAR) 107, which has the address of the next instruction to fetch.

还将同一地址提供到指令高速缓存109,其可提取被称为"提取群组"的多个指令。与指令高速缓存109相关联的是目录111。The same address is also provided to instruction cache 109, which can fetch multiple instructions referred to as a “fetch group.” Associated with instruction cache 109 is directory 111.

高速缓存和预测硬件在大致相同的时间以相同的地址被访问。如果预测硬件具有可用于提取群组中的指令的预测信息,则将该预测转发到指令定序单元(ISU)113,其随即将指令发布到执行单元以供执行。该预测可用于结合分支目标计算115和分支目标预测硬件(例如链接寄存器预测栈117a和计数寄存器栈117b)来更新IFAR 107。如果没有预测信息可用,但一个或多个指令解码器119在提取群组中发现分支指令,那么针对所述提取群组创建预测。所预测分支被存储在预测硬件中,例如存储在分支信息队列(BIQ)125中,并且被转发到ISU 113。The cache and prediction hardware are accessed at approximately the same time and at the same address. If the prediction hardware has prediction information available for the instruction in the fetch group, the prediction is forwarded to the instruction sequencing unit (ISU) 113, which then issues the instruction to the execution unit for execution. The prediction can be used to update the IFAR 107 in conjunction with the branch target calculation 115 and the branch target prediction hardware (e.g., link register prediction stack 117a and count register stack 117b). If no prediction information is available, but one or more instruction decoders 119 find a branch instruction in the fetch group, a prediction is created for the fetch group. The predicted branch is stored in the prediction hardware, such as in the branch information queue (BIQ) 125, and forwarded to the ISU 113.

分支执行单元(BRU)121响应于由ISU 113向其发布的指令而操作。BRU 121对条件寄存器(CR)文件123具有读访问。分支执行单元121进一步访问分支扫描逻辑存储在分支信息队列125中的信息,以确定分支预测的成功,并且可操作地耦合到与微处理器支持的一个或多个线程相对应的指令提取地址寄存器(IFAR)107。根据至少一个实施例,BIQ入口与标识符相关联,并由标识符标识,例如由分支标签BTAG标识。当完成与BIQ入口相关联的分支时,它被这样标记。BIQ入口被保持在队列中,并且当最旧的队列入口被标记为包含与已完成分支相关联的信息时,顺序地解除分配最旧的队列入口。BRU121还可操作地连接,以便当BRU121发现分支误预测时引起预测器更新。A branch execution unit (BRU) 121 operates in response to instructions issued thereto by the ISU 113. The BRU 121 has read access to a condition register (CR) file 123. The branch execution unit 121 further accesses information stored by the branch scan logic in a branch information queue 125 to determine the success of a branch prediction, and is operably coupled to an instruction fetch address register (IFAR) 107 corresponding to one or more threads supported by the microprocessor. In accordance with at least one embodiment, a BIQ entry is associated with and identified by an identifier, such as a branch tag BTAG. When a branch associated with a BIQ entry is completed, it is marked as such. The BIQ entries are maintained in a queue, and the oldest queue entries are sequentially de-allocated when the oldest queue entry is marked as containing information associated with a completed branch. The BRU 121 is also operably connected to cause a predictor update when the BRU 121 discovers a branch misprediction.

当执行指令时,BRU121检测预测是否错误。如果是,则更新预测。为此目的,处理器还包括预测器更新逻辑127。预测器更新逻辑127响应于来自分支执行单元121的更新指示且经配置以更新本地BHT105a、全局BHT105b和全局选择器105c中的一个或多个中的阵列入口。预测器硬件105a、105b和105c可具有与指令提取和预测操作所使用的读端口不同的写端口,或者可共享单个读/写端口。预测器更新逻辑127可进一步操作地耦合到链接栈117a和计数寄存器栈117b。When executing an instruction, BRU 121 detects whether the prediction is wrong. If so, the prediction is updated. For this purpose, the processor also includes predictor update logic 127. Predictor update logic 127 responds to the update indication from branch execution unit 121 and is configured to update the array entry in one or more of local BHT 105a, global BHT 105b and global selector 105c. Predictor hardware 105a, 105b and 105c may have a write port different from the read port used by instruction fetch and prediction operations, or may share a single read/write port. Predictor update logic 127 may further be operatively coupled to link stack 117a and count register stack 117b.

现在参考条件寄存器文件(CRF)123,CRF123可由BRU121读访问,并且可以由执行单元写入,所述执行单元包括但不限于定点单元(FXU)141、浮点单元(FPU)143和向量多媒体扩展单元(VMXU)145。条件寄存器逻辑执行单元(CRL执行)147(也称为CRU)和专用寄存器(SPR)处理逻辑149对条件寄存器文件(CRF)123具有读和写访问。CRU147会对储存于CRF文件123中的条件寄存器执行逻辑运算。FXU141能够对CRF123执行写更新。Referring now to the condition register file (CRF) 123, the CRF 123 is read accessible by the BRU 121 and can be written by execution units, including but not limited to the fixed point unit (FXU) 141, the floating point unit (FPU) 143, and the vector multimedia extension unit (VMXU) 145. The condition register logic execution unit (CRL Execution) 147 (also referred to as the CRU) and the special purpose register (SPR) processing logic 149 have read and write access to the condition register file (CRF) 123. The CRU 147 performs logic operations on the condition registers stored in the CRF file 123. The FXU 141 is capable of performing write updates to the CRF 123.

处理器104还包括加载/存储单元151、各种复用器153和缓冲器155、以及地址转换表157和其他电路。Processor 104 also includes a load/store unit 151, various multiplexers 153 and buffers 155, and an address translation table 157 and other circuits.

处理器104执行包括变量的程序(也称为应用程序)。变量具有标识符(例如,名称)并且参考包括值(例如,信息、数据)的存储位置。在运行时期间,程序通过使用TOC确定在编译时还未知的变量的地址。Processor 104 executes a program (also referred to as an application) that includes variables. A variable has an identifier (e.g., a name) and refers to a storage location that includes a value (e.g., information, data). During runtime, the program determines the address of a variable that was not known at compile time by using the TOC.

当调用子例程时,该子例程建立其自己的TOC,因为如果它在与调用它的函数不同的模块中,则它将具有其自己的数据字典(即TOC)并且要建立指向该字典的指针。建立这样的指针是昂贵的。When a subroutine is called, it builds its own TOC because if it is in a different module than the function that called it, it will have its own data dictionary (i.e., TOC) and a pointer to that dictionary has to be built. Building such a pointer is expensive.

用于建立TOC指针的代码的一个示例如下所示,例如参考示例ABI,诸如OPENPOWER ELFv2 ABI。An example of code for establishing a TOC pointer is shown below, for example with reference to an example ABI, such as the OPENPOWER ELFv2 ABI.

根据一个这样的示例实施例,调用者例如根据ABI利用被调用函数的地址来初始化一个或多个寄存器。According to one such example embodiment, the caller initializes one or more registers with the address of the called function, such as according to the ABI.

在以下示例中,用被调用函数的地址初始化两个寄存器,r12和ctr:In the following example, two registers, r12 and ctr, are initialized with the address of the called function:

根据建立的ABI,调用的函数初始化TOC指针。存在各种实现。在一个实施例中,当经由寄存器间接调用来调用地址函数时,使用来自由调用者初始化的一个或多个寄存器的入口地址。例如,根据示例ABI,诸如OPEN POWER ELFv2 ABI,TOC指针寄存器r2可以使用由调用者通过被称为“foo”的函数加载到r12中的被调用者的函数入口地址来如下初始化:According to the established ABI, the called function initializes the TOC pointer. Various implementations exist. In one embodiment, when an address function is called via a register indirect call, an entry address from one or more registers initialized by the caller is used. For example, according to an example ABI, such as the OPEN POWER ELFv2 ABI, the TOC pointer register r2 may be initialized using the callee's function entry address loaded into r12 by the caller via a function called "foo" as follows:

根据本发明的一个方面,不是使用例如上述代码来确定TOC指针,这在许多微处理器实现中是昂贵的,而是使用设置TOC寄存器(STR)指令。设置TOC寄存器指令例如通过在处理器中执行查找来将指向TOC的指针的值加载到寄存器(或其他限定位置)。由于TOC由模块的所有(或一组)函数共享,因此只有少量TOC寄存器值要被记住并与地址范围相关联。作为示例,设置TOC寄存器指令可以实现为架构化硬件指令或内部操作。According to one aspect of the invention, rather than using, for example, the above code to determine the TOC pointer, which is expensive in many microprocessor implementations, a Set TOC Register (STR) instruction is used. The Set TOC Register instruction loads the value of a pointer to the TOC into a register (or other defined location), for example, by performing a lookup in the processor. Since the TOC is shared by all (or a group of) functions of a module, only a small number of TOC register values are to be remembered and associated with an address range. As an example, the Set TOC Register instruction can be implemented as an architected hardware instruction or an internal operation.

参考图2描述了设置TOC寄存器(STR)指令的一个示例。在一个示例中,设置TOC寄存器指令200包括包含指示设置TOC寄存器操作的操作码的操作码(opcode)字段202以及指定诸如寄存器之类的位置以接收TOC指针的值的目标寄存器(RT)字段204。An example of a Set TOC Register (STR) instruction is described with reference to Figure 2. In one example, a Set TOC Register instruction 200 includes an operation code (opcode) field 202 containing an operation code indicating a Set TOC Register operation and a Target Register (RT) field 204 specifying a location, such as a register, to receive the value of the TOC pointer.

尽管在该示例中示出了一个操作码字段,但是在其他实施例中,可以存在多个操作码字段。其他变化也是可能的。Although one opcode field is shown in this example, in other embodiments, multiple opcode fields may be present. Other variations are also possible.

如所指示的,在一个示例中,目标寄存器字段204标识要加载TOC指针值的寄存器。STR指令将当前代码序列的TOC指针值加载到字段204指定的寄存器,其中代码序列对应于STR指令地址之后的代码。As indicated, in one example, target register field 204 identifies the register into which the TOC pointer value is to be loaded. The STR instruction loads the TOC pointer value for the current code sequence into the register specified by field 204, where the code sequence corresponds to the code following the STR instruction address.

有多种与STR指令相关联的处理的可能实现,包括例如软件实现、硬件辅助实现和硬件实现。在软件实现中,基于执行STR指令,引发异常,并且TOC寄存器的设置由管理程序代码(例如操作系统或系统管理程序)或由用户模式中断处理程序(例如根据Power架构的定义使用基于事件的分支设施)仿真。在硬件辅助的实现中,硬件为频繁值提供高速缓存(例如,用于存储最频繁使用的值的小型表或其他数据结构)或预测器,并且将其俘获到软件。然后,管理程序代码或用户模式中断处理程序处理该指令,如上所述。在硬件实现中,硬件为频繁值提供高速缓存或预测器,并且基于高速缓存中的未命中,查找表(或已经在软件中用TOC指针值填充的其他数据结构)。关于实现选择的进一步细节将参考图3-5来描述。There are multiple possible implementations of the processing associated with the STR instruction, including, for example, software implementation, hardware-assisted implementation, and hardware implementation. In the software implementation, based on the execution of the STR instruction, an exception is raised, and the setting of the TOC register is simulated by the management program code (e.g., an operating system or system management program) or by a user mode interrupt handler (e.g., using an event-based branch facility as defined in the Power architecture). In the hardware-assisted implementation, the hardware provides a cache (e.g., a small table or other data structure for storing the most frequently used values) or a predictor for frequent values, and captures it to the software. The management program code or user mode interrupt handler then processes the instruction as described above. In the hardware implementation, the hardware provides a cache or predictor for frequent values, and based on a miss in the cache, a lookup table (or other data structure that has been filled with the TOC pointer value in the software) is searched. Further details about the implementation options will be described with reference to Figures 3-5.

参考图3描述使用软件(例如管理程序代码或用户模式中断处理程序)的一个实现。参考图3,在一个示例中,STR指令由处理器接收,步骤300,并且引发到处理程序例程、诸如管理程序(例如,操作系统(OS)或管理程序(HV))或用户模式中断代码的陷阱,步骤310。进入处理程序例程,步骤320,并且处理程序例如在高速缓存或表中查找与STR指令的地址相对应的函数的TOC指针值,步骤330。将获得的TOC值加载到STR指令的目标寄存器中,步骤340。此后,处理返回到STR指令之后的代码,以继续利用所获得的TOC值执行,步骤350。One implementation using software (e.g., hypervisor code or a user mode interrupt handler) is described with reference to FIG3 . Referring to FIG3 , in one example, a STR instruction is received by a processor, step 300, and a trap is triggered to a handler routine, such as a hypervisor (e.g., an operating system (OS) or a hypervisor (HV)) or user mode interrupt code, step 310. The handler routine is entered, step 320, and the handler looks up a TOC pointer value for a function corresponding to the address of the STR instruction, such as in a cache or table, step 330. The obtained TOC value is loaded into a target register of the STR instruction, step 340. Thereafter, processing returns to the code following the STR instruction to continue execution using the obtained TOC value, step 350.

参考图4描述另一实现,其中描述了硬件辅助实现。参照图4,STR指令由处理器接收,步骤400,并且执行TOC高速缓存查找以定位包括STR指令的函数的TOC值,步骤402。确定是否找到该函数的TOC高速缓存条目,询问404。如果找到TOC高速缓存条目,则将来自TOC高速缓存查找的结果加载到STR目标寄存器,步骤406。否则,如上所述,引发到处理程序例程的陷阱,步骤408。例如,进入处理程序例程,步骤410,并且执行查找,例如在表中查找对应于STR指令的地址的函数的TOC值,步骤412。将TOC值加载到TOC高速缓存,步骤414,并且将所获得的TOC值加载到STR指令的目标寄存器,步骤416。然后,处理返回到STR指令之后的指令,以继续利用所获得的TOC值执行,步骤418。Another implementation is described with reference to FIG4 , in which a hardware assisted implementation is described. Referring to FIG4 , a STR instruction is received by a processor, step 400, and a TOC cache lookup is performed to locate a TOC value for the function that includes the STR instruction, step 402. A determination is made as to whether a TOC cache entry for the function is found, query 404. If a TOC cache entry is found, the result from the TOC cache lookup is loaded into the STR target register, step 406. Otherwise, a trap to a handler routine is triggered, step 408, as described above. For example, the handler routine is entered, step 410, and a lookup is performed, such as a lookup in a table for the TOC value of the function that corresponds to the address of the STR instruction, step 412. The TOC value is loaded into the TOC cache, step 414, and the obtained TOC value is loaded into the target register of the STR instruction, step 416. Processing then returns to the instruction following the STR instruction to continue execution using the obtained TOC value, step 418.

参考图5描述了另一实现,其中硬件执行该处理。参照图5,接收STR指令,步骤500,并且执行TOC高速缓存查找以定位包括STR指令的函数的TOC值,步骤502。确定是否找到该函数的TOC高速缓存条目,询问504。如果找到TOC高速缓存条目,则将来自TOC高速缓存查找的结果加载到STR目标寄存器,步骤506。然而,如果没有找到TOC高速缓存条目,则执行TOC高速缓存未命中处理逻辑,步骤510。这包括例如确定查找表开始,步骤512,以及在一个或多个表或其他数据结构中查找与STR的地址相对应的函数的TOC值,步骤514。将找到的TOC值(例如地址)加载到TOC高速缓存,步骤516,并且将获得的TOC值加载到STR的目标寄存器,步骤518。Another implementation is described with reference to FIG5 , in which hardware performs the processing. Referring to FIG5 , a STR instruction is received, step 500, and a TOC cache lookup is performed to locate a TOC value for the function that includes the STR instruction, step 502. A determination is made as to whether a TOC cache entry for the function is found, query 504. If a TOC cache entry is found, the result from the TOC cache lookup is loaded into the STR target register, step 506. However, if a TOC cache entry is not found, TOC cache miss handling logic is executed, step 510. This includes, for example, determining a lookup table start, step 512, and looking up a TOC value for the function corresponding to the address of the STR in one or more tables or other data structures, step 514. The found TOC value (e.g., address) is loaded into the TOC cache, step 516, and the obtained TOC value is loaded into the target register of the STR, step 518.

在一个或多个上述示例中,TOC高速缓存可以以多种方式实现。例如,它可以包括一个将每个STR指令的地址与要返回的值相关联的对(STR地址,返回值),或者,因为相邻函数通常共享TOC,它可以包括STR指令的地址范围,以返回指定值,例如在表中存储三元组(from_range,to_range,返回值)。下面描述关于TOC高速缓存的进一步细节。In one or more of the above examples, the TOC cache can be implemented in a variety of ways. For example, it can include a pair (STR address, return value) that associates the address of each STR instruction with the value to be returned, or, because adjacent functions often share a TOC, it can include a range of addresses of STR instructions to return a specified value, such as storing a triple (from_range, to_range, return value) in a table. Further details about the TOC cache are described below.

尽管在上述实施例中,STR用于加载TOC值,但是STR还可以用于加载其他值,例如幻数(例如,可执行和可链接格式(ELF)中的标识符)或其他值,例如,可以与STR指令的代码区域、特定模块或特定指令地址相关联的值。存在许多可能性。Although in the above embodiment, STR is used to load a TOC value, STR can also be used to load other values, such as a magic number (e.g., an identifier in the Executable and Linkable Format (ELF)) or other values, such as a value that can be associated with a code region, a specific module, or a specific instruction address of a STR instruction. There are many possibilities.

在另一方面,扫描代码以寻找设置TOC寄存器值的指令序列,并且用设置TOC寄存器指令替换那些指令序列。在又一方面,提供验证指令以验证TOC寄存器的值的预测。作为示例,指令序列包括一个或多个指令。On the other hand, the code is scanned to find the instruction sequence that sets the TOC register value, and replaces those instruction sequences with the TOC register instruction that sets. On the other hand, a verification instruction is provided to verify the prediction of the value of the TOC register. As an example, the instruction sequence includes one or more instructions.

根据传统的代码生成技术,TOC值通常使用指令序列来计算,或者从堆栈加载。According to conventional code generation techniques, the TOC value is usually calculated using an instruction sequence, or loaded from a stack.

例如,TOC值可以使用诸如以下序列来计算:For example, the TOC value may be calculated using a sequence such as:

在另一个示例中,TOC值从存储器(例如堆栈)加载(ld):In another example, the TOC value is loaded (ld) from memory (e.g. the stack):

ld r2,sp,<stackoffset for //sp is stack pointer(sp是堆ld r2,sp,<stackoffset for //sp is stack pointer (sp is stack

TOC> 栈指针)TOC> Stack Pointer)

这些序列通常涉及互锁(例如,需要等待针对存储TOC值的先前存储指令完成),之后它们才可以完成。这种类型的互锁通常导致性能下降。因此,根据本发明的一个方面,处理器指令解码单元识别TOC设置指令和/或TOC设置指令序列,并且用STR指令替换它们。可选地,还提供验证指令。如这里所使用的,TOC设置指令和/或TOC设置指令序列包括用于设置TOC寄存器或计算TOC指针值的一个或多个指令。These sequences typically involve interlocks (e.g., needing to wait for a previous storage instruction for storing a TOC value to complete) before they can complete. This type of interlock typically results in reduced performance. Therefore, according to one aspect of the present invention, a processor instruction decoding unit identifies a TOC setting instruction and/or a TOC setting instruction sequence and replaces them with a STR instruction. Optionally, a verification instruction is also provided. As used herein, a TOC setting instruction and/or a TOC setting instruction sequence includes one or more instructions for setting a TOC register or calculating a TOC pointer value.

例如,在一个实施例中,由处理器(例如,处理器的指令解码单元)识别以下指令序列:For example, in one embodiment, the following instruction sequence is recognized by a processor (e.g., an instruction decode unit of the processor):

addis r2,r12,offset@haddis r2,r12,offset@h

addi r2,r2,offset@Iaddi r2,r2,offset@I

并且该序列被以下操作所替换,以加载(预测的)TOC值,并且通过将其与寄存器r12和在原始代码中使用的偏移相加以计算r2的和进行比较来验证该预测:and the sequence is replaced by the following operation to load the (predicted) TOC value and verify the prediction by comparing it to the sum of register r12 and the offset used in the original code to calculate r2:

STR r2STR r2

verify r2,r12,offsetverify r2,r12,offset

在进一步的示例中:In a further example:

ld r2,sp,<stackoffset for TOC>ld r2,sp,<stackoffset for TOC>

被替换为:is replaced by:

STR r2STR r2

load-verify r2,sp,<stackoffset>load-verify r2,sp,<stackoffset>

STR指令的示例如上所述,并且关于使用验证操作的进一步细节如下所述。例如,参考图6A描述与使用STR验证内部操作(iop)、例如verify rx,ry,offset相关联的进一步细节。Examples of STR instructions are described above, and further details regarding the use of verification operations are described below. For example, further details associated with using STR to verify internal operations (iops), such as verify rx, ry, offset, are described with reference to FIG. 6A.

参考图6A,描述了由例如处理器执行的验证技术。最初,接收验证内部操作(例如,内部操作verify rx,ry,offset,具有两个示例寄存器操作数和一个立即数操作数-诸如上文的示例代码中的示例verify r2,r12,offset),步骤600。通过将验证操作的偏移加到验证内部操作的基址寄存器ry(例如,r12)的值,计算变量a,步骤602。确定验证内部操作的目标寄存器rx(例如,r2)中的值是否等于所计算的值a,询问604。如果rx的值等于计算的值a,则验证完成,步骤606,并且成功。Referring to FIG6A , a verification technique performed by, for example, a processor is described. Initially, a verification internal operation (e.g., an internal operation verify rx, ry, offset, having two example register operands and one immediate operand—such as the example verify r2, r12, offset in the example code above) is received, step 600. A variable a is calculated by adding the offset of the verification operation to the value of the base register ry (e.g., r12) of the verification internal operation, step 602. A determination is made as to whether the value in the target register rx (e.g., r2) of the verification internal operation is equal to the calculated value a, query 604. If the value of rx is equal to the calculated value a, then verification is complete, step 606, and is successful.

然而,如果rx的值不等于a,那么将a指派给目标寄存器rx,步骤608,且启动恢复,步骤610。恢复包括例如在当前指令之后从指令流水线中清洗rx的不正确使用,或者在当前指令之后清洗流水线中的所有指令。其他变化也是可能的。However, if the value of rx is not equal to a, then a is assigned to the target register rx, step 608, and recovery is initiated, step 610. Recovery includes, for example, flushing the incorrect use of rx from the instruction pipeline after the current instruction, or flushing all instructions in the pipeline after the current instruction. Other variations are also possible.

在另一实施例中,如图6B所示,将计算的值(例如TOC指针;又称TOC指针地址)加载到TOC高速缓存中,步骤620。In another embodiment, as shown in Figure 6B, the calculated value (e.g., TOC pointer; also known as TOC pointer address) is loaded into the TOC cache, step 620.

参考图7A-7B描述例如由处理器执行的验证技术的其他示例。参考图7A,在一个示例中,接收加载验证内部操作,步骤700。计算变量a的值。举例来说,将存储器地址ry处(即,堆栈指针处)的值加上偏移指派给变量a,步骤702。确定基址寄存器rx(例如,r2)的值是否等于a,询问704。如果rx中的值等于所计算的值a,则验证完成并且成功,步骤706。然而,如果所计算的值a不等于rx中的值,则将a指派给rx,步骤708。此外,启动恢复,步骤710。恢复包括例如清洗rx的不正确使用或清洗当前指令之后的流水线中的所有指令。其他变化是可能的。Other examples of verification techniques, such as those performed by a processor, are described with reference to Figures 7A-7B. With reference to Figure 7A, in one example, a load verification internal operation is received, step 700. The value of variable a is calculated. For example, the value at memory address ry (i.e., at the stack pointer) plus the offset is assigned to variable a, step 702. Determine whether the value of base register rx (e.g., r2) is equal to a, query 704. If the value in rx is equal to the calculated value a, then the verification is complete and successful, step 706. However, if the calculated value a is not equal to the value in rx, a is assigned to rx, step 708. In addition, recovery is initiated, step 710. Recovery includes, for example, cleaning incorrect uses of rx or cleaning all instructions in the pipeline after the current instruction. Other variations are possible.

在另一实施例中,参照图7B,将计算的值(例如TOC指针或地址)加载到TOC高速缓存中,步骤720。In another embodiment, referring to FIG. 7B , the calculated value (eg, a TOC pointer or address) is loaded into the TOC cache, step 720 .

在另一实施例中,可根据TOC值是否在TOC高速缓存中来采用不同的执行路径。该处理的一个示例由例如处理器执行,并参考图8来描述。最初,确定是否存在替换用于确定TOC值的指令序列的机会(例如将多个指令融合成iop序列的机会),询问800。也就是说,存在用STR以及可选地用验证来替换指令序列或执行一些其他指令替换的机会吗?如果否,则执行常规处理以确定TOC值(例如使用指令序列addis/addi或加载指令),步骤802。然而,如果存在TOC值替换机会,则在TOC高速缓存中进行查找以确定是否存在包括STR的例程的值,步骤804。如果存在TOC高速缓存命中,则用TOC值更新STR的目标寄存器,步骤808。此外,在一个示例中,执行验证,步骤810。然而,返回到询问806,如果不存在TOC高速缓存命中,则TOC值由例如计算指令序列(例如addis、addi)或加载指令生成,步骤812。将计算的值加载到TOC高速缓存中,步骤814,并且更新目标寄存器,步骤816。In another embodiment, different execution paths may be taken depending on whether the TOC value is in the TOC cache. An example of this process is performed by, for example, a processor and described with reference to FIG8 . Initially, it is determined whether there is an opportunity to replace the instruction sequence used to determine the TOC value (e.g., an opportunity to fuse multiple instructions into an iop sequence), query 800. That is, is there an opportunity to replace the instruction sequence with STR and optionally with verification or to perform some other instruction replacement? If not, conventional processing is performed to determine the TOC value (e.g., using the instruction sequence addis/addi or a load instruction), step 802. However, if there is a TOC value replacement opportunity, a lookup is performed in the TOC cache to determine whether there is a value for the routine including STR, step 804. If there is a TOC cache hit, the target register of STR is updated with the TOC value, step 808. In addition, in one example, verification is performed, step 810. However, returning to query 806, if there is no TOC cache hit, the TOC value is generated by, for example, a calculation instruction sequence (e.g., addis, addi) or a load instruction, step 812. The calculated value is loaded into the TOC cache, step 814 , and the target register is updated, step 816 .

其他实现和变化也是可能的。Other implementations and variations are possible.

在另一方面,基于进入子例程来预测TOC值。例如,当执行子例程调用时,预测TOC值,而不是等待找到被认为是计算TOC值的指令序列。相反,在进入子例程时预测TOC值,然后当遇到计算TOC值的被调用例程中的指令序列时,用TOC检查指令(即检查或验证预测的TOC值的指令)来代替它。如果TOC检查指令失败,或者TOC值被访问而预测未被检查,则可以执行恢复。In another aspect, a TOC value is predicted based on entry to a subroutine. For example, when a subroutine call is executed, the TOC value is predicted rather than waiting to find an instruction sequence that is believed to calculate the TOC value. Instead, the TOC value is predicted upon entry to the subroutine, and then when an instruction sequence in the called routine that calculates the TOC value is encountered, it is replaced with a TOC check instruction (i.e., an instruction that checks or verifies the predicted TOC value). If the TOC check instruction fails, or a TOC value is accessed and the prediction is not checked, recovery can be performed.

作为一个示例,处理器基于先前观察的地址来预测子例程的TOC寄存器(例如r2)的值。作为示例,将预测的TOC值与预测的目标地址一起输入到目标地址寄存器阵列中,或者输入到单独的TOC预测阵列中。As an example, the processor predicts the value of the TOC register (e.g., r2) of the subroutine based on the previously observed address. As an example, the predicted TOC value is input into the target address register array together with the predicted target address, or into a separate TOC prediction array.

在特定实施例中,可以使用例如参照图4描述的硬件辅助技术和/或参照图5描述的硬件技术来预测TOC值。在另一实施例中,通过使用遗留代码中的指令序列计算TOC值并初始化TOC高速缓存,来获得TOC值。还存在其他可能性。In certain embodiments, the TOC value may be predicted using, for example, hardware-assisted techniques described with reference to FIG. 4 and/or hardware techniques described with reference to FIG. 5. In another embodiment, the TOC value is obtained by calculating the TOC value using a sequence of instructions in legacy code and initializing a TOC cache. Other possibilities exist as well.

参照图9描述基于子例程分支预测TOC值的一个实施例。该处理例如由处理器执行。参照图9,首先,确定子例程调用是否是用于预测TOC值的候选,询问900。例如,子例程调用是寄存器间接分支(其中在分支指令中指定要执行的下一指令的地址的位置,而不是地址本身)吗?在其他实施例中,除了本地模块函数之外的分支被认为是候选,或者可以提供过滤器或其他机制来确定候选。如果不是,则执行常规处理,步骤902。然而,如果子例程调用是用于预测TOC值的候选,则执行子例程调用,步骤904。除了TOC值之外,该调用还可以与其他类型的值的预测相结合。另外,旧TOC值例如保存在恢复位置中,例如寄存器TOCRECOVER中,步骤906。此外,预测TOC值,步骤908。如本文所述,可以使用各种技术来预测TOC值。然后,将预测的TOC值加载到TOC指针寄存器(例如r2)中,步骤910。作为例子,TOC寄存器的标识可以是硬编码的,或者可以是配置的。此外,在一个示例中,设置在所选位置中维护的标志或其他指示符(例如,设置为1)以指示将在使用TOC值之前执行TOC检查(例如,TOC值的检查),步骤912。An embodiment of predicting a TOC value based on a subroutine branch is described with reference to FIG. 9 . The process is performed, for example, by a processor. Referring to FIG. 9 , first, determine whether a subroutine call is a candidate for predicting a TOC value, query 900. For example, is the subroutine call a register indirect branch (where the location of the address of the next instruction to be executed is specified in the branch instruction, rather than the address itself)? In other embodiments, branches other than local module functions are considered candidates, or filters or other mechanisms may be provided to determine candidates. If not, conventional processing is performed, step 902. However, if a subroutine call is a candidate for predicting a TOC value, the subroutine call is executed, step 904. In addition to the TOC value, the call may also be combined with predictions of other types of values. In addition, the old TOC value is, for example, saved in a restore location, such as register TOCRECOVER, step 906. In addition, a TOC value is predicted, step 908. As described herein, various techniques may be used to predict a TOC value. Then, the predicted TOC value is loaded into a TOC pointer register (e.g., r2), step 910. As an example, the identification of the TOC register may be hard-coded, or may be configurable. In addition, in one example, a flag or other indicator maintained in the selected location is set (e.g., set to 1) to indicate that a TOC check (e.g., a check of the TOC value) will be performed before the TOC value is used, step 912.

参考图10描述了关于TOC检查的进一步细节,尤其是TOC检查的插入逻辑。在一个示例中,该逻辑被集成在解码单元中。最初,获得并解码指令,步骤1000。确定是否设置了TOC检查标志,询问1002。如果没有设置,则该处理完成。然而,如果设置了TOC检查标志(例如设置为1),则进一步确定当前指令是否对应于TOC设置指令(例如用于在例如TOC寄存器中设置(例如加载、存储、提供、插入、放置)TOC值的一个或多个指令的序列;例如加载指令;或者用于计算TOC值的指令序列),询问1004。如果当前指令对应于TOC设置指令,则将TOC检查插入代码中,步骤1006。例如,STR验证或STR加载验证指令替换用于计算TOC值的代码中的一个或多个指令。例如,基于上文所示的实例,例如从正被替换的计算序列导出验证指令的参数。因此,基于计算指令的指令序列可以用计算与(一个或多个)计算指令类似的地址的验证指令来替换,例如,用使用对应加法计算指令的验证来替换一个或多个加法指令;并且加载指令可以用加载-验证指令来替换,该加载-验证指令从被替换的加载指令将从其加载TOC寄存器的相同位置获得要与之比较的值。另外,TOC检查标志被关闭(例如设置为0),步骤1008。Further details about the TOC check, and in particular the insertion logic for the TOC check, are described with reference to FIG. 10 . In one example, the logic is integrated into a decoding unit. Initially, an instruction is obtained and decoded, step 1000. A determination is made as to whether the TOC check flag is set, query 1002. If not set, the process is complete. However, if the TOC check flag is set (e.g., set to 1), a further determination is made as to whether the current instruction corresponds to a TOC setting instruction (e.g., a sequence of one or more instructions for setting (e.g., loading, storing, providing, inserting, placing) a TOC value in, for example, a TOC register; e.g., a load instruction; or a sequence of instructions for calculating a TOC value), query 1004. If the current instruction corresponds to a TOC setting instruction, a TOC check is inserted into the code, step 1006. For example, an STR verify or STR load verify instruction replaces one or more instructions in the code for calculating a TOC value. For example, based on the example shown above, the parameters of the verification instruction are derived, for example, from the calculation sequence being replaced. Thus, a sequence of instructions based on calculation instructions can be replaced with a verification instruction that calculates an address similar to the calculation instruction(s), e.g., replacing one or more addition instructions with a verification using a corresponding addition calculation instruction; and a load instruction can be replaced with a load-verify instruction that obtains a value to be compared with from the same location from which the replaced load instruction would load the TOC register. Additionally, the TOC check flag is turned off (e.g., set to 0), step 1008.

返回到询问1004,如果当前指令不对应于TOC设置指令,则进一步确定当前指令是否对应于TOC使用指令(即使用TOC值或TOC寄存器的一个或多个指令),询问1010。如果不是,则处理完成。否则,可以执行恢复,步骤1012。在一个实施例中,这可以通过将TOCRECOVER中的值复制回TOC寄存器(例如r2)中来实现。在另一实施例中,可使用寄存器重命名。在此实施例中,将预测的TOC值存储在新的重命名寄存器中,并且在恢复期间,使新的重命名寄存器无效,或者将旧的TOC值从另一个重命名寄存器复制到新的重命名寄存器。其他实现和/或实施例也是可能的。Returning to inquiry 1004, if the current instruction does not correspond to a TOC setting instruction, it is further determined whether the current instruction corresponds to a TOC use instruction (i.e., one or more instructions that use a TOC value or a TOC register), inquiry 1010. If not, the process is complete. Otherwise, recovery can be performed, step 1012. In one embodiment, this can be achieved by copying the value in TOCRECOVER back into a TOC register (e.g., r2). In another embodiment, register renaming can be used. In this embodiment, the predicted TOC value is stored in a new rename register, and during recovery, the new rename register is invalidated, or the old TOC value is copied from another rename register to the new rename register. Other implementations and/or embodiments are also possible.

参照图11描述基于子例程分支预测TOC值的另一实施例。该处理例如由处理器执行。参照图11,首先,确定子例程调用是否是用于预测TOC值的候选,询问1100。在一个实施例中,预测寄存器间接分支。在其他实施例中,排除模块局部函数,和/或过滤器可基于被调用的地址或调用者地址、被调用者地址对进一步抑制候选状态。还存在其他可能性。如果子例程调用不是候选,那么执行常规处理,步骤1102。Another embodiment of predicting TOC values based on subroutine branches is described with reference to FIG. 11 . The process is performed, for example, by a processor. Referring to FIG. 11 , first, determine whether a subroutine call is a candidate for predicting a TOC value, query 1100. In one embodiment, register indirect branches are predicted. In other embodiments, module local functions are excluded, and/or filters may further suppress candidate states based on the called address or the caller address, callee address pair. Other possibilities exist. If the subroutine call is not a candidate, then perform conventional processing, step 1102.

返回到询问1100,如果子例程调用是用于预测TOC值的候选,则进行子例程调用,步骤1104。可选地,除了TOC值之外,还可以预测其他附属值。此外,旧TOC值例如保存在恢复寄存器TOCRECOVER中,步骤1106。然后,尝试使用TOC高速缓存预测TOC值,步骤1108。确定是否存在TOC高速缓存命中,询问1110。如果存在TOC高速缓存命中,则将所获得的TOC值加载到TOC指针寄存器(例如r2)中,步骤1112。此外,设置TOC检查标志(例如设置为1),指示在使用预测TOC值之前要执行TOC值检查,并且在一个实施例中,关闭位于所选择位置的TOC捕获标志(例如设置为0),步骤1114。返回到询问1110,如果存在TOC高速缓存未命中,则设置TOC捕获标志以指示要执行TOC捕获(例如设置为1)以获得TOC值,并且关闭TOC检查标志(例如设置为0),步骤1116。其他变化也是可能的。Returning to inquiry 1100, if the subroutine call is a candidate for predicting a TOC value, the subroutine call is made, step 1104. Optionally, in addition to the TOC value, other ancillary values may be predicted. Additionally, the old TOC value is saved, for example, in a restore register TOCRECOVER, step 1106. Then, an attempt is made to predict the TOC value using the TOC cache, step 1108. A determination is made as to whether there is a TOC cache hit, inquiry 1110. If there is a TOC cache hit, the obtained TOC value is loaded into a TOC pointer register (e.g., r2), step 1112. Additionally, a TOC check flag is set (e.g., to 1), indicating that a TOC value check is to be performed before using the predicted TOC value, and in one embodiment, a TOC capture flag at a selected location is turned off (e.g., to 0), step 1114. Returning to inquiry 1110, if there is a TOC cache miss, the TOC capture flag is set to indicate that a TOC capture is to be performed (eg, set to 1) to obtain a TOC value, and the TOC check flag is turned off (eg, set to 0), step 1116. Other variations are possible.

参考图12描述了关于图11的实施例的检查插入逻辑的细节。在一个实施例中,该逻辑被集成到解码单元中。最初,获得指令并对其解码,步骤1200。确定当前指令是否对应于TOC设置指令,询问1202。如果它不对应于TOC设置指令,则确定当前指令是否对应于TOC使用指令,询问1204。如果不是,则处理完成。否则,进一步确定TOC检查标志是否被设置,询问1206。如果不是,则处理再次完成。否则,可以执行恢复,步骤1208。在一个实施例中,恢复包括将TOCRECOVER寄存器中的值复制回TOC寄存器(例如r2)中,或者使用重命名寄存器,如上所述。其他变化是可能的。Details of the check insertion logic for the embodiment of Figure 11 are described with reference to Figure 12. In one embodiment, the logic is integrated into a decoding unit. Initially, an instruction is obtained and decoded, step 1200. Determine whether the current instruction corresponds to a TOC setup instruction, query 1202. If it does not correspond to a TOC setup instruction, determine whether the current instruction corresponds to a TOC use instruction, query 1204. If not, processing is complete. Otherwise, further determine whether the TOC check flag is set, query 1206. If not, processing is completed again. Otherwise, recovery can be performed, step 1208. In one embodiment, recovery includes copying the value in the TOCRECOVER register back into a TOC register (e.g., r2), or using a rename register, as described above. Other variations are possible.

返回到询问1202,如果当前指令对应于TOC设置指令,则将检查插入代码中,步骤1210。例如,插入STR验证或STR加载验证。然后,关闭TOC检查标记(例如设置为0),步骤1212。Returning to query 1202, if the current instruction corresponds to a TOC set instruction, a check is inserted into the code, step 1210. For example, an STR verification or an STR load verification is inserted. Then, the TOC check flag is turned off (e.g., set to 0), step 1212.

参考图13描述TOC检查插入逻辑的另一实施例。在一个示例中,该逻辑被集成到解码单元中。参考图13,获得指令并对其解码,步骤1300。确定当前指令是否对应于TOC设置指令,询问1302。如果当前指令不对应于TOC设置指令,则进一步确定当前指令是否对应于TOC使用指令,询问1304。如果不是,则处理结束。否则,确定TOC捕获标志是否被设置,询问1306。如果不是,则处理完成。否则,关闭TOC捕获标记(例如设置为0),步骤1308。在一个实施例中,可以记录该函数没有在TOC高速缓存中加载新TOC值,或者指示过滤器(例如布隆过滤器)以抑制TOC高速缓存的TOC预测。其他变化也是可能的。Another embodiment of TOC check insertion logic is described with reference to FIG. 13 . In one example, the logic is integrated into a decoding unit. Referring to FIG. 13 , an instruction is obtained and decoded, step 1300. Determine whether the current instruction corresponds to a TOC setting instruction, query 1302. If the current instruction does not correspond to a TOC setting instruction, further determine whether the current instruction corresponds to a TOC use instruction, query 1304. If not, processing ends. Otherwise, determine whether the TOC capture flag is set, query 1306. If not, processing is complete. Otherwise, turn off the TOC capture flag (e.g., set to 0), step 1308. In one embodiment, it may be recorded that the function did not load a new TOC value in the TOC cache, or indicate a filter (e.g., a Bloom filter) to suppress TOC predictions for the TOC cache. Other variations are also possible.

返回到询问1302,如果当前指令不对应于TOC设置指令,则插入检查,在一个示例中包括触发恢复动作的验证指令,步骤1310,并且重置TOC捕获标志(例如设置为0),步骤1312。Returning to query 1302, if the current instruction does not correspond to a TOC setting instruction, a check is inserted, which in one example includes a verification instruction that triggers a recovery action, step 1310, and the TOC capture flag is reset (e.g., set to 0), step 1312.

在一个实施例中,可执行与TOC检查标志和TOC捕捉标志相关联的处理,并且在一个示例中,它们可以并行执行。In one embodiment, the processing associated with the TOC check flag and the TOC capture flag may be performed, and in one example, they may be performed in parallel.

现在参照图14A描述关于TOC高速缓存的更多细节。在一个示例中,TOC高速缓存1400包括多个列,包括例如TOC设置器地址列1402、包括该条目的模块的TOC值的TOC值列1404、可选函数初始化TOC列1406以及可选使用跟踪列1408。TOC设置器地址列1402包括TOC设置器地址,例如STR的地址、函数开始、或基于特定使用情况的多个其他值。在一个或多个实施例中,提供了由TOC设置器地址访问的设置关联表。函数初始化TOC列1406可以用于捕获不初始化TOC寄存器的函数。在另一实施例中,使用表项过于昂贵,可以使用过滤机制、例如布隆过滤器或其他过滤机制来标识不应预测其TOC值的函数。使用跟踪提供了一种选择当表满时要移除的条目并且要使用另一条目的方式。可以使用各种跟踪方案,包括例如最近最少使用、最不频繁使用、FIFO(先进先出)、每时间段的使用次数等。在至少一个实施例中,列1408适于存储与存储用于所实现的替换策略的适当信息相称的使用信息。Now, more details about the TOC cache are described with reference to FIG. 14A. In one example, the TOC cache 1400 includes a plurality of columns, including, for example, a TOC setter address column 1402, a TOC value column 1404 including the TOC value of the module of the entry, an optional function initialization TOC column 1406, and an optional use tracking column 1408. The TOC setter address column 1402 includes a TOC setter address, such as the address of an STR, a function start, or a plurality of other values based on a particular use case. In one or more embodiments, a set association table accessed by the TOC setter address is provided. The function initialization TOC column 1406 can be used to capture functions that do not initialize the TOC register. In another embodiment, it is too expensive to use table entries, and a filtering mechanism, such as a Bloom filter or other filtering mechanism, can be used to identify functions whose TOC values should not be predicted. Use tracking provides a way to select an entry to be removed when the table is full and to use another entry. Various tracking schemes can be used, including, for example, the least recently used, the least frequently used, FIFO (first in, first out), the number of times used per time period, etc. In at least one embodiment, column 1408 is adapted to store usage information commensurate with storing appropriate information for an implemented replacement strategy.

参考图14B描述将条目插入TOC高速缓存的一个实施例。最初,接收要输入到TOC高速缓存中的值对(例如被调用者、TOC值),步骤1450。选择高速缓存中的条目以存储该值对,步骤1452。作为示例,可使用索引位来选择条目,或者可使用使用跟踪信息。可选地,在一个实施例中,如果要驱逐条目,则将被驱逐的条目保存到例如二级TOC高速缓存中,步骤1454。将所获得的值对存储在所选择的条目中,步骤1456。其他变化是可能的。One embodiment of inserting an entry into the TOC cache is described with reference to FIG. 14B . Initially, a value pair (e.g., callee, TOC value) to be entered into the TOC cache is received, step 1450. An entry in the cache is selected to store the value pair, step 1452. As an example, an index bit may be used to select the entry, or usage tracking information may be used. Optionally, in one embodiment, if an entry is to be evicted, the evicted entry is saved, for example, to a secondary TOC cache, step 1454. The obtained value pair is stored in the selected entry, step 1456. Other variations are possible.

在一个实施例中,单个TOC指针值对应于整个模块,即模块中的所有函数具有相同TOC指针值。因此,根据本发明的一个方面,处理器在TOC高速缓存中为一定范围的地址存储TOC值。作为示例,例如通过将新发现的TOC值与预先存在的范围联合(coalescing)来动态地确定与相同TOC指针值相对应的地址范围。在另一实施例中,范围的大小由动态加载器提供,并且预测的TOC值与范围的值相关联。其他示例也是可能的。In one embodiment, a single TOC pointer value corresponds to the entire module, i.e., all functions in the module have the same TOC pointer value. Therefore, according to one aspect of the present invention, the processor stores TOC values for a certain range of addresses in the TOC cache. As an example, the address range corresponding to the same TOC pointer value is dynamically determined, for example, by coalescing a newly discovered TOC value with a pre-existing range. In another embodiment, the size of the range is provided by a dynamic loader, and the predicted TOC value is associated with the value of the range. Other examples are also possible.

作为另一示例,TOC可以覆盖模块的一部分,并且地址范围将是该部分的范围。还存在其他变型。As another example, the TOC may cover a portion of a module, and the address range would be that of that portion.Other variations exist as well.

如上所述,可以使用TOC高速缓存,但是在这方面,TOC高速缓存具有与图14A中不同的格式,因此具有不同的管理。这使得能够更紧凑和有效地表示TOC高速缓存,这利用了TOC值属性。As described above, a TOC cache may be used, but in this regard, the TOC cache has a different format than in Figure 14A and therefore has different management. This enables a more compact and efficient representation of the TOC cache, which exploits the TOC value attribute.

如图15所示,在一个示例中,应用1500可以包括多个模块,包括主程序1502和一个或多个动态共享对象(DSO)1504,诸如共享库。每个模块都与TOC值1506相关联,该TOC值对应于例如由动态加载程序加载该模块的地址范围中的代码。由于每个模块可具有与其相关联的其自己的TOC值,因此可以实现TOC高速缓存来指示这一点。例如,如图16所示,TOC高速缓存1600包括例如TOC范围address_from列1602和TOC范围address_to列1604。TOC范围address_from列1602示出TOC值的特定模块的开始,并且TOC范围address_to列1604示出TOC值的该特定模块的结束。对于该模块,TOC值包括在TOC值列1606中。此外,TOC高速缓存可以包括使用跟踪列1608。其他和/或不同的列也是可能的。As shown in Figure 15, in one example, application 1500 may include multiple modules, including main program 1502 and one or more dynamic shared objects (DSO) 1504, such as shared libraries. Each module is associated with a TOC value 1506, which corresponds to the code in the address range of the module loaded by the dynamic loader, for example. Since each module may have its own TOC value associated with it, a TOC cache can be implemented to indicate this. For example, as shown in Figure 16, TOC cache 1600 includes, for example, a TOC range address_from column 1602 and a TOC range address_to column 1604. TOC range address_from column 1602 shows the beginning of a specific module of the TOC value, and TOC range address_to column 1604 shows the end of the specific module of the TOC value. For this module, the TOC value is included in the TOC value column 1606. In addition, the TOC cache may include a usage tracking column 1608. Other and/or different columns are also possible.

参考图17描述在这种TOC高速缓存中插入条目的一个实施例。该逻辑例如由处理器执行。接收要插入高速缓存的值对(例如,被调用者、TOC值),步骤1700。基于所指示的TOC值尝试选择用于存储TOC值的条目,步骤1702。确定是否在TOC高速缓存中找到用于TOC值的条目,询问1704。如果没有找到条目,则选择TOC高速缓存内的条目用于存储TOC值,步骤1706。该条目可以是空条目,或者它可以是具有其他信息的条目。如果在要使用的条目中已经有一个值,则该信息可以被保存在例如二级TOC高速缓存中,步骤1708。然后,将所接收的值存储在所选择的条目中,步骤1710。另外,将address_from和address_to列设置为被调用者地址。Reference Figure 17 describes an embodiment of inserting an entry in such a TOC cache. The logic is executed, for example, by a processor. A value pair (e.g., callee, TOC value) to be inserted into the cache is received, step 1700. An attempt is made to select an entry for storing the TOC value based on the indicated TOC value, step 1702. Determine whether an entry for the TOC value is found in the TOC cache, query 1704. If no entry is found, an entry within the TOC cache is selected for storing the TOC value, step 1706. The entry can be an empty entry, or it can be an entry with other information. If there is already a value in the entry to be used, the information can be saved, for example, in a secondary TOC cache, step 1708. The received value is then stored in the selected entry, step 1710. In addition, the address_from and address_to columns are set to the callee address.

返回到询问1704,如果找到条目,则确定被调用者地址是否小于address_from列中的地址,询问1720。如果被调用者地址小于address_from列,则将所选择的条目的address_from列更新为被调用者地址,步骤1722。否则,将所选择的条目的address_to列更新为被调用者地址,步骤1724。Returning to inquiry 1704, if an entry is found, determine whether the callee address is less than the address in the address_from column, inquiry 1720. If the callee address is less than the address_from column, update the address_from column of the selected entry to the callee address, step 1722. Otherwise, update the address_to column of the selected entry to the callee address, step 1724.

上述流程假设每个TOC值一个条目,从而没有找到多个条目。然而,如果对于特定模块可以找到多个条目,则将对其进行检查。The above flow assumes that there is one entry per TOC value, so that multiple entries are not found. However, if multiple entries can be found for a particular module, they will be checked.

在另一实施例中,TOC预测的候选选择可以使用具有范围的TOC表来确定是否执行对相同模块的调用以抑制TOC预测。其他变化是可能的。In another embodiment, candidate selection for TOC prediction may use a TOC table with a scope to determine whether to perform a call to the same module to suppress TOC prediction. Other variations are possible.

在另一方面,为TOC预测准备并初始化TOC跟踪结构。作为一个示例,链接器链接程序,并且链接器确定TOC值,TOC值是模块的绝对值或者是例如相对于模块加载地址的相对偏移。动态加载器加载该模块并计算最终TOC值。然后,动态加载器将TOC值加载到TOC跟踪结构中,以便结合例如设置TOC寄存器指令或另一预测指令来使用。On the other hand, a TOC tracking structure is prepared and initialized for TOC prediction. As an example, a linker links programs, and the linker determines a TOC value, which is an absolute value of a module or a relative offset, for example, relative to a module load address. A dynamic loader loads the module and calculates a final TOC value. The dynamic loader then loads the TOC value into the TOC tracking structure for use in conjunction with, for example, a set TOC register instruction or another prediction instruction.

作为示例,TOC跟踪结构可以是TOC高速缓存本身,或者它可以是存储器内表的表示。其他示例也是可能的。此外,参照图18描述与将TOC值存储到跟踪结构中相关联的细节。该处理例如由加载器执行。As an example, the TOC tracking structure can be the TOC cache itself, or it can be a representation of an in-memory table. Other examples are possible. In addition, details associated with storing TOC values into the tracking structure are described with reference to FIG. 18. This process is performed, for example, by a loader.

参考图18,加载器接收加载模块的请求,步骤1800,并计算用于所加载的模块的至少一个TOC指针,步骤1802。该TOC值例如与模块已被加载到的地址范围一起存储在TOC跟踪结构中,步骤1804。然后,可以为特定函数返回所存储的值,或者将所存储的值存储在TOC高速缓存中以供稍后检索等。18 , the loader receives a request to load a module, step 1800, and computes at least one TOC pointer for the loaded module, step 1802. The TOC value is stored in a TOC tracking structure, e.g., along with the address range into which the module has been loaded, step 1804. The stored value may then be returned for a particular function, or stored in a TOC cache for later retrieval, etc.

在一个实施例中,当跟踪结构是例如存储器内结构,并且在TOC高速缓存中没有找到TOC值时,使用例如中断或基于用户模式事件的分支将控制转移给软件。然后,软件处理程序例如通过访问存储与每个模块相对应的地址范围和TOC值的存储器内结构来重新加载该值。在另一实施例中,存储器内TOC结构是架构上定义的,并且硬件处理程序直接从存储器内结构重新加载TOC高速缓存。在一个实施例中,当加载模块时,软件处理程序重新加载TOC高速缓存和存储器内结构。其他变化是可能的。In one embodiment, when the tracking structure is, for example, an in-memory structure, and the TOC value is not found in the TOC cache, control is transferred to software using, for example, an interrupt or a branch based on a user mode event. The software handler then reloads the value, for example, by accessing an in-memory structure that stores the address range and TOC value corresponding to each module. In another embodiment, the in-memory TOC structure is architecturally defined, and the hardware handler reloads the TOC cache directly from the in-memory structure. In one embodiment, the software handler reloads the TOC cache and the in-memory structure when the module is loaded. Other variations are possible.

根据本发明的另一方面,在指令集架构(ISA)中包括只读TOC寄存器和TOC寻址模式。只读TOC寄存器例如是为给定模块提供TOC值(例如通过访问TOC高速缓存或存储器内表)的伪或虚拟寄存器。即,它不是硬件或架构化寄存器,并且不具有备份它的存储装置,而是提供要在例如参考所选择的寄存器号时使用的TOC值。TOC值例如从TOC基表中存储的值初始化,TOC基表可以结合模块初始化来加载。TOC基表可以对应于图14和16的TOC高速缓存或存储器内结构中的一个或多个。结合本发明的一个或多个方面,还可以使用其他格式来存储和提供给定指令地址处的TOC基值。According to another aspect of the present invention, a read-only TOC register and a TOC addressing mode are included in an instruction set architecture (ISA). The read-only TOC register is, for example, a pseudo or virtual register that provides a TOC value for a given module (e.g., by accessing a TOC cache or a table in memory). That is, it is not a hardware or architected register and does not have a storage device to back it up, but provides a TOC value to be used, for example, when referencing a selected register number. The TOC value is initialized, for example, from a value stored in a TOC base table, which can be loaded in conjunction with module initialization. The TOC base table can correspond to one or more of the TOC cache or memory structures of Figures 14 and 16. In conjunction with one or more aspects of the present invention, other formats can also be used to store and provide a TOC base value at a given instruction address.

参考图19描述使用只读TOC寄存器的一个示例。如图所示,只读TOC寄存器1900(这里称为TOCbase)是指向TOC 1902中的位置的指针。TOC 1902包括指示保存有变量值1906的相应变量的位置的一个或多个变量地址1904。只读TOC寄存器TOCbase由寻址模式参考,或者隐含在指令中或者作为前缀。处理器响应于TOCbase被指定为寻址模式或寻址模式的寄存器而执行TOC值查找,并且所获得的TOC值被用于代替由被指定为基址寄存器的通用寄存器所提供的值。An example of using a read-only TOC register is described with reference to FIG. 19 . As shown, a read-only TOC register 1900 (referred to herein as TOCbase) is a pointer to a location in a TOC 1902. The TOC 1902 includes one or more variable addresses 1904 indicating the location of a corresponding variable that holds a variable value 1906. The read-only TOC register TOCbase is referenced by the addressing mode, either implicitly in the instruction or as a prefix. The processor performs a TOC value lookup in response to TOCbase being designated as an addressing mode or register of an addressing mode, and the TOC value obtained is used to replace the value provided by the general register designated as the base register.

在一个实施例中,当提供n位来编码指令集中的2n个寄存器时,2n个寄存器号中的一个被定义为参考TOC指针的值,并且当指定该寄存器时,TOC指针的值用作该寄存器的值。In one embodiment, when n bits are provided to encode 2n registers in an instruction set, one of the 2n register numbers is defined as a value that references a TOC pointer, and when the register is specified, the value of the TOC pointer is used as the value of the register.

在另外的方面,提供了可以使用只读寄存器的各种指令。例如,提供了各种加载TOC相对长指令,如参考图20A-20C所述,并且可以提供一个或多个加载地址TOC相对长指令,其一个示例参考图21描述。其他示例也是可能的。In another aspect, various instructions that can use read-only registers are provided. For example, various load TOC relatively long instructions are provided, as described with reference to Figures 20A-20C, and one or more load address TOC relatively long instructions can be provided, an example of which is described with reference to Figure 21. Other examples are also possible.

如图20A所示,加载TOC相对长指令2000包括:多个操作码(opcode)字段2002a、2002b,所述字段包括指定加载TOC相对长(LTL)操作的操作码;第一操作数字段(R1)2004,用于指示第一操作数的位置(例如,寄存器);以及第二操作数字段(RI2)2008,其是立即数字段,其内容用作指定字节、半字、字、双字等之一的有符号二进制整数,这些字节、半字、字、双字等被添加到当前指令地址处的TOC指针的值以形成存储器中的第二操作数的地址(TOC由外部装置定义-例如使用程序加载器、STR指令、TOC表、TOC高速缓存等)。As shown in Figure 20A, the load TOC relatively long instruction 2000 includes: multiple operation code (opcode) fields 2002a, 2002b, which include an operation code specifying a load TOC relatively long (LTL) operation; a first operand field ( R1 ) 2004, which is used to indicate the location of the first operand (e.g., a register); and a second operand field ( RI2 ) 2008, which is an immediate field, whose content is used as a signed binary integer specifying one of the bytes, half-words, words, double words, etc., which are added to the value of the TOC pointer at the current instruction address to form the address of the second operand in the memory (the TOC is defined by an external device - for example, using a program loader, a STR instruction, a TOC table, a TOC cache, etc.).

还提供了加载TOC相对长指令的其他实施例,如图20B-20C所示。每个加载TOC相对长指令(LGTL)2010(图20B)和LGFTL 2020(图20C)包括操作码字段2012a、2012b;2022a、2022b;第一操作数字段(R1)2014、2024,其用以指示第一操作数的位置(例如,寄存器);以及作为立即数字段的第二操作数字段(R2)2018、2028,其内容用作指定字节、半字、字、双字等之一的带符号二进制整数,这些字节、半字、字、双字等被加到在当前指令地址的TOC指针的值上以形成存储器中的第二操作数的地址(TOC由外部装置定义-例如使用程序加载器、STR指令、TOC表、TOC高速缓存等)。Other embodiments of the load TOC relatively long instruction are also provided, as shown in Figures 20B-20C. Each load TOC relatively long instruction (LGTL) 2010 (Figure 20B) and LGFTL 2020 (Figure 20C) includes an opcode field 2012a, 2012b; 2022a, 2022b; a first operand field ( R1 ) 2014, 2024, which is used to indicate the location of the first operand (e.g., a register); and a second operand field ( R2 ) 2018, 2028, which is an immediate field, the content of which is used as a signed binary integer specifying one of the bytes, halfwords, words, doublewords, etc., which are added to the value of the TOC pointer at the current instruction address to form the address of the second operand in memory (the TOC is defined by external means - for example, using a program loader, STR instruction, TOC table, TOC cache, etc.).

第二操作数被不改变地放置在第一操作数位置,除了对于加载TOC相对长(LGFTL)来说,它是符号扩展的。The second operand is placed in the first operand position unchanged, except that for load TOC relative long (LGFTL), it is sign extended.

对于加载TOC相对长(LTL),操作数是例如32位,而对于加载TOC相对长(LGTL),操作数是64位。对于加载TOC相对长(LGFTL),第二操作数被视为32位带符号二进制整数,而第一操作数被视为64位带符号二进制整数。For load TOC relatively long (LTL), the operand is, for example, 32 bits, while for load TOC relatively long (LGTL), the operand is 64 bits. For load TOC relatively long (LGFTL), the second operand is treated as a 32-bit signed binary integer, while the first operand is treated as a 64-bit signed binary integer.

当DAT开启时,使用与用于访问指令的寻址空间模式相同的寻址空间模式来访问第二操作数。当DAT关闭时,使用真实地址访问第二操作数。When DAT is on, the second operand is accessed using the same addressing space mode as that used to access the instruction. When DAT is off, the second operand is accessed using a real address.

对于加载TOC相对长(LTL,LGFTL),所述第二操作数将在字边界上对齐,并且对于加载TOC相对长(LGTL),所述第二操作数在双字边界上对齐;否则,可以识别规范异常。For load TOC relative long (LTL, LGFTL), the second operand shall be aligned on a word boundary, and for load TOC relative long (LGTL), the second operand shall be aligned on a doubleword boundary; otherwise, a specification exception may be recognized.

参考图21描述加载地址TOC相对长指令的一个示例。如所描绘的,加载地址TOC相对长指令2100包括多个操作码字段2102a、2102b,所述操作码字段包括指示加载地址TOC相对长操作的操作码;第一操作数字段(R1)2104,其指示第一操作数的位置(例如寄存器);以及第二操作数字段(R2)2108,其是立即数字段,其内容是指定字节、半字、字、双字等之一的数量的有符号二进制整数,其被加到当前地址处的TOC指针的值以生成所计算的地址。An example of a load address TOC relatively long instruction is described with reference to Figure 21. As depicted, the load address TOC relatively long instruction 2100 includes a plurality of opcode fields 2102a, 2102b, the opcode fields including an opcode indicating a load address TOC relatively long operation; a first operand field ( R1 ) 2104 indicating the location (e.g., register) of a first operand; and a second operand field ( R2 ) 2108, which is an immediate field, the contents of which are a signed binary integer specifying a number of one of bytes, halfwords, words, doublewords, etc., which is added to the value of the TOC pointer at the current address to generate the calculated address.

使用RI2字段指定的地址被置于通用寄存器R1中。通过将RI2字段加到当前地址的TOC值来获得地址。The address specified using the RI 2 field is placed in general register R 1. The address is obtained by adding the RI 2 field to the TOC value of the current address.

在24位寻址模式中,地址被置于位位置40-63,位32-39被设置为零,并且位0-31保持不变。在31位寻址模式中,地址被置于位位置33-63,位32被设置为零,并且位0-31保持不变。在64位寻址模式中,地址被置于位位置0-63。In 24-bit addressing mode, the address is placed in bit positions 40-63, bits 32-39 are set to zero, and bits 0-31 remain unchanged. In 31-bit addressing mode, the address is placed in bit positions 33-63, bit 32 is set to zero, and bits 0-31 remain unchanged. In 64-bit addressing mode, the address is placed in bit positions 0-63.

没有操作数的存储参考发生,并且不检查地址的访问异常。No store reference of the operand occurs, and no access exception is checked for the address.

在另一方面,提供了TOC加法立即移位(tocaddis)指令(例如,用于RISC型架构)。如图22所示,在一个示例中,TOC加法立即移位指令2200包括操作码字段2202,其包括指定TOC加法立即移位操作的操作码;指示目标返回值的目标返回(RT)字段2204;以及指定要应用于TOC值的移位量的移位立即数(SI)字段2206。In another aspect, a TOC addition immediate shift (tocaddis) instruction is provided (e.g., for RISC-type architectures). As shown in FIG. 22 , in one example, a TOC addition immediate shift instruction 2200 includes an opcode field 2202, which includes an opcode specifying a TOC addition immediate shift operation; a target return (RT) field 2204 indicating a target return value; and a shift immediate (SI) field 2206 specifying a shift amount to be applied to the TOC value.

作为一个例子,以下定义了tocaddis:As an example, the following defines tocaddis:

tocaddis RT,SItocaddis RT,SI

RT<=(TOC)+EXTS(SI││160)RT<=(TOC)+EXTS(SI││ 16 0)

将总和TOC+(SI││0×0000)放入寄存器RT。EXTS指扩展的符号,││指连接。The sum TOC+(SI││0×0000) is placed in register RT. EXTS refers to the extended sign, and ││ refers to concatenation.

在另一方面,可提供TOC指示前缀指令。例如,提供了加法TOC立即移位指令addtocis+,其是提供下一指令的信息的前缀指令。参考图23,在一个示例中,例如,加法TOC立即移位指令2300包括具有指定加法TOC立即移位操作的操作码的操作码字段2302;目标寄存器(RT)字段2304,用于保存结果;操作数字段(RA)2306;以及移位立即数(SI)字段2308。On the other hand, a TOC indication prefix instruction may be provided. For example, an addition TOC immediate shift instruction addtocis+ is provided, which is a prefix instruction that provides information for the next instruction. Referring to Figure 23, in one example, for example, an addition TOC immediate shift instruction 2300 includes an opcode field 2302 having an opcode specifying an addition TOC immediate shift operation; a target register (RT) field 2304 for storing the result; an operand field (RA) 2306; and a shift immediate (SI) field 2308.

作为一个例子,As an example,

addtocis+RT,RA,SIaddtocis+RT,RA,SI

if RA=0then RT←(TOC)+EXTS(SI││160)else RT←(RA)+if RA=0then RT←(TOC)+EXTS(SI││ 16 0)else RT←(RA)+

EXTS(SI││160)EXTS(SI││ 16 0)

提供总和(RA│TOC)+(SI││0×0000)作为仅用于下一顺序指令的寄存器RT的参考源。addtocis+是指令前缀,并且当指定RT时,修改随后的指令以使用针对RT计算的值作为输入。该指令指示在下一个顺序指令被执行之后RT变成未使用,并且它的值将是未定义的。如果在addtocis+指令之后且在下一顺序指令之前中断执行,则将以允许用下一指令恢复执行并产生正确结果的方式来更新状态(即,将写入RT,或将使用用于保持修改下一顺序指令的RT源的效果的另一实施方案界定的方法)。注意,如果RA=0,addtocis+使用TOCbase的值,而不是GPR0的内容。The sum (RA│TOC)+(SI││0×0000) is provided as a reference source for register RT for the next sequential instruction only. addtocis+ is an instruction prefix, and when RT is specified, subsequent instructions are modified to use the value calculated for RT as input. This instruction indicates that RT becomes unused after the next sequential instruction is executed, and its value will be undefined. If execution is interrupted after the addtocis+ instruction and before the next sequential instruction, the state will be updated in a manner that allows execution to be resumed with the next instruction and produce the correct result (i.e., RT will be written, or another implementation-defined method for maintaining the effect of modifying the RT source of the next sequential instruction will be used). Note that if RA=0, addtocis+ uses the value of TOCbase, rather than the contents of GPR0.

此前缀指令可具有其他的选项,例如位移指定元字段,用以指出是否要使用额外的立即位。另外,它可以包括一个或多个附加立即字段,该附加立即字段包括要与后继指令的操作数一起采用(例如,添加、或操作等)的值。This prefix instruction may have other options, such as a displacement designator field, to indicate whether to use an extra immediate bit. In addition, it may include one or more additional immediate fields that include a value to be used (e.g., add, or operate, etc.) with the operand of a subsequent instruction.

可使用其他前缀选项,包括TOC前缀和/或具有覆盖操作数中的可选操作数的选项的TOC前缀。例如,可提供前缀指令,其指示应使用TOC值而不是后继指令的操作数之一。在一个示例中,操作数是可选择的。Other prefix options may be used, including a TOC prefix and/or a TOC prefix with an option for covering an optional operand in an operand. For example, a prefix instruction may be provided that indicates that a TOC value should be used instead of one of the operands of a subsequent instruction. In one example, the operand is selectable.

另外,前缀指令(例如addtocis)和后续指令的各方面可被融合以便于处理。例如,如果指定了具有位移的前缀指令,并且后续指令也包括位移,则位移可以对应于立即移位和立即位移。存在其他可能性。In addition, aspects of prefix instructions (e.g., addtocis) and subsequent instructions may be fused for ease of processing. For example, if a prefix instruction with a displacement is specified, and the subsequent instruction also includes a displacement, the displacement may correspond to an immediate shift and an immediate displacement. There are other possibilities.

下面示出了使用addtocis的特定优化示例。在该示例中,n(例如,3)个指令候选序列包括例如:addtocis+r4,toc,upper;addi r4,r4,lower;以及lvx*vr2,r0,r4。该序列可以表示在以下模板中:A specific optimization example using addtocis is shown below. In this example, n (e.g., 3) instruction candidate sequences include, for example: addtocis+r4,toc,upper; addi r4,r4,lower; and lvx*vr2,r0,r4. This sequence can be represented in the following template:

i1=addtocis+<r1>,<r2>,<upper>i1=addtocis+<r1>,<r2>,<upper>

i2=addi<r1>,<r1>,<lower>i2=addi<r1>,<r1>,<lower>

i3=lvx*<vrt>,r0,<r1>i3=lvx*<vrt>,r0,<r1>

=>and optimized to the following internal operation(并且优化到以下内部操作):=> and optimized to the following internal operation (and optimized to the following internal operation):

lvd<vrt>,toc_or_gpr(<r2>),combined(<upper>,<lower>)lvd<vrt>,toc_or_gpr(<r2>),combined(<upper>,<lower>)

addtocis指令类似于addis,但是当RA字段具有值0时,引入TOC的值,而不是常数0。在一个示例中,lvx*是在执行指令之后将基址寄存器(例如,在该示例中为4)定义为具有未指定值的指令形式。在一个示例中,lvx*是lvx(load vector indexed,加载向量索引)指令的形式,该指令指示至少一个寄存器(例如,在此定义为模板中表示为<r1>的寄存器)的最后一次使用。lvd是具有实现的定义位移的负载向量操作。toc_or_gpr函数处理扩展TOC的特殊情况,因为lvd否则将处理与所有其他RA操作数类似的RA操作数,作为表示0的0值,以及表示逻辑寄存器的其他寄存器值。The addtocis instruction is similar to addis, but introduces the value of TOC instead of the constant 0 when the RA field has the value 0. In one example, lvx* is a form of an instruction that defines a base register (e.g., 4 in this example) to have an unspecified value after the instruction is executed. In one example, lvx* is a form of an lvx (load vector indexed) instruction that indicates the last use of at least one register (e.g., defined here as the register represented as <r1> in the template). lvd is a load vector operation with an implementation-defined displacement. The toc_or_gpr function handles the special case of an extended TOC, as lvd otherwise treats the RA operand similar to all other RA operands, as a 0 value representing 0, and other register values representing logical registers.

可能存在进一步的机会来将包括TOC指令或TOC使用指令的复杂指令序列减少到更简单的指令序列。There may be further opportunities to reduce complex instruction sequences that include TOC instructions or TOC usage instructions to simpler instruction sequences.

参照图24描述用于管理TOC操作数的执行流程的一个实施例。在一个示例中,处理器正在执行该逻辑。One embodiment of an execution flow for managing TOC operands is described with reference to Figure 24. In one example, a processor is executing the logic.

参考图24,在一个示例中,接收指令,步骤2400。根据非TOC操作数定义,获得指令的任何非TOC操作数,步骤2402。例如,如果指令操作数指定通用寄存器,则从该寄存器获得数据,等等。24, in one example, an instruction is received, step 2400. Based on the non-TOC operand definition, any non-TOC operands of the instruction are obtained, step 2402. For example, if the instruction operand specifies a general register, data is obtained from that register, and so on.

确定指令中是否存在TOC操作数,询问2404。也就是说,在指令中存在明确地或隐含地使用TOC指针的操作数吗?如果存在TOC操作数,则如下所述,获得TOC操作数,步骤2406。此后,或者如果不存在TOC操作数,则根据指令定义使用任何获得的操作数值,步骤2408。可选地,写入一个或多个输出操作数,步骤2410。A determination is made as to whether a TOC operand is present in the instruction, query 2404. That is, is there an operand in the instruction that explicitly or implicitly uses a TOC pointer? If a TOC operand is present, then the TOC operand is obtained as described below, step 2406. Thereafter, or if a TOC operand is not present, any obtained operand values are used in accordance with the instruction definition, step 2408. Optionally, one or more output operands are written, step 2410.

在一个示例中,TOC操作数是从存储器内TOC结构获得的,如参考图25所述。获得存储器内TOC跟踪结构的地址,步骤2500,并且从存储器内TOC结构获得包括该指令的模块的TOC值,步骤2502。然后提供该TOC值以供指令使用,步骤2504。In one example, the TOC operand is obtained from an in-memory TOC structure, as described with reference to Figure 25. The address of the in-memory TOC tracking structure is obtained, step 2500, and the TOC value for the module including the instruction is obtained from the in-memory TOC structure, step 2502. The TOC value is then provided for use by the instruction, step 2504.

在另一个示例中,TOC操作数是从TOC高速缓存获得的,该TOC高速缓存由存储器内结构支持,如参考图26所述。在该示例中,访问TOC高速缓存,步骤2600,并且确定是否存在TOC高速缓存命中,查询2602。也就是说,在高速缓存中存在用于包括该指令的模块的条目吗?如果不存在TOC高速缓存命中,则从存储器内TOC高速缓存结构重新加载TOC高速缓存,步骤2604。此后,或者如果存在TOC高速缓存命中,则从TOC高速缓存获得包括该指令的模块的TOC值,并提供该TOC值以供该指令使用,步骤2606。In another example, the TOC operand is obtained from a TOC cache, which is backed by an in-memory structure, as described with reference to FIG26. In this example, the TOC cache is accessed, step 2600, and a determination is made as to whether there is a TOC cache hit, query 2602. That is, is there an entry in the cache for the module that includes the instruction? If there is no TOC cache hit, the TOC cache is reloaded from the in-memory TOC cache structure, step 2604. Thereafter, or if there is a TOC cache hit, the TOC value for the module that includes the instruction is obtained from the TOC cache and provided for use by the instruction, step 2606.

参考图27描述从由存储器内结构支持的TOC高速缓存获得TOC操作数的另一示例。在该示例中,访问TOC高速缓存,步骤2700,并且确定是否存在TOC高速缓存命中,询问2702。如果存在TOC高速缓存命中,则从TOC高速缓存中与包括该指令的模块相对应的条目中检索TOC值,并提供TOC值以供该指令使用,步骤2704。然而,如果没有TOC高速缓存命中,则控制转移到软件处理程序,步骤2706。软件处理程序使用软件确定TOC值(例如从存储器内跟踪结构获得它),步骤2710,并将确定的TOC值加载到TOC高速缓存中,步骤2712。软件处理程序结束,且重新开始指令,步骤2714。Another example of obtaining a TOC operand from a TOC cache supported by an in-memory structure is described with reference to FIG27. In this example, the TOC cache is accessed, step 2700, and a determination is made as to whether there is a TOC cache hit, query 2702. If there is a TOC cache hit, the TOC value is retrieved from the entry in the TOC cache corresponding to the module that includes the instruction, and the TOC value is provided for use by the instruction, step 2704. However, if there is no TOC cache hit, control is transferred to a software handler, step 2706. The software handler uses a software determined TOC value (e.g., obtains it from an in-memory tracking structure), step 2710, and loads the determined TOC value into the TOC cache, step 2712. The software handler ends, and the instruction is restarted, step 2714.

为了从软件加载TOC高速缓存,可以使用加载TOC高速缓存(LTC)指令。例如,LTCRfrom、Rto、RTOC可以用于加载<MODULE.Rfrom,MODULE.Rto,MODULE.TOC>的条目。例如,条目被包括在高速缓存中,使用Rfrom来填充address_from列;使用Rto来填充address_to列;并且使用RTOC来填充TOC值。在一个实施例中,根据特定实现的替换策略来选择条目。To load the TOC cache from software, a load TOC cache (LTC) instruction may be used. For example, LTCRfrom, Rto, RTOC may be used to load an entry of <MODULE.Rfrom, MODULE.Rto, MODULE.TOC>. For example, an entry is included in the cache, the address_from column is populated using Rfrom; the address_to column is populated using Rto; and the TOC value is populated using RTOC. In one embodiment, the entry is selected according to the replacement strategy of the particular implementation.

在另一实施例中,通过加载多个控制寄存器来加载表条目。In another embodiment, the table entries are loaded by loading a plurality of control registers.

以下描述使用情况的一个示例:The following describes an example usage:

根据C编程语言的定义,函数foo从数组bar返回字符,其中字符位置由自变量idx指示给函数foo。According to the definition of C programming language, function foo returns the character from array bar, where the character position is indicated to function foo by the argument idx.

根据本发明的一个方面,编译器将该程序翻译成以下机器指令序列:According to one aspect of the present invention, the compiler translates the program into the following sequence of machine instructions:

根据本发明的一个或多个方面,设置TOC寄存器指令高效地初始化要用TOC值加载的寄存器。此外,根据本发明的一个方面,由于设置TOC寄存器指令是有效的,所以TOC寄存器值不在编译代码中保存和恢复。相反,当调用子例程时,放弃TOC寄存器值。当被调用的函数返回时,不加载TOC值。相反,生成新的设置TOC寄存器指令以加载TOC寄存器。According to one or more aspects of the present invention, the Set TOC Register instruction efficiently initializes the register to be loaded with the TOC value. In addition, according to one aspect of the present invention, since the Set TOC Register instruction is effective, the TOC register value is not saved and restored in the compiled code. Instead, when the subroutine is called, the TOC register value is abandoned. When the called function returns, the TOC value is not loaded. Instead, a new Set TOC Register instruction is generated to load the TOC register.

基于与上述C程序语言函数foo相对应的STR(设置TOC寄存器)指令来获得TOC指针的正确值的编译器生成代码的一个示例如下:An example of compiler generated code to obtain the correct value of the TOC pointer based on the STR (Set TOC Register) instruction corresponding to the C programming language function foo above is as follows:

参考图28描述由编译器执行的用TOC指针初始化寄存器的一个实施例。最初,确定该函数是否访问TOC,询问2800。如果不是,则处理完成。然而,如果函数访问TOC,则在第一次使用之前,使用例如STR指令,用TOC指针初始化寄存器,步骤2802。例如,STR指令被添加到被编译的代码中,并且用于初始化TOC寄存器。其他变化是可能的。One embodiment of initializing registers with a TOC pointer as performed by a compiler is described with reference to FIG. 28 . Initially, a determination is made as to whether the function accesses the TOC, query 2800 . If not, then processing is complete. However, if the function accesses the TOC, then prior to first use, the registers are initialized with the TOC pointer using, for example, a STR instruction, step 2802 . For example, a STR instruction is added to the compiled code and used to initialize the TOC registers. Other variations are possible.

在另一示例中,静态链接器可以初始化TOC,如参考图29所述。在该示例中,确定子例程是否被解析为可以用TOC值更改寄存器的函数,询问2900。如果不是,则处理完成。否则,例如用STR指令重新初始化保存TOC指针的寄存器,步骤2902。例如,将STR指令添加到被编译的代码中,并且用于初始化TOC寄存器。其他变化是可能的。In another example, the static linker can initialize the TOC, as described with reference to FIG. 29. In this example, a determination is made as to whether the subroutine is parsed as a function that can modify a register with a TOC value, query 2900. If not, the process is complete. Otherwise, the register holding the TOC pointer is reinitialized, for example, with a STR instruction, step 2902. For example, a STR instruction is added to the compiled code and used to initialize the TOC register. Other variations are possible.

示例使用情况如下。根据本发明的一个方面,生成这个更有效的代码:An example use case is as follows. According to one aspect of the invention, this more efficient code is generated:

除了利用TOC设置指令生成代码之外,还可以使用TOC只读寄存器生成代码。这进一步避免了将TOC加载到GPR的需要,并且由此减少了寄存器压力和加载寄存器或者在函数调用之后重新加载寄存器的开销。In addition to generating code using TOC setting instructions, code can also be generated using the TOC read-only register. This further avoids the need to load the TOC into the GPR and thereby reduces register pressure and the overhead of loading registers or reloading registers after function calls.

使用TOC只读寄存器的编译器生成代码的一个例子如下:An example of compiler-generated code using the TOC read-only register is as follows:

char bar[MAX];char bar[MAX];

char foo(int idx)char foo(int idx)

{{

return bar[idx];return bar[idx];

}}

根据C编程语言的定义,函数foo从数组bar返回字符,其中字符位置由自变量idx指示给函数foo。According to the definition of C programming language, function foo returns the character from array bar, where the character position is indicated to function foo by the argument idx.

根据本发明的一个方面,编译器将该程序翻译成下面的机器指令序列。According to one aspect of the present invention, the compiler translates the program into the following sequence of machine instructions.

参考图30描述使用TOC只读寄存器来参考TOC的编译流程的一个示例。在该示例中,确定是否请求对TOC的参考,询问3000。如果不是,则处理完成。否则,使用TOC只读寄存器来参考TOC,步骤3002。例如,操作(例如,内部操作、指令等)被包括在正被编译的代码中,并且被用于确定指向TOC的指针。其他变化是可能的。An example of a compilation flow using a TOC read-only register to reference the TOC is described with reference to FIG. 30 . In this example, a determination is made as to whether a reference to the TOC is requested, query 3000 . If not, the process is complete. Otherwise, the TOC is referenced using the TOC read-only register, step 3002 . For example, an operation (e.g., an internal operation, an instruction, etc.) is included in the code being compiled and is used to determine a pointer to the TOC. Other variations are possible.

示例使用情况如下。根据本发明的一个方面,生成这个更有效的代码:An example use case is as follows. According to one aspect of the invention, this more efficient code is generated:

本发明的一个或多个方面不可分地依赖于计算机技术,并便于计算机内的处理,从而提高其性能。参考图31A-31B描述与本发明的一个或多个方面相关的便于计算环境内的处理的一个实施例的进一步细节。One or more aspects of the present invention are inseparable from computer technology and facilitate processing within a computer, thereby improving its performance. Further details of an embodiment of facilitating processing within a computing environment related to one or more aspects of the present invention are described with reference to Figures 31A-31B.

参考图31A,在一个实施例中,由处理器获得提供指向参考数据结构的指针的指令(3100)。执行指令(3102),并且该执行包括例如确定指向参考数据结构的指针的值(3104),以及将该值存储在由指令指定的位置(例如,寄存器)中(3106)。31A , in one embodiment, an instruction providing a pointer to a reference data structure is obtained by a processor (3100). The instruction is executed (3102), and the execution includes, for example, determining the value of the pointer to the reference data structure (3104), and storing the value in a location (e.g., a register) specified by the instruction (3106).

在一个示例中,确定该值包括执行数据结构(例如,参考数据结构指针高速缓存或填充有参考数据结构指针值的表)的查找以确定该值(3108)。In one example, determining the value includes performing a lookup of a data structure (eg, a reference data structure pointer cache or a table populated with reference data structure pointer values) to determine the value (3108).

作为特定示例,确定该值包括例如检查参考数据结构指针高速缓存以寻找包括该值的条目(3110),以及基于找到该条目来执行存储(3112)。此外,在一个示例中,参考图31B,确定包括由处理器基于该值未位于参考数据结构指针高速缓存中来引发到处理程序的陷阱(3120),由处理程序从填充有参考数据结构指针值的数据结构获得该值(3122),以及由处理程序执行存储(3124)。另外,将该值存储在参考数据结构指针高速缓存中(3126)。As a specific example, determining the value includes, for example, checking a reference data structure pointer cache for an entry that includes the value (3110), and performing a store based on finding the entry (3112). Further, in one example, referring to FIG. 31B , determining includes initiating, by the processor, a trap to a handler based on the value not being located in the reference data structure pointer cache (3120), obtaining, by the handler, the value from a data structure populated with the reference data structure pointer value (3122), and performing a store by the handler (3124). Additionally, the value is stored in the reference data structure pointer cache (3126).

在另一示例中,确定还包括基于该值未位于参考数据结构指针高速缓存中,执行高速缓存未命中处理以确定该值并存储该值(3130)。In another example, determining further includes performing cache miss processing to determine the value and storing the value based on the value not being located in the reference data structure pointer cache (3130).

在一个实施例中,基于获得该指令,由处理器引发到处理程序的陷阱(3132),并且由该处理程序执行确定和存储(3134)。In one embodiment, based on obtaining the instruction, a trap is caused by the processor to a handler (3132), and the determination and storage (3134) are performed by the handler.

其他变化和实施例是可能的。Other variations and embodiments are possible.

其他类型的计算环境也可以结合并使用本发明的一个或多个方面,包括但不限于仿真环境,其示例参考图32A进行描述。在此示例中,计算环境20包括:例如,本机中央处理单元(CPU)22、存储器24,以及经由例如一个或多个总线28和/或其他连接而彼此耦接的一个或多个输入/输出设备和/或接口26。作为示例,计算环境20可包括:由纽约阿蒙克市的国际商业机器公司提供的PowerPC处理器或pSeries服务器,和/或基于由国际商业机器公司、因特尔或其他公司提供的架构的其他机器。Other types of computing environments may also incorporate and use one or more aspects of the present invention, including but not limited to simulation environments, examples of which are described with reference to FIG. 32A. In this example, the computing environment 20 includes, for example, a local central processing unit (CPU) 22, a memory 24, and one or more input/output devices and/or interfaces 26 coupled to each other via, for example, one or more buses 28 and/or other connections. As an example, the computing environment 20 may include a PowerPC processor or pSeries server provided by International Business Machines Corporation of Armonk, New York, and/or other machines based on architectures provided by International Business Machines Corporation, Intel, or other companies.

本机中央处理单元22包括一个或多个本机寄存器30,诸如,在环境内的处理期间使用的一个或多个通用寄存器和/或一个或多个专用寄存器。这些寄存器包括表示在任何特定时间点的环境状态的信息。The local central processing unit 22 includes one or more local registers 30, such as one or more general purpose registers and/or one or more special purpose registers used during processing within the environment. These registers include information representative of the state of the environment at any particular point in time.

此外,本机中央处理单元22执行存储在存储器24中的指令和代码。在一个特定示例中,中央处理单元执行存储在存储器24中的仿真器代码32。该代码使得在一个架构中配置的计算环境能够仿真另一种架构。例如,仿真器代码32允许基于不同于z/架构的架构的机器(诸如,PowerPC处理器、pSeries服务器或其他服务器或处理器)仿真z/架构并执行基于z/架构开发的软件和指令。In addition, the native central processing unit 22 executes instructions and codes stored in the memory 24. In a specific example, the central processing unit executes emulator code 32 stored in the memory 24. The code enables a computing environment configured in one architecture to emulate another architecture. For example, the emulator code 32 allows a machine based on an architecture other than the z/architecture (such as a PowerPC processor, a pSeries server, or other server or processor) to emulate the z/architecture and execute software and instructions developed based on the z/architecture.

参考图32B描述与仿真器代码32有关的进一步细节。存储在存储器24中的访客指令40包括被开发为在不同于本机CPU 22的架构的架构中执行的软件指令(例如,与机器指令相关)。例如,访客指令40可能已经被设计为在z/架构处理器上执行,但是相反地,其在可以是例如英特尔处理器的本机CPU 22上进行仿真。在一个示例中,仿真器代码32包括指令提取例程42,以从存储器24获得一个或多个访客指令40,并且可选地为所获得的指令提供本地缓冲。它还包括指令转换例程44,以确定所获得的访客指令的类型,并将访客指令转换成一个或多个对应的本机指令46。该转换包括:例如,识别将要由访客指令执行的功能,并选择执行该功能的本机指令。Further details related to the emulator code 32 are described with reference to FIG. 32B. The guest instructions 40 stored in the memory 24 include software instructions (e.g., related to machine instructions) that are developed to be executed in an architecture different from the architecture of the native CPU 22. For example, the guest instructions 40 may have been designed to be executed on a z/architecture processor, but on the contrary, they are simulated on a native CPU 22, which may be, for example, an Intel processor. In one example, the emulator code 32 includes an instruction fetch routine 42 to obtain one or more guest instructions 40 from the memory 24, and optionally provide a local buffer for the obtained instructions. It also includes an instruction conversion routine 44 to determine the type of the obtained guest instruction and convert the guest instruction into one or more corresponding native instructions 46. The conversion includes, for example, identifying the function to be performed by the guest instruction and selecting the native instruction to perform the function.

另外,仿真器代码32包括仿真控制例程48以使本机指令被执行。仿真控制例程48可使本机CPU 22执行仿真一个或多个先前所获得的访客指令的本机指令的例程,并且在这种执行结束时,将控制返回到指令提取例程以仿真下一个访客指令或访客指令组的获得。本机指令46的执行可包括:将数据从存储器24加载到寄存器中;将数据从寄存器存储回存储器;或者执行如由转换例程确定的某种类型的算术或逻辑运算。In addition, the emulator code 32 includes an emulation control routine 48 to cause native instructions to be executed. The emulation control routine 48 may cause the native CPU 22 to execute a routine of native instructions that emulates one or more previously acquired guest instructions, and at the end of such execution, returns control to the instruction fetch routine to emulate the acquisition of the next guest instruction or guest instruction group. The execution of the native instructions 46 may include: loading data from the memory 24 into a register; storing data from a register back to the memory; or performing some type of arithmetic or logic operation as determined by the conversion routine.

每个例程例如以软件实现,该软件存储在存储器中并由本机中央处理单元22执行。在其他示例中,一个或多个例程或操作以固件、硬件、软件或它们的一些组合实现。所仿真处理器的寄存器可使用本机CPU的寄存器30或通过使用存储器24中的位置来仿真。在实施例中,访客指令40、本机指令46和仿真器代码32可驻留在同一存储器中或可分配在不同的存储器设备中。Each routine is implemented, for example, in software that is stored in memory and executed by the native central processing unit 22. In other examples, one or more routines or operations are implemented in firmware, hardware, software, or some combination thereof. The registers of the emulated processor may be emulated using the registers 30 of the native CPU or by using locations in the memory 24. In an embodiment, the guest instructions 40, the native instructions 46, and the emulator code 32 may reside in the same memory or may be allocated in different memory devices.

如本文中所使用的,固件包括例如处理器的微代码或毫代码(millicode)。例如,其包括用于实现更高级机器代码的硬件级指令和/或数据结构。在一个实施例中,固件可包括例如通常作为微代码传送的专有代码,其包括特定于底层硬件的可信软件或微代码,并控制对系统硬件的操作系统访问。As used herein, firmware includes, for example, microcode or millicode of a processor. For example, it includes hardware-level instructions and/or data structures for implementing higher-level machine code. In one embodiment, firmware may include, for example, proprietary code that is typically transmitted as microcode, including trusted software or microcode specific to the underlying hardware, and controls operating system access to system hardware.

已获得、转换并执行的访客指令40例如是本文中描述的指令中的一个或多个指令。将具有一种架构(例如,z/架构)的指令从存储器中提取、转换并表示为一系列的具有另一种架构(例如,PowerPC、pSeries、Intel等)的本机指令46。然后可执行这些本机指令。The guest instructions 40 that are obtained, translated, and executed are, for example, one or more of the instructions described herein. Instructions of one architecture (e.g., z/architecture) are fetched from memory, translated, and represented as a series of native instructions 46 of another architecture (e.g., PowerPC, pSeries, Intel, etc.). These native instructions can then be executed.

一个或多个方面可涉及云计算。One or more aspects may relate to cloud computing.

首先应当理解,尽管本公开包括关于云计算的详细描述,但其中记载的技术方案的实现却不限于云计算环境,而是能够结合现在已知或以后开发的任何其他类型的计算环境而实现。First, it should be understood that although the present disclosure includes a detailed description about cloud computing, the implementation of the technical solutions recorded therein is not limited to a cloud computing environment, but can be implemented in combination with any other type of computing environment now known or later developed.

云计算是一种服务交付模式,用于对共享的可配置计算资源池进行方便、按需的网络访问。可配置计算资源是能够以最小的管理成本或与服务提供者进行最少的交互就能快速部署和释放的资源,例如可以是网络、网络带宽、服务器、处理、内存、存储、应用、虚拟机和服务。这种云模式可以包括至少五个特征、至少三个服务模型和至少四个部署模型。Cloud computing is a service delivery model for convenient, on-demand network access to a shared pool of configurable computing resources. Configurable computing resources are resources that can be quickly deployed and released with minimal management cost or interaction with the service provider, such as networks, network bandwidth, servers, processing, memory, storage, applications, virtual machines, and services. This cloud model can include at least five characteristics, at least three service models, and at least four deployment models.

特征包括:Features include:

按需自助式服务:云的消费者在无需与服务提供者进行人为交互的情况下能够单方面自动地按需部署诸如服务器时间和网络存储等的计算能力。On-demand self-service: Cloud consumers can unilaterally and automatically deploy computing capabilities such as server time and network storage on demand without human interaction with the service provider.

广泛的网络接入:计算能力可以通过标准机制在网络上获取,这种标准机制促进了通过不同种类的瘦客户机平台或厚客户机平台(例如移动电话、膝上型电脑、个人数字助理PDA)对云的使用。Broad network access: Computing power can be accessed over the network through standard mechanisms that facilitate the use of the cloud through different types of thin-client or thick-client platforms (e.g., mobile phones, laptops, personal digital assistants (PDAs)).

资源池:提供者的计算资源被归入资源池并通过多租户(multi-tenant)模式服务于多重消费者,其中按需将不同的实体资源和虚拟资源动态地分配和再分配。一般情况下,消费者不能控制或甚至并不知晓所提供的资源的确切位置,但可以在较高抽象程度上指定位置(例如国家、州或数据中心),因此具有位置无关性。Resource pool: The computing resources of the provider are grouped into resource pools and serve multiple consumers through a multi-tenant model, where different physical and virtual resources are dynamically allocated and reallocated on demand. Generally, consumers cannot control or even know the exact location of the provided resources, but can specify the location (such as country, state or data center) at a higher level of abstraction, so it is location-independent.

迅速弹性:能够迅速、有弹性地(有时是自动地)部署计算能力,以实现快速扩展,并且能迅速释放来快速缩小。在消费者看来,用于部署的可用计算能力往往显得是无限的,并能在任意时候都能获取任意数量的计算能力。Rapid elasticity: The ability to quickly and elastically (sometimes automatically) deploy computing power to scale up quickly, and quickly release it to scale down quickly. To consumers, the computing power available for deployment often appears to be unlimited, and any amount of computing power can be accessed at any time.

可测量的服务:云系统通过利用适于服务类型(例如存储、处理、带宽和活跃用户帐号)的某种抽象程度的计量能力,自动地控制和优化资源效用。可以监测、控制和报告资源使用情况,为服务提供者和消费者双方提供透明度。Measurable services: Cloud systems automatically control and optimize resource usage by leveraging metering capabilities at a level of abstraction appropriate to the type of service (e.g., storage, processing, bandwidth, and active user accounts). Resource usage can be monitored, controlled, and reported, providing transparency to both service providers and consumers.

服务模型如下:The service model is as follows:

软件即服务(SaaS):向消费者提供的能力是使用提供者在云基础架构上运行的应用。可以通过诸如网络浏览器的瘦客户机接口(例如基于网络的电子邮件)从各种客户机设备访问应用。除了有限的特定于用户的应用配置设置外,消费者既不管理也不控制包括网络、服务器、操作系统、存储、乃至单个应用能力等的底层云基础架构。Software as a Service (SaaS): The capability provided to consumers is to use the provider's applications running on a cloud infrastructure. Applications can be accessed from a variety of client devices through a thin client interface such as a web browser (e.g., web-based email). Aside from limited user-specific application configuration settings, consumers neither manage nor control the underlying cloud infrastructure including networks, servers, operating systems, storage, or even individual application capabilities.

平台即服务(PaaS):向消费者提供的能力是在云基础架构上部署消费者创建或获得的应用,这些应用利用提供者支持的程序设计语言和工具创建。消费者既不管理也不控制包括网络、服务器、操作系统或存储的底层云基础架构,但对其部署的应用具有控制权,对应用托管环境配置可能也具有控制权。Platform as a Service (PaaS): The capability provided to consumers is to deploy applications created or acquired by consumers on the cloud infrastructure, which are created using programming languages and tools supported by the provider. Consumers neither manage nor control the underlying cloud infrastructure including networks, servers, operating systems or storage, but have control over the applications they deploy and may also have control over the configuration of the application hosting environment.

基础架构即服务(IaaS):向消费者提供的能力是消费者能够在其中部署并运行包括操作系统和应用的任意软件的处理、存储、网络和其他基础计算资源。消费者既不管理也不控制底层的云基础架构,但是对操作系统、存储和其部署的应用具有控制权,对选择的网络组件(例如主机防火墙)可能具有有限的控制权。Infrastructure as a Service (IaaS): The capabilities provided to consumers are processing, storage, network, and other basic computing resources on which consumers can deploy and run arbitrary software including operating systems and applications. Consumers neither manage nor control the underlying cloud infrastructure, but have control over the operating system, storage, and applications they deploy, and may have limited control over selected network components (such as host firewalls).

部署模型如下:The deployment model is as follows:

私有云:云基础架构单独为某个组织运行。云基础架构可以由该组织或第三方管理并且可以存在于该组织内部或外部。Private cloud: The cloud infrastructure is run solely for an organization. The cloud infrastructure can be managed by the organization or a third party and can exist inside or outside the organization.

共同体云:云基础架构被若干组织共享并支持有共同利害关系(例如任务使命、安全要求、政策和合规考虑)的特定共同体。共同体云可以由共同体内的多个组织或第三方管理并且可以存在于该共同体内部或外部。Community Cloud: A cloud infrastructure is shared by several organizations and supports a specific community with common interests (e.g., mission, security requirements, policies, and compliance considerations). A community cloud can be managed by multiple organizations within the community or by a third party and can exist inside or outside the community.

公共云:云基础架构向公众或大型产业群提供并由出售云服务的组织拥有。Public cloud: Cloud infrastructure is provided to the public or to large industry groups and is owned by an organization that sells cloud services.

混合云:云基础架构由两个或更多部署模型的云(私有云、共同体云或公共云)组成,这些云依然是独特的实体,但是通过使数据和应用能够移植的标准化技术或私有技术(例如用于云之间的负载平衡的云突发流量分担技术)绑定在一起。Hybrid cloud: A cloud infrastructure consisting of two or more clouds (private, community or public) in a deployment model that remain distinct entities but are bound together by standardized or proprietary technologies (such as cloud burst sharing for load balancing between clouds) that enable data and application portability.

云计算环境是面向服务的,特点集中在无状态性、低耦合性、模块性和语意的互操作性。云计算的核心是包含互连节点网络的基础架构。The cloud computing environment is service-oriented, with characteristics centered on statelessness, low coupling, modularity, and semantic interoperability. The core of cloud computing is the infrastructure that includes a network of interconnected nodes.

现在参考图33,描绘了示意性的云计算环境50。如图所示,云计算环境50包括云的消费者使用本地计算设备可以与其通信的一个或多个计算节点10,本地计算设备例如是个人数字助理(PDA)或蜂窝电话54A、台式计算机54B、膝上型计算机54C和/或汽车计算机系统54N。节点10可以彼此通信。它们可以在一个或多个网络中物理地或虚拟地分组(未示出),例如如上所述的私有云、共同体云、公共云或混合云、或其组合。这样,云的消费者无需维护本地计算设备上的资源就能够允许云计算环境50提供基础架构即服务、平台即服务和/或软件即服务。应该理解,图33中所示的计算设备54A-N的类型仅仅是示意性的,而计算节点10和云计算环境50可以(例如,使用网络浏览器)通过任何类型的网络和/或网络可寻址连接与任何类型的计算设备通信。Referring now to FIG. 33 , an illustrative cloud computing environment 50 is depicted. As shown, the cloud computing environment 50 includes one or more computing nodes 10 with which a cloud consumer can communicate using a local computing device, such as a personal digital assistant (PDA) or cellular phone 54A, a desktop computer 54B, a laptop computer 54C, and/or an automotive computer system 54N. The nodes 10 can communicate with each other. They can be physically or virtually grouped (not shown) in one or more networks, such as a private cloud, a community cloud, a public cloud, or a hybrid cloud, or a combination thereof, as described above. In this way, cloud consumers can allow the cloud computing environment 50 to provide infrastructure as a service, platform as a service, and/or software as a service without maintaining resources on local computing devices. It should be understood that the types of computing devices 54A-N shown in FIG. 33 are merely illustrative, and the computing nodes 10 and the cloud computing environment 50 can communicate with any type of computing device (e.g., using a web browser) through any type of network and/or network addressable connection.

现在参考图34,示出了由云计算环境50(图33)提供的一组功能抽象层。应该事先理解图34中所示的组件、层和功能仅仅是示意性的,并且本发明的实施例不限于此。如图所示,提供了以下层和相应的功能:Referring now to FIG. 34 , a set of functional abstraction layers provided by the cloud computing environment 50 ( FIG. 33 ) is shown. It should be understood in advance that the components, layers, and functions shown in FIG. 34 are merely illustrative, and embodiments of the present invention are not limited thereto. As shown, the following layers and corresponding functions are provided:

硬件和软件层60包括硬件和软件组件。硬件组件的示例包括主机61;基于RISC(精简指令集计算机)架构的服务器62;服务器63;刀片服务器64;存储设备65;网络和网络组件66。在一些实施例中,软件组件包括网络应用服务器软件67和数据库软件68。The hardware and software layer 60 includes hardware and software components. Examples of hardware components include host 61; server 62 based on RISC (Reduced Instruction Set Computer) architecture; server 63; blade server 64; storage device 65; network and network components 66. In some embodiments, software components include network application server software 67 and database software 68.

虚拟层70提供抽象层,从该抽象层可以提供以下虚拟实体的示例:虚拟服务器71;虚拟存储72;虚拟网络73(包括虚拟私有网络);虚拟应用和操作系统74;和虚拟客户端75。Virtualization layer 70 provides an abstraction layer from which the following examples of virtual entities may be provided: virtual servers 71 ; virtual storage 72 ; virtual networks 73 (including virtual private networks); virtual applications and operating systems 74 ; and virtual clients 75 .

在一个示例中,管理层80可以提供下面描述的功能。资源供应功能81提供用于在云计算环境内执行任务的计算资源和其他资源的动态获取。计量和定价功能82在云计算环境内对资源的使用进行成本跟踪,并且提供用于消费这些资源的帐单或发票。在一个示例中,这些资源可以包括应用软件许可。安全功能为云的消费者和任务提供身份认证,以及为数据和其他资源提供保护。用户门户功能83为消费者和系统管理员提供对云计算环境的访问。服务水平管理功能84提供云计算资源的分配和管理,以满足所需的服务水平。服务水平协议(SLA)计划和履行功能85为根据SLA预测的对云计算资源未来需求提供预先安排和供应。In one example, the management layer 80 may provide the functionality described below. Resource provisioning functionality 81 provides dynamic acquisition of computing resources and other resources for performing tasks within a cloud computing environment. Metering and pricing functionality 82 tracks the cost of resource usage within a cloud computing environment and provides bills or invoices for consuming these resources. In one example, these resources may include application software licenses. Security functionality provides identity authentication for cloud consumers and tasks, as well as protection for data and other resources. User portal functionality 83 provides access to the cloud computing environment for consumers and system administrators. Service level management functionality 84 provides allocation and management of cloud computing resources to meet required service levels. Service level agreement (SLA) planning and fulfillment functionality 85 provides pre-scheduling and provisioning for future demand for cloud computing resources based on SLA forecasts.

工作负载层90提供可以利用云计算环境的功能的示例。可以从该层提供的工作负载和功能的示例包括:地图绘制与导航91;软件开发和生命周期管理92;虚拟教室的教学提供93;数据分析处理94;交易处理95;和内容表指针处理96The workload layer 90 provides examples of functions that can take advantage of a cloud computing environment. Examples of workloads and functions that can be provided from this layer include: mapping and navigation 91; software development and lifecycle management 92; virtual classroom instructional provision 93; data analysis processing 94; transaction processing 95; and content table pointer processing 96.

本发明可以是任何可能的技术细节集成级别的系统、方法和/或计算机程序产品。该计算机程序产品可以包括一个计算机可读存储介质(或多个计算机可读存储介质),其上具有计算机可读程序指令,用于使处理器执行本发明的各方面。The present invention may be a system, method and/or computer program product at any possible level of technical detail integration. The computer program product may include a computer-readable storage medium (or multiple computer-readable storage media) having computer-readable program instructions thereon for causing a processor to perform various aspects of the present invention.

计算机可读存储介质可以是有形设备,其可以保留和存储指令以供指令执行设备使用。计算机可读存储介质可以是例如但不限于电子存储设备、磁存储设备、光存储设备、电磁存储设备、半导体存储设备或前述的任何合适组合。计算机可读存储介质的更具体示例的非详尽列表包括以下内容:便携式计算机磁盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或闪存)、静态随机存取存储器(SRAM)、便携式光盘只读存储器(CD-ROM)、数字通用光盘(DVD)、记忆棒、软盘、诸如在其上记录有指令的打孔卡或凹槽内凸起结构的机械编码装置、以及前述的任何合适的组合。这里使用的计算机可读存储介质不应被解释为瞬时信号本身,诸如无线电波或其他自由传播的电磁波、通过波导或其他传输介质传播的电磁波(例如,通过光纤电缆传递的光脉冲)或通过电线传输的电信号。A computer-readable storage medium may be a tangible device that can retain and store instructions for use by an instruction execution device. A computer-readable storage medium may be, for example, but not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. A non-exhaustive list of more specific examples of computer-readable storage media includes the following: a portable computer disk, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a static random access memory (SRAM), a portable compact disc read-only memory (CD-ROM), a digital versatile disc (DVD), a memory stick, a floppy disk, a mechanical encoding device such as a punch card or a raised structure in a groove on which instructions are recorded, and any suitable combination of the foregoing. The computer-readable storage medium used herein should not be interpreted as a transient signal itself, such as a radio wave or other freely propagating electromagnetic wave, an electromagnetic wave propagated through a waveguide or other transmission medium (e.g., a light pulse transmitted through a fiber optic cable), or an electrical signal transmitted through a wire.

本文描述的计算机可读程序指令可以从计算机可读存储介质下载到相应的计算/处理设备,或者经由网络(例如,因特网,局域网,广域网和/或无线网络)下载到外部计算机或外部存储设备。网络可以包括铜传输电缆、光传输光纤、无线传输、路由器、防火墙、交换机、网关计算机和/或边缘服务器。每个计算/处理设备中的网络适配器卡或网络接口从网络接收计算机可读程序指令,并转发计算机可读程序指令以存储在相应计算/处理设备内的计算机可读存储介质中。The computer-readable program instructions described herein can be downloaded from a computer-readable storage medium to a corresponding computing/processing device, or downloaded to an external computer or external storage device via a network (e.g., the Internet, a local area network, a wide area network, and/or a wireless network). The network can include copper transmission cables, optical transmission fibers, wireless transmissions, routers, firewalls, switches, gateway computers, and/or edge servers. The network adapter card or network interface in each computing/processing device receives the computer-readable program instructions from the network and forwards the computer-readable program instructions to be stored in a computer-readable storage medium in the corresponding computing/processing device.

用于执行本发明的操作的计算机可读程序指令可以是汇编指令、指令集架构(ISA)指令、机器指令、机器相关指令、微代码、固件指令、状态设置数据、集成电路配置数据、或者以一种或多种编程语言的任意组合编写的源代码或目标代码,所述编程语言包括诸如Smalltalk,C++等的面向对象的编程语言,以及诸如“C”编程语言或类似编程语言的过程编程语言。计算机可读程序指令可以完全在用户的计算机上执行、部分地在用户计算机上执行、作为独立的软件包执行、部分地在用户计算机上并且部分地在远程计算机上执行、或完全在远程计算机或服务器上执行。在后一种情况下,远程计算机可以通过任何类型的网络(包括局域网(LAN)或广域网(WAN))连接到用户的计算机,或者,可以连接到外部计算机(例如,利用互联网服务提供商来通过互联网连接)。在一些实施例中,包括例如可编程逻辑电路、现场可编程门阵列(FPGA)或可编程逻辑阵列(PLA)的电子电路可以通过利用计算机可读程序指令的状态信息来个性化定制电子电路,该电子电路执行计算机可读程序指令,以便执行本发明的各方面。The computer-readable program instructions for performing the operation of the present invention may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, state setting data, integrated circuit configuration data, or source code or object code written in any combination of one or more programming languages, including object-oriented programming languages such as Smalltalk, C++, and procedural programming languages such as "C" programming language or similar programming languages. The computer-readable program instructions may be executed entirely on the user's computer, partially on the user's computer, as an independent software package, partially on the user's computer and partially on a remote computer, or completely on a remote computer or server. In the latter case, the remote computer may be connected to the user's computer via any type of network (including a local area network (LAN) or a wide area network (WAN)), or may be connected to an external computer (e.g., using an Internet service provider to connect via the Internet). In some embodiments, an electronic circuit including, for example, a programmable logic circuit, a field programmable gate array (FPGA), or a programmable logic array (PLA) may be personalized electronic circuits using the state information of computer-readable program instructions, and the electronic circuits execute computer-readable program instructions to perform various aspects of the present invention.

本文参考根据本发明的实施例的方法、装置(系统)和计算机程序产品的流程图图示和/或框图来描述本发明的各方面。将理解,流程图图示和/或框图中的每个框以及流程图图示和/或框图中的框的组合可以由计算机可读程序指令实现。Various aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, devices (systems) and computer program products according to embodiments of the present invention. It will be understood that each block in the flowchart illustrations and/or block diagrams and combinations of blocks in the flowchart illustrations and/or block diagrams can be implemented by computer-readable program instructions.

这些计算机可读程序指令可以被提供给通用计算机、专用计算机或其他可编程数据处理装置的处理器来生产出机器,以使得通过计算机的处理器或其他可编程数据处理装置执行的指令创建用于实现流程图和/或一个框图块或多个框图块中所指定的功能/动作的装置。这些计算机可读程序指令还可以存储在计算机可读存储介质中,这些计算机可读程序指令可以使得计算机、可编程数据处理装置和/或其他设备以特定方式工作,以使得具有存储在其中的指令的计算机可读存储介质包括制品,该制品包括实现流程图和/或一个框图块或多个框图块中指定的功能/动作的各方面的指令。These computer-readable program instructions can be provided to a processor of a general-purpose computer, a special-purpose computer, or other programmable data processing device to produce a machine, so that the instructions executed by the processor of the computer or other programmable data processing device create a device for implementing the functions/actions specified in the flowchart and/or one or more block diagram blocks. These computer-readable program instructions can also be stored in a computer-readable storage medium, which can cause a computer, a programmable data processing device, and/or other equipment to work in a specific manner, so that the computer-readable storage medium with instructions stored therein includes an article of manufacture, which includes instructions for implementing various aspects of the functions/actions specified in the flowchart and/or one or more block diagram blocks.

计算机可读程序指令还可以被加载到计算机,其他可编程数据处理装置或其他设备上,以使得在计算机、其他可编程装置或其他设备上执行一系列操作步骤,以产生计算机实现的过程,这样在计算机、其他可编程装置或其他设备上执行的指令实现在流程图和/或一个框图块或多个框图块中指定的功能/动作。Computer-readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable apparatus, or other device to produce a computer-implemented process, such that the instructions executed on the computer, other programmable apparatus, or other device implement the functions/actions specified in the flowchart and/or a block diagram block or multiple block diagram blocks.

附图中的流程图和框图示出根据本发明的各种实施例的系统,方法和计算机程序产品的可能实施方式的体系结构,功能和操作。在这方面,流程图或框图中的每个框可以表示模块、程序段或指令的一部分,其包括用于实现指定的逻辑功能的一个或多个可执行指令。在一些替代实施方式中,框中所标注的功能可以不按图中所示的顺序发生。例如,连续示出的两个框实际上可以基本上并行地执行,或者这些框有时可以以相反的顺序执行,这取决于所涉及的功能。还应注意,框图和/或流程图图示中的每个框以及框图和/或流程图图示中的框的组合可以由执行特定功能或动作,或执行专用硬件和计算机指令的组合的专用的基于硬件的系统来实现。The flow charts and block diagrams in the accompanying drawings illustrate the system, method and computer program product according to various embodiments of the present invention. The architecture, function and operation of possible implementation methods. In this regard, each box in the flow chart or block diagram can represent a part of a module, a program segment or an instruction, which includes one or more executable instructions for implementing a specified logical function. In some alternative embodiments, the functions marked in the box may not occur in the order shown in the figure. For example, the two boxes shown in succession can actually be executed substantially in parallel, or these boxes can sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each box in the block diagram and/or flow chart illustration and the combination of boxes in the block diagram and/or flow chart illustration can be implemented by a dedicated hardware-based system that performs a specific function or action, or performs a combination of dedicated hardware and computer instructions.

除了上述之外,可以由提供客户环境管理的服务提供商提供、给予、部署、管理、服务一个或多个方面。例如,服务提供商可以创建、维护、支持计算机代码和/或为一个或多个客户执行一个或多个方面的计算机基础设施。作为回报,服务提供商可以例如根据订阅和/或费用协议从客户接收付款。附加地或替代地,服务提供商可以从向一个或多个第三方销售广告内容来接收付款。In addition to the above, one or more aspects may be provided, delivered, deployed, managed, serviced by a service provider that provides customer environment management. For example, a service provider may create, maintain, support computer code and/or computer infrastructure that performs one or more aspects for one or more customers. In return, the service provider may receive payment from the customer, for example, under a subscription and/or fee agreement. Additionally or alternatively, the service provider may receive payment from the sale of advertising content to one or more third parties.

在一方面中,可以部署应用以执行一个或多个实施例。作为一个示例,应用的部署包括提供可操作以执行一个或多个实施例的计算机基础结构。In one aspect, an application may be deployed to perform one or more embodiments. As an example, deployment of an application includes providing a computer infrastructure operable to perform one or more embodiments.

作为另一方面,可以部署计算基础设施,包括将计算机可读代码集成到计算系统中,其中与计算系统结合的代码能够执行一个或多个实施例。As another aspect, a computing infrastructure may be deployed including integrating computer readable code into a computing system, wherein the code in combination with the computing system is capable of performing one or more embodiments.

作为又一方面,可以提供一种用于集成计算基础设施的过程,包括将计算机可读代码集成到计算机系统中。该计算机系统包括计算机可读介质,其中该计算机介质包括一个或多个实施例。与计算机系统结合的代码能够执行一个或多个实施例。As another aspect, a process for integrating computing infrastructure may be provided, comprising integrating computer readable code into a computer system. The computer system comprises a computer readable medium, wherein the computer medium comprises one or more embodiments. The code combined with the computer system is capable of executing one or more embodiments.

尽管以上描述了各种实施例,但这些仅是示例。例如,具有其他架构的计算环境可用于合并和使用一个或多个实施例。此外,可以使用不同的指令或操作。另外,可以使用不同的寄存器和/或可以指定其他类型的指示(除寄存器号之外)。多个变化是可能的。Although various embodiments are described above, these are merely examples. For example, computing environments with other architectures may be used to incorporate and use one or more embodiments. In addition, different instructions or operations may be used. In addition, different registers may be used and/or other types of indications (other than register numbers) may be specified. Multiple variations are possible.

此外,其他类型的计算环境可以受益并被使用。作为示例,适用于存储和/或执行程序代码的数据处理系统是可用的,其包括直接或通过系统总线间接耦接到存储器元件的至少两个处理器。存储器元件包括例如在程序代码的实际执行期间使用的本地存储器、大容量存储和高速缓存存储器,该高速缓存存储器提供至少一些程序代码的临时存储,以便减少执行期间必须从大容量存储重新取回代码的次数。In addition, other types of computing environments can benefit and be used. As an example, a data processing system suitable for storing and/or executing program code is available, which includes at least two processors that are directly or indirectly coupled to a memory element through a system bus. The memory element includes a local memory, a large storage and a cache memory that are used during the actual execution of the program code, and the cache memory provides temporary storage of at least some program codes, so as to reduce the number of times that code must be retrieved from the large storage during execution.

输入/输出或I/O设备(包括但不限于键盘、显示器、指示设备、DASD、磁带,CD、DVD、拇指驱动器和其他存储介质等)可以直接耦接到系统或通过介入I/O控制器而耦接到系统。网络适配器还可以耦接到系统,以使数据处理系统能够通过介入私有或公共网络而耦接到其他数据处理系统或远程打印机或存储设备。调制解调器、电缆调制解调器和以太网卡只是可用类型的网络适配器中的一小部分。Input/output or I/O devices (including but not limited to keyboards, displays, pointing devices, DASD, tapes, CDs, DVDs, thumb drives and other storage media, etc.) can be coupled to the system directly or through intervening I/O controllers. Network adapters can also be coupled to the system to enable the data processing system to be coupled to other data processing systems or remote printers or storage devices through intervening private or public networks. Modems, cable modems and Ethernet cards are just a few of the available types of network adapters.

本文使用的术语仅出于描述特定实施例的目的,并不意图限制本发明。如这里所使用的,单数形式“一”,“一个”和“该”旨在也包括复数形式,除非上下文另有明确说明。将进一步理解,当在本说明书中使用时,术语“包括”和/或“包含”指定所述特征、整数、步骤、操作、元素和/或组件的存在,但不排除存在或者添加一个或多个其他特征、整数、步骤、操作、元素、组件和/或它们的组合。The terms used herein are for the purpose of describing specific embodiments only and are not intended to limit the present invention. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that when used in this specification, the terms "include" and/or "comprise" specify the presence of the features, integers, steps, operations, elements and/or components, but do not exclude the presence or addition of one or more other features, integers, steps, operations, elements, components and/or combinations thereof.

以下权利要求中的所有装置或步骤加功能元件的相应结构、材料、动作和等同物(如果有的话)旨在包括如所具体要求保护的用于结合其他要求保护的元件来执行功能的任何结构、材料或动作。已经出于说明和描述的目的给出了对一个或多个实施例的描述,但是并不旨在穷举或限制于所公开的形式。许多修改和变化对于本领域普通技术人员来说是显而易见的。选择和描述实施例是为了最好地解释各个方面和实际应用,并且使本领域普通技术人员能够理解具有各种修改的各种实施例适合于预期的特定用途。The corresponding structures, materials, actions, and equivalents (if any) of all means or step plus function elements in the following claims are intended to include any structure, material, or action for performing a function in combination with other claimed elements as specifically claimed. The description of one or more embodiments has been given for the purpose of illustration and description, but is not intended to be exhaustive or limited to the disclosed forms. Many modifications and variations will be apparent to those of ordinary skill in the art. The embodiments are selected and described in order to best explain various aspects and practical applications, and to enable those of ordinary skill in the art to understand that various embodiments with various modifications are suitable for the intended specific use.

Claims (20)

1. A computer-readable storage medium for facilitating processing within a computing environment, readable by a processing circuit and storing instructions for performing a method comprising:
obtaining, by a processor, instructions to be executed, the instructions providing pointers to reference data structures used to populate variables included in program code; and
executing the instructions, the executing comprising:
determining a value of the pointer to the reference data structure, the value of the pointer being used to access the reference data structure for populating variables included in program code; and
the value is stored in a location specified by the instruction, wherein the determining and the storing are performed as part of executing the instruction.
2. The computer-readable storage medium of claim 1, wherein determining the value comprises performing a lookup of a data structure to determine the value.
3. The computer readable storage medium of claim 2, wherein the data structure comprises a reference data structure pointer cache.
4. The computer-readable storage medium of claim 2, wherein the data structure comprises a table populated with reference data structure pointer values.
5. The computer-readable storage medium of claim 1, wherein the location comprises a register specified by the instruction.
6. The computer-readable storage medium of claim 1, wherein determining the value comprises:
checking a reference data structure pointer cache for an entry comprising the value; and
based on finding the entry, the storing is performed.
7. The computer-readable storage medium of claim 6, wherein determining the value further comprises:
causing, by the processor, a trap to a handler based on the value not being located in the reference data structure pointer cache;
obtaining, by the handler, the value from a data structure populated with reference data structure pointer values; and
the storing is performed by the processing program.
8. The computer-readable storage medium of claim 7, wherein the method further comprises storing the value in the reference data structure pointer cache.
9. The computer-readable storage medium of claim 6, wherein determining the value further comprises performing a cache miss process to determine the value and store the value based on the value not being in the reference data structure pointer cache.
10. The computer-readable storage medium of claim 1, wherein the method further comprises:
causing, by the processor, a trap to a handler based on obtaining the instruction; and
the determining and the storing are performed by the processing program.
11. A computer system for facilitating processing within a computing environment, the computer system comprising:
a memory; and
a processor in communication with the memory, wherein the computer system is configured to perform a method comprising:
obtaining, by the processor, instructions to be executed, the instructions providing pointers to reference data structures used to populate variables included within program code; and
executing the instructions, the executing comprising:
determining a value of the pointer to the reference data structure, the value of the pointer being used to access the reference data structure for populating variables included in program code; and
the value is stored in a location specified by the instruction, wherein the determining and the storing are performed as part of executing the instruction.
12. The computer system of claim 11, wherein the location comprises a register specified by the instruction.
13. The computer system of claim 11, wherein determining the value comprises:
checking a reference data structure pointer cache for an entry comprising the value; and
based on finding the entry, the storing is performed.
14. The computer system of claim 13, wherein determining the value further comprises:
causing, by the processor, a trap to a handler based on the value not being located in the reference data structure pointer cache;
obtaining, by the handler, the value from a data structure populated with reference data structure pointer values; and
the storing is performed by the processing program.
15. The computer system of claim 11, wherein the method further comprises:
causing, by the processor, a trap to a handler based on obtaining the instruction; and
the determining and the storing are performed by the processing program.
16. A computer-implemented method for facilitating processing within a computing environment, the method comprising:
obtaining, by a processor, instructions to be executed, the instructions providing pointers to reference data structures used to populate variables included in program code; and
Executing the instructions, the executing comprising:
determining a value of the pointer to the reference data structure, the value of the pointer being used to access the reference data structure for populating variables included in program code; and
the value is stored in a location specified by the instruction, wherein the determining and the storing are performed as part of executing the instruction.
17. The computer-implemented method of claim 16, wherein the location comprises a register specified by the instruction.
18. The computer-implemented method of claim 16, wherein determining the value comprises:
checking a reference data structure pointer cache for an entry comprising the value; and
based on finding the entry, the storing is performed.
19. The computer-implemented method of claim 18, wherein determining the value further comprises:
causing, by the processor, a trap to a handler based on the value not being located in the reference data structure pointer cache;
obtaining, by the handler, the value from a data structure populated with reference data structure pointer values; and
the storing is performed by the processing program.
20. The computer-implemented method of claim 16, further comprising:
causing, by the processor, a trap to a handler based on obtaining the instruction; and
the determining and the storing are performed by the processing program.
CN201880058321.3A 2017-09-19 2018-09-18 Systems, methods and media for facilitating processing within a computing environment Active CN111066006B (en)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US15/708,207 2017-09-19
US15/708,207 US10884929B2 (en) 2017-09-19 2017-09-19 Set table of contents (TOC) register instruction
US15/822,801 2017-11-27
US15/822,801 US10884930B2 (en) 2017-09-19 2017-11-27 Set table of contents (TOC) register instruction
PCT/IB2018/057132 WO2019058250A1 (en) 2017-09-19 2018-09-18 Set table of contents (toc) register instruction

Publications (2)

Publication Number Publication Date
CN111066006A CN111066006A (en) 2020-04-24
CN111066006B true CN111066006B (en) 2023-08-15

Family

ID=65720248

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201880058321.3A Active CN111066006B (en) 2017-09-19 2018-09-18 Systems, methods and media for facilitating processing within a computing environment

Country Status (6)

Country Link
US (3) US10884929B2 (en)
JP (1) JP7059361B2 (en)
CN (1) CN111066006B (en)
DE (1) DE112018003586T5 (en)
GB (1) GB2581639B (en)
WO (1) WO2019058250A1 (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10896030B2 (en) 2017-09-19 2021-01-19 International Business Machines Corporation Code generation relating to providing table of contents pointer values
US10713050B2 (en) 2017-09-19 2020-07-14 International Business Machines Corporation Replacing Table of Contents (TOC)-setting instructions in code with TOC predicting instructions
US10725918B2 (en) 2017-09-19 2020-07-28 International Business Machines Corporation Table of contents cache entry having a pointer for a range of addresses
US10620955B2 (en) 2017-09-19 2020-04-14 International Business Machines Corporation Predicting a table of contents pointer value responsive to branching to a subroutine
US10884929B2 (en) 2017-09-19 2021-01-05 International Business Machines Corporation Set table of contents (TOC) register instruction
US11061575B2 (en) * 2017-09-19 2021-07-13 International Business Machines Corporation Read-only table of contents register
US10705973B2 (en) 2017-09-19 2020-07-07 International Business Machines Corporation Initializing a data structure for use in predicting table of contents pointer values
US20230052700A1 (en) * 2022-11-04 2023-02-16 Intel Corporation Memory expansion with persistent predictive prefetching

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5822787A (en) * 1994-06-30 1998-10-13 Sun Microsystems, Inc. Application binary interface and method of interfacing binary application program to digital computer including efficient acquistion of global offset table (GOT) absolute base address
US7802080B2 (en) * 2004-03-24 2010-09-21 Arm Limited Null exception handling
US9274769B1 (en) * 2014-09-05 2016-03-01 International Business Machines Corporation Table of contents pointer value save and restore placeholder positioning

Family Cites Families (162)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0337723A (en) 1989-07-05 1991-02-19 Hitachi Ltd information processing equipment
IL98248A0 (en) 1991-05-23 1992-06-21 Ibm Israel Instruction scheduler for a computer
US5581760A (en) 1992-07-06 1996-12-03 Microsoft Corporation Method and system for referring to and binding to objects using identifier objects
US5313634A (en) 1992-07-28 1994-05-17 International Business Machines Corporation Computer system branch prediction of subroutine returns
JP2883784B2 (en) 1993-04-27 1999-04-19 株式会社東芝 Microcomputer
US5604877A (en) 1994-01-04 1997-02-18 Intel Corporation Method and apparatus for resolving return from subroutine instructions in a computer processor
US5590329A (en) 1994-02-04 1996-12-31 Lucent Technologies Inc. Method and apparatus for detecting memory access errors
US6006324A (en) 1995-01-25 1999-12-21 Advanced Micro Devices, Inc. High performance superscalar alignment unit
JP3494736B2 (en) 1995-02-27 2004-02-09 株式会社ルネサステクノロジ Branch prediction system using branch destination buffer
US5896528A (en) 1995-03-03 1999-04-20 Fujitsu Limited Superscalar processor with multiple register windows and speculative return address generation
US6314561B1 (en) 1995-04-12 2001-11-06 International Business Machines Corporation Intelligent cache management mechanism
US5968169A (en) 1995-06-07 1999-10-19 Advanced Micro Devices, Inc. Superscalar microprocessor stack structure for judging validity of predicted subroutine return addresses
US5923882A (en) * 1995-08-29 1999-07-13 Silicon Graphics, Inc. Cross-module optimization for dynamically-shared programs and libraries
US5898864A (en) 1995-09-25 1999-04-27 International Business Machines Corporation Method and system for executing a context-altering instruction without performing a context-synchronization operation within high-performance processors
US5892936A (en) 1995-10-30 1999-04-06 Advanced Micro Devices, Inc. Speculative register file for storing speculative register states and removing dependencies between instructions utilizing the register
US5797014A (en) 1995-12-14 1998-08-18 International Business Machines Corporation Method for reducing processor cycles used for global offset table address computation in a position independent shared library
US5774722A (en) 1995-12-14 1998-06-30 International Business Machines Corporation Method for efficient external reference resolution in dynamically linked shared code libraries in single address space operating systems
US5815700A (en) 1995-12-22 1998-09-29 Intel Corporation Branch prediction table having pointers identifying other branches within common instruction cache lines
US6535903B2 (en) 1996-01-29 2003-03-18 Compaq Information Technologies Group, L.P. Method and apparatus for maintaining translated routine stack in a binary translation environment
US5815719A (en) 1996-05-07 1998-09-29 Sun Microsystems, Inc. Method and apparatus for easy insertion of assembler code for optimization
US5850543A (en) 1996-10-30 1998-12-15 Texas Instruments Incorporated Microprocessor with speculative instruction pipelining storing a speculative register value within branch target buffer for use in speculatively executing instructions after a return
US5996092A (en) 1996-12-05 1999-11-30 International Business Machines Corporation System and method for tracing program execution within a processor before and after a triggering event
US5898885A (en) 1997-03-31 1999-04-27 International Business Machines Corporation Method and system for executing a non-native stack-based instruction within a computer system
US5953736A (en) * 1997-04-23 1999-09-14 Sun Microsystems, Inc. Write barrier system and method including pointer-specific instruction variant replacement mechanism
US6157999A (en) 1997-06-03 2000-12-05 Motorola Inc. Data processing system having a synchronizing link stack and method thereof
US6195734B1 (en) 1997-07-02 2001-02-27 Micron Technology, Inc. System for implementing a graphic address remapping table as a virtual register file in system memory
US5961636A (en) 1997-09-22 1999-10-05 International Business Machines Corporation Checkpoint table for selective instruction flushing in a speculative execution unit
US6370090B1 (en) 1998-06-10 2002-04-09 U.S. Philips Corporation Method, device, and information structure for storing audio-centered information with a multi-level table-of-contents (toc) mechanism and doubling of area-tocs, a device for use with such mechanism and a unitary storage medium having such mechanism
AU4723699A (en) 1998-06-25 2000-01-10 Equator Technologies, Inc. Processing circuit and method for variable-length coding and decoding
US6591359B1 (en) 1998-12-31 2003-07-08 Intel Corporation Speculative renaming of data-processor registers
US6308322B1 (en) 1999-04-06 2001-10-23 Hewlett-Packard Company Method and apparatus for reduction of indirect branch instruction overhead through use of target address hints
US6446197B1 (en) 1999-10-01 2002-09-03 Hitachi, Ltd. Two modes for executing branch instructions of different lengths and use of branch control instruction and register set loaded with target instructions
US6442707B1 (en) 1999-10-29 2002-08-27 Advanced Micro Devices, Inc. Alternate fault handler
US6715064B1 (en) 2000-01-21 2004-03-30 Intel Corporation Method and apparatus for performing sequential executions of elements in cooperation with a transform
US6401181B1 (en) 2000-02-29 2002-06-04 International Business Machines Corporation Dynamic allocation of physical memory space
US6766442B1 (en) 2000-03-30 2004-07-20 International Business Machines Corporation Processor and method that predict condition register-dependent conditional branch instructions utilizing a potentially stale condition register value
US6691220B1 (en) 2000-06-06 2004-02-10 International Business Machines Corporation Multiprocessor speculation mechanism via a barrier speculation flag
US6625660B1 (en) 2000-06-06 2003-09-23 International Business Machines Corporation Multiprocessor speculation mechanism for efficiently managing multiple barrier operations
JP2002014868A (en) 2000-06-28 2002-01-18 Hitachi Ltd Microprocessor having memory reference operation detecting mechanism and compiling method
GB2367654B (en) 2000-10-05 2004-10-27 Advanced Risc Mach Ltd Storing stack operands in registers
US6880073B2 (en) 2000-12-28 2005-04-12 International Business Machines Corporation Speculative execution of instructions and processes before completion of preceding barrier operations
US6886093B2 (en) 2001-05-04 2005-04-26 Ip-First, Llc Speculative hybrid branch direction predictor
JP2003044273A (en) 2001-08-01 2003-02-14 Nec Corp Data processor and data processing method
CA2355989A1 (en) 2001-08-27 2003-02-27 Ibm Canada Limited-Ibm Canada Limitee Compiling source code files having multiple
AU2002361724B2 (en) 2001-09-03 2008-02-14 Kwh Pipe (Danmark) As A method of biologically purifying waste water and a plant preferably a mini purification plant to be used by the method
US7024538B2 (en) * 2001-12-21 2006-04-04 Hewlett-Packard Development Company, L.P. Processor multiple function units executing cycle specifying variable length instruction block and using common target block address updated pointers
US6973563B1 (en) 2002-01-04 2005-12-06 Advanced Micro Devices, Inc. Microprocessor including return prediction unit configured to determine whether a stored return address corresponds to more than one call instruction
US6883086B2 (en) 2002-03-06 2005-04-19 Intel Corporation Repair of mis-predicted load values
US6845442B1 (en) 2002-04-30 2005-01-18 Advanced Micro Devices, Inc. System and method of using speculative operand sources in order to speculatively bypass load-store operations
US7028166B2 (en) 2002-04-30 2006-04-11 Advanced Micro Devices, Inc. System and method for linking speculative results of load operations to register values
EP1387247A3 (en) 2002-07-31 2007-12-12 Texas Instruments Inc. System and method to automatically stack and unstack java local variables
US7089400B1 (en) 2002-08-29 2006-08-08 Advanced Micro Devices, Inc. Data speculation based on stack-relative addressing patterns
US7310799B2 (en) 2002-12-31 2007-12-18 International Business Machines Corporation Reducing load instructions via global data reordering
US7464254B2 (en) 2003-01-09 2008-12-09 Cisco Technology, Inc. Programmable processor apparatus integrating dedicated search registers and dedicated state machine registers with associated execution hardware to support rapid application of rulesets to data
US7024537B2 (en) 2003-01-21 2006-04-04 Advanced Micro Devices, Inc. Data speculation based on addressing patterns identifying dual-purpose register
CN100501848C (en) 2003-02-13 2009-06-17 道格卡森联合公司 Identification tag for tracking layers in a multi-layer optical disc
US6965983B2 (en) 2003-02-16 2005-11-15 Faraday Technology Corp. Simultaneously setting prefetch address and fetch address pipelined stages upon branch
US7017028B2 (en) 2003-03-14 2006-03-21 International Business Machines Corporation Apparatus and method for updating pointers for indirect and parallel register access
US7133977B2 (en) 2003-06-13 2006-11-07 Microsoft Corporation Scalable rundown protection for object lifetime management
US7263600B2 (en) 2004-05-05 2007-08-28 Advanced Micro Devices, Inc. System and method for validating a memory file that links speculative results of load operations to register values
US7296136B1 (en) 2004-06-04 2007-11-13 Hewlett-Packard Development Company, L.P. Methods and systems for loading data from memory
US7412710B2 (en) 2004-11-12 2008-08-12 Red Hat, Inc. System, method, and medium for efficiently obtaining the addresses of thread-local variables
US8706475B2 (en) 2005-01-10 2014-04-22 Xerox Corporation Method and apparatus for detecting a table of contents and reference determination
US8223600B2 (en) 2005-04-06 2012-07-17 Quantum Corporation Network-attachable, file-accessible storage drive
US7366887B2 (en) 2005-07-11 2008-04-29 Lenovo (Singapore) Pte. Ltd. System and method for loading programs from HDD independent of operating system
US8601001B2 (en) 2005-07-28 2013-12-03 The Boeing Company Selectively structuring a table of contents for accessing a database
US20070088937A1 (en) 2005-10-13 2007-04-19 International Business Machines Corporation Computer-implemented method and processing unit for predicting branch target addresses
US7688686B2 (en) 2005-10-27 2010-03-30 Microsoft Corporation Enhanced table of contents (TOC) identifiers
US7890941B1 (en) 2005-11-10 2011-02-15 Oracle America, Inc. Binary profile instrumentation framework
JP4978025B2 (en) 2006-02-24 2012-07-18 株式会社日立製作所 Pointer compression / decompression method, program for executing the same, and computer system using the same
US7590826B2 (en) 2006-11-06 2009-09-15 Arm Limited Speculative data value usage
US7444501B2 (en) 2006-11-28 2008-10-28 Qualcomm Incorporated Methods and apparatus for recognizing a subroutine call
US7953996B2 (en) 2006-12-18 2011-05-31 Hewlett-Packard Development Company, L.P. ACPI to firmware interface
US8370606B2 (en) 2007-03-16 2013-02-05 Atmel Corporation Switching data pointers based on context
US8701187B2 (en) * 2007-03-29 2014-04-15 Intel Corporation Runtime integrity chain verification
JP5085180B2 (en) 2007-04-24 2012-11-28 株式会社東芝 Information processing apparatus and access control method
US8166279B2 (en) 2007-05-03 2012-04-24 International Business Machines Corporation Method for predictive decoding of a load tagged pointer instruction
JP2008299795A (en) 2007-06-04 2008-12-11 Nec Electronics Corp Branch prediction controller and method thereof
US7809933B2 (en) 2007-06-07 2010-10-05 International Business Machines Corporation System and method for optimizing branch logic for handling hard to predict indirect branches
US8364973B2 (en) 2007-12-31 2013-01-29 Intel Corporation Dynamic generation of integrity manifest for run-time verification of software program
US8397014B2 (en) 2008-02-04 2013-03-12 Apple Inc. Memory mapping restore and garbage collection operations
US7882338B2 (en) 2008-02-20 2011-02-01 International Business Machines Corporation Method, system and computer program product for an implicit predicted return from a predicted subroutine
US8600942B2 (en) 2008-03-31 2013-12-03 Thomson Reuters Global Resources Systems and methods for tables of contents
US8078850B2 (en) 2008-04-24 2011-12-13 International Business Machines Corporation Branch prediction technique using instruction for resetting result table pointer
US8639913B2 (en) 2008-05-21 2014-01-28 Qualcomm Incorporated Multi-mode register file for use in branch prediction
CN101763248A (en) 2008-12-25 2010-06-30 世意法(北京)半导体研发有限责任公司 System and method for multi-mode branch predictor
US8150859B2 (en) 2010-02-05 2012-04-03 Microsoft Corporation Semantic table of contents for search results
CA2702354A1 (en) 2010-05-19 2010-10-07 Ibm Canada Limited - Ibm Canada Limitee Improved setjmp/longjmp for speculative execution frameworks
US8713529B2 (en) 2010-07-30 2014-04-29 Red Hat, Inc. Replacing memory pointers with implicit pointers to be used in compiler-generated debug output
DE102010045800A1 (en) 2010-09-20 2012-03-22 Texas Instruments Deutschland Gmbh Electronic device for data processing, has control stage that controls switch for connecting one of feedback paths if data output of execution unit in operation of execution unit is utilized as operand
US8769539B2 (en) 2010-11-16 2014-07-01 Advanced Micro Devices, Inc. Scheduling scheme for load/store operations
US9552206B2 (en) 2010-11-18 2017-01-24 Texas Instruments Incorporated Integrated circuit with control node circuitry and processing circuitry
US8725989B2 (en) 2010-12-09 2014-05-13 Intel Corporation Performing function calls using single instruction multiple data (SIMD) registers
US8997066B2 (en) 2010-12-27 2015-03-31 Microsoft Technology Licensing, Llc Emulating pointers
US9898291B2 (en) 2011-04-07 2018-02-20 Via Technologies, Inc. Microprocessor with arm and X86 instruction length decoders
US8930657B2 (en) 2011-07-18 2015-01-06 Infineon Technologies Ag Method and apparatus for realtime detection of heap memory corruption by buffer overruns
US8612959B2 (en) 2011-10-03 2013-12-17 International Business Machines Corporation Linking code for an enhanced application binary interface (ABI) with decode time instruction optimization
US9329869B2 (en) 2011-10-03 2016-05-03 International Business Machines Corporation Prefix computer instruction for compatibily extending instruction functionality
WO2013095510A1 (en) 2011-12-22 2013-06-27 Intel Corporation Packed data operation mask concatenation processors, methods, systems, and instructions
US8996788B2 (en) 2012-02-09 2015-03-31 Densbits Technologies Ltd. Configurable flash interface
US9063759B2 (en) 2012-03-28 2015-06-23 International Business Machines Corporation Optimizing subroutine calls based on architecture level of called subroutine
JP5598493B2 (en) 2012-03-30 2014-10-01 富士通株式会社 Information processing device, arithmetic device, and information transfer method
US9058192B2 (en) 2012-08-09 2015-06-16 Advanced Micro Devices, Inc. Handling pointers in program code in a system that supports multiple address spaces
US9471514B1 (en) * 2012-08-23 2016-10-18 Palo Alto Networks, Inc. Mitigation of cyber attacks by pointer obfuscation
WO2014068779A1 (en) 2012-11-05 2014-05-08 株式会社モルフォ Image processing device, image processing method, image processing program, and storage medium
GB201300608D0 (en) 2013-01-14 2013-02-27 Imagination Tech Ltd Indirect branch prediction
WO2015011567A2 (en) 2013-07-24 2015-01-29 Marvell World Trade Ltd Method and system for compiler optimization
US9858081B2 (en) 2013-08-12 2018-01-02 International Business Machines Corporation Global branch prediction using branch and fetch group history
CN104423929B (en) 2013-08-21 2017-07-14 华为技术有限公司 A kind of branch prediction method and relevant apparatus
JP2015049832A (en) 2013-09-04 2015-03-16 インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Machines Corporation Method, device, and program for reducing constant load overhead
US9524163B2 (en) 2013-10-15 2016-12-20 Mill Computing, Inc. Computer processor employing hardware-based pointer processing
GB201319525D0 (en) 2013-11-05 2013-12-18 Optibiotix Health Ltd Composition
GB2518022B (en) 2014-01-17 2015-09-23 Imagination Tech Ltd Stack saved variable value prediction
GB2518912B (en) 2014-01-17 2015-08-26 Imagination Tech Ltd Stack pointer value prediction
US9110675B1 (en) 2014-03-12 2015-08-18 International Business Machines Corporation Usage of TOC register as application register
US9021511B1 (en) 2014-03-14 2015-04-28 International Business Machines Corporation Runtime management of TOC pointer save and restore commands
US9256546B2 (en) 2014-03-31 2016-02-09 International Business Machines Corporation Transparent code patching including updating of address translation structures
US9471292B2 (en) 2014-04-18 2016-10-18 Intel Corporation Binary translation reuse in a system with address space layout randomization
US9329875B2 (en) 2014-04-28 2016-05-03 International Business Machines Corporation Global entry point and local entry point for callee function
US9922002B2 (en) 2014-04-28 2018-03-20 Palo Alto Research Center Incorporated Efficient representations of graphs with multiple edge types
US9329850B2 (en) 2014-06-24 2016-05-03 International Business Machines Corporation Relocation of instructions that use relative addressing
US9760496B2 (en) 2014-07-21 2017-09-12 Via Alliance Semiconductor Co., Ltd. Simultaneous invalidation of all address translation cache entries associated with an X86 process context identifier
US20160055003A1 (en) 2014-08-19 2016-02-25 Qualcomm Incorporated Branch prediction using least-recently-used (lru)-class linked list branch predictors, and related circuits, methods, and computer-readable media
US20160062655A1 (en) 2014-08-28 2016-03-03 Endgame, Inc. System and Method for Improved Memory Allocation in a Computer System
US9146715B1 (en) 2014-09-05 2015-09-29 International Business Machines Corporation Suppression of table of contents save actions
US9250881B1 (en) 2014-09-30 2016-02-02 International Business Machines Corporation Selection of an entry point of a function having multiple entry points
EP3012762A1 (en) 2014-10-24 2016-04-27 Thomson Licensing Control flow graph flattening device and method
US9354947B2 (en) 2014-10-28 2016-05-31 International Business Machines Corporation Linking a function with dual entry points
US9395964B2 (en) 2014-10-30 2016-07-19 International Business Machines Corporation Rewriting symbol address initialization sequences
US9858411B2 (en) 2014-12-19 2018-01-02 Intel Corporation Execution profiling mechanism
US9244663B1 (en) 2014-12-22 2016-01-26 International Business Machines Corporation Managing table of contents pointer value saves
US9569613B2 (en) 2014-12-23 2017-02-14 Intel Corporation Techniques for enforcing control flow integrity using binary translation
US9513832B2 (en) 2015-03-25 2016-12-06 International Business Machines Corporation Accessing global data from accelerator devices
US9880833B2 (en) 2015-06-30 2018-01-30 International Business Machines Corporation Initialization status of a register employed as a pointer
GB2540206B (en) * 2015-07-10 2018-02-07 Advanced Risc Mach Ltd Apparatus and method for executing instruction using range information associated with a pointer
US9817729B2 (en) 2015-07-30 2017-11-14 Zerto Ltd. Method for restoring files from a continuous recovery system
GB2540948B (en) 2015-07-31 2021-09-15 Advanced Risc Mach Ltd Apparatus with reduced hardware register set
GB2541714B (en) 2015-08-27 2018-02-14 Advanced Risc Mach Ltd An apparatus and method for controlling instruction execution behaviour
US20170147161A1 (en) 2015-11-24 2017-05-25 Nomad Technologies, Inc. Dynamic Table of Contents of Media
EP3188039A1 (en) 2015-12-31 2017-07-05 Dassault Systèmes Recommendations based on predictive model
US10380025B2 (en) 2016-01-19 2019-08-13 Hewlett Packard Enterprise Development Lp Accessing objects via object references
US9996294B2 (en) 2016-02-02 2018-06-12 International Business Machines Corporation Dynamically managing a table of contents
US10223295B2 (en) 2016-03-10 2019-03-05 Microsoft Technology Licensing, Llc Protected pointers
US9952844B1 (en) 2016-10-24 2018-04-24 International Business Machines Corporation Executing optimized local entry points and function call sites
US20180113689A1 (en) 2016-10-24 2018-04-26 International Business Machines Corporation Local Function Call Site Optimization
US10169016B2 (en) 2016-10-24 2019-01-01 International Business Machines Corporation Executing optimized local entry points
US10108406B2 (en) 2016-10-24 2018-10-23 International Business Machines Corporation Linking optimized entry points for local-use-only function pointers
US10360005B2 (en) 2016-10-24 2019-07-23 International Business Machines Corporation Local function call tailoring for function pointer calls
US10108407B2 (en) 2016-10-24 2018-10-23 International Business Machines Corporation Loading optimized local entry points for local-use-only function pointers
US10108404B2 (en) 2016-10-24 2018-10-23 International Business Machines Corporation Compiling optimized entry points for local-use-only function pointers
US10534593B2 (en) 2016-10-24 2020-01-14 International Business Machines Corporation Optimized entry points and local function call tailoring for function pointers
US10169011B2 (en) 2016-10-24 2019-01-01 International Business Machines Corporation Comparisons in function pointer localization
US10268465B2 (en) 2016-10-24 2019-04-23 International Business Machines Corporation Executing local function call site optimization
US10778795B2 (en) 2017-01-30 2020-09-15 Microsoft Technology Licensing, Llc Synchronization of property values between a client and a server
US10572404B2 (en) 2017-06-30 2020-02-25 Intel Corporation Cyclic buffer pointer fixing
US10725918B2 (en) 2017-09-19 2020-07-28 International Business Machines Corporation Table of contents cache entry having a pointer for a range of addresses
US10896030B2 (en) 2017-09-19 2021-01-19 International Business Machines Corporation Code generation relating to providing table of contents pointer values
US10705973B2 (en) 2017-09-19 2020-07-07 International Business Machines Corporation Initializing a data structure for use in predicting table of contents pointer values
US11061575B2 (en) 2017-09-19 2021-07-13 International Business Machines Corporation Read-only table of contents register
US10884929B2 (en) 2017-09-19 2021-01-05 International Business Machines Corporation Set table of contents (TOC) register instruction
US10620955B2 (en) 2017-09-19 2020-04-14 International Business Machines Corporation Predicting a table of contents pointer value responsive to branching to a subroutine
US10713050B2 (en) 2017-09-19 2020-07-14 International Business Machines Corporation Replacing Table of Contents (TOC)-setting instructions in code with TOC predicting instructions

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5822787A (en) * 1994-06-30 1998-10-13 Sun Microsystems, Inc. Application binary interface and method of interfacing binary application program to digital computer including efficient acquistion of global offset table (GOT) absolute base address
US7802080B2 (en) * 2004-03-24 2010-09-21 Arm Limited Null exception handling
US9274769B1 (en) * 2014-09-05 2016-03-01 International Business Machines Corporation Table of contents pointer value save and restore placeholder positioning

Also Published As

Publication number Publication date
US10884929B2 (en) 2021-01-05
US10884930B2 (en) 2021-01-05
DE112018003586T5 (en) 2020-03-26
US20190377680A1 (en) 2019-12-12
JP2020534596A (en) 2020-11-26
JP7059361B2 (en) 2022-04-25
CN111066006A (en) 2020-04-24
US20190087334A1 (en) 2019-03-21
GB2581639B (en) 2022-01-05
US11138113B2 (en) 2021-10-05
GB2581639A (en) 2020-08-26
WO2019058250A1 (en) 2019-03-28
US20190087336A1 (en) 2019-03-21
GB202005423D0 (en) 2020-05-27

Similar Documents

Publication Publication Date Title
US11138127B2 (en) Initializing a data structure for use in predicting table of contents pointer values
CN111066006B (en) Systems, methods and media for facilitating processing within a computing environment
US11010164B2 (en) Predicting a table of contents pointer value responsive to branching to a subroutine
US10963382B2 (en) Table of contents cache entry having a pointer for a range of addresses
US10831457B2 (en) Code generation relating to providing table of contents pointer values
US10713051B2 (en) Replacing table of contents (TOC)-setting instructions in code with TOC predicting instructions
US11061576B2 (en) Read-only table of contents register
CN110352404B (en) Comparison string processing through micro-operation extension based on inline decoding
US10747532B2 (en) Selecting processing based on expected value of selected character

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant