[go: up one dir, main page]

CN104937569A - Processing device with address translation probing and methods - Google Patents

Processing device with address translation probing and methods Download PDF

Info

Publication number
CN104937569A
CN104937569A CN201380071074.8A CN201380071074A CN104937569A CN 104937569 A CN104937569 A CN 104937569A CN 201380071074 A CN201380071074 A CN 201380071074A CN 104937569 A CN104937569 A CN 104937569A
Authority
CN
China
Prior art keywords
tlb
processing unit
address translation
requested
hierarchy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201380071074.8A
Other languages
Chinese (zh)
Inventor
莉萨·徐
努万·贾亚塞纳
安德鲁·凯格尔
布拉德福德·M·贝克曼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced Micro Devices Inc
Original Assignee
Advanced Micro Devices Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Advanced Micro Devices Inc filed Critical Advanced Micro Devices Inc
Publication of CN104937569A publication Critical patent/CN104937569A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1027Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1081Address translation for peripheral access to main memory, e.g. direct memory access [DMA]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/30Providing cache or TLB in specific location of a processing system
    • G06F2212/302In image processor or graphics adapter
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/50Control mechanisms for virtual memory, cache or TLB
    • G06F2212/507Control mechanisms for virtual memory, cache or TLB using speculative control
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/68Details of translation look-aside buffer [TLB]
    • G06F2212/681Multi-level TLB, e.g. microTLB and main TLB
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/68Details of translation look-aside buffer [TLB]
    • G06F2212/684TLB miss handling

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

A data processing device is provided that employs multiple translation look-aside buffers (TLBs) associated with respective processors that are configured to store selected address translations of a page table of a memory shared by the processors. The processing device is configured such that when an address translation is requested by a processor and is not found in the TLB associated with that processor, another TLB is probed for the requested address translation. The probe across to the other TLB may occur in advance of a walk of the page table for the requested address or alternatively a walk can be initiated concurrently with the probe. Where the probe successfully finds the requested address translation, the page table walk can be avoided or discontinued.

Description

具有地址转换探测的处理装置和方法Processing apparatus and method with address translation detection

相关申请交叉引用Related Application Cross Reference

本申请主张2012年12月21日提交的美国申请号为13/723,379的权益,所述申请的内容以引用的方式并入本文。This application claims the benefit of US Application No. 13/723,379, filed December 21, 2012, the contents of which are incorporated herein by reference.

领域field

本公开大体上针对数据处理装置,所述装置包括采用与存储器页表的地址转换有关的转换后备缓冲器(TLB)。The present disclosure is generally directed to data processing apparatuses that include translation lookaside buffers (TLBs) employing address translations related to memory page tables.

背景background

在数据处理的领域中,虚拟寻址的使用是众所周知的。通常,提供页表来利用与虚拟地址相关联的数据的对应物理存储器位置来维护虚拟地址列表,即:地址转换。从虚拟地址的页表列表中获取存储数据的物理存储器地址的过程被称为页表查询(page table walk)。The use of virtual addressing is well known in the field of data processing. Typically, page tables are provided to maintain a list of virtual addresses with corresponding physical memory locations of data associated with the virtual addresses, ie: address translations. The process of obtaining the physical memory address where data is stored from the page table list of virtual addresses is called a page table walk.

通常使用4Kb和8Kb的虚拟页,但是可以使用更大的尺寸,如2Mb或4Mb或更大。与处理器的操作速度相比,页表查询可花费大量的时间。为了避免进行页表查询来获取对应于虚拟地址的存储数据的每一个物理地址(即:每一个转换),使用转换后备缓冲器(TLB)来存储处理器使用或可能使用的地址转换。4Kb and 8Kb virtual pages are commonly used, but larger sizes such as 2Mb or 4Mb or larger can be used. Page table lookups can take a significant amount of time compared to the operating speed of the processor. To avoid doing a page table lookup to fetch every physical address (ie, every translation) of stored data corresponding to a virtual address, a Translation Lookaside Buffer (TLB) is used to store address translations used or likely to be used by the processor.

通常,处理器会首先在与存储器页表相关联的TLB中寻找所需的地址转换。如果找到所需的地址转换,那么不需要查询页表。如果没找到所需的地址转换,那么查询页表来找出地址转换。在某些情况下,将可以服务不同功能的多级TLB设置在处理器与页表之间,以便处理器将下降穿过若干级别的层次结构而寻找所需的转换,并且只在下降穿过TLB的层次结构中没有发现所需的转换时执行页表查询。Typically, the processor will first look for the required address translation in the TLB associated with the memory page table. If the desired address translation is found, then there is no need to consult the page tables. If the desired address translation is not found, then the page table is consulted to find the address translation. In some cases, multi-level TLBs that serve different functions are placed between the processor and the page tables, so that the processor will descend through several levels of the hierarchy looking for the required translation, and only descend through the A page table lookup is performed when the required translation is not found in the TLB hierarchy.

一般会在任何类型的数据处理器中采用页表和TLB,任何类型的数据处理器包括但不限于:中央处理单元(CPU)、图形处理单元(GPU)和加速处理单元(APU)。Page tables and TLBs are generally employed in any type of data processor, including but not limited to: central processing units (CPUs), graphics processing units (GPUs), and accelerated processing units (APUs).

示例性实施方案概述Overview of Exemplary Embodiments

提供一种数据处理装置,所述装置采用与各自的处理器相关联的多个转换后备缓冲器(TLB),TLB被配置来存储由处理器共用的存储器页表的选择地址转换。配置处理装置,使得当处理器请求地址转换而在与这个处理器相关联的TLB中没找到地址转换时,针对请求的地址转换探测另一TLB。针对另一TLB的探测可以发生在用于请求的地址的页表的查询之前,或者页表查询可以与探测同时开始。在探测成功找到请求的地址转换的情况下,可以避免或停止页表查询。A data processing apparatus is provided employing a plurality of translation lookaside buffers (TLBs) associated with respective processors, the TLBs being configured to store selected address translations of memory page tables shared by the processors. The processing means is configured such that when a processor requests an address translation and the address translation is not found in a TLB associated with the processor, another TLB is probed for the requested address translation. The probing for another TLB may occur before the lookup of the page table for the requested address, or the page table lookup may start concurrently with the probing. In the event that the probe successfully finds the requested address translation, page table lookups may be avoided or stopped.

在一个实施方案中,数据处理装置包括第一和第二处理单元以及共用存储器的页表。第一转换后备缓冲器(TLB)层次结构被配置来存储由第一处理单元使用的页表的选择地址转换。第二TLB层次结构被配置来存储由第二处理单元使用的页表的选择地址转换。处理装置被配置来在出现一种状况时针对由第一处理单元请求的地址转换探测第二TLB层次结构,所述状况为在第一TLB层次结构中没找到请求的地址转换。In one embodiment, the data processing arrangement comprises first and second processing units and a page table of a shared memory. A first translation lookaside buffer (TLB) hierarchy is configured to store selected address translations for page tables used by the first processing unit. The second TLB hierarchy is configured to store select address translations for page tables used by the second processing unit. The processing means is configured to probe the second TLB hierarchy for an address translation requested by the first processing unit upon occurrence of a condition that the requested address translation is not found in the first TLB hierarchy.

这种数据处理装置可以被配置来在出现一种状况时针对请求的地址转换进行页表的查询,所述状况为在第二TLB层次结构中没找到请求的地址转换。或者,数据处理装置可以被配置来在第二TLB层次结构的探测的同时针对请求的地址转换开始页表的查询。Such a data processing apparatus may be configured to perform a page table lookup for a requested address translation when the condition arises that the requested address translation is not found in the second TLB hierarchy. Alternatively, the data processing means may be configured to start the lookup of the page table for the requested address translation concurrently with the probing of the second TLB hierarchy.

在一些实施方案中,数据处理装置也被配置来在出现一种状况时针对由第二处理单元请求的地址转换探测第一TLB层次结构,所述状况为在第二TLB层次结构中没找到由第二处理单元请求的地址转换。在这种情况下,数据处理装置可以被配置来在出现一种状况时针对由第二处理单元请求的地址转换进行页表的查询,所述状况为在第一TLB层次结构的探测中没找到由第二处理单元请求的地址转换。或者,数据处理装置可以被配置来在第一TLB层次结构的探测的同时针对由第二处理单元请求的地址转换开始页表的查询。In some embodiments, the data processing apparatus is also configured to probe the first TLB hierarchy for an address translation requested by the second processing unit when a condition occurs that is not found in the second TLB hierarchy by Address translation requested by the second processing unit. In this case, the data processing means may be configured to perform a lookup of the page table for an address translation requested by the second processing unit upon occurrence of a condition not found in the probe of the first TLB hierarchy Address translation requested by the second processing unit. Alternatively, the data processing means may be configured to start the lookup of the page table for an address translation requested by the second processing unit simultaneously with the probing of the first TLB hierarchy.

例如,处理单元可以是相同的或不同的类型,如中央处理单元(CPU)、图形处理单元(GPU)或加速处理单元(APU)或它们的一些组合。第一或第二处理单元中的任一个或第一或第二处理单元两者,可以是分别与各自的TLB层次结构相关联的多个第一或第二处理单元中的一个。For example, the processing units may be of the same or different types, such as a central processing unit (CPU), a graphics processing unit (GPU), or an accelerated processing unit (APU), or some combination thereof. Either, or both, the first or second processing unit may be one of a plurality of first or second processing units respectively associated with a respective TLB hierarchy.

在一个实例中,所有的第一处理单元都是CPU,并且第一TLB层次结构包括单独的一级TLB,所述单独的一级TLB与每一个CPU相关联以存储由这个CPU使用的页表的选择地址转换;和与一级TLB相关联的二级TLB。在这个实例中,处理装置被配置来不成功地搜寻与第一处理单元相关联的一级TLB和二级TLB以便找到由第一处理单元请求的地址转换,并将其作为针对由第一处理单元请求的地址转换探测第二TLB层次结构的状况。In one example, all of the first processing units are CPUs, and the first TLB hierarchy includes a single level-one TLB associated with each CPU to store page tables used by that CPU Selected address translation; and a second-level TLB associated with the first-level TLB. In this example, the processing means is configured to unsuccessfully search the first-level TLB and the second-level TLB associated with the first processing unit in order to find the address translation requested by the first processing unit, and to use it as an address translation requested by the first processing unit. The address translation requested by the unit probes the status of the second TLB hierarchy.

在另一个实例中,所有的第一处理单元都是GPU,并且第一TLB层次结构包括单一TLB,所述单一TLB与所有的GPU相关联以存储由GPU使用的页表的选择地址转换。在这个实例中,处理装置被配置来不成功地搜寻单一TLB以便找到由第一处理单元请求的地址转换,并将其作为针对由第一处理单元请求的地址转换探测第二TLB层次结构的条件。In another example, all of the first processing units are GPUs, and the first TLB hierarchy includes a single TLB associated with all of the GPUs to store selected address translations for page tables used by the GPUs. In this example, the processing means is configured to unsuccessfully search a single TLB for an address translation requested by the first processing unit as a condition of probing the second TLB hierarchy for the address translation requested by the first processing unit .

数据处理装置可以具有多个包括第二处理单元的多个第二处理单元,其中第二TLB层次结构包括单独一级TLB,所述单独一级TLB与多个第二处理单元中的每一个相关联以存储由这个第二处理单元使用的页表的选择地址转换;和与一级TLB相关联的二级TLB。在这种情况下,处理装置被配置来通过搜寻与选择的第二处理单元相关联的一级TLB和二级TLB,来针对由第一处理单元请求的地址转换来探测第二TLB层次结构,以便找到由第一处理单元请求的地址转换。The data processing apparatus may have a plurality of second processing units including a second processing unit, wherein the second TLB hierarchy includes a single level of TLB associated with each of the plurality of second processing units associated to store a select address translation of the page table used by this second processing unit; and a second level TLB associated with the first level TLB. In this case, the processing means is configured to probe the second TLB hierarchy for an address translation requested by the first processing unit by searching the first level TLB and the second level TLB associated with the selected second processing unit, In order to find the address translation requested by the first processing unit.

在另一个实施方案中,数据处理装置包括第一转换后备缓冲器(TLB),其被配置来存储由第一处理单元使用的页表的选择地址转换;以及第二TLB,其被配置来存储由第二处理单元使用的页表的选择地址转换。处理装置被配置来在出现一种状况时针对由第一处理单元请求的地址转换探测第二TLB,所述状况为在第一TLB中没找到请求的地址转换。In another embodiment, the data processing apparatus includes a first translation lookaside buffer (TLB) configured to store selected address translations for page tables used by the first processing unit; and a second TLB configured to store Selected address translation of the page table used by the second processing unit. The processing means is configured to probe the second TLB for an address translation requested by the first processing unit when a condition occurs, the requested address translation not being found in the first TLB.

这种数据处理装置可以包括二级TLB,其中第一和第二TLB包括被配置来存储由第一和第二处理单元使用的页表的选择地址转换的TLB层次结构的第一级,并且二级TLB包括TLB层次结构的第二级。在这种情况下,数据处理装置可以被配置来在出现一种状况时针对请求的地址转换进行页表的查询,所述状况为在第二TLB或二级TLB的探测中没找到请求的地址转换。作为一个替代实施方案,数据处理装置可以被配置来在出现一种状况时在第二TLB的探测的同时针对请求的地址转换开始页表的查询,所述状况为在二级TLB中没找到请求的地址转换。Such a data processing apparatus may include a two-level TLB, wherein the first and second TLBs comprise a first level of a selective address-translated TLB hierarchy configured to store page tables used by the first and second processing units, and the two Level TLBs comprise the second level of the TLB hierarchy. In this case, the data processing apparatus may be configured to perform a page table lookup for the requested address translation when the condition arises that the requested address is not found in a probe of the second TLB or secondary TLB convert. As an alternative embodiment, the data processing apparatus may be configured to start a lookup of the page table for the requested address translation at the same time as the probing of the second TLB when the condition occurs that the request is not found in the secondary TLB address translation.

也提供了对应于示例实施方案和替代实施方案的方法。Methods corresponding to example embodiments and alternative embodiments are also provided.

另外,另一实施方案提供存储指今集的非暂态计算机可读存储介质,所述指令集由通用计算机执行以便促进选择性设计的集成电路的制造。非暂态计算机可读存储介质包括指令,所述指今是用于包括实施方案的一个或多个方面的装置的制造的硬件描述语言(HDL)指令。Additionally, another embodiment provides a non-transitory computer-readable storage medium storing a set of instructions for execution by a general-purpose computer to facilitate fabrication of selectively designed integrated circuits. The non-transitory computer-readable storage medium includes instructions, which are hardware description language (HDL) instructions for manufacture of an apparatus including one or more aspects of the embodiments.

这种集成电路可以包括第一TLB层次结构,其被配置来存储由第一处理单元使用的页表的选择地址转换;和第二TLB层次结构,其被配置来存储由第二处理单元使用的页表的选择地址转换。在这种情况下,集成电路被配置来在出现一种状况时针对由第一处理单元请求的地址转换探测第二TLB层次结构,所述状况为在第一TLB层次结构中没找到请求的地址转换。Such an integrated circuit may include a first TLB hierarchy configured to store selected address translations for page tables used by a first processing unit; and a second TLB hierarchy configured to store selected address translations for a page table used by a second processing unit Alternative address translation for page tables. In this case, the integrated circuit is configured to probe the second TLB hierarchy for an address translation requested by the first processing unit when a condition occurs that the requested address is not found in the first TLB hierarchy convert.

作为一个替代实施方案,集成电路可以包括第一转换后备缓冲器(TLB),其被配置来存储由第一处理单元使用的页表的选择地址转换;和第二TLB,其被配置来存储由第二处理单元使用的页表的选择地址转换。在这种情况下,集成电路被配置来在出现一种状况时针对由第一处理单元请求的地址转换探测第二TLB,所述状况为在第一TLB中没找到请求的地址转换。As an alternative embodiment, the integrated circuit may include a first translation lookaside buffer (TLB) configured to store selected address translations for page tables used by the first processing unit; and a second TLB configured to store Selected address translation of the page table used by the second processing unit. In this case, the integrated circuit is configured to probe the second TLB for an address translation requested by the first processing unit when a condition occurs that the requested address translation is not found in the first TLB.

附图简述Brief description of the drawings

可以从结合附图以举例方式给出的以下描述中获得更加详细的理解。A more detailed understanding can be obtained from the following description, given by way of example in conjunction with the accompanying drawings.

图1为可以实施一个或多个公开的实施方案的处理装置的框图。Figure 1 is a block diagram of a processing device in which one or more disclosed embodiments may be implemented.

图2为可以实施一个或多个公开的实施方案的另一处理装置的框图。Figure 2 is a block diagram of another processing device in which one or more disclosed embodiments may be implemented.

图3为可以实施一个或多个公开的实施方案的又一处理装置的框图。Figure 3 is a block diagram of yet another processing device in which one or more disclosed embodiments may be implemented.

详述detail

今天,数据处理装置无处不在并且合并成大量的不同类型的产品。通常,处理器被配置来利用转换后备缓冲器(TLB)以便避免每一次处理器请求存储在页表上的地址转换时都要查询页表,转换后备缓冲器(TLB)被配置来存储存储器页表的选择地址转换。Today, data processing devices are ubiquitous and consolidated into a large number of different types of products. Typically, processors are configured to utilize a Translation Lookaside Buffer (TLB), which is configured to store memory page Table selection address translation.

参照图1,示出了示例数据处理装置100,其中多个处理器共用包括页表120的存储器110。在图1示出的实例中,处理装置100包括被配置来存储页表120的选择地址转换以用于第一处理单元141、142的第一组140的第一TLB层次结构130。处理装置100也包括被配置来存储页表120的选择地址转换以用于第二处理单元161、162的第二组160的第二TLB层次结构150。Referring to FIG. 1 , there is shown an example data processing apparatus 100 in which a memory 110 including a page table 120 is shared by a plurality of processors. In the example shown in FIG. 1 , the processing arrangement 100 comprises a first TLB hierarchy 130 configured to store a first set 140 of selected address translations of page tables 120 for a first processing unit 141 , 142 . The processing arrangement 100 also includes a second TLB hierarchy 150 configured to store a second set 160 of selected address translations of the page tables 120 for the second processing units 161 , 162 .

在操作中,页表120的选择地址转换存储在用于第一处理单元141、142的第一组140的第一TLB层次结构130中。页表120的选择地址转换也存储在用于第二处理单元161、162的第二组160的第二TLB层次结构150中。结合数据处理,处理单元141、142、161、162将首先在它们与存储器页表相关联的各自TLB层次结构130、150中寻找所需的地址转换。其中TLB层次结构130、150包括多级别的TLB,处理器将下降穿过若干级别的层次结构而寻找所需的地址转换。In operation, selected address translations of the page table 120 are stored in the first TLB hierarchy 130 of the first group 140 for the first processing units 141 , 142 . Selected address translations of the page tables 120 are also stored in the second TLB hierarchy 150 of the second set 160 for the second processing units 161 , 162 . In connection with data processing, the processing units 141, 142, 161, 162 will first look for the required address translations in their respective TLB hierarchies 130, 150 associated with memory page tables. Where the TLB hierarchy 130, 150 includes multiple levels of TLBs, the processor will descend through several levels of the hierarchy to find the required address translation.

如果找到了所需的地址转换,那么不需要查询页表120。如果没找到所需的地址转换,那么可能需要查询页表120以找出地址转换。If the desired address translation is found, then page table 120 need not be consulted. If the desired address translation is not found, then page table 120 may need to be consulted to find the address translation.

为了尽可能避免需要等待页表的查询的结果,处理装置100被配置来在出现一种状况时针对由第一组140的第一处理单元141、142中的一者所请求的地址转换探测170第二TLB层次结构150,所述状况为在第一TLB层次结构130中没找到请求的地址转换。数据处理装置100被进一步配置来在出现一种状况时针对请求的地址转换进行页表120的查询,所述状况为在第二TLB层次结构150的探测170中没找到请求的地址转换。作为一个替代实施方案,数据处理装置100可以被配置来在第二TLB层次结构的探测的同时针对请求的地址转换开始页表120的查询。In order to avoid as far as possible the need to wait for the result of the query of the page table, the processing device 100 is configured to probe 170 for an address translation requested by one of the first processing units 141, 142 of the first group 140 when a condition occurs The second TLB hierarchy 150 , the condition is that the requested address translation is not found in the first TLB hierarchy 130 . The data processing apparatus 100 is further configured to perform a lookup of the page table 120 for the requested address translation when a condition arises that the requested address translation is not found in the probe 170 of the second TLB hierarchy 150 . As an alternative embodiment, the data processing apparatus 100 may be configured to start the lookup of the page table 120 for the requested address translation concurrently with the probing of the second TLB hierarchy.

在一个实施方案中,数据处理装置100也被配置来在出现一种状况时针对由第二处理单元161、162中的一个请求的地址转换而探测172第一TLB层次结构130,所述状况为在第二TLB层次结构150中没找到由第二处理单元161、162请求的地址转换。在这种情况下,数据处理装置100可以被配置来在出现一种状况时针对由第二处理单元161、162请求的地址转换进行页表120的查询,所述状况为在第一TLB层次结构130的探测172中没找到由第二处理单元161、162请求的地址转换。或者,数据处理装置100可以被配置来在第一TLB层次结构130的探测172的同时开始用于由第二处理单元161、162请求的地址转换的页表的查询。In one embodiment, the data processing arrangement 100 is also configured to probe 172 the first TLB hierarchy 130 for an address translation requested by one of the second processing units 161, 162 when a condition occurs, the condition being The address translation requested by the second processing unit 161 , 162 is not found in the second TLB hierarchy 150 . In this case, the data processing arrangement 100 may be configured to perform a lookup of the page table 120 for an address translation requested by the second processing unit 161, 162 when a condition arises that the first TLB hierarchy The address translation requested by the second processing unit 161 , 162 was not found in the probe 172 of 130 . Alternatively, the data processing arrangement 100 may be configured to start the lookup of the page table for the address translation requested by the second processing unit 161 , 162 at the same time as the probing 172 of the first TLB hierarchy 130 .

处理单元141、142、161、162可以是任何类型的,例如像:中央处理单元(CPU)、图形处理单元(GPU)或加速处理单元(APU)。在一个实施方案中,每一组140、160内的处理单元是相同的类型,但是不同的处理器可以包括在同一组中。虽然每一组140、160被示为包括两个处理单元,但是这是为了说明的目的。可以使用不同尺寸的处理单元组。另外,每一组140、160可以包括多于两个的处理器,或者组140、160中的一个或两者可以只具有单一处理器。The processing units 141 , 142 , 161 , 162 may be of any type, like for example: a central processing unit (CPU), a graphics processing unit (GPU) or an accelerated processing unit (APU). In one embodiment, the processing units within each group 140, 160 are of the same type, although different processors may be included in the same group. Although each set 140, 160 is shown as including two processing units, this is for illustration purposes. Handling unit groups of different sizes can be used. Additionally, each group 140, 160 may include more than two processors, or one or both groups 140, 160 may have only a single processor.

处理装置100可以包括共用存储器的多于两组的处理单元和各自的TLB层次结构。在这种情况下,任何处理单元的参考与这个处理单元相关联的TLB层次结构未能实现的转换请求,可以致使针对请求的地址转换而探测其它TLB层次结构的所有或子集。在一个实施方案中,如果在任何探测的TLB层次结构中遇到命中,那么可避免页表查询。或者,可以在开始页表查询的同时发布探测。在这种情况下,如果转换命中由任何查询装置返回,那么中止页表查询或忽略它的响应。The processing device 100 may include more than two groups of processing units sharing memory and respective TLB hierarchies. In this case, any processing unit's translation request that cannot be fulfilled by reference to the TLB hierarchy associated with that processing unit may cause all or a subset of the other TLB hierarchies to be probed for the requested address translation. In one embodiment, page table lookups may be avoided if a hit is encountered in any probed TLB hierarchy. Alternatively, the probe can be issued at the same time the page table lookup is started. In this case, if a translation hit is returned by any query device, the page table query is aborted or its response is ignored.

虽然示出了两个TLB层次结构130、150,但是处理装置100可以包括另外的TLB层次结构,所述另外的TLB层次结构被配置来分别存储用于另外的处理单元组的页表120的选择地址转换。另外,TLB层次结构被配置来存储与存储器110和/或其它存储器相关联的多个页表的选择地址转换。TLB层次结构也可以被配置有单一级别或多个级别。Although two TLB hierarchies 130, 150 are shown, the processing device 100 may include additional TLB hierarchies configured to store selections of page tables 120 for additional groups of processing units, respectively. address translation. Additionally, the TLB hierarchy is configured to store select address translations for a plurality of page tables associated with memory 110 and/or other memory. The TLB hierarchy can also be configured with a single level or multiple levels.

参照图2,示出了实施方案的另外更具体的实例。在图2中,示例数据处理装置200具有包括页表220的共用存储器210。处理装置200包括第一TLB层次结构230,所述第一TLB层次结构230被配置来存储用于中央处理单元(CPU)241、242的组240的页表220的选择地址转换。用于CPU组240的TLB层次结构230是多级层次结构,其包括分别与CPU组240的每一个CPU 241、242相关联的一级TLB231、232以及与一级TLB 231、232相关联的二级TLB 280。Referring to Figure 2, another more specific example of an embodiment is shown. In FIG. 2 , an example data processing apparatus 200 has a shared memory 210 including a page table 220 . The processing device 200 comprises a first TLB hierarchy 230 configured to store selected address translations for a page table 220 of a group 240 of central processing units (CPUs) 241 , 242 . The TLB hierarchy 230 for the CPU group 240 is a multi-level hierarchy that includes a first-level TLB 231, 232 associated with each CPU 241, 242 of the CPU group 240, respectively, and a second-level TLB 231, 232 associated with the CPU group 240. Level TLB 280.

TLB层次结构230和/或CPU组240配置的变化对本领域的技术人员来说是显而易见的。例如,在另外的CPU包括在CPU组240中的情况下,TLB层次结构230可以包括分别与每一个另外的CPU相关联的另外的一级TLB和幻影中指示的二级TLB 280。另外,TLB层次结构230可以包括多于两个的级别。Variations in the configuration of TLB hierarchy 230 and/or CPU group 240 will be apparent to those skilled in the art. For example, where additional CPUs are included in CPU group 240, TLB hierarchy 230 may include additional first-level TLBs respectively associated with each additional CPU and second-level TLB 280 indicated in the phantom. Additionally, TLB hierarchy 230 may include more than two levels.

在图2的实例中,处理装置200也包括第二TLB层次结构250,所述第二TLB层次结构250被配置来存储用于图形处理单元(GPU)261、262的组260的页表220的选择地址转换。在这个实例中,用于GPU组260的TLB层次结构250是单一级别层次结构,其具有分别与GPU组240的所有的GPU 261、262相关联的单一TLB。In the example of FIG. 2 , the processing device 200 also includes a second TLB hierarchy 250 configured to store the Select Address Translation. In this example, TLB hierarchy 250 for GPU group 260 is a single level hierarchy with a single TLB associated with all GPUs 261, 262 of GPU group 240, respectively.

TLB层次结构250和/或GPU组260配置的变化对本领域的技术人员来说是显而易见的。例如,在另外的GPU包括在GPU组260中的情况下,每一个另外的GPU与幻影中指示的TLB 250相关联。Variations in the configuration of TLB hierarchy 250 and/or GPU group 260 will be apparent to those skilled in the art. For example, where additional GPUs are included in GPU group 260, each additional GPU is associated with TLB 250 indicated in the phantom.

在操作中,页表220的选择地址转换存储在用于CPU 241、242的组240的第一TLB层次结构230中。页表220的选择地址转换也存储在用于GPU 261、262的组260的第二TLB层次结构250中。结合数据处理,CPU 241、242和GPU 261、262将首先在它们各自的TLB层次结构230、250中寻找所需的地址转换。相对于CPU TLB层次结构230,CPU 241、242将首先在它们各自相关的一级TLB 231、232中寻找,并且随后在二级TLB中寻找所需的地址转换。In operation, selected address translations for the page table 220 are stored in the first TLB hierarchy 230 for the set 240 of CPUs 241, 242. Selected address translations for the page tables 220 are also stored in the second TLB hierarchy 250 for the set 260 of the GPUs 261, 262. In conjunction with data processing, the CPUs 241, 242 and GPUs 261, 262 will first look in their respective TLB hierarchies 230, 250 for the required address translations. With respect to the CPU TLB hierarchy 230, the CPUs 241, 242 will first look in their respective associated first-level TLBs 231, 232, and then look for the required address translation in the second-level TLBs.

如果找到了所需的地址转换,那么不需要查询页表220。如果没找到所需的地址转换,那么需要查询页表220以找出地址转换。这种页表查询可能花费很多时间。例如,在非虚拟的x86系统中,可能需要至多四个存储器访问来检索所需的转换。在具有嵌套页表查询的虚拟化系统中,存储器访问数可能变为24,这是非常昂贵的。If the required address translation is found, then the page table 220 need not be consulted. If the desired address translation is not found, then the page table 220 needs to be consulted to find the address translation. This page table lookup can take a lot of time. For example, in a non-virtualized x86 system, up to four memory accesses may be required to retrieve the desired translation. In a virtualized system with nested page table lookups, the number of memory accesses can become 24, which is very expensive.

在CPU和GPU共用同一页表的异构处理装置200中,非常有可能在GPU层次结构中找到CPU需要的转换,或反之亦然。在两个CPU和GPU之间共用数据的计算机程序的情况下,一个处理单元的层次结构保持另一个所需的转换的可能性增加。In a heterogeneous processing device 200 where the CPU and GPU share the same page table, it is very likely that the translation needed by the CPU will be found in the GPU hierarchy, or vice versa. In the case of computer programs that share data between two CPUs and GPUs, the likelihood that the hierarchy of one processing unit maintains the transformations required by the other increases.

为了尽可能避免需要等待页表220的查询的结果,处理装置200被配置来在出现一种状况时针对由CPU组240的CPU 241、242中的一个请求的地址转换来探测270GPU TLB 250,所述状况为在CPUTLB层次结构230中没找到请求的地址转换。数据处理装置200被另外配置来在出现一种状况时针对请求的地址转换进行页表220的查询,所述状况为在GPU TLB 250的探测270中没找到请求的地址转换。作为一个替代实施方案,数据处理装置200可以被配置来在GPUTLB 250的探测270的同时针对请求的地址转换开始页表220的查询。In order to avoid as much as possible the need to wait for the result of the query of the page table 220, the processing device 200 is configured to probe 270 the GPU TLB 250 for an address translation requested by one of the CPUs 241, 242 of the CPU group 240 when a condition occurs, so The situation described is that the requested address translation was not found in the CPUTLB hierarchy 230 . The data processing apparatus 200 is additionally configured to perform a lookup of the page table 220 for the requested address translation when a condition arises that the requested address translation is not found in the probe 270 of the GPU TLB 250 . As an alternative embodiment, the data processing apparatus 200 may be configured to initiate the lookup of the page table 220 for the requested address translation at the same time as the probing 270 of the GPUTLB 250.

在一个实施方案中,数据处理装置200也被配置来出现一种状况时针对由GPU 261、262中的一个请求的地址转换而探测272 CPUTLB层次结构230,所述状况为在GPU TLB 250中没找到由GPU 26I、262请求的地址转换。在这种情况下,数据处理装置200可以被配置来在出现一种状况时针对由GPU 261、262请求的地址转换进行页表220的查询,所述状况为在CPU TLB层次结构230的探测272中没找到由GPU 261、262请求的地址转换。或者,数据处理装置200可以被配置来在CPU TLB层次结构230的探测272的同时针对由GPU261、262请求的地址转换开始页表的查询。In one embodiment, the data processing apparatus 200 is also configured to probe 272 the CPUTLB hierarchy 230 for an address translation requested by one of the GPUs 261, 262 when a condition occurs that is not present in the GPU TLB 250 Address translation requested by GPU 26I, 262 is found. In this case, the data processing apparatus 200 may be configured to perform a lookup of the page table 220 for an address translation requested by the GPU 261, 262 when a condition occurs, which is a probe 272 at the CPU TLB hierarchy 230 The address translation requested by the GPU 261, 262 was not found in . Alternatively, the data processing apparatus 200 may be configured to start a lookup of the page table for an address translation requested by the GPU 261 , 262 at the same time as the probing 272 of the CPU TLB hierarchy 230 .

在一个规范的情况下,CPU 241、242可以在存储器210的一些区域上执行一些工作,所述工作将这个区域中的地址的转换带到CPU241、242的TLB层次结构230中。CPU 241、242也可以在这个存储器区域上将一些工作卸载给GPU 261、262中的一个或多个来执行。GPU 261、262随后需要适当的地址转换,以便正确地访问存储器。如果在GPU TLB 250中没找到针对GPU 261、262工作的特定地址转换,那么很有可能会在CPU TLB 230中并且特别是通过以一种方式探测CPU TLB层次结构230来找到它,所述方式与将工作分配给GPU的CPU搜寻地址的方式相同。In a canonical case, the CPU 241, 242 may perform some work on some area of the memory 210 that brings translation of addresses in this area into the TLB hierarchy 230 of the CPU 241, 242. The CPU 241, 242 may also offload some work on this memory region to one or more of the GPUs 261, 262 for execution. The GPU 261, 262 then requires proper address translation in order to access the memory correctly. If a specific address translation for GPU 261, 262 work is not found in the GPU TLB 250, it is likely to be found in the CPU TLB 230 and specifically by probing the CPU TLB hierarchy 230 in a manner that The same way a CPU seeks an address to assign work to a GPU.

例如,CPU 242可以将工作分配给GPU 261。如果GPU 261针对这个工作请求地址转换但是在GPU TLB 250中没有找到它,那么GPU 261将针对请求的地址转换而探测272CPU TLB 230。在这种情况下,探测272将搜寻与CPU 242相关联的一级TLB 232并且随后搜寻二级TLB 280来找到地址转换。如果这种探测不产生地址转换,那么GPU会针对请求的地址转换等待页表查询的结果。For example, CPU 242 may distribute work to GPU 261. If the GPU 261 requests an address translation for this work but does not find it in the GPU TLB 250, then the GPU 261 will probe 272 the CPU TLB 230 for the requested address translation. In this case, the probe 272 will search the first level TLB 232 associated with the CPU 242 and then the second level TLB 280 to find the address translation. If this probe does not result in an address translation, then the GPU waits for the result of a page table lookup for the requested address translation.

在包括CPU组和GPU组的示例处理装置200中,CPU TLB层次结构230在存储器管理单元(MMU)的环境中操作,并且GPU TLB250在进行页表查询的输入/输出存储器管理单元(IOMMU)的环境中操作。在这种环境下,当GPU的转换请求在GPU TLB内失败时,IOMMU可以被配置来将包括请求的虚拟地址和地址空间ID(ASID)两者的探测消息272发送给CPU MMU,以请求所需的转换。在具有多个CPU的示例装置200中,探测可以包括启动任务的CPU的id,针对所述任务GPU进行地址转换请求从而将探测272引导至这个特定CPU进而节省探测带宽。如果探测成功,那么CPU MMU可以将转换提供回给IOMMU(远远快于IOMMU执行完整的页表查询),从而满足GPU的地址转换请求。In the example processing device 200 that includes groups of CPUs and GPUs, the CPU TLB hierarchy 230 operates in the context of a memory management unit (MMU), and the GPU TLB 250 operates in the context of an input/output memory management unit (IOMMU) that performs page table lookups. operate in the environment. In this environment, when a GPU's translation request fails within the GPU TLB, the IOMMU can be configured to send a probe message 272 including both the requested virtual address and address space ID (ASID) to the CPU MMU to request the required conversion. In an example apparatus 200 with multiple CPUs, the probe may include the id of the CPU that initiated the task for which the address translation request was made to direct the probe 272 to this specific CPU thereby saving probe bandwidth. If the probe is successful, then the CPU MMU can feed the translation back to the IOMMU (much faster than the IOMMU can perform a full page table lookup), thereby satisfying the GPU's address translation request.

替代的实施方案是在针对页表查询向IOMMU发出请求之前直接探测272CPU MMU。如果CPU MMU返回转换,那么GPU不开始IOMMU交互。在这种情况下,如果CPU MMU也不包括转换,那么会产生IOMMU请求。An alternative implementation is to probe the 272CPU MMU directly before making a request to the IOMMU for a page table lookup. If the CPU MMU returns switch, then the GPU does not start IOMMU interaction. In this case, if the CPU MMU also does not include translation, then an IOMMU request is generated.

参照图3,示出了另外的实施方案。在图3中,示例数据处理装置300具有包括页表320的共用存储器310。处理装置300包括被配置来存储用于处理单元341、342的组340的页表320的选择地址转换的TLB层次结构330。TLB层次结构330是多级层次结构,其包括分别与组340的每一个处理单元341、342相关联的一级TLB 231、232和与一级TLB 231、232相关联的二级TLB 280。Referring to Figure 3, an additional embodiment is shown. In FIG. 3 , an example data processing apparatus 300 has a shared memory 310 including a page table 320 . The processing arrangement 300 includes a TLB hierarchy 330 configured to store selected address translations for the page tables 320 of the group 340 of processing units 341 , 342 . TLB hierarchy 330 is a multi-level hierarchy that includes a first-level TLB 231, 232 associated with each processing unit 341, 342 of group 340, and a second-level TLB 280 associated with each of the first-level TLBs 231, 232.

TLB层次结构330和/或组340配置的变化对本领域的技术人员来说是显而易见的。例如,在另外的处理单元包括在组340中的情况下,TLB层次结构330可以包括分别与每一个另外的处理单元相关联的另外的一级TLB和幻影中指示的二级TLB 380。另外,TLB层次结构330可以包括多于两个的级别。Variations in the configuration of TLB hierarchy 330 and/or group 340 will be apparent to those skilled in the art. For example, where additional processing units are included in group 340, TLB hierarchy 330 may include additional first-level TLBs respectively associated with each additional processing unit and second-level TLB 380 indicated in the phantom. Additionally, TLB hierarchy 330 may include more than two levels.

在操作中,页表320的选择地址转换存储在用于处理单元341、342的组340的TLB层次结构330中,其中页表320的一些选择地址转换存储在用于处理单元341的一级TLB 331中,而页表320的一些选择地址转换存储在用于处理单元342的一级TLB 332中。结合数据处理,处理单元341、342将首先在它们各自的第一级TLB 331、332中寻找所需的地址转换,并且随后在二级TLB 380中寻找所需的地址转换。In operation, selected address translations for page tables 320 are stored in TLB hierarchy 330 for group 340 of processing units 341, 342, with some selected address translations for page tables 320 stored in level one TLB for processing unit 341 331, while some selected address translations for the page table 320 are stored in the level one TLB 332 for the processing unit 342. In connection with data processing, the processing units 341, 342 will first look for the required address translation in their respective first level TLB 331, 332 and then in the second level TLB 380.

如果找到了所需的地址转换,那么不需要查询页表320。如果没找到所需的地址转换,那么需要查询页表320以找出地址转换。If the required address translation is found, then the page table 320 need not be consulted. If the desired address translation is not found, then page table 320 needs to be consulted to find the address translation.

为了尽可能避免需要等待页表的查询的结果,处理装置300被配置来在出现一种状况时针对由处理单元341请求的地址转换而探测370一级TLB 332,所述状况为在与处理单元341相关联的一级TLB331中没找到请求的地址转换。数据处理装置300被另外配置来在出现一种状况时针对请求的地址转换进行页表320的查询,所述状况为在一级TLB 331的探测370中没找到请求的地址转换。作为一个替代实施方案,数据处理装置300可以被配置来在一级TLB 332的探测370的同时开始用于请求的地址转换的页表320的查询。In order to avoid as much as possible the need to wait for the results of page table lookups, the processing device 300 is configured to probe 370 the level 1 TLB 332 for an address translation requested by the processing unit 341 when a condition 341 The requested address translation was not found in the associated first-level TLB 331. The data processing apparatus 300 is additionally configured to perform a lookup of the page table 320 for a requested address translation when a condition arises that the requested address translation is not found in the probe 370 of the level one TLB 331. As an alternative embodiment, the data processing apparatus 300 may be configured to initiate the lookup of the page table 320 for the requested address translation concurrently with the probing 370 of the level one TLB 332.

在一个实施方案中,数据处理装置300也被配置来在出现一种状况时针对由处理单元342请求的地址转换而探测372一级TLB 331,所述状况为在与处理单元342相关联的一级TLB 332中没找到请求的地址转换。数据处理装置300被另外配置来在出现一种状况时针对请求的地址转换进行页表320的查询,所述状况为在一级TLB 332的探测372中没找到请求的地址转换。作为一个替代实施方案,数据处理装置300可以被配置来在一级TLB 331的探测372的同时针对请求的地址转换开始页表320的查询。In one embodiment, the data processing apparatus 300 is also configured to probe 372 the level 1 TLB 331 for an address translation requested by the processing unit 342 when a condition occurs in a The requested address translation was not found in level TLB 332. The data processing apparatus 300 is additionally configured to perform a lookup of the page table 320 for a requested address translation when a condition arises that the requested address translation is not found in the probe 372 of the level one TLB 332. As an alternative embodiment, the data processing apparatus 300 may be configured to start the lookup of the page table 320 for the requested address translation at the same time as the detection 372 of the level one TLB 331.

处理单元341、342可以是任何类型的,例如像:中央处理单元(CPU)、图形处理单元(GPU)或加速处理单元(APU)。图3的实施方案也适用在相对于图1示出的组内,以便在开始页表查询之前或者在开始页表查询的同时提供另外的探测。处理装置300可以包括共用存储器的多于两个的处理单元和各自的TLB层次结构。在这种情况下,任何处理单元的参考与这个处理单元相关联的一级TLB未能实现的转换请求,可以致使针对请求的地址转换而探测另一个一级TLB的所有或子集。在一个实施方案中,如果在任何探测的一级TLB中遇到命中,那么可避免页表查询。或者,可以在开始页表查询的同时发布探测。在这种情况下,如果转换命中由任何查询装置返回,那么中止页表查询或忽略它的响应。The processing units 341, 342 may be of any type, like for example: a central processing unit (CPU), a graphics processing unit (GPU) or an accelerated processing unit (APU). The embodiment of FIG. 3 is also applicable within the groups shown with respect to FIG. 1 to provide additional probes before or while starting the page table lookup. The processing device 300 may comprise more than two processing units sharing memory and respective TLB hierarchies. In this case, any processing unit's translation request that cannot be fulfilled with reference to the level 1 TLB associated with that processing unit may cause all or a subset of another level 1 TLB to be probed for the requested address translation. In one embodiment, page table lookups may be avoided if a hit is encountered in any probed level 1 TLB. Alternatively, the probe can be issued at the same time the page table lookup is started. In this case, if a translation hit is returned by any query device, the page table query is aborted or its response is ignored.

应当理解可以在采用共用存储器的任何多处理器装置中实施本文描述的方法,并且基于本文的公开内容的许多变化是有可能的。虽然上文以特定组合的方式描述了特征和元件,但是可以在没有其它特征和元件的情况下单独使用每一个特征或元件,或者以利用或不利用其它特征和元件的各种组合来使用每一个特征或元件。It should be understood that the methods described herein may be implemented in any multiprocessor device employing shared memory, and that many variations are possible based on the disclosure herein. Although features and elements have been described above in particular combinations, each feature or element can be used alone without the other features and elements or in various combinations with or without other features and elements. A feature or element.

可以在通用计算机、处理器或处理器核心中实施提供的方法。合适的处理器包括,举例来说:通用处理器、专用处理器、常规处理器、数字信号处理器(DSP)、多个微处理器、与DSP核心相关的一个或多个微处理器、控制器、微控制器、专用集成电路(ASIC)、现场可编程门阵列(FPGA)电路、其它任何类型的集成电路(IC)和/或状态机。可以通过使用处理的硬件描述语言(HDL)指令和包括网络列表的其它中间数据(这些指令能够存储在计算机可读介质上)的结果而配置制造过程来制造这些处理器。这种处理的结果可以是光罩作品,所述光罩作品随后用在半导体制造过程中,以制造实施所公开的实施方案的一个或多个方面的处理器。The provided methods can be implemented in a general purpose computer, processor or processor core. Suitable processors include, by way of example: a general purpose processor, a special purpose processor, a conventional processor, a digital signal processor (DSP), multiple microprocessors, one or more microprocessors associated with a DSP core, a control microcontrollers, application specific integrated circuits (ASICs), field programmable gate array (FPGA) circuits, any other type of integrated circuit (IC) and/or state machine. These processors may be manufactured by configuring a manufacturing process using the results of processing Hardware Description Language (HDL) instructions and other intermediate data including netlists, which instructions can be stored on a computer-readable medium. The result of such processing may be a photomask work that is subsequently used in a semiconductor fabrication process to fabricate a processor implementing one or more aspects of the disclosed embodiments.

本文提供的方法或流程图可以实施在并入非暂态计算机可读存储介质中的计算机程序、软件或固件中,以用于由通用计算机或处理器执行。非暂态计算机可读存储介质的实例包括只读存储器(ROM)、随机存取存储器(RAM)、寄存器、高速缓冲存储器、半导体存储器装置、如内部硬盘和可移动磁盘的磁性介质、磁光介质、和如CD-ROM盘的光学介质和数字通用光盘(DVD)。The methods or flowcharts provided herein can be implemented in a computer program, software or firmware incorporated into a non-transitory computer-readable storage medium for execution by a general-purpose computer or processor. Examples of non-transitory computer readable storage media include read only memory (ROM), random access memory (RAM), registers, cache memory, semiconductor memory devices, magnetic media such as internal hard disks and removable disks, magneto-optical media , and optical media such as CD-ROM discs and digital versatile discs (DVD).

*   *   ** * * *

Claims (46)

1.一种数据处理装置,其包括:1. A data processing device comprising: 第一处理单元;first processing unit; 第二处理单元;second processing unit; 所述第一和第二处理单元共用的存储器的页表;a page table of a memory shared by the first and second processing units; 第一转换后备缓冲器(TLB)层次结构,其被配置来存储由所述第一处理单元使用的所述页表的选择地址转换;a first translation lookaside buffer (TLB) hierarchy configured to store select address translations for the page tables used by the first processing unit; 第二TLB层次结构,其被配置来存储由所述第二处理单元使用的所述页表的选择地址转换;以及a second TLB hierarchy configured to store select address translations of the page tables used by the second processing unit; and 所述处理装置被配置来在出现一种状况时针对由所述第一处理单元请求的地址转换来探测所述第二TLB层次结构,所述状况为在所述第一TLB层次结构中没找到所述请求的地址转换。The processing means is configured to probe the second TLB hierarchy for an address translation requested by the first processing unit upon occurrence of a condition that is not found in the first TLB hierarchy The address translation for the request. 2.如权利要求1所述的数据处理装置,其被配置来在出现一种状况时针对所述请求的地址转换进行所述页表的查询,所述状况为在所述第二TLB层次结构的所述探测中没找到所述请求的地址转换。2. The data processing apparatus of claim 1 , configured to perform a lookup of the page table for the requested address translation upon occurrence of a condition that is in the second TLB hierarchy The requested address translation was not found in the probe for . 3.如权利要求1所述的数据处理装置,其被配置来在所述第二TLB层次结构的所述探测的同时针对所述请求的地址转换开始所述页表的查询。3. The data processing apparatus of claim 1, configured to initiate a lookup of the page table for the requested address translation concurrently with the probing of the second TLB hierarchy. 4.如权利要求1所述的数据处理装置,其被配置来在出现一种状况时针对由所述第二处理单元请求的地址转换探测所述第一TLB层次结构,所述状况为在所述第二TLB层次结构中没找到由所述第二处理单元请求的所述地址转换。4. The data processing apparatus of claim 1 , configured to probe the first TLB hierarchy for an address translation requested by the second processing unit when a condition occurs, the condition being The address translation requested by the second processing unit is not found in the second TLB hierarchy. 5.如权利要求4所述的数据处理装置,其被配置来在出现一种状况时针对由所述第二处理单元请求的所述地址转换进行所述页表的查询,所述状况为在所述第一TLB层次结构的所述探测中没找到由所述第二处理单元请求的所述地址转换。5. The data processing apparatus according to claim 4, configured to perform a lookup of the page table for the address translation requested by the second processing unit when a condition occurs, the condition being The address translation requested by the second processing unit is not found in the probing of the first TLB hierarchy. 6.如权利要求4所述的数据处理装置,其被配置来在所述第一TLB层次结构的所述探测的同时针对由所述第二处理单元请求的所述地址转换开始所述页表的查询。6. The data processing apparatus of claim 4, configured to start said page table for said address translation requested by said second processing unit concurrently with said probing of said first TLB hierarchy query. 7.如权利要求1所述的数据处理装置,其中:7. The data processing apparatus as claimed in claim 1, wherein: 所述第一处理单元是中央处理单元(CPU)、图形处理单元(GPU)或加速处理单元(APU);以及The first processing unit is a central processing unit (CPU), a graphics processing unit (GPU), or an accelerated processing unit (APU); and 所述第二处理单元是CPU、GPU或APU。The second processing unit is CPU, GPU or APU. 8.如权利要求1所述的数据处理装置,其中所述第一处理单元是CPU而所述第二处理单元是GPU。8. The data processing apparatus of claim 1, wherein the first processing unit is a CPU and the second processing unit is a GPU. 9.如权利要求1所述的数据处理装置,其更包括包含所述第一处理单元的多个第一处理单元,其中:9. The data processing device as claimed in claim 1, further comprising a plurality of first processing units including the first processing unit, wherein: 所述第一TLB层次结构被配置来存储由所述第一处理单元使用的所述页表的选择地址转换。The first TLB hierarchy is configured to store select address translations of the page tables used by the first processing unit. 10.如权利要求9所述的数据处理装置,其中每一个所述第一处理单元是中央处理单元(CPU)、图形处理单元(GPU)或加速处理单元(APU)。10. The data processing apparatus according to claim 9, wherein each of the first processing units is a central processing unit (CPU), a graphics processing unit (GPU) or an accelerated processing unit (APU). 11.如权利要求10所述的数据处理装置,其中11. The data processing apparatus as claimed in claim 10, wherein 所有的所述第一处理单元是CPU;以及all of said first processing units are CPUs; and 所述第一TLB层次结构包括单独的一级TLB,所述单独的一级TLB与每一个CPU相关联以存储由这个CPU使用的所述页表的选择地址转换;和与所述一级TLB相关联的二级TLB;以及The first TLB hierarchy includes a separate first-level TLB associated with each CPU to store select address translations for the page tables used by that CPU; and with the first-level TLB the associated second-level TLB; and 针对由所述第一处理单元请求的所述地址转换探测所述第二TLB层次结构的状况是:在与所述第一处理单元相关联的所述一级TLB和所述二级TLB的搜寻中没找到由所述第一处理单元请求的所述地址转换。probing the status of the second TLB hierarchy for the address translation requested by the first processing unit is: a search in the first level TLB and the second level TLB associated with the first processing unit The address translation requested by the first processing element was not found in . 12.如权利要求10所述的数据处理装置,其中12. The data processing apparatus as claimed in claim 10, wherein 所有的所述第一处理单元是GPU;以及all of said first processing units are GPUs; and 所述第一TLB层次结构包括与所有的所述GPU相关联以存储由所述GPU使用的所述页表的选择地址转换的单一TLB;以及said first TLB hierarchy comprising a single TLB associated with all of said GPUs to store selected address translations of said page tables used by said GPUs; and 针对由所述第一处理单元请求的所述地址转换探测所述第二TLB层次结构的状况是:在所述单一TLB的搜寻中没找到由所述第一处理单元请求的所述地址转换。A condition of probing the second TLB hierarchy for the address translation requested by the first processing unit is that the address translation requested by the first processing unit was not found in a search of the single TLB. 13.如权利要求1所述的数据处理装置,其更包括包含所述第二处理单元的多个第二处理单元,其中:13. The data processing device as claimed in claim 1, further comprising a plurality of second processing units comprising said second processing unit, wherein: 所述第二TLB层次结构包括单独的一级TLB,所述单独的一级TLB与所述多个第二处理单元中的每一个相关联以存储由这个第二处理单元使用的所述页表的选择地址转换;和与所述一级TLB相关联的二级TLB;以及The second TLB hierarchy includes a separate first-level TLB associated with each of the plurality of second processing units to store the page tables used by that second processing unit and a second-level TLB associated with said first-level TLB; and 所述处理装置被配置来通过搜寻与选择的第二处理单元相关联的所述一级TLB和所述二级TLB,来针对由所述第一处理单元请求的所述地址转换探测所述第二TLB层次结构,以便找到由所述第一处理单元请求的所述地址转换。The processing means is configured to probe the first processing unit for the address translation requested by the first processing unit by searching the first level TLB and the second level TLB associated with the second processing unit. and two TLB hierarchies to find the address translation requested by the first processing unit. 14.如权利要求4所述的数据处理装置,其更包括包含所述第一处理单元的多个第一处理单元和包含所述第二处理单元的多个第二处理单元,其中:14. The data processing device as claimed in claim 4, further comprising a plurality of first processing units including the first processing unit and a plurality of second processing units including the second processing unit, wherein: 所述第一TLB层次结构被配置来存储由所述第一处理单元使用的所述页表的选择地址转换;以及the first TLB hierarchy configured to store select address translations of the page tables used by the first processing unit; and 所述第二TLB层次结构被配置来存储由所述第二处理单元使用的所述页表的选择地址转换。The second TLB hierarchy is configured to store select address translations of the page tables used by the second processing unit. 15.如权利要求14所述的数据处理装置,其中:15. The data processing apparatus as claimed in claim 14, wherein: 每一个所述第一处理单元是中央处理单元(CPU)、图形处理单元(GPU)或加速处理单元(APU);以及Each of said first processing units is a central processing unit (CPU), a graphics processing unit (GPU), or an accelerated processing unit (APU); and 每一个所述第二处理单元是CPU、GPU或APU。Each of the second processing units is a CPU, GPU or APU. 16.如权利要求15所述的数据处理装置,其中:16. A data processing apparatus as claimed in claim 15, wherein: 所有的所述第一处理单元是CPU;all of said first processing units are CPUs; 所述第一TLB层次结构包括单独的一级TLB,所述单独的一级TLB与每一个CPU相关联以存储由这个CPU使用的所述页表的选择地址转换;和与所述一级TLB相关联的二级TLB;The first TLB hierarchy includes a separate first-level TLB associated with each CPU to store select address translations for the page tables used by that CPU; and with the first-level TLB the associated second-level TLB; 针对由所述第一处理单元请求的所述地址转换探测所述第二TLB层次结构的状况是:在与所述第一处理单元相关联的所述一级TLB和所述二级TLB的搜寻中没找到由所述第一处理单元请求的所述地址转换;probing the status of the second TLB hierarchy for the address translation requested by the first processing unit is: a search in the first level TLB and the second level TLB associated with the first processing unit The address translation requested by the first processing element is not found in ; 所有的所述第二处理单元是GPU;all of said second processing units are GPUs; 所述第二TLB层次结构包括与所有的所述GPU相关联以存储由所述GPU使用的所述页表的选择地址换的单一TLB;said second TLB hierarchy comprising a single TLB associated with all of said GPUs to store select addresses of said page tables used by said GPUs; 针对由所述第二处理单元请求的所述地址转换探测所述第一TLB层次结构的状况是:在所述单一TLB的搜寻中没找到由所述第二处理单元请求的所述地址转换;以及probing the first TLB hierarchy for the address translation requested by the second processing unit is: the address translation requested by the second processing unit was not found in a search of the single TLB; as well as 所述处理装置被配置来通过搜寻与选择的第一处理单元相关联的所述一级TLB和所述二级TLB,来针对由所述第二处理单元请求的所述地址转换来探测所述第一TLB层次结构,以便找到由所述第二处理单元请求的所述地址转换。The processing means is configured to detect the address translation requested by the second processing unit for the address translation requested by the second processing unit by searching the first level TLB and the second level TLB associated with the selected first processing unit. A first TLB hierarchy to find the address translation requested by the second processing unit. 17.如权利要求16所述的数据处理装置,其被配置来:17. The data processing apparatus of claim 16 configured to: 在出现一种状况时针对由所述第一处理单元请求的所述地址转换进行所述页表的查询,所述状况为在所述第二TLB层次结构的所述探测中没找到由所述第一处理单元请求的所述地址转换;以及The lookup of the page table is made for the address translation requested by the first processing unit upon the condition that the address translation requested by the second TLB hierarchy is not found in the probe of the second TLB hierarchy said address translation requested by the first processing unit; and 在出现一种状况时针对由所述第二处理单元请求的所述地址转换进行所述页表的查询,所述状况为在所述第一TLB层次结构的所述探测中没找到由所述第二处理单元请求的所述地址转换。The lookup of the page table is made for the address translation requested by the second processing unit upon the condition that the address translation requested by the first TLB hierarchy is not found in the probe of the first TLB hierarchy The address translation requested by the second processing unit. 18.如权利要求16所述的数据处理装置,其被配置来:18. The data processing apparatus of claim 16 configured to: 在所述第二TLB层次结构的所述探测的同时针对由所述第一处理单元请求的所述地址转换开始所述页表的查询;以及initiating a lookup of the page table for the address translation requested by the first processing unit concurrently with the probing of the second TLB hierarchy; and 在所述第一TLB层次结构的所述探测的同时针对由所述第二处理单元请求的所述地址转换开始所述页表的查询。A lookup of the page table is initiated concurrently with the probing of the first TLB hierarchy for the address translation requested by the second processing unit. 19.一种数据处理装置,其包括:19. A data processing device comprising: 第一处理单元;first processing unit; 第二处理单元;second processing unit; 所述第一和第二处理单元共用的存储器的页表;a page table of a memory shared by the first and second processing units; 第一转换后备缓冲器(TLB),其被配置来存储由所述第一处理单元使用的所述页表的选择地址转换;a first translation lookaside buffer (TLB) configured to store select address translations for the page table used by the first processing unit; 第二TLB,其被配置来存储由所述第二处理单元使用的所述页表的选择地址转换;以及a second TLB configured to store select address translations of the page tables used by the second processing unit; and 所述处理装置被配置来在出现一种状况时针对由所述第一处理单元请求的地址转换探测所述第二TLB,所述状况为在所述第一TLB中没找到所述请求的地址转换。The processing means is configured to probe the second TLB for an address translation requested by the first processing unit when a condition occurs that the requested address is not found in the first TLB convert. 20.如权利要求19所述的数据处理装置,其被配置来在出现一种状况时针对所述请求的地址转换进行所述页表的查询,所述状况为在所述第二TLB的所述探测中没找到所述请求的地址转换。20. The data processing apparatus according to claim 19 , configured to perform a lookup of the page table for the requested address translation when a condition occurs, the condition being all The requested address translation was not found in the probe. 21.如权利要求19所述的数据处理装置,其被配置来在所述第二TLB的所述探测的同时针对所述请求的地址转换开始所述页表的查询。21. The data processing apparatus of claim 19, configured to initiate a lookup of the page table for the requested address translation concurrently with the probing of the second TLB. 22.如权利要求19所述的数据处理装置,其更包括包含所述第一处理单元的多个第一处理单元和包含所述第二处理单元的多个第二处理单元,其中:22. The data processing device as claimed in claim 19, further comprising a plurality of first processing units including the first processing unit and a plurality of second processing units including the second processing unit, wherein: 所述第一TLB是被配置来存储由所述第一处理单元使用的所述页表的选择地址转换的第一TLB层次结构的一部分;以及the first TLB is part of a first TLB hierarchy configured to store select address translations of the page tables used by the first processing unit; and 所述第二TLB是被配置来存储由所述第二处理单元使用的所述页表的选择地址转换的第二TLB层次结构的一部分。The second TLB is part of a second TLB hierarchy configured to store select address translations of the page tables used by the second processing unit. 23.如权利要求19所述的数据处理装置,其更包括二级TLB,其中:23. The data processing apparatus as claimed in claim 19, further comprising a second-level TLB, wherein: 所述第一和第二TLB包括被配置来存储由所述第一和第二处理单元使用的所述页表的选择地址转换的TLB层次结构的第一级;the first and second TLBs include a first level of a TLB hierarchy configured to store select address translations of the page tables used by the first and second processing units; 所述二级TLB包括所述TLB层次结构的第二级;以及the second level TLB comprises a second level of the TLB hierarchy; and 所述数据处理装置被配置来在出现一种状况时针对所述请求的地址转换进行所述页表的查询,所述状况为在所述第二TLB或所述二级TLB的所述探测中没找到所述请求的地址转换。The data processing apparatus is configured to perform a lookup of the page table for the requested address translation when a condition occurs, the condition being in the probing of the second TLB or the secondary TLB The requested address translation was not found. 24.如权利要求19所述的数据处理装置,其更包括二级TLB,其中:24. The data processing apparatus as claimed in claim 19, further comprising a second-level TLB, wherein: 所述第一和第二TLB包括被配置来存储由所述第一和第二处理单元使用的所述页表的选择地址转换的TLB层次结构的第一级;the first and second TLBs include a first level of a TLB hierarchy configured to store select address translations of the page tables used by the first and second processing units; 所述二级TLB包括所述TLB层次结构的第二级;以及the second level TLB comprises a second level of the TLB hierarchy; and 所述数据处理装置被配置来在出现一种状况时在所述第二TLB的所述探测的同时针对所述请求的地址转换开始所述页表的查询,所述状况为在所述二级TLB中没找到所述请求的地址转换。The data processing apparatus is configured to initiate a lookup of the page table for the requested address translation at the same time as the probing of the second TLB when a condition occurs, the condition being that at the second level The requested address translation was not found in the TLB. 25.一种数据处理方法,其包括:25. A data processing method comprising: 提供第一转换后备缓冲器(TLB)层次结构,其被配置来存储由第一处理单元使用的页表的选择地址转换;和第二TLB层次结构,其被配置来存储由第二处理单元使用的所述页表的选择地址转换;以及There is provided a first translation lookaside buffer (TLB) hierarchy configured to store selected address translations for page tables used by the first processing unit; and a second TLB hierarchy configured to store selected address translations for use by the second processing unit Select address translation for said page table; and 在出现一种状况时针对由所述第一处理单元请求的地址转换探测所述第二TLB层次结构,所述状况为在所述第一TLB层次结构中没找到所述请求的地址转换。The second TLB hierarchy is probed for an address translation requested by the first processing unit upon a condition that the requested address translation is not found in the first TLB hierarchy. 26.如权利要求25所述的数据处理方法,其更包括在出现一种状况时针对所述请求的地址转换进行所述页表的查询,所述状况为在所述第二TLB层次结构的所述探测中没找到所述请求的地址转换。26. The data processing method as claimed in claim 25, further comprising performing a lookup of the page table for the requested address translation when a condition occurs, the condition is in the second TLB hierarchy The requested address translation was not found in the probe. 27.如权利要求25所述的数据处理方法,其更包括在所述第二TLB层次结构的所述探测的同时针对所述请求的地址转换开始所述页表的查询。27. The data processing method of claim 25, further comprising starting a lookup of the page table for the requested address translation concurrently with the probing of the second TLB hierarchy. 28.如权利要求25所述的数据处理方法,其更包括在出现一种状况时针对由所述第二处理单元请求的地址转换探测所述第一TLB层次结构,所述状况为在所述第二TLB层次结构中没找到由所述第二处理单元请求的所述地址转换。28. The data processing method according to claim 25, further comprising probing the first TLB hierarchy for an address translation requested by the second processing unit when a condition occurs, the condition being in the The address translation requested by the second processing unit is not found in the second TLB hierarchy. 29.如权利要求28所述的数据处理方法,其更包括在出现一种状况时针对由所述第二处理单元请求的所述地址转换进行所述页表的查询,所述状况为在所述第一TLB层次结构的所述探测中没找到由所述第二处理单元请求的所述地址转换。29. The data processing method as claimed in claim 28, further comprising performing a lookup of the page table for the address translation requested by the second processing unit when a condition occurs, the condition is in the The address translation requested by the second processing unit is not found in the probing of the first TLB hierarchy. 30.如权利要求28所述的数据处理方法,其更包括在所述第一TLB层次结构的所述探测的同时针对由所述第二处理单元请求的所述地址转换开始所述页表的查询。30. The data processing method of claim 28, further comprising starting said page table translation for said address translation requested by said second processing unit concurrently with said probing of said first TLB hierarchy Inquire. 31.如权利要求25所述的数据处理方法,其中所述第一处理单元是多个中央处理单元(CPU)中的CPU,并且所述第一TLB层次结构包括单独的一级TLB,所述单独的一级TLB与每一个CPU相关联以存储由这个CPU使用的所述页表的选择地址转换;和与所述一级TLB相关联的二级TLB,其中在出现一种状况时针对由所述第一处理单元请求的所述地址转换探测所述第二TLB层次结构,所述状况为在与所述第一处理单元相关联的所述一级TLB和所述二级TLB的搜寻中没找到由所述第一处理单元请求的所述地址转换。31. The data processing method as claimed in claim 25, wherein said first processing unit is a CPU among a plurality of central processing units (CPUs), and said first TLB hierarchy comprises a separate first-level TLB, said a separate level-1 TLB associated with each CPU to store select address translations for said page tables used by that CPU; and a level-2 TLB associated with said level-1 TLB, wherein when a condition occurs for The address translation requested by the first processing unit probes the second TLB hierarchy, the condition is in a search of the first level TLB and the second level TLB associated with the first processing unit The address translation requested by the first processing unit was not found. 32.如权利要求25所述的数据处理方法,其中所述第一处理单元是多个图形处理单元(GPU)中的GPU,并且所述第一TLB层次结构包括与所有的所述GPU相关联以存储由所述GPU使用的所述页表的选择地址转换的单一TLB,其中针对由所述第一处理单元请求的所述地址转换探测所述第二TLB层次结构的所述状况为:在所述单一TLB的搜寻中没找到由所述第一处理单元请求的所述地址转换。32. The data processing method of claim 25, wherein the first processing unit is a GPU in a plurality of graphics processing units (GPUs), and the first TLB hierarchy includes to store a single TLB of selected address translations of the page tables used by the GPU, wherein probing the condition of the second TLB hierarchy for the address translation requested by the first processing unit is: at The address translation requested by the first processing unit is not found in a search of the single TLB. 33.如权利要求25所述的数据处理方法,其中所述第二处理单元是多个第二处理单元中的一个,并且所述第二TLB层次结构包括单独一级TLB,所述单独一级TLB与所述多个第二处理单元中的每一个相关联以存储由这个第二处理单元使用的所述页表的选择地址转换;和与所述一级TLB相关联的二级TLB,其更包括:33. The data processing method according to claim 25, wherein the second processing unit is one of a plurality of second processing units, and the second TLB hierarchy comprises a single-level TLB, the single-level a TLB associated with each of said plurality of second processing units to store select address translations of said page tables used by that second processing unit; and a second level TLB associated with said first level TLB, which Also includes: 通过搜寻与选择的第二处理单元相关联的所述一级TLB和所述二级TLB,来针对由所述第一处理单元请求的所述地址转换来探测所述第二TLB层次结构,以便找到由所述第一处理单元请求的所述地址转换。probing the second TLB hierarchy for the address translation requested by the first processing unit by searching the first level TLB and the second level TLB associated with the selected second processing unit, so that The address translation requested by the first processing unit is found. 34.如权利要求28所述的数据处理方法,其中所述第一处理单元是多个中央处理单元(CPU)中的CPU,并且所述第一TLB层次结构包括单独的一级TLB,所述单独的一级TLB与每一个CPU相关联以存储由这个CPU使用的所述页表的选择地址转换;和与所述一级TLB相关联的二级TLB;并且其中所述第二处理单元是多个图形处理单元(GPU)中的GPU,并且所述第二TLB层次结构包括与所有的所述GPU相关联以存储由所述GPU使用的所述页表的选择地址转换的单一TLB,其中:34. The data processing method as claimed in claim 28, wherein said first processing unit is a CPU among a plurality of central processing units (CPUs), and said first TLB hierarchy comprises a separate first-level TLB, said a separate level-1 TLB associated with each CPU to store select address translations of said page tables used by that CPU; and a level-2 TLB associated with said level-1 TLB; and wherein said second processing unit is GPUs in a plurality of graphics processing units (GPUs), and the second TLB hierarchy comprises a single TLB associated with all of the GPUs to store selected address translations of the page tables used by the GPUs, wherein : 针对由所述第一处理单元请求的所述地址转换探测所述第二TLB层次结构的状况是:在与所述第一处理单元相关联的所述一级TLB和所述二级TLB的搜寻中没找到由所述第一处理单元请求的所述地址转换;以及probing the status of the second TLB hierarchy for the address translation requested by the first processing unit is: a search in the first level TLB and the second level TLB associated with the first processing unit The address translation requested by the first processing element is not found in ; and 针对由所述第二处理单元请求的所述地址转换探测所述第一TLB层次结构的状况是:在所述单一TLB的搜寻中没找到由所述第二处理单元请求的所述地址转换。A condition of probing the first TLB hierarchy for the address translation requested by the second processing unit is that the address translation requested by the second processing unit was not found in a search of the single TLB. 35.如权利要求34所述的数据处理方法,其更包括:35. The data processing method as claimed in claim 34, further comprising: 在出现一种状况时针对由所述第一处理单元请求的所述地址转换进行所述页表的查询,所述状况为在所述第二TLB层次结构中没找到由所述第一处理单元请求的所述地址转换;以及performing a lookup of the page table for the address translation requested by the first processing unit upon a condition that the address translation requested by the first processing unit is not found in the second TLB hierarchy the address translation requested; and 在出现一种状况时针对由所述第二处理单元请求的所述地址转换进行所述页表的查询,所述状况为在所述第一TLB层次结构的所述探测中没找到由所述第二处理单元请求的所述地址转换。The lookup of the page table is made for the address translation requested by the second processing unit upon the condition that the address translation requested by the first TLB hierarchy is not found in the probe of the first TLB hierarchy The address translation requested by the second processing unit. 36.如权利要求34所述的数据处理方法,其更包括:36. The data processing method as claimed in claim 34, further comprising: 在所述第二TLB层次结构的所述探测的同时针对由所述第一处理单元请求的所述地址转换开始所述页表的查询;以及initiating a lookup of the page table for the address translation requested by the first processing unit concurrently with the probing of the second TLB hierarchy; and 在所述第一TLB层次结构的所述探测的同时针对由所述第二处理单元请求的所述地址转换开始所述页表的查询。A lookup of the page table is initiated concurrently with the probing of the first TLB hierarchy for the address translation requested by the second processing unit. 37.一种数据处理方法,其包括:37. A data processing method comprising: 提供第一转换后备缓冲器(TLB),其被配置来存储由第一处理单元使用的页表的选择地址转换;和第二TLB,其被配置来存储由第二处理单元使用的所述页表的选择地址转换;以及providing a first translation lookaside buffer (TLB) configured to store selected address translations for a page table used by the first processing unit; and a second TLB configured to store the page used by the second processing unit the selection address translation of the table; and 在出现一种状况时针对由所述第一处理单元请求的地址转换探测所述第二TLB,所述状况为在所述第一TLB中没找到所述请求的地址转换。The second TLB is probed for an address translation requested by the first processing unit upon a condition that the requested address translation is not found in the first TLB. 38.如权利要求37所述的数据处理方法,其更包括在出现一种状况时针对所述请求的地址转换进行所述页表的查询,所述状况为在所述第二TLB的探测中没找到所述请求的地址转换。38. The data processing method according to claim 37, further comprising performing a lookup of the page table for the requested address translation when a situation occurs, the situation being during detection of the second TLB The requested address translation was not found. 39.如权利要求37所述的数据处理方法,其更包括在所述第二TLB的所述探测的同时针对所述请求的地址转换开始所述页表的查询。39. The data processing method of claim 37, further comprising starting a lookup of the page table for the requested address translation simultaneously with the probing of the second TLB. 40.如权利要求37所述的数据处理方法,其中所述第一TLB是被配置来存储由多个第一处理单元使用的所述页表的选择地址转换的第一TLB层次结构的一部分,并且所述第二TLB是被配置来存储由多个第二处理单元使用的所述页表的选择地址转换的第二TLB层次结构的一部分,其中:40. The data processing method of claim 37, wherein the first TLB is part of a first TLB hierarchy configured to store selected address translations of the page tables used by a plurality of first processing units, and said second TLB is part of a second TLB hierarchy configured to store selected address translations of said page tables used by a plurality of second processing units, wherein: 针对由所述第一处理单元请求的地址转换探测所述第二TLB的状况为:在所述第一TLB层次结构中没找到所述请求的地址转换。Probing a condition of the second TLB for an address translation requested by the first processing unit is that the requested address translation is not found in the first TLB hierarchy. 41.如权利要求37所述的数据处理方法,其中第一和第二TLB包括被配置来存储由所述第一和第二处理单元使用的所述页表的选择地址转换的TLB层次结构的第一级,并且二级TLB包括所述TLB层次结构的第二级,其更包括:41. The data processing method according to claim 37, wherein the first and second TLBs comprise a TLB hierarchy configured to store selected address translations of the page tables used by the first and second processing units The first-level, and second-level TLBs include the second level of the TLB hierarchy, which further includes: 在出现一种状况时针对所述请求的地址转换进行所述页表的查询,所述状况为在所述第二TLB或所述二级TLB的所述探测中没找到所述请求的地址转换。Performing the query of the page table for the requested address translation when a condition occurs, the condition is that the requested address translation is not found in the detection of the second TLB or the secondary TLB. . 42.如权利要求37所述的数据处理方法,其中第一和第二TLB包括被配置来存储由所述第一和第二处理单元使用的所述页表的选择地址转换的TLB层次结构的第一级,并且二级TLB包括所述TLB层次结构的第二级,其更包括:42. The data processing method according to claim 37, wherein the first and second TLBs comprise a TLB hierarchy configured to store selected address translations of the page tables used by the first and second processing units The first-level, and second-level TLBs include the second level of the TLB hierarchy, which further includes: 在出现一种状况时在所述第二TLB的所述探测的同时针对所述请求的地址转换开始所述页表的查询,所述状况为在所述二级TLB中没找到所述请求的地址转换。Initiating the lookup of the page table for the requested address translation concurrently with the probing of the second TLB when a condition occurs that the requested one is not found in the secondary TLB address translation. 43.一种非暂态计算机可读存储介质,其存储由通用计算机执行以便促进集成电路的制造的指令集,所述非暂态计算机可读存储介质包括:43. A non-transitory computer-readable storage medium storing a set of instructions for execution by a general-purpose computer to facilitate the manufacture of integrated circuits, the non-transitory computer-readable storage medium comprising: 第一处理单元;first processing unit; 第二处理单元;second processing unit; 所述第一和第二处理单元共用的存储器的页表;a page table of a memory shared by the first and second processing units; 第一转换后备缓冲器(TLB)层次结构,其被配置来存储由所述第一处理单元使用的所述页表的选择地址转换;a first translation lookaside buffer (TLB) hierarchy configured to store select address translations for the page tables used by the first processing unit; 第二TLB层次结构,其被配置来存储由所述第二处理单元使用的所述页表的选择地址转换;以及a second TLB hierarchy configured to store select address translations of the page tables used by the second processing unit; and 所述处理装置被配置来在出现一种状况时针对由所述第一处理单元请求的地址转换探测所述第二TLB层次结构,所述状况为在所述第一TLB层次结构中没找到所述请求的地址转换。The processing means is configured to probe the second TLB hierarchy for an address translation requested by the first processing unit when a condition occurs that all address translation for the above request. 44.如权利要求43所述的非暂态计算机可读存储介质,其中所述指令是用于装置制造的硬件描述语言(HDL)指令。44. The non-transitory computer readable storage medium of claim 43, wherein the instructions are hardware description language (HDL) instructions for device manufacture. 45.一种非暂态计算机可读存储介质,其存储由通用计算机执行以便促进集成电路的制造的指令集,所述非暂态计算机可读存储介质包括:45. A non-transitory computer-readable storage medium storing a set of instructions for execution by a general-purpose computer to facilitate the manufacture of integrated circuits, the non-transitory computer-readable storage medium comprising: 第一处理单元;first processing unit; 第二处理单元;second processing unit; 所述第一和第二处理单元共用的存储器的页表;a page table of a memory shared by the first and second processing units; 第一转换后备缓冲器(TLB),其被配置来存储由所述第一处理单元使用的所述页表的选择地址转换;a first translation lookaside buffer (TLB) configured to store select address translations for the page table used by the first processing unit; 第二TLB,其被配置来存储由所述第二处理单元使用的所述页表的选择地址转换;以及a second TLB configured to store select address translations of the page tables used by the second processing unit; and 所述处理装置被配置来在出现一种状况时针对由所述第一处理单元请求的地址转换探测所述第二TLB,所述状况为在所述第一TLB中没找到所述请求的地址转换。The processing means is configured to probe the second TLB for an address translation requested by the first processing unit when a condition occurs that the requested address is not found in the first TLB convert. 46.如权利要求45所述的计算机可读存储介质,其中所述指令是用于制造装置的硬件描述语言(HDL)指令。46. The computer-readable storage medium of claim 45, wherein the instructions are hardware description language (HDL) instructions for manufacturing a device.
CN201380071074.8A 2012-12-21 2013-12-17 Processing device with address translation probing and methods Pending CN104937569A (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US13/723,379 2012-12-21
US13/723,379 US8984255B2 (en) 2012-12-21 2012-12-21 Processing device with address translation probing and methods
PCT/US2013/075711 WO2014099943A1 (en) 2012-12-21 2013-12-17 Processing device with address translation probing and methods

Publications (1)

Publication Number Publication Date
CN104937569A true CN104937569A (en) 2015-09-23

Family

ID=49918882

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201380071074.8A Pending CN104937569A (en) 2012-12-21 2013-12-17 Processing device with address translation probing and methods

Country Status (6)

Country Link
US (1) US8984255B2 (en)
EP (1) EP2936322B1 (en)
JP (1) JP2016504686A (en)
KR (1) KR20150097711A (en)
CN (1) CN104937569A (en)
WO (1) WO2014099943A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112753024A (en) * 2018-09-25 2021-05-04 Ati科技无限责任公司 External memory based translation look-aside buffer
CN114817081A (en) * 2022-03-02 2022-07-29 阿里巴巴(中国)有限公司 Memory access method and device and input/output memory management unit

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2528842B (en) * 2014-07-29 2021-06-02 Advanced Risc Mach Ltd A data processing apparatus, and a method of handling address translation within a data processing apparatus
US9858198B2 (en) * 2015-06-26 2018-01-02 Intel Corporation 64KB page system that supports 4KB page operations
US10489304B2 (en) 2017-07-14 2019-11-26 Arm Limited Memory address translation
US10534719B2 (en) * 2017-07-14 2020-01-14 Arm Limited Memory system for a data processing network
US10353826B2 (en) 2017-07-14 2019-07-16 Arm Limited Method and apparatus for fast context cloning in a data processing system
US10467159B2 (en) * 2017-07-14 2019-11-05 Arm Limited Memory node controller
US10592424B2 (en) 2017-07-14 2020-03-17 Arm Limited Range-based memory system
US10565126B2 (en) 2017-07-14 2020-02-18 Arm Limited Method and apparatus for two-layer copy-on-write
US10613989B2 (en) 2017-07-14 2020-04-07 Arm Limited Fast address translation for virtual machines
US10866904B2 (en) * 2017-11-22 2020-12-15 Arm Limited Data storage for multiple data types
US10831673B2 (en) * 2017-11-22 2020-11-10 Arm Limited Memory address translation
US10884850B2 (en) 2018-07-24 2021-01-05 Arm Limited Fault tolerant memory system
US11210232B2 (en) 2019-02-08 2021-12-28 Samsung Electronics Co., Ltd. Processor to detect redundancy of page table walk
US12007935B2 (en) * 2019-03-15 2024-06-11 Intel Corporation Graphics processors and graphics processing units having dot product accumulate instruction for hybrid floating point format
US20240320161A1 (en) * 2021-08-20 2024-09-26 Intel Corporation Apparatuses, methods, and systems for a device translation lookaside buffer pre-translation instruction and extensions to input/output memory management unit protocols
WO2025232963A1 (en) * 2024-05-07 2025-11-13 Huawei Technologies Co., Ltd. Memory addressing method, device and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6092172A (en) * 1996-10-16 2000-07-18 Hitachi, Ltd. Data processor and data processing system having two translation lookaside buffers
EP1182559A1 (en) * 2000-08-21 2002-02-27 Texas Instruments Incorporated Improved microprocessor
CN1506849A (en) * 2002-12-12 2004-06-23 国际商业机器公司 Data processing system capable of managing virtual memory processing conception
US20110231612A1 (en) * 2010-03-16 2011-09-22 Oracle International Corporation Pre-fetching for a sibling cache

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH04148352A (en) * 1990-10-11 1992-05-21 Fujitsu Ltd Address converting system for information processor equipped with plural processor
JPH05250261A (en) * 1992-03-09 1993-09-28 Nec Corp Address conversion device
US20040225840A1 (en) * 2003-05-09 2004-11-11 O'connor Dennis M. Apparatus and method to provide multithreaded computer processing

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6092172A (en) * 1996-10-16 2000-07-18 Hitachi, Ltd. Data processor and data processing system having two translation lookaside buffers
EP1182559A1 (en) * 2000-08-21 2002-02-27 Texas Instruments Incorporated Improved microprocessor
CN1506849A (en) * 2002-12-12 2004-06-23 国际商业机器公司 Data processing system capable of managing virtual memory processing conception
US20110231612A1 (en) * 2010-03-16 2011-09-22 Oracle International Corporation Pre-fetching for a sibling cache

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112753024A (en) * 2018-09-25 2021-05-04 Ati科技无限责任公司 External memory based translation look-aside buffer
CN112753024B (en) * 2018-09-25 2023-11-03 Ati科技无限责任公司 External memory-based conversion lookaside buffer
CN114817081A (en) * 2022-03-02 2022-07-29 阿里巴巴(中国)有限公司 Memory access method and device and input/output memory management unit

Also Published As

Publication number Publication date
KR20150097711A (en) 2015-08-26
US20140181460A1 (en) 2014-06-26
EP2936322B1 (en) 2017-03-08
JP2016504686A (en) 2016-02-12
WO2014099943A1 (en) 2014-06-26
US8984255B2 (en) 2015-03-17
EP2936322A1 (en) 2015-10-28

Similar Documents

Publication Publication Date Title
CN104937569A (en) Processing device with address translation probing and methods
TWI531912B (en) Processor having translation lookaside buffer for multiple context comnpute engine, system and method for enabling threads to access a resource in a processor
CN110059027B (en) Device and method for performing maintenance operations
US20170286315A1 (en) Managing translation invalidation
US10067709B2 (en) Page migration acceleration using a two-level bloom filter on high bandwidth memory systems
US10241925B2 (en) Selecting a default page size in a variable page size TLB
CN104636203A (en) Method and apparatus to represent a processor context with fewer bits
US20230236988A1 (en) Reducing Translation Lookaside Buffer Searches for Splintered Pages
WO2015043376A1 (en) Page access method and page access device, and server
US10831673B2 (en) Memory address translation
KR20210037216A (en) Memory management unit capable of managing address translation table using heterogeneous memory, and address management method thereof
KR102482516B1 (en) memory address conversion
CN110073338A (en) Configurable deflection relevance in Translation Look side Buffer
JPWO2008155849A1 (en) Arithmetic processing device, TLB control method, TLB control program, and information processing device
CN112114934B (en) Method and apparatus for power reduction in multi-threaded mode
CN105359115A (en) Lookup of a data structure containing a mapping between a virtual address space and a physical address space
US10866904B2 (en) Data storage for multiple data types
US20250110893A1 (en) Cache virtualization
US10747681B1 (en) Invalidation of entries in address translation storage
JP3697990B2 (en) Vector processor operand cache
HK1232971B (en) Packet processor forwarding database cache

Legal Events

Date Code Title Description
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
AD01 Patent right deemed abandoned

Effective date of abandoning: 20180914

AD01 Patent right deemed abandoned