CN1842771A - Mechanisms for dynamic configuration of virtual processor resources - Google Patents
Mechanisms for dynamic configuration of virtual processor resources Download PDFInfo
- Publication number
- CN1842771A CN1842771A CN 200480024801 CN200480024801A CN1842771A CN 1842771 A CN1842771 A CN 1842771A CN 200480024801 CN200480024801 CN 200480024801 CN 200480024801 A CN200480024801 A CN 200480024801A CN 1842771 A CN1842771 A CN 1842771A
- Authority
- CN
- China
- Prior art keywords
- virtual treatment
- context
- virtual
- resource
- treatment element
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Landscapes
- Executing Machine-Instructions (AREA)
- Multi Processors (AREA)
- Advance Control (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
【相关专利申请的交叉参考】[Cross-references to related patent applications]
本专利申请主张下列美国临暂时专利申请的权利,它们被结合在此作为参考。
本专利申请是下列共有非临时美国专利申请的部分继续申请,下列专利申请的每一个都有相同的受让人和至少一个共同的发明人,它们被结合在此作为参考。
上述提及的两个共有非临时美国专利申请主张下列美国临时专利申请的权利。
本专利申请是和下列共有非临时美国专利申请有关,下列专利申请的每一个被结合在此作为参考。
【技术领域】【Technical field】
本发明总地来说涉及虚拟多处理器的领域,更具体的说,涉及一个或多个虚拟处理元件之间的一虚拟多处理器内的资源动态配置的一种机制。The present invention relates generally to the field of virtual multiprocessors, and more particularly, to a mechanism for dynamic allocation of resources within a virtual multiprocessor among one or more virtual processing elements.
【背景技术】【Background technique】
现今,设计者运用许多技术以增加微处理器的性能。大部份的微处理器均使用在一个固定的频率运行的时钟信号进行工作。在每一个时钟周期,微处理器的电路均执行他们相对应的功能。依据轩尼斯和派特森的方法,真实测量微处理器的性能是执行一个程序或是一群程序所需要的时间。从这个观点来说,微处理器的性能是它的时钟频率,执行一个指令所需要的平均时钟周期数目(换个说法,每个时钟周期执行指令的平均数目),和在该个程序或是该群程序中所执行的指令数目的函数。半导体科学家和工程师持续在技术上提供进展,使得微处理器能够运算在更快的时钟频率上。这些技术进展有效地缩减晶体管的大小,导致在一个集成电路中更快速的交换时间。执行的指令数目主要取决于将被该程序所执行的任务,虽然它也受微处理器指令集架构的影响。然而,大幅的性能提升已经由架构上和组织上的技术来达成,该技术提高了每时钟周期执行的指令数目,特别是通过允许指令平行执行的技术(也就是,平行处理理论)。Today, designers employ many techniques to increase the performance of microprocessors. Most microprocessors operate with a clock signal that runs at a fixed frequency. In each clock cycle, the circuits of the microprocessor perform their corresponding functions. According to Hennessy and Patterson, the true measure of microprocessor performance is the time it takes to execute a program or a group of programs. From this point of view, the performance of a microprocessor is its clock frequency, the average number of clock cycles required to execute an instruction (in other words, the average number of instructions executed per clock cycle), and the A function of the number of instructions executed in the group program. Semiconductor scientists and engineers continue to provide advances in technology that allow microprocessors to operate at faster clock frequencies. These technological advances effectively shrink the size of transistors, resulting in faster switching times within an integrated circuit. The number of instructions executed depends primarily on the tasks to be performed by the program, although it is also affected by the microprocessor's instruction set architecture. However, substantial performance gains have been achieved by architectural and organizational techniques that increase the number of instructions executed per clock cycle, in particular by techniques that allow instructions to be executed in parallel (ie, parallel processing theory).
已经提高微处理器每个时钟周期的指令数目,和他们的时钟频率的平行处理技术是流水线的。以相当类似于装配线的阶段的方式,在微处理器流水线阶段内,流水线重叠多指令的执行。在一个理想情形,一个指令在每一个时钟周期向流水线下方移到一个新阶段,该新阶段对这些指令执行不同的功能。因此,虽然每一个各别指令花数个时钟周期来完成,因为各别指令的时钟周期有重叠,每个指令的平均时钟会被减少。在程序中指令允许的情形下实现流水线的性能提升,也就是一个指令的执行并不需要依赖它的前一个指令,因此可以和它先前的指令平行地执行,通常被称为指令级平行处理。另一种被当今微处理器所采用的指令级平行处理的方法,是在相同的时钟周期发出许多执行的指令给不同的功能单元,各单元执行他们被规定的功能。以这种方法完成指令级平行处理的微处理器,通常被视为“超级标量”微处理器。Parallel processing techniques that have increased the number of instructions per clock cycle of microprocessors, and their clock frequency, are pipelined. In a manner quite similar to the stages of an assembly line, within a microprocessor pipeline stage, the pipeline overlaps the execution of multiple instructions. In an ideal situation, an instruction moves down the pipeline every clock cycle to a new stage that performs a different function for those instructions. Thus, although each individual instruction takes several clock cycles to complete, the average clock time per instruction is reduced because the clock cycles of the individual instructions overlap. The performance improvement of the pipeline is realized when the instructions in the program are allowed, that is, the execution of an instruction does not need to depend on its previous instruction, so it can be executed in parallel with its previous instruction, which is usually called instruction-level parallel processing. Another method of instruction-level parallel processing adopted by today's microprocessors is to issue many executed instructions to different functional units in the same clock cycle, and each unit performs its specified function. Microprocessors that achieve instruction-level parallel processing in this way are often referred to as "superscalar" microprocessors.
以上所讨论的平行处理机制是和各别的指令级平行处理有关。然而,经由指令级平行处理的开发所达成的性能的改善是有限的。由有限的指令级平行处理所加诸的各种限制和其它性能限制的问题,最近重新引发开发利用在指令区块级,或指令序列级层,或指令流级层,或指令线程(thread)级层,平行处理的兴趣。该级的平行处理通常是指线程层平行处理。一个线程就是程序指令的一个序列或是流。依据一些调度原则,一个多线程微处理器同时执行许多的线程,该调度原则支配各式线程的指令的提取和发配,例如,交错,阻挡,或同时的多线程化。以一个同时进行的方式,一个多线程微处理器典型地允许许多线程来共享微处理器的功能单元(例如,指令提取和解码单元,高速缓存,分支预测单元,和加载与储存,整数,浮点,SIMD等执行单元)。然而,多线程微处理器包含多组的硬件/固件资源,或是线程上下文(thread context),用于储存每一个线程独特的状态,以实现线程间快速切换的能力,以提取和配发指令。例如每一个线程上下文包含它自己的程序计数器用于指令提取和线程识别信息,而且典型地也包含它自己的通用寄存器组。The parallel processing mechanisms discussed above are related to the respective instruction-level parallel processing. However, the performance improvements achieved through the development of instruction-level parallel processing are limited. The various limitations imposed by limited instruction-level parallel processing and other performance-limiting issues have recently reinvigorated the exploitation of the instruction block level, or the instruction sequence level, or the instruction stream level, or the instruction thread (thread) Hierarchical, parallel processing of interest. Parallel processing at this level usually refers to thread-level parallel processing. A thread is a sequence or stream of program instructions. A multithreaded microprocessor executes many threads simultaneously according to some scheduling principle that governs the fetching and dispatching of instructions for the various threads, such as interleaving, blocking, or simultaneous multithreading. In a concurrent fashion, a multithreaded microprocessor typically allows many threads to share the microprocessor's functional units (e.g., instruction fetch and decode units, caches, branch prediction units, and load and store, integer, floating point, SIMD, etc. execution units). However, a multi-threaded microprocessor contains multiple sets of hardware/firmware resources, or thread contexts, which are used to store the unique state of each thread to enable the ability to quickly switch between threads to fetch and dispatch instructions . For example, each thread context contains its own program counter for instruction fetches and thread identification information, and typically also contains its own set of general-purpose registers.
一个由多线程微处理器所引发的性能限制问题的例子是由于贮存错失而必须对微处理器外的存储器进行存取,通常会有一个相对长的等待时间的事实。以现今基于微处理器架构的计算机系统的存储器的存取时间通常是在大于高速缓存命中存取时间的1至2个数量级之间。结果当流水线停顿等待来自存储器的数据,某些或是全部的单一线程微处理器的流水线阶段可能会闲置许多时钟周期而没有执行任何有用的工作。多线程微处理器,在存储器提取等待时间期间,通过发出从其它线程来的指令,可以缓和这个情形,因此可以使流水线阶段向前迈进执行有用的工作,有些类似一个操作系统为响应页面错误所执行的任务工作切换但以更精确的粒度水平。另一个性能限制问题的例子是流水线停顿和他们伴随的时钟闲置,由于错误的分支预测和伴随的流水线冲洗(pipelineflush),或是由于数据相依性,或是由于一个长等待时间指令,例如一个除法指令。再者,多线程微处理器从其它线程发配指令至空闲的流水线阶段的能力,将可以大幅地降低执行组成该些线程的程序或是程序群所需要的时间。另一个问题,特别是在嵌入式系统,是与中断服务相关联的浪费的开销。典型地,当一个输入/输出装置传送一个中断信号给微处理器,该微处理器将控制权切换至一个中断服务程序,该程序要求储存目前的程序状态,服务该中断,当中断被服务完成后回复目前的程序状态。一个多线程微处理器提供事件服务码成为他自己的线程的能力,该线程有他自己线程的上下文。因此,在响应输入/输出装置送出一个事件的信号,该微处理器能够很快的,或许在一个时钟周期内,切换至事件服务线程,因此避免发生传统的中断服务程序管开销。An example of a performance limiting problem caused by a multi-threaded microprocessor is the fact that memory misses must be accessed outside the microprocessor, usually with a relatively long latency. The memory access time in today's microprocessor-based computer systems is typically between 1 and 2 orders of magnitude greater than the cache hit access time. As a result, some or all of the pipeline stages of a single-threaded microprocessor may idle for many clock cycles without performing any useful work while the pipeline stalls waiting for data from memory. Multi-threaded microprocessors can alleviate this situation by issuing instructions from other threads during memory fetch latencies, thus allowing pipeline stages to advance to perform useful work, somewhat like an operating system responds to page faults. Perform task switching but at a more precise level of granularity. Another example of a performance-limiting problem is pipeline stalls and their accompanying clock idling, due to incorrect branch predictions and accompanying pipeline flushes, either due to data dependencies, or due to a long-latency instruction such as a divide instruction. Furthermore, the ability of a multithreaded microprocessor to dispatch instructions from other threads to idle pipeline stages can greatly reduce the time required to execute the programs or groups of programs that make up those threads. Another problem, especially in embedded systems, is the wasteful overhead associated with servicing interrupts. Typically, when an I/O device sends an interrupt signal to the microprocessor, the microprocessor switches control to an interrupt service routine that requests to store the current program state and service the interrupt. When the interrupt is serviced Reply to the current program status. A multi-threaded microprocessor provides the ability for the event service code to become its own thread, which has its own thread context. Thus, in response to the I/O device signaling an event, the microprocessor can switch to the event service thread very quickly, perhaps within one clock cycle, thereby avoiding the overhead of conventional interrupt service routines.
正如指令级平行处理的程度指示一个微处理器可以利用流水线和超纯量指令发出的好处的范围,线程级平行处理的程度指示一个微处理器可以利用多线程执行好处的范围。线程的一个重要特色是它和其它在多线程微处理器上被执行的线程是完全独立无关的。一个线程与其它的线程的无关性达到它的指令不依赖在其它线程上的指令的程度。线程独立的特性使得微处理器可以同时执行不同线程的指令。也就是,微处理器可以发出一个线程的指令至执行单元,不必关心被其它线程所发出的指令。在线程存取共同数据的条件下,线程本身必须被程序化以相互同步数据存取,以确保适当的运算,如此,微处理器指令发出阶段不需要与相依性有关。Just as the degree of instruction-level parallelism indicates the extent to which a microprocessor can take advantage of the benefits of pipelining and superscalar instruction issuance, the degree of thread-level parallelism indicates the extent to which a microprocessor can take advantage of the benefits of multithreaded execution. An important feature of a thread is that it is completely independent of other threads being executed on a multithreaded microprocessor. A thread is independent of other threads to the extent that its instructions do not depend on instructions on other threads. The thread-independent feature allows the microprocessor to execute instructions from different threads simultaneously. That is, the microprocessor can issue instructions from one thread to the execution units without concern for instructions issued by other threads. Provided that threads access common data, the threads themselves must be programmed to mutually synchronize data access to ensure proper operation, so that the microprocessor instruction issue phase need not be concerned with dependencies.
由前述观察可得,一个具有多线程上下文的处理器,同时执行许多线程,可以减少执行包括这些线程的程序或是程序群所需要的时间。然而,引进多线程上下文同时也引进一组新的问题,特别是对于系统软件,以管理多指令流和他们相关的线程上下文。本发明人已经指出在一个微处理器中提高与指令执行相关的平行处理所要求的另一级。在此和相关的应用,本发明人解决了在同一个微处理器中提供虚拟处理元件。应用至这一级,一个多线程虚拟处理元件,除了实施许多程序计数器和线程上下文以确保有效的切换程序线程之外,实现所需要的全部资源以提供一给定指令集和特许的资源架构的一单个例示,该架构是足以执行一个每处理器(per-processor)操作系统图像。实际上,一个实现N个虚拟处理元件的微处理器(也就是,一个虚拟微处理器有N个虚拟处理元件)呈现给操作系统软件的是一个N路(N-way)的对称多处理器。依据本发明的虚拟多处理器和一个传统对称多处理器之间的实际差别是,除了共享存储器和某种程度的连接性之外,在一个虚拟多处理器中的虚拟处理元件,也共享虚拟微处理器的单片资源或属性,例如指令提取和发出逻辑,地址转换逻辑(也就是,转换后备缓冲器逻辑),功能单元,例如整数单元,符点单元,多媒体单元,媒体加速单元,SIMD单元,和协处理器。此外,虚拟处理元件必须共享虚拟多处理器的性能属性或是利用方面(也就是带宽),这些是根据配置给每一个虚拟处理元件的线程数目所决定,当执行被需要的情形下,与一个虚拟处理元件相关联的线程可以比与其它虚拟处理元件相关联的线程有更高的优先权的程度,和给该虚拟处理元件的某些全处理器的资源(例如,加载和储存缓冲器)的配置。例如,考量一个其中两种不同处理同时发生的嵌入式系统:影音数据的实时压缩和使用者图形界面的运作。使用20世纪晚期的技术,这些任务可以通过使用两个不同的处理器来完成:一个实时的数字信号处理器用来处理多媒体数据和一个交互式处理器核心来执行一个多任务操作系统。本发明允许这两个功能在同一个的虚拟多处理器上执行。虚拟多处理器的两个虚拟处理元件将会被采用:一个专用于执行多媒体处理任务,而另一个专用于执行使用者界面工作。采用两个虚拟处理元件解决两种不同软件示例性的共同存在或是共同举例说明的问题,但并不保证像一个专用于处理器相同的实时性能的要求,因为该多媒体虚拟处理元件和使用者界面虚拟处理元件必须共享在虚拟多处理器内的某些资源和在一个虚拟多处理器上执行的应用程序的性能,如上述所提及,是基于如何将那些资源或属性发出给每一个虚拟处理元件。From the aforementioned observations, it can be concluded that a processor with a multi-threaded context, executing many threads simultaneously, can reduce the time required to execute a program or a group of programs including these threads. However, the introduction of multithreaded contexts also introduces a new set of problems, especially for system software, to manage multiple instruction streams and their associated thread contexts. The present inventors have shown that another level of parallel processing is required to increase the parallel processing associated with instruction execution in a microprocessor. In this and related applications, the inventors have addressed the provision of virtual processing elements within the same microprocessor. Applied to this level, a multi-threaded virtual processing element, in addition to implementing a number of program counters and thread contexts to ensure efficient switching of program threads, implements all the resources required to provide a given instruction set and privileged resource architecture. A single instantiation, the architecture is sufficient to execute a per-processor OS image. In effect, a microprocessor implementing N virtual processing elements (that is, a virtual microprocessor with N virtual processing elements) presents to operating system software an N-way symmetric multiprocessor . The practical difference between a virtual multiprocessor according to the present invention and a conventional symmetric multiprocessor is that, in addition to shared memory and some degree of connectivity, the virtual processing elements in a virtual multiprocessor also share virtual On-chip resources or attributes of a microprocessor, such as instruction fetch and issue logic, address translation logic (i.e., translation lookaside buffer logic), functional units, such as integer unit, symbol unit, multimedia unit, media acceleration unit, SIMD units, and coprocessors. In addition, the virtual processing elements must share the performance attributes or utilization aspects (i.e., bandwidth) of the virtual multiprocessor, which are determined by the number of threads allocated to each virtual processing element, when execution is required, with a Threads associated with a virtual processing element may have a higher degree of priority than threads associated with other virtual processing elements, and are given certain processor-wide resources (e.g., load and store buffers) for that virtual processing element Configuration. For example, consider an embedded system in which two different processes occur simultaneously: real-time compression of audiovisual data and operation of a user graphical interface. Using late 20th century technology, these tasks can be accomplished by using two different processors: a real-time digital signal processor to process multimedia data and an interactive processor core to execute a multitasking operating system. The invention allows these two functions to be executed on the same virtual multiprocessor. Two virtual processing elements of the virtual multiprocessor will be employed: one dedicated to multimedia processing tasks and the other dedicated to user interface tasks. Using two virtual processing elements solves the co-existence or co-exemplification problem of two different software examples, but does not guarantee the same real-time performance requirements as a dedicated processor, because the multimedia virtual processing element and the user Interface virtual processing elements must share certain resources within virtual multiprocessors and the performance of applications executing on a virtual multiprocessor, as mentioned above, is based on how those resources or attributes are issued to each virtual Processing elements.
在一个多处理应用呈现一个广泛和多样的资源需求的市场,去制造具有针对一个特殊多处理应用量身订做的资源的虚拟多处理器将会是耗费很多成本。因此,本发明人已经观察到,提供一个能够被用于横跨广泛多处理应用的虚拟多处理器,是很期望的。他进一步表示,该虚拟多处理器包含通过软件对各种虚拟处理元件进行资源配置的机制,是很期望的。这类机制应该允许该虚拟多处理器被配置一个或多个虚拟处理元件,其中每一个虚拟处理元件是被配置以执行一个或多个线程。此外,在运行时刻,可由被信赖的虚拟处理元件动态配置这些资源和提供一个撤回配置特权的机制是期望的。In a market where multiprocessing applications present a broad and varied resource requirement, it would be costly to manufacture virtual multiprocessors with resources tailored to a particular multiprocessing application. Accordingly, the inventors have observed that it would be desirable to provide a virtual multiprocessor that can be used across a wide range of multiprocessing applications. He further stated that it is highly desirable that the virtual multiprocessor includes a mechanism for resource allocation of various virtual processing elements through software. Such mechanisms should allow the virtual multiprocessor to be configured with one or more virtual processing elements, where each virtual processing element is configured to execute one or more threads. Furthermore, at runtime, it would be desirable to have these resources dynamically configured by trusted virtual processing elements and to provide a mechanism to revoke configuration privileges.
【发明内容】【Content of invention】
本发明是针对解决以上所提及的问题以及提出先前技术的其它问题,缺点,和限制。本发明提出优良的机制用于动态地配置一个虚拟多处理器的资源。在一个实施例中中,一个装置被提供用于配置虚拟多处理器中一个或多个虚拟处理元件的资源。该装置包括一个虚拟多处理器上下文,一个或多个虚拟处理元件上下文,以及配置逻辑。该虚拟多处理器上下文,规定这些资源,以及控制虚拟多处理器的配置状态。该一个或多个虚拟处理元件上下文每一个唯一地对应至一个或多个虚拟处理元件。该一个或多个虚拟处理元件上下文每一个具有第一逻辑,用于规定是否该一个或多个虚拟处理元件的一个被允许配置这些资源;以及第二逻辑,用于规定被分派至该一个或多个虚拟处理元件的所述一个的资源的子集。该配置逻辑连接至虚拟多处理器上下文,和一个或多个虚拟处理元件上下文。该配置逻辑检测是否一个或多个虚拟处理元件的一个被允许配置这些资源,更新虚拟多处理器上下文以指出虚拟多处理器进入配置状态,以及通过更新一个被规定的虚拟处理元件上下文来配置这些资源。The present invention is directed to solving the above mentioned problems as well as addressing other problems, disadvantages, and limitations of the prior art. The present invention proposes an excellent mechanism for dynamically configuring the resources of a virtual multiprocessor. In one embodiment, a means is provided for configuring resources of one or more virtual processing elements in a virtual multiprocessor. The apparatus includes a virtual multiprocessor context, one or more virtual processing element contexts, and configuration logic. The virtual multiprocessor context specifies these resources and controls the configuration state of the virtual multiprocessor. The one or more virtual processing element contexts each uniquely correspond to one or more virtual processing elements. Each of the one or more virtual processing element contexts has first logic for specifying whether one of the one or more virtual processing element contexts is allowed to configure the resources; and second logic for specifying the resources assigned to the one or more A subset of resources of the one of the plurality of virtual processing elements. The configuration logic is connected to a virtual multiprocessor context, and to one or more virtual processing element contexts. The configuration logic detects whether one of the one or more virtual processing elements is allowed to configure these resources, updates the virtual multiprocessor context to indicate that the virtual multiprocessor enters the configured state, and configures these resources by updating a specified virtual processing element context resource.
本发明的一个方面提供了一种资源配置机制,用于指派资源给虚拟多处理器中的虚拟处理元件。该资源配置机制具有虚拟多处理器寄存器,用于每一个虚拟处理元件的虚拟处理元件寄存器,和配置逻辑。虚拟多处理器寄存器规定这些资源,并控制虚拟多处理器的配置状态。虚拟处理元件寄存器规定是否一个对应的虚拟处理元件被允许指派这些资源,以及规定被分派至对应的虚拟处理元件的这些资源的一子集。配置逻辑连接至虚拟多处理器寄存器和虚拟处理元件寄存器。配置逻辑检测是否对应的虚拟处理元件被允许指派这些资源,更新虚拟多处理器寄存器以指出虚拟多处理器进入配置状态,以及通过更新被选取的虚拟处理元件寄存器的一些来指派这些资源。One aspect of the present invention provides a resource allocation mechanism for assigning resources to virtual processing elements in a virtual multiprocessor. The resource allocation mechanism has virtual multiprocessor registers, virtual processing element registers for each virtual processing element, and configuration logic. The virtual multiprocessor registers specify these resources and control the configuration state of the virtual multiprocessor. The virtual processing element register specifies whether a corresponding virtual processing element is allowed to assign the resources, and specifies a subset of the resources that are assigned to the corresponding virtual processing element. The configuration logic is coupled to the virtual multiprocessor registers and the virtual processing element registers. The configuration logic detects whether the corresponding virtual processing element is allowed to assign the resources, updates the virtual multiprocessor registers to indicate that the virtual multiprocessor has entered a configured state, and assigns the resources by updating selected ones of the virtual processing element registers.
本发明的另一个方面提供一种和计算装置一起使用的计算机程序产品。该计算机程序产品包括一个计算机可使用的媒体,其包括内建在媒体中的计算机可读取程序代码,被配置以描述一个用于为虚拟多处理器中的虚拟处理元件配置资源的装置。该计算机可读取程序代码具有第一个程序代码,第二个程序代码,和第三个程序代码。第一个程序代码描述一个虚拟多处理器上下文。该虚拟多处理器上下文规定这些资源,并控制该虚拟多处理器的配置状态。第二个程序代码描述虚拟处理元件上下文,每一个该虚拟处理元件上下文单独地对应至一个虚拟处理元件并规定是否该一个虚拟处理元件被允许配置这些资源,以及规定被分派给该一个虚拟处理元件的资源的子集。第三个程序代码描述配置逻辑。该配置逻辑被连接至虚拟多处理器上下文和该虚拟处理元件上下文。该配置逻辑检测是否该虚拟处理元件中的一个被允许配置这些资源,更新该虚拟多处理器上下文以指示该虚拟多处理器进入配置状态,以及通过更新一个规定的虚拟处理元件上下文来配置这些资源。Another aspect of the invention provides a computer program product for use with a computing device. The computer program product includes a computer-usable medium including computer-readable program code embodied in the medium, configured to describe an apparatus for allocating resources to virtual processing elements in a virtual multiprocessor. The computer readable program code has a first program code, a second program code, and a third program code. The first program code describes a virtual multiprocessor context. The virtual multiprocessor context specifies these resources and controls the configuration state of the virtual multiprocessor. The second program code describes virtual processing element contexts, each of which is individually mapped to a virtual processing element and specifies whether the one virtual processing element is allowed to allocate the resources and specifies which resources are assigned to the one virtual processing element A subset of resources. The third program code describes the configuration logic. The configuration logic is coupled to the virtual multiprocessor context and the virtual processing element context. The configuration logic detects whether one of the virtual processing elements is allowed to configure the resources, updates the virtual multiprocessor context to indicate that the virtual multiprocessor enters a configured state, and configures the resources by updating a specified virtual processing element context .
本发明的另一方面提供了内建在一个传输媒体中的计算机数据信号。该计算机数据信号具有计算机可读取的程序代码,其被配置以描述一个用于为虚拟多处理器中的虚拟处理元件配置资源的装置。该计算机可读取程序代码包括第一个程序代码,第二个程序代码,和第三个程序代码。第一个程序代码描述一个虚拟多处理器上下文,其中该虚拟多处理器上下文规定这些资源,并控制该虚拟多处理器的配置状态。第二个程序代码描述虚拟处理元件上下文,每一个虚拟处理元件上下文单独地对应至一个虚拟处理元件,并规定是否该一个虚拟处理元件被允许配置这些资源,以及规定被分派给该一个虚拟处理元件资源的子集。第三个程序代码描述配置逻辑,该配置逻辑被连接至虚拟多处理器上下文和该虚拟处理元件上下文。该配置逻辑检测是否该一个虚拟处理元件被允许配置这些资源,更新该虚拟多处理器上下文以指示该虚拟多处理器进入配置状态,以及通过更新一个规定的虚拟处理元件上下文来配置这些资源。Another aspect of the present invention provides computer data signals embedded in a transmission medium. The computer data signals have computer readable program code configured to describe a means for allocating resources to virtual processing elements in a virtual multiprocessor. The computer readable program code includes a first program code, a second program code, and a third program code. The first program code describes a virtual multiprocessor context, wherein the virtual multiprocessor context specifies the resources and controls the configuration state of the virtual multiprocessor. The second program code describes virtual processing element contexts, each virtual processing element context is individually mapped to a virtual processing element, and specifies whether the virtual processing element is allowed to allocate these resources, and specifies to be assigned to the virtual processing element A subset of resources. The third program code describes configuration logic coupled to the virtual multiprocessor context and the virtual processing element context. The configuration logic detects whether the a virtual processing element is allowed to configure the resources, updates the virtual multiprocessor context to instruct the virtual multiprocessor to enter configuration state, and configures the resources by updating a specified virtual processing element context.
本发明的再另一方面提供了一种用于为虚拟多处理器中的虚拟处理元件配置资源的方法。该方法包括:经由一个虚拟多处理器上下文,首先规定这些资源,并控制虚拟多处理器的配置状态;经由虚拟处理元件上下文,每一个虚拟处理元件上下文单独地对应至虚拟处理元件中的一个,第二规定是否一个虚拟处理元件被允许配置这些资源,以及第三规定被分派给一个虚拟处理元件的资源的子集;以及经由连接至该虚拟多处理器上下文和该虚拟处理元件上下文的配置逻辑,检测是否虚拟处理元件的一个被允许来配置这些资源,以及首先更新虚拟多处理器上下文以指示该虚拟多处理器进入配置状态,以及通过第二更新一个规定的虚拟处理元件上下文来配置这些资源。Yet another aspect of the present invention provides a method for allocating resources for virtual processing elements in a virtual multiprocessor. The method includes: via a virtual multiprocessor context, first specifying these resources, and controlling the configuration state of the virtual multiprocessor; via the virtual processing element context, each virtual processing element context is individually corresponding to one of the virtual processing elements, A second specifies whether a virtual processing element is allowed to configure the resources, and a third specifies the subset of resources assigned to a virtual processing element; and via configuration logic connected to the virtual multiprocessor context and the virtual processing element context , detects whether one of the virtual processing elements is allowed to configure these resources, and first updates the virtual multiprocessor context to indicate that the virtual multiprocessor enters the configuration state, and configures these resources by second updating a specified virtual processing element context .
本发明的再另一个方面提供了一种虚拟多处理系统。该虚拟多处理系统包括一个存储器和一个多虚拟处理器。该存储器储存和许多程序线程有关的程序指令。该虚拟多处理器被连接至该存储器。该虚拟多处理器在该虚拟多处理器中配置的一或多个虚拟处理元件上执行这些程序指令。该虚拟多处理器有一个虚拟多处理器上下文,其规定该一个或多个虚拟处理元件的配置的资源,并控制该虚拟多处理器的配置状态。一个或多个虚拟处理元件的每一个包括一个虚拟处理元件上下文和一个配置逻辑。该虚拟处理元件上下文规定是否该一个或多个虚拟处理元件的每一个被允许配置这些资源,以及规定被分派给该一个或多个虚拟处理元件种被规定的一个的资源的子集。该配置逻辑被连接至该虚拟多处理器上下文和该虚拟处理元件上下文。该配置逻辑检测是否该一个或多个虚拟处理元件的每一个被允许来配置这些资源,更新该虚拟多处理器上下文以指示该虚拟多处理器进入配置状态,以及通过更新对应于该一或多个虚拟处理元件中被规定的一个的规定虚拟处理元件上下文来配置这些资源。Yet another aspect of the present invention provides a virtual multiprocessing system. The virtual multiprocessing system includes a memory and a plurality of virtual processors. The memory stores program instructions associated with a number of program threads. The virtual multiprocessor is connected to the memory. The virtual multiprocessor executes the program instructions on one or more virtual processing elements configured in the virtual multiprocessor. The virtual multiprocessor has a virtual multiprocessor context that specifies the configured resources of the one or more virtual processing elements and controls the configuration state of the virtual multiprocessor. Each of the one or more virtual processing elements includes a virtual processing element context and a configuration logic. The virtual processing element context specifies whether each of the one or more virtual processing elements is allowed to configure the resources, and specifies a subset of resources assigned to the specified one of the one or more virtual processing elements. The configuration logic is coupled to the virtual multiprocessor context and the virtual processing element context. The configuration logic detects whether each of the one or more virtual processing elements is allowed to configure these resources, updates the virtual multiprocessor context to indicate that the virtual multiprocessor enters the configured state, and updates the These resources are configured in the specified virtual processing element context of a specified one of the virtual processing elements.
【附图说明】【Description of drawings】
本发明的这些和其它目的,特征和优点,通过下列的描述和附图,将会更容易被了解。These and other objects, features and advantages of the present invention will be more readily understood from the following description and accompanying drawings.
图1是一个描述依据本发明的一个多处理环境的方框图;Fig. 1 is a block diagram depicting a multiprocessing environment according to the present invention;
图2是一个描述依据本发明的一个虚拟多处器流水线的方框图;Fig. 2 is a block diagram depicting a virtual multiprocessor pipeline according to the present invention;
图3是一个显示依据本发明的一个动态可配置虚拟多处器的方框图;Figure 3 is a block diagram showing a dynamically configurable virtual multiplexer according to the present invention;
图4是一个呈现与本发明的一个示例性实施例一致的虚拟多处理上下文寄存器的表格;Figure 4 is a table representing virtual multiprocessing context registers consistent with an exemplary embodiment of the present invention;
图5是一系列标描述图4的每一个虚拟多处理上下文寄存器的示例性实施例的方框图;Figure 5 is a series of block diagrams depicting an exemplary embodiment of each of the virtual multiprocessing context registers of Figure 4;
图6是一个描述依据本发明的用于虚拟处理器资源的动态配置的方法的流程图;以及FIG. 6 is a flowchart describing a method for dynamic allocation of virtual processor resources according to the present invention; and
图7是一个描述依据本发明的用于虚拟处理器资源的动态配置的可撤销的方法的流程图。FIG. 7 is a flow chart depicting a revocable method for dynamic allocation of virtual processor resources in accordance with the present invention.
【具体实施方式】【Detailed ways】
以下的描述是呈现给本领域的熟练技术人员以制造和使用本发明,如在一个特别的应用和它的要求的上下文之内。针对本最佳实施例的各式的修改,对本领域的熟练技术人员将是显而易见的,且定义在此的一般原则将可应用至其它实施例。因此,本发明不意欲被限制在此所描述和所示的特殊实施例,而是遵从和在此所揭露的原理和新特征一致的最广的范围。考虑到上述有关平行处理和相关的在当前处理器中被采用的多线程和多处理技术的背景讨论,本发明的讨论将会参照图1至图7来呈现。The following description is presented to those skilled in the art to make and use the invention as within the context of a particular application and its requirements. Various modifications to the preferred embodiment will be readily apparent to those skilled in the art, and the general principles defined therein will be applicable to other embodiments. Thus, the present invention is not intended to be limited to the particular embodiments described and shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein. With the background discussion above regarding parallel processing and related multithreading and multiprocessing techniques employed in current processors in mind, the discussion of the present invention will be presented with reference to FIGS. 1-7 .
参照图1显示,示出依据本发明的一个多处理环境100的一个方塊圖方框图。该多处理环境100,包括连接至一个系统界面控制器105的一个虚拟多处理器101。该系统界面控制器105连接至一个系统存储器106和一个或多个输入/输出装置107。每一个输入/输出装置107提供一个中断要求线108至虚拟多处理器101。该虚拟多处理器101包括一个或多个虚拟处理元件102。每一个虚拟处理元件102有一个对应的虚拟处理元件上下文104和一个或多个对应的线程上下文103。该多处理环境100,可能但不限于,一个通用的可程序化的计算机系统,服务器计算机,工作站计算机,个人计算机,笔记型计算机,个人数字助理,或嵌入式系统,例如,但不限于,一个网络路由器或交换器,打印机,海量储存控制器,相机,扫瞄仪,汽车控制器等等。Referring to FIG. 1, there is shown a block diagram of a multiprocessing environment 100 in accordance with the present invention. The multiprocessing environment 100 includes a virtual multiprocessor 101 connected to a system interface controller 105 . The system interface controller 105 is connected to a system memory 106 and to one or more input/output devices 107 . Each I/O device 107 provides an interrupt request line 108 to the virtual multiprocessor 101 . The virtual multiprocessor 101 includes one or more virtual processing elements 102 . Each virtual processing element 102 has a corresponding virtual
系统存储器106可以被具体化成存储器,例如动态随机存取存储器RAM和只读存储器ROM,用于储存在虚拟多处理器101执行的程序指令,并用于储存依据程序指令待被虚拟多处理器101处理的数据。程序指令可包括一个或多个由虚拟多处理器101同时执行的程序线程。一个程序线程或是线程包括一个程序指令的序列或流和相关联的在虚拟多处理器101中的对应虚拟处理元件102内的状态变化序列,该状态变化序列和指令序列的执行有关。每一个线程上下文103包括支持相对应程序线程执行所需的硬件状态。在一个实施例中,每一个线程上下文包括一组通用寄存器,一个程序计数器,和其它寄存器保存执行线程的状态,例如,乘法器状态和协处理器状态。每一个虚拟处理元件102提供资源以支持一个完整指令集架构和特许的资源架构的示例,该些架构足以执行一个单全处理器操作系统图像。在一个实施例中,每一个虚拟处理元件102提供资源以支持一个完整MIPS32/MIPS64指令集架构和特许的资源架构的示例。每一个虚拟处理元件上下文104组成一个支持在一个相对应虚拟处理元件102中线程执行所需的硬件状态。在一个实施例中,每一个虚拟处理元件上下文104规定分配给一个相对应虚拟处理元件102的资源,例如,地址转换逻辑资源(例如,转换后备缓冲器输入),功能单元(例如,整数单元,浮点单元,多媒体单元,媒体加速单元,SIMD单元,协处理器)和性能属性。在一个特别实施例,该性能属性包括允许停止和配置分配给其它虚拟处理元件102的资源,线程列举的数目,相对应虚拟处理元件102的激活/抑制和虚拟多处理器101的与带宽相关的资源(例如,指令执行带宽或优先权,加载储存带宽等等),这些资源被分配给相对应虚拟处理元件102。本发明提供多种带宽配置技术包括调度提示,执行优先权指派,加载/储存缓冲器分配等等。The system memory 106 can be embodied as a memory, such as a dynamic random access memory RAM and a read-only memory ROM, for storing program instructions executed by the virtual multiprocessor 101, and for storing instructions to be processed by the virtual multiprocessor 101 according to the program instructions. The data. Program instructions may include one or more program threads that are executed concurrently by virtual multiprocessor 101 . A program thread or thread comprises a sequence or stream of program instructions and associated sequence of state changes within corresponding virtual processing elements 102 in virtual multiprocessor 101, the sequence of state changes being related to the execution of the sequence of instructions. Each thread context 103 includes the hardware state required to support execution of the corresponding program thread. In one embodiment, each thread context includes a set of general-purpose registers, a program counter, and other registers that hold the state of the executing thread, such as multiplier state and coprocessor state. Each virtual processing element 102 provides resources to support an instance of a full instruction set architecture and privileged resource architecture sufficient to execute a single full processor operating system image. In one embodiment, each virtual processing element 102 provides resources to support an instance of a full MIPS32/MIPS64 instruction set architecture and privileged resource architecture. Each virtual
系统界面控制器105和虚拟多处理器101经由一个处理器总线相互连接。在一个实施例中,系统界面控制器105包括一个存储器控制器以控制系统存储器106。在一个实施例中,系统界面控制器105包括一个局部总线界面控制器以提供一个局部总线,例如,一个PCI总线,连接至输入/输出装置107。The system interface controller 105 and the virtual multiprocessor 101 are connected to each other via a processor bus. In one embodiment, system interface controller 105 includes a memory controller to control system memory 106 . In one embodiment, the system interface controller 105 includes a local bus interface controller to provide a local bus, eg, a PCI bus, to the I/O device 107 .
输入/输出装置107可包括,但不限于,使用者输入装置,例如,键盘,鼠标,扫瞄仪等等;显示装置,例如,监视器,打印机等等。储存装置,例如,磁盘驱动器,磁带机,光驱等等;系统外围装置,例如,直接存储器存取控制器DMAC,时钟,定时器,输入/输出端口等等;网络装置,例如,用于以太网络,光纤网络,无限频带(infiniband),或其它高速网络界面的媒体存取控制器MAC;数据转换装置,例如,模拟—数字转换器,数字—模拟转换器等等。输入/输出装置107产生中断信号108给虚拟多处理器101以要求服务。有利地,虚拟多处理器101能够同时执行许多用以处理在中断要求线108上表示的事件的程序线程,不需要传统的与保存微处理器102状态,转移控制权给中断服务例程,和在完成中断服务例程之后回复状态相关联的开销。The input/output devices 107 may include, but are not limited to, user input devices such as keyboards, mice, scanners, etc.; display devices such as monitors, printers, and the like. Storage devices, such as disk drives, tape drives, optical drives, etc.; system peripherals, such as direct memory access controllers DMAC, clocks, timers, input/output ports, etc.; network devices, such as for Ethernet , fiber optic network, infiniband, or other high-speed network interface media access controller MAC; data conversion devices, such as analog-to-digital converters, digital-to-analog converters, and so on. The I/O device 107 generates an interrupt signal 108 to the virtual multiprocessor 101 to request service. Advantageously, virtual multiprocessor 101 is capable of concurrently executing many program threads for processing events indicated on interrupt request lines 108, without the need for conventional and preserved microprocessor 102 state, transfer of control to interrupt service routines, and Overhead associated with restoring status after completion of an interrupt service routine.
在一个实施例中,虚拟多处理器101提供两种不同,但不互相排斥,的多线程能力。首先,虚拟多处理器包括一个或多个虚拟处理元件(VPEs)102以支持一个对应的一个或多个逻辑处理器上下文,经由在虚拟多处理器101中的资源共享,每个逻辑处理器上下文呈现给操作系统的是一个独立的处理元件。对一个操作系统,一个有N个VPEs 102的虚拟多处理器101看起来像一个N路(N-way)对称多处理器(SMP),其允许存在SMP可操作系统来管理一个或多个VPEs 102。第二,每一个VPE 102可以包括一个或多个线程上下文103,以同时执行对应的一个或多个程序线程。因此,依据本发明,虚拟多处理器101提供一个多线程程序化模型,其中在典型的情况下,程序线程能够被产生和销毁而不需要操作系统的干预,且系统服务线程能够用最小的中断等待时间被调度以响应外部的条件(例如,输入/输出服务事件信号)。In one embodiment, virtual multiprocessor 101 provides two different, but not mutually exclusive, multithreading capabilities. First, a virtual multiprocessor includes one or more virtual processing elements (VPEs) 102 to support a corresponding one or more logical processor contexts, via resource sharing in the virtual multiprocessor 101, each logical processor context Presented to the operating system is an individual processing element. To an operating system, a virtual multiprocessor 101 with N VPEs 102 looks like an N-way symmetric multiprocessor (SMP), which allows the existence of an SMP-operable operating system to manage one or more VPEs 102. Second, each VPE 102 can include one or more thread contexts 103 to simultaneously execute corresponding one or more program threads. Thus, in accordance with the present invention, virtual multiprocessor 101 provides a multithreaded programmatic model in which, in typical cases, program threads can be spawned and destroyed without operating system intervention, and system service threads can be processed with minimal interruption. Wait times are scheduled in response to external conditions (eg, input/output service event signals).
在一个实施例中,每一个线程上下文包括一个或多个储存元件,例如,寄存器或锁存器,其中具有描述相对应线程的执行状态的字段(例如,位)。也就是,一个给定线程上下文103描述各自线程的状态,其对该线程是唯一的,而不是和在虚拟处理元件102上同时执行的其他线程共享的状态。一个线程,这里也被称为程序线程、执行的线程、或指令流,是一个指令序列。每一个虚拟处理元件102有能力同时处理许多线程。通过在线程上下文103内储存每一个线程的状态,在虚拟多处理器101中的每一个虚拟处理元件102被配置成能在线程间快速切换,以提取和发出指令。有利地,本发明的虚拟多处理器101是被配置成执行指令以在不同线程上下文103间搬移线程上下文信息,正如共有待审的美国专利申请(案卷编号:MIPS.0194-00-US),其标题为“多计算上下文软件管理的机制”所详细描述的。In one embodiment, each thread context includes one or more storage elements, such as registers or latches, with fields (eg, bits) therein that describe the execution state of the corresponding thread. That is, a given thread context 103 describes the state of the respective thread that is unique to that thread, rather than a state that is shared with other threads executing concurrently on the virtual processing element 102 . A thread, also referred to herein as a program thread, thread of execution, or instruction stream, is a sequence of instructions. Each virtual processing element 102 is capable of processing many threads simultaneously. By storing the state of each thread within the thread context 103, each virtual processing element 102 in the virtual multiprocessor 101 is configured to quickly switch between threads to fetch and issue instructions. Advantageously, the virtual multiprocessor 101 of the present invention is configured to execute instructions to move thread context information between different thread contexts 103, as in co-pending U.S. Patent Application (Docket No.: MIPS.0194-00-US), Its title "Mechanisms for Software Management of Multiple Computing Contexts" is described in detail.
在一个实施例中,每一个VPE上下文104包括一群的储存元件,例如,寄存器或锁存器,其中具有描述相对应VPE 102的执行状态的字段(例如,位),提供相对应VPE 102的资源的配置,例如,但不限于,地址转换资源,协处理资源(例如,浮点处理器,媒体处理器等等),线程容量和列举,特定VPE 102激活/抑制执行的允许,和配置特定VPE 102资源的允许。在一个实施例中,一个VPE 102可以通过更新它的VPE上下文104来配置它自己的资源。另外,一个VPE 102可以通过更新对应不同VPE 102的VPE上下文104来配置不同VPE 102的资源。因此,一个有N个VPE 102的虚拟多处理器101呈现给操作系统或是其它对称多处理应用是一个N路对称多处理器。在一个实施例中,VPE 102共享在虚拟多处理器101中特定的资源,例如,指令高速缓存,指令提取器,指令解码器,指令发出器,指令调度器,执行单元和协处理单元,和对于操作单元是显然的数据贮存。资源共享的范围和程度是由VPE上下文104所规定,且可以通过更新VPE上下文104,在运行时间或其它时间被动态地配置。对一个给定的VPE 102来配置它自己的资源,或规定给其它VPE102的资源,他自己的VPE上下文104必须规定该被给定的VPE 102是被允许配置虚拟多处理器101的资源,在下面将会有更详细的描述。因此,假如给定VPE 102的VPE上下文104指出该给定的VPE 102是被允许来配置资源,则该给定的VPE 102可以更新所有的VPE上下文104以提供动态资源配置,包括资源配置许可的修改,其中包括撤销配置许可的能力。在一个实施例中,每一个VPE 102基本上符合一个MIPS32或MIPS64指令集架构(ISA)和一个MIPS特许资源架构(PRA),且每一个VPE上下文104包括该MIPS PRA协处理器0和描述其一示例所需的系统状态。在一个实施例中,VPE上下文106包括图5D-5G所描述的,VPECONTROL寄存器504,VPECONF0寄存器505,VPECONF1寄存器506,和VPESCHEDULE寄存器592。在一方面,一个VPE 102可以被当成是一个异常域(exceptiondomain)。也就是当VPE 102的一个线程上下文103产生一个异常,在VPE 102上的多线程被暂停(也就是,只有与线程上下文104服务该异常相关联的指令流的指令被提取和发出),且每一个VPE上下文104包括服务该异常所需的状态。一但该异常被服务之后,异常处理器将会选择性地重新启动在VPE 102上的多重线程。In one embodiment, each
现在请参阅图2,其是说明依据本发明的一个虚拟多处理器内的虚拟多处理器流水线200的方框图。该流水线200包括许多的流水线阶段且另外包括一个或多个线程内容103。图2的示例性实施例显示四个线程上下文103。在一个实施例,每一个线程上下文103包括一个程序计数器(PC)222,用于储存提取在相关的指令流中的下一个指令的地址,一个通用寄存器(GPR)组224,用于储存依据程序计数器222的值,从线程所发出的指令流的中间执行结果,和其它每线程(per-thread)上下文226。在一个实施例中,流水线222包括一个乘法器单元(未显示于图中),且其它线程上下文226包括用于储存乘法器单元的结果的寄存器,这些结果与指令流中的乘法指令特别有关。在一个实施例中,其它线程上下文226包括用于唯一辨识每一个线程上下文103的信息。在一个实施例中,该线程辨识信息包括用于规定有关的线程的执行特权级的信息,例如,是否该线程是一个核心,监督者,或使用者层线程。在一个实施例中,该线程辨识信息包括用于辨识组成该线程的一个任务或过程的信息。特别的是,该任务识别信息可以被用作为一个地址空间标识符(ASID)以将实际地址转换成虚拟地址。Referring now to FIG. 2, which is a block diagram illustrating a
流水线200包括一个调度器216用于被虚拟多处理器100所同时执行的许多线程的调度。调度器216连接到VMP上下文210,图1的VPE上下文104,和其它每线程(per-thread)上下文226。特别的是,调度器216是负责调度从不同线程上下文104的程序计数器222中提取的指令,和调度将提取指令发出给虚拟多处理器100的执行单元212,如下所描述。依据虚拟多处理器100的调度原则,调度器216对线程的执行进行调度。调度原则可以包括,但不限于,任何下列的调度原则。在一个实施例中,调度器216采用一个循环,或时分多路复用,或交叉的调度原则,配置一个预先决定数目的时钟周期,或指令发出时段,以一个环绕的顺序给每一个就序的线程。循环原则在一个其中公平性是重要的和基本服务品质对于某些线程是需要的应用上是有用的,例如,实时应用程序线程。在一个实施例中,调度器216采用一个阻挡调度原则,其中,调度器216持续对正在执行线程的提取和发出进行调度,直到一个阻挡线程进一步进展的事件发生,例如,一个贮存失误,一个分支预测错误,一个数据相依性,或一个长等待时间的指令。在一个实施例中,流水线200包括一个采用许多执行单元212的超级标量流水线,且调度器216调度每时钟周期许多指令的发出,特别的是,每时钟周期来自多个线程的指令发出,一般被认为是同时多线程。在其它实施例,调度器216采用一个利用经由VPE上下文104提供的调度信息的调度原则,其中,调度信息指出配置给每一个VPE 102的带宽和/或带宽相关的资源。
流水线200包括一个指令高速缓存202,用于贮存从一个系统存储器提取出的程序指令。在一个实施例,流水线200提供虚拟存储器的能力,且提取单元204包括一个转换后备缓冲器(未示出)用于贮存实际到虚拟存储器页面转换。在这个实施例,在转换后备缓冲器内的资源(例如,入口)被分配给共享流水线200的每一个VPE 102,正如VPE上下文104所规定的。在一个实施例中,在流水线200内所执行的每一个程序或任务,被指派一个唯一的任务ID,或地址空间ID(ASID),其被用来执行存储器存取,及具体地执行存储器地址转换,且一个线程上下文103,也包括储存与该线程相关联的ASID。
流水线200还包括一个提取单元204,连接到指令高速缓存202,用于从指令高速缓存202和系统存储器提取程序指令。提取单元204从多路复用器244所提供的指令提取地址提取地址。多路复用器244从对应的多个程序计数器222,接收多个指令提取地址。每一个程序计数器222储存用于不同程序线程的当前指令提取地址。图2的实施例说明了与四个不同线程相关联的四个不同程序计数器222。依据由调度器216提供的一个选择输入,多路复用器244从四个程序计数器222中选择一个。在一个实施例中,在微处理器100上执行的不同线程共享该提取单元204。
流水线200还包括一个解码单元206,连接至提取单元204,用于解码由提取单元204所提取的程序指令。解码单元206解码操作码,操作数,和指令的其它字段。在一个实施例中中,在微处理器100上执行的不同线程共享一个解码单元206。The
流水线200也包括执行单元212,用于执行指令。执行单元212可以包括,但不限于,一个或多个整数单元,用于执行整数算术,布尔运算,位移运算,旋转运算等等;用于执行浮点运算的浮点单元;用于执行存储器存取及特别地对连接到执行单元212的数据高速缓存242的存取的加载/储存单元;多媒体加速单元,用于执行多媒体运算;和一个分支解析单元,用于解析分支指令的结果和目标地址。在一个实施例中,数据高速缓存242包括一个转换后备缓冲器用于贮存实际到虚拟存储器页面转换。除了从数据高速缓存242所收到的操作数,执行单元212也从通用寄存器组224的寄存器接收操作数。具体地,一个执行单元212接收从线程上下文104的寄存器组224来的操作数,该线程上下文104是分配给该指令所属的线程。一个多路复用器248选择来自的适当寄存器组224的操作数提供给执行单元212。此外,多路复用器248接收从另一个线程上下文226和程序计数器222来的数据,以根据由执行单元212所执行的指令的线程上下文104来选择性地提供给执行单元212。在一个实施例中,不同的执行单元212可以同时执行从多个并存线程来的指令。The
流水线200也包括一个指令发出单元208,该指令发出单元208连接到调度器216,并连接到解码单元206和执行单元212之间,用于依调度器216的指示发出指令给执行单元212,并响应有关被解码单元206所解码的指令的信息。特别的是,假如当指令和先前发出给执行单元212的其它指令有数据相依性,指令发出单元208确保这些指令不会发出给执行单元212。在一个实施例中,一个指令队列(未示于图中)被置于解码单元206和指令发出单元208之间,用于缓冲等待发出给执行单元212的指令,用以减少执行单元212空乏的可能性。在一个实施例中,在流水线200中的许多线程执行共享该指令发出单元208。The
流水线200也包括一个写回单元214,其连接到执行单元212,用于将指令的结果写回到通用寄存器组224,程序计数器222,和其它线程上下文226。一个解多路复用器246接收从写回单元214来的指令结果,并将指令结果储存到适当的寄存器组224,程序计数器222,和其它跟该指令的线程有关的线程上下文226。该指令结果也被提供用于储存到VPE上下文104和一个虚拟多处理器(VMP)上下文210。
在一个实施例中,VMP上下文210包括一组储存元件,例如,寄存器或锁存器,在该储存元件有一个或多个字段(例如,字节)描述虚拟多处理器101的执行状态。特别的是,VMP上下文210储存关于在VPE102中被共享的虚拟多处理器101,全部资源的状态,如上所描述。具体地,VMP上下文规定在配置期间可以分配给VPEs 102的资源,也控制虚拟多处理器101是否是在一个配置这些资源的配置状态。在一个实施例中,该VMP上下文210包括如下所述的图5A-5C的一个MVPCONTROL寄存器501,MVPCON0寄存器502,和MVPCON1寄存器503。In one embodiment,
图2所示流水线200的具体阶段202,204,206,208,212,214被提供来清楚地说明本发明而不会模糊实质性方面。本领域的熟练技术人员可以领会流水线200的阶段化可通过增加或减少阶段的数目,或通过分配不同的功能给阶段而被修改以增进性能,而不会偏离本发明的精神和范围。The specific stages 202, 204, 206, 208, 212, 214 of the
参照图3,显示依据本发明的一个动态可配置虚拟多处理器300的方框图。该多处理器300包括一个或多个VPEs 302-304,列举为VPE 1302,VPE 2 303,直到VPE N 304。每一个VPE 302-304有一个对应的VPE上下文305-307。该些VPEs 302-304和VMP上下文210连接到执行逻辑212,如上参照图2所述。该执行逻辑212包括VPE配置逻辑310。该VPE配置逻辑310连接到一个例外信号311。该方框图中还示出有一个或多个资源322,324,326,328,它们分别被列举为RESOURCE1 322,RESOURCE2 324,RESOURCE3 326,直到RESOURCEM 328。Referring to FIG. 3, a block diagram of a dynamically configurable virtual multiprocessor 300 according to the present invention is shown. The multiprocessor 300 includes one or more VPEs 302-304, enumerated as VPE 1302,
在操作上,资源322-328的配置是通过执行一个由被允许配置这些资源322-328的VPEs 302-304,所发出的配置指令序列而被实现的。在一个实施例中,配置资源322-328的允许权是由对应VPEs 302-304的VPE上下文305-307所规定。当一个配置指令序列是由流水线200中的执行逻辑212所接收,该VPE配置逻辑310存取对应于VPEs 302-304的VPE上下文305-307,VPEs 302-304的程序线程导致配置指令序列被提取以决定VPEs 302-304是否被允许配置这些资源322-328。如果不是,则配置逻辑310导致例外(exception)信号311被断言,且配置指令序列不被执行。如果VPEs 302-304被允许配置这些资源322-328,则VPE配置逻辑310执行配置指令序列以指引虚拟多处理器300进入配置状态,且更新一个或多个规定的VPE上下文305-307,因此,重新配置这些资源。在一个实施例中,配置指令序列通过更新VMP上下文210来指引虚拟多处理器300进入配置状态。在一个实施例中,配置指令的序列包括遵循MIPS32/MIPS64多线程(MT)应用特殊延伸(ASE)架构的指令。In operation, configuration of resources 322-328 is accomplished by executing a sequence of configuration commands issued by VPEs 302-304 that are permitted to configure these resources 322-328. In one embodiment, permissions to configure resources 322-328 are specified by VPE contexts 305-307 of corresponding VPEs 302-304. When a sequence of configuration instructions is received by the execution logic 212 in the
该方框图示出一个由配置指令序列的执行导致的配置的资源322-328的具体实施例,且图表地描绘根据本发明,具体资源322-328如何能够被动态地配置,以最佳化在一个给定的多线程多处理应用中,同时执行线程的性能。例如,考虑到RESOURCE1 322图标分支对应于地址转换资源(例如,转换后备缓冲器输入)。从分支所示,VPE1 302是被规定为地址转换资源的一部份,且小于分配给剩余的VPEs 303-304的部分。或许,在VPE1 302执行的线程相对于其它线程是短的和反复的,因此,不需要扩大的地址转换资源。也考虑RESOURCE2 324代表对应到多线程协处理器的上下文(例如,浮点元件,媒体元件,SIMD元件等等)。VPE2 303,如在其VPE上下文306中所指定,相较于其它VPEs 302,307,是被配置较少的上下文数目,或许是由于由VPE2303所发出指令线程所指引的运算,不需要大量的共处理资源。此外,考虑到RESOURCE3 326代表资源配置许可。如图表所呈现,只有VPE2 303被允许配置虚拟多处理器300中的资源302-304。也就是指明,已经获得配置许可的一个给定的VPE302-304(在这个例子为VPE2 303)可以授予配置许可给其它的VPEs 302-304,或取消它们的配置许可,或取消它自己的配置许可。这是通过如这里所描述的更新规定的VPE上下文305-307来实现的。考虑到RESOURCE M 328是一个带宽资源,其根据如上所述的一被实现的调度原则,配置虚拟多处理器300的带宽给他的VPEs 302-304。因此,图表呈现每一个示例性的VPEs 302-304被给予相同部份的多处理器带宽,或者经由直接执行带宽配置,或通过设定几近相同的执行优先权,或通过其它用于规定带宽或与带宽相关的资源的技术。一个由本发明所尝试的规定与带宽相关的资源这样的技术,是加载/储存给VPEs 302-304的带宽的分配。例如,在VPEs 302-304间共享的在虚拟多处理器300中的存储器运算缓冲器的数目(未示出)小于执行线程的数目,则在执行一个与给定VPE302-304的线程有关的存储器运算之前,该虚拟多处理器300将会评估,是否要将给定的线程断开,因为,这样的运算可能会超过被规定用于给定的VPE302-304的与带宽相关的资源分配。这样一个带宽分配方案有利地解决了与VPEs 302-304有关的小数量的线程,例如,产生一长串的贮存失误可能独占与带宽相关的资源(在该示例中未存储器运算缓冲器)的情况,因此,防止来自其它VPEs 302-304的线程的执行。通过规定与带宽相关的资源的份额,依据本发明,这样的情况已经被排除在虚拟多处理器300外。The block diagram shows a specific embodiment of configured resources 322-328 resulting from execution of a sequence of configuration instructions, and diagrammatically depicts how specific resources 322-328 can be dynamically configured in accordance with the present invention to optimize The performance of concurrently executing threads in a given multithreaded multiprocessing application. For example, consider that the RESOURCE1 322 icon branch corresponds to an address translation resource (eg, translation lookaside buffer input). As shown from the branch, VPE1 302 is defined as a portion of the address translation resource and is smaller than the portion allocated to the remaining VPEs 303-304. Perhaps, the thread that executes at VPE1 302 is short and repetitive with respect to other thread, therefore, does not need the address translation resource of extensification. Also consider that RESOURCE2 324 represents contexts corresponding to multi-threaded coprocessors (eg, floating point elements, media elements, SIMD elements, etc.). VPE2 303, as specified in its VPE context 306, is configured with a lower number of contexts than the other VPEs 302, 307, perhaps due to the fact that operations directed by instruction threads issued by
请参照图4,给出一表格400,其描绘的根据本发明的一示例性实施例的虚拟多处理上下文寄存器。该虚拟多处理上下文寄存器被采用以配置一个虚拟多处理器上下文210,或是一个虚拟处理元件上下文104,如上所述。该虚拟多处理上下文包括寄存器MVPCONTROL,MVPCONF0,和MVPCONF1。用于一个虚拟多处理器内的每一个VPE的虚拟处理元件上下文包括寄存器VPECONTROL,VPECONF0,VPECONF1,和VPESCHEDULE。表格400显示寄存器和到MIPS32/MIPS64指令集和特许资源架构的多线程应用特殊延伸一致,其中,规定一个CPO寄存器的数目和寄存器选择数目给每一个所示寄存器以存取其中的上下文。上述寄存器的架构和上下文将会参照图5来讨论。Please refer to FIG. 4 , which shows a table 400 depicting virtual multiprocessing context registers according to an exemplary embodiment of the present invention. The virtual multiprocessing context register is employed to configure a
图5是一系列方框图,描述图4的每一个虚拟多处理器上下文寄存器501-506,592的示例性实施例。图5A-5F包括每一个寄存器的字段和一个描述不同字段的表格的说明,特别相关的字段会在此详加讨论。图5所说明的每一个寄存器,可以选择性的由VPE读或写,依VPECONF0寄存器505中MVP字段553的值指出,VPE有动态配置这些资源的许可。在寄存器501-506,592中某些字段是只能由VPE写入,VPE的MVP字段553指出它有配置许可。否则,某些字段是只读,如同由配置逻辑310所控制的。FIG. 5 is a series of block diagrams depicting an exemplary embodiment of each of the virtual multiprocessor context registers 501-506, 592 of FIG. Figures 5A-5F include descriptions of the fields of each register and a table describing the different fields, particularly relevant fields are discussed in detail here. Each register illustrated in FIG. 5 can be selectively read or written by the VPE. According to the value of the MVP field 553 in the VPECONF0 register 505, the VPE has the permission to dynamically configure these resources. Certain fields in registers 501-506, 592 are writable only by the VPE, and the MVP field 553 of the VPE indicates that it has configuration permissions. Otherwise, certain fields are read-only, as controlled by configuration logic 310 .
MVPCONTROL寄存器501有一个STLB字段511,一个VPC字段512,和一个EVP字段513。一个有如上所述配置许可的VPE102可以更新VPC字段512和EVP字段513以将虚拟多处理器101置于一个配置状态用于资源配置。清除VPC字段512和设定EVP字段513使新资源值被锁存在配置寄存器501-506,592中且用于虚拟处理以重新开始。一个有配置许可的VPE102可以更新STLB字段511以共享地址转换资源。The
MVPCONF0寄存器502,和MVPCONF1寄存器503是只读寄存器,这些寄存器由一个有配置许可的VPE102所读取,以决定在一个给定虚拟多处理器101中设置的可配置这些资源的数目和范围。字段TLBS指出,地址转换资源是可共享的,且地址转换资源共享可以通过设定MVPCONTROL寄存器501的字段STLB511来配置。字段PVPE524规定由虚拟多处理器101所提供VPEs 102的总数量。在图5的实施例,会采用多至16个VPEs 102。字段PTC525指出由虚拟多处理器101所提供线程上下文103的总数量。在该示出的实施例中,多至256个线程上下文103将会被举例说明。字段C1M 531指出可分配的协处理器是可多媒体延伸的。字段C1F 532指出是否可分配的协处理器是可浮点的。字段533-535指出可用于分配给VPEs 102的其它ISA特定资源的总数。
资源分配给具体VPE 104是通过将VPE数写到VPECONTROL寄存器504的字段TARGVPE 334。一个写入字段334的实施例是经由上述的MIPSMTTR和MFTR指令。Resources are assigned to a
在寄存器VPECONF0 505中字段VPA 552的值是被设定来起动/取消一个规定的VPE 102。字段MVP 553是设定为给予或取消资源配置许可。字段MINTC 554和MAXTC 555是被更新以配分配线程上下文103的数目和例举给一个规定的VPE 102。在本发明MIPS32/MIPS64多线程应用特殊延伸的实施例中,字段NCX 561,NCP2562,和NCP1 563是被更新以配置协处理器资源给一个具体的VPE 102。如上所述,图5E-5F的表格显示,该注明的资源分配字段552-555,561-563是只读的字段。所有VPEs 102没有资源配置许可,如VPECONF0寄存器505中MVP位553的状态所示。对于一个被授予资源配置许可的VPE 102,配置逻辑310使注明的字段552-555,561-563能够被更新(也就是被写入)。The value of field VPA 552 in register VPECONF0 505 is set to activate/deactivate a specified VPE 102. Field MVP 553 is set to grant or cancel resource allocation permission. Fields MINTC 554 and MAXTC 555 are updated to match the number of thread contexts 103 allocated and instantiated to a specified VPE 102. In the MIPS32/MIPS64 multithreaded application-specific extension of the present invention, fields
寄存器VPESCHEDULE 592包括一个调度器提示字段529,该字段529能够被更新以配置跨越在虚拟多处理器101中VPEs 102的带宽资源。
虽然图4和图5描述本发明的一示例性实施例,其中,某些资源能够在一个MIPS32/MIPS64多线程应用特殊延伸环境中被动态地配置,本发明人指出该示例性实施例是依据一个已知的指令集架构被提供的以教示本发明的多个方面。本发明人还指出,其它的架构同样的也可以被包含。Although Figures 4 and 5 describe an exemplary embodiment of the present invention in which certain resources can be dynamically configured in a MIPS32/MIPS64 multi-threaded application-specific extension environment, the inventor points out that the exemplary embodiment is based on A known instruction set architecture is provided to teach aspects of the invention. The inventor also points out that other architectures may also be included.
请参阅图6,示出一个说明依据本发明的用于虚拟处理器资源的动态配置的方法的流程图600。本方法由区块602开始,其中,依据本发明,一个VPE想要动态地配置这些资源。流程前进至区块604。Referring to FIG. 6 , there is shown a
在区块604,对应至该请求VPE的VPE上下文被读取。流程前进至决定区块606。At
在决定区块606,该VPE上下文被评估以决定是否该请求VPE被允许在虚拟多处理器中动态地配置这些资源。如果是,流程前进至区块608。如果不是,流程前进至区块607。At
在区块607,因为该请求VPE没有资源配置许可,一个例外被宣告且流程前进至区块620。At
在区块608,在虚拟多处理器中的虚拟处理被禁止,以允许资源配置。流程前进至区块610。At
在区块610,在虚拟多处理器中一个配置状态被建立。流程前进至区块612。At
在区块612,在虚拟多处理器中的一个VMP上下文被存取,以决定什么和多少资源是可用于配置。流程前进至区块614。At
在区块614,一个目标VPE被选取用于它分配的资源的配置。流程前进至区块616。At
在区块616,通过更新其相对应的VPE上下文,这些资源被配置用于选定的VPE。流程前进至区块618。At
在区块618,用于选定的VPE的资源的新配置通过退出配置状态而被锁存,且在虚拟多处理器中的虚拟处理被再次起动。流程前进至区块620。At
在区块620,该方法完成。At
图7是一个流程图700描述依据本发明的用于虚拟处理器资源的动态配置的一可撤回的方法。图7的流程图700的所有的区块702-720等同于图6的流程图600的对应的区块602-620,其中百位数是由7所取代,除了一个额外的区块717,其中,被选定的VPE的VPE上下文被更新以撤销它的动态配置这些资源的许可。区块702的请求VPE,可以和区块717的被选定的VPE相同,因此,启动一个VPE来撤销它自己的配置许可。在区块718中锁存新配置之后,该请求VPE不能再配置这些资源。FIG. 7 is a
虽然本发明和它的目的,特征,优点,已经被详细地描述,其它实施例被本发明所包括。例如,除了本发明使用硬件的实现方案外,本发明也可以例如,在一个计算机可使用(例如,可读取的)的媒体中配置的软件(例如,计算机可读取码,程序代码,指令和/或数据)来实现。这样的软件激活在这里描述的装置和方法的功能,制造,模型化,仿真,描述和/或测试。例如,可以由使用通常的程序语言(例如,C,C++,JAVA等等),GDSII数据库,包括Verilog HDL,VHDL的硬件描述语言(HDL)等等,或其它可用的程序,数据库,和/或电路(也就是,简图)捕捉工具来完成。这样的软件能够被配置在任何已知计算机可使用的(例如,可读取的)媒体,包括半导体存储器,磁盘,光盘(例如,CD-ROM,DVD-ROM等等)中,且作为在一个计算机可使用的(例如,可读取的)传输媒体(例如,载波或其它媒体包括数字,光学,或基于模拟的媒体)中的计算机数据信号。这样的软件可以在包括网际网络和内部网络的通讯网络上被传送。本发明可以软件(例如,作为半导体知识产权核心的一部分的HDL中,例如一个微处理器核心,或是一个系统级设计,例如单芯片系统或SOC)来实现和转换成硬件成为集成电路产品的一部分。本发明也可以由软件和硬件的结合来实施。Although the invention and its objects, features, and advantages have been described in detail, other embodiments are encompassed by the invention. For example, in addition to the implementation of the present invention using hardware, the present invention can also be embodied in software (e.g., computer readable code, program code, instruction and/or data) to achieve. Such software enables the functioning, fabrication, modeling, simulation, description and/or testing of the devices and methods described herein. For example, it may be implemented using common programming languages (e.g., C, C++, JAVA, etc.), GDSII databases, hardware description languages (HDL) including Verilog HDL, VHDL, etc., or other available programs, databases, and/or Circuit (ie, schematic) capture tool to complete. Such software can be deployed on any known computer-usable (e.g., readable) media, including semiconductor memory, magnetic disks, optical disks (e.g., CD-ROM, DVD-ROM, etc.), and as a A computer data signal in a computer-usable (eg, readable) transmission medium (eg, carrier wave or other media including digital, optical, or analog-based media). Such software can be transmitted over communication networks including the Internet and intranets. The present invention can be implemented in software (e.g., in HDL as part of a semiconductor intellectual property core, such as a microprocessor core, or in a system-level design, such as a system-on-a-chip or SOC) and converted to hardware as an integrated circuit product part. The present invention can also be implemented by a combination of software and hardware.
最后,本领域的熟练技术人员可以理解他们可以使用在此公开的概念和特定实施例为基础,设计或修改其它架构以实现本发明的相同目的,而不会背离如后附权利要求所定义的本发明的精神和范围。Finally, those skilled in the art will appreciate that they may use the concepts and specific embodiments disclosed herein as a basis for designing or modifying other architectures for carrying out the same purposes of the present invention without departing from the principles defined in the appended claims spirit and scope of the invention.
Claims (69)
Applications Claiming Priority (6)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US49918003P | 2003-08-28 | 2003-08-28 | |
| US60/499,180 | 2003-08-28 | ||
| US60/502,359 | 2003-09-12 | ||
| US60/502,358 | 2003-09-12 | ||
| US10/684,348 | 2003-10-10 | ||
| US10/684,350 | 2003-10-10 |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN1842771A true CN1842771A (en) | 2006-10-04 |
| CN100538640C CN100538640C (en) | 2009-09-09 |
Family
ID=37031160
Family Applications (4)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN 200480024800 Pending CN1842770A (en) | 2003-08-28 | 2004-08-26 | A holistic mechanism for suspending and releasing threads of computation during execution in a processor |
| CNB2004800247988A Expired - Fee Related CN100489784C (en) | 2003-08-28 | 2004-08-27 | Multithreading microprocessor and its novel threading establishment method and multithreading processing system |
| CN2004800248529A Expired - Fee Related CN1846194B (en) | 2003-08-28 | 2004-08-27 | Method and apparatus for executing parallel program threads |
| CNB2004800248016A Expired - Fee Related CN100538640C (en) | 2003-08-28 | 2004-08-27 | Apparatus for dynamically configuring virtual processor resources |
Family Applications Before (3)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN 200480024800 Pending CN1842770A (en) | 2003-08-28 | 2004-08-26 | A holistic mechanism for suspending and releasing threads of computation during execution in a processor |
| CNB2004800247988A Expired - Fee Related CN100489784C (en) | 2003-08-28 | 2004-08-27 | Multithreading microprocessor and its novel threading establishment method and multithreading processing system |
| CN2004800248529A Expired - Fee Related CN1846194B (en) | 2003-08-28 | 2004-08-27 | Method and apparatus for executing parallel program threads |
Country Status (1)
| Country | Link |
|---|---|
| CN (4) | CN1842770A (en) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110768807A (en) * | 2018-07-25 | 2020-02-07 | 中兴通讯股份有限公司 | Virtual resource method and device, virtual resource processing network element and storage medium |
Families Citing this family (36)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US9417914B2 (en) | 2008-06-02 | 2016-08-16 | Microsoft Technology Licensing, Llc | Regaining control of a processing resource that executes an external execution context |
| WO2010095182A1 (en) * | 2009-02-17 | 2010-08-26 | パナソニック株式会社 | Multithreaded processor and digital television system |
| GB2474521B (en) * | 2009-10-19 | 2014-10-15 | Ublox Ag | Program flow control |
| US8561070B2 (en) | 2010-12-02 | 2013-10-15 | International Business Machines Corporation | Creating a thread of execution in a computer processor without operating system intervention |
| CN102183922A (en) * | 2011-03-21 | 2011-09-14 | 浙江机电职业技术学院 | Method for realization of real-time pause of affiliated computer services (ACS) motion controller |
| EP2434402A4 (en) * | 2011-05-20 | 2012-08-01 | Huawei Tech Co Ltd | Method and device for multithread to access multiple copies |
| CN104750607B (en) * | 2011-06-17 | 2018-07-06 | 阿里巴巴集团控股有限公司 | A kind of method and device of selective recovery test execution |
| US9507638B2 (en) * | 2011-11-08 | 2016-11-29 | Nvidia Corporation | Compute work distribution reference counters |
| CN102750132B (en) * | 2012-06-13 | 2015-02-11 | 深圳中微电科技有限公司 | Thread control and call method for multithreading virtual assembly line processor, and processor |
| CN103973600B (en) * | 2013-02-01 | 2018-10-09 | 德克萨斯仪器股份有限公司 | Merge and deposit the method and device of field instruction for packet transaction rotation mask |
| JP6122749B2 (en) * | 2013-09-30 | 2017-04-26 | ルネサスエレクトロニクス株式会社 | Computer system |
| CN108228321B (en) * | 2014-12-16 | 2021-08-10 | 北京奇虎科技有限公司 | Android system application closing method and device |
| US9747108B2 (en) * | 2015-03-27 | 2017-08-29 | Intel Corporation | User-level fork and join processors, methods, systems, and instructions |
| US10346168B2 (en) | 2015-06-26 | 2019-07-09 | Microsoft Technology Licensing, Llc | Decoupled processor instruction window and operand buffer |
| US9720693B2 (en) * | 2015-06-26 | 2017-08-01 | Microsoft Technology Licensing, Llc | Bulk allocation of instruction blocks to a processor instruction window |
| US10169105B2 (en) * | 2015-07-30 | 2019-01-01 | Qualcomm Incorporated | Method for simplified task-based runtime for efficient parallel computing |
| US9921838B2 (en) * | 2015-10-02 | 2018-03-20 | Mediatek Inc. | System and method for managing static divergence in a SIMD computing architecture |
| GB2544994A (en) * | 2015-12-02 | 2017-06-07 | Swarm64 As | Data processing |
| CN105700913B (en) * | 2015-12-30 | 2018-10-12 | 广东工业大学 | A kind of parallel operation method of lightweight bare die code |
| US10761849B2 (en) * | 2016-09-22 | 2020-09-01 | Intel Corporation | Processors, methods, systems, and instruction conversion modules for instructions with compact instruction encodings due to use of context of a prior instruction |
| GB201717303D0 (en) | 2017-10-20 | 2017-12-06 | Graphcore Ltd | Scheduling tasks in a multi-threaded processor |
| GB2569275B (en) * | 2017-10-20 | 2020-06-03 | Graphcore Ltd | Time deterministic exchange |
| GB2569098B (en) * | 2017-10-20 | 2020-01-08 | Graphcore Ltd | Combining states of multiple threads in a multi-threaded processor |
| CN109697084B (en) * | 2017-10-22 | 2021-04-09 | 刘欣 | Fast access memory architecture for time division multiplexed pipelined processor |
| CN108536613B (en) * | 2018-03-08 | 2022-09-16 | 创新先进技术有限公司 | Data cleaning method and device and server |
| CN110955503B (en) * | 2018-09-27 | 2023-06-27 | 深圳市创客工场科技有限公司 | Task scheduling method and device |
| GB2580327B (en) * | 2018-12-31 | 2021-04-28 | Graphcore Ltd | Register files in a multi-threaded processor |
| CN111414196B (en) * | 2020-04-03 | 2022-07-19 | 中国人民解放军国防科技大学 | A method and device for implementing a zero-value register |
| US12020064B2 (en) * | 2020-10-20 | 2024-06-25 | Micron Technology, Inc. | Rescheduling a failed memory request in a processor |
| CN112395095A (en) * | 2020-11-09 | 2021-02-23 | 王志平 | Process synchronization method based on CPOC |
| CN112579278B (en) * | 2020-12-24 | 2023-01-20 | 海光信息技术股份有限公司 | Central processing unit, method, device and storage medium for simultaneous multithreading |
| TWI775259B (en) * | 2020-12-29 | 2022-08-21 | 新唐科技股份有限公司 | Direct memory access apparatus and electronic device using the same |
| CN115129369B (en) * | 2021-03-26 | 2025-03-28 | 上海阵量智能科技有限公司 | Command distribution method, command distributor, chip and electronic device |
| CN113946445B (en) * | 2021-10-15 | 2025-02-25 | 杭州国芯微电子股份有限公司 | A multi-thread module and multi-thread control method based on ASIC |
| CN116701085B (en) * | 2023-06-02 | 2024-03-19 | 中国科学院软件研究所 | Form verification method and device for consistency of instruction set design of RISC-V processor Chisel |
| CN116954950B (en) * | 2023-09-04 | 2024-03-12 | 北京凯芯微科技有限公司 | Inter-core communication method and electronic equipment |
Family Cites Families (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP0473714A1 (en) * | 1989-05-26 | 1992-03-11 | Massachusetts Institute Of Technology | Parallel multithreaded data processing system |
| CA2100540A1 (en) * | 1992-10-19 | 1994-04-20 | Jonel George | System and method for performing resource reconfiguration in a computer system |
| US5812811A (en) * | 1995-02-03 | 1998-09-22 | International Business Machines Corporation | Executing speculative parallel instructions threads with forking and inter-thread communication |
| US6647508B2 (en) * | 1997-11-04 | 2003-11-11 | Hewlett-Packard Development Company, L.P. | Multiprocessor computer architecture with multiple operating system instances and software controlled resource allocation |
| US6330656B1 (en) * | 1999-03-31 | 2001-12-11 | International Business Machines Corporation | PCI slot control apparatus with dynamic configuration for partitioned systems |
| US6668317B1 (en) * | 1999-08-31 | 2003-12-23 | Intel Corporation | Microengine for parallel processor architecture |
| HK1046566A1 (en) * | 1999-09-01 | 2003-01-17 | Intel Corporation | Branch instruction for processor |
| US6986137B1 (en) * | 1999-09-28 | 2006-01-10 | International Business Machines Corporation | Method, system and program products for managing logical processors of a computing environment |
| US7610366B2 (en) * | 2001-11-06 | 2009-10-27 | Canon Kabushiki Kaisha | Dynamic network device reconfiguration |
-
2004
- 2004-08-26 CN CN 200480024800 patent/CN1842770A/en active Pending
- 2004-08-27 CN CNB2004800247988A patent/CN100489784C/en not_active Expired - Fee Related
- 2004-08-27 CN CN2004800248529A patent/CN1846194B/en not_active Expired - Fee Related
- 2004-08-27 CN CNB2004800248016A patent/CN100538640C/en not_active Expired - Fee Related
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110768807A (en) * | 2018-07-25 | 2020-02-07 | 中兴通讯股份有限公司 | Virtual resource method and device, virtual resource processing network element and storage medium |
Also Published As
| Publication number | Publication date |
|---|---|
| CN100538640C (en) | 2009-09-09 |
| CN1846194A (en) | 2006-10-11 |
| CN1842770A (en) | 2006-10-04 |
| CN100489784C (en) | 2009-05-20 |
| CN1846194B (en) | 2010-12-15 |
| CN1842769A (en) | 2006-10-04 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN100538640C (en) | Apparatus for dynamically configuring virtual processor resources | |
| US7694304B2 (en) | Mechanisms for dynamic configuration of virtual processor resources | |
| CN1117319C (en) | Method and apparatus for altering thread priorities in multithreaded processor | |
| US7418585B2 (en) | Symmetric multiprocessor operating system for execution on non-independent lightweight thread contexts | |
| US8266620B2 (en) | Symmetric multiprocessor operating system for execution on non-independent lightweight thread contexts | |
| CN1112636C (en) | Method and apparatus for selecting thread switch events in multithreaded processor | |
| US7376954B2 (en) | Mechanisms for assuring quality of service for programs executing on a multithreaded processor | |
| US20060136915A1 (en) | Method and apparatus for scheduling multiple threads for execution in a shared microprocessor pipeline | |
| WO2005022385A1 (en) | Mechanisms for dynamic configuration of virtual processor resources | |
| US20070044106A2 (en) | Symmetric multiprocessor operating system for execution on non-independent lightweight thread contexts | |
| EP1660999A2 (en) | Integrated mechanism for suspension and deallocation of computational threads of execution in a processor | |
| CN1726468A (en) | Cross partition sharing of state information | |
| CN1726469A (en) | Processor virtualization mechanism via an enhanced restoration of hard architected states | |
| CN107918557A (en) | Device and method for operating multi-core system and multi-core system | |
| CN111752615A (en) | Apparatus, method and system for ensuring quality of service of multithreaded processor cores | |
| CN105027075A (en) | Processing cores with shared front-end unit | |
| Abeydeera et al. | SAM: Optimizing multithreaded cores for speculative parallelism | |
| Ausavarungnirun et al. | Mosaic: Enabling application-transparent support for multiple page sizes in throughput processors | |
| Park et al. | A hardware operating system kernel for multi-processor systems |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| C14 | Grant of patent or utility model | ||
| GR01 | Patent grant | ||
| CP01 | Change in the name or title of a patent holder |
Address after: American California Patentee after: Imagination Technologies Ltd. Address before: American California Patentee before: Imagination Technology Co.,Ltd. Address after: American California Patentee after: Imagination Technology Co.,Ltd. Address before: American California Patentee before: Mips Technologies, Inc. |
|
| CP01 | Change in the name or title of a patent holder | ||
| CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20090909 Termination date: 20200827 |
|
| CF01 | Termination of patent right due to non-payment of annual fee |