[go: up one dir, main page]

CN1842771A - Mechanisms for dynamic configuration of virtual processor resources - Google Patents

Mechanisms for dynamic configuration of virtual processor resources Download PDF

Info

Publication number
CN1842771A
CN1842771A CN 200480024801 CN200480024801A CN1842771A CN 1842771 A CN1842771 A CN 1842771A CN 200480024801 CN200480024801 CN 200480024801 CN 200480024801 A CN200480024801 A CN 200480024801A CN 1842771 A CN1842771 A CN 1842771A
Authority
CN
China
Prior art keywords
virtual treatment
context
virtual
resource
treatment element
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN 200480024801
Other languages
Chinese (zh)
Other versions
CN100538640C (en
Inventor
凯文·基塞尔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Imagination Technologies Ltd
MIPS Tech LLC
Original Assignee
MIPS Technologies Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by MIPS Technologies Inc filed Critical MIPS Technologies Inc
Publication of CN1842771A publication Critical patent/CN1842771A/en
Application granted granted Critical
Publication of CN100538640C publication Critical patent/CN100538640C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Landscapes

  • Executing Machine-Instructions (AREA)
  • Multi Processors (AREA)
  • Advance Control (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a virtual multi-processor context, one or several virtual processing element context, and arrangement logic, wherein said virtual multi-processor context defines the resources and controls the arrangement of virtual multi-processor; said one virtual processing element context is relative to one of virtual processing element; the virtual processing element context has the first logic, to define if one virtual processing element is allowed to own these resources; and the second logic for defining the subgroup of the resource distributed to the virtual processing element; said arrangement logic is connected to the virtual multi-processor context and the virtual processing element context; the arrangement logic detects if one virtual processing element is allowed to arrange said resources, to refresh the virtual multi-processor context to make said virtual multi-processor into arrangement state, and distribute said resources via refreshing one defined virtual processing element.

Description

动态配置虚拟处理器资源的机制A Mechanism for Dynamically Allocation of Virtual Processor Resources

【相关专利申请的交叉参考】[Cross-references to related patent applications]

本专利申请主张下列美国临暂时专利申请的权利,它们被结合在此作为参考。   序号   申请日   标题   60/499180(MIPS.0188-00-US)   8/28/03   多线程(thread)应用特殊延伸   60/502358(MIPS.0188-02-US)   9/12/03   对一个处理器架构的多线程应用特殊延伸   60/502359(MIPS.0188-03-US)   9/12/03   对一个处理器架构的多线程应用特殊延伸 This patent application claims the benefit of the following US provisional patent applications, which are hereby incorporated by reference. serial number filing date title 60/499180 (MIPS.0188-00-US) 8/28/03 Special extensions for multi-threaded applications 60/502358 (MIPS.0188-02-US) 9/12/03 Application-specific extensions to multithreading for a processor architecture 60/502359 (MIPS.0188-03-US) 9/12/03 Application-specific extensions to multithreading for a processor architecture

本专利申请是下列共有非临时美国专利申请的部分继续申请,下列专利申请的每一个都有相同的受让人和至少一个共同的发明人,它们被结合在此作为参考。   序号   申请日   标题   10/684350(MIPS.0188-01-US)   10/10/03   确保在一个多线程处理器上程序执行服务品质的机制   10/684348(MIPS.0189-00-US)   10/10/03   在一个处理器中计算线程执行的暂停和解除分配的整合机制 This patent application is a continuation-in-part of the following commonly owned non-provisional United States patent applications, each of which has the same assignee and at least one common inventor, which are hereby incorporated by reference. serial number filing date title 10/684350 (MIPS.0188-01-US) 10/10/03 Mechanisms for ensuring quality of service for program execution on a multithreaded processor 10/684348 (MIPS.0189-00-US) 10/10/03 Integrated mechanism for suspending and deallocating computation thread execution in a processor

上述提及的两个共有非临时美国专利申请主张下列美国临时专利申请的权利。   序号   申请日   标题   60/499180(MIPS.0188-00-US)   8/28/03   多线程应用特殊延伸   60/502358(MIPS.0188-02-US)   9/12/03   对一个处理器架构的多线程应用特殊延伸   60/502359(MIPS.0188-03-US)   9/12/03   对一个处理器架构的多线程应用特殊延伸 The above mentioned two co-owned non-provisional US patent applications claim the rights to the following US provisional patent applications. serial number filing date title 60/499180 (MIPS.0188-00-US) 8/28/03 Special extensions for multi-threaded applications 60/502358 (MIPS.0188-02-US) 9/12/03 Application-specific extensions to multithreading for a processor architecture 60/502359 (MIPS.0188-03-US) 9/12/03 Application-specific extensions to multithreading for a processor architecture

本专利申请是和下列共有非临时美国专利申请有关,下列专利申请的每一个被结合在此作为参考。   序号   申请日   标题 (MIPS.0189-01-US)   8/27/04   在一个处理器中计算线程执行的暂停和解除分配的整合机制 (MIPS.0192-00-US)   8/27/04   在一个多线程微处理器中指令流的同时开始的装置,方法,和指令 (MIPS.0194-00-US)   8/27/04   多运算上下文(context)的软件管理的机制 This patent application is related to the following co-owned non-provisional US patent applications, each of which is incorporated herein by reference. serial number filing date title (MIPS.0189-01-US) 8/27/04 Integrated mechanism for suspending and deallocating computation thread execution in a processor (MIPS.0192-00-US) 8/27/04 Apparatus, method, and instructions for simultaneous initiation of instruction streams in a multithreaded microprocessor (MIPS.0194-00-US) 8/27/04 Mechanism for software management of multiple computing contexts

【技术领域】【Technical field】

本发明总地来说涉及虚拟多处理器的领域,更具体的说,涉及一个或多个虚拟处理元件之间的一虚拟多处理器内的资源动态配置的一种机制。The present invention relates generally to the field of virtual multiprocessors, and more particularly, to a mechanism for dynamic allocation of resources within a virtual multiprocessor among one or more virtual processing elements.

【背景技术】【Background technique】

现今,设计者运用许多技术以增加微处理器的性能。大部份的微处理器均使用在一个固定的频率运行的时钟信号进行工作。在每一个时钟周期,微处理器的电路均执行他们相对应的功能。依据轩尼斯和派特森的方法,真实测量微处理器的性能是执行一个程序或是一群程序所需要的时间。从这个观点来说,微处理器的性能是它的时钟频率,执行一个指令所需要的平均时钟周期数目(换个说法,每个时钟周期执行指令的平均数目),和在该个程序或是该群程序中所执行的指令数目的函数。半导体科学家和工程师持续在技术上提供进展,使得微处理器能够运算在更快的时钟频率上。这些技术进展有效地缩减晶体管的大小,导致在一个集成电路中更快速的交换时间。执行的指令数目主要取决于将被该程序所执行的任务,虽然它也受微处理器指令集架构的影响。然而,大幅的性能提升已经由架构上和组织上的技术来达成,该技术提高了每时钟周期执行的指令数目,特别是通过允许指令平行执行的技术(也就是,平行处理理论)。Today, designers employ many techniques to increase the performance of microprocessors. Most microprocessors operate with a clock signal that runs at a fixed frequency. In each clock cycle, the circuits of the microprocessor perform their corresponding functions. According to Hennessy and Patterson, the true measure of microprocessor performance is the time it takes to execute a program or a group of programs. From this point of view, the performance of a microprocessor is its clock frequency, the average number of clock cycles required to execute an instruction (in other words, the average number of instructions executed per clock cycle), and the A function of the number of instructions executed in the group program. Semiconductor scientists and engineers continue to provide advances in technology that allow microprocessors to operate at faster clock frequencies. These technological advances effectively shrink the size of transistors, resulting in faster switching times within an integrated circuit. The number of instructions executed depends primarily on the tasks to be performed by the program, although it is also affected by the microprocessor's instruction set architecture. However, substantial performance gains have been achieved by architectural and organizational techniques that increase the number of instructions executed per clock cycle, in particular by techniques that allow instructions to be executed in parallel (ie, parallel processing theory).

已经提高微处理器每个时钟周期的指令数目,和他们的时钟频率的平行处理技术是流水线的。以相当类似于装配线的阶段的方式,在微处理器流水线阶段内,流水线重叠多指令的执行。在一个理想情形,一个指令在每一个时钟周期向流水线下方移到一个新阶段,该新阶段对这些指令执行不同的功能。因此,虽然每一个各别指令花数个时钟周期来完成,因为各别指令的时钟周期有重叠,每个指令的平均时钟会被减少。在程序中指令允许的情形下实现流水线的性能提升,也就是一个指令的执行并不需要依赖它的前一个指令,因此可以和它先前的指令平行地执行,通常被称为指令级平行处理。另一种被当今微处理器所采用的指令级平行处理的方法,是在相同的时钟周期发出许多执行的指令给不同的功能单元,各单元执行他们被规定的功能。以这种方法完成指令级平行处理的微处理器,通常被视为“超级标量”微处理器。Parallel processing techniques that have increased the number of instructions per clock cycle of microprocessors, and their clock frequency, are pipelined. In a manner quite similar to the stages of an assembly line, within a microprocessor pipeline stage, the pipeline overlaps the execution of multiple instructions. In an ideal situation, an instruction moves down the pipeline every clock cycle to a new stage that performs a different function for those instructions. Thus, although each individual instruction takes several clock cycles to complete, the average clock time per instruction is reduced because the clock cycles of the individual instructions overlap. The performance improvement of the pipeline is realized when the instructions in the program are allowed, that is, the execution of an instruction does not need to depend on its previous instruction, so it can be executed in parallel with its previous instruction, which is usually called instruction-level parallel processing. Another method of instruction-level parallel processing adopted by today's microprocessors is to issue many executed instructions to different functional units in the same clock cycle, and each unit performs its specified function. Microprocessors that achieve instruction-level parallel processing in this way are often referred to as "superscalar" microprocessors.

以上所讨论的平行处理机制是和各别的指令级平行处理有关。然而,经由指令级平行处理的开发所达成的性能的改善是有限的。由有限的指令级平行处理所加诸的各种限制和其它性能限制的问题,最近重新引发开发利用在指令区块级,或指令序列级层,或指令流级层,或指令线程(thread)级层,平行处理的兴趣。该级的平行处理通常是指线程层平行处理。一个线程就是程序指令的一个序列或是流。依据一些调度原则,一个多线程微处理器同时执行许多的线程,该调度原则支配各式线程的指令的提取和发配,例如,交错,阻挡,或同时的多线程化。以一个同时进行的方式,一个多线程微处理器典型地允许许多线程来共享微处理器的功能单元(例如,指令提取和解码单元,高速缓存,分支预测单元,和加载与储存,整数,浮点,SIMD等执行单元)。然而,多线程微处理器包含多组的硬件/固件资源,或是线程上下文(thread context),用于储存每一个线程独特的状态,以实现线程间快速切换的能力,以提取和配发指令。例如每一个线程上下文包含它自己的程序计数器用于指令提取和线程识别信息,而且典型地也包含它自己的通用寄存器组。The parallel processing mechanisms discussed above are related to the respective instruction-level parallel processing. However, the performance improvements achieved through the development of instruction-level parallel processing are limited. The various limitations imposed by limited instruction-level parallel processing and other performance-limiting issues have recently reinvigorated the exploitation of the instruction block level, or the instruction sequence level, or the instruction stream level, or the instruction thread (thread) Hierarchical, parallel processing of interest. Parallel processing at this level usually refers to thread-level parallel processing. A thread is a sequence or stream of program instructions. A multithreaded microprocessor executes many threads simultaneously according to some scheduling principle that governs the fetching and dispatching of instructions for the various threads, such as interleaving, blocking, or simultaneous multithreading. In a concurrent fashion, a multithreaded microprocessor typically allows many threads to share the microprocessor's functional units (e.g., instruction fetch and decode units, caches, branch prediction units, and load and store, integer, floating point, SIMD, etc. execution units). However, a multi-threaded microprocessor contains multiple sets of hardware/firmware resources, or thread contexts, which are used to store the unique state of each thread to enable the ability to quickly switch between threads to fetch and dispatch instructions . For example, each thread context contains its own program counter for instruction fetches and thread identification information, and typically also contains its own set of general-purpose registers.

一个由多线程微处理器所引发的性能限制问题的例子是由于贮存错失而必须对微处理器外的存储器进行存取,通常会有一个相对长的等待时间的事实。以现今基于微处理器架构的计算机系统的存储器的存取时间通常是在大于高速缓存命中存取时间的1至2个数量级之间。结果当流水线停顿等待来自存储器的数据,某些或是全部的单一线程微处理器的流水线阶段可能会闲置许多时钟周期而没有执行任何有用的工作。多线程微处理器,在存储器提取等待时间期间,通过发出从其它线程来的指令,可以缓和这个情形,因此可以使流水线阶段向前迈进执行有用的工作,有些类似一个操作系统为响应页面错误所执行的任务工作切换但以更精确的粒度水平。另一个性能限制问题的例子是流水线停顿和他们伴随的时钟闲置,由于错误的分支预测和伴随的流水线冲洗(pipelineflush),或是由于数据相依性,或是由于一个长等待时间指令,例如一个除法指令。再者,多线程微处理器从其它线程发配指令至空闲的流水线阶段的能力,将可以大幅地降低执行组成该些线程的程序或是程序群所需要的时间。另一个问题,特别是在嵌入式系统,是与中断服务相关联的浪费的开销。典型地,当一个输入/输出装置传送一个中断信号给微处理器,该微处理器将控制权切换至一个中断服务程序,该程序要求储存目前的程序状态,服务该中断,当中断被服务完成后回复目前的程序状态。一个多线程微处理器提供事件服务码成为他自己的线程的能力,该线程有他自己线程的上下文。因此,在响应输入/输出装置送出一个事件的信号,该微处理器能够很快的,或许在一个时钟周期内,切换至事件服务线程,因此避免发生传统的中断服务程序管开销。An example of a performance limiting problem caused by a multi-threaded microprocessor is the fact that memory misses must be accessed outside the microprocessor, usually with a relatively long latency. The memory access time in today's microprocessor-based computer systems is typically between 1 and 2 orders of magnitude greater than the cache hit access time. As a result, some or all of the pipeline stages of a single-threaded microprocessor may idle for many clock cycles without performing any useful work while the pipeline stalls waiting for data from memory. Multi-threaded microprocessors can alleviate this situation by issuing instructions from other threads during memory fetch latencies, thus allowing pipeline stages to advance to perform useful work, somewhat like an operating system responds to page faults. Perform task switching but at a more precise level of granularity. Another example of a performance-limiting problem is pipeline stalls and their accompanying clock idling, due to incorrect branch predictions and accompanying pipeline flushes, either due to data dependencies, or due to a long-latency instruction such as a divide instruction. Furthermore, the ability of a multithreaded microprocessor to dispatch instructions from other threads to idle pipeline stages can greatly reduce the time required to execute the programs or groups of programs that make up those threads. Another problem, especially in embedded systems, is the wasteful overhead associated with servicing interrupts. Typically, when an I/O device sends an interrupt signal to the microprocessor, the microprocessor switches control to an interrupt service routine that requests to store the current program state and service the interrupt. When the interrupt is serviced Reply to the current program status. A multi-threaded microprocessor provides the ability for the event service code to become its own thread, which has its own thread context. Thus, in response to the I/O device signaling an event, the microprocessor can switch to the event service thread very quickly, perhaps within one clock cycle, thereby avoiding the overhead of conventional interrupt service routines.

正如指令级平行处理的程度指示一个微处理器可以利用流水线和超纯量指令发出的好处的范围,线程级平行处理的程度指示一个微处理器可以利用多线程执行好处的范围。线程的一个重要特色是它和其它在多线程微处理器上被执行的线程是完全独立无关的。一个线程与其它的线程的无关性达到它的指令不依赖在其它线程上的指令的程度。线程独立的特性使得微处理器可以同时执行不同线程的指令。也就是,微处理器可以发出一个线程的指令至执行单元,不必关心被其它线程所发出的指令。在线程存取共同数据的条件下,线程本身必须被程序化以相互同步数据存取,以确保适当的运算,如此,微处理器指令发出阶段不需要与相依性有关。Just as the degree of instruction-level parallelism indicates the extent to which a microprocessor can take advantage of the benefits of pipelining and superscalar instruction issuance, the degree of thread-level parallelism indicates the extent to which a microprocessor can take advantage of the benefits of multithreaded execution. An important feature of a thread is that it is completely independent of other threads being executed on a multithreaded microprocessor. A thread is independent of other threads to the extent that its instructions do not depend on instructions on other threads. The thread-independent feature allows the microprocessor to execute instructions from different threads simultaneously. That is, the microprocessor can issue instructions from one thread to the execution units without concern for instructions issued by other threads. Provided that threads access common data, the threads themselves must be programmed to mutually synchronize data access to ensure proper operation, so that the microprocessor instruction issue phase need not be concerned with dependencies.

由前述观察可得,一个具有多线程上下文的处理器,同时执行许多线程,可以减少执行包括这些线程的程序或是程序群所需要的时间。然而,引进多线程上下文同时也引进一组新的问题,特别是对于系统软件,以管理多指令流和他们相关的线程上下文。本发明人已经指出在一个微处理器中提高与指令执行相关的平行处理所要求的另一级。在此和相关的应用,本发明人解决了在同一个微处理器中提供虚拟处理元件。应用至这一级,一个多线程虚拟处理元件,除了实施许多程序计数器和线程上下文以确保有效的切换程序线程之外,实现所需要的全部资源以提供一给定指令集和特许的资源架构的一单个例示,该架构是足以执行一个每处理器(per-processor)操作系统图像。实际上,一个实现N个虚拟处理元件的微处理器(也就是,一个虚拟微处理器有N个虚拟处理元件)呈现给操作系统软件的是一个N路(N-way)的对称多处理器。依据本发明的虚拟多处理器和一个传统对称多处理器之间的实际差别是,除了共享存储器和某种程度的连接性之外,在一个虚拟多处理器中的虚拟处理元件,也共享虚拟微处理器的单片资源或属性,例如指令提取和发出逻辑,地址转换逻辑(也就是,转换后备缓冲器逻辑),功能单元,例如整数单元,符点单元,多媒体单元,媒体加速单元,SIMD单元,和协处理器。此外,虚拟处理元件必须共享虚拟多处理器的性能属性或是利用方面(也就是带宽),这些是根据配置给每一个虚拟处理元件的线程数目所决定,当执行被需要的情形下,与一个虚拟处理元件相关联的线程可以比与其它虚拟处理元件相关联的线程有更高的优先权的程度,和给该虚拟处理元件的某些全处理器的资源(例如,加载和储存缓冲器)的配置。例如,考量一个其中两种不同处理同时发生的嵌入式系统:影音数据的实时压缩和使用者图形界面的运作。使用20世纪晚期的技术,这些任务可以通过使用两个不同的处理器来完成:一个实时的数字信号处理器用来处理多媒体数据和一个交互式处理器核心来执行一个多任务操作系统。本发明允许这两个功能在同一个的虚拟多处理器上执行。虚拟多处理器的两个虚拟处理元件将会被采用:一个专用于执行多媒体处理任务,而另一个专用于执行使用者界面工作。采用两个虚拟处理元件解决两种不同软件示例性的共同存在或是共同举例说明的问题,但并不保证像一个专用于处理器相同的实时性能的要求,因为该多媒体虚拟处理元件和使用者界面虚拟处理元件必须共享在虚拟多处理器内的某些资源和在一个虚拟多处理器上执行的应用程序的性能,如上述所提及,是基于如何将那些资源或属性发出给每一个虚拟处理元件。From the aforementioned observations, it can be concluded that a processor with a multi-threaded context, executing many threads simultaneously, can reduce the time required to execute a program or a group of programs including these threads. However, the introduction of multithreaded contexts also introduces a new set of problems, especially for system software, to manage multiple instruction streams and their associated thread contexts. The present inventors have shown that another level of parallel processing is required to increase the parallel processing associated with instruction execution in a microprocessor. In this and related applications, the inventors have addressed the provision of virtual processing elements within the same microprocessor. Applied to this level, a multi-threaded virtual processing element, in addition to implementing a number of program counters and thread contexts to ensure efficient switching of program threads, implements all the resources required to provide a given instruction set and privileged resource architecture. A single instantiation, the architecture is sufficient to execute a per-processor OS image. In effect, a microprocessor implementing N virtual processing elements (that is, a virtual microprocessor with N virtual processing elements) presents to operating system software an N-way symmetric multiprocessor . The practical difference between a virtual multiprocessor according to the present invention and a conventional symmetric multiprocessor is that, in addition to shared memory and some degree of connectivity, the virtual processing elements in a virtual multiprocessor also share virtual On-chip resources or attributes of a microprocessor, such as instruction fetch and issue logic, address translation logic (i.e., translation lookaside buffer logic), functional units, such as integer unit, symbol unit, multimedia unit, media acceleration unit, SIMD units, and coprocessors. In addition, the virtual processing elements must share the performance attributes or utilization aspects (i.e., bandwidth) of the virtual multiprocessor, which are determined by the number of threads allocated to each virtual processing element, when execution is required, with a Threads associated with a virtual processing element may have a higher degree of priority than threads associated with other virtual processing elements, and are given certain processor-wide resources (e.g., load and store buffers) for that virtual processing element Configuration. For example, consider an embedded system in which two different processes occur simultaneously: real-time compression of audiovisual data and operation of a user graphical interface. Using late 20th century technology, these tasks can be accomplished by using two different processors: a real-time digital signal processor to process multimedia data and an interactive processor core to execute a multitasking operating system. The invention allows these two functions to be executed on the same virtual multiprocessor. Two virtual processing elements of the virtual multiprocessor will be employed: one dedicated to multimedia processing tasks and the other dedicated to user interface tasks. Using two virtual processing elements solves the co-existence or co-exemplification problem of two different software examples, but does not guarantee the same real-time performance requirements as a dedicated processor, because the multimedia virtual processing element and the user Interface virtual processing elements must share certain resources within virtual multiprocessors and the performance of applications executing on a virtual multiprocessor, as mentioned above, is based on how those resources or attributes are issued to each virtual Processing elements.

在一个多处理应用呈现一个广泛和多样的资源需求的市场,去制造具有针对一个特殊多处理应用量身订做的资源的虚拟多处理器将会是耗费很多成本。因此,本发明人已经观察到,提供一个能够被用于横跨广泛多处理应用的虚拟多处理器,是很期望的。他进一步表示,该虚拟多处理器包含通过软件对各种虚拟处理元件进行资源配置的机制,是很期望的。这类机制应该允许该虚拟多处理器被配置一个或多个虚拟处理元件,其中每一个虚拟处理元件是被配置以执行一个或多个线程。此外,在运行时刻,可由被信赖的虚拟处理元件动态配置这些资源和提供一个撤回配置特权的机制是期望的。In a market where multiprocessing applications present a broad and varied resource requirement, it would be costly to manufacture virtual multiprocessors with resources tailored to a particular multiprocessing application. Accordingly, the inventors have observed that it would be desirable to provide a virtual multiprocessor that can be used across a wide range of multiprocessing applications. He further stated that it is highly desirable that the virtual multiprocessor includes a mechanism for resource allocation of various virtual processing elements through software. Such mechanisms should allow the virtual multiprocessor to be configured with one or more virtual processing elements, where each virtual processing element is configured to execute one or more threads. Furthermore, at runtime, it would be desirable to have these resources dynamically configured by trusted virtual processing elements and to provide a mechanism to revoke configuration privileges.

【发明内容】【Content of invention】

本发明是针对解决以上所提及的问题以及提出先前技术的其它问题,缺点,和限制。本发明提出优良的机制用于动态地配置一个虚拟多处理器的资源。在一个实施例中中,一个装置被提供用于配置虚拟多处理器中一个或多个虚拟处理元件的资源。该装置包括一个虚拟多处理器上下文,一个或多个虚拟处理元件上下文,以及配置逻辑。该虚拟多处理器上下文,规定这些资源,以及控制虚拟多处理器的配置状态。该一个或多个虚拟处理元件上下文每一个唯一地对应至一个或多个虚拟处理元件。该一个或多个虚拟处理元件上下文每一个具有第一逻辑,用于规定是否该一个或多个虚拟处理元件的一个被允许配置这些资源;以及第二逻辑,用于规定被分派至该一个或多个虚拟处理元件的所述一个的资源的子集。该配置逻辑连接至虚拟多处理器上下文,和一个或多个虚拟处理元件上下文。该配置逻辑检测是否一个或多个虚拟处理元件的一个被允许配置这些资源,更新虚拟多处理器上下文以指出虚拟多处理器进入配置状态,以及通过更新一个被规定的虚拟处理元件上下文来配置这些资源。The present invention is directed to solving the above mentioned problems as well as addressing other problems, disadvantages, and limitations of the prior art. The present invention proposes an excellent mechanism for dynamically configuring the resources of a virtual multiprocessor. In one embodiment, a means is provided for configuring resources of one or more virtual processing elements in a virtual multiprocessor. The apparatus includes a virtual multiprocessor context, one or more virtual processing element contexts, and configuration logic. The virtual multiprocessor context specifies these resources and controls the configuration state of the virtual multiprocessor. The one or more virtual processing element contexts each uniquely correspond to one or more virtual processing elements. Each of the one or more virtual processing element contexts has first logic for specifying whether one of the one or more virtual processing element contexts is allowed to configure the resources; and second logic for specifying the resources assigned to the one or more A subset of resources of the one of the plurality of virtual processing elements. The configuration logic is connected to a virtual multiprocessor context, and to one or more virtual processing element contexts. The configuration logic detects whether one of the one or more virtual processing elements is allowed to configure these resources, updates the virtual multiprocessor context to indicate that the virtual multiprocessor enters the configured state, and configures these resources by updating a specified virtual processing element context resource.

本发明的一个方面提供了一种资源配置机制,用于指派资源给虚拟多处理器中的虚拟处理元件。该资源配置机制具有虚拟多处理器寄存器,用于每一个虚拟处理元件的虚拟处理元件寄存器,和配置逻辑。虚拟多处理器寄存器规定这些资源,并控制虚拟多处理器的配置状态。虚拟处理元件寄存器规定是否一个对应的虚拟处理元件被允许指派这些资源,以及规定被分派至对应的虚拟处理元件的这些资源的一子集。配置逻辑连接至虚拟多处理器寄存器和虚拟处理元件寄存器。配置逻辑检测是否对应的虚拟处理元件被允许指派这些资源,更新虚拟多处理器寄存器以指出虚拟多处理器进入配置状态,以及通过更新被选取的虚拟处理元件寄存器的一些来指派这些资源。One aspect of the present invention provides a resource allocation mechanism for assigning resources to virtual processing elements in a virtual multiprocessor. The resource allocation mechanism has virtual multiprocessor registers, virtual processing element registers for each virtual processing element, and configuration logic. The virtual multiprocessor registers specify these resources and control the configuration state of the virtual multiprocessor. The virtual processing element register specifies whether a corresponding virtual processing element is allowed to assign the resources, and specifies a subset of the resources that are assigned to the corresponding virtual processing element. The configuration logic is coupled to the virtual multiprocessor registers and the virtual processing element registers. The configuration logic detects whether the corresponding virtual processing element is allowed to assign the resources, updates the virtual multiprocessor registers to indicate that the virtual multiprocessor has entered a configured state, and assigns the resources by updating selected ones of the virtual processing element registers.

本发明的另一个方面提供一种和计算装置一起使用的计算机程序产品。该计算机程序产品包括一个计算机可使用的媒体,其包括内建在媒体中的计算机可读取程序代码,被配置以描述一个用于为虚拟多处理器中的虚拟处理元件配置资源的装置。该计算机可读取程序代码具有第一个程序代码,第二个程序代码,和第三个程序代码。第一个程序代码描述一个虚拟多处理器上下文。该虚拟多处理器上下文规定这些资源,并控制该虚拟多处理器的配置状态。第二个程序代码描述虚拟处理元件上下文,每一个该虚拟处理元件上下文单独地对应至一个虚拟处理元件并规定是否该一个虚拟处理元件被允许配置这些资源,以及规定被分派给该一个虚拟处理元件的资源的子集。第三个程序代码描述配置逻辑。该配置逻辑被连接至虚拟多处理器上下文和该虚拟处理元件上下文。该配置逻辑检测是否该虚拟处理元件中的一个被允许配置这些资源,更新该虚拟多处理器上下文以指示该虚拟多处理器进入配置状态,以及通过更新一个规定的虚拟处理元件上下文来配置这些资源。Another aspect of the invention provides a computer program product for use with a computing device. The computer program product includes a computer-usable medium including computer-readable program code embodied in the medium, configured to describe an apparatus for allocating resources to virtual processing elements in a virtual multiprocessor. The computer readable program code has a first program code, a second program code, and a third program code. The first program code describes a virtual multiprocessor context. The virtual multiprocessor context specifies these resources and controls the configuration state of the virtual multiprocessor. The second program code describes virtual processing element contexts, each of which is individually mapped to a virtual processing element and specifies whether the one virtual processing element is allowed to allocate the resources and specifies which resources are assigned to the one virtual processing element A subset of resources. The third program code describes the configuration logic. The configuration logic is coupled to the virtual multiprocessor context and the virtual processing element context. The configuration logic detects whether one of the virtual processing elements is allowed to configure the resources, updates the virtual multiprocessor context to indicate that the virtual multiprocessor enters a configured state, and configures the resources by updating a specified virtual processing element context .

本发明的另一方面提供了内建在一个传输媒体中的计算机数据信号。该计算机数据信号具有计算机可读取的程序代码,其被配置以描述一个用于为虚拟多处理器中的虚拟处理元件配置资源的装置。该计算机可读取程序代码包括第一个程序代码,第二个程序代码,和第三个程序代码。第一个程序代码描述一个虚拟多处理器上下文,其中该虚拟多处理器上下文规定这些资源,并控制该虚拟多处理器的配置状态。第二个程序代码描述虚拟处理元件上下文,每一个虚拟处理元件上下文单独地对应至一个虚拟处理元件,并规定是否该一个虚拟处理元件被允许配置这些资源,以及规定被分派给该一个虚拟处理元件资源的子集。第三个程序代码描述配置逻辑,该配置逻辑被连接至虚拟多处理器上下文和该虚拟处理元件上下文。该配置逻辑检测是否该一个虚拟处理元件被允许配置这些资源,更新该虚拟多处理器上下文以指示该虚拟多处理器进入配置状态,以及通过更新一个规定的虚拟处理元件上下文来配置这些资源。Another aspect of the present invention provides computer data signals embedded in a transmission medium. The computer data signals have computer readable program code configured to describe a means for allocating resources to virtual processing elements in a virtual multiprocessor. The computer readable program code includes a first program code, a second program code, and a third program code. The first program code describes a virtual multiprocessor context, wherein the virtual multiprocessor context specifies the resources and controls the configuration state of the virtual multiprocessor. The second program code describes virtual processing element contexts, each virtual processing element context is individually mapped to a virtual processing element, and specifies whether the virtual processing element is allowed to allocate these resources, and specifies to be assigned to the virtual processing element A subset of resources. The third program code describes configuration logic coupled to the virtual multiprocessor context and the virtual processing element context. The configuration logic detects whether the a virtual processing element is allowed to configure the resources, updates the virtual multiprocessor context to instruct the virtual multiprocessor to enter configuration state, and configures the resources by updating a specified virtual processing element context.

本发明的再另一方面提供了一种用于为虚拟多处理器中的虚拟处理元件配置资源的方法。该方法包括:经由一个虚拟多处理器上下文,首先规定这些资源,并控制虚拟多处理器的配置状态;经由虚拟处理元件上下文,每一个虚拟处理元件上下文单独地对应至虚拟处理元件中的一个,第二规定是否一个虚拟处理元件被允许配置这些资源,以及第三规定被分派给一个虚拟处理元件的资源的子集;以及经由连接至该虚拟多处理器上下文和该虚拟处理元件上下文的配置逻辑,检测是否虚拟处理元件的一个被允许来配置这些资源,以及首先更新虚拟多处理器上下文以指示该虚拟多处理器进入配置状态,以及通过第二更新一个规定的虚拟处理元件上下文来配置这些资源。Yet another aspect of the present invention provides a method for allocating resources for virtual processing elements in a virtual multiprocessor. The method includes: via a virtual multiprocessor context, first specifying these resources, and controlling the configuration state of the virtual multiprocessor; via the virtual processing element context, each virtual processing element context is individually corresponding to one of the virtual processing elements, A second specifies whether a virtual processing element is allowed to configure the resources, and a third specifies the subset of resources assigned to a virtual processing element; and via configuration logic connected to the virtual multiprocessor context and the virtual processing element context , detects whether one of the virtual processing elements is allowed to configure these resources, and first updates the virtual multiprocessor context to indicate that the virtual multiprocessor enters the configuration state, and configures these resources by second updating a specified virtual processing element context .

本发明的再另一个方面提供了一种虚拟多处理系统。该虚拟多处理系统包括一个存储器和一个多虚拟处理器。该存储器储存和许多程序线程有关的程序指令。该虚拟多处理器被连接至该存储器。该虚拟多处理器在该虚拟多处理器中配置的一或多个虚拟处理元件上执行这些程序指令。该虚拟多处理器有一个虚拟多处理器上下文,其规定该一个或多个虚拟处理元件的配置的资源,并控制该虚拟多处理器的配置状态。一个或多个虚拟处理元件的每一个包括一个虚拟处理元件上下文和一个配置逻辑。该虚拟处理元件上下文规定是否该一个或多个虚拟处理元件的每一个被允许配置这些资源,以及规定被分派给该一个或多个虚拟处理元件种被规定的一个的资源的子集。该配置逻辑被连接至该虚拟多处理器上下文和该虚拟处理元件上下文。该配置逻辑检测是否该一个或多个虚拟处理元件的每一个被允许来配置这些资源,更新该虚拟多处理器上下文以指示该虚拟多处理器进入配置状态,以及通过更新对应于该一或多个虚拟处理元件中被规定的一个的规定虚拟处理元件上下文来配置这些资源。Yet another aspect of the present invention provides a virtual multiprocessing system. The virtual multiprocessing system includes a memory and a plurality of virtual processors. The memory stores program instructions associated with a number of program threads. The virtual multiprocessor is connected to the memory. The virtual multiprocessor executes the program instructions on one or more virtual processing elements configured in the virtual multiprocessor. The virtual multiprocessor has a virtual multiprocessor context that specifies the configured resources of the one or more virtual processing elements and controls the configuration state of the virtual multiprocessor. Each of the one or more virtual processing elements includes a virtual processing element context and a configuration logic. The virtual processing element context specifies whether each of the one or more virtual processing elements is allowed to configure the resources, and specifies a subset of resources assigned to the specified one of the one or more virtual processing elements. The configuration logic is coupled to the virtual multiprocessor context and the virtual processing element context. The configuration logic detects whether each of the one or more virtual processing elements is allowed to configure these resources, updates the virtual multiprocessor context to indicate that the virtual multiprocessor enters the configured state, and updates the These resources are configured in the specified virtual processing element context of a specified one of the virtual processing elements.

【附图说明】【Description of drawings】

本发明的这些和其它目的,特征和优点,通过下列的描述和附图,将会更容易被了解。These and other objects, features and advantages of the present invention will be more readily understood from the following description and accompanying drawings.

图1是一个描述依据本发明的一个多处理环境的方框图;Fig. 1 is a block diagram depicting a multiprocessing environment according to the present invention;

图2是一个描述依据本发明的一个虚拟多处器流水线的方框图;Fig. 2 is a block diagram depicting a virtual multiprocessor pipeline according to the present invention;

图3是一个显示依据本发明的一个动态可配置虚拟多处器的方框图;Figure 3 is a block diagram showing a dynamically configurable virtual multiplexer according to the present invention;

图4是一个呈现与本发明的一个示例性实施例一致的虚拟多处理上下文寄存器的表格;Figure 4 is a table representing virtual multiprocessing context registers consistent with an exemplary embodiment of the present invention;

图5是一系列标描述图4的每一个虚拟多处理上下文寄存器的示例性实施例的方框图;Figure 5 is a series of block diagrams depicting an exemplary embodiment of each of the virtual multiprocessing context registers of Figure 4;

图6是一个描述依据本发明的用于虚拟处理器资源的动态配置的方法的流程图;以及FIG. 6 is a flowchart describing a method for dynamic allocation of virtual processor resources according to the present invention; and

图7是一个描述依据本发明的用于虚拟处理器资源的动态配置的可撤销的方法的流程图。FIG. 7 is a flow chart depicting a revocable method for dynamic allocation of virtual processor resources in accordance with the present invention.

【具体实施方式】【Detailed ways】

以下的描述是呈现给本领域的熟练技术人员以制造和使用本发明,如在一个特别的应用和它的要求的上下文之内。针对本最佳实施例的各式的修改,对本领域的熟练技术人员将是显而易见的,且定义在此的一般原则将可应用至其它实施例。因此,本发明不意欲被限制在此所描述和所示的特殊实施例,而是遵从和在此所揭露的原理和新特征一致的最广的范围。考虑到上述有关平行处理和相关的在当前处理器中被采用的多线程和多处理技术的背景讨论,本发明的讨论将会参照图1至图7来呈现。The following description is presented to those skilled in the art to make and use the invention as within the context of a particular application and its requirements. Various modifications to the preferred embodiment will be readily apparent to those skilled in the art, and the general principles defined therein will be applicable to other embodiments. Thus, the present invention is not intended to be limited to the particular embodiments described and shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein. With the background discussion above regarding parallel processing and related multithreading and multiprocessing techniques employed in current processors in mind, the discussion of the present invention will be presented with reference to FIGS. 1-7 .

参照图1显示,示出依据本发明的一个多处理环境100的一个方塊圖方框图。该多处理环境100,包括连接至一个系统界面控制器105的一个虚拟多处理器101。该系统界面控制器105连接至一个系统存储器106和一个或多个输入/输出装置107。每一个输入/输出装置107提供一个中断要求线108至虚拟多处理器101。该虚拟多处理器101包括一个或多个虚拟处理元件102。每一个虚拟处理元件102有一个对应的虚拟处理元件上下文104和一个或多个对应的线程上下文103。该多处理环境100,可能但不限于,一个通用的可程序化的计算机系统,服务器计算机,工作站计算机,个人计算机,笔记型计算机,个人数字助理,或嵌入式系统,例如,但不限于,一个网络路由器或交换器,打印机,海量储存控制器,相机,扫瞄仪,汽车控制器等等。Referring to FIG. 1, there is shown a block diagram of a multiprocessing environment 100 in accordance with the present invention. The multiprocessing environment 100 includes a virtual multiprocessor 101 connected to a system interface controller 105 . The system interface controller 105 is connected to a system memory 106 and to one or more input/output devices 107 . Each I/O device 107 provides an interrupt request line 108 to the virtual multiprocessor 101 . The virtual multiprocessor 101 includes one or more virtual processing elements 102 . Each virtual processing element 102 has a corresponding virtual processing element context 104 and one or more corresponding thread contexts 103 . The multiprocessing environment 100 may be, but is not limited to, a general-purpose programmable computer system, server computer, workstation computer, personal computer, notebook computer, personal digital assistant, or embedded system, such as, but not limited to, a Network routers or switches, printers, mass storage controllers, cameras, scanners, automotive controllers, and more.

系统存储器106可以被具体化成存储器,例如动态随机存取存储器RAM和只读存储器ROM,用于储存在虚拟多处理器101执行的程序指令,并用于储存依据程序指令待被虚拟多处理器101处理的数据。程序指令可包括一个或多个由虚拟多处理器101同时执行的程序线程。一个程序线程或是线程包括一个程序指令的序列或流和相关联的在虚拟多处理器101中的对应虚拟处理元件102内的状态变化序列,该状态变化序列和指令序列的执行有关。每一个线程上下文103包括支持相对应程序线程执行所需的硬件状态。在一个实施例中,每一个线程上下文包括一组通用寄存器,一个程序计数器,和其它寄存器保存执行线程的状态,例如,乘法器状态和协处理器状态。每一个虚拟处理元件102提供资源以支持一个完整指令集架构和特许的资源架构的示例,该些架构足以执行一个单全处理器操作系统图像。在一个实施例中,每一个虚拟处理元件102提供资源以支持一个完整MIPS32/MIPS64指令集架构和特许的资源架构的示例。每一个虚拟处理元件上下文104组成一个支持在一个相对应虚拟处理元件102中线程执行所需的硬件状态。在一个实施例中,每一个虚拟处理元件上下文104规定分配给一个相对应虚拟处理元件102的资源,例如,地址转换逻辑资源(例如,转换后备缓冲器输入),功能单元(例如,整数单元,浮点单元,多媒体单元,媒体加速单元,SIMD单元,协处理器)和性能属性。在一个特别实施例,该性能属性包括允许停止和配置分配给其它虚拟处理元件102的资源,线程列举的数目,相对应虚拟处理元件102的激活/抑制和虚拟多处理器101的与带宽相关的资源(例如,指令执行带宽或优先权,加载储存带宽等等),这些资源被分配给相对应虚拟处理元件102。本发明提供多种带宽配置技术包括调度提示,执行优先权指派,加载/储存缓冲器分配等等。The system memory 106 can be embodied as a memory, such as a dynamic random access memory RAM and a read-only memory ROM, for storing program instructions executed by the virtual multiprocessor 101, and for storing instructions to be processed by the virtual multiprocessor 101 according to the program instructions. The data. Program instructions may include one or more program threads that are executed concurrently by virtual multiprocessor 101 . A program thread or thread comprises a sequence or stream of program instructions and associated sequence of state changes within corresponding virtual processing elements 102 in virtual multiprocessor 101, the sequence of state changes being related to the execution of the sequence of instructions. Each thread context 103 includes the hardware state required to support execution of the corresponding program thread. In one embodiment, each thread context includes a set of general-purpose registers, a program counter, and other registers that hold the state of the executing thread, such as multiplier state and coprocessor state. Each virtual processing element 102 provides resources to support an instance of a full instruction set architecture and privileged resource architecture sufficient to execute a single full processor operating system image. In one embodiment, each virtual processing element 102 provides resources to support an instance of a full MIPS32/MIPS64 instruction set architecture and privileged resource architecture. Each virtual processing element context 104 constitutes a hardware state required to support thread execution in a corresponding virtual processing element 102 . In one embodiment, each virtual processing element context 104 specifies resources allocated to a corresponding virtual processing element 102, such as address translation logic resources (e.g., translation lookaside buffer inputs), functional units (e.g., integer units, floating point unit, multimedia unit, media acceleration unit, SIMD unit, coprocessor) and performance attributes. In a particular embodiment, the performance attributes include the ability to stop and configure resources allocated to other virtual processing elements 102, the number of thread enumerations, activation/deactivation of corresponding virtual processing elements 102 and bandwidth-related parameters of virtual multiprocessors 101. Resources (eg, instruction execution bandwidth or priority, load-store bandwidth, etc.), which are allocated to corresponding virtual processing elements 102 . The present invention provides various bandwidth allocation techniques including scheduling hints, execution priority assignment, load/store buffer allocation, etc.

系统界面控制器105和虚拟多处理器101经由一个处理器总线相互连接。在一个实施例中,系统界面控制器105包括一个存储器控制器以控制系统存储器106。在一个实施例中,系统界面控制器105包括一个局部总线界面控制器以提供一个局部总线,例如,一个PCI总线,连接至输入/输出装置107。The system interface controller 105 and the virtual multiprocessor 101 are connected to each other via a processor bus. In one embodiment, system interface controller 105 includes a memory controller to control system memory 106 . In one embodiment, the system interface controller 105 includes a local bus interface controller to provide a local bus, eg, a PCI bus, to the I/O device 107 .

输入/输出装置107可包括,但不限于,使用者输入装置,例如,键盘,鼠标,扫瞄仪等等;显示装置,例如,监视器,打印机等等。储存装置,例如,磁盘驱动器,磁带机,光驱等等;系统外围装置,例如,直接存储器存取控制器DMAC,时钟,定时器,输入/输出端口等等;网络装置,例如,用于以太网络,光纤网络,无限频带(infiniband),或其它高速网络界面的媒体存取控制器MAC;数据转换装置,例如,模拟—数字转换器,数字—模拟转换器等等。输入/输出装置107产生中断信号108给虚拟多处理器101以要求服务。有利地,虚拟多处理器101能够同时执行许多用以处理在中断要求线108上表示的事件的程序线程,不需要传统的与保存微处理器102状态,转移控制权给中断服务例程,和在完成中断服务例程之后回复状态相关联的开销。The input/output devices 107 may include, but are not limited to, user input devices such as keyboards, mice, scanners, etc.; display devices such as monitors, printers, and the like. Storage devices, such as disk drives, tape drives, optical drives, etc.; system peripherals, such as direct memory access controllers DMAC, clocks, timers, input/output ports, etc.; network devices, such as for Ethernet , fiber optic network, infiniband, or other high-speed network interface media access controller MAC; data conversion devices, such as analog-to-digital converters, digital-to-analog converters, and so on. The I/O device 107 generates an interrupt signal 108 to the virtual multiprocessor 101 to request service. Advantageously, virtual multiprocessor 101 is capable of concurrently executing many program threads for processing events indicated on interrupt request lines 108, without the need for conventional and preserved microprocessor 102 state, transfer of control to interrupt service routines, and Overhead associated with restoring status after completion of an interrupt service routine.

在一个实施例中,虚拟多处理器101提供两种不同,但不互相排斥,的多线程能力。首先,虚拟多处理器包括一个或多个虚拟处理元件(VPEs)102以支持一个对应的一个或多个逻辑处理器上下文,经由在虚拟多处理器101中的资源共享,每个逻辑处理器上下文呈现给操作系统的是一个独立的处理元件。对一个操作系统,一个有N个VPEs 102的虚拟多处理器101看起来像一个N路(N-way)对称多处理器(SMP),其允许存在SMP可操作系统来管理一个或多个VPEs 102。第二,每一个VPE 102可以包括一个或多个线程上下文103,以同时执行对应的一个或多个程序线程。因此,依据本发明,虚拟多处理器101提供一个多线程程序化模型,其中在典型的情况下,程序线程能够被产生和销毁而不需要操作系统的干预,且系统服务线程能够用最小的中断等待时间被调度以响应外部的条件(例如,输入/输出服务事件信号)。In one embodiment, virtual multiprocessor 101 provides two different, but not mutually exclusive, multithreading capabilities. First, a virtual multiprocessor includes one or more virtual processing elements (VPEs) 102 to support a corresponding one or more logical processor contexts, via resource sharing in the virtual multiprocessor 101, each logical processor context Presented to the operating system is an individual processing element. To an operating system, a virtual multiprocessor 101 with N VPEs 102 looks like an N-way symmetric multiprocessor (SMP), which allows the existence of an SMP-operable operating system to manage one or more VPEs 102. Second, each VPE 102 can include one or more thread contexts 103 to simultaneously execute corresponding one or more program threads. Thus, in accordance with the present invention, virtual multiprocessor 101 provides a multithreaded programmatic model in which, in typical cases, program threads can be spawned and destroyed without operating system intervention, and system service threads can be processed with minimal interruption. Wait times are scheduled in response to external conditions (eg, input/output service event signals).

在一个实施例中,每一个线程上下文包括一个或多个储存元件,例如,寄存器或锁存器,其中具有描述相对应线程的执行状态的字段(例如,位)。也就是,一个给定线程上下文103描述各自线程的状态,其对该线程是唯一的,而不是和在虚拟处理元件102上同时执行的其他线程共享的状态。一个线程,这里也被称为程序线程、执行的线程、或指令流,是一个指令序列。每一个虚拟处理元件102有能力同时处理许多线程。通过在线程上下文103内储存每一个线程的状态,在虚拟多处理器101中的每一个虚拟处理元件102被配置成能在线程间快速切换,以提取和发出指令。有利地,本发明的虚拟多处理器101是被配置成执行指令以在不同线程上下文103间搬移线程上下文信息,正如共有待审的美国专利申请(案卷编号:MIPS.0194-00-US),其标题为“多计算上下文软件管理的机制”所详细描述的。In one embodiment, each thread context includes one or more storage elements, such as registers or latches, with fields (eg, bits) therein that describe the execution state of the corresponding thread. That is, a given thread context 103 describes the state of the respective thread that is unique to that thread, rather than a state that is shared with other threads executing concurrently on the virtual processing element 102 . A thread, also referred to herein as a program thread, thread of execution, or instruction stream, is a sequence of instructions. Each virtual processing element 102 is capable of processing many threads simultaneously. By storing the state of each thread within the thread context 103, each virtual processing element 102 in the virtual multiprocessor 101 is configured to quickly switch between threads to fetch and issue instructions. Advantageously, the virtual multiprocessor 101 of the present invention is configured to execute instructions to move thread context information between different thread contexts 103, as in co-pending U.S. Patent Application (Docket No.: MIPS.0194-00-US), Its title "Mechanisms for Software Management of Multiple Computing Contexts" is described in detail.

在一个实施例中,每一个VPE上下文104包括一群的储存元件,例如,寄存器或锁存器,其中具有描述相对应VPE 102的执行状态的字段(例如,位),提供相对应VPE 102的资源的配置,例如,但不限于,地址转换资源,协处理资源(例如,浮点处理器,媒体处理器等等),线程容量和列举,特定VPE 102激活/抑制执行的允许,和配置特定VPE 102资源的允许。在一个实施例中,一个VPE 102可以通过更新它的VPE上下文104来配置它自己的资源。另外,一个VPE 102可以通过更新对应不同VPE 102的VPE上下文104来配置不同VPE 102的资源。因此,一个有N个VPE 102的虚拟多处理器101呈现给操作系统或是其它对称多处理应用是一个N路对称多处理器。在一个实施例中,VPE 102共享在虚拟多处理器101中特定的资源,例如,指令高速缓存,指令提取器,指令解码器,指令发出器,指令调度器,执行单元和协处理单元,和对于操作单元是显然的数据贮存。资源共享的范围和程度是由VPE上下文104所规定,且可以通过更新VPE上下文104,在运行时间或其它时间被动态地配置。对一个给定的VPE 102来配置它自己的资源,或规定给其它VPE102的资源,他自己的VPE上下文104必须规定该被给定的VPE 102是被允许配置虚拟多处理器101的资源,在下面将会有更详细的描述。因此,假如给定VPE 102的VPE上下文104指出该给定的VPE 102是被允许来配置资源,则该给定的VPE 102可以更新所有的VPE上下文104以提供动态资源配置,包括资源配置许可的修改,其中包括撤销配置许可的能力。在一个实施例中,每一个VPE 102基本上符合一个MIPS32或MIPS64指令集架构(ISA)和一个MIPS特许资源架构(PRA),且每一个VPE上下文104包括该MIPS PRA协处理器0和描述其一示例所需的系统状态。在一个实施例中,VPE上下文106包括图5D-5G所描述的,VPECONTROL寄存器504,VPECONF0寄存器505,VPECONF1寄存器506,和VPESCHEDULE寄存器592。在一方面,一个VPE 102可以被当成是一个异常域(exceptiondomain)。也就是当VPE 102的一个线程上下文103产生一个异常,在VPE 102上的多线程被暂停(也就是,只有与线程上下文104服务该异常相关联的指令流的指令被提取和发出),且每一个VPE上下文104包括服务该异常所需的状态。一但该异常被服务之后,异常处理器将会选择性地重新启动在VPE 102上的多重线程。In one embodiment, each VPE context 104 includes a group of storage elements, such as registers or latches, with fields (e.g., bits) that describe the execution state of the corresponding VPE 102, providing resources for the corresponding VPE 102 Configurations such as, but not limited to, address translation resources, co-processing resources (e.g., floating point processors, media processors, etc.), thread capacity and enumeration, specific VPE 102 activation/inhibition of execution permissions, and configuration specific VPE 102 resource permission. In one embodiment, a VPE 102 can configure its own resources by updating its VPE context 104. In addition, a VPE 102 can configure resources of different VPEs 102 by updating the VPE context 104 corresponding to different VPEs 102. Therefore, a virtual multiprocessor 101 with N VPEs 102 appears to the operating system or other symmetric multiprocessing applications as an N-way symmetric multiprocessor. In one embodiment, VPE 102 shares resources specific to virtual multiprocessor 101, such as instruction caches, instruction fetchers, instruction decoders, instruction issuers, instruction schedulers, execution units, and coprocessing units, and Data storage is obvious to the operating unit. The scope and degree of resource sharing is specified by the VPE context 104 and can be dynamically configured at runtime or otherwise by updating the VPE context 104 . For a given VPE 102 to configure its own resources, or to provide resources for other VPEs 102, his own VPE context 104 must specify that the given VPE 102 is allowed to configure the resources of the virtual multiprocessor 101, in There will be a more detailed description below. Therefore, if the VPE context 104 of a given VPE 102 indicates that the given VPE 102 is allowed to configure resources, then the given VPE 102 can update all VPE contexts 104 to provide dynamic resource configuration, including resource configuration permissions. Modifications, including the ability to revoke configuration permissions. In one embodiment, each VPE 102 substantially conforms to a MIPS32 or MIPS64 instruction set architecture (ISA) and a MIPS Privileged Resource Architecture (PRA), and each VPE context 104 includes the MIPS PRA coprocessor 0 and describes its An example of the desired system state. In one embodiment, VPE context 106 includes VPECONTROL register 504 , VPECONF0 register 505 , VPECONF1 register 506 , and VPESCHEDULE register 592 as depicted in FIGS. 5D-5G . In one aspect, a VPE 102 can be thought of as an exception domain. That is, when a thread context 103 of the VPE 102 produces an exception, the multi-threading on the VPE 102 is suspended (that is, only the instruction of the instruction stream associated with the exception of the thread context 104 service is fetched and issued), and every A VPE context 104 includes the state needed to service the exception. Once the exception is serviced, the exception handler will selectively restart multiple threads on the VPE 102.

现在请参阅图2,其是说明依据本发明的一个虚拟多处理器内的虚拟多处理器流水线200的方框图。该流水线200包括许多的流水线阶段且另外包括一个或多个线程内容103。图2的示例性实施例显示四个线程上下文103。在一个实施例,每一个线程上下文103包括一个程序计数器(PC)222,用于储存提取在相关的指令流中的下一个指令的地址,一个通用寄存器(GPR)组224,用于储存依据程序计数器222的值,从线程所发出的指令流的中间执行结果,和其它每线程(per-thread)上下文226。在一个实施例中,流水线222包括一个乘法器单元(未显示于图中),且其它线程上下文226包括用于储存乘法器单元的结果的寄存器,这些结果与指令流中的乘法指令特别有关。在一个实施例中,其它线程上下文226包括用于唯一辨识每一个线程上下文103的信息。在一个实施例中,该线程辨识信息包括用于规定有关的线程的执行特权级的信息,例如,是否该线程是一个核心,监督者,或使用者层线程。在一个实施例中,该线程辨识信息包括用于辨识组成该线程的一个任务或过程的信息。特别的是,该任务识别信息可以被用作为一个地址空间标识符(ASID)以将实际地址转换成虚拟地址。Referring now to FIG. 2, which is a block diagram illustrating a virtual multiprocessor pipeline 200 within a virtual multiprocessor in accordance with the present invention. The pipeline 200 includes a number of pipeline stages and additionally includes one or more thread contexts 103 . The exemplary embodiment of FIG. 2 shows four thread contexts 103 . In one embodiment, each thread context 103 includes a program counter (PC) 222 for storing the address of the next instruction to be fetched in the associated instruction stream, and a general purpose register (GPR) set 224 for storing Values of counters 222 , intermediate execution results of instruction streams issued from threads, and other per-thread context 226 . In one embodiment, the pipeline 222 includes a multiplier unit (not shown), and the other thread contexts 226 include registers for storing the results of the multiplier unit that are particularly relevant to multiply instructions in the instruction stream. In one embodiment, other thread contexts 226 include information for uniquely identifying each thread context 103 . In one embodiment, the thread identification information includes information specifying the execution privilege level of the associated thread, eg, whether the thread is a kernel, supervisor, or user-level thread. In one embodiment, the thread identification information includes information identifying a task or process that makes up the thread. In particular, the task identification information can be used as an address space identifier (ASID) to translate real addresses into virtual addresses.

流水线200包括一个调度器216用于被虚拟多处理器100所同时执行的许多线程的调度。调度器216连接到VMP上下文210,图1的VPE上下文104,和其它每线程(per-thread)上下文226。特别的是,调度器216是负责调度从不同线程上下文104的程序计数器222中提取的指令,和调度将提取指令发出给虚拟多处理器100的执行单元212,如下所描述。依据虚拟多处理器100的调度原则,调度器216对线程的执行进行调度。调度原则可以包括,但不限于,任何下列的调度原则。在一个实施例中,调度器216采用一个循环,或时分多路复用,或交叉的调度原则,配置一个预先决定数目的时钟周期,或指令发出时段,以一个环绕的顺序给每一个就序的线程。循环原则在一个其中公平性是重要的和基本服务品质对于某些线程是需要的应用上是有用的,例如,实时应用程序线程。在一个实施例中,调度器216采用一个阻挡调度原则,其中,调度器216持续对正在执行线程的提取和发出进行调度,直到一个阻挡线程进一步进展的事件发生,例如,一个贮存失误,一个分支预测错误,一个数据相依性,或一个长等待时间的指令。在一个实施例中,流水线200包括一个采用许多执行单元212的超级标量流水线,且调度器216调度每时钟周期许多指令的发出,特别的是,每时钟周期来自多个线程的指令发出,一般被认为是同时多线程。在其它实施例,调度器216采用一个利用经由VPE上下文104提供的调度信息的调度原则,其中,调度信息指出配置给每一个VPE 102的带宽和/或带宽相关的资源。Pipeline 200 includes a scheduler 216 for scheduling many threads to be executed simultaneously by virtual multiprocessor 100 . Scheduler 216 is connected to VMP context 210 , VPE context 104 of FIG. 1 , and other per-thread contexts 226 . In particular, the scheduler 216 is responsible for scheduling fetched instructions from the program counters 222 of the various thread contexts 104 and for dispatching the fetched instructions to the execution units 212 of the virtual multiprocessor 100, as described below. According to the scheduling principles of the virtual multiprocessor 100, the scheduler 216 schedules the execution of threads. Scheduling principles may include, but are not limited to, any of the following scheduling principles. In one embodiment, the scheduler 216 employs a round-robin, or time-division multiplexing, or interleaved scheduling principle, assigning a predetermined number of clock cycles, or instruction issue periods, to each in-order the rout. The round robin principle is useful in an application where fairness is important and basic quality of service is required for certain threads, eg, real-time application threads. In one embodiment, scheduler 216 employs a blocking scheduling principle, wherein scheduler 216 continues to schedule fetches and issues of executing threads until an event occurs that blocks further progress of the thread, e.g., a store miss, a branch Misprediction, a data dependency, or a long-latency instruction. In one embodiment, pipeline 200 includes a superscalar pipeline employing many execution units 212, and scheduler 216 schedules the issue of many instructions per clock cycle, and in particular, instruction issues from multiple threads per clock cycle are typically Think of it as simultaneous multithreading. In other embodiments, the scheduler 216 employs a scheduling policy that utilizes scheduling information provided via the VPE context 104, wherein the scheduling information indicates the bandwidth and/or bandwidth-related resources allocated to each VPE 102.

流水线200包括一个指令高速缓存202,用于贮存从一个系统存储器提取出的程序指令。在一个实施例,流水线200提供虚拟存储器的能力,且提取单元204包括一个转换后备缓冲器(未示出)用于贮存实际到虚拟存储器页面转换。在这个实施例,在转换后备缓冲器内的资源(例如,入口)被分配给共享流水线200的每一个VPE 102,正如VPE上下文104所规定的。在一个实施例中,在流水线200内所执行的每一个程序或任务,被指派一个唯一的任务ID,或地址空间ID(ASID),其被用来执行存储器存取,及具体地执行存储器地址转换,且一个线程上下文103,也包括储存与该线程相关联的ASID。Pipeline 200 includes an instruction cache 202 for storing program instructions fetched from a system memory. In one embodiment, pipeline 200 provides virtual memory capabilities, and fetch unit 204 includes a translation lookaside buffer (not shown) for storing actual to virtual memory page translations. In this embodiment, resources (eg, entries) in the translation lookaside buffer are allocated to each VPE 102 of the shared pipeline 200, as specified by the VPE context 104. In one embodiment, each program or task executing within pipeline 200 is assigned a unique task ID, or address space ID (ASID), which is used to perform memory accesses, and specifically memory addresses Transformation, and a thread context 103, also includes storing the ASID associated with the thread.

流水线200还包括一个提取单元204,连接到指令高速缓存202,用于从指令高速缓存202和系统存储器提取程序指令。提取单元204从多路复用器244所提供的指令提取地址提取地址。多路复用器244从对应的多个程序计数器222,接收多个指令提取地址。每一个程序计数器222储存用于不同程序线程的当前指令提取地址。图2的实施例说明了与四个不同线程相关联的四个不同程序计数器222。依据由调度器216提供的一个选择输入,多路复用器244从四个程序计数器222中选择一个。在一个实施例中,在微处理器100上执行的不同线程共享该提取单元204。Pipeline 200 also includes a fetch unit 204 coupled to instruction cache 202 for fetching program instructions from instruction cache 202 and system memory. The fetch unit 204 fetches addresses from the instruction fetch addresses provided by the multiplexer 244 . The multiplexer 244 receives a plurality of instruction fetch addresses from the corresponding plurality of program counters 222 . Each program counter 222 stores the current instruction fetch address for a different program thread. The embodiment of FIG. 2 illustrates four different program counters 222 associated with four different threads. Multiplexer 244 selects one of four program counters 222 upon a select input provided by scheduler 216 . In one embodiment, different threads executing on the microprocessor 100 share the fetch unit 204 .

流水线200还包括一个解码单元206,连接至提取单元204,用于解码由提取单元204所提取的程序指令。解码单元206解码操作码,操作数,和指令的其它字段。在一个实施例中中,在微处理器100上执行的不同线程共享一个解码单元206。The pipeline 200 also includes a decode unit 206 connected to the fetch unit 204 for decoding program instructions fetched by the fetch unit 204 . Decode unit 206 decodes opcodes, operands, and other fields of instructions. In one embodiment, different threads executing on the microprocessor 100 share one decoding unit 206 .

流水线200也包括执行单元212,用于执行指令。执行单元212可以包括,但不限于,一个或多个整数单元,用于执行整数算术,布尔运算,位移运算,旋转运算等等;用于执行浮点运算的浮点单元;用于执行存储器存取及特别地对连接到执行单元212的数据高速缓存242的存取的加载/储存单元;多媒体加速单元,用于执行多媒体运算;和一个分支解析单元,用于解析分支指令的结果和目标地址。在一个实施例中,数据高速缓存242包括一个转换后备缓冲器用于贮存实际到虚拟存储器页面转换。除了从数据高速缓存242所收到的操作数,执行单元212也从通用寄存器组224的寄存器接收操作数。具体地,一个执行单元212接收从线程上下文104的寄存器组224来的操作数,该线程上下文104是分配给该指令所属的线程。一个多路复用器248选择来自的适当寄存器组224的操作数提供给执行单元212。此外,多路复用器248接收从另一个线程上下文226和程序计数器222来的数据,以根据由执行单元212所执行的指令的线程上下文104来选择性地提供给执行单元212。在一个实施例中,不同的执行单元212可以同时执行从多个并存线程来的指令。The pipeline 200 also includes an execution unit 212 for executing instructions. Execution units 212 may include, but are not limited to, one or more integer units for performing integer arithmetic, Boolean operations, shift operations, rotation operations, etc.; floating point units for performing floating point operations; a load/store unit for fetching and in particular access to the data cache 242 connected to the execution unit 212; a multimedia acceleration unit for performing multimedia operations; and a branch resolution unit for resolving the result and target address of a branch instruction . In one embodiment, data cache 242 includes a translation lookaside buffer for storing actual to virtual memory page translations. In addition to operands received from data cache 242 , execution units 212 also receive operands from registers of general register file 224 . Specifically, an execution unit 212 receives operands from the register bank 224 of the thread context 104 assigned to the thread to which the instruction belongs. A multiplexer 248 selects operands from the appropriate register bank 224 to provide to the execution unit 212 . Additionally, multiplexer 248 receives data from another thread context 226 and program counter 222 to selectively provide to execution unit 212 based on the thread context 104 of the instruction being executed by execution unit 212 . In one embodiment, different execution units 212 may simultaneously execute instructions from multiple concurrent threads.

流水线200也包括一个指令发出单元208,该指令发出单元208连接到调度器216,并连接到解码单元206和执行单元212之间,用于依调度器216的指示发出指令给执行单元212,并响应有关被解码单元206所解码的指令的信息。特别的是,假如当指令和先前发出给执行单元212的其它指令有数据相依性,指令发出单元208确保这些指令不会发出给执行单元212。在一个实施例中,一个指令队列(未示于图中)被置于解码单元206和指令发出单元208之间,用于缓冲等待发出给执行单元212的指令,用以减少执行单元212空乏的可能性。在一个实施例中,在流水线200中的许多线程执行共享该指令发出单元208。The pipeline 200 also includes an instruction issuing unit 208, the instruction issuing unit 208 is connected to the scheduler 216, and is connected between the decoding unit 206 and the execution unit 212, and is used to issue instructions to the execution unit 212 according to the instruction of the scheduler 216, and The response is information about the instruction decoded by the decode unit 206 . In particular, the instruction issue unit 208 ensures that instructions are not issued to the execution unit 212 if the instruction has data dependencies on other instructions previously issued to the execution unit 212 . In one embodiment, an instruction queue (not shown in the figure) is placed between the decoding unit 206 and the instruction issuing unit 208, and is used for buffering instructions waiting to be issued to the execution unit 212, so as to reduce the exhaustion of the execution unit 212. possibility. In one embodiment, the instruction issue unit 208 is shared by many thread executions in the pipeline 200 .

流水线200也包括一个写回单元214,其连接到执行单元212,用于将指令的结果写回到通用寄存器组224,程序计数器222,和其它线程上下文226。一个解多路复用器246接收从写回单元214来的指令结果,并将指令结果储存到适当的寄存器组224,程序计数器222,和其它跟该指令的线程有关的线程上下文226。该指令结果也被提供用于储存到VPE上下文104和一个虚拟多处理器(VMP)上下文210。Pipeline 200 also includes a writeback unit 214 coupled to execution unit 212 for writing the results of instructions back to general register set 224 , program counter 222 , and other thread contexts 226 . A demultiplexer 246 receives the instruction result from the writeback unit 214 and stores the instruction result in the appropriate register file 224, program counter 222, and other thread context 226 associated with the instruction's thread. The instruction result is also provided for storage to VPE context 104 and a virtual multiprocessor (VMP) context 210 .

在一个实施例中,VMP上下文210包括一组储存元件,例如,寄存器或锁存器,在该储存元件有一个或多个字段(例如,字节)描述虚拟多处理器101的执行状态。特别的是,VMP上下文210储存关于在VPE102中被共享的虚拟多处理器101,全部资源的状态,如上所描述。具体地,VMP上下文规定在配置期间可以分配给VPEs 102的资源,也控制虚拟多处理器101是否是在一个配置这些资源的配置状态。在一个实施例中,该VMP上下文210包括如下所述的图5A-5C的一个MVPCONTROL寄存器501,MVPCON0寄存器502,和MVPCON1寄存器503。In one embodiment, VMP context 210 includes a set of storage elements, such as registers or latches, in which one or more fields (eg, bytes) describe the execution state of virtual multiprocessor 101 . In particular, VMP context 210 stores state about all resources of virtual multiprocessor 101 that are shared within VPE 102, as described above. Specifically, the VMP context specifies the resources that can be allocated to the VPEs 102 during configuration, and also controls whether the virtual multiprocessor 101 is in a configuration state that configures these resources. In one embodiment, the VMP context 210 includes a MVPCONTROL register 501 , MVPCON0 register 502 , and MVPCON1 register 503 as described below in FIGS. 5A-5C .

图2所示流水线200的具体阶段202,204,206,208,212,214被提供来清楚地说明本发明而不会模糊实质性方面。本领域的熟练技术人员可以领会流水线200的阶段化可通过增加或减少阶段的数目,或通过分配不同的功能给阶段而被修改以增进性能,而不会偏离本发明的精神和范围。The specific stages 202, 204, 206, 208, 212, 214 of the pipeline 200 shown in Figure 2 are provided to clearly illustrate the invention without obscuring substantive aspects. Those skilled in the art will appreciate that the phasing of pipeline 200 may be modified to enhance performance by increasing or decreasing the number of stages, or by assigning different functions to the stages, without departing from the spirit and scope of the invention.

参照图3,显示依据本发明的一个动态可配置虚拟多处理器300的方框图。该多处理器300包括一个或多个VPEs 302-304,列举为VPE 1302,VPE 2 303,直到VPE N 304。每一个VPE 302-304有一个对应的VPE上下文305-307。该些VPEs 302-304和VMP上下文210连接到执行逻辑212,如上参照图2所述。该执行逻辑212包括VPE配置逻辑310。该VPE配置逻辑310连接到一个例外信号311。该方框图中还示出有一个或多个资源322,324,326,328,它们分别被列举为RESOURCE1 322,RESOURCE2 324,RESOURCE3 326,直到RESOURCEM 328。Referring to FIG. 3, a block diagram of a dynamically configurable virtual multiprocessor 300 according to the present invention is shown. The multiprocessor 300 includes one or more VPEs 302-304, enumerated as VPE 1302, VPE 2 303, up to VP N 304. Each VPE 302-304 has a corresponding VPE context 305-307. The VPEs 302-304 and VMP context 210 are connected to execution logic 212, as described above with reference to FIG. 2 . The execution logic 212 includes VPE configuration logic 310 . The VPE configuration logic 310 is connected to an exception signal 311 . Also shown in the block diagram are one or more resources 322, 324, 326, 328, listed as RESOURCE1 322, RESOURCE2 324, RESOURCE3 326, through RESOURCEM 328, respectively.

在操作上,资源322-328的配置是通过执行一个由被允许配置这些资源322-328的VPEs 302-304,所发出的配置指令序列而被实现的。在一个实施例中,配置资源322-328的允许权是由对应VPEs 302-304的VPE上下文305-307所规定。当一个配置指令序列是由流水线200中的执行逻辑212所接收,该VPE配置逻辑310存取对应于VPEs 302-304的VPE上下文305-307,VPEs 302-304的程序线程导致配置指令序列被提取以决定VPEs 302-304是否被允许配置这些资源322-328。如果不是,则配置逻辑310导致例外(exception)信号311被断言,且配置指令序列不被执行。如果VPEs 302-304被允许配置这些资源322-328,则VPE配置逻辑310执行配置指令序列以指引虚拟多处理器300进入配置状态,且更新一个或多个规定的VPE上下文305-307,因此,重新配置这些资源。在一个实施例中,配置指令序列通过更新VMP上下文210来指引虚拟多处理器300进入配置状态。在一个实施例中,配置指令的序列包括遵循MIPS32/MIPS64多线程(MT)应用特殊延伸(ASE)架构的指令。In operation, configuration of resources 322-328 is accomplished by executing a sequence of configuration commands issued by VPEs 302-304 that are permitted to configure these resources 322-328. In one embodiment, permissions to configure resources 322-328 are specified by VPE contexts 305-307 of corresponding VPEs 302-304. When a sequence of configuration instructions is received by the execution logic 212 in the pipeline 200, the VPE configuration logic 310 accesses the VPE context 305-307 corresponding to the VPEs 302-304, the program threads of the VPEs 302-304 cause the sequence of configuration instructions to be fetched To determine whether VPEs 302-304 are allowed to configure these resources 322-328. If not, configuration logic 310 causes exception signal 311 to be asserted, and the sequence of configuration instructions is not executed. If the VPEs 302-304 are allowed to configure these resources 322-328, the VPE configuration logic 310 executes a sequence of configuration instructions to direct the virtual multiprocessor 300 into the configuration state, and update one or more specified VPE contexts 305-307, thus, Reconfigure these resources. In one embodiment, a sequence of configure instructions directs virtual multiprocessor 300 to enter a configure state by updating VMP context 210 . In one embodiment, the sequence of configuration instructions includes instructions conforming to the MIPS32/MIPS64 Multithreaded (MT) Application Specific Extensions (ASE) architecture.

该方框图示出一个由配置指令序列的执行导致的配置的资源322-328的具体实施例,且图表地描绘根据本发明,具体资源322-328如何能够被动态地配置,以最佳化在一个给定的多线程多处理应用中,同时执行线程的性能。例如,考虑到RESOURCE1 322图标分支对应于地址转换资源(例如,转换后备缓冲器输入)。从分支所示,VPE1 302是被规定为地址转换资源的一部份,且小于分配给剩余的VPEs 303-304的部分。或许,在VPE1 302执行的线程相对于其它线程是短的和反复的,因此,不需要扩大的地址转换资源。也考虑RESOURCE2 324代表对应到多线程协处理器的上下文(例如,浮点元件,媒体元件,SIMD元件等等)。VPE2 303,如在其VPE上下文306中所指定,相较于其它VPEs 302,307,是被配置较少的上下文数目,或许是由于由VPE2303所发出指令线程所指引的运算,不需要大量的共处理资源。此外,考虑到RESOURCE3 326代表资源配置许可。如图表所呈现,只有VPE2 303被允许配置虚拟多处理器300中的资源302-304。也就是指明,已经获得配置许可的一个给定的VPE302-304(在这个例子为VPE2 303)可以授予配置许可给其它的VPEs 302-304,或取消它们的配置许可,或取消它自己的配置许可。这是通过如这里所描述的更新规定的VPE上下文305-307来实现的。考虑到RESOURCE M 328是一个带宽资源,其根据如上所述的一被实现的调度原则,配置虚拟多处理器300的带宽给他的VPEs 302-304。因此,图表呈现每一个示例性的VPEs 302-304被给予相同部份的多处理器带宽,或者经由直接执行带宽配置,或通过设定几近相同的执行优先权,或通过其它用于规定带宽或与带宽相关的资源的技术。一个由本发明所尝试的规定与带宽相关的资源这样的技术,是加载/储存给VPEs 302-304的带宽的分配。例如,在VPEs 302-304间共享的在虚拟多处理器300中的存储器运算缓冲器的数目(未示出)小于执行线程的数目,则在执行一个与给定VPE302-304的线程有关的存储器运算之前,该虚拟多处理器300将会评估,是否要将给定的线程断开,因为,这样的运算可能会超过被规定用于给定的VPE302-304的与带宽相关的资源分配。这样一个带宽分配方案有利地解决了与VPEs 302-304有关的小数量的线程,例如,产生一长串的贮存失误可能独占与带宽相关的资源(在该示例中未存储器运算缓冲器)的情况,因此,防止来自其它VPEs 302-304的线程的执行。通过规定与带宽相关的资源的份额,依据本发明,这样的情况已经被排除在虚拟多处理器300外。The block diagram shows a specific embodiment of configured resources 322-328 resulting from execution of a sequence of configuration instructions, and diagrammatically depicts how specific resources 322-328 can be dynamically configured in accordance with the present invention to optimize The performance of concurrently executing threads in a given multithreaded multiprocessing application. For example, consider that the RESOURCE1 322 icon branch corresponds to an address translation resource (eg, translation lookaside buffer input). As shown from the branch, VPE1 302 is defined as a portion of the address translation resource and is smaller than the portion allocated to the remaining VPEs 303-304. Perhaps, the thread that executes at VPE1 302 is short and repetitive with respect to other thread, therefore, does not need the address translation resource of extensification. Also consider that RESOURCE2 324 represents contexts corresponding to multi-threaded coprocessors (eg, floating point elements, media elements, SIMD elements, etc.). VPE2 303, as specified in its VPE context 306, is configured with a lower number of contexts than the other VPEs 302, 307, perhaps due to the fact that operations directed by instruction threads issued by VPE 2 303 do not require a large amount of sharing. Process resources. Also, consider that RESOURCE3 326 represents a resource configuration license. As the diagram presents, only VPE2 303 is allowed to configure resources 302-304 in virtual multiprocessor 300. That is to say, a given VPE 302-304 (VPE2 303 in this example) that has obtained configuration permissions can grant configuration permissions to other VPEs 302-304, or revoke their configuration permissions, or revoke its own configuration permissions . This is accomplished by updating the provisioned VPE contexts 305-307 as described herein. Considering that RESOURCE M 328 is a bandwidth resource, it allocates the bandwidth of the virtual multiprocessor 300 to its VPEs 302-304 according to an implemented scheduling principle as described above. Thus, the graph shows that each of the exemplary VPEs 302-304 is given the same portion of the multiprocessor bandwidth, either via direct execution bandwidth allocation, or by setting approximately equal execution priorities, or by other means for specifying bandwidth Or technologies for bandwidth-related resources. One such technique for specifying bandwidth-related resources attempted by the present invention is the allocation of load/store bandwidth to VPEs 302-304. For example, if the number of memory arithmetic buffers (not shown) in the virtual multiprocessor 300 shared among the VPEs 302-304 is less than the number of execution threads, then the memory operation buffer associated with a thread of a given VPE 302-304 is executed. Prior to operation, the virtual multiprocessor 300 will evaluate whether to disconnect a given thread because such an operation may exceed the bandwidth-related resource allocation specified for a given VPE 302-304. Such a bandwidth allocation scheme advantageously addresses the small number of threads associated with VPEs 302-304, e.g., the situation where a long chain of store misses could monopolize bandwidth-related resources (not memory op buffers in this example) , thus preventing execution of threads from other VPEs 302-304. Such cases have been excluded from the virtual multiprocessor 300 according to the invention by specifying the allocation of bandwidth-related resources.

请参照图4,给出一表格400,其描绘的根据本发明的一示例性实施例的虚拟多处理上下文寄存器。该虚拟多处理上下文寄存器被采用以配置一个虚拟多处理器上下文210,或是一个虚拟处理元件上下文104,如上所述。该虚拟多处理上下文包括寄存器MVPCONTROL,MVPCONF0,和MVPCONF1。用于一个虚拟多处理器内的每一个VPE的虚拟处理元件上下文包括寄存器VPECONTROL,VPECONF0,VPECONF1,和VPESCHEDULE。表格400显示寄存器和到MIPS32/MIPS64指令集和特许资源架构的多线程应用特殊延伸一致,其中,规定一个CPO寄存器的数目和寄存器选择数目给每一个所示寄存器以存取其中的上下文。上述寄存器的架构和上下文将会参照图5来讨论。Please refer to FIG. 4 , which shows a table 400 depicting virtual multiprocessing context registers according to an exemplary embodiment of the present invention. The virtual multiprocessing context register is employed to configure a virtual multiprocessor context 210, or a virtual processing element context 104, as described above. The virtual multiprocessing context includes registers MVPCONTROL, MVPCONF0, and MVPCONF1. The virtual processing element context for each VPE within a virtual multiprocessor includes registers VPECONTROL, VPECONF0, VPECONF1, and VPESCHEDULE. Table 400 shows registers conforming to MIPS32/MIPS64 instruction set and privileged resource architecture-specific extensions, wherein a CPO register number and register selection number are specified for each of the indicated registers to access contexts therein. The architecture and context of the above registers will be discussed with reference to FIG. 5 .

图5是一系列方框图,描述图4的每一个虚拟多处理器上下文寄存器501-506,592的示例性实施例。图5A-5F包括每一个寄存器的字段和一个描述不同字段的表格的说明,特别相关的字段会在此详加讨论。图5所说明的每一个寄存器,可以选择性的由VPE读或写,依VPECONF0寄存器505中MVP字段553的值指出,VPE有动态配置这些资源的许可。在寄存器501-506,592中某些字段是只能由VPE写入,VPE的MVP字段553指出它有配置许可。否则,某些字段是只读,如同由配置逻辑310所控制的。FIG. 5 is a series of block diagrams depicting an exemplary embodiment of each of the virtual multiprocessor context registers 501-506, 592 of FIG. Figures 5A-5F include descriptions of the fields of each register and a table describing the different fields, particularly relevant fields are discussed in detail here. Each register illustrated in FIG. 5 can be selectively read or written by the VPE. According to the value of the MVP field 553 in the VPECONF0 register 505, the VPE has the permission to dynamically configure these resources. Certain fields in registers 501-506, 592 are writable only by the VPE, and the MVP field 553 of the VPE indicates that it has configuration permissions. Otherwise, certain fields are read-only, as controlled by configuration logic 310 .

MVPCONTROL寄存器501有一个STLB字段511,一个VPC字段512,和一个EVP字段513。一个有如上所述配置许可的VPE102可以更新VPC字段512和EVP字段513以将虚拟多处理器101置于一个配置状态用于资源配置。清除VPC字段512和设定EVP字段513使新资源值被锁存在配置寄存器501-506,592中且用于虚拟处理以重新开始。一个有配置许可的VPE102可以更新STLB字段511以共享地址转换资源。The MVPCONTROL register 501 has a STLB field 511 , a VPC field 512 , and an EVP field 513 . A VPE 102 with configuration permissions as described above can update VPC field 512 and EVP field 513 to place virtual multiprocessor 101 in a configuration state for resource configuration. Clearing the VPC field 512 and setting the EVP field 513 causes new resource values to be latched in configuration registers 501-506, 592 and used for virtual processing to start over. A VPE 102 with configuration permission can update the STLB field 511 to share address translation resources.

MVPCONF0寄存器502,和MVPCONF1寄存器503是只读寄存器,这些寄存器由一个有配置许可的VPE102所读取,以决定在一个给定虚拟多处理器101中设置的可配置这些资源的数目和范围。字段TLBS指出,地址转换资源是可共享的,且地址转换资源共享可以通过设定MVPCONTROL寄存器501的字段STLB511来配置。字段PVPE524规定由虚拟多处理器101所提供VPEs 102的总数量。在图5的实施例,会采用多至16个VPEs 102。字段PTC525指出由虚拟多处理器101所提供线程上下文103的总数量。在该示出的实施例中,多至256个线程上下文103将会被举例说明。字段C1M 531指出可分配的协处理器是可多媒体延伸的。字段C1F 532指出是否可分配的协处理器是可浮点的。字段533-535指出可用于分配给VPEs 102的其它ISA特定资源的总数。MVPCONF0 register 502, and MVPCONF1 register 503 are read-only registers which are read by a VPE 102 with configuration permission to determine the number and extent of configurable resources set in a given virtual multiprocessor 101. The field TLBS indicates that the address translation resource can be shared, and the address translation resource sharing can be configured by setting the field STLB511 of the MVPCONTROL register 501 . Field PVPE 524 specifies the total number of VPEs 102 provided by the virtual multiprocessor 101. In the embodiment of FIG. 5, up to 16 VPEs 102 may be employed. Field PTC525 indicates the total number of thread contexts 103 provided by virtual multiprocessor 101 . In the illustrated embodiment, up to 256 thread contexts 103 will be illustrated. Field C1M 531 indicates that the allocatable coprocessor is multimedia extensible. Field C1F 532 indicates whether the allocatable coprocessor is floating point capable. Fields 533-535 indicate the total number of other ISA-specific resources available for allocation to VPEs 102.

资源分配给具体VPE 104是通过将VPE数写到VPECONTROL寄存器504的字段TARGVPE 334。一个写入字段334的实施例是经由上述的MIPSMTTR和MFTR指令。Resources are assigned to a specific VPE 104 by writing the VPE number to the field TARGVPE 334 of the VPECONTROL register 504. One example of writing to field 334 is via the MIPSMTTR and MFTR instructions described above.

在寄存器VPECONF0 505中字段VPA 552的值是被设定来起动/取消一个规定的VPE 102。字段MVP 553是设定为给予或取消资源配置许可。字段MINTC 554和MAXTC 555是被更新以配分配线程上下文103的数目和例举给一个规定的VPE 102。在本发明MIPS32/MIPS64多线程应用特殊延伸的实施例中,字段NCX 561,NCP2562,和NCP1 563是被更新以配置协处理器资源给一个具体的VPE 102。如上所述,图5E-5F的表格显示,该注明的资源分配字段552-555,561-563是只读的字段。所有VPEs 102没有资源配置许可,如VPECONF0寄存器505中MVP位553的状态所示。对于一个被授予资源配置许可的VPE 102,配置逻辑310使注明的字段552-555,561-563能够被更新(也就是被写入)。The value of field VPA 552 in register VPECONF0 505 is set to activate/deactivate a specified VPE 102. Field MVP 553 is set to grant or cancel resource allocation permission. Fields MINTC 554 and MAXTC 555 are updated to match the number of thread contexts 103 allocated and instantiated to a specified VPE 102. In the MIPS32/MIPS64 multithreaded application-specific extension of the present invention, fields NCX 561, NCP2 562, and NCP1 563 are updated to allocate coprocessor resources to a specific VPE 102. As noted above, the tables of Figures 5E-5F show that the annotated resource allocation fields 552-555, 561-563 are read-only fields. All VPEs 102 have no resource allocation permissions, as indicated by the state of MVP bit 553 in VPECONF0 register 505. For a VPE 102 granted resource configuration permission, configuration logic 310 enables noted fields 552-555, 561-563 to be updated (ie, written to).

寄存器VPESCHEDULE 592包括一个调度器提示字段529,该字段529能够被更新以配置跨越在虚拟多处理器101中VPEs 102的带宽资源。Register VPESCHEDULE 592 includes a scheduler hint field 529 that can be updated to configure bandwidth resources across VPEs 102 in virtual multiprocessor 101.

虽然图4和图5描述本发明的一示例性实施例,其中,某些资源能够在一个MIPS32/MIPS64多线程应用特殊延伸环境中被动态地配置,本发明人指出该示例性实施例是依据一个已知的指令集架构被提供的以教示本发明的多个方面。本发明人还指出,其它的架构同样的也可以被包含。Although Figures 4 and 5 describe an exemplary embodiment of the present invention in which certain resources can be dynamically configured in a MIPS32/MIPS64 multi-threaded application-specific extension environment, the inventor points out that the exemplary embodiment is based on A known instruction set architecture is provided to teach aspects of the invention. The inventor also points out that other architectures may also be included.

请参阅图6,示出一个说明依据本发明的用于虚拟处理器资源的动态配置的方法的流程图600。本方法由区块602开始,其中,依据本发明,一个VPE想要动态地配置这些资源。流程前进至区块604。Referring to FIG. 6 , there is shown a flowchart 600 illustrating a method for dynamic allocation of virtual processor resources according to the present invention. The method starts at block 602, where, according to the present invention, a VPE wants to dynamically allocate these resources. Flow proceeds to block 604 .

在区块604,对应至该请求VPE的VPE上下文被读取。流程前进至决定区块606。At block 604, the VPE context corresponding to the requested VPE is read. Flow proceeds to decision block 606 .

在决定区块606,该VPE上下文被评估以决定是否该请求VPE被允许在虚拟多处理器中动态地配置这些资源。如果是,流程前进至区块608。如果不是,流程前进至区块607。At decision block 606, the VPE context is evaluated to determine whether the requesting VPE is allowed to dynamically allocate the resources in the virtual multiprocessor. If yes, flow proceeds to block 608 . If not, the process proceeds to block 607.

在区块607,因为该请求VPE没有资源配置许可,一个例外被宣告且流程前进至区块620。At block 607 , an exception is declared and flow proceeds to block 620 because the requesting VPE does not have resource allocation permission.

在区块608,在虚拟多处理器中的虚拟处理被禁止,以允许资源配置。流程前进至区块610。At block 608, virtual processing in the virtual multiprocessor is disabled to allow resource allocation. Flow proceeds to block 610 .

在区块610,在虚拟多处理器中一个配置状态被建立。流程前进至区块612。At block 610, a configuration state is established in the virtual multiprocessor. Flow proceeds to block 612 .

在区块612,在虚拟多处理器中的一个VMP上下文被存取,以决定什么和多少资源是可用于配置。流程前进至区块614。At block 612, a VMP context within the virtual multiprocessor is accessed to determine what and how many resources are available for configuration. Flow proceeds to block 614 .

在区块614,一个目标VPE被选取用于它分配的资源的配置。流程前进至区块616。At block 614, a target VPE is selected for configuration of its allocated resources. Flow proceeds to block 616 .

在区块616,通过更新其相对应的VPE上下文,这些资源被配置用于选定的VPE。流程前进至区块618。At block 616, the resources are configured for the selected VPE by updating its corresponding VPE context. Flow proceeds to block 618 .

在区块618,用于选定的VPE的资源的新配置通过退出配置状态而被锁存,且在虚拟多处理器中的虚拟处理被再次起动。流程前进至区块620。At block 618, the new configuration of resources for the selected VPE is latched by exiting the configuration state, and virtual processing in the virtual multiprocessor is restarted. Flow proceeds to block 620 .

在区块620,该方法完成。At block 620, the method is complete.

图7是一个流程图700描述依据本发明的用于虚拟处理器资源的动态配置的一可撤回的方法。图7的流程图700的所有的区块702-720等同于图6的流程图600的对应的区块602-620,其中百位数是由7所取代,除了一个额外的区块717,其中,被选定的VPE的VPE上下文被更新以撤销它的动态配置这些资源的许可。区块702的请求VPE,可以和区块717的被选定的VPE相同,因此,启动一个VPE来撤销它自己的配置许可。在区块718中锁存新配置之后,该请求VPE不能再配置这些资源。FIG. 7 is a flowchart 700 depicting a retractable method for dynamic allocation of virtual processor resources in accordance with the present invention. All of the blocks 702-720 of the flowchart 700 of FIG. 7 are identical to the corresponding blocks 602-620 of the flowchart 600 of FIG. , the VPE context of the selected VPE is updated to revoke its permission to dynamically configure these resources. The requesting VPE of block 702 may be the same as the selected VPE of block 717, therefore, starting a VPE revokes its own configuration permission. After latching the new configuration in block 718, the requesting VPE can no longer configure these resources.

虽然本发明和它的目的,特征,优点,已经被详细地描述,其它实施例被本发明所包括。例如,除了本发明使用硬件的实现方案外,本发明也可以例如,在一个计算机可使用(例如,可读取的)的媒体中配置的软件(例如,计算机可读取码,程序代码,指令和/或数据)来实现。这样的软件激活在这里描述的装置和方法的功能,制造,模型化,仿真,描述和/或测试。例如,可以由使用通常的程序语言(例如,C,C++,JAVA等等),GDSII数据库,包括Verilog HDL,VHDL的硬件描述语言(HDL)等等,或其它可用的程序,数据库,和/或电路(也就是,简图)捕捉工具来完成。这样的软件能够被配置在任何已知计算机可使用的(例如,可读取的)媒体,包括半导体存储器,磁盘,光盘(例如,CD-ROM,DVD-ROM等等)中,且作为在一个计算机可使用的(例如,可读取的)传输媒体(例如,载波或其它媒体包括数字,光学,或基于模拟的媒体)中的计算机数据信号。这样的软件可以在包括网际网络和内部网络的通讯网络上被传送。本发明可以软件(例如,作为半导体知识产权核心的一部分的HDL中,例如一个微处理器核心,或是一个系统级设计,例如单芯片系统或SOC)来实现和转换成硬件成为集成电路产品的一部分。本发明也可以由软件和硬件的结合来实施。Although the invention and its objects, features, and advantages have been described in detail, other embodiments are encompassed by the invention. For example, in addition to the implementation of the present invention using hardware, the present invention can also be embodied in software (e.g., computer readable code, program code, instruction and/or data) to achieve. Such software enables the functioning, fabrication, modeling, simulation, description and/or testing of the devices and methods described herein. For example, it may be implemented using common programming languages (e.g., C, C++, JAVA, etc.), GDSII databases, hardware description languages (HDL) including Verilog HDL, VHDL, etc., or other available programs, databases, and/or Circuit (ie, schematic) capture tool to complete. Such software can be deployed on any known computer-usable (e.g., readable) media, including semiconductor memory, magnetic disks, optical disks (e.g., CD-ROM, DVD-ROM, etc.), and as a A computer data signal in a computer-usable (eg, readable) transmission medium (eg, carrier wave or other media including digital, optical, or analog-based media). Such software can be transmitted over communication networks including the Internet and intranets. The present invention can be implemented in software (e.g., in HDL as part of a semiconductor intellectual property core, such as a microprocessor core, or in a system-level design, such as a system-on-a-chip or SOC) and converted to hardware as an integrated circuit product part. The present invention can also be implemented by a combination of software and hardware.

最后,本领域的熟练技术人员可以理解他们可以使用在此公开的概念和特定实施例为基础,设计或修改其它架构以实现本发明的相同目的,而不会背离如后附权利要求所定义的本发明的精神和范围。Finally, those skilled in the art will appreciate that they may use the concepts and specific embodiments disclosed herein as a basis for designing or modifying other architectures for carrying out the same purposes of the present invention without departing from the principles defined in the appended claims spirit and scope of the invention.

Claims (69)

1, a kind of device that is used to the one or more virtual treatment element resource allocation in the dummy multiprocessor, it comprises:
A dummy multiprocessor context is used to stipulate these resources, and is used to control a configuration status of this dummy multiprocessor;
One or more virtual treatment element contexts, each ad hoc corresponds in these one or more virtual treatment elements one, and described each virtual treatment element context comprises:
First logic is used for stipulating whether described of these one or more virtual treatment elements is allowed to dispose these resources; And
Second logic is used to stipulate to be assigned to described in these one or more virtual treatment elements a subclass of one resource; And
Configuration logic, be connected to described dummy multiprocessor context and described one or more virtual treatment element context, whether described one that is used for detecting described one or more virtual treatment elements be allowed to dispose these resources, be used to upgrade described dummy multiprocessor context and enter described configuration status, and be used for disposing these resources by the virtual treatment element context that upgrades a regulation to indicate this dummy multiprocessor.
2, device according to claim 1, wherein these one or more virtual treatment elements are carried out in this dummy multiprocessor simultaneously, and wherein this dummy multiprocessor shows as a symmetric multiprocessor to a symmetrical multiprocessing operating system.
3, device according to claim 1, wherein each of these one or more virtual treatment elements comprises one or more thread context, these thread context are configured to and carry out one or more threads simultaneously.
4, device according to claim 3, wherein these one or more thread context described each share resource of configuration, the resource of wherein said configuration is assigned to of correspondence these one or more virtual treatment elements from these resources.
5, device according to claim 1, wherein these resources comprise one or more attribute of this dummy multiprocessor, and the mode that described particular virtual treatment element is carried out with respect to every other virtual treatment element in these one or more virtual treatment elements of this dummy multiprocessor is determined in the configuration that wherein is used for the resource of a particular virtual treatment element.
5, device according to claim 1, wherein these resources comprise conversion look ahead buffer attribute.
6, device according to claim 1, wherein these resources comprise the coprocessing attribute.
7, device according to claim 1, wherein these resources comprise the floating-point processing attribute.
8, device according to claim 1, wherein these resources comprise that medium quicken attribute.
9, device according to claim 1, wherein these resources comprise the permission of disposing these resources.
10, device according to claim 1, wherein these resource packet vinculum journey contexts.
11, device according to claim 1, wherein these resources comprise the bandwidth of this dummy multiprocessor.
12, device according to claim 1, wherein these resources comprise that virtual treatment element starts.
13, device according to claim 1, wherein each of these one or more virtual treatment elements comprises an illustration and the privileged resource framework of MIPS32/MIPS64 instruction.
14, device according to claim 1, the virtual treatment element context of wherein said regulation correspond to described in these one or more virtual treatment elements one.
15, device according to claim 14, the wherein permission that can cancel its these resources of configuration described in these one or more virtual treatment elements.
16, device according to claim 1, the virtual treatment element context of wherein said regulation correspond to not same in these one or more virtual treatment elements.
17, device according to claim 16, the wherein described permission that can cancel the described not same resource allocation in these one or more virtual treatment elements in these one or more virtual treatment elements.
18, device according to claim 1, wherein said virtual multiprocessing context comprises one or more registers, and wherein said configuration status is worth wherein a configuration status field and Be Controlled by writing one.
19, device according to claim 1, wherein said first logic is included in a main virtual processor field in one or more virtual processor context register, and a particular value of wherein said main virtual processor field stipulates whether described one in described one or more virtual treatment element be allowed to dispose these resources.
20, device according to claim 1, wherein said second logic is included in the one or more fields in one or more virtual processor context register, and wherein said one or more field can only be updated by a given virtual treatment element that is allowed to dispose these resources.
21, device according to claim 20, if wherein described given virtual treatment element is not allowed to dispose these resources, then described configuration logic causes one unusually.
22, device according to claim 1, wherein one or more programmed instruction are performed by described one in described one or more virtual treatment elements, to set up described configuration status and these resources of configuration.
23, a kind of resource distribution mechanism is used for the virtual treatment element that Resources allocation is given a dummy multiprocessor, and this resource distribution mechanism comprises:
A plurality of dummy multiprocessor registers are used to stipulate these resources, and are used to control the configuration status of this dummy multiprocessor;
Each a plurality of virtual treatment element register that is used for these virtual treatment elements, be used to stipulate whether a corresponding virtual treatment element is allowed to assign these resources, and be used to stipulate be assigned to the subclass of the resource of described corresponding virtual treatment element; And
Configuration logic, be connected to described dummy multiprocessor register and described virtual treatment element register, be used to detect the virtual treatment element of described correspondence and whether be allowed to assign these resources, be used to upgrade described dummy multiprocessor register and enter described configuration status, and be used for assigning these resources by upgrading the selected register of described virtual treatment element register to indicate this dummy multiprocessor.
24, mechanism according to claim 23, wherein these resources comprise conversion look ahead buffer attribute.
25, mechanism according to claim 23, wherein these resources comprise the coprocessing attribute.
26, mechanism according to claim 23, wherein these resources comprise the floating-point processing attribute.
27, mechanism according to claim 23, wherein these resources comprise that medium quicken attribute.
28, mechanism according to claim 23, wherein these resources comprise the permission of disposing these resources.
29, mechanism according to claim 23, wherein these resource packet vinculum journey contexts.
30, mechanism according to claim 23, wherein these resources comprise the bandwidth of this dummy multiprocessor.
31, mechanism according to claim 23, wherein these resources comprise that virtual treatment element starts.
32, mechanism according to claim 23, wherein each in these virtual treatment elements comprises an illustration and the privileged resource framework of MIPS32/MIPS64 instruction.
33, mechanism according to claim 23, the virtual treatment element of wherein said correspondence can be cancelled the permission of its these resources of assignment.
34, mechanism according to claim 23, the virtual treatment element of wherein said correspondence can be cancelled and be the not permission of same resource allocation in these virtual treatment elements.
35, a kind of computer program that uses with a calculation element, this computer program comprises:
The spendable medium of computing machine have the program code that is included in the embodied on computer readable in the described medium, are configured to and describe the device that is used to the virtual treatment element resource allocation in the dummy multiprocessor, and the program code of described embodied on computer readable comprises:
First program code is configured to and describes a dummy multiprocessor context, and described resource stipulated in this dummy multiprocessor context, and be used to control a configuration status of described dummy multiprocessor;
Second program code, be configured to and describe virtual treatment element context, in the ad hoc corresponding described virtual treatment element of each virtual treatment element context one, and stipulate whether described one in the described virtual treatment element be allowed to dispose described resource, and regulation is assigned to described in the described virtual treatment element subclass of one described resource; And
The 3rd program code, be configured to the description configuration logic, described configuration logic is connected to described dummy multiprocessor context and described virtual treatment element context, whether described one that is used for detecting described virtual treatment element be allowed to dispose described resource, be used to upgrade described dummy multiprocessor context and enter described configuration status, and be used for disposing described resource by the virtual treatment element context that upgrades a regulation to indicate described dummy multiprocessor.
36, computer program according to claim 35, wherein said resource comprises one or more attributes of described dummy multiprocessor, and the mode that the virtual treatment element of described regulation is carried out with respect to the every other element in the described virtual treatment element of described dummy multiprocessor is determined in the configuration of described resource that wherein is used for the virtual treatment element of described regulation.
37, computer program according to claim 35, wherein said resource comprise conversion look ahead buffer attribute.
38, computer program according to claim 35, wherein said resource comprises the coprocessing attribute.
39, computer program according to claim 35, wherein said resource comprises the floating-point processing attribute.
40, computer program according to claim 35, wherein said resource comprise that medium quicken attribute.
41, computer program according to claim 35, wherein said resource comprises the permission of disposing described resource.
42, computer program according to claim 35, wherein said resource packet vinculum journey context.
43, computer program according to claim 35, wherein said resource comprises the bandwidth of described dummy multiprocessor.
44, computer program according to claim 35, wherein said resource comprise that virtual treatment element starts.
45, computer program according to claim 35, each of wherein said virtual treatment element comprise an illustration and the privileged resource framework of MIPS32/MIPS64 instruction.
46, be included in computer data signal in the transmission medium, it comprises:
The program code of embodied on computer readable is configured to and describes the device that is used to the virtual treatment element resource allocation in the dummy multiprocessor, and the program code of described embodied on computer readable comprises:
First program code is configured to and describes a dummy multiprocessor context, and described resource stipulated in described dummy multiprocessor context, and be used to control a configuration status of described dummy multiprocessor;
Second program code, be configured to and describe virtual treatment element context, in the ad hoc corresponding described virtual treatment element of each virtual treatment element context one, and stipulate whether described one in the described virtual treatment element be allowed to dispose described resource, and regulation is assigned to a subclass of described one the described resource in the described virtual treatment element; And
The 3rd program code, be configured to the description configuration logic, described configuration logic is connected to described dummy multiprocessor context and described virtual treatment element context, be used to detect described in the described virtual treatment element one and whether be allowed to dispose described resource, be used to upgrade described dummy multiprocessor context and enter described configuration status, and be used for disposing described resource by the virtual treatment element context that upgrades a regulation to indicate described dummy multiprocessor.
47, according to the described computer data signal of claim 46, wherein said resource comprises one or more attributes of described dummy multiprocessor, and the mode that the virtual treatment element of described regulation is carried out with respect to the every other element in the described virtual treatment element of described dummy multiprocessor is determined in the configuration of described resource that wherein is used for the virtual treatment element of described regulation.
48, according to the described computer data signal of claim 46, wherein said resource comprises conversion look ahead buffer attribute.
49, according to the described computer data signal of claim 46, wherein said resource comprises the coprocessing attribute.
50, according to the described computer data signal of claim 46, wherein said resource comprises the floating-point processing attribute.
51, according to the described computer data signal of claim 46, wherein said resource comprises that medium quicken attribute.
52, according to the described computer data signal of claim 46, wherein said resource comprises the permission of disposing described resource.
53, according to the described computer data signal of claim 46, wherein said resource packet vinculum journey context.
54, according to the described computer data signal of claim 46, wherein said resource comprises the bandwidth of described dummy multiprocessor.
55, according to the described computer data signal of claim 46, wherein said resource comprises that virtual treatment element starts.
56, according to the described computer data signal of claim 46, each in the wherein said virtual treatment element comprises an illustration and the privileged resource framework of MIPS32/MIPS64 instruction.
57, a kind of method that is used to the virtual treatment element resource allocation in the dummy multiprocessor, this method comprises:
Via a dummy multiprocessor context, at first stipulate these resources, and control a configuration status of this dummy multiprocessor;
Via virtual treatment element context, each virtual treatment element context ad hoc corresponds in these virtual treatment elements, second stipulates whether one in these virtual treatment elements be allowed to dispose these resources, and the 3rd regulation is tasked a subclass of these these resources in these virtual treatment elements by branch; And
Be connected to this dummy multiprocessor context and the contextual configuration logic of this virtual treatment element via connection, whether this that detects in these virtual treatment elements be allowed to dispose these resources, and first upgrade this dummy multiprocessor context and enter described configuration status to indicate this dummy multiprocessor, and dispose these resources by the second virtual treatment element context that upgrades a regulation.
58, according to the described method of claim 57, wherein said second renewal comprises the one or more attributes that distribute this dummy multiprocessor.
59, according to the described method of claim 58, wherein said distribution comprises: assign conversion look ahead buffer attribute.
60, according to the described method of claim 58, wherein said distribution comprises: assign the coprocessing attribute.
61, according to the described method of claim 58, wherein said distribution comprises: assign the floating-point processing attribute.
62, according to the described method of claim 58, wherein said distribution comprises: assign medium and quicken attribute.
63, according to the described method of claim 58, wherein said distribution comprises: the permission of assigning these resources of configuration.
64, according to the described method of claim 58, wherein said distribution comprises: assign thread context.
65, according to the described method of claim 58, wherein said distribution comprises: the bandwidth of assigning this dummy multiprocessor.
66, according to the described method of claim 58, wherein said distribution comprises: start a given virtual treatment element.
67, according to the described method of claim 57, wherein each of these virtual treatment elements comprises an illustration and the privileged resource framework of MIPS32/MIPS64 instruction.
68, a kind of virtual multiprocessing system, it comprises:
A storer is configured to and stores the programmed instruction relevant with a plurality of program threads; And
A dummy multiprocessor, be connected to described storer, be configured to and carry out described programmed instruction on the one or more virtual treatment element in described dummy multiprocessor, wherein said dummy multiprocessor has the resource of the configuration of the described one or more virtual treatment elements of regulation, and the dummy multiprocessor context of controlling a configuration status of described dummy multiprocessor, wherein said one or more virtual treatment elements comprise:
A virtual treatment element context, be used to stipulate described one or more virtual treatment elements whether described each be allowed to dispose described resource, and be used for stipulating being assigned to a subclass of one the described resource that described one or more virtual treatment element stipulates; And
Configuration logic, be connected to described dummy multiprocessor context and described virtual treatment element context, be used to detect described one or more virtual treatment elements described each whether be allowed to dispose described resource, be used to upgrade described dummy multiprocessor context and enter described configuration status, and the virtual treatment element context that is used for corresponding to by renewal one 's of stipulating described in described one or more virtual treatment element a regulation disposes described resource to indicate described dummy multiprocessor.
CNB2004800248016A 2003-08-28 2004-08-27 Apparatus for dynamically configuring virtual processor resources Expired - Fee Related CN100538640C (en)

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
US49918003P 2003-08-28 2003-08-28
US60/499,180 2003-08-28
US60/502,359 2003-09-12
US60/502,358 2003-09-12
US10/684,348 2003-10-10
US10/684,350 2003-10-10

Publications (2)

Publication Number Publication Date
CN1842771A true CN1842771A (en) 2006-10-04
CN100538640C CN100538640C (en) 2009-09-09

Family

ID=37031160

Family Applications (4)

Application Number Title Priority Date Filing Date
CN 200480024800 Pending CN1842770A (en) 2003-08-28 2004-08-26 A holistic mechanism for suspending and releasing threads of computation during execution in a processor
CNB2004800247988A Expired - Fee Related CN100489784C (en) 2003-08-28 2004-08-27 Multithreading microprocessor and its novel threading establishment method and multithreading processing system
CN2004800248529A Expired - Fee Related CN1846194B (en) 2003-08-28 2004-08-27 Method and apparatus for executing parallel program threads
CNB2004800248016A Expired - Fee Related CN100538640C (en) 2003-08-28 2004-08-27 Apparatus for dynamically configuring virtual processor resources

Family Applications Before (3)

Application Number Title Priority Date Filing Date
CN 200480024800 Pending CN1842770A (en) 2003-08-28 2004-08-26 A holistic mechanism for suspending and releasing threads of computation during execution in a processor
CNB2004800247988A Expired - Fee Related CN100489784C (en) 2003-08-28 2004-08-27 Multithreading microprocessor and its novel threading establishment method and multithreading processing system
CN2004800248529A Expired - Fee Related CN1846194B (en) 2003-08-28 2004-08-27 Method and apparatus for executing parallel program threads

Country Status (1)

Country Link
CN (4) CN1842770A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110768807A (en) * 2018-07-25 2020-02-07 中兴通讯股份有限公司 Virtual resource method and device, virtual resource processing network element and storage medium

Families Citing this family (36)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9417914B2 (en) 2008-06-02 2016-08-16 Microsoft Technology Licensing, Llc Regaining control of a processing resource that executes an external execution context
WO2010095182A1 (en) * 2009-02-17 2010-08-26 パナソニック株式会社 Multithreaded processor and digital television system
GB2474521B (en) * 2009-10-19 2014-10-15 Ublox Ag Program flow control
US8561070B2 (en) 2010-12-02 2013-10-15 International Business Machines Corporation Creating a thread of execution in a computer processor without operating system intervention
CN102183922A (en) * 2011-03-21 2011-09-14 浙江机电职业技术学院 Method for realization of real-time pause of affiliated computer services (ACS) motion controller
EP2434402A4 (en) * 2011-05-20 2012-08-01 Huawei Tech Co Ltd Method and device for multithread to access multiple copies
CN104750607B (en) * 2011-06-17 2018-07-06 阿里巴巴集团控股有限公司 A kind of method and device of selective recovery test execution
US9507638B2 (en) * 2011-11-08 2016-11-29 Nvidia Corporation Compute work distribution reference counters
CN102750132B (en) * 2012-06-13 2015-02-11 深圳中微电科技有限公司 Thread control and call method for multithreading virtual assembly line processor, and processor
CN103973600B (en) * 2013-02-01 2018-10-09 德克萨斯仪器股份有限公司 Merge and deposit the method and device of field instruction for packet transaction rotation mask
JP6122749B2 (en) * 2013-09-30 2017-04-26 ルネサスエレクトロニクス株式会社 Computer system
CN108228321B (en) * 2014-12-16 2021-08-10 北京奇虎科技有限公司 Android system application closing method and device
US9747108B2 (en) * 2015-03-27 2017-08-29 Intel Corporation User-level fork and join processors, methods, systems, and instructions
US10346168B2 (en) 2015-06-26 2019-07-09 Microsoft Technology Licensing, Llc Decoupled processor instruction window and operand buffer
US9720693B2 (en) * 2015-06-26 2017-08-01 Microsoft Technology Licensing, Llc Bulk allocation of instruction blocks to a processor instruction window
US10169105B2 (en) * 2015-07-30 2019-01-01 Qualcomm Incorporated Method for simplified task-based runtime for efficient parallel computing
US9921838B2 (en) * 2015-10-02 2018-03-20 Mediatek Inc. System and method for managing static divergence in a SIMD computing architecture
GB2544994A (en) * 2015-12-02 2017-06-07 Swarm64 As Data processing
CN105700913B (en) * 2015-12-30 2018-10-12 广东工业大学 A kind of parallel operation method of lightweight bare die code
US10761849B2 (en) * 2016-09-22 2020-09-01 Intel Corporation Processors, methods, systems, and instruction conversion modules for instructions with compact instruction encodings due to use of context of a prior instruction
GB201717303D0 (en) 2017-10-20 2017-12-06 Graphcore Ltd Scheduling tasks in a multi-threaded processor
GB2569275B (en) * 2017-10-20 2020-06-03 Graphcore Ltd Time deterministic exchange
GB2569098B (en) * 2017-10-20 2020-01-08 Graphcore Ltd Combining states of multiple threads in a multi-threaded processor
CN109697084B (en) * 2017-10-22 2021-04-09 刘欣 Fast access memory architecture for time division multiplexed pipelined processor
CN108536613B (en) * 2018-03-08 2022-09-16 创新先进技术有限公司 Data cleaning method and device and server
CN110955503B (en) * 2018-09-27 2023-06-27 深圳市创客工场科技有限公司 Task scheduling method and device
GB2580327B (en) * 2018-12-31 2021-04-28 Graphcore Ltd Register files in a multi-threaded processor
CN111414196B (en) * 2020-04-03 2022-07-19 中国人民解放军国防科技大学 A method and device for implementing a zero-value register
US12020064B2 (en) * 2020-10-20 2024-06-25 Micron Technology, Inc. Rescheduling a failed memory request in a processor
CN112395095A (en) * 2020-11-09 2021-02-23 王志平 Process synchronization method based on CPOC
CN112579278B (en) * 2020-12-24 2023-01-20 海光信息技术股份有限公司 Central processing unit, method, device and storage medium for simultaneous multithreading
TWI775259B (en) * 2020-12-29 2022-08-21 新唐科技股份有限公司 Direct memory access apparatus and electronic device using the same
CN115129369B (en) * 2021-03-26 2025-03-28 上海阵量智能科技有限公司 Command distribution method, command distributor, chip and electronic device
CN113946445B (en) * 2021-10-15 2025-02-25 杭州国芯微电子股份有限公司 A multi-thread module and multi-thread control method based on ASIC
CN116701085B (en) * 2023-06-02 2024-03-19 中国科学院软件研究所 Form verification method and device for consistency of instruction set design of RISC-V processor Chisel
CN116954950B (en) * 2023-09-04 2024-03-12 北京凯芯微科技有限公司 Inter-core communication method and electronic equipment

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0473714A1 (en) * 1989-05-26 1992-03-11 Massachusetts Institute Of Technology Parallel multithreaded data processing system
CA2100540A1 (en) * 1992-10-19 1994-04-20 Jonel George System and method for performing resource reconfiguration in a computer system
US5812811A (en) * 1995-02-03 1998-09-22 International Business Machines Corporation Executing speculative parallel instructions threads with forking and inter-thread communication
US6647508B2 (en) * 1997-11-04 2003-11-11 Hewlett-Packard Development Company, L.P. Multiprocessor computer architecture with multiple operating system instances and software controlled resource allocation
US6330656B1 (en) * 1999-03-31 2001-12-11 International Business Machines Corporation PCI slot control apparatus with dynamic configuration for partitioned systems
US6668317B1 (en) * 1999-08-31 2003-12-23 Intel Corporation Microengine for parallel processor architecture
HK1046566A1 (en) * 1999-09-01 2003-01-17 Intel Corporation Branch instruction for processor
US6986137B1 (en) * 1999-09-28 2006-01-10 International Business Machines Corporation Method, system and program products for managing logical processors of a computing environment
US7610366B2 (en) * 2001-11-06 2009-10-27 Canon Kabushiki Kaisha Dynamic network device reconfiguration

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110768807A (en) * 2018-07-25 2020-02-07 中兴通讯股份有限公司 Virtual resource method and device, virtual resource processing network element and storage medium

Also Published As

Publication number Publication date
CN100538640C (en) 2009-09-09
CN1846194A (en) 2006-10-11
CN1842770A (en) 2006-10-04
CN100489784C (en) 2009-05-20
CN1846194B (en) 2010-12-15
CN1842769A (en) 2006-10-04

Similar Documents

Publication Publication Date Title
CN100538640C (en) Apparatus for dynamically configuring virtual processor resources
US7694304B2 (en) Mechanisms for dynamic configuration of virtual processor resources
CN1117319C (en) Method and apparatus for altering thread priorities in multithreaded processor
US7418585B2 (en) Symmetric multiprocessor operating system for execution on non-independent lightweight thread contexts
US8266620B2 (en) Symmetric multiprocessor operating system for execution on non-independent lightweight thread contexts
CN1112636C (en) Method and apparatus for selecting thread switch events in multithreaded processor
US7376954B2 (en) Mechanisms for assuring quality of service for programs executing on a multithreaded processor
US20060136915A1 (en) Method and apparatus for scheduling multiple threads for execution in a shared microprocessor pipeline
WO2005022385A1 (en) Mechanisms for dynamic configuration of virtual processor resources
US20070044106A2 (en) Symmetric multiprocessor operating system for execution on non-independent lightweight thread contexts
EP1660999A2 (en) Integrated mechanism for suspension and deallocation of computational threads of execution in a processor
CN1726468A (en) Cross partition sharing of state information
CN1726469A (en) Processor virtualization mechanism via an enhanced restoration of hard architected states
CN107918557A (en) Device and method for operating multi-core system and multi-core system
CN111752615A (en) Apparatus, method and system for ensuring quality of service of multithreaded processor cores
CN105027075A (en) Processing cores with shared front-end unit
Abeydeera et al. SAM: Optimizing multithreaded cores for speculative parallelism
Ausavarungnirun et al. Mosaic: Enabling application-transparent support for multiple page sizes in throughput processors
Park et al. A hardware operating system kernel for multi-processor systems

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CP01 Change in the name or title of a patent holder

Address after: American California

Patentee after: Imagination Technologies Ltd.

Address before: American California

Patentee before: Imagination Technology Co.,Ltd.

Address after: American California

Patentee after: Imagination Technology Co.,Ltd.

Address before: American California

Patentee before: Mips Technologies, Inc.

CP01 Change in the name or title of a patent holder
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20090909

Termination date: 20200827

CF01 Termination of patent right due to non-payment of annual fee