CN101908200A - Drawing processing system and method with power gate control function - Google Patents
Drawing processing system and method with power gate control function Download PDFInfo
- Publication number
- CN101908200A CN101908200A CN2009101392911A CN200910139291A CN101908200A CN 101908200 A CN101908200 A CN 101908200A CN 2009101392911 A CN2009101392911 A CN 2009101392911A CN 200910139291 A CN200910139291 A CN 200910139291A CN 101908200 A CN101908200 A CN 101908200A
- Authority
- CN
- China
- Prior art keywords
- shaders
- shader
- power gating
- frame
- graphics processing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Landscapes
- Image Generation (AREA)
- Power Sources (AREA)
Abstract
Description
技术领域technical field
本发明是有关于绘图处理,特别是有关于具有电源闸控功能的绘图处理系统及电源闸控方法,根据画面速度的变化,来动态预测所需运作的着色器数量。The present invention relates to drawing processing, in particular to a drawing processing system with power gate control function and a power gate control method, which dynamically predicts the number of shaders to be operated according to the change of picture speed.
背景技术Background technique
一般而言,绘图应用程序包括复杂且高细节的图形绘制,例如:三维(3D)绘图,而为了符合目前在这方面不断增加的需求,于个人电脑或便携式装置中所设置的绘图处理单元(graphics processing unit,GPU)被用来处理大量的计算,用以显示各种物件,也因此十分耗电。进一步,对于使用电池供电的便携式装置而言,像是手机,由于电能消耗是一个特别重要的问题,因而减少手机中绘图处理单元所产生的功率消耗是有必要的。In general, drawing applications include complex and high-detail graphics drawing, such as: three-dimensional (3D) drawing, and in order to meet the current increasing demand in this regard, the graphics processing unit ( graphics processing unit (GPU) is used to process a large number of calculations to display various objects, and therefore consumes a lot of power. Furthermore, for battery-powered portable devices, such as mobile phones, since power consumption is a particularly important issue, it is necessary to reduce the power consumption generated by the graphics processing unit in the mobile phone.
于电子元件中,一般功率消耗的来源,主要包括:电源电压及工作频率所产生的动态功率消耗、以及因为漏电耗损所导致的静态功率消耗。随着半导体工艺技术的发展,漏电耗损所引起的静态功率消耗已成为主要问题。以65纳米(65nm)以下的工艺为例,超过40%的功率消耗起因于漏电耗损。In electronic components, the sources of general power consumption mainly include: dynamic power consumption caused by power supply voltage and operating frequency, and static power consumption caused by leakage loss. With the development of semiconductor process technology, static power consumption caused by leakage loss has become a major problem. Taking the process below 65 nanometers (65nm) as an example, more than 40% of the power consumption is caused by leakage loss.
已知,诸如时脉闸控(clock-gating)技术或动态电压频率调变(dynamic voltage and frequency scaling,DVFS)技术,是为常用的功耗节省方式,两者皆有效地减少动态功率消耗,但无助于减少漏电耗损,或帮助有限。此外,其他的已知方式,例如电源闸控(power-gating)技术,将电源闸控元件配置于整个绘图处理单元之上,通过电源闸控元件对整个绘图处理单元进行供电控制,但较缺乏弹性。或者,将电源闸控元件配置于内部的每一个元件中。当一元件为闲置时,通过对应的电源闸控元件关闭供给元件的电源,进而同时降低动态及静态功率消耗。然而,此一电源闸控机制需要额外的控制电路,用来开启或关闭供给每一个元件的电源,因此仍有功率消耗。除此之外,在执行电源闸控功能时,需要额外的时间(overhead)来恢复供给每一个元件的电源,使得已知的电源闸控机制耗时且无效率。Known, such as clock gating (clock-gating) technology or dynamic voltage frequency modulation (dynamic voltage and frequency scaling, DVFS) technology, is a commonly used power saving method, both of which can effectively reduce dynamic power consumption, But it does not help to reduce leakage loss, or the help is limited. In addition, other known methods, such as power-gating technology, configure the power-gating element on the entire graphics processing unit, and control the power supply of the entire graphics processing unit through the power-gating element, but lack elasticity. Alternatively, a power gating element is configured in each of the internal elements. When a component is idle, the power supply to the component is turned off through the corresponding power gate control component, thereby simultaneously reducing dynamic and static power consumption. However, this power gating mechanism requires an additional control circuit for turning on or off the power supplied to each device, so there is still power consumption. In addition, when performing the power gating function, additional time (overhead) is required to restore power to each component, making the known power gating mechanism time-consuming and inefficient.
因此,需要一种绘图处理单元,具有改进的电源闸控功能,得以根据各种绘图应用程序的需求来达到节省功耗的目的。Therefore, there is a need for a graphics processing unit with an improved power gating function to achieve the purpose of saving power consumption according to the requirements of various graphics applications.
发明内容Contents of the invention
本发明的实施例提供一种具有电源闸控功能的绘图处理系统,包括一绘图处理单元及一驱动器。所述绘图处理单元包括一统合着色器单元及一或多个电源闸控元件。所述统合着色器单元包括多个的着色器。所述这些着色器用以绘制多个的前画面。所述一或多个电源闸控元件耦接于所述这些着色器,用以根据所述第二运作着色器数量,来启动对应的着色器。所述驱动器耦接于所述绘图处理单元,计算绘制每一前画面的一第一运作着色器数量及对应的一画面速度,并根据每一前画面的所述第一运作着色器数量及对应的所述画面速度,用以决定绘制所述这些前画面之后的一下一个画面的一第二运作着色器数量。An embodiment of the present invention provides a graphics processing system with a power gating function, including a graphics processing unit and a driver. The graphics processing unit includes an integrated shader unit and one or more power gating elements. The integrated shader unit includes a plurality of shaders. These shaders are used to draw multiple front frames. The one or more power gating elements are coupled to the shaders for activating corresponding shaders according to the second number of operating shaders. The driver is coupled to the graphics processing unit, calculates a first number of operating shaders for drawing each previous frame and a corresponding frame speed, and according to the number of first operating shaders of each previous frame and the corresponding The frame speed is used to determine a second number of operating shaders for drawing the next frame after the previous frames.
另一方面,本发明的实施例提供一种具有电源闸控功能的绘图处理系统,包括一绘图处理单元及一驱动器。所述绘图处理单元包括一统合着色器单元及一或多个电源闸控元件。所述统合着色器单元包括多个的着色器。所述这些着色器用以绘制多个的前画面。所述一或多个电源闸控元件耦接于所述这些着色器,用以根据所述第二运作着色器数量,来启动对应的着色器。所述驱动器耦接于所述绘图处理单元,计算绘制每一前画面的一第一运作着色器数量及对应的一画面速度,并根据每一前画面的所述第一运作着色器数量及对应的所述画面速度,用以决定绘制所述这些前画面之后的一下一个画面的一第二运作着色器数量。On the other hand, an embodiment of the present invention provides a graphics processing system with a power gating function, including a graphics processing unit and a driver. The graphics processing unit includes an integrated shader unit and one or more power gating elements. The integrated shader unit includes a plurality of shaders. These shaders are used to draw multiple front frames. The one or more power gating elements are coupled to the shaders for activating corresponding shaders according to the second number of operating shaders. The driver is coupled to the graphics processing unit, calculates a first number of operating shaders for drawing each previous frame and a corresponding frame speed, and according to the number of first operating shaders of each previous frame and the corresponding The frame speed is used to determine a second number of operating shaders for drawing the next frame after the previous frames.
本发明上述方法根据画面速度的变化,来动态预测所需运作的着色器数量,另根据各种绘图应用程序的需求来达到节省功耗的目的。The above method of the present invention dynamically predicts the number of shaders to be operated according to the change of the frame speed, and also achieves the purpose of saving power consumption according to the requirements of various drawing applications.
附图说明Description of drawings
图1是显示依据本发明绘图处理系统的电源闸控方法示意图。FIG. 1 is a schematic diagram showing a power gating method of a graphics processing system according to the present invention.
图2是显示依据本发明实施例的一绘图处理单元方块图。FIG. 2 is a block diagram showing a graphics processing unit according to an embodiment of the invention.
图3是显示依据本发明实施例的具有电源闸控功能的绘图处理系统。FIG. 3 shows a graphics processing system with a power gating function according to an embodiment of the present invention.
图4是显示依据本发明实施例的一电源闸控方法流程图。FIG. 4 is a flowchart showing a power gating method according to an embodiment of the invention.
图5是显示依据本发明另一实施例的一绘图处理单元方块图。FIG. 5 is a block diagram showing a graphics processing unit according to another embodiment of the present invention.
图6是显示依据本发明另一实施例的一绘图处理单元方块图。FIG. 6 is a block diagram showing a graphics processing unit according to another embodiment of the present invention.
附图标号:Figure number:
302~绘图处理单元;302~graphic processing unit;
304~驱动器;304~driver;
308~仲裁器;308~arbiter;
310~命令处理器;310~command processor;
312~应用程序介面;312~application programming interface;
314、316~应用程序;314, 316 ~ application program;
318~存储器对应输入/输出;318~memory corresponding input/output;
320~统合着色器单元;320~integrated shader unit;
320A、320B、320C、320D~着色器;320A, 320B, 320C, 320D~shaders;
328A、328B、328C、328D~电源闸控元件;328A, 328B, 328C, 328D~power gating components;
Vdd~电压源;及Vdd ~ voltage source; and
Ck~时脉。Ck ~ clock.
具体实施方式Detailed ways
为使本发明的上述目的、特征和优点能更明显易懂,下文特举实施例,并配合所附附图,详细说明如下。In order to make the above-mentioned objects, features and advantages of the present invention more comprehensible, the following specific embodiments are described in detail in conjunction with the accompanying drawings.
图1是显示依据本发明绘图处理系统的电源闸控方法示意图。FIG. 1 is a schematic diagram showing a power gating method of a graphics processing system according to the present invention.
于图1的实施例中,是一绘图处理单元102中加入电源闸控控制电路,致使所述绘图处理单元102具有电源闸控功能。进一步,于所述绘图处理单元102绘制一画面(frame)之前,事先通过一驱动器104决定所述画面所需的一画面速度(frame rate per second,FPS),用以表示所述画面的绘制工作量。因此,所述驱动器104根据所述画面的绘制工作量,控制所述绘图处理单元102中相关功能元件的电源开启与关闭,藉以在不影响使用者及绘制效能(例如:画面所呈现的流畅性)前提下,改善所述绘图处理单元102的整体功率消耗。In the embodiment of FIG. 1 , a power gating control circuit is added to a
具体来讲,于一实施例中,当所述绘图处理单元102完成一画面的绘制操作时,所述驱动器104计算出所述画面所对应的画面速度,用以决定下一个画面所需的画面速度(如箭头106所示)。接着,于所述绘图处理单元102进行所述下一个画面的绘制操作前,所述驱动器104便根据下一个画面的画面速度,控制所述绘图处理单元102,例如:设定相关功能元件的电源开启与关闭(如箭头108所示)。Specifically, in one embodiment, when the
于其它实施例中,更可利用多个的前画面所对应的画面速度,决定下一个画面所需的画面速度。In other embodiments, a plurality of frame speeds corresponding to previous frames can be used to determine the required frame speed of the next frame.
图2是显示依据本发明实施例的一绘图处理单元202方块图。FIG. 2 is a block diagram showing a
参考图2,所述绘图处理单元202具有一统合着色器单元220(unified shader unit),所述统合着色器单元220为多重处理器,能够于单一时脉内处理多个指令。所述统合着色器单元220包括多个着色器(shader processors or shader cores),如220A、220B、220C等。每一着色器可以是一种向量(vector)及纯量(scalar)架构、或者是一种多数值(multiple scalar)的超长指令集(very long instruction word,VLIW)架构,并有对应的暂存器档案及指令快取存储器。再者,每一着色器执行各种着色器程序,负责执行顶点着色(vertex shader)及像素着色(pixel shader)的操作,用以绘制每一画面。除此之外,所述绘图处理单元202更包括:固定功能几何阶段(fixed-function geometry stages)204、固定功能片段阶段(fixed-function fragment stages)206、一仲裁器208及一命令处理器210。Referring to FIG. 2 , the
具体地,所述固定功能几何阶段204包括一剪裁器(clipper)212、一图元组合单元(primitive assembly)214及一串流接收器(streamer)216。于此一管线阶段中,所述串流接收器216接收一3D物体的顶点数据,并传送至着色器。之后,所述这些着色器220A、220B、220C执行相关着色程序来决定所述3D物体的顶点数据性质,用以将所述3D物体转换成显示于屏幕上的画面。接着,所述图元组合单元214执行几何组合,用以将顶点组合成多边形,像是三角形。所述剪裁器212消除可视区域以外的三角形。Specifically, the fixed-
进一步,所述固定功能片段阶段206包括一三角形建立单元222、一片段生成单元224、一阶层式深度处理单元226、一深度/模板(Z/stencil)测试单元228、一插补单元230及一描绘单元232。于此一管线阶段中,所述三角形建立单元222执行平面消除(face-culling),去除无法显示的三角形,并计算三角形的边线方程序。所述片段生成单元224提供三角形的片段生成功能,用以计算出欲显示的像素(pixel)。所述阶层式深度处理单元226选择性地配置于所述固定功能片段阶段206中,用以舍弃三角形或视区以外的片段,并可一次舍弃整个片段区块。所述深度/模板测试单元228,利用深度缓冲器及模板缓冲器,来判断并舍弃被隐藏的片段。所述插补单元230将进行透视修正后的三角形属性进行插补,用以产生片段的属性。随后,再由所述描绘单元232执行像素描绘操作。于一实施例中,所述深度/模板测试单元228亦可配置于所述描绘单元232之后。Further, the fixed
于操作中,所述命令处理器210用以接收各种绘图命令,并监控并设定所述这些着色器的电源状态。所述仲裁器208根据各种绘图命令进行执行绪排程(thread scheduling),并将绘图命令分配至各着色器,用以进行3D绘图运算。由于所述统合着色器单元220负责大量的绘图运算,因此成为整个绘图处理单元202功率消耗的瓶颈。进一步地,因为每一个画面所需的绘制工作量不同,因此可藉由电源闸控来对每一个着色器进行供电控制,例如:分别对着色器220A、220B、220C的可电源闸控元件进行导通或切断的动作。如此一来,可有效降低所述绘图处理单元202的动态功率消耗或漏电耗损,使得所述绘图处理单元的整体功率消耗显著减少,且不影响应用程序的执行效能。In operation, the
图3是显示依据本发明实施例的具有电源闸控功能的一绘图处理系统方块图。FIG. 3 is a block diagram showing a graphics processing system with a power gating function according to an embodiment of the present invention.
参考图3,所述绘图处理系统包括一绘图处理单元302及一驱动器304。于图3所示的实施例中,所述绘图处理单元302包括一统合着色器单元320,其具有4个着色器320A、320B、320C及320D,用以绘制多个的画面。类似于图2所示的统合着色器单元220,所述统合着色器单元320为多重处理器,可于单一时脉Ck中处理多个指令。进一步,所述绘图处理单元302包括4个电源闸控元件328A、328B、328C及328D,分别耦接于每一着色器。所述这些电源闸控元件根据对应的控制信号330A、330B、330C及330D,用以启动或关闭对应的着色器。所述驱动器304耦接于所述绘图处理单元302,从一应用程序介面(application programming interface,API)312接收并执行各种应用程序,例如:一第一应用程序314、一第二应用程序316等,用以对应地驱动所述绘图处理单元302进行绘图。所述绘图处理系统的电源闸控方法将配合第3及4图详细说明如下。Referring to FIG. 3 , the graphics processing system includes a
图4是显示依据本发明实施例的一电源闸控方法40流程图。FIG. 4 is a flowchart showing a
如上所述,在所述绘图处理单元302绘制一既定画面Framen+1的前,所述驱动器304可使用一种基于历史(history-based)计算方式,以先前已绘制画面Framen,Framen-1,…,Framen-m+1的画面速度FPSn,FPSn-1,…,FPSn-m+1及所运作的着色器数量Sn,Sn-1,…,Sn-m+1为基础,用以预测所述绘图处理单元302绘制所述既定画面Framen+1时所需启动的着色器数量Sn+1。其中,m表示用以预测的前画面数量。接着,通过所述这些电源闸控元件,用以开启或关闭对应着色器的电源,使得所述绘图处理单元302更有效率地运算,并且改善所述绘图处理系统的整体功率消耗。As mentioned above, before the
进一步,所述驱动器304是根据每一应用程序的请求来驱动所述绘图处理单元302执行各种绘图操作。因此,所述驱动器304亦可通过应用程序的请求来判断画面绘制的开始与结束,用以进行电源闸控的操作。举例而言,所述第一应用程序314可包括一指令SwapBuffer,用以表示画面绘制的结束,而应用程序316可包括一指令ClearBuffer,用以表示画面绘制的结束。于上述指令执行期间进行电源闸控操作的话,便不会影响到绘图的效能。Further, the driver 304 drives the
参考第3及4图,当所述驱动器304从所述应用程序介面312接收所述第一应用程序314时,所述驱动器304产生对应的命令封包,通过存储器对应输入/输出(memory-mapped I/O)318传送至所述绘图处理单元302的一命令处理器310(步骤S402)。Referring to Figures 3 and 4, when the driver 304 receives the
回应于所述指令SwapBuffer的执行,亦即表示所述既定画面Framen+1的前一个画面Framen已绘制完成,所述驱动器304随即计算所述这些前画面Framen,Framen-1,…,Framen-m+1所运作的着色器数量Sn,Sn-1,…,Sn-m+1及对应的画面速度FPSn,FPSn-1,…,FPSn-m+1(步骤S404)。In response to the execution of the command SwapBuffer, which means that the previous frame Frame n of the predetermined frame Frame n+1 has been drawn, the driver 304 then calculates the previous frames Frame n , Frame n-1 , . . . , the number of shaders S n , S n-1 , ..., S n-m+1 operated by Frame n-m+1 and the corresponding frame speed FPS n , FPS n-1 , ..., FPS n-m+1 (step S404).
举例来讲,当m=5时,表示所述驱动器304计算出所述既定画面Framen+1的前5个已绘制画面Framen,Framen-1,…,Framen-4所运作的着色器数量Sn,Sn-1,…,Sn-4及对应的画面速度FPSn,FPSn-1,…,FPSn-4。For example, when m=5, it means that the driver 304 calculates the coloring performed by the first five drawn frames Frame n , Frame n-1 , . . . , Frame n-4 of the predetermined frame Frame n + 1 Number of monitors S n , S n-1 , . . . , S n-4 and corresponding frame speeds FPS n , FPS n-1 , . . . , FPS n-4 .
除此之外,驱动器304根据这些前画面Framen,Framen-1,…,Framen-m+1的运作着色器数量Sn,Sn-1,…,Sn-m+1及对应的画面速度FPSn,FPSn-1,…,FPSn-m+1,用以决定绘制所述既定画面Framen+1所需运作的着色器数量Sn+1(步骤S406)。In addition, the driver 304 operates the number of shaders S n , S n-1 , ..., S n-m +1 and the corresponding The frame speeds FPS n , FPS n -1 , .
更具体地,所述驱动器304可根据下式决定绘制所述既定画面Framenn+1所需的着色器数量Sn+1:More specifically, the driver 304 may determine the number of shaders S n+ 1 required to draw the predetermined frame Framen +1 according to the following formula:
其中,m为所述这些前画面的数量、Sn,Sn-1,…,Sn-m+1为绘制所述这些前画面Framen,Framen-1,…,Framen-m+1所运作的着色器数量、FPSn,FPSn-1,…,FPSn-m+1为所述这些前画面Framem,Framen-1,…,Framen-m+1所对应的画面速度、Target_FPS为依显示需求调整的一目标画面速度、α为一控制变数、且n≥m。Wherein, m is the number of these front pictures, S n , S n-1 , ..., S n-m+1 is to draw these front pictures Frame n , Frame n-1 , ..., Frame n-m+ 1 The number of shaders operated, FPS n , FPS n-1 ,..., FPS n-m+1 are the frames corresponding to the previous frames Frame m , Frame n-1 ,..., Frame n-m+1 Speed, Target_FPS is a target frame speed adjusted according to display requirements, α is a control variable, and n≧m.
之后,当所述驱动器304从所述应用程序介面312接收所述第二应用程序316(步骤S408)时,回应于所述指令ClearBuffer的执行,亦即表示开始进行所述既定画面Framen+1的绘制操作,所述驱动器304根据所述既定画面Framen+1所需运作的着色器数量Sn+1及目前每一着色器320A、320B、320C及320D的电源状态,产生一对应命令封包,用以配置所述这些着色器320A、320B、320C及320D的电源开启及关闭,并将所述对应命令封包传送至所述命令处理器310(步骤S410)。于一实施例中,假设所述既定画面Framen+1之前一画面Framen所使用的着色器数量Sn大于绘制所述既定画面Framen+1所需运作的着色器数量Sn+1,则对应地关闭闲置(inactive)着色器的电源供应。反之,则对应地开启需运作(active)着色器的电源供应。Afterwards, when the driver 304 receives the
接下来,回应于所述命令封包,所述命令处理器310产生用以导通或切断每一电源闸控元件328A、328B、328C及328D的控制信号330A、330B、330C及330D,藉以设定所述这些着色器320A、320B、320C及320D的电源开启或关闭,并通知一仲裁器308(步骤S412)。Next, in response to the command packet, the
之后,所述仲裁器308可根据已启动的着色器,进行绘图命令分配,执行所述既定画面Framen+1的绘制操作(步骤S414),例如:可以图块式绘制(tile-based rendering)的方式来绘制所述既定画面Framen+1。Afterwards, the
于一实施例中,每一电源闸控元件包括一晶体管。如图3所示,每一电源闸控元件包括一NMOS晶体管,耦接于一电压源Vdd及每一着色器之间,且其闸极接收所述命令处理器310所发出的控制信号。因此,每一晶体管根据对应的控制信号被导通或切断,用以决定是否将所述电压源Vdd提供至每一着色器。In one embodiment, each power gating element includes a transistor. As shown in FIG. 3 , each power gating element includes an NMOS transistor coupled between a voltage source Vdd and each shader, and its gate receives a control signal from the
更进一步,于操作中,每一着色器320A、320B、320C及320D可以单独配置一纹理单元(texture unit),或者共享一或多个纹理单元。因此,每一着色器320A、320B、320C及320D可通过各自的纹理存取路径332、334、336及338,从纹理单元接收纹理数据。于此情况下,亦可根据纹理单元的配置,弹性地调整电源闸控元件的分配。如此一来,可大幅提升电源管理的效率。上述的电源闸控机制将配合图5及图6详细说明如下。Furthermore, in operation, each
图5是显示依据本发明另一实施例的一绘图处理单元502方块图。FIG. 5 is a block diagram showing a
参考图5,所述绘图处理单元502包括一统合着色器单元520、区域共享存储器512及514、纹理单元508及510、全域共享存储器516及执行绪处理单元518。Referring to FIG. 5 , the
于此实施例中,所述统合着色器单元520具有多个的着色器。所述这些着色器包括2个着色器丛集(cluster)504及506,各自使用所述区域共享存储器502及514进行绘图。此外,着色器丛集504及506各自耦接于2个纹理单元508及510,而所述全域共享存储器516由纹理单元508及510共享。具体地,每一着色器丛集各自包括8个着色器。所述执行绪处理单元518包括2个执行绪定序器522及524,用以进行执行绪的分配。In this embodiment, the
于此情况下,每一着色器丛集504及506可各自配置一电源闸控元件。如此一来,每一电源闸控元件的导通或切断,将启动或关闭对应的着色器丛集。此外,亦可同时对每一着色器丛集所属的区域共享存储器及纹理单元进行供电控制。不仅减少电源闸控控制电路的成本,亦可节省着色器周边相关元件所造成的功率消耗。In this case, each
图6是显示依据本发明另一实施例的一绘图处理单元602方块图。FIG. 6 is a block diagram showing a graphics processing unit 602 according to another embodiment of the present invention.
参考图6,所述绘图处理单元602包括一统合着色器单元620、一几何控制单元604、一着色器控制单元606及一纹理单元608。Referring to FIG. 6 , the graphics processing unit 602 includes an integrated shader unit 620 , a geometry control unit 604 , a shader control unit 606 and a texture unit 608 .
于此实施例中,所述统合着色器单元620具有多个的着色器。所述这些着色器包括2个多重着色处理单元(shader multi-processor)610及612,且所述2个多重着色处理单元610及612组成一着色器丛集,共同使用所述纹理单元608来进行绘图。所述几何控制单元604及所述着色器控制单元606用以接收数据及分配绘图工作。于图6中,每一多重着色处理单元包括8个着色器SP、I及C快取存储器(cache)、多重执行绪发布单元MT(multi-thread issue)、2个特别函数单元SFU(Special Function Unit)及共享存储器MEM,用以进行绘图运算。于此一架构下,多重着色处理单元610及612共享所述纹理单元608,因此,可将这2个多重着色处理单元610及612视为一个电源管理单位,由同一电源闸控元件进行供电控制。当所述电源闸控元件切断时,整个着色器丛集,亦即多重着色处理单元610及612,及所述纹理单元608将一起被关闭。进一步地减少不必要的电源闸控元件及功率消耗。In this embodiment, the unified shader unit 620 has multiple shaders. These shaders include two multi-shader processing units (shader multi-processor) 610 and 612, and the two multi-shader processing units 610 and 612 form a shader cluster, and use the texture unit 608 together for drawing . The geometry control unit 604 and the shader control unit 606 are used for receiving data and assigning drawing tasks. In FIG. 6, each multi-shading processing unit includes 8 shader SPs, I and C caches (cache), multi-thread issue unit MT (multi-thread issue), 2 special function units SFU (Special Function Unit) Function Unit) and the shared memory MEM for drawing operations. Under this architecture, the multi-shading processing units 610 and 612 share the texture unit 608, therefore, these two multi-shading processing units 610 and 612 can be regarded as a power management unit, and the power supply control is performed by the same power gating element . When the power gating element is turned off, the entire shader cluster, ie, MPUs 610 and 612, and the texture unit 608 will be turned off together. Further reduce unnecessary power gating components and power consumption.
因此,通过本发明的绘图处理系统及其电源闸控方法,当进行绘图时,可以依据每一画面的画面速度变化关系,动态地控制需运作的着色器数量,从而减少不必要的功率消耗。Therefore, through the graphics processing system and the power gate control method of the present invention, when performing graphics, the number of shaders to be operated can be dynamically controlled according to the frame speed variation relationship of each frame, thereby reducing unnecessary power consumption.
本发明的方法,或特定型态或其部份,可以以程序码的型态存在。程序码可以包含于实体媒体,如软盘、光盘、硬盘、或是任何其他机器可读取(如电脑可读取)储存媒体,亦或不限于外在形式的电脑程序产品,其中,当程序码被机器,如电脑载入且执行时,此机器变成用以参与本发明的装置。程序码也可以通过一些传送媒体,如电线或电缆、光纤、或是任何传输型态进行传送,其中,当程序码被机器,如电脑接收、载入且执行时,此机器变成用以参与本发明的装置。当在一般用途处理单元实作时,程序码结合处理单元提供一操作类似于应用特定逻辑电路的独特装置。The method of the present invention, or specific forms or parts thereof, may exist in the form of program codes. The program code may be contained in a physical medium, such as a floppy disk, a CD, a hard disk, or any other machine-readable (such as a computer-readable) storage medium, or a computer program product not limited to an external form, where, when the program code When loaded and executed by a machine, such as a computer, the machine becomes a device for participating in the present invention. The program code may also be transmitted through some transmission medium, such as wire or cable, optical fiber, or any transmission type in which when the program code is received, loaded and executed by a machine, such as a computer, the machine becomes used to participate in the Device of the present invention. When implemented on a general-purpose processing unit, the program code combines with the processing unit to provide a unique device that operates similarly to application-specific logic circuits.
虽然本发明已以较佳实施例揭露如上,然其并非用以限定本发明,任何熟习本领域的技术人员,在不脱离本发明的精神和范围内,当可作各种的更动与润饰,因此本发明的保护范围当视权利要求范围所界定为准。Although the present invention has been disclosed above with preferred embodiments, it is not intended to limit the present invention. Any person skilled in the art may make various modifications and modifications without departing from the spirit and scope of the present invention. Therefore, the scope of protection of the present invention should be defined by the claims.
Claims (18)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN2009101392911A CN101908200B (en) | 2009-06-05 | 2009-06-05 | Drawing processing system and method with power gating function |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN2009101392911A CN101908200B (en) | 2009-06-05 | 2009-06-05 | Drawing processing system and method with power gating function |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN101908200A true CN101908200A (en) | 2010-12-08 |
| CN101908200B CN101908200B (en) | 2012-08-08 |
Family
ID=43263653
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN2009101392911A Expired - Fee Related CN101908200B (en) | 2009-06-05 | 2009-06-05 | Drawing processing system and method with power gating function |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN101908200B (en) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2016090641A1 (en) * | 2014-12-12 | 2016-06-16 | 上海兆芯集成电路有限公司 | Graphics processing system and power gating method thereof |
Family Cites Families (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| TWI348653B (en) * | 2006-06-08 | 2011-09-11 | Via Tech Inc | Decoding of context adaptive binary arithmetic codes in computational core of programmable graphics processing unit |
| CN101216932B (en) * | 2008-01-03 | 2010-08-18 | 威盛电子股份有限公司 | Graphics processing device, unit, and method for executing triangle configuration and attribute configuration |
-
2009
- 2009-06-05 CN CN2009101392911A patent/CN101908200B/en not_active Expired - Fee Related
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2016090641A1 (en) * | 2014-12-12 | 2016-06-16 | 上海兆芯集成电路有限公司 | Graphics processing system and power gating method thereof |
| US10209758B2 (en) | 2014-12-12 | 2019-02-19 | Via Alliance Semiconductor Co., Ltd. | Graphics processing system and power gating method thereof |
Also Published As
| Publication number | Publication date |
|---|---|
| CN101908200B (en) | 2012-08-08 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| TWI393067B (en) | Graphics processing system with power-gating function, power-gating method, and computer program products thereof | |
| US11874715B2 (en) | Dynamic power budget allocation in multi-processor system | |
| CN106537447B (en) | Dynamic scaling of graphics processor execution resources | |
| US11533683B2 (en) | Advanced graphics power state management | |
| CN108604113B (en) | Frame-based clock rate adjustment for processing units | |
| CN109219787B (en) | Dynamic scaling of processor frequency and bus bandwidth based on active and stalled cycles | |
| US10983581B2 (en) | Resource load balancing based on usage and power limits | |
| US10729980B2 (en) | Anti-cheating solution to detect graphics driver tampering for online gaming | |
| US20210304351A1 (en) | High-speed resume for gpu applications | |
| CA3042553C (en) | Mixed reality system with reduced power rendering | |
| CN103999128A (en) | Graphics processing unit with command processor | |
| CN108694686B (en) | Hybrid low-power homogeneous graphics processing unit | |
| US20170199558A1 (en) | Flexible and scalable energy model for estimating energy consumption | |
| CN110825159A (en) | Techniques for configuring processors to execute instructions efficiently | |
| US8522254B2 (en) | Programmable integrated processor blocks | |
| CN107408311A (en) | Hybrid 2D/3D graphics rendering | |
| JP2022545604A (en) | Apparatus and method for improving power/thermal budgets in switchable graphics systems, energy consumption based applications, and real-time systems | |
| CN111448546A (en) | Precise suspend and resume of workloads in processing units | |
| CN113176805A (en) | Method and apparatus for dynamically changing clock frequency of display | |
| CN111724294A (en) | Distributed copy engine | |
| US11281463B2 (en) | Conversion of unorm integer values to floating-point values in low power | |
| US10776897B1 (en) | System and method to support multiple walkers per command | |
| CN114600149A (en) | Method and apparatus for reducing drawing command information | |
| US11029960B2 (en) | Apparatus and method for widened SIMD execution within a constrained register file | |
| CN101908200A (en) | Drawing processing system and method with power gate control function |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| C14 | Grant of patent or utility model | ||
| GR01 | Patent grant | ||
| CF01 | Termination of patent right due to non-payment of annual fee | ||
| CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20120808 |