Synchronized GPU priority scheduling
As Ursulin describe the situation, the "current processing landscape
seems to be more and more composed of pipelines where computations are done
on multiple hardware devices
". The kernel directly controls the
availability of CPU time for the work that is actually done on the CPU.
But, increasingly, computing work is offloaded to GPUs, AI accelerators, or
cryptocurrency-mining peripherals. Those processors, while capable, can also be
overloaded by the demands placed on them. If they run their workloads in a
way that disagrees with the kernel's idea of process priorities, the end
result may not be what the user would like to see.
As an example, Ursulin pointed out that the Chrome browser will lower the priority of tabs that are not currently in the foreground. If one of those background tabs is doing a lot of rendering in the GPU, though, it may slow down the foreground tab even though the background work is supposed to be running at low priority. It turns out that at least some of these GPUs, including some Intel i915 versions, can perform priority-based scheduling internally. But that requires informing the GPU of the relevant priorities, and there is currently no way to communicate those decisions, which are made in user space, to the GPU.
Ursulin's approach is to add the concept of "context nice" to the i915 driver. This value, which is tied to the priority of the process submitting work, is used with suitably capable GPUs to influence the scheduling of that work. This approach works, but only until the priority of the process on the CPU is changed; if the browser switches to a new tab and wants to increase its priority, continuing to run the associated work on the GPU side at a lower priority would not lead to greater user satisfaction. To avoid that problem, Ursulin's patch series adds a new notifier to the scheduler so that interested kernel subsystems can be informed whenever a process's priority is changed. The i915 driver then hooks into that notifier so that it can update its priority information to keep up with the CPU priority of any process that is running work on the GPU.
The notifier has turned out to be the most controversial part of this patch set. Ursulin noted that there could be security concerns with calling into a device driver from deep within the scheduler whenever a process's priority has changed. John Wanghui suggested that a separate "I/O nice" value could be added to control priorities on the GPU; this would be different from the "ionice" that already exists for block I/O but would function in a similar way. Barry Song, instead, complained that the use of simple nice values is insufficient; it does not take into account the effect of control groups or accumulated run time on actual access to the CPU. That could lead to scheduling results on the GPU that would be inconsistent with what happens on the CPU.
Ursulin mostly agreed with Song's criticisms, but also made the claim that even just using the process nice value is better than no control over execution priority on the GPU at all. This initial implementation could be extended later to include support for control groups and such if that seemed warranted. Meanwhile, though, he has concluded that perhaps the scheduler notifier is not necessary after all. By using the current process priority whenever work is submitted to the GPU, similar results would be obtained; the main difference is that a priority change would not apply to work that had already been passed to the GPU. The next version of this patch set, it appears, will drop the notifier.
Ursulin has done some simple benchmark tests where a
graphical application is running alongside a "GPU hog" process. If the GPU
hog is given a low priority, the graphical application is able to produce
significantly higher frame rates than it can in the absence of priority
control. He concluded: "So it appears the
feature can indeed improve the user experience
". It thus seems
likely that some version of this work will eventually find its way into the
mainline; what remains to be seen is how much it will have to change before
it gets there.
| Index entries for this article | |
|---|---|
| Kernel | Device drivers/Accelerators |
| Kernel | Scheduler |