Synchronized GPU priority scheduling

By Jonathan Corbet
October 22, 2021

Since the early days, Unix-like systems have implemented the concept of process priorities, where higher-priority processes are given more CPU time to get their work done. Implementations have changed, and alternatives (such as deadline scheduling) are available for specialized situations, but the core priority (or, in an inverted sense, "niceness") concept remains essentially the same. What should happen, though, in a world where increasing amounts of computing work is done outside of the CPU? Tvrtko Ursulin has put together a patch set showing how the nice mechanism can be extended to GPUs as well.

As Ursulin describe the situation, the "current processing landscape seems to be more and more composed of pipelines where computations are done on multiple hardware devices". The kernel directly controls the availability of CPU time for the work that is actually done on the CPU. But, increasingly, computing work is offloaded to GPUs, AI accelerators, or cryptocurrency-mining peripherals. Those processors, while capable, can also be overloaded by the demands placed on them. If they run their workloads in a way that disagrees with the kernel's idea of process priorities, the end result may not be what the user would like to see.

As an example, Ursulin pointed out that the Chrome browser will lower the priority of tabs that are not currently in the foreground. If one of those background tabs is doing a lot of rendering in the GPU, though, it may slow down the foreground tab even though the background work is supposed to be running at low priority. It turns out that at least some of these GPUs, including some Intel i915 versions, can perform priority-based scheduling internally. But that requires informing the GPU of the relevant priorities, and there is currently no way to communicate those decisions, which are made in user space, to the GPU.

Ursulin's approach is to add the concept of "context nice" to the i915 driver. This value, which is tied to the priority of the process submitting work, is used with suitably capable GPUs to influence the scheduling of that work. This approach works, but only until the priority of the process on the CPU is changed; if the browser switches to a new tab and wants to increase its priority, continuing to run the associated work on the GPU side at a lower priority would not lead to greater user satisfaction. To avoid that problem, Ursulin's patch series adds a new notifier to the scheduler so that interested kernel subsystems can be informed whenever a process's priority is changed. The i915 driver then hooks into that notifier so that it can update its priority information to keep up with the CPU priority of any process that is running work on the GPU.

The notifier has turned out to be the most controversial part of this patch set. Ursulin noted that there could be security concerns with calling into a device driver from deep within the scheduler whenever a process's priority has changed. John Wanghui suggested that a separate "I/O nice" value could be added to control priorities on the GPU; this would be different from the "ionice" that already exists for block I/O but would function in a similar way. Barry Song, instead, complained that the use of simple nice values is insufficient; it does not take into account the effect of control groups or accumulated run time on actual access to the CPU. That could lead to scheduling results on the GPU that would be inconsistent with what happens on the CPU.

Ursulin mostly agreed with Song's criticisms, but also made the claim that even just using the process nice value is better than no control over execution priority on the GPU at all. This initial implementation could be extended later to include support for control groups and such if that seemed warranted. Meanwhile, though, he has concluded that perhaps the scheduler notifier is not necessary after all. By using the current process priority whenever work is submitted to the GPU, similar results would be obtained; the main difference is that a priority change would not apply to work that had already been passed to the GPU. The next version of this patch set, it appears, will drop the notifier.

Ursulin has done some simple benchmark tests where a graphical application is running alongside a "GPU hog" process. If the GPU hog is given a low priority, the graphical application is able to produce significantly higher frame rates than it can in the absence of priority control. He concluded: "So it appears the feature can indeed improve the user experience". It thus seems likely that some version of this work will eventually find its way into the mainline; what remains to be seen is how much it will have to change before it gets there.

Index entries for this article
Kernel	Device drivers/Accelerators
Kernel	Scheduler

to post comments

Synchronized GPU priority scheduling

Posted Oct 24, 2021 16:07 UTC (Sun) by jezuch (subscriber, #52988) [Link] (49 responses)

Personally, I would go as far as to -STOP the inactive tabs. Any tab I'm not looking at which is doing any kind of CPU or GPU work is a power virus for me.

Synchronized GPU priority scheduling

Posted Oct 24, 2021 16:09 UTC (Sun) by NYKevin (subscriber, #129325) [Link] (42 responses)

Most end users want to get Slack (or any other in-browser chat app) notifications when they receive messages, even if the Slack (or whatever) tab is not focused.

Synchronized GPU priority scheduling

Posted Oct 24, 2021 22:04 UTC (Sun) by k8to (guest, #15413) [Link] (30 responses)

I suspect most users would be happier with a proper native client, but we can't have nice things.

Given being forced into using slack as it is by external factors though, yes I would prefer it to be able to run in the background, although in practice I give it a sandboxed browser session all to itself.

Synchronized GPU priority scheduling

Posted Oct 24, 2021 23:43 UTC (Sun) by developer122 (guest, #152928) [Link] (12 responses)

Or maybe everyone just wants to do everything in a browser these days.

Synchronized GPU priority scheduling

Posted Oct 25, 2021 1:06 UTC (Mon) by NYKevin (subscriber, #129325) [Link] (10 responses)

Personally, I am very much tired of entirely-functional mobile sites telling me "You should use our app, it's so much better" over and over again, when the only additional functionality you get with the app is notification spam.

Synchronized GPU priority scheduling

Posted Oct 25, 2021 10:12 UTC (Mon) by nix (subscriber, #2304) [Link] (9 responses)

Of course that just means the website is out of date, because these days websites can *also* give you notification spam! (At least recent Android and all major web browsers let you shut them up again, but that's rarely used...)

Synchronized GPU priority scheduling

Posted Oct 25, 2021 14:50 UTC (Mon) by NYKevin (subscriber, #129325) [Link] (6 responses)

There is an option in Chrome's settings to auto-deny that permission (and there's a separate option to auto-deny audio, which is also handy). I have no idea if other browsers have a similar setting.

Synchronized GPU priority scheduling

Posted Oct 26, 2021 13:59 UTC (Tue) by LtWorf (subscriber, #124958) [Link] (5 responses)

The problem with chrome is that if you deny a permission to a website, it tells the website the permission is denied. Causing the website to forever nag you to grant the permission.

In a better world websites (that are in general malicious and hostile software) would be given an API that always returns success.

Synchronized GPU priority scheduling

Posted Oct 26, 2021 16:02 UTC (Tue) by fenncruz (subscriber, #81417) [Link] (1 responses)

Though someone would probebrly come up with some JavaScript timing attack to work out if it's a real success or a faked success.

Then somehow work out how to use that to fingerprint you to send you even more ads.

Synchronized GPU priority scheduling

Posted Oct 26, 2021 18:13 UTC (Tue) by LtWorf (subscriber, #124958) [Link]

I think they already have imprecise timers provided by browsers to js, to mitigate exactly this kind of attacks.

Synchronized GPU priority scheduling

Posted Oct 29, 2021 9:15 UTC (Fri) by taladar (subscriber, #68407) [Link] (1 responses)

That would require the browser to simulate a convincing accept permission scenario for each of those permissions though (e.g. white noise from a microphone,...)

Synchronized GPU priority scheduling

Posted Oct 29, 2021 12:40 UTC (Fri) by mathstuf (subscriber, #69389) [Link]

Given the myriad audio issues I've seen, it could just simulate pulling from the "wrong" microphone source that happens to be silent :) . Or act as if it is muted in the audio pipeline.

Synchronized GPU priority scheduling

Posted Oct 31, 2021 21:48 UTC (Sun) by NYKevin (subscriber, #129325) [Link]

> In a better world websites (that are in general malicious and hostile software) would be given an API that always returns success.

This is exactly what happens if you deny audio permission in Chrome (it just mutes the tab, and the site is none the wiser), because audio was not designed with a "permissions" system in the first place (there's no standardized API by which Chrome, or any user agent, *could* indicate that the entire site, as a whole, is muted). Unfortunately, I don't think any browser currently does this for notifications.

Synchronized GPU priority scheduling

Posted Oct 27, 2021 2:29 UTC (Wed) by foom (subscriber, #14868) [Link] (1 responses)

Website notification spam is opt in via permission dialog, while app notification spam is opt out via a difficult-to-discover notification configuration UI. That seems like it's probably a huge advantage in "user engagement" for apps right there.

Now, I'm sure there's a way sites could detect that notifications are disabled and put up blocking popups saying "Reddit is better with notifications enabled!" which prevent you from using the site until you enable them...

And, hey, maybe they'll do that in the future for desktop users who can't install the app. Can't pass up a chance to piss off more people, right?

Synchronized GPU priority scheduling

Posted Oct 27, 2021 11:21 UTC (Wed) by mathstuf (subscriber, #69389) [Link]

> while app notification spam is opt out via a difficult-to-discover notification configuration UI.

Note that you can now long click on any notification to get access to this configuration UI.

Synchronized GPU priority scheduling

Posted Oct 25, 2021 4:34 UTC (Mon) by rgmoore (✭ supporter ✭, #75) [Link]

As a practical matter, the browser has become something pretty close to a guest operating system. The host OS still needs to be in overall control, but things like how to deal with individual tabs in the browser are better dealt with at the guest OS level rather than the host OS level.

Synchronized GPU priority scheduling

Posted Oct 25, 2021 0:56 UTC (Mon) by roc (subscriber, #30627) [Link] (4 responses)

One very good reason to run Slack/Zoom/etc in the browser instead of a native client (or even an Electron client) is that it's guaranteed to be far more secure. Browser sandbox security is state of the art ... random client apps, not so much.

Synchronized GPU priority scheduling

Posted Oct 25, 2021 5:37 UTC (Mon) by marcH (subscriber, #57642) [Link] (3 responses)

Corollary: when I close some annoying tab, I trust the "guest OS" (= browser) that it is totally, completely, utterly gone. With a regular OS it's hit and miss.

Synchronized GPU priority scheduling

Posted Oct 25, 2021 6:58 UTC (Mon) by weberm (guest, #131630) [Link] (2 responses)

And so it is with the browser as well. Closing a tab does not guarantee you closing the computing context of that tab's JS.

Synchronized GPU priority scheduling

Posted Oct 25, 2021 15:46 UTC (Mon) by marcH (subscriber, #57642) [Link]

I suspect your reference-free comment alludes to PWAs and other "Web Apps". I'm aware these (dangerously?) blur the lines but AFAIK these require some kind of explicit authorization.

Synchronized GPU priority scheduling

Posted Oct 26, 2021 22:15 UTC (Tue) by roc (subscriber, #30627) [Link]

Not sure what you're referring to. Service Workers maybe? They can respond to certain events with no same-origin tab activity, but the browser throttles them carefully so they don't just keep running JS indefinitely. See https://stackoverflow.com/questions/29741922/prevent-serv...

Synchronized GPU priority scheduling

Posted Oct 25, 2021 13:51 UTC (Mon) by dskoll (subscriber, #1630) [Link] (11 responses)

I'm not most users, but I hate Slack in both the native app and the browser, so I generally interact with it with an IRC gateway. There are a few, but the one I've chosen is matterircd. As a bonus, this also works nicely with my workplace's Mattermost installation.

I configure the notifications I want from my IRC client (typically, just mentions of my nick.)

Synchronized GPU priority scheduling

Posted Oct 25, 2021 14:52 UTC (Mon) by NYKevin (subscriber, #129325) [Link] (10 responses)

There's always one. https://xkcd.com/1782/

Synchronized GPU priority scheduling

Posted Oct 25, 2021 15:16 UTC (Mon) by dskoll (subscriber, #1630) [Link]

Yes, exactly! And I am that One! :)

Synchronized GPU priority scheduling

Posted Oct 26, 2021 14:01 UTC (Tue) by LtWorf (subscriber, #124958) [Link] (8 responses)

When he wrote that, I'm sure slack didn't have the option to let your boss read everything you write.

Synchronized GPU priority scheduling

Posted Oct 26, 2021 14:37 UTC (Tue) by zdzichu (subscriber, #17118) [Link] (7 responses)

I'm sure it had it. It's basic feature in Data Leak Prevention. I doubt any company would select Slack for it's internal communication if there wasn't a permanent record of all communication, for audit purposes.

Synchronized GPU priority scheduling

Posted Oct 26, 2021 18:11 UTC (Tue) by LtWorf (subscriber, #124958) [Link] (6 responses)

It didn't have it. Because I remember that xkcd was already old when the outrage about the new "feature" came.

Synchronized GPU priority scheduling

Posted Oct 26, 2021 19:33 UTC (Tue) by NYKevin (subscriber, #129325) [Link] (5 responses)

Perhaps I'm a bit old-fashioned, but I have always assumed that my employer can see everything I do on company-owned devices or using company-owned comms. Have I misunderstood why people were outraged, or was it really just "my employer can read things that I write on the company's time with the company's equipment"?

Synchronized GPU priority scheduling

Posted Oct 26, 2021 19:58 UTC (Tue) by LtWorf (subscriber, #124958) [Link] (4 responses)

Yeah most people have slack on their personal phone and receive notifications in their personal time…

Plus, organizing a union gets real hard in this way.

In some countries, like Italy, there are precise rules on what the boss can check. For example if I login to my personal email address at work, using a work device, it's in any case forbidden to the boss to intercept and read that.

Work emails can be read provided that the employees have been previously informed that this is the case. Otherwise it is not legal to do so.

So you see, with slack… they just made available all the history to the boss, who couldn't previously access it, so people weren't aware of the possibility in advance, so for the boss to read is illegal in Italy. However since they can do so without anyone being notified, this will likely go unpunished.

Now you get why the controversy?

Synchronized GPU priority scheduling

Posted Oct 27, 2021 18:32 UTC (Wed) by NYKevin (subscriber, #129325) [Link] (3 responses)

> Yeah most people have slack on their personal phone and receive notifications in their personal time…

If you're doing work on personal time, that's a totally different problem (of the form "you're not getting paid").

> Plus, organizing a union gets real hard in this way.

In every reasonable jurisdiction I've ever heard of, it's illegal for them to prevent you from organizing a union, although enforcement of this is questionable in the US due to at-will employment (but it is nevertheless the law). You're probably better off just exchanging contact info and then using something like Signal for further communication.

> In some countries, like Italy, there are precise rules on what the boss can check. For example if I login to my personal email address at work, using a work device, it's in any case forbidden to the boss to intercept and read that.

To my understanding, this has simply never been the case in the US.

> Work emails can be read provided that the employees have been previously informed that this is the case. Otherwise it is not legal to do so.

In the US, work emails are the property of the company. I don't know what Italian law says about that, but it seems very strange to me that the company is not allowed to read its own emails.

> So you see, with slack… they just made available all the history to the boss, who couldn't previously access it, so people weren't aware of the possibility in advance, so for the boss to read is illegal in Italy. However since they can do so without anyone being notified, this will likely go unpunished.

If it's enabling them to break Italian law, then I can see how that would be problematic in Italy. Slack needs to comply with local law in every jurisdiction where it operates, and so they should be aware of whatever laws Italy has. If Italian law is as you describe, then obviously Slack should not be enabling employers to violate it.

However, upthread this was described as a problem *in general*, not a problem in Italy. So I'm still not sure why this is such a big deal outside of Italy.

Synchronized GPU priority scheduling

Posted Oct 27, 2021 22:13 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link]

> To my understanding, this has simply never been the case in the US.

That's the case in the US. There's a Supreme Court ruling that minor use of work-related stuff for personal communication is acceptable and protected. In that case it was about the work phone used for personal calls.

Synchronized GPU priority scheduling

Posted Oct 28, 2021 21:38 UTC (Thu) by sionescu (subscriber, #59410) [Link] (1 responses)

> I don't know what Italian law says about that, but it seems very strange to me that the company is not allowed to read its own emails.

Italian jurisprudence in this regards revolves around the concept of the human dignity of the worker, and the right to privacy. I'm not even sure that the concept of ownership of emails is contemplated as such, the way it is in the US.

In practice it seems [1] that currently it might be allowed for an employer to open a mailbox only to ascertain a crime of which the employer already has a reasonable suspicion, but never preventively or for monitoring; more or less the same bar as for a police officer to do an inspection of an automobile.

[1] https://www.laleggepertutti.it/205247_lemail-aziendale-pu...

Synchronized GPU priority scheduling

Posted Nov 7, 2021 7:54 UTC (Sun) by fgrosshans (guest, #35486) [Link]

The (legal) access of an employer to their employees electronic communication is actually restricted accross the EU, even if said communications are done on company previded devices during work hours. As a French, I must confess that the idea that the employer could legally read all these emails sounds to me like a strange American way of doing business.

Synchronized GPU priority scheduling

Posted Oct 24, 2021 23:44 UTC (Sun) by developer122 (guest, #152928) [Link] (4 responses)

Another simple example would be listening to music/podcasts/whatever from youtube in the background.

Synchronized GPU priority scheduling

Posted Oct 25, 2021 7:30 UTC (Mon) by geert (subscriber, #98403) [Link] (2 responses)

> Another simple example would be listening to music/podcasts/whatever from youtube in the background.

Which Youtube may want to disallow, just like with the Android app?

Synchronized GPU priority scheduling

Posted Oct 25, 2021 7:52 UTC (Mon) by zdzichu (subscriber, #17118) [Link] (1 responses)

It is not disallowed, it is a paid feature on Android. Do you expect the same will happen on web?

Synchronized GPU priority scheduling

Posted Oct 25, 2021 8:27 UTC (Mon) by geert (subscriber, #98403) [Link]

I hope not.

Synchronized GPU priority scheduling

Posted Oct 25, 2021 11:02 UTC (Mon) by jezuch (subscriber, #52988) [Link]

I admit this is a valid use case, but I'm doing it only when desperate ;) The embedded media player in the browser is terribly inefficient for some reason. Whenever I need to play something, I do it via youtube-dl.

Synchronized GPU priority scheduling

Posted Oct 25, 2021 11:00 UTC (Mon) by jezuch (subscriber, #52988) [Link] (1 responses)

I get browser notifications even for tabs that are closed, so this is in fact not tied to a tab process, apparently.

Synchronized GPU priority scheduling

Posted Oct 25, 2021 12:15 UTC (Mon) by rbtree (guest, #129790) [Link]

It's done through service workers, probably. If you're using Firefox, take a look at about:serviceworkers

https://developer.mozilla.org/en-US/docs/Web/API/Service_...

https://developer.mozilla.org/en-US/docs/Web/API/Push_API

Synchronized GPU priority scheduling

Posted Oct 25, 2021 18:09 UTC (Mon) by bartoc (guest, #124262) [Link] (1 responses)

I'm not really sure why slack should need to run code on my GPU to show me a notification.....

Synchronized GPU priority scheduling

Posted Oct 25, 2021 20:02 UTC (Mon) by mjg59 (subscriber, #23239) [Link]

-STOPing a tab is going to stop it executing code on the CPU as well as the GPU

Synchronized GPU priority scheduling

Posted Oct 25, 2021 20:04 UTC (Mon) by Wol (subscriber, #4433) [Link] (1 responses)

> Most end users want to get Slack (or any other in-browser chat app) notifications when they receive messages,

Are you sure !?!?!?

Pretty much ANY messaging app, I disable push notifications. I doubt I'm alone. If you're trying to work, push notifications pretty much guarantee you will have a screwed up workflow...

(We've just transitioned from Hangouts to Slack at work. At least hangouts didn't announce new messages when its tab was hidden. And pretty much the first thing I did with Slack was to make sure that it couldn't, either.)

Cheers,
Wol

Synchronized GPU priority scheduling

Posted Oct 26, 2021 19:34 UTC (Tue) by NYKevin (subscriber, #129325) [Link]

If you don't like Slack, then replace it with calendar webapps, music webapps, etc.

Synchronized GPU priority scheduling

Posted Oct 24, 2021 16:30 UTC (Sun) by ttuttle (subscriber, #51118) [Link]

You can patch your browser to do that, but I'd like YouTube Music to keep working even when I'm not looking at it. :P

Synchronized GPU priority scheduling

Posted Oct 24, 2021 18:31 UTC (Sun) by ericonr (guest, #151527) [Link] (1 responses)

I know I wouldn't like to have a tab stop loading resources while I unfocus it to do something else. Having to control which tabs automatically sleep and which don't would he a UI nightmare, so I can understand the choice of not stopping them at all.

Synchronized GPU priority scheduling

Posted Oct 24, 2021 22:06 UTC (Sun) by k8to (guest, #15413) [Link]

Sadly, this problem exists anyway in what passes for a lot of modern web development, where the page Javascript checks for signs of interaction or visibility before loading content. Opening a tab in the background ends up requiring a reload when you go to read it.

Synchronized GPU priority scheduling

Posted Oct 24, 2021 20:26 UTC (Sun) by roc (subscriber, #30627) [Link] (2 responses)

To be clear, browers do aggressively throttle background tabs already. Bringing them to a complete stop is problematic for reasons discussed by others here.

Literal SIGSTOP isn't the way to do it because a process could be running multiple pages from the same site, some of which are foreground and others of which are in the background. (Splitting those out into their own processes would produce unnecessary memory bloat.)

Fixing "throttling leaks" like this GPU issue makes sense, but I'm not sure there's room to be significantly more aggressive with throttling without breaking the user experience in many cases.

Synchronized GPU priority scheduling

Posted Oct 25, 2021 11:07 UTC (Mon) by jezuch (subscriber, #52988) [Link] (1 responses)

Yeah, the way the browser is implemented currently this is probably not a good idea, but that's something I would like to see anyway. Currently, when I see high (i.e. non-marginal) CPU usage for no apparent reason, I know this is time to open chromium process hunting season :) This happens way too often for my taste.

Synchronized GPU priority scheduling

Posted Oct 26, 2021 22:17 UTC (Tue) by roc (subscriber, #30627) [Link]

Firefox gives you a task manager for this purpose and I assume Chrome does too.

Synchronized GPU priority scheduling

Posted Oct 24, 2021 23:39 UTC (Sun) by developer122 (guest, #152928) [Link] (6 responses)

Fun fact: that internal scheduling on the i915 and similar is done by an embedded i486 system. Because what else but x86 would intel use for power and scheduling management?

Synchronized GPU priority scheduling

Posted Oct 25, 2021 1:07 UTC (Mon) by roc (subscriber, #30627) [Link] (5 responses)

Hmm, that's wild. Thanks for the tip.

Synchronized GPU priority scheduling

Posted Oct 25, 2021 5:38 UTC (Mon) by marcH (subscriber, #57642) [Link] (4 responses)

It's on the Internet, so it must be true :-)

Synchronized GPU priority scheduling

Posted Oct 25, 2021 15:03 UTC (Mon) by khim (subscriber, #9252) [Link] (3 responses)

If you don't trust the random web sites then you can just go and play with firmware yourself.

But the question “what else but x86 would intel use for power and scheduling management” does, apparently, have an answer: initially Intel used ARC.

Synchronized GPU priority scheduling

Posted Oct 25, 2021 15:39 UTC (Mon) by marcH (subscriber, #57642) [Link] (2 responses)

What I don't trust at all and that no one should either is reference-free comments; in this case not even a search keyword. I believe the immense majority of comments on LWN are made in good faith but good faith does not imply fact, no more here than on social media.

I looked at your references and found none mentioned any relationship between the ME and the GPU.

Synchronized GPU priority scheduling

Posted Oct 25, 2021 16:07 UTC (Mon) by excors (subscriber, #95769) [Link]

A more relevant reference is https://igor-blue.github.io/2021/02/10/graphics-part1.html ("Security of the Intel Graphics Stack - Part 1 - Introduction") which describes:

> The GuC is a small embedded core that supports graphics scheduling, power management and firmware attestation. It is implemented in an i486DX4 CPU (also called P24C and Minute IA), although it seems that since broadwell it has been extended to the Pentium (i586) ISA. It runs a small microkernel call μOS.

(and includes some disassembled GuC firmware as evidence).

There are also microcontrollers for H.265 and for display (see e.g. https://01.org/linuxgraphics/downloads/firmware), and I'd guess those are probably the same architecture as the GuC (to avoid redundant development effort).

NVIDIA has previously said (https://riscv.org/wp-content/uploads/2017/05/Tue1345pm-NV...) they have ~10 microcontrollers per GPU, based on a custom architecture. When Intel has similar requirements, it seems quite sensible for them to use an architecture for which they already have expertise and good development tools and no licensing costs.

Synchronized GPU priority scheduling

Posted Oct 26, 2021 22:22 UTC (Tue) by roc (subscriber, #30627) [Link]

FWIW I actually did find a detailed reference before I responded :-).

The ME and the GUC are different. I think the ME core is on a different physical chip?

Synchronized GPU priority scheduling

Posted Oct 25, 2021 4:03 UTC (Mon) by alison (subscriber, #63752) [Link] (1 responses)

Anyone how many GPUs support the notion of priority? The patchset is not so exciting if it affects on i915.

Synchronized GPU priority scheduling

Posted Nov 3, 2021 13:47 UTC (Wed) by BenHutchings (subscriber, #37955) [Link]

Running git grep -l DRM_SCHED_PRIORITY drivers/gpu shows several of them.

Synchronized GPU priority scheduling

Posted Oct 25, 2021 5:57 UTC (Mon) by marcH (subscriber, #57642) [Link] (1 responses)

> this would be different from the "ionice" that already exists for block I/O but would function in a similar way.

So "nice" has been completely independent from ionice the whole time? Interesting, that would explain why I always felt that "nice" is useless and gave up trying to use it a long time ago.

> it does not take into account the effect of control groups

Makes me wonder: is defeating process priorities as easy as defeating TCP fairness? Split the work across more processes (resp. connections) and done.

I think many devices (workstations, laptops, smartphones) became single user before any serious thought ever went into a proper definition of what is a "fairness unit". I mean who cares whether tabs prioritization works only 90% of the time, just close some tabs when it fails and done. Same for that long and disk intensive job, just Ctrl-Z it when you need. I don't remember seeing anyone trying to use "nice" interactively, I suspect many people don't even know it exists.

On the other hand, I wonder how it works in the cloud. A long time ago, I experienced spurs of unusable latency with some (too) cheap cloud VM and as expected with any latency issue support was useless. Much more recently I experienced seconds long, complete shell freezes a couple times per hour on a VMware system. I was very proud to have found a super simple way to observe it: an ioping for loop! But again, latency => support did not care.

Synchronized GPU priority scheduling

Posted Oct 25, 2021 18:20 UTC (Mon) by bartoc (guest, #124262) [Link]

Yes it is (or was) just that easy, and that's one of the reasons why cgroups came to exist, and one of the biggest reasons that systemd is worth using.

Nowadays, at least on my (fedora 35, with gnome) system when I launch an application gnome (I think gnome is doing this) will create a systemd .scope unit via systemd's dbus API, and that means the app gets it's own cgroup. I just ran systemd-cgls and observed that all my firefox tab processes are in their own cgroup, helping to prevent resource starvation.

Synchronized GPU priority scheduling

Posted Oct 25, 2021 18:14 UTC (Mon) by bartoc (guest, #124262) [Link]

It's nice to see this work finally starting to get done. At #previousjob we did a lot of GPU compute tasks on a multi-user system, and, while nvidia was pushing their "nvidia docker" support at the time (which was [and I think still is] just mapping the gpu device node into the container), the benefit was always extremely minimal, since the nvidia driver components needed to match between host and container (negating many of the "packaging" benefits of docker) and the scheduling on the GPU was totally unaware of the gcroups created by the docker containers, so the anti-resource-starvation properties of containers were also nullified.

I would think that with all the interest in "cloud gaming" plenty of people are interested in fixing up GPU scheduling (although that use-case requires a lot more work, you need good virtualization support too), as right now I think resource utilization in such services is probably extremely bad.