Possible side-effects of CLOCK_MONOTONIC change?

Posted Apr 16, 2018 22:48 UTC (Mon) by glenn (subscriber, #102223)
Parent article: The second half of the 4.17 merge window

I fear this change to CLOCK_MONOTONIC may induce floods of activity post-wake, as was the case with Google Chromecast not too long ago: https://www.theregister.co.uk/2018/01/18/chromecast_flood.... Timers set against CLOCK_MONOTONIC would be susceptible, no?

Also, are timers (i.e., timerfd()) against CLOCK_MONOTONIC_ACTIVE supported? If not, my code base may need a lot of rework...

to post comments

Possible side-effects of CLOCK_MONOTONIC change?

Posted Apr 17, 2018 5:51 UTC (Tue) by epa (subscriber, #39769) [Link] (5 responses)

If I read the article correctly, it could be summarized as “CLOCK_MONOTONIC has been renamed to CLOCK_MONOTONIC_ACTIVE, but the old name has now become an alias for something else”. It all seems a bit strange, especially given the declared intention never to break user space.

Possible side-effects of CLOCK_MONOTONIC change?

Posted Apr 17, 2018 9:05 UTC (Tue) by tglx (subscriber, #31301) [Link] (4 responses)

We are well aware of the fact that it might break user space and prepared for reverting it. In hindsight we should have never introduced CLOCK_BOOTTIME, but back in the days not all architectures were converted to the generic timekeeping infrastructure.

We have discussed that back and forth and finally decided to give it a try. If you or anyone else observes wreckage please let us know immediately.

Possible side-effects of CLOCK_MONOTONIC change?

Posted Apr 17, 2018 9:18 UTC (Tue) by epa (subscriber, #39769) [Link]

OK, makes sense. Maybe the unambiguous new CLOCK_MONOTONIC_ACTIVE should be added first (and backported to stable kernels) so applications that really want that can be prepared for the change. But this is just an ignorant suggestion.

Possible side-effects of CLOCK_MONOTONIC change?

Posted Apr 19, 2018 12:47 UTC (Thu) by lynxeye (subscriber, #90890) [Link]

I didn't observe it yet, but I definitely can see a place where things will break: Most DRM drivers are specifying IOCTL timeouts as absolute timeouts in terms of CLOCK_MONOTONIC. So if GPU operations get suspended and only submitted to the hardware after resume, the userspace will see a lot of its waits time out, while the GPU is still happily working through it's queue of work.

This is unexpected and I bet most of the graphics userspace will fall over if it hits such a condition.

Possible side-effects of CLOCK_MONOTONIC change?

Posted Apr 27, 2018 3:38 UTC (Fri) by njs (subscriber, #40338) [Link] (1 responses)

Traditionally nanosleep() and the timeouts in select(), epoll_wait(), etc., have all used the CLOCK_MONOTONIC clock, so that if you sleep for 10 seconds, and after 5 seconds the system is suspended for an hour, then after it wakes up again the process keeps sleeping for another 5 seconds.

Did you keep the relationship between sleeping syscalls and CLOCK_MONOTONIC – so that e.g. a nanosleep() before suspend will now wake up immediately on resume? Or did you keep the old sleeping syscall semantics, and break the relationship with CLOCK_MONOTONIC?

As far as I know, all correct event loops currently depend on the assumption that sleeping syscalls and CLOCK_MONOTONIC match each other. For example, if I set a timeout for T seconds from now, the event loop will:

- use (clock_gettime(CLOCK_MONOTONIC) + T) to calculate the absolute time of the timeout
- later, when it calls epoll_wait(), it'll choose the timeout by doing (deadline - clock_gettime(CLOCK_MONOTONIC))
- then it passes that timeout to epoll_wait()

Right now that's sufficient to ensure that epoll_wait() will return when clock_gettime(CLOCK_MONOTONIC) == deadline, or thereabouts... but if CLOCK_MONOTONIC starts counting suspend time, while epoll_wait() doesn't, then we'll start sleeping too long and missing our deadlines by an arbitrary amount.

Or at least, that's what the event loop I maintain does, which is why I want to know :-).

(As an added bonus, if I *do* have to switch to CLOCK_MONOTONIC_ACTIVE, that's going to be a hassle. Currently the event loop is implemented in Python, and the Python standard library obviously doesn't yet have any bindings for CLOCK_MONOTONIC_ACTIVE. Given where we are in the release cycle, the earliest they could be added is 1.5-2 years from now. In the mean time I guess it becomes temporarily impossible to implement an event loop in Python on Linux; you have to write part of it in C, and that's a huge obstacle for distribution :-(.)

Possible side-effects of CLOCK_MONOTONIC change?

Posted Apr 27, 2018 3:52 UTC (Fri) by njs (subscriber, #40338) [Link]

> In the mean time I guess it becomes temporarily impossible to implement an event loop in Python on Linux; you have to write part of it in C, and that's a huge obstacle for distribution :-(.

On further investigation, it looks like it's not quite as bad as I thought – CLOCK_MONOTONIC_ACTIVE can be queried from Python with:

time.clock_gettime(12)

(Untested, since I don't have a kernel with CLOCK_MONOTONIC_ACTIVE support).

Possible side-effects of CLOCK_MONOTONIC change?

Posted Apr 17, 2018 17:52 UTC (Tue) by k8to (guest, #15413) [Link] (5 responses)

I'm trying to figure this out.

Are we worried that the time jumping forward may expire many timers at once causing programs to do work? That seems correct. It's fairly easy for programs with many expired timers to amortize the cost of doing the work those timers represent, and they probably need to have that logic in place anyway if they hope to self-regulate.

If you're instead worried about many different programs having expiring timers and fighting over resources, that seems like a problem that requires a co-ordinating facility. Grand Central Dispatch from Apple would be one approach. Of course, in a way, the operating system's basic task switching functions are another.

The other option would be some software that thinks it needs to do some work for every interval window, so that if 1000 intervals are passed, it insists on doing 1000 times the work. That behavior is either required (if for example, there's a requirement to look at each time interval's data sample), or is fundamentally broken. I'm not sure how this particular change really affects either of those two situations.

Am I missing something?

Possible side-effects of CLOCK_MONOTONIC change?

Posted Apr 17, 2018 19:59 UTC (Tue) by glenn (subscriber, #102223) [Link] (4 responses)

> Are we worried that the time jumping forward may expire many timers at once causing programs to do work?

This is my concern. I've used CLOCK_MONOTONIC timers to trigger periodic tasks, such as transmit a heartbeat/health-status message, run a watchdog check, etc. Another use-case could be a timer that drives a game loop or animation. The logic surrounding these routines is simple because the (old) CLOCK_MONOTONIC is simple. The software built up around such timers might hide the underlying timer mechanisms (e.g., a timerfd file descriptor), so higher-level application-level software might be unable to reprogram the underlying timer (or cancel it).

Possible side-effects of CLOCK_MONOTONIC change?

Posted Apr 17, 2018 20:07 UTC (Tue) by k8to (guest, #15413) [Link] (3 responses)

But for these scenarios, it's no big deal. Your timers will expire, and you'll send a heartbeat or watchdog check after being asleep for an hour. Maybe your games draw some frames a tiny bit earlier than they need to. It should all settle down rather quickly. For most cases you would want your timers to expire after being asleep an hour.

Possible side-effects of CLOCK_MONOTONIC change?

Posted Apr 17, 2018 20:53 UTC (Tue) by glenn (subscriber, #102223) [Link] (2 responses)

> But for these scenarios, it's no big deal. Your timers will expire, and you'll send a heartbeat or watchdog check after being asleep for an hour. Maybe your games draw some frames a tiny bit earlier than they need to. It should all settle down rather quickly. For most cases you would want your timers to expire after being asleep an hour.

For one-shot timers, I believe that you are correct. My concern is with periodic timers.

Consider the use case of timerfd with a 10Hz periodic timer on CLOCK_MONOTONIC. Your application logic invokes a callback for every increment of the timerfd counter. Before you suspend, the timerfd count is 0---you have no callbacks to execute. You wake from suspension after an hour. The timerfd counter has been fast-forwarded and has a backlogged count of 36,000. If your application logic is simple, you'll invoke your callback in a burst of 36k invocations as you burn the counter back down to zero.

Possible side-effects of CLOCK_MONOTONIC change?

Posted Apr 18, 2018 5:47 UTC (Wed) by k8to (guest, #15413) [Link] (1 responses)

Is this really a problem?

> read(2)
> If the timer has already expired one or more times since its
> settings were last modified using timerfd_settime(), or since
> the last successful read(2), then the buffer given to read(2)
> returns an unsigned 8-byte integer (uint64_t) containing the
> number of expirations that have occurred.

If you get a read() of 36,000 and you execute your logic 36,000 times your program is just busted. Runaway could occur without this quirk.

Possible side-effects of CLOCK_MONOTONIC change?

Posted Apr 18, 2018 17:33 UTC (Wed) by glenn (subscriber, #102223) [Link]

> If you get a read() of 36,000 and you execute your logic 36,000 times your program is just busted. Runaway could occur without this quirk.

That is a fair point. However, this kind of defensive programming was unnecessary under the old CLOCK_MONOTONIC contract. Moreover, if code needs to be updated to detect unexpected timer backlogs, the developer has to make a judgement call on how many backlogged timers are too many: It may not always be clear if a backlog is due to system suspension or if an application is simply unable to service its timers fast enough (either due to its own execution behaviors, or due to those of other processes inducing CPU starvation). Setting a timer against CLOCK_MONOTONIC_ACTIVE may be an easier countermeasure. In either case, userspace has to change.