Suppressing SIGBUS signals

By Jonathan Corbet
June 25, 2021

The mmap() system call creates a mapping for a range of virtual addresses; it has a long list of options controlling just how that mapping should work. Ming Lin is proposing the addition of yet another option, called MAP_NOSIGBUS, which changes the kernel's response when a process accesses an unmapped address. What this option does is relatively easy to understand; why it is useful takes a bit more explanation.

Normally, when a process performs an operation involving memory, it expects the desired data to be read from or written to the requested location. Sometimes, though, things can go wrong, resulting in the delivery of a fatal (by default) signal to the process. A "segmentation violation" (SIGSEGV) signal is generated in response to an attempt to access a valid memory address in a way that is contrary to its protection — writing to read-only memory, for example. Attempting to access an address that is invalid, instead, results in a "bus error" (SIGBUS). Bus errors can be provoked in a number of ways, including using an improperly aligned address or an address that is not mapped at all. If a process uses mmap() to create a mapping that extends beyond the end of the backing file, attempts to access the pages past the end of the file will result in SIGBUS signals.

If, however, a memory range has been mapped with the proposed MAP_NOSIGBUS flag, SIGBUS signals will no longer be generated in response to an invalid address that lies within the mapped area. Instead, the guilty process will get a new page filled with zeroes. If the mapped area is backed up by a file on disk, the new page will not be added to that file. To a first approximation, the new option simply makes SIGBUS signals go away, with the process never even knowing that it had tried to access an invalid address.

OK...but why?

This behavior may seem like a strange thing to want. One would not normally expect a mapped area to contain invalid addresses within it, and one ordinarily wants to know if a program is generating and using invalid addresses. As it happens, mapped areas can contain invalid addresses in one normal use case: if that area is mapping a file, and it extends beyond the end of the file on disk. Attempts to access pages beyond the end of the file will generate a SIGBUS signal; this situation can be avoided by extending the file before attempting to access it through the mapping.

MAP_NOSIGBUS is explicitly incompatible with that way of working, though; since the zero-filled pages that it creates in response to invalid addresses are not connected to the backing file, it makes extending the file without redoing the mapping impossible. Instead, this option exists to address another problem: graphical clients that can, accidentally or intentionally, cause a compositor to crash.

Graphical applications often have to communicate large amounts of data to the compositor. An efficient way of doing this can be to map a file and pass a descriptor to the compositor; that file (which can live in a memory-only filesystem) becomes a shared-memory segment between the two processes. If, however, the client process then calls ftruncate() to shorten the file, the result is a mapping (in the compositor) that extends beyond the end of that file. If the compositor tries to access the shared-memory segment beyond the new end of the file, it will get a SIGBUS signal; in the absence of measures taken to the contrary, that will cause the compositor to crash, which is the sort of thing that user-experience developers usually make at least a modest effort to avoid. The SIGBUS signal can be caught and handled in the compositor, but that can be complex and hard to get right.

As Simon Ser, who works on Wayland compositors, noted back in April, there is another mechanism for passing data between the two processes: the memfd abstraction. A memfd can be "sealed", meaning that the creator cannot shrink it as described above (or, indeed, change it at all); the recipient, knowing that the segment will not change unexpectedly, can access it safely. But, as Ser points out, no compositor requires the use of sealed memfds because there are clients that are unwilling or unable to use them. So compositors must either jump through the SIGBUS-handling hoops or risk filling the disk with embarrassing core dumps.

But if the compositor could map a segment in a way that wouldn't create SIGBUS signals on invalid addresses, this whole problem would go away. Ser suggested looking at the __MAP_NOFAULT flag supported by OpenBSD as a possible solution. At the beginning of June, Lin responded with an implementation of MAP_NOSIGBUS, which differs from __MAP_NOFAULT in a number of ways. The initial implementation only worked for the in-memory tmpfs filesystem, but Hugh Dickins objected, saying that it should apply to any mapping; the second (and current revision) reflects that criticism and works regardless of the backing store behind a mapping.

Limitations

One significant limitation of the current implementation is that it only works for MAP_PRIVATE mappings — that seems like it could be a fatal flaw for a mechanism that is meant for use with mappings shared between clients and a compositor. But, as Ser explained, private mappings will work in almost all cases; since the data transfer is one-way from the client to the compositor, the mapping can be read-only on the compositor side. The big exception is screen capture, which will still have to be handled specially as long as shared mappings are not supported. So the solution is not complete, but 90% is a big step in the right direction.

The second version of the patch set has seen relatively little discussion; it seems that the developers who care about it are relatively happy with its current condition (though Kirill Shutemov was heard to grumble a bit about "one-user features"). There are never any guarantees, but there does seem to be a reasonable chance that this change could be merged as early as the 5.14 release.

Index entries for this article
Kernel	System calls/mmap()

to post comments

Suppressing SIGBUS signals

Posted Jun 25, 2021 17:37 UTC (Fri) by pm215 (subscriber, #98099) [Link] (3 responses)

Another kernel ABI change with no documentation ? That's certainly a good way to keep it a one-user feature...

Suppressing SIGBUS signals

Posted Jun 25, 2021 22:11 UTC (Fri) by angelsl (subscriber, #144646) [Link] (2 responses)

It's not actually accepted yet.

It will be documented once it gets into mainline.

Suppressing SIGBUS signals

Posted Jun 26, 2021 12:24 UTC (Sat) by pm215 (subscriber, #98099) [Link] (1 responses)

I'm not a kernel dev, but when I do review KVM ABI changes (from my POV as a userspace consumer of the ABI) I always want to *start* with the documentation patch. That tells me what the intention is, which I can then use to check whether the implementation matches the intention. Documentation that appears after the fact doesn't permit that and I think it tends to encourage "thing that's easy" or "first thing I thought of" ABI design rather than "ABI that makes sense to consumers".

And indeed sometimes "documentation after the fact" becomes "documentation never" or "documentation years later" -- I've run into a few cases of that when implementing QEMU's syscall emulation layer. The point of leverage for ensuring contributions meet a minimum standard is before patches are accepted; once they go in that leverage disappears.

Suppressing SIGBUS signals

Posted Jun 26, 2021 19:21 UTC (Sat) by mathstuf (subscriber, #69389) [Link]

Agreed. It's the same reason commit messages are so important for diffs: it helps to introduce others to what the change does as a whole, how it is used, and to (ideally) say why it is useful. Documentation is even more important because it should be what people look for first when trying to figure out how to do something. Commit messages are just harder to dig up via manpages…

Suppressing SIGBUS signals

Posted Jun 25, 2021 20:13 UTC (Fri) by roc (subscriber, #30627) [Link] (27 responses)

This would be useful for Rust (and I guess other languages that take safety seriously). A private, read-only mapping of a file to get a &[u8] *should* be safe, but it technically isn't because of this exact issue --- something could truncate the backing file, causing some innocent user of the &[u8] to crash. MAP_NOSIGBUS would make it really safe. So this sounds great.

Suppressing SIGBUS signals

Posted Jun 25, 2021 21:11 UTC (Fri) by zlynx (guest, #2285) [Link] (17 responses)

I don't know that I would call pages of zeros with no other indication of an error "safe."

I believe the program can get a few other signals from memory map access too. IO errors on a disk read for example. I just looked it up, and it is still a SIGBUS but if you use sigaction and a SA_SIGINFO handler, the si_code field will give details about what exactly happened.

Suppressing SIGBUS signals

Posted Jun 26, 2021 0:57 UTC (Sat) by NYKevin (subscriber, #129325) [Link] (11 responses)

The word "safe" has two different meanings in this context:

1. I know, at compile time, that dereferencing a given pointer will never cause the program to crash, corrupt the heap, or otherwise misbehave. The pointer is "safe."
2. I know, at compile time, that my program will handle all error cases within a given subroutine. The subroutine is "safe."

The problem is that mmap basically can't support (2) if the map is backed by a real file, unless you want to catch and handle SIGBUS. But nobody *wants* to handle SIGBUS, especially in a "let's try to recover and keep executing" way (as opposed to a "let's print a cryptic error message and call abort" way). (1), on the other hand, is solved quite effectively with this patch, and anyone who really does want (2) can use something other than mmap to read the file (which is no worse than the status quo).

Suppressing SIGBUS signals

Posted Jun 26, 2021 1:21 UTC (Sat) by sbaugh (guest, #103291) [Link] (4 responses)

I think Rust considers crashing the program to be "safe" (certainly I do); that's what it does on OOM, after all.

So I think this is misguided: I think SIGBUS signals on bad memory accesses are already safe, in Rust terms.

Suppressing SIGBUS signals

Posted Jun 26, 2021 2:04 UTC (Sat) by NYKevin (subscriber, #129325) [Link] (3 responses)

Meh, that's a matter of perspective. Crashing is only the default behavior, and they are working to make it easier to avoid crashing on OOM. There is no technical reason* why OOMing needs to crash, in a language with manual memory management. You could instead have fallible allocation and require the application to handle an OOM condition explicitly.

* You cannot control what the OOM killer does (although the sysadmin can, to some extent). But the OOM killer could even kill a completely unrelated process, and is not part of Rust itself, so let's ignore it. Besides, since Rust is a systems language, it can be used to write kernelspace code that is exempt from the OOM killer altogether.

Suppressing SIGBUS signals

Posted Jun 27, 2021 5:57 UTC (Sun) by dancol (guest, #142293) [Link] (2 responses)

> why OOMing needs to crash, in a language with manual memory management. You could instead have fallible allocation and require the application to handle an OOM condition explicitly.

Not every system is an overcommit system. Windows isn't. Linux with vm.overcommit_memory=2 isn't either.

Anyway, Rust wouldn't be having this problem at all if it had just adopted exceptions for error handling.

Suppressing SIGBUS signals

Posted Jun 27, 2021 12:14 UTC (Sun) by mpr22 (subscriber, #60784) [Link]

> Anyway, Rust wouldn't be having this problem at all if it had just adopted exceptions for error handling.

I dare say it would be having others, though.

Suppressing SIGBUS signals

Posted Jun 27, 2021 21:26 UTC (Sun) by NYKevin (subscriber, #129325) [Link]

> Not every system is an overcommit system. Windows isn't. Linux with vm.overcommit_memory=2 isn't either.

Yes, that's why I included the footnote about the OOM killer being out of scope. It's not part of Rust in the first place.

Suppressing SIGBUS signals

Posted Jun 26, 2021 17:16 UTC (Sat) by mathstuf (subscriber, #69389) [Link] (5 responses)

I don't know. `&[u8]` might be fine, but if I mmap'd something like `&[RawSpriteData]`, I think I'd *prefer* to crash because all-zeros isn't very useful to me.

> I know, at compile time, that dereferencing a given pointer will never cause the program to crash, corrupt the heap, or otherwise misbehave.

Rust (or any language for that matter) can only do so much. If, say, the hypervisor ends up fiddling with the VM's page tables, programs can certainly crash no matter how "safe" the language or runtime is. If the RAM gets a bit twiddled by a cosmic ray, all bets are similarly off in most cases. These kinds of things are outside of a language's control and are, IMO, purely in the realm of "just how paranoid do you need to be?" when coding any project in any language. JoeSchmo webapp? Users are probably trained to refresh on weird errors these days and such things are fine with systemd's restart logic. Sending a rover to Mars? Random RAM flips are very relevant, better have redundant hardware.

> I know, at compile time, that my program will handle all error cases within a given subroutine. The subroutine is "safe."

Not sure how you plan to handle the "power lost" error condition, but I'd be interested :) .

Suppressing SIGBUS signals

Posted Jun 27, 2021 4:12 UTC (Sun) by roc (subscriber, #30627) [Link] (4 responses)

> `&[u8]` might be fine, but if I mmap'd something like `&[RawSpriteData]`, I think I'd *prefer* to crash because all-zeros isn't very useful to me.

You can have a safe mmap function that returns &[u8]. Then you could use something like https://docs.rs/safe-transmute/0.11.2/safe_transmute to transmute to a different kind of reference if that's safe.

You're right that in extremes, safety gets a bit fuzzy. It's nice to be able to push the boundaries pretty far out though.

Suppressing SIGBUS signals

Posted Jun 27, 2021 17:17 UTC (Sun) by NYKevin (subscriber, #129325) [Link] (3 responses)

I don't think it is semantically possible to avoid making a copy. The length of the mmap is equal to the length on disk, and because Linux went with the no-mandatory-file-locking model, that length can change at any time, regardless of whether you're using Rust, C, or a magnetic needle and a steady hand. If you want to end up with a "safe" object, you need it to not get suddenly truncated/zero-filled while you're halfway through processing it.

Making a copy also resolves issues of the form "What if someone decides to scribble all over the file without changing its length, while I'm halfway through processing it?" More generally, this falls into the "validate, then process" model of doing things - you can't validate something if it can change out from under you!

Suppressing SIGBUS signals

Posted Jun 28, 2021 2:50 UTC (Mon) by ilammy (subscriber, #145312) [Link] (2 responses)

I wonder if some sort of copy-on-write private file mappings could help with that by avoiding copying the entire range (since mapped files tend to be huge). Like, you map a file a get a snapshot of its contents. Your process writing to that memory copies a page just for you and never syncs that page with the actual file. If any other process touches the file in any way via non-cow mapping or normal file ops, then the original data is copied and other process cow-mappings are dissociated from the file.

Suppressing SIGBUS signals

Posted Jun 30, 2021 11:57 UTC (Wed) by hmh (subscriber, #3838) [Link] (1 responses)

That looks like it would have been a much more generally useful (and expensive, complicated, etc) feature to implement than suppressing sigbus or zero-extending...

Mmap snapshot (maybe with a read-only result if that would be much easier or cheaper to implement and still cover the use cases).

But until someone offers to do that work...

Suppressing SIGBUS signals

Posted Jul 1, 2021 2:16 UTC (Thu) by ilammy (subscriber, #145312) [Link]

After doing some research, this idea is implemented in some systems as a MAP_COPY flag for mmap() [1]. That's basically the semantics of MAP_PRIVATE mapping, but with a “snapshot” guarantee that the data won't change, making your copy genuinely private, but avoiding the actual copy if the data does not change.

While thinking of how this could be implemented, I realized that it could be quite expensive, complicated, and full of “spooky action at a distance”. If some process grabs or released a MAP_COPY mapping of a file, then all existing processes must be made aware of it (e.g., by turning all mappings for everyone RO and catching page faults). Any change by the other process forces said process to expend some kernel time doing the copy of the page for the benefit of some other process, which is not particularly fair.

Turns out, adding MAP_COPY into Linux was discussed several times [2][3], but it's still considered a pretty stupid idea.

[1]: https://www.gnu.org/software/hurd/glibc/mmap.html
[2]: https://yarchive.net/comp/linux/map_copy.html
[3]: https://www.spinics.net/lists/linux-mm/msg119339.html

Suppressing SIGBUS signals

Posted Jun 26, 2021 9:41 UTC (Sat) by roc (subscriber, #30627) [Link] (4 responses)

Finding zeroes in some pages is perfectly reasonably behaviour. After all, the file could have just contained zeroes there to start with. Rust safety only requires that &[u8] reference a range of memory that doesn't change and doesn't cause a crash if you access it.

If MAP_NOSIGBUS still triggers SIGBUS on I/O errors then I think it is misnamed.

Suppressing SIGBUS signals

Posted Jun 26, 2021 18:02 UTC (Sat) by izbyshev (guest, #107996) [Link] (3 responses)

> Rust safety only requires that &[u8] reference a range of memory that doesn't change and doesn't cause a crash if you access it.

Consider the following sequence of events:
1. Program A maps the file with MAP_PRIVATE | MAP_NOSIGBUS.
2. Program A accesses page P from that file.
3. Program B truncates the file to zero.
4. Page P is evicted from RAM due to memory pressure.
5. Program A accesses page P again.

How is the resulting page fault handled? If a zero-filled page is mapped, then the contents of the range of memory does change silently.

Suppressing SIGBUS signals

Posted Jun 27, 2021 4:14 UTC (Sun) by roc (subscriber, #30627) [Link]

That's a good point.

Suppressing SIGBUS signals

Posted Jun 27, 2021 14:24 UTC (Sun) by ocrete (subscriber, #107180) [Link] (1 responses)

Isn't the content of a memory range chaning silently always possible if another program is modifying the file?

Suppressing SIGBUS signals

Posted Jun 28, 2021 0:07 UTC (Mon) by izbyshev (guest, #107996) [Link]

Yes (the man page says that behavior of MAP_PRIVATE is unspecified if the underlying file changes, but this is what happens on Linux in practice). The ftruncate()-based scenario is only somewhat interesting because it might seem like it only changes the file metadata, but in fact the data in the truncated tail also changes (as observed by mmap() users).

To be clear, the same behavior within the last page of the file is already possible with current kernels: ftruncate() that doesn't remove the last page completely will look like filling its remainder with zeros (though it's also formally unspecified).

So, overall, I don't see how MAP_NOSIGBUS would help with unsafety of Rust's &[u8] referring to mmap()'ed range.

Suppressing SIGBUS signals

Posted Jun 26, 2021 1:30 UTC (Sat) by alison (subscriber, #63752) [Link] (2 responses)

> something could truncate the backing file

Why would an application sharing memory with the compositor call ftruncate()? The article explains why NOSIGBUS solves a problem, but not why the problem arises in the first place. What is the point of truncating the backing file? Is the system writing the backing file hitting a quota, or is the swap space getting full?

Suppressing SIGBUS signals

Posted Jun 26, 2021 2:00 UTC (Sat) by NYKevin (subscriber, #129325) [Link]

There are several answers to that:

- The operating system† should protect the user from incompetently-written software. Poorly written software should not cause any part of the operating system to crash.
- Depending on your security model, the application may be considered untrusted, in which case it should not be allowed to bring down the compositor (as that would be a denial-of-service attack).
- The application calling ftruncate might not even be the same application which created the file, if the latter leaves a directory entry lying around.

† Outside of the FOSS world, the compositor is universally considered to be part of "the operating system."

Suppressing SIGBUS signals

Posted Jun 26, 2021 7:29 UTC (Sat) by ncm (guest, #165) [Link]

Agree. It seems like there might be an actual problem here, but this seems like a very ham-fisted way to address it.

It seems like the compositor should be in full control of its file, and the client process should be physically unable to do anything with that file except write data into pages mapped from it. Then the client could mess up and crash itself, as is its privilege, but the compositor could not.

Suppressing SIGBUS signals

Posted Jun 27, 2021 5:51 UTC (Sun) by dancol (guest, #142293) [Link] (1 responses)

Or the language could just do the right thing and install a signal handler and report the access failure as a language-specific exception or panic --- but the entire Linux ecosystem has been inexplicably reluctant to change or improve anything whatsoever relating to signals, so it's very hard for a general-purpose library to use a signal handler in a way that doesn't stomp over someone else's desired use.

Suppressing SIGBUS signals

Posted Jun 27, 2021 19:35 UTC (Sun) by jayalane (guest, #133964) [Link]

You can’t change signal handling semantics much at all without breaking a ton of old stuff. Pretty sure Linus would not approve.

Suppressing SIGBUS signals

Posted Jun 30, 2021 15:52 UTC (Wed) by miquels (guest, #59247) [Link] (3 responses)

A process that panic()s or crashes on invalid memory accesses is inherently safe, isn't it?

The problem with Rust and mmap is that it just doesn't work if you write from one thread or process and read from another - that's instant UB (Undefined Behaviour) and that is _really_ unsafe.

I think the only safe way to read from or write to mmap'ed memory in Rust is to use atomics - mmap it as AtomicU<whatever> and only read/write using the load/store operations on the atomic values.

Suppressing SIGBUS signals

Posted Jun 30, 2021 17:59 UTC (Wed) by Wol (subscriber, #4433) [Link] (2 responses)

> A process that panic()s or crashes on invalid memory accesses is inherently safe, isn't it?

And if you're in the aircraft where that software is controlling the plane's fly-by-wire system ... ?

Cheers,
Wol

Suppressing SIGBUS signals

Posted Jun 30, 2021 18:03 UTC (Wed) by mathstuf (subscriber, #69389) [Link] (1 responses)

I've said this elsewhere, but "safe" is a lot like security: it depends on your environment (threat model in security). I don't care much about cosmic rays flipping bits (in the "I should consider this case" sense) and count that as outside of the safety model of my code. NASA can't work with such reckless abandon and have *far* higher concerns about such things. But languages don't save you there: redundancy does.

Aircraft have similarly high standards and I imagine the solution there is along the lines of rip out (or somehow poison) the `panic` function and let the linker tell you that you have a case that isn't *explicitly* handled.

Suppressing SIGBUS signals

Posted Jul 7, 2021 6:48 UTC (Wed) by ssmith32 (subscriber, #72404) [Link]

Yes, I imagine redundancy would be a better solution. Better to have the fly-by-wire fail over to a backup, then try to limp along in some weird state.

Of course, time and again, cost-cutting works against that, and you have cars that share data buses between control systems & the entertainment system :(

Suppressing SIGBUS signals

Posted Jun 26, 2021 9:06 UTC (Sat) by Bigos (subscriber, #96807) [Link] (4 responses)

I understand the premise that given "bad applications" the compositor authors want a mechanism to protect themselves from wrong client behavior that is set up solely by the compositor. It means less work on the whole ecosystem. But that just puts more custom logic into the kernel for no other reason than "let the kernel deal with it".

I thought Wayland was an extensible protocol that would allow one to fix such API mistakes and have clients and servers adopt the change gradually. However, nothing like that has been done for 7 years it seems. When can we expect Wayland successor that fixes this, then?

Suppressing SIGBUS signals

Posted Jun 26, 2021 11:00 UTC (Sat) by Wol (subscriber, #4433) [Link] (2 responses)

> I understand the premise that given "bad applications" the compositor authors want a mechanism to protect themselves from wrong client behavior that is set up solely by the compositor.

Except it's nothing to do with "wrong client behaviour". You could be running a perfect client and then I log in on the same computer, truncate a file you're using, and bring YOUR desktop crashing down because of what I'VE done.

That's a vulnerability to a malicious actor, and this is (presumably) intended to fix that vulnerability (not bug).

Cheers,
Wol

Suppressing SIGBUS signals

Posted Jun 26, 2021 14:45 UTC (Sat) by Bigos (subscriber, #96807) [Link]

What I meant is to fix all legitimate applications (doing a change in mesa might solve the majority of it) and then eventually require sealed memfds instead of supporting the mechanism that is, as you mentioned, prone to abuse.

However, it seems the ship has already sailed and no one wants to change the protocol to accommodate this safer way of passing buffers around. Which is sad.

Suppressing SIGBUS signals

Posted Jul 5, 2021 8:54 UTC (Mon) by immibis (subscriber, #105511) [Link]

I'm sure there are many ways to bring things down if you can arbitrarily truncate files I'm using.

Suppressing SIGBUS signals

Posted Jun 26, 2021 14:08 UTC (Sat) by re:fi.64 (subscriber, #132628) [Link]

Afaik this isn't really a solvable issue for any protocol without just not supporting sending these at all, which would be a performance hit.

Suppressing SIGBUS signals

Posted Jun 26, 2021 23:10 UTC (Sat) by ju3Ceemi (subscriber, #102464) [Link] (5 responses)

Why are they using a memory-baked file as shared memory ?
Why not use shared memory as shared memory ?

Suppressing SIGBUS signals

Posted Jun 27, 2021 8:09 UTC (Sun) by randomguy3 (subscriber, #71063) [Link] (4 responses)

Are you referring to MAP_SHARED + MAP_ANONYMOUS memory? As I understand it, you can only share such memory with your child processes.

Or perhaps you mean using mem_fd? That possibility was mentioned in the article, which also notes "no compositor requires the use of sealed memfds because there are clients that are unwilling or unable to use them".

I am interested in the reasons behind this "unable or unwilling", though.

Suppressing SIGBUS signals

Posted Jun 27, 2021 8:13 UTC (Sun) by randomguy3 (subscriber, #71063) [Link] (2 responses)

I went and looked up the link:

The reason is that there will always exist clients which are either old (and predate file sealing) or refuse to use Linux-only APIs (they don't use memfd and file sealing, instead they use e.g. shm_open). Requiring sealed memfds in compositors would break these clients.
I don't believe the situation is about to change.
Rather than requiring changes in all compositors *and* clients, can we maybe only require changes in compositors? For instance, OpenBSD has a __MAP_NOFAULT flag. When passed to mmap, it means that out-of-bound accesses will read as zeroes instead of triggering SIGBUS. Such a flag would be very helpful to unblock the annoying SIGBUS situation.

Suppressing SIGBUS signals

Posted Jun 28, 2021 1:25 UTC (Mon) by viro (subscriber, #7872) [Link] (1 responses)

Create and open a file on tmpfs, in directory not accessible to anyone else. mmap() it. Pass the descriptor over SCM_RIGHTS datagram. And unlink() the damn thing, to keep the clutter down. Or use O_TMPFILE, for that matter, since that's no more Linux-specific than new mmap flags would be.

Sure, somebody malicious and controlling a process with your credentials will be able to screw you over, but then they could attach gdb to your client and have their merry way with it.

Suppressing SIGBUS signals

Posted Jul 2, 2021 19:28 UTC (Fri) by daniels (subscriber, #16193) [Link]

> Create and open a file on tmpfs, in directory not accessible to anyone else. mmap() it. Pass the descriptor over SCM_RIGHTS datagram. And unlink() the damn thing, to keep the clutter down.

That’s literally what we do, yeah.

Suppressing SIGBUS signals

Posted Jun 27, 2021 13:52 UTC (Sun) by anton (subscriber, #25547) [Link]

I would have thought about System V shared memory stuff (shmget and friends), and I was wondering about that myself when I read the article. Not that I have experience with it, but it seems that it was made for this. I now see that there is also POSIX shared memory (see man shm_overview), but it supports ftrucate(), so that probably does not help.

Suppressing SIGBUS signals

Posted Jun 27, 2021 9:42 UTC (Sun) by eru (subscriber, #2753) [Link]

responded with an implementation of MAP_NOSIGBUS, which differs from __MAP_NOFAULT in a number of ways

I wonder why? Would it not be useful if *ix -style systems implemented similar extensions similarly? That would help porting and also they would probably eventually get standardized.

Suppressing SIGBUS signals

Posted Jun 27, 2021 13:49 UTC (Sun) by dullfire (guest, #111432) [Link] (8 responses)

Having now thought about it for a while, this seems like a pretty bad approach to me.

It seems to be targeted at exactly one use case (as others have mentioned).
Further more it would actually only make the intended us case situation worse. Now clients can behave badly, and the compositor won't even be able to notice. No longs, no disconnected the miss behaving client, just users with "ugly windows", and they may not even know what program it was. Sounds like a nightmare to debug.

Further more, for non RGB buffer layouts, I'm not even sure what that would end up looking like on the screen.

Again having thought about it for a while, it seems a sane thing to do would be: handle the SIGBUS by noting where(address) it's failing, and then mapping in a private page of all zero (or ones, or w/e) over the now invalid mapping. Then the compositor can log something like "client XXXX has given malformed buffer" or what ever. It can even take corrective action, like dropping the client connection (and thus implicitly destroying the bad display content), or maybe even popping up an error message for the user (depending on settings,use case ,etc).

I don't think papering over the problem (and then closing your eyes to it ever existing) is going to make better user experiences, or more stable software. However that seems to be the approach this patch series wants to take.

Suppressing SIGBUS signals

Posted Jun 27, 2021 14:46 UTC (Sun) by matthias (subscriber, #94967) [Link] (7 responses)

> Further more it would actually only make the intended us case situation worse. Now clients can behave badly, and the compositor won't even be able to notice. No longs, no disconnected the miss behaving client, just users with "ugly windows",

This was always the case. If the client changes the contents of the buffer to garbage, the compositor will not notice and the window will display garbage. Nothing new here. The change is that the client truncating the buffer now will look exactly the same as the client clearing the buffer at an inappropriate time.

> and they may not even know what program it was. Sounds like a nightmare to debug.

Probably the program whose window displays garbage.

Suppressing SIGBUS signals

Posted Jun 28, 2021 8:35 UTC (Mon) by dullfire (guest, #111432) [Link] (6 responses)

> This was always the case. If the client changes the contents of the buffer to garbage, the compositor will not notice and the window will display garbage. Nothing new here. The change is that the client truncating the buffer now will look exactly the same as the client clearing the buffer at an inappropriate time.

You appear to be conflating two very different issues here. It's true the compositor has never really be able to tell if the client has been displaying garbage. However the source of SIGBUS is essentially a protocol error. The client has promised the compositor that there was buffer here, and instead there was a hand grande. SIGBUS always allowed the compositor to detect and handle such protocol violations. The proposed changes remove that ability (for the case of compositors that use it, so I guess it being "opt-in" is better than being "opt-out").

> Probably the program whose window displays garbage.

Except the compositor doesn't know there's even a problem, so it's can't say "hey it's connection XXX", or it registered with string "YYY". Not all programs have server side decoration (and IIRC the wayland standard is client side decoration). So if the user wasn't looking at that window, or doesn't remember which one it is (or never new in the case of dialogs, and a few other things), then it becomes difficult to figure out. If the window is on a "task list" like WM UI element, that might help, but not all windows are placed there.

It's probably not impossible, but it surely wouldn't be easy.

To sum it up (again): The approach here appears to be "make it work 'safely' by sweeping all the problems under the rug". I don't think that's a good long term solution.

Suppressing SIGBUS signals

Posted Jun 28, 2021 13:18 UTC (Mon) by kleptog (subscriber, #1183) [Link] (4 responses)

Isn't the real problem here the use of signal? The compositor is just doing a memory lookup, at assembly level there is no such thing as a "failed read". The kernel can generate a signal but it can't fail the original instruction, the MOV has to return something. The compositor would have to do a siglongjmp() because otherwise it's going to generate a SIGBUS for every single instruction accessing the missing pages. And siglongjmp() is pretty tricky at the best of times.

I can understand that developers would prefer if the kernel would just return zeros and the compositor recheck the size of the file after the copy completes. It's just less moving parts that way.

Someone else here noted userfaultfd() is doing something similar, so maybe there's an answer there. The compositor thread would be blocked and the handler could remap the pages to avoid the SIGBUS retriggerring.

Suppressing SIGBUS signals

Posted Jun 28, 2021 14:52 UTC (Mon) by dullfire (guest, #111432) [Link] (2 responses)

> Isn't the real problem here the use of signal? The compositor is just doing a memory lookup, at assembly level there is no such thing as a "failed read".

As I said in my original post: I'm pretty sure the compositors SIGBUS handler could mmap over the faulting region, and then set a flag that the given client is miss behaving.

When the compositor returns from the signal handler, it would finish it's turn function/call stack (using the newly mmaped region... possibly zero filled, possibly full of kitten pictures.), eventually get high enough to notice that that connection had been marked as bad, then either discard the processing it had done, or perhaps show one frame with a garbage window.

That way you resume from the error AND get notification of the buggy/malicious client. And you can have meaningful error messages.

Suppressing SIGBUS signals

Posted Jun 28, 2021 17:29 UTC (Mon) by NYKevin (subscriber, #129325) [Link] (1 responses)

> As I said in my original post: I'm pretty sure the compositors SIGBUS handler could mmap over the faulting region, and then set a flag that the given client is miss behaving.

signal-safety(7) does not list mmap as async-signal-safe, at least on my system. But that just means that POSIX doesn't require it to be safe. I suppose it's possible that Linux does implement an async-signal-safe mmap as an extension?

(Unfortunately, we can't just use the usual trick of "set a flag and return, then do the real work from the main event loop" because we need to fix the memory problem *before* we return from the SIGBUS handler, or else we'd need to pause the offending thread, call mmap from a different thread, etc.)

Suppressing SIGBUS signals

Posted Jun 28, 2021 18:54 UTC (Mon) by dullfire (guest, #111432) [Link]

I would not imagine "mmap" could ever be fully async-signal-safe (after all you are screwing with the page table). However for the every narrow case of replacing a no-longer valid mapping, it should be fine.

Seeing as mmap is typically a thin wrapper around a syscall (AFAIK there is no currently known way for this to not be a kernel task) most of this must naturally be done in kernel, which negates most of the issues.

However since you brought it up, I'm assuming you are implying being portable/standards conformant is important. So for that we would need a change of standards. But that's probably more of a 'paper' change (while it's possible some implementations that are posix.1 2008 compliant implement mmap(2) in a way that would be unsafe for this proposed usage, I kind of doubt it).

A linux-only change, that just papers over the issue seems like a poor solution.

Suppressing SIGBUS signals

Posted Jun 28, 2021 14:53 UTC (Mon) by excors (subscriber, #95769) [Link]

> The compositor is just doing a memory lookup, at assembly level there is no such thing as a "failed read". The kernel can generate a signal but it can't fail the original instruction, the MOV has to return something.

I don't think that's really true. At the assembly level, the outcome of the MOV instruction is that it either loads a value into the register *or* triggers an exception. E.g. if you look in the ARMv8-A Architecture Reference Manual, it explicitly defines the LDR instruction in terms of the "AArch64.MemSingle" operation which can call the "AArch64.TakeException" operation, which sets up the exception state then calls "EndOfInstruction" to stop any further processing of the LDR instruction, so it won't write anything to the destination register.

Once the exception is triggered, the kernel is free to do whatever it wants - update the page tables then jump back to the MOV instruction to retry it, manually update the register state then jump back to the instruction after the MOV, call a signal handler, etc. Anyone writing user-space assembly code has to be aware of that, e.g. there are often ABI rules about stack pointers that are specifically there to allow the kernel to interrupt your thread at any point and run a signal handler on its current stack. So that's not something you can safely ignore when working at assembly level.

Suppressing SIGBUS signals

Posted Jun 29, 2021 10:34 UTC (Tue) by matthias (subscriber, #94967) [Link]

> You appear to be conflating two very different issues here. It's true the compositor has never really be able to tell if the client has been displaying garbage. However the source of SIGBUS is essentially a protocol error. The client has promised the compositor that there was buffer here, and instead there was a hand grande. SIGBUS always allowed the compositor to detect and handle such protocol violations.

The client has promised that there is a buffer that does not change while the compositor is using it. Any change would be a protocol error. Only in the very special case that the change is truncate, this protocol error could be detected by the compositor. With the proposed change of semantics, this special case just looks like other cases where the client changes the buffer at the wrong time.

> [Finding out which program displays garbage] is probably not impossible, but it surely wouldn't be easy.

For debugging purposes there should be a possibility to ask the compositor to which program some window belongs. After all, you also want to debug cases where a program just opens a window with garbage in it. Probably a much more common case than a program calling truncate on the buffer.

Suppressing SIGBUS signals

Posted Jun 27, 2021 20:11 UTC (Sun) by dancol (guest, #142293) [Link]

It occurs to me that one way to address the original problem with SIGBUS is to wire up userfaultfd to the mapping somehow. Then the compositor could arrange for the zero substitution --- or any other policy --- on its own without hard-coding zero-fill policy into the kernel.