Optional mandatory locking

By Jonathan Corbet
December 9, 2015

Files in a filesystem are globally visible data structures, so it is entirely possible that more than one process will try to modify the same file at the same time. In the absence of a way to coordinate the actions of those processes, the result will almost certainly be messy at best. One form of coordination is mandatory locking, which has been supported by Linux since nearly the beginning. A recent discussion, however, may be the beginning of the end for this venerable, if unloved, feature.

Back in the early days, Unix lacked a mechanism for serializing access to shared files. Eventually, POSIX standardized an advisory locking mechanism invoked via the fcntl() system call; with advisory locks, a process can lock a region of a file for reading or writing, the latter granting exclusive access to that region. POSIX advisory locking mostly works, modulo certain shortcomings, and it was recently improved in Linux with the addition of file-private POSIX locks. But there is one obvious problem with advisory locks: they're advisory. All it takes is one process that isn't playing by the rules and the whole scheme falls down. As has been noted over the years, there is a lot of poorly written software out there; while advisory locks can work well within a single, well-written application (think of a database manager accessing its files, for example), they cannot be relied on in situations where arbitrary programs may try to access a file.

The answer from POSIX was mandatory locks, which can be thought of as advisory locks with more kludges added on top. If a file is made subject to mandatory locks (see below), then the locks that would have otherwise been advisory are enforced by the kernel. If a process write-locks a region of a file, any other process trying to read or write that region will be blocked until the lock is released. With mandatory locks, there is no longer a need to hope that every process accessing a file will observe the advisory locks.

Enforcing mandatory locks on every file access would be expensive, so the mandatory locks mechanism is restricted in scope. First of all, the filesystem must be mounted with the "-o mand" option to enable mandatory locks. Then any file subject to mandatory locking must be marked by setting the set-group-ID protection bit (but not the group execute bit). When those conditions are met, mandatory locks will be enforced on the file in question — to a point.

Linux has supported mandatory locks for a long time; the document describing them was written in 1996. It compares the kernel's implementation to the equivalent in other bleeding-edge operating systems like SunOS 4.1, Solaris 2, and HP-UX 9. Linux comes off relatively well in that comparison, but that is setting a low bar; everybody's implementation of mandatory locks was evidently bug-ridden, inconsistent, and unreliable. The kernel's document itself starts off with a section (added in 2007) titled "why you should avoid mandatory locking." Among other things, in Linux, the lock restrictions are enforced only at the beginning of an operation, so operations can race with locks that are established (by another process) halfway through.

In other words, Linux claims to support mandatory locks, but that support is incomplete and racy; developers who are serious about getting user-space locking right understand this and use something else.

While working on user namespaces, Eric Biederman recently ran into another issue: a process within a namespace can apply a mandatory lock to a file, then pass a descriptor for that file to a daemon outside of the namespace. That daemon will then freeze as soon as it tries to access the file descriptor. This is the sort of denial-of-service attack that namespaces are supposed to prevent, so this behavior is a bit of a problem. It can be fixed easily enough by limiting mandatory-lock enforcement to other processes in the same namespace, but Eric also has a more far-reaching solution in mind.

As has been noted, mandatory locks are inelegant, buggy, and subject to races on multiple operating systems. Furthermore, they have been that way for decades, and nobody has made the effort to fix them. Rather than try to fix the problems at this late date, Eric suggested:

From what little I can glean we want to discourage people from using mandatory locking and to let it wither and die. A Kconfig option that allows mandatory locking to be disabled at compile time seems like the first step in making that happen. Perhaps in a decade or so when all linux distributions are setting the option we can remove the code.

The thinking here is that mandatory locking probably has almost no users at all. That might argue for its immediate removal, but that "almost" is the sticking point: breaking even a single ancient application goes against the kernel's "no regressions" rule. So the next best thing is to slowly make the feature harder to get at and to see if anything breaks.

The resulting discussion among filesystem developers was remarkably one-sided; there is little love for the mandatory-locking feature in the development community. There were some worries that Samba might rely on mandatory locking, but Jeremy Allison put those concerns to rest. So Jeff Layton quickly queued the patch and said that, in the absence of objections, he would push it during the 4.5 merge window. No such objections have been heard, so it appears that, in 4.5, the mandatory locks feature will be optional and slated for eventual (if distant) removal.

Index entries for this article
Kernel	Filesystems/POSIX locks

to post comments

Optional mandatory locking

Posted Dec 10, 2015 4:45 UTC (Thu) by lambda (subscriber, #40735) [Link]

Just to note, mandatory locking in Samba to SMB clients causes various problems as well; at the very least, OS X applications mostly don't expect mandatory locking, but get it when mounting an SMB volume, causing all kinds of fun like Photoshop not being able to save a file if it's selected in the Finder, since the finder is trying to preview it and thus Photoshop can't obtain a mandatory exclusive lock.

Mandatory locking seems like it would solve problems, but in practice, applications have files open for read for inessential operations like previewing, indexing, and virus scanning, and having those operations cause actual important writes to fail because they couldn't get a lock actually causes more problems than it solves.

Optional mandatory locking

Posted Dec 10, 2015 10:24 UTC (Thu) by cuboci (subscriber, #9641) [Link] (15 responses)

As it happens, two days ago I read up on file locking on Linux. And the conclusion I came away with was that there is absolutely no reliable way to make sure you're the only one accessing a file. The use case was something like this:

Someone uploads a file to a server. Upon completion some service does something with that file.

On the server side you have to have a way to know when the transfer is complete. In an ideal world the user would either upload the file with a temporary name and then rename it on the server or upload a second 'flag' file indicating completion. Unsurprisingly, (some) users are dumb and incapable of doing either in practice. So you have to fall back on some other mechanism.

I thought I could use locks that I would only be able to acquire once all other processes finished accessing the file. As it turns out, there's no such mechanism on Linux. Advisory locks are of no use here, mandatory locks are buggy and hard to use anyway. The only way is to wait for a certain amount of time (say, five minutes or so) and check if the file has changed in that time. But that can break down due to bad network connections, too.

So, I'm stuck with an impractical way of doing what I want that is also prone to errors just because Linux lacks proper file locking mechanisms. Sad.

Optional mandatory locking

Posted Dec 10, 2015 13:13 UTC (Thu) by philipstorry (subscriber, #45926) [Link] (8 responses)

I'm sure I'll be accused of over-engineering, but...

Use a database?

A database should have all the correct locking you'll need. Granted, you shouldn't put the file being uploaded into the database - but you could use it for the operation status flag.

It's overkill. But after having looked at file locking mechanisms I began to understand why (some) developers use databases so often, and sometimes for what are apparently trivial things.

Optional mandatory locking

Posted Dec 10, 2015 13:27 UTC (Thu) by cuboci (subscriber, #9641) [Link] (7 responses)

This is not about files I generate myself. The files I'm talking about are uploaded by customers. I have no control over that other than to notice a new file is there.

Optional mandatory locking

Posted Dec 10, 2015 15:59 UTC (Thu) by alankila (guest, #47141) [Link] (4 responses)

I'd say that your sftp/whatever server implementation should be able to generate an event when client is finished with uploading a file. This would generally be the case if you used a library that implements the protocol rather than e.g. separate unix process that just dumps stuff to filesystem.

Optional mandatory locking

Posted Dec 10, 2015 19:55 UTC (Thu) by cuboci (subscriber, #9641) [Link] (3 responses)

This is standard OpenSSH SFTP. What event is it able to generate once the upload is complete?

Optional mandatory locking

Posted Dec 10, 2015 20:19 UTC (Thu) by iabervon (subscriber, #722) [Link]

sftp-server logs transactions it performs on behalf of the client. I'm not sure if successful completion is what's at the INFO level or if that would be at a DEBUG level, but this would be a better trigger than any sort of locking, since sftp transfers can fail in the middle, and a locking-based method would either think it was done (and act on partial data) or think it was still going (and wait forever).

Optional mandatory locking

Posted Dec 10, 2015 22:42 UTC (Thu) by rotty (guest, #14630) [Link]

You could also use inotify, for example by incron to generate an event based on a file being open for writing being closed. There might be gotchas, but in principle, it should work (I've used it for auto-converting files uploaded via SMB).

Optional mandatory locking

Posted Dec 12, 2015 11:24 UTC (Sat) by alankila (guest, #47141) [Link]

Probably none, because you are using a system that just dumps stuff to unix filesystem, so you are stuck with something like inotify/dnotify or whatever it is called today. Ideally, you'd assemble your own SFTP daemon out of reusable components, rather than using processes solving parts of the problem and then being stuck trying to discover mechanisms by which they can interoperate.

Optional mandatory locking

Posted Dec 13, 2015 2:55 UTC (Sun) by giraffedata (guest, #1954) [Link] (1 responses)

It's worth noting that even if there were some file locking function that could let you block until the file isn't open for write, relying on that is still a hack in your situation, since there's no reason the program that generates the file, over which you have no control, couldn't open and close the file multiple times in the process.

Optional mandatory locking

Posted Dec 14, 2015 11:55 UTC (Mon) by cuboci (subscriber, #9641) [Link]

I'm aware of that. Problem is, right now I have no choice but to use standard components. Talk about being stuck between a rock and a hard place.

lsof

Posted Dec 10, 2015 20:58 UTC (Thu) by abatters (✭ supporter ✭, #6932) [Link] (2 responses)

Consider scanning /proc/<PID>/fd/* or using lsof to see if the server process still has the file open.

lsof

Posted Dec 15, 2015 16:19 UTC (Tue) by k8to (guest, #15413) [Link]

Or, more simply, if you can trust that the filesystem will be local, make use of mtime. Yeah, it's not perfect but you can wait for the file to be 15 seconds stale and it will work as well as the other hacks.

lsof

Posted Dec 15, 2015 16:53 UTC (Tue) by k8to (guest, #15413) [Link]

Independently, it may be simpler to use fuser for a single file inquiry than lsof. Personally I generally struggle with the lsof flags, but I may be the odd duck.

Optional mandatory locking

Posted Dec 21, 2015 12:02 UTC (Mon) by oldtomas (guest, #72579) [Link] (2 responses)

One of my favourite options for when the transport is SSH is the "command" feature of authorized_keys.

This way you can hook yourself into the action (and even do different processing depending on your customer's credentials).

This is the way gitolite and friends work. For an example on how to do it with rsync, see [1].

Or set up a gitolite, add a few users, go into ~gitolite/.ssh/authorized_keys and follow the breadcrumbs from there.

Missing piece: convince ssh's sftp module to be called from your wrapper script. But I'd expect it to be sufficiently unixy and well-behaved as to just accept some command line parameters and then take the bulk of communication over stdio.

[1] <http://www.sakana.fr/blog/2008/05/07/securing-automated-r...>

Optional mandatory locking

Posted Dec 21, 2015 23:12 UTC (Mon) by nix (subscriber, #2304) [Link] (1 responses)

Your mention of sftp reminds me that insufficient attention is paid to ssh subsystems, probably because they're relatively undocumented. They're not tied into ssh at all -- they're *really* easy to write. A subsystem is just a process whose stdin/stdout/stderr get transparently connected to an SSH stream: all the client end has to do is run ssh -s subsystem_name user@host.

It can be more appealing than authorized_keys commands in some situations (particularly when you want to be able to this for more than one user on the server without frotzing with all their authorized_keys files).

Optional mandatory locking

Posted Dec 22, 2015 12:55 UTC (Tue) by oldtomas (guest, #72579) [Link]

> A subsystem is just a process whose stdin/stdout/stderr get transparently connected to an SSH stream

So my hunch was right, thanks for clarifying that (gotta love the Unix Way :-)

> It can be more appealing than authorized_keys commands in some situations ([...] without frotzing with all their authorized_keys files)

The authorized_keys part serves a different and highly complementary purpose: if you want different clients to do different things depending on their identity (authentification + authorization). The possibility of "hooking in" is just a side-effect.

If you just want to hook in, perhaps substituting the sftp module by an "enhanced" one (which appropriately triggers things on transfer success/failure) would be most adequate, yes.

Optional mandatory locking

Posted Dec 10, 2015 12:44 UTC (Thu) by jnareb (subscriber, #46500) [Link] (2 responses)

File locking in MS Windows was for me on of bigger sources of frustrations with this operating system...

Optional mandatory locking

Posted Dec 10, 2015 13:49 UTC (Thu) by philipstorry (subscriber, #45926) [Link] (1 responses)

Windows NT file locking is - AIUI - entirely down to the "Windows personality" running on top of the kernel. The kernel in Windows certainly used to support POSIX-style file operations, so didn't have to enforce locking. The original design of Windows NT allowed for these "personalities", which were originally called subsystems. There was a POSIX one, an OS/2 one, and Windows itself.
Both the other subsystems are dead these days, as Windows has focused more and more on just being Windows.

My understanding was that Windows file locking was more a decision taken for backwards compatibility more than anything else. They decided to enforce the old DOS-style file locking (that you had if you loaded SHARE.EXE) when they went multi-user, as it was the cleanest and clearest way to do so and have existing applications understand what was going on if a file was locked.

As Windows NT was multi-user, they then went further and baked that file locking in to the OS components.

Of course, in practice none of this matters, except when you're trying to explain to someone why Windows inevitably requires a restart after a software installation/update...

Optional mandatory locking

Posted Dec 15, 2015 16:56 UTC (Tue) by k8to (guest, #15413) [Link]

Or trying to write an application which shares data with other applications.

(Not a common case, but it comes up.)

Or more likely you're trying to port some Unix software that happens to do rename swizzling on open files.

But agreed the reboot-on-update is the most common.

Optional mandatory locking

Posted Dec 10, 2015 13:12 UTC (Thu) by jlayton (subscriber, #31672) [Link]

We disable all sorts of stuff in the kernel these days in the name of "tinification" so I didn't see that allowing people to compile this out was really any different. FWIW, while I was doing some overhaul of the file locking code a while back, I did see how we could make mandatory locking race free (or at least, less racy), but I found it really hard to care enough to do the work for it.

It has always seemed like a bit of a hack anyway (you need a mount option _and_ a special mode-bit combo which is not at all intuitive), and the use-cases for it are pretty thin on the ground.

Optional mandatory locking

Posted Dec 15, 2015 7:52 UTC (Tue) by neilbrown (subscriber, #359) [Link] (3 responses)

> All it takes is one process that isn't playing by the rules and the whole scheme falls down.

This always struck me as a rather lame argument. If you have one process that isn't playing by the rules, then when it gets the mandatory write lock it can completely corrupt the file.

If you trust the other process, then advisory locking is plenty. If you don't then you give it a different UID and don't give it write permission at all. If you want an in-between relationship — “Trust, but verify” — you use IPC to a management daemon.

Optional mandatory locking

Posted Dec 16, 2015 14:45 UTC (Wed) by nix (subscriber, #2304) [Link] (2 responses)

One problem, of course, is that moving to that intermediate trust-but-verify scheme should really require something that is in some sense intermediate between that used by advisory locking and no-write-permission, so you could shift between them easily. Instead, you have simple code (advisory locking), a completely different *architecture* (a management daemon), and a local administrative tool (no write permission). Is there any wonder that people are annoyed by this dog's breakfast?

My dim memories of the days in the 90s when I tried to use mandatory locking are that it was similar to NFS in those days -- as in, any block was uninterruptible and unkillable. This essentially means that any bug (or attack from malevolent untrusted program, but this was the 90s, we weren't thinking in those terms so much) elevates a possible file corruption all the way up to oh-crap-I-have-to-reboot-and-even-that-might-not-work territory. Is there any wonder nobody went near mandatory locking after one look at that?

Optional mandatory locking

Posted Dec 16, 2015 23:28 UTC (Wed) by neilbrown (subscriber, #359) [Link] (1 responses)

> One problem, of course, is that moving to that intermediate trust-but-verify scheme should really require something that is in some sense intermediate between that used by advisory locking and no-write-permission, so you could shift between them easily.

Nope. There is an enormous difference between co-operating processes and adversarial processes. Advisory locking is for friends that work together on a common goal and don't want to tread on each other's toes. IPC is for strangers with a contractual arrangement. They really are different scenarios and pretending you can drift smoothly from one to the other is a mistake.

I can agree that it would be nice if IPC were as easy as writing to a file, but I don't agree that you should be able to achieve IPC with minor modifications to code which is written to just write to a file.

Optional mandatory locking

Posted Dec 20, 2015 1:08 UTC (Sun) by nix (subscriber, #2304) [Link]

Hm, good point -- only, of course, the whole point of 'everything is an fd' and That Hideous Name was that you *should* be able to make major changes like that with as little churn as possible. I would be ever so happy if the VFS was general enough that it *was* the only IPC mechanism we needed. But that's not what we've got :(