Overlayfs features

By Jake Edge
March 29, 2017

LSFMM 2017

The overlayfs filesystem is being used more and more these days, especially in conjunction with containers. Amir Goldstein and Miklos Szeredi led a discussion about recent and upcoming features for the filesystem at LSFMM 2017.

Goldstein said that he went back to the 4.7 kernel to look at what has been added since then for overlayfs. There has been a fair amount of work in adding support for unprivileged containers. 4.8 saw the addition of SELinux support, while 4.9 added POSIX access-control lists (ACLs) and fixed file locks. 4.10 added support for cloning a file instead of copying it up on filesystems that support cloning (e.g. XFS).

There is ongoing work on using overlayfs to provide snapshots of directory trees on XFS. It is not clear when that will be merged, but 4.11 should see the addition of a parallel copy-up operation that should speed that operation up on filesystems that do not support cloning.

Another feature that is coming, perhaps in the 4.12 time frame, is to handle the case where an application gets inconsistent data because a copy up has occurred. Szeredi explained that if an application opens a file in the lower layer that gets copied up due to a write from some other program, the application will get only old data because it will still have that lower-layer file open. There are plans to change the read() and mmap() paths to check if a file has been copied up and change the kernel's view of the file to point at the new file.

But Al Viro was concerned that it would change a fundamental behavior that applications expect. If a world-readable file is opened, then has its permission changed to exclude the reader (which causes a copy up), the application would not expect errors at that point, but this solution would change that. Szeredi suggested that the open of the upper file could be done without permission checks, which Viro thought might work for some local filesystems, but not for upper layers on remote filesystems.

But Bruce Fields wondered if the behavior could even be changed the way Szeredi described. There could be applications that rely on the current behavior, or else no one is really using overlayfs. Viro said that he didn't believe any applications use the behavior. But, he noted, he has broken things in the past that didn't surface and have bugs filed until years later when users actually started testing their applications with the broken kernels.

Szeredi pointed out that these changes will make overlayfs more POSIX compliant and that there are other changes to that end that are coming. Fields is still concerned that the semantics are going to change in subtle ways over the next few years while people are actually using the filesystem. If people use it enough, there will be bugs filed from changing the behavior. But Jeff Layton said that even if it were noticed in some applications, it would be hard to argue against bringing overlayfs into POSIX compliance.

Goldstein said that there have also been a lot of improvements in the overlayfs test suite. There is support for running those tests from xfstests, so he asked the assembled filesystem developers to run them on top of their filesystems. He also mentioned overlayfs snapshots, which kind of turns overlayfs on its head, making the upper layer into a snapshot, while the lower layer is allowed to change. Any modifications to the lower-layer objects cause a copy-up operation to preserve the contents prior to the change, while any file-creation operation causes a whiteout in the snapshot. So when the lower layer is viewed through the snapshot, it appears just as the filesystem did at snapshot time.

Index entries for this article
Kernel	Filesystems/Union
Kernel	Overlayfs
Conference	Storage, Filesystem, and Memory-Management Summit/2017

to post comments

Overlayfs features

Posted Mar 30, 2017 12:05 UTC (Thu) by Gollum (guest, #25237) [Link] (4 responses)

I'd love to know if overlayfs can be used with an upper tmpfs to allow minimal writes to the underlying filesystem, except when desired/triggered.

Use case being the Raspberry Pi's, or routers with flash file systems, or NAS's booting from SD cards, etc, to minimise writes to the card, but still allow the operator/controller to sync the upper FS to the lower at some point in time.

As an example, the obvious one is to have /var/log/ be on a tmpfs. Of course, you need to size the tmpfs to be big enough for your logfiles, and you also lose the log files if the device reboots for any reason. An alternative might be to mount the tmpfs as the upper layer of an overlayfs, on top the /var/log/ directory. This allows you to access older logs as necessary. Then you could use a bind mount to make /var/log available as /underlay/var/log, and periodically, rsync the files from /var/log/ (tmpfs) to /underlay/var/log/

The missing part is the warning that overlayfs doesn't deal well with changes to the underlying FS, as well as overlayfs being informed/recognising that files that were copied up to tmpfs have been copied down. Unless putting the "copy down" part into overlayfs itself makes sense?

Mimic overlayfs low-write frequency with large cache?

Posted Mar 31, 2017 6:19 UTC (Fri) by alison (subscriber, #63752) [Link] (3 responses)

>I'd love to know if overlayfs can be used with an upper tmpfs to allow minimal writes to the >underlying filesystem, except when desired/triggered.

Filesystems and often, the storage drivers themselves, have lots of tunable parameters. Couldn't the same behavior as you describe with overlayfs be mimicked just with a large pending-write cache and perhaps control of features like BKOPS in recent eMMCs?

Mimic overlayfs low-write frequency with large cache?

Posted Mar 31, 2017 7:36 UTC (Fri) by Gollum (guest, #25237) [Link] (2 responses)

I guess it could be!

My specific use case is an HP Gen 8 Microserver with an on-board SD Card reader, which I am using as the boot volume for a NAS. So eMMC options are not applicable. I'll have to have a look to see what other options are available.

Thanks for the comment!

Mimic overlayfs low-write frequency with large cache?

Posted Mar 31, 2017 9:20 UTC (Fri) by ajb (subscriber, #9694) [Link]

If your logs are generated by journald, you can also just increase SyncIntervalSec (although that doesn't affect CRIT, ALERT or EMERG levels)

Mimic overlayfs low-write frequency with large cache?

Posted Apr 1, 2017 3:00 UTC (Sat) by alison (subscriber, #63752) [Link]

> I'll have to have a look to see what other options are available.

Inspect /proc/sys/vm. The sysctl command provides an easy interface to manipulate these variables. More information is found in kernel source tree in Documentation/systctl/vm.txt.