A revoke() update and more

By Jake Edge
April 2, 2014

2014 LSFMM Summit

Al Viro gave an update on the long-awaited revoke() system call to the 2014 Linux Storage, Filesystem, and Memory Management (LSFMM) Summit. revoke() is meant to close() any existing file descriptors open for a given pathname so that a process can know that it has exclusive use of the file or device in question. Viro also discussed some work he has been doing to unify the multiple variants of the read() and write() system calls.

Viro started out by saying that revoke() was the less interesting part of his session. It is getting "more or less close to done", he said. We looked at an earlier version of this work a year ago. Files will be able to be declared revokable at open() time. If they are, a counter will track the usage of the file_operations functions at any given time. Once revoke() is called, it waits for all currently active threads to exit the file_operations, and makes sure that no more are allowed to start.

There are places in procfs and sysfs where something similar is open-coded, Viro said, that could be removed once the revoke() changes go in. One of the keys is to ensure that the common path does not slow down for revoke() since most files will not be revokable. There are several areas that still need work, including poll(), which "provides some complications", and mmap(), which has always been problematic for revoke().

In a bit of an aside, Viro noted that there is a lot of code that is "just plain broke". For example, if a file in debugfs is opened and the underlying code removes the file from the debugfs directory, any read or write operation using the open file descriptor will oops the kernel. Dynamic debugfs is completely broken, Viro said. He hopes that the revoke() code will be in reasonable shape in a couple of cycles—"it's getting there". Dynamic debugfs will be one of the first users, he said.

Viro then moved on to the unification of plain read() and write() with the readv()/writev() variants as well as splice_read() and splice_write(). The regular and vector variants (readv()/writev()) have mostly been combined, he said. It is "not pretty", but it is tolerable. The splice variants got "really messy".

Ideally, the code for all of the variants should look the same all the way down, until you get to the final disposition. But each of the variants has its own view of the data; the splice variants get/put their data into pages, which doesn't fit well with the iovec used by the other two variants (in most implementations, plain read() and write() are translated to an iovec of length one). Creating a new data structure that can hold both user and kernel iovec members, along with struct page for the splice variants may be the way to go, Viro said.

Something that "fell out" of his work in this area is the addition of iov_iter. The iov_shorten() operation tries to recalculate the number of network segments that fall into a given iovec area, but the result is that the iovec gets modified when there are short reads or writes. Worse still, how the iovec gets modified is protocol-dependent, which makes it hard for users. In fact, someone from the CIFS team said that it makes a copy of any iovec before passing it in because it doesn't know what it will get back.

Having it be protocol-dependent is "just wrong", Viro said. He has been getting rid of iov_shorten() calls, as well as other places that shorten iovec arrays. That might allow sendpage() to be removed entirely; protocols that want to be smart can set up an iov_iter, he said.

[ Thanks to the Linux Foundation for travel support to attend LSFMM. ]

Index entries for this article
Kernel	revoke()
Conference	Storage, Filesystem, and Memory-Management Summit/2014

to post comments

Extremely unrelated suggestion

Posted Apr 3, 2014 19:01 UTC (Thu) by diederich (subscriber, #26007) [Link] (5 responses)

I often see photos of individual people in lwn articles. If it's not too much work, please consider putting the name of these people with the pictures. It could be a hover (probably not great for mobile devices), or a caption, or anything like that.

It's often possible, as in the case of this article, to accurately guess the name of the person, based on the title and/or content. But much of the time it is not.

Thank you for your consideration, and thanks for all of these awesome articles!

Extremely unrelated suggestion

Posted Apr 3, 2014 19:38 UTC (Thu) by khim (subscriber, #9252) [Link] (4 responses)

There are one additional thing one can do (besides hover): actual honest to goodness left click. In this particular case this will show large picture with acompanying text “Al Viro at 2014 LSFMM Summit” which means that detective skills are not really needed.

Extremely unrelated suggestion

Posted Apr 4, 2014 16:47 UTC (Fri) by diederich (subscriber, #26007) [Link] (1 responses)

I am flabbergasted. Please accept my apology.

Extremely unrelated suggestion

Posted Apr 4, 2014 23:07 UTC (Fri) by giraffedata (guest, #1954) [Link]

I never care enough about the picture to go to the trouble of clicking through, but I would read a caption if it were there. I might hover.

Extremely unrelated suggestion

Posted Apr 18, 2014 7:20 UTC (Fri) by hpro (subscriber, #74751) [Link] (1 responses)

While that does the trick, it is not particularly user friendly. I would prefer to not have to navigate away from where I am in the text to learn the name of someone in a picture. As soon as I navigate away the natural flow of reading gets interrupted.

Also, as I generally browse LWN though a mobile device I would also like to not have to eat from my data allowance by loading a large image, when I really just want to have some information that could easily have fit in a small caption.

Extremely unrelated suggestion

Posted Apr 18, 2014 17:47 UTC (Fri) by mathstuf (subscriber, #69389) [Link]

I usually long click, inspect the URL in the popup dialog and then just ignore the popup. Having a caption on the image would be appreciated though.