[go: up one dir, main page]

|
|
Log in / Subscribe / Register

TALPA strides forward

TALPA strides forward

Posted Aug 28, 2008 23:41 UTC (Thu) by njs (subscriber, #40338)
In reply to: TALPA strides forward by nix
Parent article: TALPA strides forward

No, you just need a cleverer algorithm -- like someone mentioned above, you should look for changed timestamps rather than simply "future" timestamps (because clocks get set back all the time, but it's extraordinarily unlikely that a second edit will come along at exactly the moment when the old timestamp is repeated). Then to fix the quickly-repeated-edits problem, if the timestamp is within 2*resolution of the current time (for some conservative definition of resolution), don't write that timestamp down in your cache. Easy and safe, and causes hardly any speed degradation.

(High-quality VCS's already do this; I first learned the trick from bzr, dunno if any other popular ones have picked it up.)


to post comments

TALPA strides forward

Posted Aug 29, 2008 0:08 UTC (Fri) by dlang (guest, #313) [Link] (5 responses)

remember that the notification goes out while the file is still open.

so a program writes to a file, the scanner gets notified, scans the file, notes the mtime, the program writes to the file again.

on a fast machine it's very possible that this can all take place in a short enough time that the mtime does not change

TALPA strides forward

Posted Aug 29, 2008 0:35 UTC (Fri) by njs (subscriber, #40338) [Link] (4 responses)

>so a program writes to a file, the scanner gets notified, scans the file, notes the mtime, the program writes to the file again.

and the scanner gets notified again, and scans the file again, yes.

All the things you say are true, but I'm afraid I don't understand why you are saying them here (i.e., I'm missing your point somewhere)?

TALPA strides forward

Posted Aug 29, 2008 0:47 UTC (Fri) by dlang (guest, #313) [Link] (3 responses)

if the scanner is only notified when mtime changes, then if the mtime doesn't change no notification will be sent out.

I posted a proposal for a slightly different approach where instead of using mtime and a single 'clean' bit I suggested stealing a chunk of xattr namespace and have the kernel clear this namespace when the file was dirtied.

this would let a scanner set a placeholder in the namespace to indicate that it was looking at the file, then when it was done it could check to see if the placeholder was still there, if so the file didn't change while it was being scanned and it's safe to mark it as scanned, if the placeholder is not there then you know the file changed and the scan you just did is worthless.

by using a chunk of namespace you can also support multiple scanners (without them needing to know anything about each other)

TALPA strides forward

Posted Aug 29, 2008 7:46 UTC (Fri) by njs (subscriber, #40338) [Link] (2 responses)

Oh, I see. Sure. I was reading quickly and just assumed that anyone talking about "notify when the mtime changes" actually meant, "hook into the kernel's poke-that-file's-mtime routine so it sends a notification", whether the resulting mtime was modified or not.

(In practice I'm pretty sure that the mtime *would* always be updated, though, because in linux, in-memory inodes always get nanosecond-accurate timestamps. The extra resolution gets stripped away by the filesystem driver when the metadata gets pushed out to disk, but the actual data structures used in the core kernel don't care about that.)

TALPA strides forward

Posted Aug 29, 2008 16:45 UTC (Fri) by bfields (subscriber, #19510) [Link] (1 responses)

In practice I'm pretty sure that the mtime *would* always be updated, though, because in linux, in-memory inodes always get nanosecond-accurate timestamps.

That's not true. On a recent kernel try running a simple test program, that does e.g., write, stat, usleep(x), write, stat. You'll see that on ext2/ext3 "x" has to be at least a million (a second) before you see a difference in the two stats, and that on something like xfs, it has to be at least a thousand to ten thousand (a few milliseconds--the time resolution used is actually jiffies).

(On older kernels I think the ext2/3 behavior might look like xfs's; that was fixed because of problems with unexpected changes in timestamps (due to lost nanoseconds field) when an inode got flushed out of cache and then read back.)

TALPA strides forward

Posted Aug 30, 2008 1:47 UTC (Sat) by njs (subscriber, #40338) [Link]

I was aware of the issues with confusing timestamp changes, but didn't realize it had been changed. Thanks.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds