Readahead: the documentation I wanted to read

Posted Apr 11, 2022 8:19 UTC (Mon) by donald.buczek (subscriber, #112892)
In reply to: Readahead: the documentation I wanted to read by willy
Parent article: Readahead: the documentation I wanted to read

> I am a bit confused that it evicts useful data.

Sorry, I was unclear with the term "valuable". I'm not talking about hot pages, which are accessed by the system. These can probably avoid eviction by returning to the active list fast enough. The (possibly) useful data lost, I've talked about, are other inactive pages and data from other caches (namely dcache). The original user complaint was, " `ls` take ages in the morning". So only when the user took a break, his data was replaced. That by itself is not wrong and the basic strategy of LRU. How should the system now, that the user is going to return the next morning? On the other hand, the system *could* notice, that a big file, which is never going to fit into the cache, is being read sequentially from the beginning. So keeping the already processed head of the file when memory is needed, is even more likely to be useless, because it will be evicted anyway if the observed pattern continues.

> Do you see a difference if the files are accessed locally versus over NFS

No, the same is true for access from the local system. NFS is just a complication in the regards I mentioned (sometimes out of order, no fadvice, no cgroups). In the thread referenced below, I've posted a reproducer script for a local file access.

> would you mind taking this to linux-mm, and/or linux-fsdevel

A colleague of mine did so in August 2021 [1]

Best
Donald

[1]: https://lore.kernel.org/all/878157e2-b065-aaee-f26b-5c87e...

to post comments

Readahead: the documentation I wanted to read

Posted Apr 11, 2022 13:39 UTC (Mon) by willy (subscriber, #9762) [Link] (1 responses)

Ah, I see that in my inbox now ... I read the third email in the chain (the first two went only to xfs?), but didn't read the fourth and fifth. The usual too-much-email problem.

Anyway, I think recognising this special case probably isn't the right solution. Backup is always tricky, and your proposal would fix one-large-file but do nothing for many-small-files.

I suspect the right way to go is to recognise that the page cache is large and has many easily-reclaimable pages, and shrink only the page cache. ie the problem is that backup is exerting general memory pressure when we'd really like it to only exert pressure only on the page cache. Or rather, we'd like the page cache to initially exert pressure only on the page cache. The dcache should initially exert pressure only on the dcache. Etc. If a cache can't reclaim enough memory easily, then it should pressure other caches to shrink.

Readahead: the documentation I wanted to read

Posted Apr 12, 2022 19:08 UTC (Tue) by donald.buczek (subscriber, #112892) [Link]

> Backup is always tricky, and your proposal would fix one-large-file but do nothing for many-small-files.

To be exact, its not backup. Our maintenance jobs run locally and are tamed via cgroups. Its (other) users. Our scientific users often process rather big files.

> Or rather, we'd like the page cache to initially exert pressure only on the page cache. The dcache should initially exert pressure only on the dcache. Etc. If a cache can't reclaim enough memory easily, then it should pressure other caches to shrink.

This would probably help a lot in the problem area I described and also in some others. It good to know, that this is on your mind. The negative dentries discussion mentioned in the later lwn article seems to get into the same field.