How to manipulate page cache in Linux?
Asked Answered
B

3

5

I want to know what files are cached in Page Cache, and want to free the cache space of a specific file pragmatically. It is possible for me to write kernel module or even modify the kernel code if needed. Can anyone give me some clues?

Biographer answered 8/2, 2011 at 13:27 Comment(0)
K
5

Firstly, the kernel does not maintain a master list of all files in the page cache, because it has no need for such information. Instead, given an inode you can look up the associated page cache pages, and vice-versa.

For each page cache struct page, page_mapping() will return the struct address_space that it belongs to. The host member of struct address_space identifies the owning struct inode, and from there you can get the inode number and device.

Ken answered 9/2, 2011 at 4:41 Comment(3)
Thanks for your advice! Then, is there any easy way to get the filename of an struct inode? If there are several filenames point to the same inode, any one will be OK.Biographer
@Stephenjy: Nope, and that's for the reason you've identified - an inode might have zero, one or many filenames pointing at it. The only way is to scan all the filenames in the filesystem looking for a match. You can easily go the other way though - lookup a filename and determine how many page cache pages it owns.Ken
Maybe I can intercept all open() syscalls, and make a record of all opened files, and then scan through all these files to see if one is cached. In this way, I can get rid of scanning the entire filesystem. Will it do?Biographer
B
2

mincore() returns a vector that indicates whether pages of the calling process's virtual memory are resident in core (RAM), and so will not cause a disk access (page fault) if referenced. The kernel returns residency information about the pages starting at the address addr, and continuing for length bytes.

To test whether a file currently mapped into your process is in cache, call mincore with its mapped address.
To test whether an arbitrary file is in cache, open and map it, then follow the above.

There is a proposed fincore() system call which would not require mapping the file first, but (at this point in time) it's not yet generally available.

(And then madvise(MADV_DONTNEED)/fadvise(FADV_DONTNEED) can drop parts of a mapping/file from cache.)

Beitris answered 9/2, 2011 at 4:51 Comment(2)
Thank you for fincore(), this helps me solve my another problem. By the way, fincore() is not a system call, but a tool provided by linux-ftools, it uses mmap() and mincore() to determine which pages of the file are in pagecache, and use posix_fadvise() with POSIX_FADV_DONTNEED to drop parts of a mapping/file from cache.Biographer
@stephenjy: As stated, there's a proposed fincore syscall, even though its results could be obtained in userland by using multiple existing syscalls. Looks like linux-ftools does just that.Beitris
P
1

You can free the contents of a file from the page cache under Linux by using

posix_fadvise(fd, POSIX_FADV_DONTNEED

As of Linux 2.6 this will immediately get rid of the parts of the page cache which are caching the given file or part of file; the call blocks until the operation is complete, but that behaviour is not guaranteed by posix.

Note that it won't have any effect if the pages have been modified, in that case you want to do a fdatasync or such like first.

EDIT: Sorry, I didn't fully read your question. I don't know how to tell which files are currently in the page cache. Sorry.

Polychasium answered 8/2, 2011 at 15:33 Comment(1)
Still thanks a lot, posix_fadvise() solves my another problem :)Biographer

© 2022 - 2024 — McMap. All rights reserved.