Linux Cached Memory
We were running a BMC firmware (If you don’t know already, find out what it is?) on top of our custom linux kernel and one fine day we realized that we have developed so many applications that we forgot checking how much memory these applications were using and we were running low on available RAM.
The first thing we saw was, the
cached memory in
/etc/procmeminfo was too high.
We eventually found what cached was and how we can limit/free it at will. So, lets take a stroll down the memory lane…
What is Cached
In linux there are two things that are part of the of file page cache. The kernel caches
- The files that your processes access
- The RAM based filesystems, such as tmpfs/ramfs, are also part of the cache.
RAM File System Caching
The kernel tmpfs documentation notes that
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
To be more clear, any file that you place in your tmpfs will be in the cache forever, till it is deleted. So the downside of tmpfs is, if you oversize the tmpfs eventually the OOM killer will start killing your running processes to get more RAM space, as it cannot free the files in tmpfs from cache unless you have swap (in embedded linux, you don’t have it – which means you are stuck forever in RAM)
Anytime a process opens a file, the kernel automatically caches the file there by making subsequent I/O calls to the file faster as it is done in the memory instead of directly on the disk.
Eviction from the cache
Kernel automatically reclaims the pagecache when other processes or kernel needs memory. However, if the file is in use then the kernel wouldn’t be able to free the cache. And if the whole system is running short of memory, then the Out-Of-Memory (OOM) killer is called.
After a file is closed, it may still be part of the cache in case the process again tries to open it. But this is in a reclaimable part of the cache, which means if more memory is necessary this closed file will be evicted from the memory.
A file in RAM filesystem will be part of the cache. Uneeded pages will be swapped to swap. If you don’t have swap, then unless you delete the file it is part of cache.
Best practices for reducing the cache
RAM based FileSystem
When you mount, you can restrict the maximum size of the tmpfs. Note that you cannot limit it using ramfs. Because, as noted in the tmpfs documentation
1 2 3 4 5
So the bottom line is: tmpfs is a better option than ramfs, as you can add swap and also limit the mount size.
Operating on a file
If you are operating on a file of substantial size, then after the end of the file access you can use posix_fadvise to evict this file from the cache immediately. This is a much safer option, as you will only evict your file instead of the other files that are being cached.
1 2 3 4 5 6
Global Drop Cache
If you want to drop caches of the whole system then write
- `echo 1 > /proc/sys/vm/drop_caches` - Will free pagecache - `echo 2 > /proc/sys/vm/drop_caches` - Will free dentries and inodes - `echo 3 > /proc/sys/vm/drop_caches` - Will free pagecache, dentries and inodes
You can control the below values which would help you in freeing up the cache more often.
This controls how agressively kernel will swap memory pages out.
1 2 3 4
This controls how the kernel will reclaim memory that is used for caching of directory and inode objects. Note: This doesn’t affect the file caching pressure
1 2 3 4
This forces the VM to keep a minimum number of configured kilobytes free for the atomic operations within the kernel. Note: This is not the free memory that your userspace processes can use.
Choose a value that would be needed by your kernel depending on your kernel configurations
The percentage value of memory that can be filled with dirty pages before pdflush begins to write them.
If you have huge memory, then the default percentage may be too high. You can lower it
The hundreth of the second after which data will be considered to be expired from the page cache and will be written at the next opportunity.
Reducing will clean up the page cache, but will trigger IO congestion
The percentage value of memory that can be filled with dirty pages before the processes begin to write them.
Reducing this will kick in pdflush when a process is writing out huge files. This would block the IO for the process momentarily
The hundredth of a second after which the pdflush wakes up to write data to disk.
Can reduce the value to avoid data loss, however this will have an effect of IO congestion
This is a nice utility, part of linux-ftools. You have to give the file name as input, and it will stats for the files that are in the cache now. Hint: Running it on files that are part of tmpfs will show that these files are always in cache
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
If you want to modify the files in cache, then you can use vmtouch. With vmtouch you can
- Evict files from cache
- Keep a file always in cache, without giving it a possibility to be evicted
Damn, be extra careful with what you keep in your cache memory. That determines whether you system works as you want it to or not, esp in an embedded linux world.
I will write few more blog entries on how we eventually managed to handle all the memory issues in our firmware.
If you have more tools or thoughts on handling cache better, let me know in the comments.