I've got an embedded ARM Linux box with a limited amount of RAM (512MB) and no swap space, on which I need to create and then manipulate a fairly large file (~200MB). Loading the entire file into RAM, modifying the contents in-RAM, and then writing it back out again would sometimes invoke the OOM-killer, which I want to avoid.
My idea to get around this was to use mmap()
to map this file into my process's virtual address space; that way, reads and writes to the mapped memory-area would go out to the local flash-filesystem instead, and the OOM-killer would be avoided since if memory got low, Linux could just flush some of the mmap()'d memory pages back to disk to free up some RAM. (That might make my program slow, but slow is okay for this use-case)
However, even with the mmap()
call, I'm still occasionally seeing processes get killed by the OOM-killer while performing the above operation.
My question is, was I too optimistic about how Linux would behave in the presence of both a large mmap() and limited RAM? (i.e. does mmap()-ing a 200MB file and then reading/writing to the mmap()'d memory still require 200MB of available RAM to accomplish reliably?) Or should mmap() be clever enough to page out mmap'd pages when memory is low, but I'm doing something wrong in how I use it?
FWIW my code to do the mapping is here:
void FixedSizeDataBuffer :: TryMapToFile(const std::string & filePath, bool createIfNotPresent, bool autoDelete)
{
const int fd = open(filePath.c_str(), (createIfNotPresent?(O_CREAT|O_EXCL|O_RDWR):O_RDONLY)|O_CLOEXEC, S_IRUSR|(createIfNotPresent?S_IWUSR:0));
if (fd >= 0)
{
if ((autoDelete == false)||(unlink(filePath.c_str()) == 0)) // so the file will automatically go away when we're done with it, even if we crash
{
const int fallocRet = createIfNotPresent ? posix_fallocate(fd, 0, _numBytes) : 0;
if (fallocRet == 0)
{
void * mappedArea = mmap(NULL, _numBytes, PROT_READ|(createIfNotPresent?PROT_WRITE:0), MAP_SHARED, fd, 0);
if (mappedArea)
{
printf("FixedSizeDataBuffer %p: Using backing-store file [%s] for %zu bytes of data\n", this, filePath.c_str(), _numBytes);
_buffer = (uint8_t *) mappedArea;
_isMappedToFile = true;
}
else printf("FixedSizeDataBuffer %p: Unable to mmap backing-store file [%s] to %zu bytes (%s)\n", this, filePath.c_str(), _numBytes, strerror(errno));
}
else printf("FixedSizeDataBuffer %p: Unable to pad backing-store file [%s] out to %zu bytes (%s)\n", this, filePath.c_str(), _numBytes, strerror(fallocRet));
}
else printf("FixedSizeDataBuffer %p: Unable to unlink backing-store file [%s] (%s)\n", this, filePath.c_str(), strerror(errno));
close(fd); // no need to hold this anymore AFAIK, the memory-mapping itself will keep the backing store around
}
else printf("FixedSizeDataBuffer %p: Unable to create backing-store file [%s] (%s)\n", this, filePath.c_str(), strerror(errno));
}
I can rewrite this code to just use plain-old-file-I/O if I have to, but it would be nice if mmap()
could do the job (or if not, I'd at least like to understand why not).
madvise(MADV_DONTNEED)
on mapped file ranges that you don't need any more and have an "window" into the file. Otherwise mmap() will keep the data firmly on RAM. – Ossicle