Using boost::iostreams::mapped_file
Asked Answered
M

1

3

I am very new to the memory mapping and trying to understand memory mapped files to use them in my project(linux based). My requirement is to write & then read from memory mapped files. I wrote a sample program which only writes and it works fine but i have a few very basic doubts as i do not understand this funda of memory mapping properly.

#include <unordered_map>
#include <boost/iostreams/device/mapped_file.hpp>
using namespace boost::interprocess;
using namespace std;
typedef unordered_map<int, string> work;
int main()
{
        boost::iostreams::mapped_file_params  params;
        params.path = "map.dat";
        params.new_file_size = 100;
        params.mode = (std::ios_base::out | std::ios_base::in);
        boost::iostreams::mapped_file  mf;
        mf.open(params);
        work w1;
        w1[0] = "abcd";
        w1[1] = "bcde";
        w1[2] = "cdef";

        work* w = static_cast<work*>((void*)mf.data());
        *w = w1;
        mf.close();
        return 0;
}

I have a few questions here:

  1. When i do this : mf.open(params) , i see that a file is created on disk with size 100. Now when i write to it i.e *w = w1, the contents of the file on disk changes. Does this mean that i am not using the RAM at all and i am writing straight into the
    disk?

  2. When i do mf.size(), it always give me the size which i gave as the input for creating the actual file. How can i find out the size of the data that i actually wrote into the
    memory mapped file?

  3. Also if i give params.new_file_size = 10GB, the file of that size gets created on the
    disk but it does not take up any disk space.Confirmed by using df cmd. Why so? -rwx------. 1 root root 10000000000 Apr 29 14:26 map.dat

  4. I read that close file frees the mapping. Does this mean that after close i lose all the data that i wrote? But this is not true as i have the working code where i close and then open the file again and read it correctly.

  5. How to delete the memory mapped files created after use? By using rm -rf cmd/linux apis?

Misprint answered 30/4, 2014 at 8:50 Comment(0)
O
4
  • When i do this : mf.open(params) , i see that a file is created on disk with size 100. Now when i write to it i.e *w = w1, the contents of the file on disk changes. Does this mean that i am not using the RAM at all and i am writing straight into the disk?

You're using memory mapped files. This means both: you are writing to 'virtual memory pages' that have been mapped into your process space, but actually refer to disk blocks. The growth indicates that the pages get committed on write.

  • When i do mf.size(), it always give me the size which i gave as the input for creating the actual file. How can i find out the size of the data that i actually wrote into the memory mapped file?

You can't. You can only find the number of blocks committed with a tool like stat

  • Also if i give params.new_file_size = 10GB, the file of that size gets created on the disk but it does not take up any disk space.Confirmed by using df cmd. Why so? -rwx------. 1 root root 10000000000 Apr 29 14:26 map.dat

It's sparsely allocated. E.g. using fallocate or similar on other platforms.

  • I read that close file frees the mapping. Does this mean that after close i lose all the data that i wrote? But this is not true as i have the working code where i close and then open the file again and read it correctly.

No. It means that the mapping is freed. That is, the /virtual memory/ area in your process space is now 'free' to use for other things.

  • How to delete the memory mapped files created after use? By using rm -rf cmd/linux apis?

Yes.

Olin answered 30/4, 2014 at 9:49 Comment(7)
Thanks for your answers. So combining 1st answer and 4th answer, I get that if I do not use close, I actually am using both virtual memory & disk space. My exact use case is as below: With the implementation that I have now(without memory map), I collect 10 GB of data(from some data source) in memory and I give that to an avro file writer to write avro files. I do this whenever I collect 10GB data from some data Source.Misprint
Now by using memory mapping, I was planning to memory map this 10GB of data to a file and the allow avro writer to write it. In my example which I pasted, it's like passing *w to the writer. But as per your answer, if I do so, then too I would be using memory and its not good. So I should write the data to a memory mapped file, Close it, Ask the writer to open that file and read and use it to write avro files. Is my understanding correct?Misprint
You're using virtual memory. You know. virtual, as in: not physical. It's ok. It's using "address space". But address space is cheap. On 64bit you will not run out of address space easily. I've memory mapped files of 200GiB just fine, recently.Olin
(caveat: the OS has the duty of reading pages into memory on demand. When there's not enough physical memory left, then the least-recently used page will be dropped, so if you loop around the whole memory area, you will end up reading the same pages back from disk many times, unless it all stays in memory)Olin
Thanks Sehe. 1 more doubt.. after closing the memory mapped file, when i read it back.. can i read it in parts? For eg: In the code pasted, if i only want to read back w[0] or w[1] and not the entire map.. can i do that? If yes, how?Misprint
@NehaRawat you mean, "1 more question" :) And you cannot do it any other way! Although mapping the memory maps the whole region at once, it is all virtual (notice the mantra? get it into your system!). Only when you access w[0], or w[8836752] does the relevant page get loaded into memory.Olin
Sehe, Since it allows me to write only a few chars,.. i have posted a new question with some new observations. It would be great if you could help me with that. Here is the link : #23468727 ThanksMisprint

© 2022 - 2024 — McMap. All rights reserved.