Is overwriting a small file atomic on ext4?
Asked Answered
C

2

5

Assume we have a file of FILE_SIZE bytes, and:

  • FILE_SIZE <= min(page_size, physical_block_size);
  • file size never changes (i.e. truncate() or append write() are never performed);
  • file is modified only by completly overwriting its contents using:

    pwrite(fd, buf, FILE_SIZE, 0);
    

Is it guaranteed on ext4 that:

  1. Such writes are atomic with respect to concurrent reads?
  2. Such writes are transactional with respect to a system crash?

    (i.e., after a crash the file's contents is completely from some previous write and we'll never see a partial write or empty file)

Is the second true:

  • with data=ordered?
  • with data=journal or alternatively with journaling enabled for a single file?

    (using ioctl(fd, EXT4_IOC_SETFLAGS, EXT4_JOURNAL_DATA_FL))

  • when physical_block_size < FILE_SIZE <= page_size?


I've found related question which links discussion from 2011. However:

  • I didn't find an explicit answer for my question 2.
  • I wonder, if the above is true, is it documented somewhere?
Cecilius answered 29/9, 2015 at 18:52 Comment(1)
Ext4 isn't atomic with data=ordered or data=journal. Only with data=writeback writes occur in place.Ashil
K
7

From my experiment it was not atomic.

Basically my experiment was to have two processes, one writer and one reader. The writer writes to a file in a loop and reader reads from the file

Writer Process:

char buf[][18] = {
    "xxxxxxxxxxxxxxxx",
    "yyyyyyyyyyyyyyyy"
};
i = 0;
while (1) {
   pwrite(fd, buf[i], 18, 0);
   i = (i + 1) % 2;
}

Reader Process

while(1) {
    pread(fd, readbuf, 18, 0);
    //check if readbuf is either buf[0] or buf[1]
}

After a while of running both processes, I could see that the readbuf is either xxxxxxxxxxxxxxxxyy or yyyyyyyyyyyyyyyyxx.

So it definitively shows that the writes are not atomic. In my case 16byte writes were always atomic.

The answer was: POSIX doesn't mandate atomicity for writes/reads except for pipes. The 16 byte atomicity that I saw was kernel specific and may/can change in future.

Details of the answer in the actual post: write(2)/read(2) atomicity between processes in linux

Kathrynkathryne answered 1/3, 2016 at 21:36 Comment(0)
I
3

I am familiar with theory about filesystems in general, not with implementation of Ext4. Take this as educated guess.

Yes, I believe one sector reads and writes will be atomic because

  • Link you provided quotes "Currently concurrent reads/writes are atomic only wrt individual pages, however are not on the system call. "
  • Disk sector (512 bytes) writes are atomic according to Stephen Tweedie. In private email conversation with him, he acknowledged that this guarantee is only as good as the hardware.
  • Ext filesystems overwrite data in place, no copy on write. No allocation.
  • There is some effort to implement inline data, very small files data can fit in the inode itself. If you only need to store few bytes, that may have impact.

Not sure about one page, but it would make little sense in full journaling mode to send less than a page to the journal before commiting.

Irresistible answered 28/10, 2015 at 4:40 Comment(2)
Thank for the answer.Cecilius
FYI, most disks use 4 KiB sectors now.Jaw

© 2022 - 2024 — McMap. All rights reserved.