write(2)/read(2) atomicity between processes in linux
Asked Answered
Z

2

6

I have a case where there are two processes which act on the same file - one as a writer and one as a reader. The file is a one line text file and the writer re-writes the line in a loop. reader reads the line. The pseudo code looks like this:

Writer Process

char buf[][18] = {
"xxxxxxxxxxxxxxxx",
"yyyyyyyyyyyyyyyy"
};
i = 0;
while (1) {
 pwrite(fd, buf[i], 18, 0);
 i = (i + 1) % 2;
}

Reader Process

while(1) {
  pread(fd, readbuf, 18, 0);
  //check if readbuf is either buf[0] or buf[1]
}

After a while of running both processes, I could see that the readbuf is either xxxxxxxxxxxxxxxxyy or yyyyyyyyyyyyyyyyxx.

My understanding was that the write would be atomic for sizes upto 512 bytes. but from my experiment, it looks like the atomicity is only for 16 bytes.

man pages doesn't say anything about atomicity for normal files, it mentions only about pipe atomicity for 512 bytes.

I have tried this with tmpfs and ext4 and the result are the same. with O_SYNC, ext4 writes become atomic and I understand it because writes don't return until it hits the disk, but O_SYNC doesn't help for tmpfs (/dev/shm).

Zaragoza answered 23/2, 2016 at 19:12 Comment(2)
Maybe the file you are using it's not seekable, check for errors in pread/pwrite,Ludovick
The files are regular files in tmpfs or ext4. And the code checks for pwrite/pread errors as well. Actually code opens, write, closes inside the loop. so pwrite doesn't matter. as open anyways set the offset to 0.Zaragoza
I
5

POSIX doesn't give any minimum guarantee of atomic operations for read and write except for writes on a pipe (where a write of up to PIPE_BUF (≥ 512) bytes is guaranteed to be atomic, but reads have no atomicity guarantee). The operation of read and write is described in terms of byte values; apart from pipes, a write operation offers no extra guarantees compared to a loop around single-byte write operations.

I'm not aware of any extra guarantee that Linux would give, neither 16 nor 512. In practice I'd expect it to depend on the kernel version, on the filesystem, and possibly on other factors such as the underlying block device, the number of CPUs, the CPU architecture, etc.

The O_SYNC, O_RSYNC and O_DSYNC guarantees (synchronized I/O data integrity completion, given for read and write in the optional SIO feature of POSIX) are not what you need. They guarantee that writes are committed to persistent storage before the read or write system call, but do not make any claim regarding a write that is started while the read operation is in progress.

In your scenario, reading and writing files doesn't look like the right toolset.

  • If you need to transfer only small amounts of data, use pipes. Don't worry too much about copying: copying data in memory is very fast on the scale of most processing, or of a context switch. Plus Linux is pretty good at optimizing copies.
  • If you need to transfer large amounts of data, you should probably be using some form of memory mapping: either a shared memory segment if disk backing isn't required, or mmap if it is. This doesn't magically solve the atomicity problem, but is likely to improve the performance of a proper synchronization mechanism. To perform synchronization, there are two basic approaches:
    • The producer writes data to shared memory, then sends a notification to the consumer indicating exactly what data is available. The consumer only processes data upon request. The notification may use the same channel (e.g. mmap + msync) or a different channel (e.g. pipe).
    • The producer writes data to shared memory, then flushes the write (e.g. msync). Then the producer writes a well-known value to one machine word (a sig_atomic_t will typically work, even though its atomicity is formally guaranteed only for signals — or, in practice, a uintptr_t). The consumer reads that one machine word and only processes the corresponding data if this word has an acceptable value.
Infarction answered 24/2, 2016 at 0:54 Comment(3)
Thanks for the detailed description. This clarifies my query. shared memory might not be a good option for me as the readers and writers are completely asynchronous in nature. I switched to a transactional write where the writer writes to a temp file and then rename(2)s the file which is atomic. So the reader would get a consistent view of the data which might not be always the latest, but that is acceptable.Zaragoza
One can also place a process shared mutex, condition variable and a message counter into the file mapping, so that the writer can notify (potentially multiple) readers in other threads and processes that the file has been updated and the readers can wait on it.Swat
On Linux, PIPE_BUF is 4096 bytes.Stolid
O
0

The PIPE_BUF atomicity requirement applies to pipes and FIFOs. POSIX gives regular files a different atomicity requirement, but the Linux kernel does not conform. The regular-file atomicity requirement appears in 2.9.7 Thread Interactions with Regular File Operations. Whenever a write() implementation returns some positive value N, that entire N-byte write shall be atomic. (A conforming write() implementation could choose to always return a value less than or equal to one, accepting just one byte at a time, in which case the atomicity has no practical benefit.)

While some have publicly argued that regular-file atomicity applies only to threads sharing a process, there's no precedent for POSIX writing "two threads" when it means "two threads of the same process". Moreover, the part about "shall also apply whenever a file descriptor is successfully closed, however caused (for example [...] process termination)" would be superfluous in a requirement isolated to threads of one process.

Octodecimo answered 27/1, 2022 at 8:6 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.