Linux ext3 readdir and concurrent updates
Asked Answered
G

1

1

we are receiving about 10000 messages per hour. We store them as individual files in hourly directories on an ext3 filesystem. The file name includes a sequence number. We use rsync to mirror these files every 20 seconds at another location (via a SAN, but that doesn't matter).

Sometimes an rsync run picks up files n-3, n-2, n-1, n+1, and then next rsync run continues with n, n+2, n+3, n+4 and so on.

Is it possible that when one process creates files in a certain sequence within a directory, that another process using readdir() sees the files appearing in a different sequence?

Kind regards, Sebastian

Gruelling answered 28/5, 2010 at 11:13 Comment(1)
rsync sorts file list lexicographically. What are file names? Also sorting algo may differ from version to version.Chemotaxis
I
2

I suppose your question can be restated as:

If process A creates file d/x and then creates file d/y, is it possible for process B to peform a concurrent readdir() on directory d and see an entry d/y, but not see an entry d/x?

The answer is Yes. The ordering guarantees for readdir are very weak indeed.

If you want to enforce an ordering, you will need to explicitly fsync() a file descriptor for the directory d itself after creating each file.

Ingrid answered 30/5, 2010 at 11:24 Comment(3)
Thanks for the answer. That's weak indeed, and I'm still inclined to call it a bug. Do you have any pointers to Linux documentation or source code? Btw, we have now disabled the ext3 dir_index option, and whereas in the last month the issue occured a few times nearly every day, it has now not happened anymore for three days in a row.Gruelling
I'm still wondering whether and how fsync could avoid this from happening.Gruelling
@Wangnick: The relevant documentation is the POSIX spec for readdir(), which simply says If a file is removed from or added to the directory after the most recent call to opendir() or rewinddir(), whether a subsequent call to readdir() returns an entry for that file is unspecified. - note, no ordering guarantees. If process A creates d/x, then calls fsync(d), then creates d/y, then calls fsync(d) (etc) then you should get the externally visible ordering you desire.Ingrid

© 2022 - 2024 — McMap. All rights reserved.