Python Multiple users append to the same file at the same time
Asked Answered
A

4

43

I'm working on a python script that will be accessed via the web, so there will be multiple users trying to append to the same file at the same time. My worry is that this might cause a race condition where if multiple users wrote to the same file at the same time and it just might corrupt the file.

For example:

#!/usr/bin/env python

g = open("/somepath/somefile.txt", "a")
new_entry = "foobar"
g.write(new_entry)
g.close

Will I have to use a lockfile for this as this operation looks risky.

Aquitaine answered 7/8, 2012 at 20:23 Comment(3)
Maybe you can just use syslog?Wing
If you are on Linux or other Unix mkfifo may be an interesting option. mkfifo creates a FIFO special file. Anyone can write to the file at random, then one single process reads out of the FIFO. That way you don't need to use file locking.Veroniqueverras
If you open with O_APPEND, the target filesystem is POSIX-compliant, and your writes are all short enough to be accomplished in a single syscall, there will be no corruption in the first place.Buiron
L
56

You can use file locking:

import fcntl
new_entry = "foobar"
with open("/somepath/somefile.txt", "a") as g:
    fcntl.flock(g, fcntl.LOCK_EX)
    g.write(new_entry)
    fcntl.flock(g, fcntl.LOCK_UN)

Note that on some systems, locking is not needed if you're only writing small buffers, because appends on these systems are atomic.

Legibility answered 7/8, 2012 at 20:27 Comment(13)
Nice answer, why would you need to do a g.seek(0,2) here to go to EOF. Won't append just add to the end of the file?Aquitaine
Oh, you're right. At least on Linux, it's not required (I imagined an OS that implements the a mode by initially seeking to EOF). I was also playing with the idea of opening the file with another mode but a, but that's apparently not possible in Python.Legibility
What would happen if a user tried to append to the file but the file was locked by flock? Error?Aquitaine
@RayY No, the process (or more precisely, the current thread) just blocks until the lock is released. For more information, refer to man 2 flockLegibility
@Legibility So if it's blocked, do you need adjust your code so it keeps retrying to append until the lock is released?Armorial
@Armorial No, blocked means that flock will not return until it has acquired a lock.Legibility
fcntl is for Unix only, does not exist on Windows. Any suggestions for Windows?Poliomyelitis
@AnatolyAlekseev See this answer for the Windows equivalent.Legibility
Can this happen: process A and B both open the file in append mode. A gets the lock, writes, then unlocks. B gets now the lock and writes over what was written by A, since when open was called, the end of the file was the line which is now written to by A. I am trying to write to a file with 175 processes, and some lines are messed up, even though they print correctly in the individual log files belonging to each process, so I was wondering if this might be the issue. Do I need to call seek or something to move to the end of the file for B? (I'm on Debian 10, Python 3.7.5)Benito
@Benito No, append really appends all the time. Are you sure you are locking everywhere you write? You can check with strace. Another potential culprit could be line buffering. Assuming you write out lines at once, you might want to turn that off.Legibility
Thank you, I tried with the -u flag to disable buffering, but that didn't help. Could you please point me to a reference on how to use strace on this?Benito
@Benito -u only affects stdin, stdout, stderr. You want buffering=0 in the open call, although I believe it also should work by default. You can pick one (or all) of your 175 processes and prefix the command line to it with strace -ff -o log to see what they are doing. There is no specific reference for using strace for your problem; strace is a generic tool.Legibility
Thank you very much for your help! I could not use buffering=0 since I was not opening the file in binary mode, but using flush on the file descriptor before releasing the lock resolved the issue.Benito
C
6

If you are doing this operation on Linux, and the cache size is smaller than 4KB, the write operation is atomic and you should be good.

More to read here: Is file append atomic in UNIX?

Custody answered 9/11, 2017 at 23:11 Comment(0)
T
5

You didn't state what platform you use, but here is an module you can use that is cross platform: File locking in Python

Trampoline answered 7/8, 2012 at 20:26 Comment(2)
This link is dead.Midgard
No it's not, it's just resting.Trampoline
G
1

Depending on your platform/filesystem location this may not be doable in a safe manner (e.g. NFS). Perhaps you can write to different files and merge the results afterwards?

Gustaf answered 7/8, 2012 at 21:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.