Locking a file in Python
Asked Answered
E

15

215

I need to lock a file for writing in Python. It will be accessed from multiple Python processes at once. I have found some solutions online, but most fail for my purposes as they are often only Unix based or Windows based.

Extract answered 28/1, 2009 at 23:20 Comment(0)
E
172

Alright, so I ended up going with the code I wrote here, on my website link is dead, view on archive.org (also available on GitHub). I can use it in the following fashion:

from filelock import FileLock

with FileLock("myfile.txt.lock"):
    # work with the file as it is now locked
    print("Lock acquired.")
Extract answered 31/1, 2009 at 8:30 Comment(16)
I implemented something similar, and just found that it broke in some configurations since NFSv2 does not support atomic exclusive file creation.Swam
As noted by a comment at the blog post, this solution isn't "perfect", in that it's possible for the program to terminate in such a way that the lock is left in place and you have to manually delete the lock before the file becomes accessible again. However, that aside, this is still a good solution.Chesterfieldian
is it the same as pythonhosted.org/lockfile/lockfile.html ? it seems to be compatible, but I am not sure... If it's the same, the from filelock import FileLock doesn't work, use from lockfile import FileLock instead...Precaution
@Precaution It looks like the link you submitted might have a different author (Skip Montanaro) than the author of this answer (Evan Fosmark). I don't know if it could still be the same code passed off to a different maintainer.Fabricate
Yet another improved version of Evan's FileLock can be found here: github.com/ilastik/lazyflow/blob/master/lazyflow/utility/…Define
Using 2 threads and heavy load (just continuous iteration: saving consecutive numbers to file), both the original FileLock and @superbatfish improved version cause, after some time, under Windows: fd = os.open( self.lockfile, os.O_CREAT | os.O_EXCL | os.O_RDWR ): OSError: [Errno 13] Permission denied: 'counter_test.txt.lock'. Minimal test: cdn.anonfiles.com/1403402648742.pySent
I dislike this solution because filelock leaves stale lock files behind which are not cleaned up at the next attempt.Bronchi
OpenStack did publish their own (well, Skip Montanaro's) implementation - pylockfile - Very similar to the ones mentioned in previous comments, but still worth taking a look.Bacciferous
@Bacciferous Openstacks pylockfile is now deprecated. It is advised to use fasteners or oslo.concurrency instead.Unerring
Another similar implementation I guess: github.com/benediktschmitt/py-filelockPsychotechnics
This implementation is incorrect, or at least it is outdated now. From the documentation: Don't use a FileLock to lock the file you want to write to, instead create a separate .lock file Source: pypi.org/project/filelockWarren
@Unerring The fasteners library has no documentation whatsoever and therefore this is a bad recommendation.Keijo
This clutters the file system with millions of lock files. Am I doing something wrong?Friedlander
The comments make it very confusing whether this is still the recommended package to use or whether we should be using something newer. An update to the answer would be really nice.Frederico
How can we install it via pip3 install filelock ?Shin
@Evan Fosmark is it possible to use FileLock as a reader writer lock ? i want to be able to read a file from multiple processes in parallel but only one process can write to the file at any given timeIneluctable
L
55

The other solutions cite a lot of external code bases. If you would prefer to do it yourself, here is some code for a cross-platform solution that uses the respective file locking tools on Linux / DOS systems.

try:
    # Posix based file locking (Linux, Ubuntu, MacOS, etc.)
    #   Only allows locking on writable files, might cause
    #   strange results for reading.
    import fcntl, os
    def lock_file(f):
        if f.writable(): fcntl.lockf(f, fcntl.LOCK_EX)
    def unlock_file(f):
        if f.writable(): fcntl.lockf(f, fcntl.LOCK_UN)
except ModuleNotFoundError:
    # Windows file locking
    import msvcrt, os
    def file_size(f):
        return os.path.getsize( os.path.realpath(f.name) )
    def lock_file(f):
        msvcrt.locking(f.fileno(), msvcrt.LK_RLCK, file_size(f))
    def unlock_file(f):
        msvcrt.locking(f.fileno(), msvcrt.LK_UNLCK, file_size(f))


# Class for ensuring that all file operations are atomic, treat
# initialization like a standard call to 'open' that happens to be atomic.
# This file opener *must* be used in a "with" block.
class AtomicOpen:
    # Open the file with arguments provided by user. Then acquire
    # a lock on that file object (WARNING: Advisory locking).
    def __init__(self, path, *args, **kwargs):
        # Open the file and acquire a lock on the file before operating
        self.file = open(path,*args, **kwargs)
        # Lock the opened file
        lock_file(self.file)

    # Return the opened file object (knowing a lock has been obtained).
    def __enter__(self, *args, **kwargs): return self.file

    # Unlock the file and close the file object.
    def __exit__(self, exc_type=None, exc_value=None, traceback=None):        
        # Flush to make sure all buffered contents are written to file.
        self.file.flush()
        os.fsync(self.file.fileno())
        # Release the lock on the file.
        unlock_file(self.file)
        self.file.close()
        # Handle exceptions that may have come up during execution, by
        # default any exceptions are raised to the user.
        if (exc_type != None): return False
        else:                  return True        

Now, AtomicOpen can be used in a with block where one would normally use an open statement.

WARNINGS:

  • If running on Windows and Python crashes before exit is called, I'm not sure what the lock behavior would be.
  • The locking provided here is advisory, not absolute. All potentially competing processes must use the "AtomicOpen" class.
  • As of (Nov 9th, 2020) this code only locks writable files on Posix systems. At some point after the posting and before this date, it became illegal to use the fcntl.lock on read-only files.
Levorotation answered 25/9, 2017 at 14:12 Comment(10)
unlock_file file on linux should not call fcntl again with the LOCK_UN flag?Drown
The unlock automatically happens when the file object is closed. However, it was bad programming practice of me not to include it. I've updated the code and added the fcntl unlock operation!Levorotation
In __exit__ you close outside of the lock after unlock_file. I believe the runtime could flush (i.e., write) data during close. I believe one must flush and fsync under the lock to make sure no additional data is written outside the lock during close.Quanta
Thanks for the correction! I verified that there is the possibility for a race condition without the flush and fsync. I've added the two lines you suggested before calling unlock. I re-tested and the race condition appears to be resolved.Levorotation
Even if both processes use this, isn't it still possible for things to go wrong if, say, process 2 happens to call self.file = open('somefile.txt', 'w') between process 1's calls to self.file = open( 'somefile.txt', 'r') and lock_file(self.file) ?Ear
The only thing that will go "wrong" is that by the time process 1 locks the file its contents will be truncated (contents erased). You can test this yourself by adding another file "open" with a "w" to the code above before the lock. This is unavoidable though, because you must open the file before locking it. To clarify, the "atomic" is in the sense that only legitimate file contents will be found in a file. This means that you will never get a file with contents from multiple competing processes mixed together.Levorotation
According to what I have found via Google, msvcrt.LK_RLCK is a read lock, so same as fcntl.LOCK_SH, while msvcrt.LK_LOCK is the excusive one which is similar to fcntl.LOCK_EX. So you probably must use msvcrt.LK_LOCK under Windows. However the documentation sucks as usual and I have no Windows to verify, sorry.Instrumentalist
So this is a harder problem than expected to address on Windows. The file is locked by a size, but if contents are written then the size changes (and the current code fails on unlock, because the size is wrong). It also appears like the locks are ignored entirely on Windows. I'll have to investigate further, but it will take time. I'll post a solution when I figure it out.Levorotation
On Linux it seems, that opening a file for reading throws OSError. As fcntl documentation says: "On at least some systems, LOCK_EX can only be used if the file descriptor refers to a file opened for writing."Contortion
@DeadlyPointer thanks for the catch. I've edited the locking code for systems with fcntl to only lock writeable files. This appears to be working. However, I haven't tested what happens when files that are being read are modified by another process. I'm not sure how an OS will handle that issue, if anyone wants to contribute their insight or test cases they would be welcome!Levorotation
B
44

There is a cross-platform file locking module here: Portalocker

Although as Kevin says, writing to a file from multiple processes at once is something you want to avoid if at all possible.

If you can shoehorn your problem into a database, you could use SQLite. It supports concurrent access and handles its own locking.

Barimah answered 29/1, 2009 at 1:1 Comment(8)
+1 -- SQLite is almost always the way to go in these kinds of situations.Clamber
Portalocker requires Python Extensions for Windows, on that.Pika
@naxa there is a variant of it which relies only on msvcrt and ctypes, see roundup.hg.sourceforge.net/hgweb/roundup/roundup/file/tip/…Stallworth
@Pika Portalocker has just been updated so it doesn't require any extensions on Windows anymore :)Pains
SQLite supports concurrent access?Trioxide
I ended up here because I need a lockfile to repair my SQLite database, which was corrupted by concurrency. I think it's time to burn my computer and become a monkEpidote
Portalocker does not seem work well with nfs mounted files.Triphylite
How can we merge SQLite into Python ?Shin
S
26

I have been looking at several solutions to do that and my choice has been oslo.concurrency

It's powerful and relatively well documented. It's based on fasteners.

Other solutions:

Sedimentation answered 6/12, 2015 at 23:9 Comment(6)
re: Portalocker, you can now install pywin32 through pip via the pypiwin32 package.Disposure
there is also filelock (Last released: May 18, 2019 at the time of the comment)Mew
@Mew isn't filelock package same as the accepted answer, where: pip3 install filelockShin
@alper: same module names, different packages (look at the github)Mew
@Mew Which one is recommend to use? The filelock you linked hase more Star and forks than the accepted answer's packageShin
@Shin pip installs the module I've linked. So it can be considered the default choice unless you have some spry requirements that are better served by some other module.Mew
I
18

I prefer lockfile — Platform-independent file locking

Instrumentalism answered 27/7, 2010 at 13:4 Comment(8)
This library seems well written, but there's no mechanism for detecting stale lock files. It tracks the PID that created the lock, so should be possible to tell if that process is still running.Deci
@sherbang: what about remove_existing_pidfile?Digit
@JanusTroelsen the pidlockfile module doesn't acquire locks atomically.Deci
@Deci Are you sure? It opens the lock file with mode O_CREAT|O_EXCL.Block
@rgove You're correct, and I misspoke. Yes, it obtains locks atomically. What I should have said was that it doesn't allow for an atomic way to deal with stale locks. Although, I can't recall now if there is a way to handle that atomically.Deci
Is it possible to get away with the stale locks?Bronchi
According to the documentation that packages is deprecated.Autocratic
Please note that this library has been superceded and is part of github.com/harlowja/fastenersUndamped
T
16

Locking is platform and device specific, but generally, you have a few options:

  1. Use flock(), or equivalent (if your os supports it). This is advisory locking, unless you check for the lock, it's ignored.
  2. Use a lock-copy-move-unlock methodology, where you copy the file, write the new data, then move it (move, not copy - move is an atomic operation in Linux -- check your OS), and you check for the existence of the lock file.
  3. Use a directory as a "lock". This is necessary if you're writing to NFS, since NFS doesn't support flock().
  4. There's also the possibility of using shared memory between the processes, but I've never tried that; it's very OS-specific.

For all these methods, you'll have to use a spin-lock (retry-after-failure) technique for acquiring and testing the lock. This does leave a small window for mis-synchronization, but its generally small enough to not be a major issue.

If you're looking for a solution that is cross platform, then you're better off logging to another system via some other mechanism (the next best thing is the NFS technique above).

Note that sqlite is subject to the same constraints over NFS that normal files are, so you can't write to an sqlite database on a network share and get synchronization for free.

Thayne answered 29/1, 2009 at 8:46 Comment(2)
Note: Move/Rename is not atomic in Win32. Reference: #167914Deci
New note: os.rename is now atomic in Win32 since Python 3.3: bugs.python.org/issue8828Doublefaced
W
10

Here's an example of how to use the filelock library, which is similar to Evan Fosmark's implementation:

from filelock import FileLock

lockfile = r"c:\scr.txt"
lock = FileLock(lockfile + ".lock")
with lock:
    file = open(path, "w")
    file.write("123")
    file.close()

Any code within the with lock: block is thread-safe, meaning that it will be finished before another thread has access to the file.

Warren answered 17/1, 2020 at 21:54 Comment(3)
This isn't so much adding on to Evan's answer as it is completely orthogonal to it, though you may not have realised that yourself! Somewhat confusingly, the filelock module on PyPI that you link to in your answer, which like Evan's module exposes a FileLock class, is totally unrelated to Evan's work. You can see on GitHub that Evan's code at github.com/dmfrey/FileLock/blob/master/filelock/filelock.py has no shared code or ancestry with the code at github.com/tox-dev/py-filelock/tree/main/src/filelock, which is what you're using here.Salpingectomy
Oh wow, I didn't even notice. Thanks for pointing that out, I updated my post accordingly :)Warren
This is thread safe but not process safe.Parnassus
F
7

Coordinating access to a single file at the OS level is fraught with all kinds of issues that you probably don't want to solve.

Your best bet is have a separate process that coordinates read/write access to that file.

Faugh answered 29/1, 2009 at 0:24 Comment(4)
"separate process that coordinates read/write access to that file" - in other words, implement a database server :-)Unclad
This is actually the best answer. To just say "use a database server" is overly simplified, as a db is not always going to be the right tool for the job. What if it needs to be a plain text file? A good solution might be to spawn a child process and then access it via a named pipe, unix socket, or shared memory.Klapp
-1 because this is just FUD without explanation. Locking a file for writing seems like a pretty straightforward concept to me that OSes offer up with functions like flock for it. An approach of "roll your own mutexes and a daemon process to manage them" seems like a rather extreme and complicated approach to take to solve... a problem you haven't actually told us about, but just scarily suggested exists.Salpingectomy
-1 for the reasons given by @Mark Amery, as well as for offering an unsubstantiated opinion about which issues the OP wants to solveStatolith
M
3

Locking a file is usually a platform-specific operation, so you may need to allow for the possibility of running on different operating systems. For example:

import os

def my_lock(f):
    if os.name == "posix":
        # Unix or OS X specific locking here
    elif os.name == "nt":
        # Windows specific locking here
    else:
        print "Unknown operating system, lock unavailable"
Millennial answered 28/1, 2009 at 23:45 Comment(1)
You may already know this, but the platform module is also available to obtain information on the running platform. platform.system(). docs.python.org/library/platform.html.Blastoderm
P
2

I have been working on a situation like this where I run multiple copies of the same program from within the same directory/folder and logging errors. My approach was to write a "lock file" to the disc before opening the log file. The program checks for the presence of the "lock file" before proceeding, and waits for its turn if the "lock file" exists.

Here is the code:

def errlogger(error):

    while True:
        if not exists('errloglock'):
            lock = open('errloglock', 'w')
            if exists('errorlog'): log = open('errorlog', 'a')
            else: log = open('errorlog', 'w')
            log.write(str(datetime.utcnow())[0:-7] + ' ' + error + '\n')
            log.close()
            remove('errloglock')
            return
        else:
            check = stat('errloglock')
            if time() - check.st_ctime > 0.01: remove('errloglock')
            print('waiting my turn')

EDIT--- After thinking over some of the comments about stale locks above I edited the code to add a check for staleness of the "lock file." Timing several thousand iterations of this function on my system gave and average of 0.002066... seconds from just before:

lock = open('errloglock', 'w')

to just after:

remove('errloglock')

so I figured I will start with 5 times that amount to indicate staleness and monitor the situation for problems.

Also, as I was working with the timing, I realized that I had a bit of code that was not really necessary:

lock.close()

which I had immediately following the open statement, so I have removed it in this edit.

Plosion answered 7/8, 2014 at 1:1 Comment(2)
This won't work always, since some other program can get between if not exists('errloglock') and lock = open('errloglock', 'w').Denominate
@RokJaklič You are right if someone writes a program to access the same log file without checking for the lock file. You will notice that I qualified the application to "multiple copies of the same program."Plosion
U
2

this worked for me: Do not occupy large files, distribute in several small ones you create file Temp, delete file A and then rename file Temp to A.

import os
import json

def Server():
    i = 0
    while i == 0:
        try:        
                with open(File_Temp, "w") as file:
                    json.dump(DATA, file, indent=2)
                if os.path.exists(File_A):
                    os.remove(File_A)
                os.rename(File_Temp, File_A)
                i = 1
        except OSError as e:
                print ("file locked: " ,str(e))
                time.sleep(1)
            
            
def Clients():
    i = 0
    while i == 0:
        try:
            if os.path.exists(File_A):
                with open(File_A,"r") as file:
                    DATA_Temp = file.read()
            DATA = json.loads(DATA_Temp)
            i = 1
        except OSError as e:
            print (str(e))
            time.sleep(1)
Unwarranted answered 15/2, 2021 at 10:46 Comment(0)
Q
1

The scenario is like that: The user requests a file to do something. Then, if the user sends the same request again, it informs the user that the second request is not done until the first request finishes. That's why, I use lock-mechanism to handle this issue.

Here is my working code:

from lockfile import LockFile
lock = LockFile(lock_file_path)
status = ""
if not lock.is_locked():
    lock.acquire()
    status = lock.path + ' is locked.'
    print status
else:
    status = lock.path + " is already locked."
    print status

return status
Quoit answered 28/1, 2009 at 23:20 Comment(0)
W
0

I found a simple and worked(!) implementation from grizzled-python.

Simple use os.open(..., O_EXCL) + os.close() didn't work on windows.

Weighted answered 19/8, 2013 at 15:22 Comment(1)
O_EXCL option is not related to lockDorwin
M
0

You may find pylocker very useful. It can be used to lock a file or for locking mechanisms in general and can be accessed from multiple Python processes at once.

If you simply want to lock a file here's how it works:

import uuid
from pylocker import Locker

#  create a unique lock pass. This can be any string.
lpass = str(uuid.uuid1())

# create locker instance.
FL = Locker(filePath='myfile.txt', lockPass=lpass, mode='w')

# aquire the lock
with FL as r:
    # get the result
    acquired, code, fd  = r

    # check if aquired.
    if fd is not None:
        print fd
        fd.write("I have succesfuly aquired the lock !")

# no need to release anything or to close the file descriptor, 
# with statement takes care of that. let's print fd and verify that.
print fd
Mandamandaean answered 26/9, 2016 at 16:41 Comment(0)
G
-1

If you just need Mac/POSIX this should work without external packages.

import sys
import stat
import os


filePath = "<PATH TO FILE>"
if sys.platform == 'darwin':
  flags = os.stat(filePath).st_flags
  if flags & ~stat.UF_IMMUTABLE:
    os.chflags(filePath, flags & stat.UF_IMMUTABLE)

and if you want to unlock a file just change,

  if flags & stat.UF_IMMUTABLE:
    os.chflags(filePath, flags & ~stat.UF_IMMUTABLE)
Getupandgo answered 7/11, 2022 at 18:3 Comment(2)
This makes the file immutable, i.e. makes the file unwritable. The question (which mentions multiple processes and writing) is clearly asking about concurrent access with cooperative exclusive locking.Backgammon
Hey @neverpanic, yes I see this. I was probably focused on my solution and not the question.Getupandgo

© 2022 - 2024 — McMap. All rights reserved.