Atomic `ln -sf` in python (symlink overwriting exsting file)
Asked Answered
R

1

3

I want create a symlink, overwriting an existing file or symlink if needed.

I've discovered that os.path.exists only returns True for non-broken symlinks, so I'm guessing that any test must also include os.path.lexists.

What is the most atomic way to implement ln -sf in python? (Ie, preventing a file being created by another process between deletion and symlink creation)


Differentiation: This question doesn't specify the atomic requirement

Rundlet answered 18/4, 2019 at 6:40 Comment(3)
If you prepare ln -s file tmplink, then mv tmplink link is atomic.Jointress
@Jointress thanks for the suggestion. I still see a security hole, but I hope I got it as good as possible in my answer.Rundlet
FWIW, ln -sf itself is not actually atomic. GNU Coreutils internally implements the solution that @Jointress proposed (and that the currently-accepted answer implements); FreeBSD and Busybox simply delete the destination file before linking. So "par" is actually pretty easy to clear for this.Aldwin
R
1

This code tries to minimise the possibilities for race conditions:

import os
import tempfile

def symlink_force(target, link_name):
    '''
    Create a symbolic link link_name pointing to target.
    Overwrites link_name if it exists.
    '''

    # os.replace() may fail if files are on different filesystems
    link_dir = os.path.dirname(link_name)

    while True:
        temp_link_name = tempfile.mktemp(dir=link_dir)
        try:
            os.symlink(target, temp_link_name)
            break
        except FileExistsError:
            pass
    try:
        os.replace(temp_link_name, link_name)
    except OSError:  # e.g. permission denied
        os.remove(temp_link_name)
        raise

Note:

  1. If the function is interrupted (e.g. computer crashes), an additional random link to the target might exist.

  2. An unlikely race condition still remains: the symlink created at the randomly-named temp_link_name could be modified by another process before replacing link_name.

I raised a python issue to highlight the issues of os.symlink() requiring the target not exist.

Credit to Robert Seimer's input.

Rundlet answered 18/4, 2019 at 8:2 Comment(5)
I raised an issue for the security hole mentioned above.Rundlet
You could handle the (unlikely) race by putting the mktemp+symlink calls inside a loop, retrying until you win the race. Obviously you would need to check that symlink failed because of EEXIST and not for some other reason that's never going to let it succeed. And for efficiency you'd probably want to hoist the embedded dirname call above the loop.Possie
@Possie Thanks, your suggestion helps. But how to prevent the mktemp-named file being changed after creation and before replacing the symlink?Rundlet
You can't prevent that. Any badly-behaved process that has write access to this directory can destroy the whole arrangement. It's just as possible that something could replace or delete the new symlink after the rename, as it is that something could replace or delete it before the rename. But if you're worried about concurrent mktemps accidentally colliding on the same name then you could mkdtemp a directory, create the new symlink in that directory, then rename the symlink over the original one (and then rmdir the temp directory). A rename within a filesystem is atomic.Possie
I am somewhat astounded that this didn't come OOTB.Aldwin

© 2022 - 2024 — McMap. All rights reserved.