How to guarantee atomic move or exception of a file in Java?
Asked Answered
B

3

11

In one of my projects I have concurrent write access to one single file within one JRE and want to handle that by first writing to a temporary file and afterwards moving that temp file to the target using an atomic move. I don't care about the order of the write access or such, all I need to guarantee is that any given time the single file is usable. I'm already aware of Files.move and such, my problem is that I had a look at at least one implementation for that method and it raised some doubts about if implementations really guarantee atomic moves. Please look at the following code:

Files.move on GrepCode for OpenJDK

1342        FileSystemProvider provider = provider(source);
1343        if (provider(target) == provider) {
1344            // same provider
1345            provider.move(source, target, options);
1346        } else {
1347            // different providers
1348            CopyMoveHelper.moveToForeignTarget(source, target, options);
1349        }

The problem is that the option ATOMIC_MOVE is not considered in all cases, but the location of the source and target path is the only thing that matters in the first place. That's not what I want and how I understand the documentation:

If the move cannot be performed as an atomic file system operation then AtomicMoveNotSupportedException is thrown. This can arise, for example, when the target location is on a different FileStore and would require that the file be copied, or target location is associated with a different provider to this object.

The above code clearly violates that documentation because it falls back to a copy-delete-strategy without recognizing ATOMIC_MOVE at all. An exception would be perfectly OK in my case, because with that a hoster of our service could change his setup to use only one filesystem which supports atomic moves, as that's what we expect in the system requirements anyway. What I don't want to deal with is things silently failing just because an implementation uses a copy-delete-strategy which may result in data corruption in the target file. So, from my understanding it is simply not safe to rely on Files.move for atomic operations, because it doesn't always fail if those are not supported, but implementations may fall back to a copy-delete-strategy.

Is such behaviour a bug in the implementation and needs to get filed or does the documentation allow such behaviour and I'm understanding it wrong? Does it make any difference at all if I now already know that such maybe broken implementations are used out there? I would need to synchronize the write access on my own in that case...

Bentonbentonite answered 11/8, 2014 at 7:54 Comment(0)
H
12

You are looking at the wrong place. When the file system providers are not the same, the operation will be delegated to moveToForeignTarget as you have seen within the code snippet you’ve posted. The method moveToForeignTarget however will use the method convertMoveToCopyOptions (note the speaking name…) for getting the necessary copy options for the translated operation. And convertMoveToCopyOptions will throw an AtomicMoveNotSupportedException if it encounters the ATOMIC_MOVE option as there is no way to convert that move option to a valid copy option.

So there’s no reason to worry and in general it’s recommended to avoid hasty conclusion from seeing just less than ten lines of code (especially when not having tried a single test)…

Hearts answered 11/8, 2014 at 10:10 Comment(2)
Tests of course guarantee nothing in general and I surely did not make hasty conclusions, but I must admit that I didn't recognize the call to "convertMoveToCopyOptions" even if I had more than one look at the implementation. So thanks for that hint!Nicker
@Thorsten Schöning: tests are not the last word but a single test using two different filesystems would disprove the assumption that there is never an exception thrown and reveal the location were it’s being thrown. Therefore bug reports are usually augmented with an example code which can reproduce the bug. Never mind, such things happen and it’s better to ask than trusting blindly…Hearts
A
2

The standard Java library does not provide a way to perform an atomic move in all cases.

Files.move() does not guarantee atomic move. You can pass ATOMIC_MOVE as an option, but if the move cannot be performed as an atomic operation, AtomicMoveNotSupportedException is thrown (this is the case when target location is on a different FileStore and would require that the file be copied).

You have to implement it yourself if you really need that. One solution can be to catch AtomicMoveNotSupportedException and then do this: Try to move the file without the ATOMIC_MOVE option but catch exceptions and remove the target if error occured during the copy.

Ageold answered 11/8, 2014 at 8:1 Comment(4)
Sorry, but you are missing the point: My posted code of OpenJDK clearly states that AtomicMoveNotSupportedException is not thrown in any case, that's the problem. I could perfectly live with such an exception, but I can't if it is silently converted to a fall back strategy. I can't know what implementations are doing, I only have the official API docs and the posted implementation is violating that. At least that's my understanding...Nicker
I wouldn't use an implementation which violates the official documentation... In this particular case you can perform the same check which results in file copy: check the file system providers and if they are different, act as you would in the AtomicMoveNotSupportedException case.Ageold
OK, didn't thought of that easiest option, so a thin wrapper around Files.move should do the trick for now. But is OpenJDKs implementation a bug worth filing regarding the docs?Nicker
This looks too small (thinking they would not attend to it) but if it is in the javadoc, it should work that way, so imo it is worth filing it.Ageold
L
1

I came across similar problem to be solved:

  • One process frequently updates file via 'save to tempfile -> move tempfile to final file' using Files.move(tmp, out, ATOMIC_MOVE, REPLACE_EXISTING);
  • Another one or more processes read that file - completely, all-at-once, and closes immediatelly. File is rather small - less than 50k.

And it just does not work reliably, at least on windows. Under heavy load reader occasionally gets NoSuchFileException - this means Files.move is not that ATOMIC even on the same file system :(

My env: Windows 10 + java 11.0.12

Here is the code to play with:

import org.junit.Test;

import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.nio.ByteBuffer;
import java.nio.channels.ByteChannel;
import java.nio.file.Files;
import java.nio.file.Path;
import java.util.Set;

import static java.nio.charset.StandardCharsets.UTF_8;
import static java.nio.file.StandardCopyOption.ATOMIC_MOVE;
import static java.nio.file.StandardCopyOption.REPLACE_EXISTING;
import static java.util.Locale.US;

public class SomeTest {

    static int nWrite = 0;
    static int nRead = 0;
    static int cErrors = 0;
    static boolean writeFinished;
    static boolean useFileChannels = true;
    static String filePath = "c:/temp/test.out";

    @Test
    public void testParallelFileAccess() throws Exception {
        new Writer().start();
        new Reader().start();

        while( !writeFinished ) {
            Thread.sleep(10);
        }

        System.out.println("cErrors: " + cErrors);
    }

    static class Writer extends Thread {

        public Writer() {
            setDaemon(true);
        }

        @Override
        public void run() {
            File outFile = new File("c:/temp/test.out");
            File outFileTmp = new File(filePath + "tmp");
            byte[] bytes = "test".getBytes(UTF_8);

            for( nWrite = 1; nWrite <= 100000; nWrite++ ) {
                if( (nWrite % 1000) == 0 )
                    System.out.println("nWrite: " + nWrite + ", cReads: " + nRead);

                try( FileOutputStream fos = new FileOutputStream(outFileTmp) ) {
                    fos.write(bytes);
                }
                catch( Exception e ) {
                    logException("write", e);
                }

                int maxAttemps = 10;
                for( int i = 0; i <= maxAttemps; i++ ) {
                    try {
                        Files.move(outFileTmp.toPath(), outFile.toPath(), ATOMIC_MOVE, REPLACE_EXISTING);
                        break;
                    }
                    catch( IOException e ) {
                        try {
                            Thread.sleep(1);
                        }
                        catch( InterruptedException ex ) {
                            break;
                        }
                        if( i == maxAttemps )
                            logException("move", e);
                    }
                }
            }

            System.out.println("Write finished ...");
            writeFinished = true;
        }
    }

    static class Reader extends Thread {

        public Reader() {
            setDaemon(true);
        }

        @Override
        public void run() {
            File inFile = new File(filePath);
            Path inPath = inFile.toPath();
            byte[] bytes = new byte[100];
            ByteBuffer buffer = ByteBuffer.allocateDirect(100);

            try { Thread.sleep(100); } catch( InterruptedException e ) { }

            for( nRead = 0; !writeFinished; nRead++ ) {
                if( useFileChannels ) {
                    try ( ByteChannel channel = Files.newByteChannel(inPath, Set.of()) ) {
                        channel.read(buffer);
                    }
                    catch( Exception e ) {
                        logException("read", e);
                    }
                }
                else {
                    try( InputStream fis = Files.newInputStream(inFile.toPath()) ) {
                        fis.read(bytes);
                    }
                    catch( Exception e ) {
                        logException("read", e);
                    }
                }
            }
        }
    }

    private static void logException(String action, Exception e) {
        cErrors++;
        System.err.printf(US, "%s: %s - wr=%s, rd=%s:, %s%n", cErrors, action, nWrite, nRead, e);
    }
}
Lenes answered 2/11, 2021 at 21:53 Comment(7)
There might be a misunderstanding here, as with using ProcMon I can see two types of problems: ACCESS DENIED and NAME NOT FOUND. The former is because while Java opens files to allow concurrent reads+writes+deletions, Windows/NTFS itself reserves file names after deletion until all open handles are closed. This makes sense to me in your setup with concurrent reads and you addressed that with additional attempts to rename. Though, when NAME NOT FOUND happens, I can see that SetRenameInformationFile is properly used, like in all other cases before.Nicker
What your code actually tests is if test.out is ALWAYS available. But is that really the same like an atomic file system operation replacing a possibly existing file? No, because it's fine that the file doesn't exist e.g. at the start. And even if it exists, the atomic part is that test.out needs to ONLY be visible when the replace happened entirely with the then available content. And that seems to be the case! Instead of polling for an existing file like in your case, think of file system events e.g. about newly created files send to listeners.Nicker
With using ATOMIC_MOVE and REPLACE_EXISTING from my experience there's really only ONE event fired in the end, when the actual move happened successfully. test.out won't be visible with the new content before, it's either test.outtmp or test.out with the same content, nothing in between from my understanding. Though, NOT having a formerly available test.out during that move operation available at some point doesn't seem to break that contract from my point of view. But the latter is what your code breaks on because it polls for a file to be available always.Nicker
My understanding of ATOMIC comes from DB world - it means that rename-atomic is somthing like [start transaction -> kill dst file if exist -> change name of src file -> commit transaction]. While transaction is not commited - readers see previous state of data (eiter no file at first iteration or last known version of file otherwise), once transaction is commited - reader should see only final state, and in this state there is no "NoSuchFile" for 2nd+ iteration.Lenes
In my scenario i do not care about each change of the file - therefore no need for listeners. I just care that if file exists - just reliably read its latest content.Lenes
I get your point, just saying that ATOMIC_MOVE is not designed like your understanding on the level of NTFS and most other file systems. Your example already shows that some additional transaction concept for multiple, in itself atomic operations would be needed. While NTFS provides that in theory e.g. with MoveFileTransactedW, it's deprecated and doesn't seem to be used in Java under the hood.Nicker
Regarding your use case: You always will get the latest content of the file, but simply can not rely on the fact that it does exist always. That's two different problems from my point of view. In theory the file could be missing for any other reason, like some user having deleted it, some AV software deleted it because of some false-positive etc. So I suggest redesigning your approach a bit and a file system watcher handling deletes and creations might help for that. I use that for the latter a lot and it works pretty well and atomic in terms of all or nothing of newly available content.Nicker

© 2022 - 2024 — McMap. All rights reserved.