Is an atomic file rename (with overwrite) possible on Windows?
Asked Answered
M

8

87

On POSIX systems rename(2) provides for an atomic rename operation, including overwriting of the destination file if it exists and if permissions allow.

Is there any way to get the same semantics on Windows? I know about MoveFileTransacted() on Vista and Server 2008, but I need this to support Win2k and up.

The key word here is atomic... the solution must not be able to fail in any way that leaves the operation in an inconsistent state.

I've seen a lot of people say this is impossible on win32, but I ask you, is it really?

Please provide reliable citations if possible.

Margravine answered 3/10, 2008 at 15:25 Comment(1)
@Adam Davis - If you have control of the reader program as well as the writer, you can solve it like this. Reader does io.Directory("FileDone_*.dat") and picks highest # in place of *. Write creates file with the name of "FileWriting.dat" and renames it to "FileDone_002.dat" ..003, 004, etc. Not only does this solve the problem of non atomic delete/rename, that single rename-only is atomic, and, if the old file is held open, it's still possible to update. The reader(s) can watch for a new file based on a timer if it doesn't re-open with every operation. Readers can clean up old files.Romilda
N
20

Win32 does not guarantee atomic file meta data operations. I'd provide a citation, but there is none - that fact that there's no written or documented guarantee means as much.

You're going to have to write your own routines to support this. It's unfortunate, but you can't expect win32 to provide this level of service - it simply wasn't designed for it.

Nicollenicolson answered 3/10, 2008 at 15:55 Comment(6)
I find this hard to believe. This would mean that a power outage could easily corrupt the file system even if we're dealing with a reliable system such as NTFS.Macpherson
@mafutrct Keep in mind that the question isn't about corrupting the file system - it's about making sure that the rename completes successfully, or doesn't occur at all. The file system would not be left corrupted, but the file name may not be left in either the original or the final state. NTFS is a journaling files sytem, so it won't (easily) become corrupted, but depending on the complexity of the file rename or order of operations it's possible that it won't be left in the original or desired final state.Nicollenicolson
That makes sense, but it's also really scary. To end up with a filename that is neither original or final is a recipe for disaster pretty much. Especially since (iirc) the POSIX standard already requires atomic meta file ops.Macpherson
@mafutrct I suspect it isn't an issue with a simple file rename, but as the op suggests there are more complex rename operations, such as renaming a file to an name of a file that already exists. If you have LOGFILE and LOGBACKUP and periodically you want to move the logfile to the backup and start a new logfile, you might rename logfile to logbackup. The OS has to delete logbackup, then rename logfile - it's possible that the deletion happens, but not the rename, and then you lose both logfiles, and it's not a trivial problem to resolve in software.Nicollenicolson
@AdamDavis it's still a shame. Atomic overwrites are a crucial feature. On a filesystem, it is the only way to know that you have either the old version or the new of a named blob.Drawknife
@Macpherson I dont find this hard to believe :) Because this has probably been ported straight from DOS (for compatibility reasons), and nothing in there was made for concurrency. and it also doesn't surprise me that MS has probably never considered seriously altering their C libs behavior later on. Again, for compat reasons. If they d change that now, they d risk tons of existing programs to fail as assuming this is not atomic, i.e "misusing" the admitted error on on attempted overwrite that rename would cause currently in win32.Secretive
S
41

See ReplaceFile() in Win32 (https://www.microsoft.com/en-us/research/wp-content/uploads/2006/04/tr-2006-45.pdf)

Stpierre answered 3/3, 2010 at 1:56 Comment(14)
If you read msdn.microsoft.com/en-us/library/aa365512(VS.85).aspx you'll see that ReplaceFile is a complicated merge operation, with no indication that it's atomic.Schlessel
The relevant passage from that MS research paper: "Under UNIX, rename() is guaranteed to atomically overwrite the old version of the file. Under Windows, the ReplaceFile() call is used to atomically replace one file with another."Amphioxus
"The relevant passage from that MS research paper: "Under ...." This is suggesting that the operation is atomic, but it is misleading and only suitable to create confusion. What is required is atomic delete of the old file and rename of the new file in one indivisible step. These may be two metadata operations which in itself are atomic but the compound is NOT atomic.Extrauterine
I found this comment on some other places in MSDN. Seems that ReplaceFile is atomic now despite it is not mentioned in the direct documentation pag.Microdot
This is just an implementation decision and easy to implement atomic or not. What MS has choosen to do is still not 100% clear until someone with access to the source code can tell us.Microdot
msdn.microsoft.com/en-us/library/windows/desktop/… says ReplaceFile canbe used atomically:"Many applications which deal with "document-like" data tend to load the entire document into memory, operate on it, and then write it back out to save the changes. The needed atomicity here is that the changes either are completely applied or not applied at all, as an inconsistent state would render the file corrupt. A common approach is to write the document to a new file, then replace the original file with the new one. One method to do this is with the ReplaceFile API."Gwen
Note in particular the various return codes documented for ReplaceFile, which all correspond to different degrees of partial (i.e. non-atomic) completion of the operation.Skerrick
Microsoft intern here. I had this problem, so I asked a guy who worked on NTFS. The part where data is moved is atomic, so while it can be interrupted while the file attributes are being modified, the part where data itself is moved is atomic.Kenn
@zneak: Will ReplaceFile avoid "file in use" errors? For example, on Unix, you can use rename to overwrite an executable even while it's running. Can Windows do this?Hexagon
@JaySullivan, AFAIK, ReplaceFile honors open file sharing restrictions and won't let you do this.Kenn
Its worth noting, that unlike rename(), ReplaceFile() fails if the new file does not exist -- thus you have another atomicity problem -- checking for existence and then replacing in a single atomic step.... sigh...Lockhart
@Lockhart that's not an atomic problem - if you make the call and it fails without changing anything, then atomicity is maintained - and you can check the return code to find out why it failed to avoid checking in the first place. Still a question about calls available for W2k isn't exactly current.....Jehial
@Jehial A file can be created or deleted between the check and the replace. You need atomic check_and_replace_ifExists and check_and_create_ifNotExists. I prefer the proper rename behavior.Lockhart
@Lockhart it’s definitely a race condition, but at least you can just do a loop around the check for file existence and move/replace call (as long as you can distinguish transient and permanent errors so as to bail the loop appropriately). Yes, POSIX rename() is a ton easier to use.Garlan
N
20

Win32 does not guarantee atomic file meta data operations. I'd provide a citation, but there is none - that fact that there's no written or documented guarantee means as much.

You're going to have to write your own routines to support this. It's unfortunate, but you can't expect win32 to provide this level of service - it simply wasn't designed for it.

Nicollenicolson answered 3/10, 2008 at 15:55 Comment(6)
I find this hard to believe. This would mean that a power outage could easily corrupt the file system even if we're dealing with a reliable system such as NTFS.Macpherson
@mafutrct Keep in mind that the question isn't about corrupting the file system - it's about making sure that the rename completes successfully, or doesn't occur at all. The file system would not be left corrupted, but the file name may not be left in either the original or the final state. NTFS is a journaling files sytem, so it won't (easily) become corrupted, but depending on the complexity of the file rename or order of operations it's possible that it won't be left in the original or desired final state.Nicollenicolson
That makes sense, but it's also really scary. To end up with a filename that is neither original or final is a recipe for disaster pretty much. Especially since (iirc) the POSIX standard already requires atomic meta file ops.Macpherson
@mafutrct I suspect it isn't an issue with a simple file rename, but as the op suggests there are more complex rename operations, such as renaming a file to an name of a file that already exists. If you have LOGFILE and LOGBACKUP and periodically you want to move the logfile to the backup and start a new logfile, you might rename logfile to logbackup. The OS has to delete logbackup, then rename logfile - it's possible that the deletion happens, but not the rename, and then you lose both logfiles, and it's not a trivial problem to resolve in software.Nicollenicolson
@AdamDavis it's still a shame. Atomic overwrites are a crucial feature. On a filesystem, it is the only way to know that you have either the old version or the new of a named blob.Drawknife
@Macpherson I dont find this hard to believe :) Because this has probably been ported straight from DOS (for compatibility reasons), and nothing in there was made for concurrency. and it also doesn't surprise me that MS has probably never considered seriously altering their C libs behavior later on. Again, for compat reasons. If they d change that now, they d risk tons of existing programs to fail as assuming this is not atomic, i.e "misusing" the admitted error on on attempted overwrite that rename would cause currently in win32.Secretive
H
15

In Windows Vista and Windows Server 2008 an atomic move function has been added - MoveFileTransacted()

Unfortunately this doesn't help with older versions of Windows.

Interesting article here on MSDN.

Hillhouse answered 19/3, 2009 at 0:49 Comment(6)
Hidden in the comments: this will not work on network shares.Brodie
@sorin: The question asks for an equivalent to a POSIX call that isn't atomic on network shares either.Monochasium
However, this solution (and its limitations to certain Windows versions) was already mentioned in the question, so it's not useful to write it as an answer.Monochasium
Actually, the POSIC call is atomic on NFS.Bittner
@JonWatte it is, but NFS is a beast with its own peculiar failure modes because it's trying to achieve POSIX semantics over the network where several clients can read and write. So I'd avoid relying on the over NFS if I could.Kktp
It seems it's about to be deprecated now.Kktp
B
14

Starting with Windows 10 1607, NTFS does support an atomic superseding rename operation. To do this call NtSetInformationFile(..., FileRenameInformationEx, ...) and specify the FILE_RENAME_POSIX_SEMANTICS flag.

Or equivalently in Win32 call SetFileInformationByHandle(..., FileRenameInfoEx, ...) and specify the FILE_RENAME_FLAG_POSIX_SEMANTICS flag.

Begun answered 8/8, 2018 at 1:53 Comment(2)
Is there a reason that while DeleteFile now uses POSIX delete and ReplaceFile now uses POSIX rename (but still in two steps), MoveFileEx with MOVEFILE_REPLACE_EXISTING still performs a legacy rename?Latreshia
I think for SetFileInformationByHandle you mean the FILE_RENAME_INFO.ReplaceIfExists flag, not the FILE_RENAME_FLAG_POSIX_SEMANTICS flag, right?Allege
J
10

you still have the rename() call on Windows, though I imagine the guarantees you want cannot be made without knowing the filesystem you're using - no guarantees if you're using FAT for instance.

However, you can use MoveFileEx and use the MOVEFILE_REPLACE_EXISTING and MOVEFILE_WRITE_THROUGH options. The latter has this description in MSDN:

Setting this value guarantees that a move performed as a copy and delete operation is flushed to disk before the function returns. The flush occurs at the end of the copy operation.

I know that's not necessarily the same as a rename operation, but I think it might be the best guarantee you'll get - if it does that for a file move, it should for a simpler rename.

Jehial answered 3/10, 2008 at 15:39 Comment(3)
To the best of my knowledge, if the destination existed and an I/O error occurs during the data copy step, this "original" destination is lost, thus MoveFileEx is not atomic per your requirements. That's why MoveFileTransacted was added later.Coss
MoveFileEx should be good. It has a flag called MOVEFILE_COPY_ALLOWED which says: "If the file is to be moved to a different volume, the function simulates the move by using the CopyFile and DeleteFile functions." So just don't pass this flag and you should have something that is equivalent to the POSIX rename, yes?Migrate
Rename fails if new file already exists under windows. Atomicity aside, the windows version is not even semantically compatible with the Unix version.Lockhart
V
6

The MSDN documentation avoids clearly stating which APIs are atomic and which are not, but Niall Douglas states in his Cppcon 2015 talk that the only atomic function is

SetFileInformationByHandle

with FILE_RENAME_INFO.ReplaceIfExists set to true. It's available starting with Windows Vista / 2008 Server.

Niall is the author of a highly complicated LLFIO library and is an expert in file system race conditions so I believe if you're writing an algorithm where atomicity is crucial, better be safe than sorry and use the suggested function even though nothing in ReplaceFile's description states it's not atomic.

Vet answered 5/8, 2019 at 12:13 Comment(10)
Superseding rename is actually the only type of rename that is not guaranteed to be atomic on NTFS. The reason for it potentially being non-atomic is NTFS has to delete all of the target's allocation, and deleting allocation gets logged. If the superseded target is extremely large then all of the deleted allocation will not be able to fit inside a single NTFS transaction, so NTFS splits it into multiple transactions. If the machine crashes, you could end up in a state where both source and target are still there but the target has been partially truncated (along transaction boundaries).Begun
@CraigBarkhouse: Very interesting, can you please clarify what the term "superseding rename" refers to? Which API functions?Vet
Superseding rename is simply the ReplaceIfExists that you already mentioned if using FileRenameInformation, or FILE_RENAME_REPLACE_IF_EXISTS if using FileRenameInformationEx, or MOVEFILE_REPLACE_EXISTING if using MoveFileEx, etc. They're all the same file system operation underneath. When the target did in fact exist, it is said to have been superseded. You could use the terms overwritten or replaced if you prefer.Begun
@CraigBarkhouse, thanks a lot for the info and the explanation, then I should delete my incorrect answer. One thing I'm still not clear about: is ReplaceFile atomic? It's also superseding rename, isn't it?Vet
ReplaceFile is most definitely NOT atomic. But not because of using superseding rename. Rather it's because it uses a series of file system operations. What ReplaceFile does is 1) renames ReplacedFile to BackupFile; 2) renames ReplacementFile to ReplaceFile; and 3) optionally deletes BackupFile. In the event of a crash you can end up in pre-1 state (ReplacedFile and ReplacementFile), pre-2 state (BackupFile and ReplacementFile), pre-3 state (BackupFile and ReplacedFile), mid-3 state (BackupFile partially truncated and ReplacedFile), or post-3 state (ReplacedFile only).Begun
@CraigBarkhouse: so then, is MoveFile / MoveFileEx a better approach where atomicity is required? Or is there an Nt* or Zw* function that guarantees atomicity?Vet
The first thing you have to know is that file system operations are what can be atomic, not APIs per se. Whether a file system operation is atomic depends on which file system you are talking about, and which operation. Mostly I've been assuming you are talking about NTFS as the file system. On FAT, nothing at all is atomic, therefore no higher level file related API is atomic on FAT. On NTFS an API can be considered atomic if it limits itself to a single file system operation (why ReplaceFile isn't atomic), and that file system operation is atomic (why MoveFileEx isn't atomic).Begun
To take MoveFileEx as an example, it is complicated because depending on how it is called, it might end up doing 1) a simple rename; or 2) a superseding rename (the MOVEFILE_REPLACE_EXISTING thing); or 3) a copy and delete. The first case actually is atomic on NTFS. The second case is atomic 99.99999% of the time, the only exception being when the superseded target is huge as I described earlier. The third case is definitely never atomic because "copy" is a long series of operations. So you have to understand the specific scenario before you can even try to answer if it is atomic.Begun
Linux is not fundamentally different. For example, virtually no file system operation can be considered atomic on an ext2 file system, because (like FAT) that file system doesn't support transactions. Therefore virtually no Linux file related API per se is atomic.Begun
@CraigBarkhouse, thank you very much for the explanation!Vet
A
4

A fair number of answers but not the one I was expecting... I had the understanding (perhaps incorrectly) that MoveFile could be atomic provided that the proper stars aligned, flags were used, and file system was the same on the source as target. Otherwise, the operation would fall back to a [Copy->Delete]File.

Given that; I was also had the understanding that MoveFile -- when it is atomic -- was just setting the file information which also could be done here: setfileinfobyhandle.

Someone gave a talk called "Racing the Filesystem" which goes into some more depth about this. (about 2/3rds down they talk about atomic rename)

Angiosperm answered 1/7, 2014 at 1:3 Comment(0)
R
2

There is std::rename and starting with C++17 std::filesystem::rename. It's unspecified what happens if destination exists with std::rename:

If new_filename exists, the behavior is implementation-defined.

POSIX rename, however, is required to replace existing files atomically:

This rename() function is equivalent for regular files to that defined by the ISO C standard. Its inclusion here expands that definition to include actions on directories and specifies behavior when the new parameter names a file that already exists. That specification requires that the action of the function be atomic.

Thankfully, std::filesystem::rename requires that it behaves just like POSIX:

Moves or renames the filesystem object identified by old_p to new_p as if by the POSIX rename

However, when I tried to debug, it appears that std::filesystem::rename as implemented by VS2019 (as of March 2020) simply calls MoveFileEx, which isn't atomic in some cases. So, possibly, when all bugs in its implementation are fixed, we'll see portable atomic std::filesystem::rename.

Rainout answered 1/4, 2020 at 4:22 Comment(1)
The MSVC STL guys claimed that std::filesystem::rename is already atomic by calling MoveFileEx. See github.com/microsoft/STL/pull/2062#issuecomment-1139106197Walkout

© 2022 - 2024 — McMap. All rights reserved.