I watched the talk: "PostgreSQL vs. fsync. How is it possible that PostgreSQL used fsync incorrectly for 20 years, and what we'll do about it." via https://fosdem.org/2019/schedule/event/postgresql_fsync/ and also read https://lwn.net/Articles/752063/ as background.
The really short and simplified summary is with Linux if you call fsync() and it fails, don't think you can call fsync() again to fix it, as the second time the call will succeed and you will have corrupted data on disk (the failed buffer cache pages are marked as clean after the first failed call). There is a lot of detail as to why this happens (supports the case when a USB is taken out - you don't want to retry and hold on to dirty buffer cache pages that can never succeed).
How does FlushFileBuffers() behave in this situation? I am particularly interested in files being accessed over CIFS where failures are more likely.
Also, given the OS can attempt to write dirty buffer cache pages to stable storage at any time in the background, how can user-land programs pick up these failures via the Win32 API?
FlushFileBuffers
is just a user-mode wrapper aroundNtFlushBuffersFile
, and it looks to me like that function just assembles a flush IRP (IRP_MJ_FLUSH_BUFFERS
) and sends it viaIoCallDriver
. Of course, one should eye calls toFlushFileBuffers
with a great deal of suspicion. There are better ways of implementing transactional I/O in Windows, like creating the file with theFILE_FLAG_NO_BUFFERING
flag. – Patency