How to keep ReadDirectoryChangesW from missing file changes
Asked Answered
A

3

21

There are many posts on the internet about the ReadDirectoryChangesW API function missing files when there is a lot of file activity. Most blame the speed at which the ReadDirectoryChangesW function loop is called. This is an incorrect assumption. The best explanation I have seen is in the following post, the comment on Monday, April 14, 2008 2:15:27 PM

http://social.msdn.microsoft.com/forums/en-US/netfxbcl/thread/4465cafb-f4ed-434f-89d8-c85ced6ffaa8/

The summary is that the ReadDirectoryChangesW function reports file changes as they leave the file-write-behind queue, not as they are added. And if too many are added before being committed, you lose notice on some of them. You can see this with your implementation, if you just write a program to generate a 1000+ files in a directory real quick. Just count how many file event notices you get and you will see there are times when you will not receive all of them.

The question is, has anyone found a reliable method to use the ReadDirectoryChangesW function without having to flush the volume each time? This is not allowed if the user is not an Administrator and can also take some time to complete.

Amuse answered 11/9, 2008 at 18:16 Comment(0)
B
1

If the API is unreliable, then a workaround may be your only option. That of course likely involves keeping track of lastmodified and filenames. What this doesn't mean is that you need to poll when looking for changes, rather, you can use the FileSystemWatcher as a means to trigger checking.

So if you keep track of the last 50-100 times the ReadDirectoryChangesW/FSW event happened, and you see that it is being called rapidly, you can detect this and trigger the special condition to get all the files that have been changed (and set a flag to prevent future bogus FSW events temporarily) in a few seconds.

Since some people are confused in the comments about this solution, I am proposing that you should monitor how fast events are arriving from the ReadDirectoryChangesW and when they are arriving too fast, try to attempt a workaround (usually a manual sweep of a directory).

Brace answered 11/9, 2008 at 18:34 Comment(6)
This would work for about 99% of the time. What happens if a file in another directory (other then the one having a lot of file changes) is the one that is skipped. You would scan the one directory for changes but miss the single file change in another.Amuse
The FileSystemWatcher class is a .NET way to wrap ReadDirectoryChangesW, so no, that does not help.Vocation
My answer was more about detecting the # of events within a time period to trigger the workaround for ReadDirectoryChangesW.Brace
And some of those events will also be dropped and never arrive, in the same as they would be if trying to use ReadDirectoryChangesW(). Dressing it up with FSW does no good.Vocation
Ok, replace all occurrences of FileSystemWatcher with ReadDirectoryChangesW and the answer is still the SAME. You will still need to detect when files are arriving to rapidly, and do your own sweep of the directory.Brace
No, it's not, because events include "when files are arriving", which you will never see. In other words, the answer to the original question is NO. And there is absolutely NO WORKAROUND except to use a different mechanism like change-journaling or write a file-system minifilter. How could using ReadDirectoryChangesW() again to compensate for it's own design flaw, somehow "work around" it? Baffling.Vocation
A
1

We've never seen ReadDirectoryChangesW to be 100% reliable. But, the best way to handle it is separate the "reporting" from the "handling".

My implementation has a thread which has only one job, to re-queue all events. Then a second thread to process my intermediate queue. You basically, want to impede the reporting of events as little as possible.

Under high CPU situations, you can also impede the reporting of watcher events.

Anthelion answered 1/10, 2013 at 15:13 Comment(1)
Since I don't have enough reputation to comment above, I'll say that the top solution is very good. It get's you "almost there" for 99% of the cases. Add 1 more thing to it, a periodic total rescan of the folders you are watching to look for changes.Anthelion
A
0

I met same problem. But, I didn't find a solution that guarantee to get all of events. In several tests, I could know that ReadDirectoryChangesW function should be called again as fast as possible after GetQueuedCompletionStatus function returned. I guess if a processing speed of filesystem is very faster than my application processing speed, the application might be able to lose some events.

Anyway, I separated a parsing logic from a monitoring logic and placed a parsing logic on a thread.

Armillia answered 24/5, 2012 at 8:17 Comment(1)
I successfully solved the issue using something similar. Queue the events or write them to a file for later processing. Writing them to disk was the key for me. That might seem slow but disks are cached and the overhead is less than a DB. My program mirrors about 1 TB per day of file changes- hundreds of thousands of files per day.Portiere

© 2022 - 2024 — McMap. All rights reserved.