How should I poll a large number of files for changes?
Asked Answered
V

3

8

I'd like to poll the file system for any changed, added or removed files or sub-directories. All changes should be detected quickly but without putting pressure on the machine. The OS is Windows >= Vista, the observed part is a local directory.

Typically, I would resort to a FileSystemWatcher, but this led to problems with other programs that tried to watch the same spot (prominently, Windows Explorer). Also, I heard that FSW is not really reliable even for local folders and with a large buffer.

The main issue I have is that the number of files and directories may be very large (guess 7-digits). Simply running a check for all files every second did noticeably affect my machine.

My next idea was to check different parts of the whole tree per second to reduce the overall impact, and possibly add a kind of heuristic, like checking files that get changed frequently in quicker succession.

I'm wondering if there are patterns for this kind of problem, or if anyone has experiences with this situation.

Vote answered 26/8, 2011 at 11:30 Comment(2)
Are all subfolders under a single root? What kind of problems did you have with windows explorer? Here is a pattern to ensure you don't miss any messages. #4967595Errata
@adrianm: Yes, same root. --- Explorer did not update its view when a monitored folder was changed, I suppose because FSW stole its events.Vote
C
3

We have implemented a similar feature, using C#. The FileSystemWatcher was inefficient with large directory trees.

Our alternative, was using FSNodes, an struct created by us, using the following Windows API calls:

    [StructLayout(LayoutKind.Sequential)]
        private struct FILETIME
    {
        public uint dwLowDateTime;
        public uint dwHighDateTime;
    };

    [StructLayout(LayoutKind.Sequential, CharSet=CharSet.Unicode)]
        private struct WIN32_FIND_DATA
    {
        public FileAttributes dwFileAttributes;
        public FILETIME ftCreationTime;
        public FILETIME ftLastAccessTime;
        public FILETIME ftLastWriteTime;
        public uint nFileSizeHigh;
        public uint nFileSizeLow;
        public int dwReserved0;
        public int dwReserved1;
        [MarshalAs(UnmanagedType.ByValTStr, SizeConst=MAX_PATH)]
        public string cFileName;
        [MarshalAs(UnmanagedType.ByValTStr, SizeConst=MAX_ALTERNATE)]
        public string cAlternate;
    }

    [DllImport("kernel32.dll", SetLastError = true)]
    static extern bool FindClose(IntPtr hFindFile);

    [DllImport("kernel32", CharSet=CharSet.Unicode)]
    private static extern IntPtr FindFirstFile(
        string lpFileName, out WIN32_FIND_DATA lpFindFileData);

    [DllImport("kernel32", CharSet=CharSet.Unicode)]
    private static extern bool FindNextFile(
        IntPtr hFindFile, out WIN32_FIND_DATA lpFindFileData);

What we do is a static processing. We save a metadata tree on disk and compare the stored directory tree vs the loaded one, searching modified (based on its timestamp (faster), or on the file hash). Also, we can manage deleted, added and moved, even moved-modified files (also based on the file hash).

This implementation mixed with a daemon that executed it each POLL_TIME, was valid for us. Hope it helps.

Ceres answered 26/8, 2011 at 11:57 Comment(2)
Could you explaina bit about how you are using the Win32 API calls?Carcanet
FindFirstFile searches a directory for a file or subdirectory with a name that matches a specific name (or partial name if wildcards are used). FindNextFile continues a file search from a previous call to the FindFirstFile or FindFirstFileEx function. Closes a file search handle opened by the FindFirstFile (and other) functions. I recommend a google search to undertand better the API.Karakalpak
P
1

My best guess would be to use USN journal if it is a local machine, you have administrator privileges and partitions are NTFS. USN journal is extremely fast and reliable. It is a long topis and this link explains everything: http://www.microsoft.com/msj/0999/journal/journal.aspx

Pyorrhea answered 3/9, 2011 at 19:10 Comment(0)
W
0

For *nix environments you can use inotify https://github.com/rvoicilas/inotify-tools/wiki/, which worked great in my limited research on it. There might be a version out there that work with windows which I have less experience with ... quick googling led me to a java clone called jnotify http://jnotify.sourceforge.net/ which is advertised to work on windows so it might be worth trying.

Wellesley answered 26/8, 2011 at 11:47 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.