Know the number of files / directories before doing a FSCTL_ENUM_USN_DATA
Asked Answered
I

0

1

Before doing a USN journal / NTFS MFT files-enumeration with

while (DeviceIoControl(hDrive, FSCTL_ENUM_USN_DATA, &med, sizeof(med), pData, sizeof(pData), &cb, NULL))
{
    // do stuff here
    med.StartFileReferenceNumber = *((DWORDLONG*) pData);    // pData contains FRN for next FSCTL_ENUM_USN_DATA
}

I'd like to know the number of files/directories (to "reserve" a std::vector: v.reserve(...) and also other reasons).

I thought about using FSCTL_QUERY_USN_JOURNAL before, that gives a USN_JOURNAL_DATA_V0 containing informations about the volume.

Unfortnuately FirstUsn, NextUsn, MaxUsn don't give this information. Even if I have 100k files on the volume, NextUsn can be 10 millions for example, so it doesn't give the right order of magnitude.

How to get the number of files / directories before doing a FSCTL_ENUM_USN_DATA?

Iolenta answered 20/7, 2017 at 19:34 Comment(15)
Is opendir/readdir/closedir what you are looking for?Impeachment
Which API exactly @JesperJuhl? I'm looking for a number that can be obtained in a volume in < 1 second. Just a number, that would be written somewhere in the NTFS. (I'm not looking to obtain this number via a filesystem enumeration, because I'm going to do this next in the FSCTL_ENUM_USN_DATA step).Iolenta
possible send FSCTL_GET_NTFS_FILE_RECORD in binary search adjust FileReferenceNumber - this give the highest valid FileReferenceNumber after several iterations. the same value is returned in FSCTL_ENUM_USN_DATA when we reach end of enum. but this value bit more than actual number of files. for example in my test on some volume i get f800 as this number. and count of files by FSCTL_ENUM_USN_DATA i got d81b, while by enumerating via FSCTL_GET_NTFS_FILE_RECORD- d830Polyandry
@JesperJuhl: No Windows developer ever needs a file handling API, that doesn't support Unicode.Fung
@Polyandry According to my tests, highest FileReferenceNumber can be 5 or 10 times bigger (actual the ratio is probably somewhere between 1 and 10?) than the actual number of files in the volume, so unfortunately it won't be really useful for reserving memory space for a std::vector, etc.Iolenta
may be this depend from volumePolyandry
@Polyandry Yes... It seems there's no "total filecount" easily obtainable from WinAPI for a volume... Strange! I really thought this info would be somewhere in USN journal / NTFS MFT, without having to enumerate all the USN records.Iolenta
@Fung the question was not clear, so I was guessing...Impeachment
I don't think this information is available. As a general rule, Windows doesn't track information it doesn't need. Anyway, if you use a moderately large buffer (at least a megabyte) then you won't have to loop around the call to DeviceIoControl too many times, if you allocate/extend your vector once per loop the overhead shouldn't be too bad.Pallbearer
@HarryJohnston in the main loop, I'll fill a std::vector<wstring> or even a vector<pair<wstring, DWORDLONG>> to store filename + parentID. That's why I need to reserve in order to reallocate this vector as least as possible. What do you call a moderately buffer (of what?) in this context? Thanks!Iolenta
I mean the output buffer, pData in your code. The size of that buffer determines how many entries you can receive in a single call to DeviceIoControl. Before copying the data from pData into your vector, count the number of entries in the buffer, that way you only need to resize the vector once per call.Pallbearer
Alternatively, use a std::deque to prevent copying the entire controlled sequence when the container needs to grow. Assuming that operation has been identified as a bottleneck through profiling.Fung
@JesperJuhl: Guessing or not, opendir/readdir/closedir is not an answer to any Windows development question.Fung
@Fung you can't deny those do still work on Windows (although I agree they are not the best option on that platform).Impeachment
@JesperJuhl: How would you open a directory with a name containing Unicode code units, that cannot be represented in an ANSI character set?Fung

© 2022 - 2024 — McMap. All rights reserved.