Get file size without using System.IO.FileInfo?
Asked Answered
G

6

16

Is it possible to get the size of a file in C# without using System.IO.FileInfo at all?

I know that you can get other things like Name and Extension by using Path.GetFileName(yourFilePath) and Path.GetExtension(yourFilePath) respectively, but apparently not file size? Is there another way I can get file size without using System.IO.FileInfo?

The only reason for this is that, if I'm correct, FileInfo grabs more info than I really need, therefore it takes longer to gather all those FileInfo's if the only thing I need is the size of the file. Is there a faster way?

Grano answered 18/1, 2013 at 21:29 Comment(12)
Premature optimization is the root of all evil. Use FileInfo, profile the code, and determine if it is fast enough for your needs. If you have verified that it is both a substantial percentage of the runtime of your application, and that your application is unacceptably slow, then consider other options.Snowbird
I would imagine it's the file size taking the bulk of the time, with the other items coming along for the ride basically for free.Dreamadreamer
Premature optimization is the root of all evil. Is this really causing an issue for you?Vida
@Dreamadreamer And that's assuming the information isn't lazily loaded to begin with.Snowbird
@Snowbird Yep. Profile profile profile.Dreamadreamer
I have a small application that gathers the size info and saves it into an array... but I often have half a million files, give or take and that takes a while to go through all of those files (I'm using FileInfo). I was just wondering if there was a faster way...Grano
@Grano So how long does it take to run? How long does it need to run in for you to meet your requirements?Snowbird
A well-known problem with FileInfo is that it only obtains the data that you ask for. But pretty convenient right now and the reason that trying to optimize it is pointless.Maggie
@Snowbird Requirements can't provide you with possibility. I know what your beating at but the OP is looking to determine BAU, what should they expect. If the OP knows that FileInfo is generally 15% overhead without optimization X, I believe that is what they are after.Lepine
@AaronMcIver If you know that not doing optimization X is 15% slower, but your application spends .001% of it's time doing that task, then there is no compelling reason to use that optimization. However, that is the reason I have just posted comments, and not an answer saying that he should just use FileInfo, because it is not an answer to the question, just the likely course of action the OP should take anyway.Snowbird
System.IO.FileInfo uses Win32's FindFirstFile API call to extract a WIN32_FIND_FILE structure. You could use GetFileSizeEx but it requires a HANDLE which you must obtain from opening the file first. I would assume the former is better on performance. If you really need insane performance, then try the Win32 calls to FindFirstFile (and FindClose) yourself.Investigation
@Servy, I have a meeting shortly but I will run some numbers and get back with specific results. Thank you!Grano
G
10

I performed a benchmark using these two methods:

    public static uint GetFileSizeA(string filename)
    {
        WIN32_FIND_DATA findData;
        FindFirstFile(filename, out findData);
        return findData.nFileSizeLow;
    }

    public static uint GetFileSizeB(string filename)
    {
        IntPtr handle = CreateFile(
            filename,
            FileAccess.Read,
            FileShare.Read,
            IntPtr.Zero,
            FileMode.Open,
            FileAttributes.ReadOnly,
            IntPtr.Zero);
        long fileSize;
        GetFileSizeEx(handle, out fileSize);
        CloseHandle(handle);
        return (uint) fileSize;
    }

Running against a bit over 2300 files, GetFileSizeA took 62-63ms to run. GetFileSizeB took over 18 seconds.

Unless someone sees something I'm doing wrong, I think the answer is clear as to which method is faster.

Is there a way I can refrain from actually opening the file?

Update

Changing FileAttributes.ReadOnly to FileAttributes.Normal reduced the timing so that the two methods were identical in performance.

Furthermore, if you skip the CloseHandle() call, the GetFileSizeEx method becomes about 20-30% faster, though I don't know that I'd recommend that.

Guerrilla answered 18/1, 2013 at 22:41 Comment(6)
It can be further improved by using FindFirstFileEx and limit the search if possible.Parley
@Parley How do you mean limit it? It's searching for a specific filename. In the benchmarking, I got a list of all the files in a directory and then called GetFileSizeA() or GetFileSizeB() with the full path and filename.Guerrilla
I assume that OP indicated by "it takes longer to gather all those" that he's grabbing more files. He didn't said he goes for only one file in one directory necessarily. So I'm pointing out, that I agree he should use FindFirstFile (FindNextFile) in that case. Or possibly FindFirstFileEx. As it provides more options to specify the search options (only folders, large fetch, etc.)Parley
@Pete: I'm getting some reference errors when trying to test the methods you suggested. Do I have to call any particular libraries for these? Thanks.Grano
@Grano Oops. Fixed the code. Change handle to an IntPtr instead. There's extra cost in using SafeHandle because it needs to be released.Guerrilla
@Guerrilla : Could you please tell me where I can find the complete functions for 'FindFirstFile()', 'CreateFile()', 'GetFileSizeEx()', 'CloseHandle()'. I want to use your code in C#.Gwenore
B
6

From a short test i did, i've found that using a FileStream is just 1 millisecond slower in average than using Pete's GetFileSizeB (took me about 21 milliseconds over a network share...). Personally i prefer staying within the BCL limits whenever i can.

The code is simple:

using (var file = new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
{
    return file.Length;
}
Brenza answered 19/12, 2013 at 9:19 Comment(1)
In my scenario this code is slower than using FileInfo?Sectional
S
3

As per this comment:

I have a small application that gathers the size info and saves it into an array... but I often have half a million files, give or take and that takes a while to go through all of those files (I'm using FileInfo). I was just wondering if there was a faster way...

Since you're finding the length of so many files you're much more likely to benefit from parallelization than from trying to get the file size through another method. The FileInfo class should be good enough, and any improvements are likely to be small.

Parallelizing the file size requests, on the other hand, has the potential for significant improvements in speed. (Note that the degree of improvement will be largely based on your disk drive, not your processor, so results can vary greatly.)

Snowbird answered 18/1, 2013 at 22:16 Comment(1)
Actually, if he's gathering lots of files from individual directories, he may benefit from using FindFirstFile and FindNextFile() to iterate through the files in a directory, though I have no numebrs to back that up.Guerrilla
S
3

Not a direct answer...because I am not sure there is a faster way using the .NET framework.

Here's the code I am using:

  List<long> list = new List<long>();
  DirectoryInfo di = new DirectoryInfo("C:\\Program Files");
  FileInfo[] fiArray = di.GetFiles("*", SearchOption.AllDirectories);
  foreach (FileInfo f in fiArray)
    list.Add(f.Length);

Running that, it took 2709ms to run on my "Program Files" directory, which was around 22720 files. That's no slouch by any means. Furthermore, when I put *.txt as a filter for the first parameter of the GetFiles method, it cut the time down drastically to 461ms.

A lot of this will depend on how fast your hard drive is, but I really don't think that FileInfo is killing performance.

NOTE: I thikn this only valid for .NET 4+

Scourings answered 13/2, 2013 at 20:21 Comment(5)
How is this relevant? You're using the very methods the OP doesn't want to use.Trueman
@Trueman It's relevant because it showcases that using FileInfo probably isn't going to drastically decrease performance.Scourings
That's when dealing with local files. When dealing with network files using FileInfo is slow. There is a codeproject FastFileInfo that addresses this.Palm
@Palm Thank you so much! That FastFileInfo was helping me, as I was getting 23.5k FileInfos through network, and it took about 20mins. Now it only takes 1:09 min O.o! This is amazing!Cutter
@KeenoraFluffball Good stuff. There is a recursion exception with the one on codeproject. I rewrote it and put it on sourceforge.net, and the sourceforge version should also be a little faster.Palm
G
1

A quick'n'dirty solution if you want to do this on the .NET Core or Mono runtimes on non-Windows hosts:

Include the Mono.Posix.NETStandard NuGet package, then something like this...

using Mono.Unix.Native;

private long GetFileSize(string filePath)
{
    Stat stat;
    Syscall.stat(filePath, out stat);
    return stat.st_size;
}

I've tested this running .NET Core on Linux and macOS - not sure if it works on Windows - it might, given that these are POSIX syscalls under the hood (and the package is maintained by Microsoft). If not, combine with the other P/Invoke-based answer to cover all platforms.

When compared to FileInfo.Length, this gives me much more reliable results when getting the size of a file that is actively being written to by another process/thread.

Gallnut answered 20/1, 2020 at 2:36 Comment(0)
N
0

You can try this:

[DllImport("kernel32.dll")]
static extern bool GetFileSizeEx(IntPtr hFile, out long lpFileSize);

But that's not much of an improvement...

Here's the example code taken from pinvoke.net:

IntPtr handle = CreateFile(
    PathString, 
    GENERIC_READ, 
    FILE_SHARE_READ, 
    0, 
    OPEN_EXISTING, 
    FILE_ATTRIBUTE_READONLY, 
    0); //PInvoked too

if (handle.ToInt32() == -1) 
{
    return; 
}

long fileSize;
bool result = GetFileSizeEx(handle, out fileSize);
if (!result) 
{
    return;
}
Noelnoelani answered 18/1, 2013 at 21:33 Comment(8)
@Venson: definitely better than yours, I'm sorry to say ;)Noelnoelani
Of Course but thats what i Said! thats just an Idea please READ before vote!Fanatic
@Fanatic No, that's not what you said. You said they were the same. They're not. Yours is terrible, this is probably more annoying, but likely not (at least much) worse.Snowbird
@Grano The two best methods are GetFileSizeEx() (as above) and FindFirstFile (which is what FileInfo uses). I don't know that there's any particular performance advantage to one over the other. But if performance is really critical, you should time the two methods and see which is actually faster. Using FileInfo may be as fast.Guerrilla
This is the same as FileStream.Length, just the less readable version.Lamond
Thank you!! I will run some tests and time these options.Grano
@TimSchmelter The concern of the OP is that FileInfo will also grab additional info; so this may find the file info just as quickly, but not not also querying for the last modified date (as an example) it might be quicker. Now, I doubt that's true (due to lazy loading) but it's at least something to address.Snowbird
@Tim Schmelter - Not entirely true. FileStream.Length is not static and thus you have to instantiate and the instantiation of a FileStream does have some cost with it.Guerrilla

© 2022 - 2024 — McMap. All rights reserved.