I'm trying to get the filesize of a large file (12gb+) and I don't want to open the file to do so as I assume this would eat a lot of resources. Is there any good API to do so with? I'm in a Windows environment.
You should call GetFileSizeEx
which is easier to use than the older GetFileSize
. You will need to open the file by calling CreateFile
but that's a cheap operation. Your assumption that opening a file is expensive, even a 12GB file, is false.
You could use the following function to get the job done:
__int64 FileSize(const wchar_t* name)
{
HANDLE hFile = CreateFile(name, GENERIC_READ,
FILE_SHARE_READ | FILE_SHARE_WRITE, NULL, OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL, NULL);
if (hFile==INVALID_HANDLE_VALUE)
return -1; // error condition, could call GetLastError to find out more
LARGE_INTEGER size;
if (!GetFileSizeEx(hFile, &size))
{
CloseHandle(hFile);
return -1; // error condition, could call GetLastError to find out more
}
CloseHandle(hFile);
return size.QuadPart;
}
There are other API calls that will return you the file size without forcing you to create a file handle, notably GetFileAttributesEx
. However, it's perfectly plausible that this function will just open the file behind the scenes.
__int64 FileSize(const wchar_t* name)
{
WIN32_FILE_ATTRIBUTE_DATA fad;
if (!GetFileAttributesEx(name, GetFileExInfoStandard, &fad))
return -1; // error condition, could call GetLastError to find out more
LARGE_INTEGER size;
size.HighPart = fad.nFileSizeHigh;
size.LowPart = fad.nFileSizeLow;
return size.QuadPart;
}
If you are compiling with Visual Studio and want to avoid calling Win32 APIs then you can use _wstat64
.
Here is a _wstat64
based version of the function:
__int64 FileSize(const wchar_t* name)
{
__stat64 buf;
if (_wstat64(name, &buf) != 0)
return -1; // error, could use errno to find out more
return buf.st_size;
}
If performance ever became an issue for you then you should time the various options on all the platforms that you target in order to reach a decision. Don't assume that the APIs that don't require you to call CreateFile
will be faster. They might be but you won't know until you have timed it.
std::wstring
arguments by const reference... you're doing memory copies on each call :S –
Udo const wchar_t*
as all you really want is to call .c_str()
anyway let the user decide where and if they want a memcpy. –
Udo GetCompressedFileSize
have to open the file too, even though that takes a file name and not a file handle? –
Commissure I've also lived with the fear of the price paid for opening a file and closing it just to get its size. And decided to ask the performance counter^ and see how expensive the operations really are.
This is the number of cycles it took to execute 1 file size query on the same file with the three methods. Tested on 2 files: 150 MB and 1.5 GB. Got +/- 10% fluctuations so they don't seem to be affected by actual file size. (obviously this depend on CPU but it gives you a good vantage point)
- 190 cycles -
CreateFile
,GetFileSizeEx
,CloseHandle
- 40 cycles -
GetFileAttributesEx
- 150 cycles -
FindFirstFile
,FindClose
The GIST with the code used^ is available here.
As we can see from this highly scientific :) test, slowest is actually the file opener. 2nd slowest is the file finder while the winner is the attributes reader. Now, in terms of reliability, CreateFile
should be preferred over the other 2. But I still don't like the concept of opening a file just to read its size... Unless I'm doing size critical stuff, I'll go for the Attributes.
PS: When I'll have time I'll try to read sizes of files that are opened and am writing to. But not right now...
Another option using the FindFirstFile function
#include "stdafx.h"
#include <windows.h>
#include <tchar.h>
#include <stdio.h>
int _tmain(int argc, _TCHAR* argv[])
{
WIN32_FIND_DATA FindFileData;
HANDLE hFind;
LPCTSTR lpFileName = L"C:\\Foo\\Bar.ext";
hFind = FindFirstFile(lpFileName , &FindFileData);
if (hFind == INVALID_HANDLE_VALUE)
{
printf ("File not found (%d)\n", GetLastError());
return -1;
}
else
{
ULONGLONG FileSize = FindFileData.nFileSizeHigh;
FileSize <<= sizeof( FindFileData.nFileSizeHigh ) * 8;
FileSize |= FindFileData.nFileSizeLow;
_tprintf (TEXT("file size is %u\n"), FileSize);
FindClose(hFind);
}
return 0;
}
ULARGE_INTEGER
instead of twiddling the ULONGLONG
bits manually, eg: ULARGE_INTEGER ul; ul.LowPart = FindFileData.nFileSizeLow; ul.HighPart = FindFileData.nFileSizeHigh; ULONGLONG FileSize = ul.QuadPart;
. Also, %u
expects a 32-bit unsigned int
on Windows, you need to use %Lu
instead for a 64-bit integer. –
Winded As of C++17, there is file_size as part of the standard library. (Then the implementor gets to decide how to do it efficiently!)
What about GetFileSize function?
GetFileSize()
requires the file to be opened first, then it uses that handle to determine where the file is located in the filesystem so it can grab the size. If you use FindFirstFile()
instead, it queries the filesystem without needing to open the file. –
Winded © 2022 - 2024 — McMap. All rights reserved.
CreateFile()
can be rather slow if you're opening the file on slow media like network drives, but the slowness would be due to storage access latencies and not because of the fact that the file is huge. – Cram