How can I figure out the size of a file, in bytes?
#include <stdio.h>
unsigned int fsize(char* file){
//what goes here?
}
How can I figure out the size of a file, in bytes?
#include <stdio.h>
unsigned int fsize(char* file){
//what goes here?
}
On Unix-like systems, you can use POSIX system calls: stat
on a path, or fstat
on an already-open file descriptor (POSIX man page, Linux man page).
(Get a file descriptor from open(2)
, or fileno(FILE*)
on a stdio stream).
Based on NilObject's code:
#include <sys/stat.h>
#include <sys/types.h>
off_t fsize(const char *filename) {
struct stat st;
if (stat(filename, &st) == 0)
return st.st_size;
return -1;
}
Changes:
const char
.struct stat
definition, which was missing the variable name.-1
on error instead of 0
, which would be ambiguous for an empty file. off_t
is a signed type so this is possible.If you want fsize()
to print a message on error, you can use this:
#include <sys/stat.h>
#include <sys/types.h>
#include <string.h>
#include <stdio.h>
#include <errno.h>
off_t fsize(const char *filename) {
struct stat st;
if (stat(filename, &st) == 0)
return st.st_size;
fprintf(stderr, "Cannot determine size of %s: %s\n",
filename, strerror(errno));
return -1;
}
On 32-bit systems you should compile this with the option -D_FILE_OFFSET_BITS=64
, otherwise off_t
will only hold values up to 2 GB. See the "Using LFS" section of Large File Support in Linux for details.
fseek
+ ftell
as proposed by Derek. –
Karoline fseek
+ ftell
as proposed by Derek. No. The C Standard specifically states that fseek()
to SEEK_END
on a binary file is undefined behavior. 7.19.9.2 The fseek
function ... A binary stream need not meaningfully support fseek
calls with a whence value of SEEK_END
, and as noted below, which is from footnote 234 on p. 267 of the linked C Standard, and which specifically labels fseek
to SEEK_END
in a binary stream as undefined behavior. . –
Chat Don't use int
. Files over 2 gigabytes in size are common as dirt these days
Don't use unsigned int
. Files over 4 gigabytes in size are common as some slightly-less-common dirt
IIRC the standard library defines off_t
as an unsigned 64 bit integer, which is what everyone should be using. We can redefine that to be 128 bits in a few years when we start having 16 exabyte files hanging around.
If you're on windows, you should use GetFileSizeEx - it actually uses a signed 64 bit integer, so they'll start hitting problems with 8 exabyte files. Foolish Microsoft! :-)
Matt's solution should work, except that it's C++ instead of C, and the initial tell shouldn't be necessary.
unsigned long fsize(char* file)
{
FILE * f = fopen(file, "r");
fseek(f, 0, SEEK_END);
unsigned long len = (unsigned long)ftell(f);
fclose(f);
return len;
}
Fixed your brace for you, too. ;)
Update: This isn't really the best solution. It's limited to 4GB files on Windows and it's likely slower than just using a platform-specific call like GetFileSizeEx
or stat64
.
long int
from ftell()
. (unsigned long)
casting does not improve the range as already limited by the function. ftell()
return -1 on error and that get obfuscated with the cast. Suggest fsize()
return the same type as ftell()
. –
Ashok int
, that would fail to handle large files even on a 64-bit system where long
was a 64-bit type. (e.g. most non-Windows 64-bit systems use an LP64 ABI). But really you should use ftello
which returns an off_t
, which is 64-bit on every system with large file support. –
Nullity **Don't do this (why?):
Quoting the C99 standard doc that i found online: "Setting the file position indicator to end-of-file, as with
fseek(file, 0, SEEK_END)
, has undefined behavior for a binary stream (because of possible trailing null characters) or for any stream with state-dependent encoding that does not assuredly end in the initial shift state.**
Change the definition to int so that error messages can be transmitted, and then use fseek()
and ftell()
to determine the file size.
int fsize(char* file) {
int size;
FILE* fh;
fh = fopen(file, "rb"); //binary mode
if(fh != NULL){
if( fseek(fh, 0, SEEK_END) ){
fclose(fh);
return -1;
}
size = ftell(fh);
fclose(fh);
return size;
}
return -1; //error
}
fseeko
and ftello
(or fseek
and ftell
if you're stuck without the former and happy with limits on the file sizes you can work with) are the correct way to determine the length of a file. stat
-based solutions do not work on many "files" (such as block devices) and are not portable to non-POSIX-ish systems. –
Ovum int
here. ftell
returns a signed long
, which is a 64-bit type on many (but not all) 64-bit systems. It's still only 32-bit on most 32-bit systems, so you need ftello
with off_t
to be able to handle large files portably. Despite ISO C choosing not to define the behaviour, most implementations do, so this does work in practice on most systems. –
Nullity The POSIX standard has its own method to get file size.
Include the sys/stat.h
header to use the function.
stat(3)
.st_size
property.Note: It limits the size to 4GB
. If not Fat32
filesystem then use the 64bit version!
#include <stdio.h>
#include <sys/stat.h>
int main(int argc, char** argv)
{
struct stat info;
stat(argv[1], &info);
// 'st' is an acronym of 'stat'
printf("%s: size=%ld\n", argv[1], info.st_size);
}
#include <stdio.h>
#include <sys/stat.h>
int main(int argc, char** argv)
{
struct stat64 info;
stat64(argv[1], &info);
// 'st' is an acronym of 'stat'
printf("%s: size=%ld\n", argv[1], info.st_size);
}
The ANSI C doesn't directly provides the way to determine the length of the file.
We'll have to use our mind. For now, we'll use the seek approach!
#include <stdio.h>
int main(int argc, char** argv)
{
FILE* fp = fopen(argv[1]);
int f_size;
fseek(fp, 0, SEEK_END);
f_size = ftell(fp);
rewind(fp); // to back to start again
printf("%s: size=%ld", (unsigned long)f_size);
}
If the file is
stdin
or a pipe. POSIX, ANSI C won't work.
It will going return0
if the file is a pipe orstdin
.Opinion: You should use POSIX standard instead. Because, it has 64bit support.
struct _stat64
and __stat64()
for _Windows. –
Pelagi fopen
takes two arguments –
Nahtanha ftell
is only guaranteed to give you the number of bytes from the beginning of the file when the file is open in binary mode. However, in text mode, the value returned by ftell
is unspecified and is only meaningful to fseek
. –
Blubber If you're fine with using the std c library:
#include <sys/stat.h>
off_t fsize(char *file) {
struct stat filestat;
if (stat(file, &filestat) == 0) {
return filestat.st_size;
}
return 0;
}
And if you're building a Windows app, use the GetFileSizeEx API as CRT file I/O is messy, especially for determining file length, due to peculiarities in file representations on different systems ;)
In plain ISO C, there is only one way to determine the size of a file which is guaranteed to work: To read the entire file from the start, until you encounter end-of-file, while counting the number of bytes read.
However, this is highly inefficient. If you want a more efficient solution, then you will have to either
fseek
and ftell
, orstat
on Linux or GetFileSize
on Microsoft Windows.In contrast to what other answers have suggested, the following code is not guaranteed to work:
fseek( fp, 0, SEEK_END );
long size = ftell( fp );
Even if we assume that the data type long
is large enough to represent the file size (which is questionable on some platforms, most notably Microsoft Windows), the posted code has the following problems:
The posted code is not guaranteed to work on text streams, because according to §7.21.9.4 ¶2 of the ISO C11 standard, the value of the file position indicator returned by ftell
contains unspecified information. Only for binary streams is this value guaranteed to be the number of characters from the beginning of the file. There is no such guarantee for text streams.
The posted code is also not guaranteed to work on binary streams, because according to §7.21.9.2 ¶3 of the ISO C11 standard, binary streams are not required to meaningfully support SEEK_END
.
That being said, on most common platforms, the posted code will work, if we assume that the data type long
is large enough to represent the size of the file.
However, on Microsoft Windows, the characters \r\n
(carriage return followed by line feed) will be translated to \n
in text mode (but not in binary mode), so that the file size you get will count \r\n
as two bytes, although you are only reading a single character (\n
) in text mode. Therefore, the results you get will not be consistent.
On POSIX-based platforms (e.g. Linux), this is not an issue, because on those platforms, no translation take place, so that there is no difference between text mode and binary mode.
long
is only 4 bytes on Windows, meaning ftell()
will fail on Windows for files larger than 2 GB. –
Chat I used this set of code to find the file length.
//opens a file with a file descriptor
FILE * i_file;
i_file = fopen(source, "r");
//gets a long from the file descriptor for fstat
long f_d = fileno(i_file);
struct stat buffer;
fstat(f_d, &buffer);
//stores file size
long file_length = buffer.st_size;
fclose(i_file);
C++ MFC extracted from windows file details, not sure if this is better performing than seek but if it is extracted from metadata I think it is faster because it doesn't need to read the entire file
ULONGLONG GetFileSizeAtt(const wchar_t *wFile)
{
WIN32_FILE_ATTRIBUTE_DATA fileInfo;
ULONGLONG FileSize = 0ULL;
//https://learn.microsoft.com/nl-nl/windows/win32/api/fileapi/nf-fileapi-getfileattributesexa?redirectedfrom=MSDN
//https://learn.microsoft.com/nl-nl/windows/win32/api/fileapi/ns-fileapi-win32_file_attribute_data?redirectedfrom=MSDN
if (GetFileAttributesEx(wFile, GetFileExInfoStandard, &fileInfo))
{
ULARGE_INTEGER ul;
ul.HighPart = fileInfo.nFileSizeHigh;
ul.LowPart = fileInfo.nFileSizeLow;
FileSize = ul.QuadPart;
}
return FileSize;
}
Here's a simple and clean function that returns the file size.
long get_file_size(char *path)
{
FILE *fp;
long size = -1;
/* Open file for reading */
fp = fopen(path, "r");
fseek(fp, 0, SEEK_END);
size = ftell(fp);
fclose(fp);
return size;
}
ftell
might not be a byte offset, for text files (you open the file in text mode) –
Nahtanha ftello
which returns an off_t
, which can be a 64-bit type even when long
isn't. I assume ftello
still has the same problem of in theory being undefined behaviour seeking to the end of a binary stream as you described in an answer, but ISO C doesn't provide anything better AFAIK, so for a lot of programs the least-bad thing is to rely on implementations to define this behaviour. –
Nullity _ftelli64()
(What?!? Microsoft uses a non-portable function? In a way resulting in vendor lock-in?!!? Say it ain't so!) But if you're relying on implementation-defined behavior, you might as well use an implementation's method to get the file size. Both fileno()
and stat()
are supported on Windows, albeit in vendor-lock-in mode as _fileno()
and _fstat()
. #ifdef _WIN32 #define fstat _fstat #define fileno _fileno #endif
is actually the most portable solution. –
Chat I have a function that works well with only stdio.h
. I like it a lot and it works very well and is pretty concise:
size_t fsize(FILE *File) {
size_t FSZ;
fseek(File, 0, 2);
FSZ = ftell(File);
rewind(File);
return FSZ;
}
© 2022 - 2024 — McMap. All rights reserved.
char* file
, why notFILE* file
? -1 – Conventualstrlen
! – Hegelfsize
andread
. Be careful. – Hegel