How do you determine the size of a file in C?
Asked Answered
U

12

170

How can I figure out the size of a file, in bytes?

#include <stdio.h>

unsigned int fsize(char* file){
  //what goes here?
}
Unbelievable answered 11/8, 2008 at 21:16 Comment(4)
You're going to need to use a library function to retrieve the details of a file. As C is completely platform independent, you're going to need to let us know what platform / operating system you're developing for!Streamlined
Why char* file, why not FILE* file? -1Conventual
@Conventual so that ... just strlen!Hegel
Note that: the file can grow between fsize and read. Be careful.Hegel
A
177

On Unix-like systems, you can use POSIX system calls: stat on a path, or fstat on an already-open file descriptor (POSIX man page, Linux man page).
(Get a file descriptor from open(2), or fileno(FILE*) on a stdio stream).

Based on NilObject's code:

#include <sys/stat.h>
#include <sys/types.h>

off_t fsize(const char *filename) {
    struct stat st; 

    if (stat(filename, &st) == 0)
        return st.st_size;

    return -1; 
}

Changes:

  • Made the filename argument a const char.
  • Corrected the struct stat definition, which was missing the variable name.
  • Returns -1 on error instead of 0, which would be ambiguous for an empty file. off_t is a signed type so this is possible.

If you want fsize() to print a message on error, you can use this:

#include <sys/stat.h>
#include <sys/types.h>
#include <string.h>
#include <stdio.h>
#include <errno.h>

off_t fsize(const char *filename) {
    struct stat st;

    if (stat(filename, &st) == 0)
        return st.st_size;

    fprintf(stderr, "Cannot determine size of %s: %s\n",
            filename, strerror(errno));

    return -1;
}

On 32-bit systems you should compile this with the option -D_FILE_OFFSET_BITS=64, otherwise off_t will only hold values up to 2 GB. See the "Using LFS" section of Large File Support in Linux for details.

Aggappora answered 12/8, 2008 at 0:55 Comment(5)
This is Linux/Unix specific--probably worth pointing that out since the question didn't specify an OS.Wingfooted
You could probably change the return type to ssize_t and cast the size from an off_t without any trouble. It would seem to make more sense to use a ssize_t :-) (Not to be confused with size_t which is unsigned and cannot be used to indicate error.)Aggappora
For more portable code, use fseek + ftell as proposed by Derek.Karoline
For more portable code, use fseek + ftell as proposed by Derek. No. The C Standard specifically states that fseek() to SEEK_END on a binary file is undefined behavior. 7.19.9.2 The fseek function ... A binary stream need not meaningfully support fseek calls with a whence value of SEEK_END, and as noted below, which is from footnote 234 on p. 267 of the linked C Standard, and which specifically labels fseek to SEEK_END in a binary stream as undefined behavior. .Chat
From gnu libc manual: ... [non-POSIX] systems make a distinction between files containing text and files containing binary data, and the input and output facilities of ISO C provide for this distinction. ... In the GNU C Library, and on all POSIX systems, there is no difference between text streams and binary streams. When you open a stream, you get the same kind of stream regardless of whether you ask for binary. This stream can handle any file content, and has none of the restrictions that text streams sometimes have.Oconnor
B
83

Don't use int. Files over 2 gigabytes in size are common as dirt these days

Don't use unsigned int. Files over 4 gigabytes in size are common as some slightly-less-common dirt

IIRC the standard library defines off_t as an unsigned 64 bit integer, which is what everyone should be using. We can redefine that to be 128 bits in a few years when we start having 16 exabyte files hanging around.

If you're on windows, you should use GetFileSizeEx - it actually uses a signed 64 bit integer, so they'll start hitting problems with 8 exabyte files. Foolish Microsoft! :-)

Bogtrotter answered 11/8, 2008 at 22:9 Comment(3)
I've used compilers where off_t is 32 bits. Granted, this is on embedded systems where 4GB files are less common. Anyways, POSIX also defines off64_t and corresponding methods to add to the confusion.Eliott
I always love answers that assume Windows and do nothing else but criticize the question. Could you please add something that's POSIX-compliant?Reluctivity
@JL2210 the accepted answer from Ted Percival shows a posix compliant solution, so I see no sense in repeating the obvious. I (and 70 others) thought that adding the note about windows and not to use signed 32 bit integers to represent file sizes was a value-add on top of that. CheersBogtrotter
O
34

Matt's solution should work, except that it's C++ instead of C, and the initial tell shouldn't be necessary.

unsigned long fsize(char* file)
{
    FILE * f = fopen(file, "r");
    fseek(f, 0, SEEK_END);
    unsigned long len = (unsigned long)ftell(f);
    fclose(f);
    return len;
}

Fixed your brace for you, too. ;)

Update: This isn't really the best solution. It's limited to 4GB files on Windows and it's likely slower than just using a platform-specific call like GetFileSizeEx or stat64.

Obi answered 11/8, 2008 at 21:26 Comment(6)
Yes, you should. However, unless there's a really compelling reason not write platform-specific, though, you should probably just use a platform-specific call rather than the open/seek-end/tell/close pattern.Obi
Sorry about the late reply, but I am having a major issue here. It causes the app to hang when accessing restricted files (like password protected or system files). Is there a way to ask the user for a password when needed?Hawkbill
@Justin, you should probably open a new question specifically about the issue you're running into, and provide details about the platform you're on, how you're accessing the files, and what the behavior is.Obi
Both C99 and C11 return long int from ftell(). (unsigned long) casting does not improve the range as already limited by the function. ftell() return -1 on error and that get obfuscated with the cast. Suggest fsize() return the same type as ftell().Ashok
I agree. The cast was to match the original prototype in the question. I can't recall why I turned it into unsigned long instead of unsigned int, though.Obi
Obviously you wouldn't want to use int, that would fail to handle large files even on a 64-bit system where long was a 64-bit type. (e.g. most non-Windows 64-bit systems use an LP64 ABI). But really you should use ftello which returns an off_t, which is 64-bit on every system with large file support.Nullity
U
16

**Don't do this (why?):

Quoting the C99 standard doc that i found online: "Setting the file position indicator to end-of-file, as with fseek(file, 0, SEEK_END), has undefined behavior for a binary stream (because of possible trailing null characters) or for any stream with state-dependent encoding that does not assuredly end in the initial shift state.**

Change the definition to int so that error messages can be transmitted, and then use fseek() and ftell() to determine the file size.

int fsize(char* file) {
  int size;
  FILE* fh;

  fh = fopen(file, "rb"); //binary mode
  if(fh != NULL){
    if( fseek(fh, 0, SEEK_END) ){
      fclose(fh);
      return -1;
    }

    size = ftell(fh);
    fclose(fh);
    return size;
  }

  return -1; //error
}
Unbelievable answered 11/8, 2008 at 21:16 Comment(3)
@mezhaka: That CERT report is simply wrong. fseeko and ftello (or fseek and ftell if you're stuck without the former and happy with limits on the file sizes you can work with) are the correct way to determine the length of a file. stat-based solutions do not work on many "files" (such as block devices) and are not portable to non-POSIX-ish systems.Ovum
This is the only way to get the file size on many non-posix compliant systems (such as my very minimalistic mbed)Precast
You absolutely do not want to use int here. ftell returns a signed long, which is a 64-bit type on many (but not all) 64-bit systems. It's still only 32-bit on most 32-bit systems, so you need ftello with off_t to be able to handle large files portably. Despite ISO C choosing not to define the behaviour, most implementations do, so this does work in practice on most systems.Nullity
L
12

POSIX

The POSIX standard has its own method to get file size.
Include the sys/stat.h header to use the function.

Synopsis

  • Get file statistics using stat(3).
  • Obtain the st_size property.

Examples

Note: It limits the size to 4GB. If not Fat32 filesystem then use the 64bit version!

#include <stdio.h>
#include <sys/stat.h>

int main(int argc, char** argv)
{
    struct stat info;
    stat(argv[1], &info);

    // 'st' is an acronym of 'stat'
    printf("%s: size=%ld\n", argv[1], info.st_size);
}
#include <stdio.h>
#include <sys/stat.h>

int main(int argc, char** argv)
{
    struct stat64 info;
    stat64(argv[1], &info);

    // 'st' is an acronym of 'stat'
    printf("%s: size=%ld\n", argv[1], info.st_size);
}

ANSI C (standard)

The ANSI C doesn't directly provides the way to determine the length of the file.
We'll have to use our mind. For now, we'll use the seek approach!

Synopsis

  • Seek the file to the end using fseek(3).
  • Get the current position using ftell(3).

Example

#include <stdio.h>

int main(int argc, char** argv)
{
    FILE* fp = fopen(argv[1]);
    int f_size;

    fseek(fp, 0, SEEK_END);
    f_size = ftell(fp);
    rewind(fp); // to back to start again

    printf("%s: size=%ld", (unsigned long)f_size);
}

If the file is stdin or a pipe. POSIX, ANSI C won't work.
It will going return 0 if the file is a pipe or stdin.

Opinion: You should use POSIX standard instead. Because, it has 64bit support.

Lysis answered 9/1, 2019 at 16:48 Comment(3)
struct _stat64 and __stat64() for _Windows.Pelagi
The last example is incorrect, fopen takes two argumentsNahtanha
In ISO C, the function ftell is only guaranteed to give you the number of bytes from the beginning of the file when the file is open in binary mode. However, in text mode, the value returned by ftell is unspecified and is only meaningful to fseek.Blubber
S
4

If you're fine with using the std c library:

#include <sys/stat.h>
off_t fsize(char *file) {
    struct stat filestat;
    if (stat(file, &filestat) == 0) {
        return filestat.st_size;
    }
    return 0;
}
Swindle answered 11/8, 2008 at 21:21 Comment(1)
That's not standard C. It's part of the POSIX standard, but not the C standard.Obi
P
4

And if you're building a Windows app, use the GetFileSizeEx API as CRT file I/O is messy, especially for determining file length, due to peculiarities in file representations on different systems ;)

Pigling answered 12/8, 2008 at 0:15 Comment(0)
B
4

In plain ISO C, there is only one way to determine the size of a file which is guaranteed to work: To read the entire file from the start, until you encounter end-of-file, while counting the number of bytes read.

However, this is highly inefficient. If you want a more efficient solution, then you will have to either

  • rely on platform-specific behavior of the functions fseek and ftell, or
  • revert to platform-specific functions, such as stat on Linux or GetFileSize on Microsoft Windows.

In contrast to what other answers have suggested, the following code is not guaranteed to work:

fseek( fp, 0, SEEK_END );
long size = ftell( fp );

Even if we assume that the data type long is large enough to represent the file size (which is questionable on some platforms, most notably Microsoft Windows), the posted code has the following problems:

The posted code is not guaranteed to work on text streams, because according to §7.21.9.4 ¶2 of the ISO C11 standard, the value of the file position indicator returned by ftell contains unspecified information. Only for binary streams is this value guaranteed to be the number of characters from the beginning of the file. There is no such guarantee for text streams.

The posted code is also not guaranteed to work on binary streams, because according to §7.21.9.2 ¶3 of the ISO C11 standard, binary streams are not required to meaningfully support SEEK_END.

That being said, on most common platforms, the posted code will work, if we assume that the data type long is large enough to represent the size of the file.

However, on Microsoft Windows, the characters \r\n (carriage return followed by line feed) will be translated to \n in text mode (but not in binary mode), so that the file size you get will count \r\n as two bytes, although you are only reading a single character (\n) in text mode. Therefore, the results you get will not be consistent.

On POSIX-based platforms (e.g. Linux), this is not an issue, because on those platforms, no translation take place, so that there is no difference between text mode and binary mode.

Blubber answered 6/1, 2023 at 17:49 Comment(2)
Another Windows problem: long is only 4 bytes on Windows, meaning ftell() will fail on Windows for files larger than 2 GB.Chat
@AndrewHenle: Yes, that is an important point. Meanwhile, I have edited my answer. I believe that I have now addressed your point in my answer.Blubber
G
3

I used this set of code to find the file length.

//opens a file with a file descriptor
FILE * i_file;
i_file = fopen(source, "r");

//gets a long from the file descriptor for fstat
long f_d = fileno(i_file);
struct stat buffer;
fstat(f_d, &buffer);

//stores file size
long file_length = buffer.st_size;
fclose(i_file);
Gus answered 8/2, 2014 at 1:54 Comment(1)
This solution is using platform-specific functions. It will likely not work on non-POSIX platforms. If you provide a platform-specific answer to a platform-agnostic question, then I suggest that you clearly mark it as such.Blubber
F
0

C++ MFC extracted from windows file details, not sure if this is better performing than seek but if it is extracted from metadata I think it is faster because it doesn't need to read the entire file

ULONGLONG GetFileSizeAtt(const wchar_t *wFile)
{
    WIN32_FILE_ATTRIBUTE_DATA fileInfo;
    ULONGLONG FileSize = 0ULL;
    //https://learn.microsoft.com/nl-nl/windows/win32/api/fileapi/nf-fileapi-getfileattributesexa?redirectedfrom=MSDN
    //https://learn.microsoft.com/nl-nl/windows/win32/api/fileapi/ns-fileapi-win32_file_attribute_data?redirectedfrom=MSDN
    if (GetFileAttributesEx(wFile, GetFileExInfoStandard, &fileInfo))
    {
        ULARGE_INTEGER ul;
        ul.HighPart = fileInfo.nFileSizeHigh;
        ul.LowPart = fileInfo.nFileSizeLow;
        FileSize = ul.QuadPart;
    }
    return FileSize;
}
Fettling answered 9/1, 2022 at 17:2 Comment(0)
J
-2

Here's a simple and clean function that returns the file size.

long get_file_size(char *path)
{
    FILE *fp;
    long size = -1;
    /* Open file for reading */
    fp = fopen(path, "r");
    fseek(fp, 0, SEEK_END);
    size = ftell(fp); 
    fclose(fp);
    return size;
}
Jat answered 6/6, 2016 at 15:27 Comment(6)
No, I dislike functions that expect a path. Instead, please make ti exppect a file pointerConventual
ftell might not be a byte offset, for text files (you open the file in text mode)Nahtanha
And what happens if you're running on Windows and the file size is 14 GB?Chat
@AndrewHenle: In that case you'd need to use ftello which returns an off_t, which can be a 64-bit type even when long isn't. I assume ftello still has the same problem of in theory being undefined behaviour seeking to the end of a binary stream as you described in an answer, but ISO C doesn't provide anything better AFAIK, so for a lot of programs the least-bad thing is to rely on implementations to define this behaviour.Nullity
@PeterCordes Windows uses _ftelli64() (What?!? Microsoft uses a non-portable function? In a way resulting in vendor lock-in?!!? Say it ain't so!) But if you're relying on implementation-defined behavior, you might as well use an implementation's method to get the file size. Both fileno() and stat() are supported on Windows, albeit in vendor-lock-in mode as _fileno() and _fstat(). #ifdef _WIN32 #define fstat _fstat #define fileno _fileno #endif is actually the most portable solution.Chat
(cont) Of course it's not quite that easy to write portable code that works on Windows - see the 32/64-bit manuscript at learn.microsoft.com/en-us/cpp/c-runtime-library/reference/…Chat
C
-2

I have a function that works well with only stdio.h. I like it a lot and it works very well and is pretty concise:

size_t fsize(FILE *File) {
    size_t FSZ;
    fseek(File, 0, 2);
    FSZ = ftell(File);
    rewind(File);
    return FSZ;
}
Conventual answered 11/7, 2019 at 22:22 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.