Using C++ filestreams (fstream), how can you determine the size of a file?
Asked Answered
R

6

101

I'm sure I've just missed this in the manual, but how do you determine the size of a file (in bytes) using C++'s istream class from the fstream header?

Reggi answered 9/3, 2010 at 14:0 Comment(2)
#2283049Matejka
@NarendraN - that doesn't use fstream, as this question explicitly asks forReggi
S
71

You can seek until the end, then compute the difference:

std::streampos fileSize( const char* filePath ){

    std::streampos fsize = 0;
    std::ifstream file( filePath, std::ios::binary );

    fsize = file.tellg();
    file.seekg( 0, std::ios::end );
    fsize = file.tellg() - fsize;
    file.close();

    return fsize;
}
Snowplow answered 9/3, 2010 at 14:4 Comment(18)
Out of interest, is the first call to tellg not guaranteed to return 0?Rumor
@Steve Honestly, I am not sure. I couldn't figure it out from the standards :(Snowplow
I had to remove the subtraction aspect - but just reading file.tellg() after the seekg() gives the same byte size as is reported by the shell (running on CentOS 4 with g++ 3.4.6)Reggi
I wonder why in the 21st century we still have to count the size of the stream manually? Why don't the commitee just make a function for that which everybody could use? Isn't it just that simple? Are there any hidden caveats?Demonize
@rightaway717: They could. But why? C++ is not a language where "everything is done for you". It provides the building blocks then it's up to you to do whatever you want with them. It would be a complete waste of time to create a function to serve just this one specific use case. Besides, a simple "getter" could easily mislead people into not realising that a file seek is involved, which may have ramifications for them in all sorts of cases. It's better to have people do this explicitly.Delaney
@LightnessRacesinOrbit: while I do understand what you mean, I still do not agree. Many other functions are also that primitive to serve their specific usecase, like get the length of a string. This is a pretty common operation to get the size of things to make a function for it. I believe you do it as well in your code - provide a function to know the size of your own classes, if size has a meaning for them.There are also plenty of function that do a (potentially) long operation under the hood, like vector resizing for example. that doesn't mean that we have to make people do it explicitly.Demonize
@rightaway717: Getting the length of a string is a completely different operation, as I've pointed out already. Vector resizing is clearly a mutating operation so not a good example.Delaney
@LightnessRacesinOrbit: And hundreds thousands developers substracting beginning from end is not a waste of time... If the pointer was actual byte position, with 0 guaranteed, the case would be different. Also, a common stat() call can return the value - the data is already there - without performing any seeks. Meanwhile, we have the file.tellg(ios::begin) value made accessible, which IS completely useless; a piece of under-the-hood data that by itself is useless, and its only purpose is to allow the developer to combine it with another under-the-hood value to extract something useful.Meatball
@LightnessRacesinOrbit, on most systems getting the file size by subtracting offests returned by tell's operations is a slower operation than just directly getting the size of the file which is stored somewhere in the inode or similar structure in the filesystem itself. So, whilst the fallback algorithm could be the one that uses seek's and tell's, an optimized implementation could work around that and provide a faster approach. How's that a bad thing?Travis
@FabioA.: Because the abstraction that is C++ knows about streams of data (some seekable), not "files" or "inodes". That is beyond its purview. You're quite right that you can make more performant programs by using your particular system's API, and that's great! Feel free to go ahead and do that. But I see no reason for C++ to hardcode any of that implementation detail for the general case.Delaney
@LightnessRacesinOrbit, I oughta remind you of std::streamsize. Clearly, c++ streams do have a concept of "size", and since they do also have a concept of "end of stream" (see seekg), in my opinion it does make sense to have a size() method that defaults to the implementation depicted above.Travis
@LightnessRacesinOrbit What new information should I gather from a comment that I already read and didn't change since?Travis
@FabioA.: You could understand it properly this time ;)Delaney
@LightnessRacesinOrbit You might try to explain yourself better, if you think you weren't able to properly convey what you meant.Travis
@FabioA.: I conveyed it okay :) I like how the recipient always forgets that the responsibility for communication is on both parties. Doesn't matter anyway so never mind, and have a nice evening. :)Delaney
Nevermind indeed, it seems the forgetful one is you, here.Travis
If the file you have open is a pipe or a socket or a signalfd, what should a "get_file_size" function return?Ibanez
@Demonize in C++ 17, they finally added it!. hereInfinitive
S
109

You can open the file using the ios::ate flag (and ios::binary flag), so the tellg() function will directly give you directly the file size:

ifstream file( "example.txt", ios::binary | ios::ate);
return file.tellg();
Sunup answered 15/11, 2012 at 9:0 Comment(5)
@Dominik Honnef: in VS 2013 Update5 64 bit this approach, with ios:ate flag and without seekg(0, ios:end) might not work for large files. See #32058250 for more information.Fief
This doesn't seem like a good approach. tellg does not report the size of the file, nor the offset from the beginning in bytes.Landre
@Landre cplusplus.com kind of disagrees with that statement: it indeed uses tellg() to detect filesize.Travis
the answer below works in all situations and not this one, since it returns 0 in lot of cases when the pointer is at the beginning of the fileArnulfoarny
This is a totally incorrect answer and gives 0 in many cases.Gauze
S
71

You can seek until the end, then compute the difference:

std::streampos fileSize( const char* filePath ){

    std::streampos fsize = 0;
    std::ifstream file( filePath, std::ios::binary );

    fsize = file.tellg();
    file.seekg( 0, std::ios::end );
    fsize = file.tellg() - fsize;
    file.close();

    return fsize;
}
Snowplow answered 9/3, 2010 at 14:4 Comment(18)
Out of interest, is the first call to tellg not guaranteed to return 0?Rumor
@Steve Honestly, I am not sure. I couldn't figure it out from the standards :(Snowplow
I had to remove the subtraction aspect - but just reading file.tellg() after the seekg() gives the same byte size as is reported by the shell (running on CentOS 4 with g++ 3.4.6)Reggi
I wonder why in the 21st century we still have to count the size of the stream manually? Why don't the commitee just make a function for that which everybody could use? Isn't it just that simple? Are there any hidden caveats?Demonize
@rightaway717: They could. But why? C++ is not a language where "everything is done for you". It provides the building blocks then it's up to you to do whatever you want with them. It would be a complete waste of time to create a function to serve just this one specific use case. Besides, a simple "getter" could easily mislead people into not realising that a file seek is involved, which may have ramifications for them in all sorts of cases. It's better to have people do this explicitly.Delaney
@LightnessRacesinOrbit: while I do understand what you mean, I still do not agree. Many other functions are also that primitive to serve their specific usecase, like get the length of a string. This is a pretty common operation to get the size of things to make a function for it. I believe you do it as well in your code - provide a function to know the size of your own classes, if size has a meaning for them.There are also plenty of function that do a (potentially) long operation under the hood, like vector resizing for example. that doesn't mean that we have to make people do it explicitly.Demonize
@rightaway717: Getting the length of a string is a completely different operation, as I've pointed out already. Vector resizing is clearly a mutating operation so not a good example.Delaney
@LightnessRacesinOrbit: And hundreds thousands developers substracting beginning from end is not a waste of time... If the pointer was actual byte position, with 0 guaranteed, the case would be different. Also, a common stat() call can return the value - the data is already there - without performing any seeks. Meanwhile, we have the file.tellg(ios::begin) value made accessible, which IS completely useless; a piece of under-the-hood data that by itself is useless, and its only purpose is to allow the developer to combine it with another under-the-hood value to extract something useful.Meatball
@LightnessRacesinOrbit, on most systems getting the file size by subtracting offests returned by tell's operations is a slower operation than just directly getting the size of the file which is stored somewhere in the inode or similar structure in the filesystem itself. So, whilst the fallback algorithm could be the one that uses seek's and tell's, an optimized implementation could work around that and provide a faster approach. How's that a bad thing?Travis
@FabioA.: Because the abstraction that is C++ knows about streams of data (some seekable), not "files" or "inodes". That is beyond its purview. You're quite right that you can make more performant programs by using your particular system's API, and that's great! Feel free to go ahead and do that. But I see no reason for C++ to hardcode any of that implementation detail for the general case.Delaney
@LightnessRacesinOrbit, I oughta remind you of std::streamsize. Clearly, c++ streams do have a concept of "size", and since they do also have a concept of "end of stream" (see seekg), in my opinion it does make sense to have a size() method that defaults to the implementation depicted above.Travis
@LightnessRacesinOrbit What new information should I gather from a comment that I already read and didn't change since?Travis
@FabioA.: You could understand it properly this time ;)Delaney
@LightnessRacesinOrbit You might try to explain yourself better, if you think you weren't able to properly convey what you meant.Travis
@FabioA.: I conveyed it okay :) I like how the recipient always forgets that the responsibility for communication is on both parties. Doesn't matter anyway so never mind, and have a nice evening. :)Delaney
Nevermind indeed, it seems the forgetful one is you, here.Travis
If the file you have open is a pipe or a socket or a signalfd, what should a "get_file_size" function return?Ibanez
@Demonize in C++ 17, they finally added it!. hereInfinitive
W
47

Don't use tellg to determine the exact size of the file. The length determined by tellg will be larger than the number of characters can be read from the file.

From stackoverflow question tellg() function give wrong size of file? tellg does not report the size of the file, nor the offset from the beginning in bytes. It reports a token value which can later be used to seek to the same place, and nothing more. (It's not even guaranteed that you can convert the type to an integral type.). For Windows (and most non-Unix systems), in text mode, there is no direct and immediate mapping between what tellg returns and the number of bytes you must read to get to that position.

If it is important to know exactly how many bytes you can read, the only way of reliably doing so is by reading. You should be able to do this with something like:

#include <fstream>
#include <limits>

ifstream file;
file.open(name,std::ios::in|std::ios::binary);
file.ignore( std::numeric_limits<std::streamsize>::max() );
std::streamsize length = file.gcount();
file.clear();   //  Since ignore will have set eof.
file.seekg( 0, std::ios_base::beg );
Wildebeest answered 14/6, 2016 at 9:29 Comment(1)
My... I guess I'll continue to use stat().Burseraceous
L
11

Like this:

long begin, end;
ifstream myfile ("example.txt");
begin = myfile.tellg();
myfile.seekg (0, ios::end);
end = myfile.tellg();
myfile.close();
cout << "size: " << (end-begin) << " bytes." << endl;
Lavena answered 9/3, 2010 at 14:6 Comment(2)
You may want to use the more appropriate std::streampos instead of long as the latter may not support as large a range as the former - and streampos is more than just an integer.Anarch
Isn't your begin just 0?Hodgson
A
11

Since C++17, we have std::filesystem::file_size. This doesn't strictly speaking use istream or fstream but is by far the most concise and correct way to read a file's size in standard C++.

#include <filesystem>
...
auto size = std::filesystem::file_size("example.txt");
Adulterine answered 23/3, 2022 at 18:14 Comment(3)
After running benchmarks, I realize it's also running much faster than the istream variant, on Windows at least.Falsify
Does it return the actual file size or the file size determined by the OS? The OS will report a size bigger than the actual size, since most OSs use blocks for spaceBolshevism
@Bolshevism filesize returns the size determined as if by reading the st_size member of the structure obtained by POSIX. Copied from en.cppreference.com/w/cpp/filesystem/file_sizeBoner
D
-8

I'm a novice, but this is my self taught way of doing it:

ifstream input_file("example.txt", ios::in | ios::binary)

streambuf* buf_ptr =  input_file.rdbuf(); //pointer to the stream buffer

input.get(); //extract one char from the stream, to activate the buffer
input.unget(); //put the character back to undo the get()

size_t file_size = buf_ptr->in_avail();
//a value of 0 will be returned if the stream was not activated, per line 3.
Dichroscope answered 6/9, 2013 at 3:8 Comment(1)
all this does is determine if there is a first character. How does that help?Reggi

© 2022 - 2024 — McMap. All rights reserved.