The C Standard was formalized in 1990 when most hard drives were smaller than 2 GB. The prototype for fseek()
was already in broad use with a long
type offset and 32 bits seemed large enough for all purposes, especially since the corresponding system call used the same API already. They did add fgetpos()
and fsetpos()
for exotic file systems where a simple long offset did not carry all the necessary information for seeking, but kept the fpos_t
type opaque.
After a few years, when 64-bit offsets became necessary, many operating systems added 64-bit versions of the system calls and POSIX introduced fseeko()
and ftello()
to provide a high level interface for larger offsets. These extensions are not necessary anymore for 64-bit versions of common operating systems (linux, OS/X) but Microsoft decided to keep it's long
, or more precisely LONG
, type at 32-bits, solidifying this issue and other ones too such as size_t
being larger than unsigned long
. This very unfortunate decision plagues C developers on Win64 platforms ever since and forces them to use non portable APIs for large files.
Changing fseek
and ftell
prototypes would create more problems with existing software as it would break compatibility, so it will not happen.
Some other historical shortcomings are even more surprising, such as the prototype for fgets
:
char *fgets(char * restrict s, int n, FILE * restrict stream);
Why did they use int
instead of size_t
is a mystery: back in 1990, int
and size_t
had the same size on most platforms and it did not make sense to pass a negative value anyway. Again, this inconsistent API is here to stay.
_fseeki64
. Regardinglseek
, Linux havelseek64
which uses the guaranteed 64-bit typeoff64_t
. – Hubblefseek
andftell
in terms ofoff_t
, or something. – Booleseek
call gave way tolseek
, as Unix learned how to deal with 32-bit (!) file sizes. Fast forward to today, and we've got this litany ofstat64
and_fseeki64
andlseek64
calls. ("lseek64
" is a particularly ghastly misnomer; it should clearly be "seek64
" or "llseek
".) – Boolelong long int
(64-bit) and notlong long long int
(128-bit)?" – Infidelityllseek
might be confusing as linux already has_llseek
which splits a 64 bit offset into two 32 bit args. It might be ghastly but given that we already havelseek
we probably want to keeplseek
as part of the replacement name(s). When I'm looking at a code base and asking the question: Where are all the places seeking is done? I'd like to be able to do agrep
onlseek
and get a match on eitherlseek
orlseek64
On 64 bit systemslseek
works by default. For 32 bit, we can do:#define _LARGEFILE*_SOURCE
andlseek
works – Abneyfseeko
andftello
in POSIX-1.2001. – Bitinglong long int
was added in C99, butfseek
was already defined to uselong int
offsets before C99. – Bitinglong long int
supports 9.22 EB (exabytes). Should be enough for the next 50 years I guess. Example: 1 hour of 512K (sic!) video takes ~400 TB. Not sure though about the 512K video. – Chalcography