Proper way to get file size in C
Asked Answered
P

2

5

I am working on an assignment in socket programming in which I have to send a file between sparc and linux machine. Before sending the file in char stream I have to get the file size and tell the client. Here are some of the ways I tried to get the size but I am not sure which one is the proper one.

For testing purpose, I created a file with content " test" (space + (string)test)

Method 1 - Using fseeko() and ftello()

This is a method I found on https://www.securecoding.cert.org/confluence/display/c/FIO19-C.+Do+not+use+fseek()+and+ftell()+to+compute+the+size+of+a+regular+file While the fssek() has a problem of "Setting the file position indicator to end-of-file, as with fseek(file, 0, SEEK_END), has undefined behavior for a binary stream", fseeko() is said to have tackled this problem but it only works on POSIX system (which is fine because the environment I am using is sparc and linux)

fd = open(file_path, O_RDONLY);
fp = fopen(file_path, "rb");
/* Ensure that the file is a regular file */
if ((fstat(fd, &st) != 0) || (!S_ISREG(st.st_mode))) {
  /* Handle error */
}
if (fseeko(fp, 0 , SEEK_END) != 0) {
  /* Handle error */
}
file_size = ftello(fp);
fseeko(fp, 0, SEEK_SET);
printf("file size %zu\n", file_size);

This method works fine and get the size correctly. However, it is limited to regular files only. I tried to google the term "regular file" but I still not quite understand it thoroughly. And I do not know if this function is reliable for my project.

Method 2 - Using strlen()

Since the max. size of a file in my project is 4MB, so I can just calloc a 4MB buffer. After that, the file is read into the buffer, and I tried to use the strlen to get the file size (or more correctly the length of content). Since strlen() is portable, can I use this method instead? The code snippet is like this

fp = fopen(file_path, "rb");
fread(file_buffer, 1024*1024*4, 1, fp);
printf("strlen %zu\n", strlen(file_buffer));

This method works too and returns

strlen 8

However, I couldn't see any similar approach on the Internet using this method. So I am thinking maybe I have missed something or there are some limitations of this approach which I haven't realized.

Purism answered 14/2, 2016 at 10:53 Comment(9)
Why would you use strlen() when fread() already tells how much it read and strlen() will stop at the first nul byte?Childhood
Possible duplicate of Using fseek and ftell to determine the size of a file has a vulnerability?Haemagglutinate
Also related: https://mcmap.net/q/682671/-understanding-undefined-behavior-for-a-binary-stream-using-fseek-file-0-seek_end-with-a-fileHaemagglutinate
If file is not regular, getting size depends on the file type. For instance on a fifo youhave to read it until read returns 0 and sum the returns of read. On a directory use readdir, but it has no meaning. Etc.Quatre
Sorry, I am new to C and I totally forgot fread will return the size read. In my case, since I know the file must be smaller than 4MB, does it means I can simply use the result of fread? (which is reliable and portable)Purism
@Haemagglutinate I read similar questions on Stack Overflow too but according to the site I have given in the question, using fseeko() and ftello() should have avoid the vulnerability. Please correct me if I mis-understand it. Thank you.Purism
The proper way to get the file size is platform dependent. Use the preprocessor to determine which is correct at build time and implement the correct method for each of your target platforms. Since you include the linux tag, you may be perfectly happy to just use stat. Using any form of strlen or fseek is wrong.Warbeck
You're already calling fstat, so just check st.st_size.Warbeck
Possible duplicate of How do you determine the size of a file in C?Power
D
14

Regular file means that it is nothing special like device, socket, pipe etc. but "normal" file. It seems that by your task description before sending you must retrieve size of normal file. So your way is right:

FILE* fp = fopen(...);
if(fp) {
  fseek(fp, 0 , SEEK_END);
  long fileSize = ftell(fp);
  fseek(fp, 0 , SEEK_SET);// needed for next read from beginning of file
  ...
  fclose(fp);
}

but you can do it without opening file:

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

struct stat buffer;
int         status;

status = stat("path to file", &buffer);
if(status == 0) {
  // size of file is in member buffer.st_size;
}
Denney answered 14/2, 2016 at 11:36 Comment(1)
1) ftell(fp) returns type long, not size_t. 2) If fopen was in text mode, "... the difference between two such return values is not necessarily a meaningful measure of the number of characters written or read." C11 §7.21.9.4 2. 3) File size may exceed LONG_MAX.Neldanelia
N
0

OP can do it the easy way as "max. size of a file in my project is 4MB".

Rather than using strlen(), use the return value from fread(). stlen() stops on the first null character, so may report too small a value. @Sami Kuhmonen Also we do not know the data read contains any null character, so it may not be a string. Append a null character (and allocate +1) if code needs to use data as a string. But in that case, I'd expect the file needed to be open in text mode.

Note that many OS's do not even use allocated memory until it is written.
Why is malloc not "using up" the memory on my computer?

fp = fopen(file_path, "rb");
if (fp) {

  #define MAX_FILE_SIZE 4194304
  char *buf = malloc(MAX_FILE_SIZE);
  if (buf) {
    size_t numread = fread(buf, sizeof *buf, MAX_FILE_SIZE, fp);

    // shrink if desired
    char *tmp = realloc(buf, numread);
    if (tmp) {
      buf = tmp;

      // Use buf with numread char

    }
    free(buf);
  }
  fclose(fp);
}

Note: Reading the entire file into memory may not be the best idea to begin with.

Neldanelia answered 14/2, 2016 at 17:3 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.