Recent ways to obtain file size in Java
Asked Answered
T

0

8

I know this question has been widely discussed in different posts:

My problem is that I need to obtain the sizes of a large number of files (regular files existing in a HD), and for this I need a solution that provides the best performance. My intuition is that it should be done through a method that reads directly the file system table, not obtaining the size of the file by reading the whole file contents. It is difficult to know which specific method is used by reading the documentation.

As stated in this page:

Files has the size() method to determine the size of the file. This is the most recent API and it is recommended for new Java applications.

But this is apparently not the best advise, in terms of performance. I have made different measurements of different methods:

  1. file.length();

  2. Files.size(path);

  3. BasicFileAttributes attr = Files.readAttributes(path, BasicFileAttributes.class); attr.size();

And my surprise is that file.length(); is the fastest, having to create a File object instead of using the newer Path. I do not now if this also reads the file system or the contents. So my question is:

What is the fastest, recommended way to obtain file sizes in recent Java versions (9/10/11)?


EDIT

I do not think these details add anything to the question. Basically the benchmark reads like this:

Length: 49852  with previous instanciation: 84676
Files: 3451537 with previous instanciation: 5722015
Length: 48019 with previous instanciation:: 79910
Length: 47653 with previous instanciation:: 86875
Files: 83576 with previous instanciation: 125730
BasicFileAttr: 333571 with previous instanciation:: 366928
.....

Length is quite consistent. Files is noticeable slow on the first call, but it must cache something since later calls are faster (still slower than Length). This is what other people observed in some of the links I reference above. BasicFileAttr was my hope but still is slow.

I am asing what is recommended in modern Java versions, and I considered 9/10/11 as "modern". It is not a dependency, nor a limitation, but I suppose Java 11 is supposed to provide better means to get file sizes than Java 5. If Java 8 released the fastest way, that is OK.

It is not a premature optimisation, at the moment I am optimising a CRC check with an initial size check, because it should be much faster and does not need, in theory, to read file contents. So I can use directly the "old" Length method, and all I am asking is what are the new advances on this respect in modern Java, since the new methods are apparently not as fast as the old ones.

Terret answered 20/2, 2019 at 13:33 Comment(6)
This is a pretty broad question, It's not only depend on the Java version because all the methods that you mentioned do native calls. Why do you want all three java versions?Keewatin
1) I am surprised that those ways to get a file size are significantly different. Please share your benchmark! 2) The fastest way to get a file's size is liable to be OS dependent. 3) This smells of "premature optimization".Lenwood
"It is difficult to know which specific method is used by reading the documentation." - You can read the source code instead.Lenwood
2. and 3. appear to be the same thing if you look at the source code. They read all the basic file attributes. 1. appears to go directly to the platform specific XxxxFileSystem class and from there to native. Note: benchmarking is hard - so be careful of drawing hasty conclusions.Vasomotor
Neither file.length() nor Files.size() reads the whole file. Both methods are fast enough (i.e. take a few microseconds). Whether one is faster than another depends on the particular operating system, the file system and even the hardware. E.g. in my experiments on Linux with XFS Files.size() was slightly better than file.length().Metabolic
Your benchmark numbers do not read like it was a benchmark. You seem to have implemented one algorithm and ran it several times. However that does not help to answer your own question.Elfland

© 2022 - 2024 — McMap. All rights reserved.