Storing & accessing up to 10 million files in Linux

Asked 16/2, 2011 at 16:46 Answered 16/2, 2011 at 23:17

I'm writing an app that needs to store lots of files up to approx 10 million.

They are presently named with a UUID and are going to be around 4MB each but always the same size. Reading and writing from/to these files will always be sequential.

2 main questions I am seeking answers for:

1) Which filesystem would be best for this. XFS or ext4? 2) Would it be necessary to store the files beneath subdirectories in order to reduce the numbers of files within a single directory?

For question 2, I note that people have attempted to discover the XFS limit for number of files you can store in a single directory and haven't found the limit which exceeds millions. They noted no performance problems. What about under ext4?

Googling around with people doing similar things, some people suggested storing the inode number as a link to the file instead of the filename for performance (this is in a database index. which I'm also using). However, I don't see a usable API for opening the file by inode number. That seemed to be more of a suggestion for improving performance under ext3 which I am not intending to use by the way.

What are the ext4 and XFS limits? What performance benefits are there from one over the other and could you see a reason to use ext4 over XFS in my case?

Libbielibbna answered 16/2, 2011 at 16:46 Comment(2)

See e.g. lwn.net/Articles/400629 – Cecilececiley 16/2, 2011 at 17:0

did things change in 2018? – Osullivan 24/4, 2018 at 11:54

You should definitely store the files in subdirectories.

EXT4 and XFS both use efficient lookup methods for file names, but if you ever need to run tools over the directories such as ls or find you will be very glad to have the files in manageable chunks of 1,000 - 10,000 files.

The inode number thing is to improve the sequential access performance of the EXT filesystems. The metadata is stored in inodes and if you access these inodes out of order then the metadata accesses are randomized. By reading your files in inode order you make the metadata access sequential too.

Clavicytherium answered 16/2, 2011 at 18:35 Comment(6)

With the inode number thing, how would I open file by inode? I can then avoid using an expensive stat operation right? – Libbielibbna 17/2, 2011 at 2:41

@Matt There is no way to open a file by inode (it would bypass part of the Unix access control scheme). But readdir tells you the inode numbers, so you sort your list of file names by inode number and open them in that order. BTW, "stat is expensive" is an oversimplification; the more accurate statement is "stat(f);open(f) is somewhat more expensive than "h=open(f);fstat(h)". (The expensive operation that you avoid doing twice in the latter case is pathname processing, not disk access. The differential used to be 2x but should be much less with modern systems.) – Paternal 17/2, 2011 at 2:58

@Zack - Thanks for the very useful insite comparing stat/open vs open/fstat – Libbielibbna 17/2, 2011 at 6:4

so XFS or EXT4 for 100 million? – Osullivan 24/4, 2018 at 11:51

@Osullivan 100 million is impossible for EXT4 which I have found directly. Both tempfs using /dev/shm as well as EXT4 will around 10 million +/- 500,000 begin to run so slowly that applications begin to time-out because they think that the disk is broken but it's not broken. It behaves exactly same for spinning disk, SSD, and RAM-backed /dev/shm using tempfs. It leads me to expect XFS on linux will behave same as these other 3, ie, failure around 10.5 million. Very high inodes values showed zero improvement. – Didier 18/12, 2018 at 16:3

@GeoffreyAnderson we are at 122 M currently with XFS – Osullivan 19/12, 2018 at 14:32

Modern filesystems will let you store 10 million files all in the same directory if you like. But tools (ls and its friends) will not work well.

I'd recommend putting a single level of directories, a fixed number, perhaps 1,000 directories, and putting the files in there (10,000 files is tolerable to the shell, and "ls").

I've seen systems which create many levels of directories, this is truly unnecessary and increases inode consumption and makes traversal slower.

10M files should not really be a problem either, unless you need to do bulk operations on them.

I expect you will need to prune old files, but something like "tmpwatch" will probably work just fine with 10M files.

Husky answered 16/2, 2011 at 23:17 Comment(5)

Thanks, is mkdir a slow operation? Should I pre-make the directories at startup and from then on assume they exist? – Libbielibbna 17/2, 2011 at 2:15

Once you get into the millions of files in a same directory, ext4 starts to struggle and gets index hash collisions. – Tear 11/8, 2015 at 18:1

>Modern filesystems will let you store 10 million files all in the same directory if "you like. But tools (ls and its friends) will not work well." Actually it's worse than that. The system itself, not just ls and command line capacity, begins to break down from extreme latency at 10.5 million, and this is true regafrdless of of any kind of storage (tempfs, ssd, spinning disk), and despite sufficiently high inodes values. – Didier 18/12, 2018 at 16:7

@GeoffreyAnderson Interesting, what do you mean by extreme latency? I did some benchmark and actually found out that flat directory performs better: medium.com/@hartator/… – Pigeonwing 22/12, 2018 at 3:39

@Pigeonwing - but isn't the optimal neither flat nor deep; rather its "just deep enough"? That is, if getting into millions, add one level of subdirs - not 0 or 2 levels. – Reconnoitre 29/3, 2019 at 6:40

Recommended topics

Hot tags