How to decide whether to store binary data in Postgres or in files?
Asked Answered
E

1

9

I'm developing a Postgres-backed system, in which many binary files will be stored.

I have at least 2 choices:

  1. Store them in Postgres.
  2. Store them as files.

What criteria do I need to consider to make the best possible decision?

Edenedens answered 6/2, 2016 at 10:13 Comment(5)
You would store them in the database if storing them is required to be transactional. Or if access to them should be managed with the same privileges as the rest of the data. You would store them in the file system if external tools (image resizer, web server, ...) have to access those file of they are bigger than 1GB.Tridactyl
The files are songs. A song is uploaded once and can be listened to (but not downloaded) many times.Edenedens
Let's assume that once uploaded, the songs will stay in the database forever.Edenedens
With files, how do your SQL client access the contents through a SQL connection?Abbreviate
Also, see Files - in the database or not? on DBA.se with its 10+ answers.Abbreviate
E
9

I would consider the following:

  • Performance. Storing binary files in the file system generally performs better, both for reading and writing.
  • Security. Access to the files in the file system is controlled by the operating system, however if you store the files in the database, PostgreSQL access rules apply.
  • Backup consistence. If you store the data separately (file system and database) is very difficult to have a consistent backup. File system backup and database backup are likely to be out of sync.
  • Transactional properties. File system is not transactional, but obviously PostgreSQL is.
Ersatz answered 6/2, 2016 at 10:29 Comment(8)
Storing the files in the file system is not necessarily faster. See e.g. here: research.microsoft.com/apps/pubs/default.aspx?id=64525 (it's for SQL Server, but I wouldn't be surprised if something similar was true for other DBMS as well)Tridactyl
What about scalability? If you have lots of files to store, what is preferable?Debauch
@a_horse_with_no_name. I really doubt SQL Server performs better than file system regarding binary files.Gonfalon
It does (with certain file sizes). Read the document from the link I provided.Tridactyl
@Adi Levin. Feel free to edit the answer to extend itGonfalon
@JesúsLópez, I am hoping to get an answer from you. I'm not sure what's the correct answer on scalability.Debauch
@Adi Levin. I'm also not sure which one scales better. To have more capacity to store more files you just need to throw more disks on both file system and database.Gonfalon
That's what I thought. And scaling Postgres is more complicated than this, isn't it? I mean, at some point, vertical scaling won't work any more, and you have to do sharding...Debauch

© 2022 - 2024 — McMap. All rights reserved.