MySQL - Base64 vs BLOB
Asked Answered
S

3

30

For the sake of simplicity, suppose I'm developing a mobile app like Instagram. Users can download images from the server, and upload images of their own. Currently the server stores all images (in reality, just small thumbnails) in a MySQL database as BLOBs. It seems that the most common way to transfer images is by using Base64 encoding, which leaves me with two options:

  1. Server stores all images as BLOBs. To upload an image, client encodes it into Base64 string, then sends it to the server. Server decodes image BACK into binary format and stores it as BLOB in the database. When client requests an image, server re-encodes the image as Base64 string and sends it to the client, who then decodes it back to binary for display.
  2. Server stores all images as Base64 strings. To upload an image, client encodes it into Base64 string and sends it to the server. Server does no encoding or decoding, but simply stores the string in the database. When client requests an image, the Base64 string is returned to the client, who then decodes it for display.

Clearly, option #1 requires significantly more processing on the server, as images must be encoded/decoded with every single request. This makes me lean toward option #2, but some research has suggested that storing Base64 string in MySQL is much less efficient than storing the image directly as BLOB, and is generally discouraged.

I'm certainly not the first person to encounter this situation, so does anybody have suggestions on the best way to make this work?

Sw answered 26/3, 2015 at 16:47 Comment(3)
Option #3 is no images in the database in the first place. They have a system for this: The filesystem.Limann
I started out storing file paths, but some research on the most recent versions of MySQL has suggested that it's actually more efficient to store small files (under 1 or 2 mb) as BLOBs. My images are only a few kb, so it's much easier to maintain this way.Sw
Backing up a database full of BLOBs is an absolute nightmare. It's expensive to replicate these, expensive to back them up, extremely painful to restore them, and excruciating to split them up when they get too big. Unless you're dealing with inconsequential amounts of data, this will blow up on you eventually. Files on disk can be replicated easily using something as rudimentary as rsync. Same is not true for MySQL.Limann
D
14

JSON assumes utf8, hence is incompatible with images unless they are encoded in some way.

Base64 is almost exactly 8/6 times as bulky as binary (BLOB). One could argue that it is easily affordable. 3000 bytes becomes about 4000 bytes.

Everyone should be able to accept arbitrary 8-bit codes, but not everybody does. Base-64 may be the simplest and overall best compromise for not having to deal with 8-bit data.

Since these are "small", I would store them in a table, not a file. I would, however, store them in a separate table and JOIN by an appropriate id when you need them. This allows queries that don't need the image to run faster because they are not stepping over the BLOBs.

Technically, TEXT CHARACTER SET ascii COLLATE ascii_bin would do, but BLOB makes it clearer that there is not really any usable text in the column.

Diphase answered 26/3, 2015 at 22:6 Comment(3)
To clarify, are you suggesting that I store the images as Base64 in a BLOB, or the binary data itself and encode on every fetch? As for your suggestion with JOINing - that's precisely what I'm doing, along with a Sphinx index.Sw
Base64 throughout its life. Then you don't need to do escaping anywhere. So, yes, Base64 in a BLOB without encode except on initial INSERT. This is my opinion.Diphase
Interesting, maybe I'll give that a shot. It would definitely save a lot of CPU since fetches occur much more often than inserts.Sw
M
0

I don't see why the DB Server shouldn't always keep binary data in it's native form. Thus, use a BLOB. (But even if you did store the data in Base64 string, there is no need concern yourself about encoding/decoding performance because the IO's impact will be more significant.)

I don't get why the client should send the data in base64 though. Why not just "stream" it using a simple HTTP call?

Mesomorph answered 4/9, 2018 at 9:23 Comment(1)
Switching to a "binary stream" requires some shifting of gears (in the protocol, etc). HTTP accepts only a limited subset of 256 different byte combinations and has kludges to accommodate other combos. Html-entities is one such, but it is quite bulky. Backslashes is another, but doesn't cover everything. Url-encoding 8-bit codes into 3 bytes. Etc.Diphase
C
-1

Why would you base64-encode the images on the wire? I think you're starting from a wrong assumption.

Commutable answered 26/3, 2015 at 21:12 Comment(3)
Could you clarify? Right now I'm sending all images as Base64-encoded JSON objects, with some metadata as well (so the client knows what to do with the image upon receipt). Is there a way to send the images as binary objects (no encoding) WITH metadata? What about the client POSTing a new image? My server uses Node.js with Express for HTTP, as well as ws for some WebSocket features.Sw
if we dont send it over the wire, which i agree is bad. what suggestions do you have if client end wants to show an image that is stored in a database ? i guess a weblink is the only alternative then ? but our database already has thousands of base64 encoded images in a column. any help appreicated.Donyadoodad
This should be a comment, it's certainly not an answerHydrops

© 2022 - 2024 — McMap. All rights reserved.