PostgreSQL just introduced JSONB in version 9.4, and it's already trending on hacker news. How is it different from Hstore and JSON previously present in PostgreSQL?
What are its advantages and limitations and when should someone consider using it?
PostgreSQL just introduced JSONB in version 9.4, and it's already trending on hacker news. How is it different from Hstore and JSON previously present in PostgreSQL?
What are its advantages and limitations and when should someone consider using it?
First, hstore
is a contrib module, which only allows you to store key => value pairs, where keys and values can only be text
s (however values can be sql NULL
s too).
Both json
& jsonb
allows you to store a valid JSON value (defined in its spec).
F.ex. these are valid JSON representations: null
, true
, [1,false,"string",{"foo":"bar"}]
, {"foo":"bar","baz":[null]}
- hstore
is just a little subset compared to what JSON is capable (but if you only need this subset, it's fine).
The only difference between json
& jsonb
is their storage:
json
is stored in its plain text format, whilejsonb
is stored in some binary representationThere are 3 major consequences of this:
jsonb
usually takes more disk space to store than json
(sometimes not)jsonb
takes more time to build from its input representation than json
json
operations take significantly more time than jsonb
(& parsing also needs to be done each time you do some operation at a json
typed value)When jsonb
will be available with a stable release, there will be two major use cases, when you can easily select between them:
json
.jsonb
.jsonb
doesn't support this? UPDATE test SET data->'a' = 123 WHERE id = 1;
from CREATE TABLE test(id SERIAL PRIMARY KEY, data JSONB);
–
Celesta json
over jsonb
is if for legacy reasons your code consuming your json
is dependent on the ordering of the json
fields and they can't be reordered. –
Holozoic text
vs. json
: the latter comes with JSON validation, so upon invalid JSON, it will only fail upon inserting, instead of every time your application reads it (because it gets an invalid representation). Also, you can safely cast the latter to jsonb
within the database. –
Galiot jsonb
is usually 5-10% smaller than json
. Different encoding from PostgreSQL. –
Dirty Peeyush:
The short answer is:
For a longer answer, you'll need to wait for me to do a full "HowTo" writeup closer to the 9.4 release.
A simple explanation of the difference between json and jsonb (original image by PostgresProfessional):
SELECT '{"c":0, "a":2,"a":1}'::json, '{"c":0, "a":2,"a":1}'::jsonb;
json | jsonb
------------------------+---------------------
{"c":0, "a":2,"a":1} | {"a": 1, "c": 0}
(1 row)
More in speech video and slide show presentation by jsonb developers. Also they introduced JsQuery, a pg.extension that provides the powerful jsonb query language.
hstore
is more of a "wide column" storage type, it is a flat (non-nested) dictionary of key-value pairs, always stored in a reasonably efficient binary format (a hash table, hence the name).json
stores JSON documents as text, performing validation when the documents are stored, and parsing them on output if needed (i.e. accessing individual fields); it should support the entire JSON spec. Since the entire JSON text is stored, its formatting is preserved.jsonb
takes shortcuts for performance reasons: JSON data is parsed on input and stored in binary format, key orderings in dictionaries are not maintained, and neither are duplicate keys. Accessing individual elements in the JSONB field is fast as it doesn't require parsing the JSON text all the time. On output, JSON data is reconstructed and initial formatting is lost.IMO, there is no significant reason for not using jsonb
once it is available, if you are working with machine-readable data.
JSONB is a "better" version of JSON.
Let's look at an example:
SELECT '{"c":0, "a":2,"a":1}'::json, '{"c":0, "a":2,"a":1}'::jsonb;
json | jsonb
------------------------+---------------------
{"c":0, "a":2,"a":1} | {"a": 1, "c": 0}
(1 row)
In general, one should prefer JSONB, unless there are specialized needs, such as legacy assumptions about ordering of object keys.
Regarding the differences between json
and jsonb
datatypes, it worth mentioning the official explanation:
PostgreSQL offers two types for storing JSON data:
json
andjsonb
. To implement efficient query mechanisms for these data types, PostgreSQL also provides the jsonpath data type described in Section 8.14.6.The
json
andjsonb
data types accept almost identical sets of values as input. The major practical difference is one of efficiency. Thejson
data type stores an exact copy of the input text, which processing functions must reparse on each execution; whilejsonb
data is stored in a decomposed binary format that makes it slightly slower to input due to added conversion overhead, but significantly faster to process, since no reparsing is needed.jsonb
also supports indexing, which can be a significant advantage.Because the
json
type stores an exact copy of the input text, it will preserve semantically-insignificant white space between tokens, as well as the order of keys within JSON objects. Also, if a JSON object within the value contains the same key more than once, all the key/value pairs are kept. (The processing functions consider the last value as the operative one.) By contrast,jsonb
does not preserve white space, does not preserve the order of object keys, and does not keep duplicate object keys. If duplicate keys are specified in the input, only the last value is kept.In general, most applications should prefer to store JSON data as
jsonb
, unless there are quite specialized needs, such as legacy assumptions about ordering of object keys.PostgreSQL allows only one character set encoding per database. It is therefore not possible for the JSON types to conform rigidly to the JSON specification unless the database encoding is UTF8. Attempts to directly include characters that cannot be represented in the database encoding will fail; conversely, characters that can be represented in the database encoding but not in UTF8 will be allowed.
Source: https://www.postgresql.org/docs/current/datatype-json.html
I was at the PostgresOpen today, and benchmarks are way faster than MongoDB. I believe it was around 500% faster for selects. Pretty much everything was faster, at least by at 200% when contrasted with MongoDB. Then one exception right now is an update which requires completely rewriting the entire JSON column - something MongoDB handles better.
The gin indexing on JSONB sounds amazing.
Also PostgreSQL will persist types of JSONB internally and basically match this with types such as numeric, text, boolean, etc.
Joins will also be possible using JSONB.
Add PLv8 for stored procedures and this will basically be a dream come true for Node.js developers.
Being it's stored as binary, JSONB will also strip all white space, change the ordering of properties and remove duplicate properties using the last occurrence of the property.
Besides the index when querying against a JSONB column contrasted to a JSON column PostgreSQL doesn't have to actually run the functionality to convert the text to JSON on every row which will likely save a vast amount of time alone.
Another important difference, that wasn't mentioned in any answer above, is that there is no equality operator for json
type, but there is one for jsonb
.
This means that you can't use DISTINCT
keyword when selecting this json
-type and/or other fields from a table (you can use DISTINCT ON
instead, but it's not always possible because of cases like this).
As far as I can tell,
hstore as it currently exists (in PostgreSQL 9.3) does not allow for nesting other objects and arrays as the values of its key/value pairs. However, a future hstore patch will allow for nesting. This patch will not be in the 9.4 release and may not be included any time soon.
json as it currently exists does allow for nesting, but it is text-based and does not allow for indexing, thus it is "slow"
jsonb that will be released with 9.4 will have the current nesting capabilities of json, as well as the GIN/GIST indexing of hstore, so it will be fast
People working on PostgreSQL 9.4 seem to be saying that the new, fast jsonb type will appeal to people who would have chosen to use a NoSQL data store like MongoDB, but it can now combine a relational database with query-able unstructured data under one roof
Why HStore2/jsonb is the most important patch of 9.4
Benchmarks of PostgreSQL 9.4 jsonb seem to be on par with or in some cases faster than MongoDB.
http://texture.io/alphabetum/postgresql-incl-hstore-vs-mongodb
© 2022 - 2024 — McMap. All rights reserved.