How to perform update operations on columns of type JSONB

K

11

246

Looking through the documentation for the Postgres 9.4 datatype JSONB, it is not immediately obvious to me how to do updates on JSONB columns.

Documentation for JSONB types and functions:

http://www.postgresql.org/docs/9.4/static/functions-json.html http://www.postgresql.org/docs/9.4/static/datatype-json.html

As an examples, I have this basic table structure:

CREATE TABLE test(id serial, data jsonb);

Inserting is easy, as in:

INSERT INTO test(data) values ('{"name": "my-name", "tags": ["tag1", "tag2"]}');

Now, how would I update the 'data' column? This is invalid syntax:

UPDATE test SET data->'name' = 'my-other-name' WHERE id = 1;

Is this documented somewhere obvious that I missed?

Krimmer answered 2/11, 2014 at 19:37 Comment(0)

P

49

Ideally, you don't use JSON documents for structured, regular data that you want to manipulate inside a relational database. Use a normalized relational design instead.

JSON is primarily intended to store whole documents that do not need to be manipulated inside the RDBMS. Related:

JSONB with indexing vs. hstore

Updating a row in Postgres always writes a new version of the whole row. That's the basic principle of Postgres' MVCC model. From a performance perspective, it hardly matters whether you change a single piece of data inside a JSON object or all of it: a new version of the row has to be written.

Thus the advice in the manual:

JSON data is subject to the same concurrency-control considerations as any other data type when stored in a table. Although storing large documents is practicable, keep in mind that any update acquires a row-level lock on the whole row. Consider limiting JSON documents to a manageable size in order to decrease lock contention among updating transactions. Ideally, JSON documents should each represent an atomic datum that business rules dictate cannot reasonably be further subdivided into smaller datums that could be modified independently.

The gist of it: to modify anything inside a JSON object, you have to assign a modified object to the column. Postgres supplies limited means to build and manipulate json data in addition to its storage capabilities. The arsenal of tools has grown substantially with every new release since version 9.2. But the principal remains: You always have to assign a complete modified object to the column and Postgres always writes a new row version for any update.

Some techniques how to work with the tools of Postgres 9.3 or later:

How do I modify fields inside the new PostgreSQL JSON datatype?

This answer has attracted about as many downvotes as all my other answers on SO together. People don't seem to like the idea: a normalized design is superior for regular data. This excellent blog post by Craig Ringer explains in more detail:

"PostgreSQL anti-patterns: Unnecessary json/hstore dynamic columns"

Another blog post by Laurenz Albe, another official Postgres contributor like Craig and myself:

JSON in PostgreSQL: how to use it right

Preclude answered 2/11, 2014 at 21:28 Comment(11)

This answer only concerns with the type JSON and ignores JSONB. – Asare 15/7, 2015 at 14:9

@fiatjaf: This answer is fully applicable to the data types json and jsonb alike. Both store JSON data, jsonb does it in a normalized binary form that has some advantages (and few disadvantages). https://mcmap.net/q/109154/-how-do-i-query-using-fields-inside-the-new-postgresql-json-datatype Neither data type is good for being manipulating a lot inside the database. No document type is. Well, it's fine for small, hardly structured JSON documents. But big, nested documents would be a folly that way. – Preclude 15/7, 2015 at 14:26

"Instructions how to work with the tools of Postgres 9.3" really aught to be first in your answer as it answers the question asked.. sometimes it makes sense to update json for maintenance / schema changes etc and the reasons not to do update json don't really apply – Thymelaeaceous 11/5, 2016 at 13:40

I don't think it is correct that PostgreSQL always rewrites the whole row when an update is performed. This is only the case for rows that fit within a single physical page (normally 8kb). Rows that are larger are split internally and as far as I understand only the page(s) affected by the update will be rewritten. postgresql.org/docs/9.6/static/storage.html – Merow 16/2, 2018 at 21:33

Not really sure if it's related to the downvotes (I didn't downvote), but the issue might be more because it's a "you shouldn't be doing that" answer rather than a "people don't seem to like the idea: a normalized..." problem. I think you're right, storing a lot in JSON/B probably indicates a larger problem and is far from the ideal, but a lot of us have to work with pre-existing systems and need to learn how to work with what we have until we're able to improve those systems. – Lepine 21/12, 2018 at 16:51

Answer the question first before adding your own comment/opinion/discussion. – Foch 29/7, 2019 at 5:28

This answer is largely irrelevant with current generation of postgresql (versions 11, 12). There is much better support of Jsonb operations and there are many reasons to switch to that as being mainstream type for many applications. – Priest 11/1, 2020 at 20:35

@taleodor: JSON support has been improved with every version and is pretty excellent by now. Has been for some time. And very useful for certain applications. But my answer is still fully applicable - especially for "update operations" this question asks about - as it addresses a principle limitation of document types. For regular data, proper columns in a more or less normalized db schema are typically much more efficient. That is not going to change. The Postgres project advises accordingly, like I quoted above - unaltered up to Postgres 13 devel manual. – Preclude 12/1, 2020 at 12:50

In addition to @MichaelWasser comment - the concept of normalization is irrelevant to the OP's question. Normalization is entirely out of scope and brings its own pros, cons and justifications. – Chetnik 23/1, 2020 at 21:23

People should know first why doing things that way. Love the philosophy. – Relive 21/3, 2020 at 10:42

Be very careful when reading this answer. For each blog post listed in this answer, there are equal and opposite views on this. – Alamode 16/3, 2022 at 16:37

Z

539

If you're able to upgrade to Postgresql 9.5, the jsonb_set command is available, as others have mentioned.

In each of the following SQL statements, I've omitted the where clause for brevity; obviously, you'd want to add that back.

Update name:

UPDATE test SET data = jsonb_set(data, '{name}', '"my-other-name"');

Replace the tags (as oppose to adding or removing tags):

UPDATE test SET data = jsonb_set(data, '{tags}', '["tag3", "tag4"]');

Replacing the second tag (0-indexed):

UPDATE test SET data = jsonb_set(data, '{tags,1}', '"tag5"');

Append a tag (~~this will work as long as there are fewer than 999 tags; changing argument 999 to 1000 or above generates an error~~. This no longer appears to be the case in Postgres 9.5.3; a much larger index can be used):

UPDATE test SET data = jsonb_set(data, '{tags,999999999}', '"tag6"', true);

Remove the last tag:

UPDATE test SET data = data #- '{tags,-1}'

Complex update (delete the last tag, insert a new tag, and change the name):

UPDATE test SET data = jsonb_set(
    jsonb_set(data #- '{tags,-1}', '{tags,999999999}', '"tag3"', true), 
    '{name}', '"my-other-name"');

It's important to note that in each of these examples, you're not actually updating a single field of the JSON data. Instead, you're creating a temporary, modified version of the data, and assigning that modified version back to the column. In practice, the result should be the same, but keeping this in mind should make complex updates, like the last example, more understandable.

In the complex example, there are three transformations and three temporary versions: First, the last tag is removed. Then, that version is transformed by adding a new tag. Next, the second version is transformed by changing the name field. The value in the data column is replaced with the final version.

Zingaro answered 11/2, 2016 at 20:45 Comment(10)

you get bonus points for showing how to update a column in a table as the OP requested – Pleven 13/7, 2016 at 5:28

I know this was not in the original request, but I'd love to see an example of how to chain these, e.g. remove two keys and set two keys in one UPDATE. – Pleven 13/7, 2016 at 5:35

@chadrik: I added a more complex example. It doesn't do exactly what you requested, but it should give you an idea. Note that the input to the outer jsonb_set call is the output from the inner call, and that the input to that inner call is the result of data #- '{tags,-1}'. I.e., the original data with the last tag removed. – Zingaro 13/7, 2016 at 18:40

i dont kow tag index. how to delete the tag by value ? – Farnesol 21/7, 2016 at 7:52

@PranaySoni: For that purpose, I'd probably use a stored procedure or, if the overhead isn't a concern, bring that data back, manipulate it in the application's language, then write it back. This sounds heavy, but keep in mind, in all the examples I gave, you're not still not updating a single field in the JSON(B): you're overwriting the whole column either way. So a stored proc is really no different. – Zingaro 11/8, 2016 at 14:58

@Zingaro is "'{tags,999999999}'" an array append hack? How does it work? – Kraemer 8/11, 2016 at 20:25

@Alex: Yes, a bit of a hack. If I said {tags,0}, that would mean "the first element of array tags", allowing me to give a new value to that element. By using a large number instead of 0, instead of it replacing an existing element in the array, it adds a new element to the array. However, if the array actually had more than 999,999,999 elements in it, this would replace the last element instead of add a new one. – Zingaro 9/11, 2016 at 15:32

what about if field contains null? looks dont' work. Eg info jsonb field is null: "UPDATE organizer SET info = jsonb_set(info, '{country}', '"FRA"') where info->>'country'::text IS NULL; " I get UPDATE 105 record but no changes on db – Outandout 20/5, 2018 at 11:5

What if the jsonb contains a sub json? Something like "nested_tag":{"subtag1":true, "subtag2":"value"}. i'm trying to modify/add a subtag. I'm trying Set data = jsonb_set(data, '{nested_tag}->{subtag3}', true). But I get ERROR: function jsonb_set(jsonb, unknown, boolean) does not exist. Any idea? – Quinn 31/3, 2022 at 8:34

Does anyone think updating single keys in JSONb column is not concurrency safe, not from data base perspective but from Application data integrity view? – Poole 5/12, 2023 at 7:4

P

49