PostgreSQL: How to SUM all attributes in a JSONB field?
Asked Answered
B

5

9

I am working with Postgres 9.4. I have a JSONB field:

     Column      │         Type         │                             Modifiers
─────────────────┼──────────────────────┼────────────────────────────────────────────────────────────────────
 id              │ integer              │ not null default
 practice_id     │ character varying(6) │ not null
 date            │ date                 │ not null
 pct_id          │ character varying(3) │
 astro_pu_items  │ double precision     │ not null
 astro_pu_cost   │ double precision     │ not null
 star_pu         │ jsonb                │

I can query the raw values of the JSONB field just fine:

SELECT star_pu FROM mytable limit 1;
star_pu │ {"statins_cost": 16790.692924903742, "hypnotics_adq": 18523.58385328709, "laxatives_cost": 8456.98405165182, "analgesics_cost": 48271.21822239242, "oral_nsaids_cost": 9911.336052088493, "antidepressants_adq": 186715.7, "antidepressants_cost": 26885.54622478343, "bronchodilators_cost": 26646.54899847902, "cox-2_inhibitors_cost": 2063.4652015406728, "antiplatelet_drugs_cost": 4844.798321177439, "drugs_for_dementia_cost": 3390.569564110721, "antiepileptic_drugs_cost": 44990.94756286502, "oral_antibacterials_cost": 21047.048353859234, "oral_antibacterials_item": 5096.6501798218205, "ulcer_healing_drugs_cost": 15999.05326260261, "lipid-regulating_drugs_cost": 24711.589440943662, "proton_pump_inhibitors_cost": 14545.398978447573, "inhaled_corticosteroids_cost": 50759.91062192373, "calcium-channel_blockers_cost": 11571.457036131978, "omega-3_fatty_acid_compounds_adq": 2026.0, "benzodiazepine_caps_and_tabs_cost": 1800.2581325567717, "bisphosphonates_and_other_drugs_cost": 2996.912924744617, "drugs_acting_on_benzodiazepine_receptors_cost": 2993.142806352308, "drugs_affecting_the_renin_angiotensin_system_cost": 20255.500615282508, "drugs_used_in_parkinsonism_and_related_disorders_cost": 9812.457888596877}

Now I want SUM the JSONB values across the entire table, but I don't know how to do this. Ideally I would get back a dictionary, where the keys are as above and the values are summed values.

I can do the following to SUM one JSONB field explicitly:

    SELECT date, SUM(total_list_size) as total_list_size, 
    SUM((star_pu->>'oral_antibacterials_item')::float) AS star_pu_oral_antibac_items
    FROM mytable GROUP BY date ORDER BY date

But how do I calculate the sums for all the attributes in the JSONB field - and preferably return the entire field as a dictionary? Ideally I'd get back something like:

star_pu │ {"statins_cost": very-large-number, "hypnotics_adq": very-large-number, ...

I guess I can get each field manually by SUMming each key explicitly, but the whole reason I have the JSONB field is that there are lots of keys and they may change.

It is safe to assume that the JSONB field only contains keys and values, i.e. has depth 1.

Beltran answered 1/2, 2016 at 11:18 Comment(0)
S
12

The query should do the job:

select date, json_object_agg(key, val)
from (
    select date, key, sum(value::numeric) val
    from mytable t, jsonb_each_text(star_pu)
    group by date, key
    ) s
group by date;

The resulting json values will be sorted alphabetically by keys (a side effect of json_object_agg ()). I do not know whether this matters.

Sylviasylviculture answered 1/2, 2016 at 11:57 Comment(1)
I've just posted a related question: #35131370Beltran
P
6

I've written a Postgres extension that does exactly that. Once you have it installed you could do:

SELECT jsonb_deep_sum(star_pu) FROM mytable;

Benchmarks are in 4s for 2 million rows, @klin's answer takes 11s

Plainspoken answered 10/6, 2017 at 9:1 Comment(0)
B
1

There might be a better way, but at least this works:

WITH
  keys AS (SELECT DISTINCT jsonb_object_keys(star_pu) AS key FROM mytable),
  sums AS (SELECT key, sum((star_pu->>key)::float) AS total FROM keys, mytable GROUP BY key)
  SELECT json_object(array_agg(key), array_agg(total::text))::jsonb FROM sums

Basically it explodes the jsonbs into rows, gets the names from them, sums them up, aggregates into arrays and creates a jsonb structure. Unfortunately there isn't a jsonb_object() function so we have to make it into json and then cast to jsonb.

Birch answered 1/2, 2016 at 11:37 Comment(1)
jsonb_object was added in 9.5Tran
A
1

If you need to sum the values in a JSONB column in PostgreSQL, you can use a custom aggregate function. I've written a function called that does just that.

create function custom_jsonb_add(a jsonb, b jsonb) returns jsonb as $$
    with expoded as (select *
                from jsonb_each(a) as t(key, value)
                union all
                select *
                from jsonb_each(b) as t(key, value)),
         folded as (select key, sum(value::numeric) as value
                from expoded
                group by key)
    select jsonb_object_agg(key, value)
    from folded
$$ language SQL immutable strict;

create aggregate custom_jsonb_add_agg(jsonb) (
    sfunc = custom_jsonb_add,
    stype = jsonb,
    initcond = '{}'
);
Aeriell answered 2/3, 2023 at 8:45 Comment(0)
J
0

Here is implementation of a stored procedure to sum JSONb values:

   CREATE OR REPLACE FUNCTION sum_jsonb_values(data JSONB)
              RETURNS NUMERIC AS $$
            DECLARE
              key_value RECORD;
              total_sum NUMERIC := 0;
            BEGIN
              IF data IS NULL THEN
                RETURN NULL;
              END IF;

           FOR key_value IN
             SELECT * FROM jsonb_each(data)
           LOOP
             total_sum := total_sum + (key_value.value::NUMERIC);
           END LOOP;
           RETURN total_sum;
    END;
    $$ LANGUAGE plpgsql
    IMMUTABLE ;
Jaquelynjaquenetta answered 13/6, 2023 at 11:13 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.