MySQL SUM json values grouped by json keys
Asked Answered
T

4

5

Is it possible to calculate sum of json values grouped by json keys?

Mysql version is 5.7.17 on Google cloud sql.

Example_1: A short example of my point:

col1 | col2
-----|-----------------------
aaa  | {"key1": 1, "key2": 3}
-----|-----------------------
bbb  | {"key1": 0, "key2": 2}
-----|-----------------------
aaa  | {"key1": 50, "key2": 0}

SQL query should produce:

col1 | col2
-----|-----------------------
aaa  | {"key1": 51, "key2": 3}
-----|-----------------------
bbb  | {"key1": 0, "key2": 2}

OR

will it be possible with any of the below schema?

Example_2:

col1 | col2
-----|-----------------------
aaa  | {{"key_name" : "key1", "key_value" : 1}, {"key_name" : "key2", "key_value" : 3}}
-----|-----------------------
bbb  | {{"key_name" : "key1", "key_value" : 0}, {"key_name" : "key2", "key_value" : 2}}
-----|-----------------------
aaa  | {{"key_name" : "key1", "key_value" : 50}, {"key_name" : "key2", "key_value" : 0}}

Example_3:

col1 | col2
-----|-----------------------
aaa  | [{"key_name" : "key1", "key_value" : 1}, {"key_name" : "key2", "key_value" : 3}]
-----|-----------------------
bbb  | [{"key_name" : "key1", "key_value" : 0}, {"key_name" : "key2", "key_value" : 2}]
-----|-----------------------
aaa  | [{"key_name" : "key1", "key_value" : 50}, {"key_name" : "key2", "key_value" : 0}]

Example_4:

col1 | col2
-----|-----------------------
aaa  | {"key1": {"key_name" : "key1", "key_value" : 1}, "key2": {"key_name" : "key2", "key_value" : 3}}
-----|-----------------------
bbb  | {"key1": {"key_name" : "key1", "key_value" : 0}, "key2": {"key_name" : "key2", "key_value" : 2}}
-----|-----------------------
aaa  | {"key1": {"key_name" : "key1", "key_value" : 50}, "key2": {"key_name" : "key2", "key_value" : 0}}
Towbin answered 14/1, 2017 at 12:58 Comment(4)
Have you got any solutions for this ?Mutilate
I would start by abandoning JSON; put the key:value pairs in an EAV schema. Then the SQL is trivial.Kilogram
Team, Please dont limit yourself by not exploring the new concepts. I like json support in mysql which has tremendous values and want to explore all the possibilities. So please dont give traditional group by option as answers.Mutilate
@wchiquito could you please help here ..?Mutilate
C
1

I believe that something like this could work.

SELECT SUM(col2->>"$.key1"), SUM(col2->>"$.key2") FROM your_table GROUP BY col1
Chobot answered 26/10, 2017 at 12:38 Comment(1)
But I should know the keys that exists to prepare the query :(Mutilate
G
1

TL;DR: yes, it can be done without knowing the key names in advance, and none of the alternate data formats has any advantage over the original.

This can be done without knowing the key names in advance but it is painful... basically you have to look at every value in the table to determine the set of distinct keys in the table before you can sum them. Because of this requirement, and the fact that the alternate data formats can all have multiple keys per entry, there is no advantage to using any of them.

Since you have to look for all the distinct keys, it is as easy to do the sums while you are looking for them. This function and procedure together will do that. The function, json_merge_sum, takes two JSON values and merges them, summing the values where a key appears in both values e.g.

SELECT json_sum_merge('{"key1": 1, "key2": 3}', '{"key3": 1, "key2": 2}')

Output:

{"key1": 1, "key2": 5, "key3": 1}

The function code:

DELIMITER //
DROP FUNCTION IF EXISTS json_merge_sum //
CREATE FUNCTION json_sum_merge(IN j1 JSON, IN total JSON) RETURNS JSON
BEGIN
  DECLARE knum INT DEFAULT 0;
  DECLARE jkeys JSON DEFAULT JSON_KEYS(j1);
  DECLARE kpath VARCHAR(20);
  DECLARE v INT;
  DECLARE l INT DEFAULT JSON_LENGTH(jkeys);
  kloop: LOOP
    IF knum >= l THEN
      LEAVE kloop;
    END IF;
    SET kpath = CONCAT('$.', JSON_EXTRACT(jkeys, CONCAT('$[', knum, ']')));
    SET v = JSON_EXTRACT(j1, kpath);
    IF JSON_CONTAINS_PATH(total, 'one', kpath) THEN
      SET total = JSON_REPLACE(total, kpath, JSON_EXTRACT(total, kpath) + v);
    ELSE
      SET total = JSON_SET(total, kpath, v);
    END IF;
    SET knum = knum + 1;
  END LOOP kloop;
  RETURN total;
END

The procedure, count_keys, performs the equivalent of the GROUP BY clause. It finds all the distinct values of col1 in the table and then calls json_sum_merge for each row which has that value of col1. Note the row select query performs a SELECT ... INTO a dummy variable so no output is generated, and uses a MIN() to ensure there is only one result (so that it can be assigned to a variable).

The procedure:

DELIMITER //
DROP PROCEDURE IF EXISTS count_keys //
CREATE PROCEDURE count_keys()
BEGIN
  DECLARE finished INT DEFAULT 0;
  DECLARE col1val VARCHAR(20);
  DECLARE col1_cursor CURSOR FOR SELECT DISTINCT col1 FROM table2;
  DECLARE CONTINUE HANDLER FOR NOT FOUND SET finished=1;
  OPEN col1_cursor;
  col1_loop: LOOP
    FETCH col1_cursor INTO col1val;
    IF finished=1 THEN
      LEAVE col1_loop;
    END IF;
    SET @total = '{}';
    SET @query = CONCAT("SELECT MIN(@total:=json_sum_merge(col2, @total)) INTO @json FROM table2 WHERE col1='", col1val, "'");
    PREPARE stmt FROM @query;
    EXECUTE stmt;
    DEALLOCATE PREPARE stmt;
    SELECT col1val AS col1, @total AS col2;
  END LOOP col1_loop;
END

For a slightly larger example:

col1    col2    
aaa     {"key1": 1, "key2": 3}
bbb     {"key1": 4, "key2": 2}
aaa     {"key1": 50, "key3": 0}
ccc     {"key2": 5, "key3": 1, "key4": 3}
bbb     {"key1": 5, "key2": 1, "key5": 3}

CALL count_keys() produces:

col1    col2    
aaa     {"key1": 51, "key2": 3, "key3": 0}
bbb     {"key1": 9, "key2": 3, "key5": 3}
ccc     {"key2": 5, "key3": 1, "key4": 3}

Note I've called the table table2 in the procedure, you will need to edit that (in both queries) to suit.

Goldarned answered 20/5, 2018 at 8:12 Comment(2)
Are you sure, there is no advantage of the Example_1 compare to other Examples schema's ..?Mutilate
Sorry I didn't make myself clear, Example 1 is actually the easiest to use. Example 2 would be the second easiest, and Examples 3 and 4 are both quite difficult to work with.Goldarned
J
0

SQL for the "short example":

SELECT col1,
       JSON_OBJECT('key1', SUM(value1), 'key2', SUM(value2)) AS col2
FROM
    (SELECT col1,
            JSON_EXTRACT(col2, '$.key1') AS value1,
            JSON_EXTRACT(col2, '$.key2') AS value2
     FROM tbl) subq
GROUP BY col1;
Jobless answered 16/5, 2018 at 14:39 Comment(1)
Thanks! But i need a dynamic query, which uses common name. But in your example you have mentioned the "key1 and key2" which is static info, that has to be added/remembered whenever we create query. I think it is not possible with Example_1 hence mentioned other examples in the questions with common name key_name and key_value.Mutilate
T
0

Solution for Example_3:

DROP TABLE IF EXISTS jsondata;
CREATE TABLE jsondata (json JSON, col varchar(11));

INSERT INTO jsondata VALUES
('[{"key_name" : "key1", "key_value" : 1}, {"key_name" : "key2", "key_value" : 3}]', 'aaa'),
('[{"key_name" : "key1", "key_value" : 0}, {"key_name" : "key3", "key_value" : 2}]', 'bbb'),
('[{"key_name" : "key1", "key_value" : 50}, {"key_name" : "key2", "key_value" : 0}]', 'aaa');

DROP FUNCTION IF EXISTS json_sum_by_col;
CREATE FUNCTION json_sum_by_col(col varchar(100)) RETURNS JSON
BEGIN
    DECLARE i INT DEFAULT 0;
    DECLARE done INT DEFAULT FALSE;
    DECLARE select_values JSON;
    DECLARE temp_result JSON;
    DECLARE json_result JSON DEFAULT '[]';
    DECLARE temp_key varchar(11);
    DECLARE temp_value int;

    DECLARE curs CURSOR FOR SELECT json FROM jsondata WHERE jsondata.col = col;
    DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;

    OPEN curs;
        read_loop: LOOP
            SET i = 0;
            FETCH curs INTO select_values;

            IF done THEN
              LEAVE read_loop;
            END IF;

            WHILE i < JSON_LENGTH(select_values) DO
                -- extract key and value for i element
                SET temp_key = JSON_EXTRACT(JSON_EXTRACT(select_values, CONCAT('$[',i,']')), '$.key_name');
                SET temp_value = JSON_EXTRACT(JSON_EXTRACT(select_values, CONCAT('$[',i,']')), '$.key_value');

                -- search json_result for key
                SET @search = JSON_SEARCH(json_result, 'one', JSON_UNQUOTE(temp_key));
                IF @search IS NOT NULL THEN
                    -- if exists add to existing value
                    SET @value_path = JSON_UNQUOTE(REPLACE(@search, 'name', 'value'));
                    SET temp_value = temp_value + JSON_EXTRACT(json_result, @value_path);
                    SET json_result = JSON_REPLACE(json_result, @value_path, temp_value);
                ELSE
                    -- else attach it to json_result
                    SET temp_result = JSON_OBJECT("key_name", JSON_UNQUOTE(temp_key), "key_value", temp_value);
                    SET json_result = JSON_INSERT(json_result, CONCAT('$[',JSON_LENGTH(json_result),']'), temp_result);
                END IF;

                SELECT i + 1 INTO i;
            END WHILE;
        END LOOP;
    CLOSE curs;

    RETURN json_result;
END;

SELECT col, json_sum_by_col(col) FROM jsondata GROUP BY col;

You can run it here

Thenceforth answered 21/5, 2018 at 9:58 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.