how to extract properly when sqlite json has value as an array
Asked Answered
A

1

9

I have a sqlite database and in one of the fields I have stored complete json object . I have to make some json select requests . If you see my json the ALL key has value which is an array . We need to extract some data like all comments where "pod" field is fb . How to extract properly when sqlite json has value as an array ?

select json_extract(data,'$."json"') from datatable ; gives me entire thing . Then I do select json_extract(data,'$."json"[0]') but i dont want to do it manually . i want to iterate .

kindly suggest some source where i can study and work on it . MY JSON

{
    "ALL": [{
            "comments": "your site is awesome",
            "pod": "passcode",
            "originalDirectory": "case1"
        },
        {
            "comments": "your channel is good",
            "data": ["youTube"],
            "pod": "library"
        },
        {
            "comments": "you like everything",
            "data": ["facebook"],
            "pod": "fb"
        },
        {
            "data": ["twitter"],
            "pod": "tw",
            "ALL": [{
                "data": [{
                    "codeLevel": "3"
                }],
                "pod": "mo",
                "pod2": "p"
            }]
        }
    ]
}



create table datatable ( path string , data json1 );
insert into datatable values("1" , json('<abovejson in a single line>'));
Atonement answered 23/10, 2019 at 9:30 Comment(5)
put this json_extract(data,'$."json"[0]') in a loopGastroscope
can you please guide how . Any link to documentation will really helpAtonement
Column types of string and json1? You should read sqlite.org/datatype3.html very carefully.Repetitive
And what you're looking for is json_each().Repetitive
@Repetitive . can you please share an example . I am unable to understand documentation for this .Atonement
P
25

Simple List

Where your JSON represents a "simple" list of comments, you want something like:

select key, value 
  from datatable, json_each( datatable.data, '$.ALL' )
 where json_extract( value, '$.pod' ) = 'fb' ;

which, using your sample data, returns:

2|{"comments":"you like everything","data":["facebook"],"pod":"fb"}

The use of json_each() returns a row for every element of the input JSON (datatable.data), starting at the path $.ALL (where $ is the top-level, and ALL is the name of your array: the path can be omitted if the top-level of the JSON object is required). In your case, this returns one row for each comment entry.

The fields of this row are documented at 4.13. The json_each() and json_tree() table-valued functions in the SQLite documentation: the two we're interested in are key (very roughly, the "row number") and value (the JSON for the current element). The latter will contain elements called comment and pod, etc..

Because we are only interested in elements where pod is equal to fb, we add a where clause, using json_extract() to get at pod (where $.pod is relative to value returned by the json_each function).

Nested List

If your JSON contains nested elements (something I didn't notice at first), then you need to use the json_tree() function instead of json_each(). Whereas the latter will only iterate over the immediate children of the node specified, json_tree() will descend recursively through all children from the node specified.

To give us some data to work with, I have augmented your test data with an extra element:

create table datatable ( path string , data json1 );
insert into datatable values("1" , json('
    {
        "ALL": [{
                "comments": "your site is awesome",
                "pod": "passcode",
                "originalDirectory": "case1"
            },
            {
                "comments": "your channel is good",
                "data": ["youTube"],
                "pod": "library"
            },
            {
                "comments": "you like everything",
                "data": ["facebook"],
                "pod": "fb"
            },
            {
                "data": ["twitter"],
                "pod": "tw",
                "ALL": [{
                    "data": [{
                        "codeLevel": "3"
                    }],
                    "pod": "mo",
                    "pod2": "p"
                },
                {
                    "comments": "inserted by TripeHound",
                    "data": ["facebook"],
                    "pod": "fb"
                }]
            }
        ]
    }
'));

If we were to simply switch to using json_each(), then we see that a simple query (with no where clause) will return all elements of the source JSON:

select key, value 
  from datatable, json_tree( datatable.data, '$.ALL' ) limit 10 ;

ALL|[{"comments":"your site is awesome","pod":"passcode","originalDirectory":"case1"},{"comments":"your channel is good","data":["youTube"],"pod":"library"},{"comments":"you like everything","data":["facebook"],"pod":"fb"},{"data":["twitter"],"pod":"tw","ALL":[{"data":[{"codeLevel":"3"}],"pod":"mo","pod2":"p"},{"comments":"inserted by TripeHound","data":["facebook"],"pod":"fb"}]}]
0|{"comments":"your site is awesome","pod":"passcode","originalDirectory":"case1"}
comments|your site is awesome
pod|passcode
originalDirectory|case1
1|{"comments":"your channel is good","data":["youTube"],"pod":"library"}
comments|your channel is good
data|["youTube"]
0|youTube
pod|library

Because JSON objects are mixed in with simple values, we can no longer simply add where json_extract( value, '$.pod' ) = 'fb' because this produces errors when value does not represent an object. The simplest way around this is to look at the type values returned by json_each()/json_tree(): these will be the string object if the row represents a JSON object (see above documentation for other values).

Adding this to the where clause (and relying on "short-circuit evaluation" to prevent json_extract() being called on non-object rows), we get:

select key, value
  from datatable, json_tree( datatable.data, '$.ALL' )
 where type = 'object'
   and json_extract( value, '$.pod' ) = 'fb' ;

which returns:

2|{"comments":"you like everything","data":["facebook"],"pod":"fb"}
1|{"comments":"inserted by TripeHound","data":["facebook"],"pod":"fb"}

If desired, we could use json_extract() to break apart the returned objects:

.mode column
.headers on
.width 30 15 5
select json_extract( value, '$.comments' ) as Comments,
       json_extract( value, '$.data' ) as Data,
       json_extract( value, '$.pod' ) as POD
  from datatable, json_tree( datatable.data, '$.ALL' )
 where type = 'object'
   and json_extract( value, '$.pod' ) = 'fb' ;

Comments                        Data             POD
------------------------------  ---------------  -----
you like everything             ["facebook"]     fb
inserted by TripeHound          ["facebook"]     fb

Note: If your structure contained other objects, of different formats, it may not be sufficient to simply select for type = 'object': you may have to devise a more subtle filtering process.

Prosthetics answered 23/10, 2019 at 10:58 Comment(3)
Thanks sir :) This is ok for first level . How will we go inside the jsons which are at second level ie if you see "data": ["twitter"], it has another "ALL" . this query will become complicated . Any suggestionAtonement
Possibly json_tree will help... it is similar to json_each but (IIRC: I'm away from my computer) it descends down nested layers.Prosthetics
@Atonement I've added a way of using json_tree() to (hopefully) do what you want...Prosthetics

© 2022 - 2024 — McMap. All rights reserved.