Google Sheets - Parse JSON string contained in one cell and extract specific values to another cell
Asked Answered
M

4

6

I have a sheet where for each row in column Z there is a JSON string recovered from Twitter via TAGS.

enter image description here

The JSON strings in column Z all have a similar structure:

{
  "hashtags": [
    {
      "text": "Negev_Summit",
      "indices": [
        172,
        185
      ]
    }
  ],
  "symbols": [],
  "user_mentions": [
    {
      "screen_name": "JY_LeDrian",
      "name": "Jean-Yves Le Drian",
      "id": 1055021191,
      "id_str": "1055021191",
      "indices": [
        69,
        80
      ]
    }
  ],
  "urls": [],
  "media": [
    {
      "id": 1513588335893258200,
      "id_str": "1513588335893258240",
      "indices": [
        271,
        294
      ],
      "media_url": "http://pbs.twimg.com/media/FQFYknkXoAAxgYd.jpg",
      "media_url_https": "https://pbs.twimg.com/media/FQFYknkXoAAxgYd.jpg",
      "url": "https://twitter.com/yairlapid/status/1513588345468825605",
      "display_url": "pic.twitter.com/dA4cBepIh2",
      "expanded_url": "https://twitter.com/yairlapid/status/1513588345468825605/photo/1",
      "type": "photo",
      "sizes": {
        "medium": {
          "w": 1024,
          "h": 576,
          "resize": "fit"
        },
        "thumb": {
          "w": 150,
          "h": 150,
          "resize": "crop"
        },
        "large": {
          "w": 1024,
          "h": 576,
          "resize": "fit"
        },
        "small": {
          "w": 680,
          "h": 383,
          "resize": "fit"
        }
      }
    }
  ]
}

I need to extract specific values for each JSON string in column Z and put them in columns AA, AB and AC (hashtags, user mentions, and URL's).

I've managed to achieve this with a really dirty multiple REGEXREPLACE formula but it doesn't seem logical that there is no way to fo this more efficiently:

=IFERROR("@"&JOIN(" @",SPLIT(REGEXREPLACE(REGEXREPLACE(REGEXREPLACE(REGEXREPLACE(REGEXREPLACE(REGEXEXTRACT(INDIRECT("Y"&ROW()),".*user_mentions\"":\[(.*)\],\""urls.*"),"(,\""indices\"":\[\d+,\d+\])",""),"(,\""id_str\"":\""\d+\"")",""),"(,\""id\"":\d+)",""),"(\{\""screen_name\"":\"")",""),"\"",\""name\"":\""(.){1,50}\""\}",""),",")),"")

Ideally i'm looking for a script which would parse the JSON string and extract 1 or more values from each section of the JSON. For example:

For hashtags (column AA):

=PARSEJSON(Z1, "hashtags")

Result:

#hashtag1 #hashtag2

For user_mentions (column AB):

=PARSEJSON(Z1, "user_mentions/screen_name")

Result:

@username1 @username2

Would appreciate any help sending me in the right direction.

Metatarsal answered 16/2, 2022 at 22:36 Comment(2)
how do you organize your data in the spreadsheet? what happens if you only filter the values in screen_name in a separate cell? how are you using the fetch method to get the data?Laconism
I'm using TAGS (tags.hawksey.info) to get the data. The JSON strings from the example above are stored in column Y. Ideally, i'd like to display the "screen_names" in column AA.Metatarsal
A
2

I managed to do it with a different script I found here.

This is the script:

function getData(json, path) {
  const obj = JSON.parse(json);
  const keys = path.split('.');
  let current = obj;
  for( key of keys ){
    current = current[key];
  }
  return current;
}

You would then enter in the cell with =getData(Z1, "hashtags")

Axon answered 6/4, 2022 at 14:48 Comment(1)
Tried the code in your answer. I get an empty cell :-( I have updated the JSON in my question to the one I tested your code on. Normally I should have got Negev_Summit as a hashtag result but got nothing ?Metatarsal
M
2

I couldn't get any of the solutions to work. Ultimately this is the script that worked for me.

function getData(jsonText, key) {
  try {
    var data = JSON.parse(jsonText);
    return data[key] || "";
  } catch (error) {
    return "Error parsing JSON or key not found.";
  }
}

Calling out the element you need in the cell using: =getData(A2, "title")

  • Replace A2 with the cell containing your JSON data
  • Replace title with the element to parse from your data
Mcabee answered 16/8, 2023 at 15:26 Comment(1)
I like this solution as well. I found that my json values were single quoted, i.e: {'cpu_user':1.234}. The single-quotes caused the JSON.parse() call to fail. The JSON.parse() call seems to prefer keys that are double-quoted, like: {"cpu_user":1.234}.Stephine
L
0

If your main purpose is to only get the values in screen_name I'd modify my script and I'd use =IMPORTJSON(url, "user_mentions/screen_name")

/**
* Imports JSON data to your spreadsheet Ex: IMPORTJSON("http://myapisite.com","city/population")
* @param url URL of your JSON data as string
* @param xpath simplified xpath as string
* @customfunction
*/
function IMPORTJSON(url,xpath){
  
  try{
    var res = UrlFetchApp.fetch(url);
    var content = res.getContentText();
    var json = JSON.parse(content);
    
    var patharray = xpath.split("/");
    
    for(var i=0;i<patharray.length;i++){
      json = json[patharray[i]];
    }
    
    
    if(typeof(json) === "undefined"){
      return "Node Not Available";
    } else if(typeof(json) === "object"){
      var tempArr = [];
      
      for(var obj in json){
        tempArr.push([obj,json[obj]]);
      }
      return tempArr;
    } else if(typeof(json) !== "object") {
      return json;
    }
  }
  catch(err){
      return "Error getting data";  
  }
  
}
Laconism answered 17/2, 2022 at 1:28 Comment(4)
Hey David, indeed when I researched my problem I came across your script, however, it expects a URL for the JSON while in my project, the JSON segments are already fetched and stored in cells, hundreds of them all along column Y.Metatarsal
I tried modulating your script instead of looking for the JSON in a URL, to get it from the active cell, however i keep on getting "Error getting data".Metatarsal
@EladRatson do you have a sample sheet I can take a look at?Laconism
Somehow your comment slipped me. Apologize. Here is a link: docs.google.com/spreadsheets/d/… . Note that the raw JSON is in column AAMetatarsal
C
0

@Yiddy s answer did not work for me. So i did some modifications to it and came up with this.

function getData(range, path, sheet_name) {
    var sprsheet = SpreadsheetApp.getActiveSpreadsheet();
    var sheet = sprsheet.getSheetByName(sheet_name);

    var string = sheet.getRange(range).getValue();

    var json = JSON.parse(string);
    const keys = path.split('.');

    var current = json;
    for (key of keys) {
        current = current[key];
    }
    return current;
}
Codeine answered 2/10, 2022 at 19:10 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.