Parsing text file of one line JSON objects using RJSONIO
Asked Answered
H

1

2

What I want: I would like to parse a text file of the form

{"business_id": "rncjoVoEFUJGCUoC1JgnUA", "full_address": "8466 W Peoria Ave\nSte 6\nPeoria, AZ 85345", "open": true, "categories": ["Accountants", "Professional Services", "Tax Services", "Financial Services"], "city": "Peoria", "review_count": 3, "name": "Peoria Income Tax Service", "neighborhoods": [], "longitude": -112.241596, "state": "AZ", "stars": 5.0, "latitude": 33.581867000000003, "type": "business"}
{"business_id": "0FNFSzCFP_rGUoJx8W7tJg", "full_address": "2149 W Wood Dr\nPhoenix, AZ 85029", "open": true, "categories": ["Sporting Goods", "Bikes", "Shopping"], "city": "Phoenix", "review_count": 5, "name": "Bike Doctor", "neighborhoods": [], "longitude": -112.10593299999999, "state": "AZ", "stars": 5.0, "latitude": 33.604053999999998, "type": "business"}

where every line is an individual json object. I would like the parsed form to be of a type which RPart can take as an argument.

I can get this working if I loop through every line but according to this SO answer it's more R like to use the apply function and not by looping through each line individually.

For each row in an R dataframe

Problem: When I run my code I'm getting this error

Error in apply(yelp_df, 1, fromJSON) : dim(X) must have a positive length

My code

#!/usr/bin/Rscript

require(graphics)
require(RJSONIO)


con <- file("yelp_phoenix_academic_dataset/yelp_academic_dataset_business.json", "r")
yelp_df <- readLines(con) #rather then guessing what the optimal buffer size of the system is I'll just put everything into memeory

apply(yelp_df, 1, fromJSON)
Hallo answered 23/5, 2013 at 3:38 Comment(0)
T
8

readLines is returning a character vector. apply expects an array. Use lapply or something similar.

out <- lapply(readLines("test.txt"), fromJSON)

> head(out[[1]])
$business_id
[1] "rncjoVoEFUJGCUoC1JgnUA"

$full_address
[1] "8466 W Peoria Ave\nSte 6\nPeoria, AZ 85345"

$open
[1] TRUE

$categories
[1] "Accountants"           "Professional Services" "Tax Services"         
[4] "Financial Services"   

$city
[1] "Peoria"

$review_count
[1] 3
Touch answered 23/5, 2013 at 7:3 Comment(4)
How does tell what is returned by a function in R? I'm used to programming language documentation specifying specific returns but I haven't found that in R yet.Hallo
?readLines will tell you. The value section states A character vector of length the number of lines read. is the return.Touch
How will we able to get all the $cities.. is there a straight forward way without looping ?Igraine
How does one aggregate such result? I was going to create empty data frame and then move calculation results but I have a feeling there is an easier way to do itSelle

© 2022 - 2024 — McMap. All rights reserved.