RJSONIO vs rjson - better tuning
Asked Answered
D

2

17

UPDATE:

The tl;dr is that RJSONIO is no longer the faster of the two options. Rather rjson is now much faster.

See the comments for additional confirmation of results


I was under the impression that RJSONIO was supposed to be faster tha rjson.
However, I am getting the opposite results.

My Question is:

  • Is there any tuning that can/should be performed to improve the results from RJSONIO? (ie, Am I overlooking something?)

Below are the comparisons using real data (where U is the contents of a json webpage) and then a mocked up json

## REAL DATA
library(microbenchmark)
> microbenchmark(RJSONIO::fromJSON(U), rjson::fromJSON(U))

Unit: milliseconds
                  expr       min        lq    median        uq      max
1   rjson::fromJSON(U)  29.46913  30.16218  31.74999  34.11012 158.6932
2 RJSONIO::fromJSON(U) 175.11514 181.67742 186.52871 195.90646 414.6160

> microbenchmark(RJSONIO::fromJSON(U, simplify=FALSE), rjson::fromJSON(U))
Unit: milliseconds
                                    expr       min       lq    median        uq        max
1                     rjson::fromJSON(U)  27.92341  28.7430  29.60091  30.63291 1 143.9478
2 RJSONIO::fromJSON(U, simplify = FALSE) 173.30136 179.5815 183.94315 190.17245 2 328.8996

Example with Mock Data

(Similar results)

# MOCK DATA
U <- toJSON(list(1:10, LETTERS, letters, rnorm(20)))

microbenchmark(RJSONIO::fromJSON(U), rjson::fromJSON(U))
# Unit: microseconds
#                   expr     min       lq   median       uq      max
# 1   rjson::fromJSON(U)  94.788 100.8650 105.6035 111.0740 3457.479
# 2 RJSONIO::fromJSON(U) 520.131 527.7775 533.2715 555.2415  942.136

Example 2 with iris dataset

Iris.JSON <- toJSON(iris)

microbenchmark(RJSONIO::fromJSON(Iris.JSON), rjson::fromJSON(Iris.JSON))
# Unit: microseconds
#                           expr      min       lq   median       uq       max
# 1   rjson::fromJSON(Iris.JSON)  229.669  235.571  238.511  241.423   260.164
# 2 RJSONIO::fromJSON(Iris.JSON) 1209.607 1224.793 1232.165 1238.953 12039.772

> sessionInfo()
R version 2.15.1 (2012-06-22)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] data.table_1.8.8 stringr_0.6.1    RJSONIO_1.0-1    rjson_0.2.11

loaded via a namespace (and not attached):
[1] plyr_1.7.1
Desideratum answered 9/3, 2013 at 7:38 Comment(10)
I test your benchamarking and I confirm the result(I use simplify = FALSE to get identical results) – What do you expect as an answer?Aloke
Can we have a full reproductible example ? Because in my settings RJSONIO is much faster than rjson.Serving
@dicko A full workable example was included. It may have been missed mixed in with the benchmarks. I separated it to be more visible. Also added session info.Desideratum
@agstudy, I would have expected the results to be flipped -- ie for RJSONIO to have been much faster. [This is based on what I have heard about RJSONIO and so I'm trying to confirm if in fact it is slower or rather that I am simply doing something incorrectly]Desideratum
@RicardoSaporta I tried and it seems that you are right and I'm somewhat surprised because with the iris data I get the opposite. Look at this : https://mcmap.net/q/738962/-how-to-read-big-jsonServing
I think RJSONIO used to be faster, but now rjson seems to beat it. Even with the iris or bigger datasets. Maybe it's connected to some compiler settings, although rjson also uses the C implementation since 0.2.7 - so this performance update should have happened about a year ago, not now.Lutanist
@daroczig, I'm not sure about its history, but currently rjson is beating RJSONIO on any dataset I am testing it on.Desideratum
@RicardoSaporta: right, we agree on this. I just wrote about the history as I've benchmarked the two package a year ago in February pretty seriously, and RJSONIO seemed to perform a lot better. After that I stopped following any news about the rjson package, which is a shame as in March (2012) it started to use the C implementation of the JSON parser - IMHO it become much faster at that time compared to RJSONIO that already used the C lib.Lutanist
@Lutanist nice and clear explanation, I think yours should have been the answer.Agate
for anyone that finds this > 2015, I would strongly recommend jsonliteLenwood
G
1
> library('BBmisc')
> suppressAll(lib(c('RJSONIO','rjson','jsonlite','microbenchmark')))
> U <- toJSON(list(1:10, LETTERS, letters, rnorm(20)))
> microbenchmark(
+     rjson::toJSON(U),
+     RJSONIO::toJSON(U),
+     jsonlite::toJSON(U, dataframe = "column"),
+     times = 10
+ )
Unit: microseconds
                                      expr     min      lq      mean   median      uq       max neval cld
                          rjson::toJSON(U)  65.174  68.767 2002.7007  88.2675 103.151 19179.224    10   a
                        RJSONIO::toJSON(U) 299.186 304.832  482.8038 329.7210 493.683  1351.727    10   a
 jsonlite::toJSON(U, dataframe = "column") 485.985 501.381  555.4192 548.5935 587.083   708.708    10   a

Testing system.time()

> microbenchmark(
+     system.time(rjson::toJSON(U)),
+     system.time(RJSONIO::toJSON(U)),
+     system.time(jsonlite::toJSON(U, dataframe = "column")),
+     times = 10)
Unit: milliseconds
                                                   expr      min       lq     mean   median       uq      max neval cld
                          system.time(rjson::toJSON(U)) 112.0660 115.8677 119.8426 119.8372 121.6908 132.2111    10  ab
                        system.time(RJSONIO::toJSON(U)) 115.4223 118.0262 129.2758 120.5690 148.5175 151.6874    10   b
 system.time(jsonlite::toJSON(U, dataframe = "column")) 113.2674 114.9096 118.0905 117.8401 120.9626 123.6784    10  a

Below are comparison of few packages. Hope these links help...

1) New package: jsonlite. A smart(er) JSON encoder/decoder.

2) Improved memory usage and RJSONIO compatibility in jsonlite 0.9.15

3) A biased comparsion of JSON packages in R

Gaseous answered 9/3, 2013 at 7:38 Comment(0)
W
0

https://cran.r-project.org/web/packages/jsonlite/vignettes/json-aaquickstart.html

Please try jsonlite its the fastest in my experience for json data especially nested

also see

https://rstudio-pubs-static.s3.amazonaws.com/31702_9c22e3d1a0c44968a4a1f9656f1800ab.html

Wondering answered 28/3, 2017 at 6:49 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.