Get json data via url and use in python (simplejson)
Asked Answered
F

3

37

I imagine this must have a simple answer, but I am struggling: I want to take a url (which outputs json) and get the data in a usable dictionary in python. I am stuck on the last step.

>>> import urllib2
>>> import simplejson
>>> req = urllib2.Request("http://vimeo.com/api/v2/video/38356.json", None, {'user-agent':'syncstream/vimeo'})
>>> opener = urllib2.build_opener()
>>> f = opener.open(req)
>>> f.read()             # this works
'[{"id":"38356","title":"Forgetfulness - Billy Collins Animated Poetry","description":"US Poet Laureate Billy Collins reads his poem ","url":"http:\\/\\/vimeo.com\\/38356","upload_date":"2006-01-24 15:21:03","thumbnail_small":"http:\\/\\/80.media.vimeo.com\\/d1\\/5\\/47\\/74\\/thumbnail-4774968.jpg","thumbnail_medium":"http:\\/\\/80.media.vimeo.com\\/d1\\/5\\/46\\/85\\/thumbnail-4685118.jpg","thumbnail_large":"http:\\/\\/images.vimeo.com\\/87\\/39\\/873998\\/873998_640x480.jpg","user_name":"smjwt","user_url":"http:\\/\\/vimeo.com\\/smjwt","user_portrait_small":"http:\\/\\/bitcast.vimeo.com\\/vimeo\\/portraits\\/defaults\\/d.30.jpg","user_portrait_medium":"http:\\/\\/bitcast.vimeo.com\\/vimeo\\/portraits\\/defaults\\/d.75.jpg","user_portrait_large":"http:\\/\\/bitcast.vimeo.com\\/vimeo\\/portraits\\/defaults\\/d.100.jpg","user_portrait_huge":"http:\\/\\/bitcast.vimeo.com\\/vimeo\\/portraits\\/defaults\\/d.300.jpg","stats_number_of_likes":"281","stats_number_of_plays":"9173","stats_number_of_comments":23,"duration":"112","width":"320","height":"240","tags":"poetry, poet, online poetry, famous poet, video poetry, modern poetry, famous poem, poetry sites, poetry websites, audio poetry, american poet, animation clips, american poetry, free poetry sites, animation art, free poetry, animated clips, poem, poet laureate"}]'
>>> simplejson.load(f)
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/usr/lib/python2.5/site-packages/django/utils/simplejson/__init__.py", line 298, in load
    parse_constant=parse_constant, **kw)
  File "/usr/lib/python2.5/site-packages/django/utils/simplejson/__init__.py", line 338, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python2.5/site-packages/django/utils/simplejson/decoder.py", line 326, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python2.5/site-packages/django/utils/simplejson/decoder.py", line 344, in raw_decode
    raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded

Any ideas where I am going wrong?

Fetish answered 28/10, 2009 at 23:9 Comment(2)
Simple things first: the f.read() in the snippet is just for explanation purposes, right? Reason to ask is because if it is part of the intended code, this has the effect of "emptying" f, hence the ValueError w/ simplejason.Pugilist
django.utils.simplejson is deprecated; use json instead.Unblown
P
42

Try

f = opener.open(req)
simplejson.load(f)

without running f.read() first. When you run f.read(), the filehandle's contents are slurped so there is nothing left when your call simplejson.load(f)

Porosity answered 28/10, 2009 at 23:13 Comment(3)
Thank you - I was just taking it step by step to make sure I could see the data ... obviously, don't know enough about read()! Thanks.Fetish
In python 3, simplejson is replaced with json, and it needs a conversion from bytes. data = json.loads(str(opener.open(req).read(),"utf-8")) will work for a UTF-8 encoded response.Jonson
nit above—"utf-8" conversion goes with loads, not str (at least in python 2.7.3) ~ data = json.loads(str(opener.open(req).read()),"utf-8")Edile
C
10

The first line reads the entire file. The second line then tries to read more from the file, but there's nothing left:

>>> f.read()             # this works
blah blah blah
>>> simplejson.load(f)

Either just omit the f.read() line, or save the value from read, and use it in loads:

json = f.read()
simplejson.loads(json)
Chainsmoke answered 28/10, 2009 at 23:17 Comment(0)
S
-7

There's an even easier way - you dont need simplejson at all. Python can parse json into a dict/array using the eval statement as long as you set true/false/null to the right values.

# fetch the url
url = "https://api.twitter.com/1/users/lookup.json?user_id=6253282,18949452"
json = urllib2.urlopen(url).read()

# convert to a native python object
(true,false,null) = (True,False,None)
profiles = eval(json)
Scipio answered 15/1, 2012 at 3:15 Comment(4)
This requires that you absolutely trust the source of the json thoughApproximal
@Approximal Can you explain why that is?Godthaab
Please see: docs.python.org/library/functions.html#eval So the the value of the json string is interpreted as python code, this means that you are allowing the code to be run as your program, and could be a major security issue.Approximal
Instead of eval, look at the json library. json.loads() and json.dumps() converts between a python object and json string without the security issues of eval(). Otherwise this is a decent answer and thanks for the urllib2 code, that's what I was looking for.Kun

© 2022 - 2024 — McMap. All rights reserved.