Python, opposite function urllib.urlencode
Asked Answered
A

3

94

How can I convert data after processing urllib.urlencode to dict? urllib.urldecode does not exist.

Amoebic answered 22/8, 2010 at 18:59 Comment(0)
H
131

As the docs for urlencode say,

The urlparse module provides the functions parse_qs() and parse_qsl() which are used to parse query strings into Python data structures.

(In older Python releases, they were in the cgi module). So, for example:

>>> import urllib
>>> import urlparse
>>> d = {'a':'b', 'c':'d'}
>>> s = urllib.urlencode(d)
>>> s
'a=b&c=d'
>>> d1 = urlparse.parse_qs(s)
>>> d1
{'a': ['b'], 'c': ['d']}

The obvious difference between the original dictionary d and the "round-tripped" one d1 is that the latter has (single-item, in this case) lists as values -- that's because there is no uniqueness guarantee in query strings, and it may be important to your app to know about what multiple values have been given for each key (that is, the lists won't always be single-item ones;-).

As an alternative:

>>> sq = urlparse.parse_qsl(s)
>>> sq  
[('a', 'b'), ('c', 'd')]
>>> dict(sq)
{'a': 'b', 'c': 'd'}

you can get a sequence of pairs (urlencode accepts such an argument, too -- in this case it preserves order, while in the dict case there's no order to preserve;-). If you know there are no duplicate "keys", or don't care if there are, then (as I've shown) you can call dict to get a dictionary with non-list values. In general, however, you do need to consider what you want to do if duplicates are present (Python doesn't decide that on your behalf;-).

Heaves answered 22/8, 2010 at 19:2 Comment(1)
Up vote for Python 2, however Python 3 is all in the urllib module. See @phobie answer.Perfumer
G
24

Python 3 version based on Alex's answer:

>>> import urllib.parse
>>> d = {'a':'x', 'b':'', 'c':'z'}
>>> s = urllib.parse.urlencode(d)
>>> s
'a=x&b=&c=z'
>>> d1 = urllib.parse.parse_qs(s, keep_blank_values=True)
>>> d1
{'a': ['x'], 'b': [''], 'c': ['z']}

The alternative:

>>> sq = urllib.parse.parse_qsl(s, keep_blank_values=True)
>>> sq
[('a', 'x'), ('b', ''), ('c', 'z')]
>>> dict(sq)
{'a': 'x', 'b': '', 'c': 'z'}

parse_qsl is reversible:

>>> urllib.parse.urlencode(sq)
'a=x&b=&c=z'

Keep possible duplicates in mind, when parsing user-input:

>>> s = 'a=x&b=&a=z'
>>> d1 = urllib.parse.parse_qs(s, keep_blank_values=True)
>>> d1
{'a': ['x', 'z'], 'b': ['']}
>>> sq = urllib.parse.parse_qsl(s, keep_blank_values=True)
>>> sq
[('a', 'x'), ('b', ''), ('a', 'z')]
>>> dict(sq)
{'a': 'z', 'b': ''}
  1. The lists in the parse_qs result may have more than one item
  2. Calling dict on the parse_qsl result may hide values
Glop answered 17/4, 2012 at 0:2 Comment(1)
It is only reversible when the query was parsed using keep_blank_values=True.Betthezel
E
16

urllib.unquote_plus() does what you want. It replaces %xx escapes by their single-character equivalent and replaces plus signs with spaces.

Example:

unquote_plus('/%7Ecandidates/?name=john+connolly') 

yields

'/~candidates/?name=john connolly'.
Ericaericaceous answered 26/2, 2014 at 15:36 Comment(2)
He said, he wanted a dict. So your answer is wrong.Persistent
yay, this is what I was looking for.Enschede

© 2022 - 2024 — McMap. All rights reserved.