If I do
url = "http://example.com?p=" + urllib.quote(query)
- It doesn't encode
/
to%2F
(breaks OAuth normalization) - It doesn't handle Unicode (it throws an exception)
Is there a better library?
If I do
url = "http://example.com?p=" + urllib.quote(query)
/
to %2F
(breaks OAuth normalization)Is there a better library?
From the Python 3 documentation:
urllib.parse.quote(string, safe='/', encoding=None, errors=None)
Replace special characters in string using the
%xx
escape. Letters, digits, and the characters'_.-~'
are never quoted. By default, this function is intended for quoting the path section of a URL. The optional safe parameter specifies additional ASCII characters that should not be quoted — its default value is'/'
.
That means passing ''
for safe will solve your first issue:
>>> import urllib.parse
>>> urllib.parse.quote('/test')
'/test'
>>> urllib.parse.quote('/test', safe='')
'%2Ftest'
(The function quote
was moved from urllib
to urllib.parse
in Python 3.)
By the way, have a look at urlencode.
About the second issue, there was a bug report about it and it was fixed in Python 3.
For Python 2, you can work around it by encoding as UTF-8 like this:
>>> query = urllib.quote(u"Müller".encode('utf8'))
>>> print urllib.unquote(query).decode('utf8')
Müller
reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" | "$" | ","
Which is what urllib.quote is dealing with. –
Kindhearted urllib.parse.quote
docs –
Greenwich six.moves.urllib.parse.quote(u"Müller".encode('utf8'))
for Python 2 and 3. –
Ephraim urllib.parse.quote('http://example.com/some path/').replace('%3A', ':')
–
Electronics urllib.parse.quote(url, safe=':/')
. Even better, encode some path
, then join strings. This is Python, not PHP. –
Tank In Python 3, urllib.quote
has been moved to urllib.parse.quote
, and it does handle Unicode by default.
>>> from urllib.parse import quote
>>> quote('/test')
'/test'
>>> quote('/test', safe='')
'%2Ftest'
>>> quote('/El Niño/')
'/El%20Ni%C3%B1o/'
quote
is rather vague as a global. It might be nicer to use something like urlencode: from urllib.parse import quote as urlencode
. –
Jadajadd urlencode
in urllib.parse
already that does something completely different, so you'd be better off picking another name or risk seriously confusing future readers of your code. –
Pervious quote
is "rather vague". rather than rename the variable/object to something else you can leave the name fully qualified as urllib.parse.quote
. leaving it fully qualified does two things: takes a little extra time typing and saves time reading and maintaining the code. ) –
Abhorrent I think module requests
is much better. It's based on urllib3
.
You can try this:
>>> from requests.utils import quote
>>> quote('/test')
'/test'
>>> quote('/test', safe='')
'%2Ftest'
My answer is similar to Paolo's answer.
requests.utils.quote
is link to python quote
. See request sources. –
Profit requests.utils.quote
is a thin compatibility wrapper to urllib.quote
for python 2 and urllib.parse.quote
for python 3 –
Kindhearted If you're using Django, you can use urlquote:
>>> from django.utils.http import urlquote
>>> urlquote(u"Müller")
u'M%C3%BCller'
Note that changes to Python mean that this is now a legacy wrapper. From the Django 2.1 source code for django.utils.http:
A legacy compatibility wrapper to Python's urllib.parse.quote() function.
(was used for unicode handling on Python 2)
It is better to use urlencode
here. There isn't much difference for a single parameter, but, IMHO, it makes the code clearer. (It looks confusing to see a function quote_plus
! - especially those coming from other languages.)
In [21]: query='lskdfj/sdfkjdf/ksdfj skfj'
In [22]: val=34
In [23]: from urllib.parse import urlencode
In [24]: encoded = urlencode(dict(p=query,val=val))
In [25]: print(f"http://example.com?{encoded}")
http://example.com?p=lskdfj%2Fsdfkjdf%2Fksdfj+skfj&val=34
An alternative method using furl:
import furl
url = "https://httpbin.org/get?hello,world"
print(url)
url = furl.furl(url).url
print(url)
Output:
https://httpbin.org/get?hello,world
https://httpbin.org/get?hello%2Cworld
© 2022 - 2024 — McMap. All rights reserved.