How can I escape certain characters while using python's urllib.urlencode()?
Asked Answered
K

2

6

I have a dictionary that I want to urlencode as query parameters. The server that I am hitting expects the the query to look like this: http://www.example.com?A=B,C

But when I try to use urllib.urlencode to build the URL, I find that the comma gets turned into %2C:

>>> import urllib
>>> urllib.urlencode({"A":"B,C"})
'A=B%2CC'

Is there any way I can escape the comma so that urlencode treats it like a normal character?

If not, how can I work around this problem?

Kuroshio answered 23/6, 2019 at 8:16 Comment(0)
E
1

You can do this by adding the query params as a string before hitting the endpoint.

I have used requests for making a request.

For example:

GET Request

import requests

url = "https://www.example.com/?"
query = "A=B,C"

url_final = url + query

url  = requests.get(url_final)

print(url.url)
# https://www.example.com/?A=B,C

The comma (along with some other characters) is defined in RFC 3986 as a reserved character. This means the comma has defined meaning at various parts in a URL, and if it is not being used in that context it needs to be percent-encoded.

That said, the query parameter doesn't give the comma any special syntax, so in query parameters, we probably shouldn't be encoding it. That said, it's not entirely Requests' fault: the parameters are encoded using urllib.urlencode(), which is what is percent-encoding the query parameters.

This isn't easy to fix though, because some web services use , and some use %2C, and neither is wrong. You might just have to handle this encoding yourself.

Elute answered 23/6, 2019 at 9:0 Comment(0)
A
5

You can escape certain characters by specifying them explicitly as safe argument value

urllib.quote(str, safe='~()*!.\'')

More : https://docs.python.org/3.0/library/urllib.parse.html#urllib.parse.quote

Anglophile answered 23/6, 2019 at 8:20 Comment(3)
In my case, what should the value of str be? urllib.quote(urllib.urlencode({"A":"B,C"}), safe=',') does not give the correct answer.Kuroshio
Also, urllib.urlencode({"A": urllib.quote("B,C", safe=',')}) does not give the correct answer. So I am not sure what your answer means or how to use it. Please clarify.Kuroshio
My bad, you cannot escape characters while encoding a dictionary in Py2, although you can use an ugly hack str.replace("%2C",",") In Py3 however, urllib.parse.urlencode({"hello":"w,b"},safe=",")Anglophile
E
1

You can do this by adding the query params as a string before hitting the endpoint.

I have used requests for making a request.

For example:

GET Request

import requests

url = "https://www.example.com/?"
query = "A=B,C"

url_final = url + query

url  = requests.get(url_final)

print(url.url)
# https://www.example.com/?A=B,C

The comma (along with some other characters) is defined in RFC 3986 as a reserved character. This means the comma has defined meaning at various parts in a URL, and if it is not being used in that context it needs to be percent-encoded.

That said, the query parameter doesn't give the comma any special syntax, so in query parameters, we probably shouldn't be encoding it. That said, it's not entirely Requests' fault: the parameters are encoded using urllib.urlencode(), which is what is percent-encoding the query parameters.

This isn't easy to fix though, because some web services use , and some use %2C, and neither is wrong. You might just have to handle this encoding yourself.

Elute answered 23/6, 2019 at 9:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.