To encode the URI, I used urllib.quote("schönefeld")
but when some non-ascii characters exists in string, it thorws
KeyError: u'\xe9'
Code: return ''.join(map(quoter, s))
My input strings are köln, brønshøj, schönefeld
etc.
When I tried just printing statements in windows(Using python2.7, pyscripter IDE). But in linux it raises exception (I guess platform doesn't matter).
This is what I am trying:
from commands import getstatusoutput
queryParams = "schönefeld";
cmdString = "http://baseurl" + quote(queryParams)
print getstatusoutput(cmdString)
Exploring the issue reason:
in urllib.quote()
, actually exception being throwin at return ''.join(map(quoter, s))
.
The code in urllib is:
def quote(s, safe='/'):
if not s:
if s is None:
raise TypeError('None object cannot be quoted')
return s
cachekey = (safe, always_safe)
try:
(quoter, safe) = _safe_quoters[cachekey]
except KeyError:
safe_map = _safe_map.copy()
safe_map.update([(c, c) for c in safe])
quoter = safe_map.__getitem__
safe = always_safe + safe
_safe_quoters[cachekey] = (quoter, safe)
if not s.rstrip(safe):
return s
return ''.join(map(quoter, s))
The reason for exception is in ''.join(map(quoter, s))
, for every element in s, quoter function will be called and finally the list will be joined by '' and returned.
For non-ascii char è
, the equivalent key will be %E8
which presents in _safe_map
variable. But when I am calling quote('è'), it searches for the key \xe8
. So that the key does not exist and exception thrown.
So, I just modifed s = [el.upper().replace("\\X","%") for el in s]
before calling ''.join(map(quoter, s))
within try-except block. Now it works fine.
But I am annoying what I have done is correct approach or it will create any other issue? And also I do have 200+ instances of linux which is very tough to deploy this fix in all instances.
urllib.quote('sch\xe9nefeld')
. You only get the error forurllib.quote(u'sch\xe9nefeld')
(note theu''
unicode literal). – CornfieldcmdString = "http://baseurl" + quote("schönefeld")
this should be likecmdString=u"http://baseurl"+quote(u"schönefeld")
? – Mcglonequote()
unicode values. For byte strings (already encoded) this doesn't happen. – Cornfield