I have a unicode string in python code:
name = u'Mayte_Martín'
I would like to use it with a SPARQL query, which meant that I should encode the string using 'utf-8' and use urllib.quote_plus or requests.quote on it. However, both these quote functions behave strangely as can be seen when used with and without the 'safe' arguments.
from urllib import quote_plus
Without 'safe' argument:
quote_plus(name.encode('utf-8'))
Output: 'Mayte_Mart%C3%ADn'
With 'safe' argument:
quote_plus(name.encode('utf-8'), safe=':/')
Output:
---------------------------------------------------------------------------
UnicodeDecodeError Traceback (most recent call last)
<ipython-input-164-556248391ee1> in <module>()
----> 1 quote_plus(v, safe=':/')
/usr/lib/python2.7/urllib.pyc in quote_plus(s, safe)
1273 s = quote(s, safe + ' ')
1274 return s.replace(' ', '+')
-> 1275 return quote(s, safe)
1276
1277 def urlencode(query, doseq=0):
/usr/lib/python2.7/urllib.pyc in quote(s, safe)
1264 safe = always_safe + safe
1265 _safe_quoters[cachekey] = (quoter, safe)
-> 1266 if not s.rstrip(safe):
1267 return s
1268 return ''.join(map(quoter, s))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 10: ordinal not in range(128)
The problem seems to be with rstrip function. I tried to make some changes and call as...
quote_plus(name.encode('utf-8'), safe=u':/'.encode('utf-8'))
But that did not solve the issue. What could be the issue here?