simplejson.loads() get Invalid \escape: 'x'
Asked Answered
B

3

10

I am learning how to use simplejson to decode JSON file. But I suffered the "invalid \escape" error. Here is the code

import simplejson as json

def main():
    json.loads(r'{"test":"\x27"}')

if __name__ == '__main__':
    main()

And here is the error message

Traceback (most recent call last):
  File "hello_world.py", line 7, in <module>
    main()
  File "hello_world.py", line 4, in main
    json.loads(r'{"test":"\x27"}')
  File "C:\Users\zhangkai\python\simplejson\__init__.py", line 307, in loads
    return _default_decoder.decode(s)
  File "C:\Users\zhangkai\python\simplejson\decoder.py", line 335, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "C:\Users\zhangkai\python\simplejson\decoder.py", line 351, in raw_decode

    obj, end = self.scan_once(s, idx)
  File "C:\Users\zhangkai\python\simplejson\scanner.py", line 36, in _scan_once
    return parse_object((string, idx + 1), encoding, strict, _scan_once, object_
hook)
  File "C:\Users\zhangkai\python\simplejson\decoder.py", line 185, in JSONObject

    value, end = scan_once(s, end)
  File "C:\Users\zhangkai\python\simplejson\scanner.py", line 34, in _scan_once
    return parse_string(string, idx + 1, encoding, strict)
  File "C:\Users\zhangkai\python\simplejson\decoder.py", line 114, in py_scanstr
ing
    raise ValueError(errmsg(msg, s, end))
ValueError: Invalid \escape: 'x': line 1 column 10 (char 10)

I think json parser is supposed to recognize the escape. So I want to know what is wrong, and what should I do.

Boxer answered 28/11, 2010 at 8:49 Comment(1)
Related: Missing double escape in windows file path: python - json reading error json.decoder.JSONDecodeError: Invalid \escape - Stack Overflow, octal escape python - Fixing invalid JSON escape - Stack OverflowQuadruple
F
15

JSON has no hex escape (\xNN) like some languages (including JavaScript) and notations do, details here. It has a unicode escape, \uNNNN where NNNN is four hex digits, but no \x hex escape.

Fight answered 28/11, 2010 at 8:51 Comment(9)
Thanks. So if the JSON file has \x notation, I should convert it myself first?Boxer
@user308587: If the file has \x notation, it's not in JSON format. If you want to accept invalid JSON anyway, yes, you'd have to pre-process it yourself. Assuming you want to treat the \x the way JavaScript does, convert \xNN to \u00NN (e.g., \x27 becomes \u0027). FWIW, how \x and \u are handled by JavaScript -- not JSON -- is covered by Section 7.8.4 of the ECMAScript spec. But my read is it really is just a matter of changing the x to a u and adding the leading zeroes. Best,Fight
@T.J.Crowder Can you please elaborate just a matter of changing the x to a u and adding the leading zeroes ? How do I do with a character that is part of a big string?Pelletier
@Volatil3: Say you have raw JSON in a string, for instance: str = '{"foo": "bar\\x23 testing 1 2 3 \\x23"}' You can convert those to \u notation with a simple replace: str2 = str.replace(/\\x/g, "\\u00") Then str2 will successfully parse, and you'll have an object with a property, foo, with the value "bar# testing 1 2 3 #" (because \x23 / \u0023 is #).Fight
@Volatil3: Not load, JSON.parse. Assuming the text you're referring to is JSON.Fight
@T.J.Crowder there is no json.parse in Python 2.xPelletier
@Volatil3: Ah, I didn't remember this question was originally about Python... My replace above may be suspect as well (I don't do Python, that was a JavaScript example); you'll have to massage it into the equivalent Python.Fight
It is nonsense that JSON has no x notation; JavaScript eval accepts it, so it is valid JavaScript format.Volkan
@Vitaliy: eval accepting it doesn't make it JSON. eval also accepts the string (function foo() { alert("Not JSON"); })() (jsfiddle.net/whghsf6o), but that's not JSON either. :-) While some actual JSON parsers (as opposed to eval) do accept \xNN notation (V8's, for instance), it is not valid JSON. Details in the JSON website linked above as well as the RFC and the Standard (pdf). \xNN in a string is valid JavaScript, but not valid JSON.Fight
T
5

This is expected behavior from a parser as that JSON is invalid; within a string a slash may be followed only by ", \, /, b, f, n, r, t or u (which must then be followed by 4 hex characters). An x is not allowed. See the spec at http://json.org/

Transverse answered 28/11, 2010 at 8:52 Comment(0)
D
0

try python-cjson

import cjson
s = cjson.encode({'abc':123,'def':'xyz'})
print 'json: %s - %s' % (type(s), s)
s = cjson.decode(s)
print '%s - %s' % (type(s), s)
Darmit answered 14/6, 2012 at 0:8 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.