NSJSONSerialization serialization of a string containing forward slashes / and HTML is escaped incorrectly
Asked Answered
M

3

12

I am trying to convert some simple HTML into a string value in a JSON object and I'm having trouble getting the string encoding to not escape the string in NSJSONSerialization.

Example... I have a string which contains some basic HTML text:

NSString *str = @"<html><body><p>Samples / Text</p></body></html>";

The desired outcome is JSON with HTML as the value:

{
    "Title":"My Title",
    "Instructions":"<html><body><p>Samples / Text</p></body></html>"
}

I'm using the standard technique to convert an NSDictionary to a NSString containing JSON:

NSMutableDictionary *dict = [NSMutableDictionary dictionary];
[dict setObject:str forKey:@"Instructions"];
[dict setObject:@"My Title" forKey:@"Title"];

NSError *err;
NSData *data = [NSJSONSerialization dataWithJSONObject:dict options:NSJSONWritingPrettyPrinted error:&err];
NSString *resultingString = [[NSString alloc] initWithData:data encoding:NSUTF8StringEncoding];
NSLog(@"%@", resultingString);

The JSON produced by this method is valid, however the HTML has all forward slashes escaped:

{
    "Title":"My Title",
    "Instructions":"<html><body><p>Samples \/ Text<\/p><\/body><\/html>"
}

This creates invalid HTML in the instructions JSON string.

I'd like to stick with NSJSONSerialization since we're using that everywhere else in our framework and I've been burned before switching to non-Apple libraries as they get desupported. I've tried many different string encodings and all of them escape the angle brackets.

Apparently \/ is a valid representation in JavaScript for the / characters, which is why forward slash is escaped (even the StackOverflow text editor escaped it). See: escaping json string with a forward slash? and also JSON: why are forward slashes escaped?. I just don't want it to do that and there doesn't seem to be a way to stop iOS from escaping forward slashes in string values when serializing.

Military answered 3/4, 2013 at 17:55 Comment(4)
Another good reason to not use NSJSONSerialization, I suppose. One could always scan the data and replace adjacent "\/" characters with "/", but kinda messy.Bugbear
That's what I'm doing for now, which feels like a hack. Unfortunately every third party iOS framework I've used so far has been de-supported after the original authors got bored or busy.Military
We're still using SBJSON. And worst case you can write your own -- it's really only about 2K lines of code, so long as you don't get too fancy.Bugbear
The serialization is perfectly valid. Any decent deserializer will get the original string back without any problem.Apeman
S
5

I believeNSJSONSerialization is behaving as designed in regards to encoding HTML.

If you look at some questions (1, 2) on encoding HTML in JSON you'll see the answers always mention escaping the forward slashes.

JSON doesn't require forward slashes to be escaped, but HTML doesn't allow a javascript string to contain </ as it can be confused with the end of the <SCRIPT> tag.

See the answers here, here and most directly the w3.org HTML4 Appendix which states in B.3.2 Specifying non-HTML data

ILLEGAL EXAMPLE: 
The following script data incorrectly contains a "</" sequence (as part of "</EM>") before the SCRIPT end tag:

<SCRIPT type="text/javascript">
  document.write ("<EM>This won't work</EM>")
</SCRIPT>

Although this behaviour may cause issues for you NSJSONSerialisation is just playing by the age old rules of encoding HTML data for use in <SCRIPT> tags.

Sgraffito answered 8/12, 2013 at 0:14 Comment(0)
T
3

iOS 13 only: If you're not worried about producing invalid HTML sequences (as described in this answer), you can disable forward-slash escaping by passing the option NSJSONWritingWithoutEscapingSlashes to the serializer.

Example:

jsonData = [NSJSONSerialization dataWithJSONObject:batchUpdates
                                           options:NSJSONWritingWithoutEscapingSlashes
                                             error:nil];
Tangency answered 6/2, 2020 at 19:41 Comment(0)
R
0

Here's my subclass of AFJSONRequestSerializer to remove \ before / symbols in resulting JSON; handy if you use AFNetworking

class SanitizedAFJSONRequestSerializer: AFJSONRequestSerializer
{
    override func requestBySerializingRequest(request: NSURLRequest!, withParameters parameters: AnyObject!, error: NSErrorPointer) -> NSURLRequest!
    {
        var request = super.requestBySerializingRequest(request, withParameters: parameters, error: error)

        if let jsonData = request.HTTPBody
        {
            if let jsonString = NSString(data: jsonData, encoding: NSUTF8StringEncoding) as? String
            {
                let sanitizedString = jsonString.stringByReplacingOccurrencesOfString("\\/", withString: "/", options: NSStringCompareOptions.CaseInsensitiveSearch, range:nil) as NSString

                println("sanitized json string: \(sanitizedString)")

                var mutableRequest = request.mutableCopy() as! NSMutableURLRequest
                mutableRequest.HTTPBody = sanitizedString.dataUsingEncoding(NSUTF8StringEncoding)
                request = mutableRequest
            }
        }

        return request
    }
}
Rosamondrosamund answered 12/6, 2015 at 12:3 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.