JSON - is there any XML CDATA equivalent?
Asked Answered
M

4

37

I'm looking for a way that json parsing will take information as is (as if it was CDATA) - and not to try to serialize that. We use both .net and java (client and server) - so the answer should be about JSON structure Is there any way to achieve this structure?

Thanks.

Marotta answered 18/2, 2013 at 12:3 Comment(0)
H
22

There is no XML CDATA equivalent in JSON. But you can encode your message in a string literal using something like base64. See this question for more details.

Homo answered 18/2, 2013 at 12:16 Comment(2)
Putting binary data into JSON appears best/easiest accomplished with Base64 encoding. Ref: stackoverflow.com/questions/1443158.Haunt
even though it could be the best solution it is still a terrible idea. JSON is supposed to be human-readable and machine-readable (just like xml), and encoding breaks this pattern. also , the string can get very big.Tallulah
K
0

This is a development of Raman's suggestion above.

I love the JSON format, but there are two things I want to be able to do with it and cannot:

  1. Paste some arbitrary text into a value using a text editor
  2. Transparently convert between XML and JSON if the XML contains CDATA sections.

This thread is germane to both these issues.

I am proposing to overcome this in the following manner, which doesn't break the formal definition of JSON, and I wonder if I'm storing up any problems if I do this?

  1. Define a JSON-compatible string format as follows:

    "<![CDATA[ (some text, escaped according to JSON rules) ]]>"

  2. Write an Unescape routine in my favorite programming language, which unescapes anything between <![CDATA[ and ]]>. This will be called before offering any JSON file to my text editor.

  3. Write the complementary routine to call after editing the file, which re-escapes anything between <![CDATA[ and ]]> according to JSON rules.

Then in order to paste any arbitrary data into the file, all I need to do is signal the start and end of the arbitrary data within a JSON string by typing <![CDATA[ before and ]]> after it.

This is a routine to call before and after text-editing, in Python3: lang-python3

escape_list = {
    8 : 'b',
    9 : 't',
    10: 'n',
    12: 'f',
    13: 'r',
    34: '"',
}   #List of ASCII character codes to escape, with their escaped equivalents

escape_char = "\\"  #this must be dealt with separately
unlikely_string = "ZzFfGgQqWw"

shebang = "#!/json/unesc\n"
start_cdata = "<![CDATA["
end_cdata = "]]>"

def escapejson(json_path):

    if (os.path.isfile(json_path)): #If it doesn't exist, we can't update it
        with open(json_path) as json_in:
            data_in = json_in.read()   #use read() 'cos we're goint to treat as txt
        #Set direction of escaping
        if (data_in[:len(shebang)] == shebang):   #data is unescaped, so re-escape
            data_in = data_in[len(shebang):] 
            unescape = False
            data_out = ""
        else:
            data_out = shebang
            unescape = True 

        while (data_in != ""):  #while there is still some input to deal with
            x = data_in.find(start_cdata)
            x1 = data_in.find(end_cdata)
            if (x > -1):    #something needs escaping
                if (x1 <0):
                    print ("Unterminated CDATA section!")
                    exit()
                elif (x1 < x):  #end before next start
                    print ("Extra CDATA terminator!")
                    exit()
                data_out += data_in[:x]
                data_in = data_in[x:]
                y = data_in.find(end_cdata) + len(end_cdata)
                to_fix = data_in[:y]    #this is what we're going to (un)escape
                if (to_fix[len(start_cdata):].find(start_cdata) >= 0):
                    print ("Nested CDATA sections not supported!")
                    exit()
                data_in = data_in[y:]   #chop data to fix from front of source
                if (unescape):
                    to_fix = to_fix.replace(escape_char + escape_char,unlikely_string)
                    for each_ascii in escape_list:
                        to_fix = to_fix.replace(escape_char + escape_list[each_ascii],chr(each_ascii))
                    to_fix = to_fix.replace(unlikely_string,escape_char)
                else:
                    to_fix = to_fix.replace(escape_char,escape_char + escape_char)
                    for each_ascii in escape_list:
                        to_fix = to_fix.replace(chr(each_ascii),escape_char + escape_list[each_ascii],)
                data_out += to_fix
            else:
                if (x1 > 0):
                    print ("Termination without start!")
                    exit()
                data_out += data_in
                data_in = ""

        #Save all to file of same name in same location
        try:
            with open(json_path, 'w') as outfile:
                outfile.write(data_out)
        except IOError as e:
            print("Writing "+ json_path + " failed "+ str(e))
    else:
        print("JSON file not found")

Operating on the following legal JSON data

{
    "test": "<![CDATA[\n We can put all sorts of wicked things like\n \\slashes and\n \ttabs and \n \"double-quotes\"in here!]]>"
}

...will produce the following:

#!/json/unesc
{
    "test": "<![CDATA[
 We can put all sorts of wicked things like
 \slashes and
    tabs and 
 "double-quotes"in here!]]>"
}

In this form, you can paste in any text between the markers. Calling the rountine again will change it back to the original legal JSON.

I think this can also be made to work when converting to/from XML with CDATA regions. (I'm going to try that next!)

Kirbee answered 25/8, 2020 at 9:0 Comment(6)
Unfortunately, StackOverflow's editor removed my CDATA strings, so it doesn't make sense. Here's the same again,with a few spaces added:Kirbee
stackoverflow.com/editing-help Did I miss something?Vacla
Not at all, Yunnosch! You and I were both doing the same thing! :-) Thank you for your help.Kirbee
I spotted a slight error in my code above: if(unescape): .... else: should be: ` to_fix = to_fix.replace(escape_char + escape_char,unlikely_string) #doublequotes to unlikely string for each_ascii in escape_list: to_fix = to_fix.replace(escape_char + escape_list[each_ascii],chr(each_ascii)) to_fix = to_fix.replace(unlikely_string,escape_char)` ... where unlikely_string is some long string value never likely to occur in the dataKirbee
Chris, you should really embrace the idea of editing your posts instead of commenting on them.Vacla
You're right of course. I can't do right now, but I'll revisit it later, and edit properly.Kirbee
P
0

You can create a YAML file and convert to JSON. For example:

test.yaml

storage:
  files:
  - filesystem: root
    path: /etc/sysconfig/network/ifcfg-eth0
    mode: 644
    overwrite: true
    contents:
      source: |
        data:,
        IPV6INIT=yes
        IPV6_AUTOCONF=yes

... then run yaml2json_pretty (shown later), like this:

#!/bin/bash

cat test.yaml | yaml2json_pretty > test.json

... which produces:

test.json

{
  "storage": {
    "files": [
      {
        "filesystem": "root",
        "path": "/etc/sysconfig/network/ifcfg-eth0",
        "mode": 644,
        "overwrite": true,
        "contents": {
          "source": "data:,\nIPV6INIT=yes\nIPV6_AUTOCONF=yes\n"
        }
      }
    ]
  }
}

This is the source code of yaml2json_pretty:

#!/usr/bin/env python3

import sys, yaml, json
print(json.dumps(yaml.load(sys.stdin.read(),Loader=yaml.FullLoader), sort_keys=False, indent=2))

More tricks similar to this yaml2json_pretty at: http://github.com/frgomes/bash-scripts

Politesse answered 10/2, 2021 at 2:0 Comment(0)
H
-9

http://www.json.org/ describes JSON format in details. According to it JSON doesn't support "something like CDATA" value type.

To achieve CDATA structure you can apply custom logic to handle string based values (and do it in the same way both for .net and java implementations). E.g.

{ 
  "type" : "CDATA",
  "value" : "Value that I will handle with my custom logic on java and .net side"
}
Herschel answered 18/2, 2013 at 12:14 Comment(2)
this will fail of course, since the 'value' can not contain literal data without escaping certain characters.Tallulah
The point is to be able to put in the value raw strings like value is with double quotes like " and : an more "Farny

© 2022 - 2024 — McMap. All rights reserved.