Convert XML to JSON format
Asked Answered
P

10

19

I have to convert docx file format (which is in openXML format) into JSON format. I need some guidelines to do it. Thanks in advance.

Palomino answered 25/2, 2011 at 4:44 Comment(2)
Consider using simplify-docxHekking
Underscore-java library has a static method U.xmlToJson(xml).Geraint
I
13

You may take a look at the Json-lib Java library, that provides XML-to-JSON conversion.

String xml = "<hello><test>1.2</test><test2>123</test2></hello>";
XMLSerializer xmlSerializer = new XMLSerializer();  
JSON json = xmlSerializer.read( xml );  

If you need the root tag too, simply add an outer dummy tag:

String xml = "<hello><test>1.2</test><test2>123</test2></hello>";
XMLSerializer xmlSerializer = new XMLSerializer();  
JSON json = xmlSerializer.read("<x>" + xml + "</x>");  
Interchange answered 25/2, 2011 at 6:4 Comment(2)
It would be good, but the result is {"test":"1.2","test2":"123"} and not {"hello":{"test":"1.2","test2":"123"}}, that is it prints only the leaves (<hello> tag is lost). The same occurs if we add an intermediate node (son of <hello> and parent of <test>): it will be ignored. Is it a matter of configuration?Complacency
In this Case we have XML parameter like <...>[2] jsfnek [5]<....> but in my json we are getting only [2] first value can you please help me to understand why complete value is not readableBuoy
C
10

There is no direct mapping between XML and JSON; XML carries with it type information (each element has a name) as well as namespacing. Therefore, unless each JSON object has type information embedded, the conversion is going to be lossy.

But that doesn't necessarily matter. What does matter is that the consumer of the JSON knows the data contract. For example, given this XML:

<books>
  <book author="Jimbo Jones" title="Bar Baz">
    <summary>Foo</summary>
  </book>
  <book title="Don't Care" author="Fake Person">
    <summary>Dummy Data</summary>
  </book>
</books>

You could convert it to this:

{
    "books": [
        { "author": "Jimbo Jones", "title": "Bar Baz", "summary": "Foo" },
        { "author": "Fake Person", "title": "Don't Care", "summary": "Dummy Data" },
    ]
}

And the consumer wouldn't need to know that each object in the books collection was a book object.

Edit:

If you have an XML Schema for the XML and are using .NET, you can generate classes from the schema using xsd.exe. Then, you could parse the source XML into objects of these classes, then use a DataContractJsonSerializer to serialize the classes as JSON.

If you don't have a schema, it will be hard getting around manually defining your JSON format yourself.

Cinquain answered 25/2, 2011 at 4:54 Comment(4)
How to make this conversion.. The input xml is somewhat complex, so the conversion has to be easy and have good performance..Which language is preferred to do this?Palomino
this is exactly what XML-ValidatorBuddy can do for youPigmentation
Jacob, I'm using your code sample and have successfully transformed an XML file to a JSON object. However, my XML file represents a root element (<DATA>) and a grouping element (<AllCustomers>), where all other <Customer> elements are nested. The <Customer> element is what I'm after, and it may contain any variation of attributes, child and grand-child elements, etc. Is there a way for me to return an ArrayList<JSONObject> of each <Customer> element. In other words, can I specify something similar to XPath and still use SAX? P.S: I can't use DOM due to size limits.Fleet
Did you mean to post this on a different answer? I don't have any code sample.Cinquain
E
6

The XML class in the org.json namespace provides you with this functionality.

You have to call the static toJSONObject method

Converts a well-formed (but not necessarily valid) XML string into a JSONObject. Some information may be lost in this transformation because JSON is a data format and XML is a document format. XML uses elements, attributes, and content text, while JSON uses unordered collections of name/value pairs and arrays of values. JSON does not does not like to distinguish between elements and attributes. Sequences of similar elements are represented as JSONArrays. Content text may be placed in a "content" member. Comments, prologs, DTDs, and <[ [ ]]> are ignored.

Eleaseeleatic answered 3/5, 2011 at 12:46 Comment(2)
This is definitely the simplest and cleanest approach. Thanks.Autobahn
I was trying to use this, but it create Node/text based xmls. how to create attribute based xml? Eg: If my JSON data is : {Order:{OrderLine:{ItemID:"1234"}},OrderNo:"4567"} Reqrd O/P: <Order OrderNo="4567"><OrderLine ItemID="1234"/></Order> which is node based. Am looking at attribute base xml conversions.Please let me know if you have any suggestions.Moluccas
H
6

If you are dissatisfied with the various implementations, try rolling your own. Here is some code I wrote this afternoon to get you started. It works with net.sf.json and apache common-lang:

static public JSONObject readToJSON(InputStream stream) throws Exception {
    SAXParserFactory factory = SAXParserFactory.newInstance();
    factory.setNamespaceAware(true);
    SAXParser parser = factory.newSAXParser();
    SAXJsonParser handler = new SAXJsonParser();
    parser.parse(stream, handler);
    return handler.getJson();
}

And the SAXJsonParser implementation:

package xml2json;

import net.sf.json.*;
import org.apache.commons.lang.StringUtils;
import org.xml.sax.*;
import org.xml.sax.helpers.DefaultHandler;
import java.util.ArrayList;
import java.util.List;

public class SAXJsonParser extends DefaultHandler {

    static final String TEXTKEY = "_text";

    JSONObject result;
    List<JSONObject> stack;

    public SAXJsonParser(){}
    public JSONObject getJson(){return result;}
    public String attributeName(String name){return "@"+name;}

    public void startDocument () throws SAXException {
        stack = new ArrayList<JSONObject>();
        stack.add(0,new JSONObject());
    }
    public void endDocument () throws SAXException {result = stack.remove(0);}
    public void startElement (String uri, String localName,String qName, Attributes attributes) throws SAXException {
        JSONObject work = new JSONObject();
        for (int ix=0;ix<attributes.getLength();ix++)
            work.put( attributeName( attributes.getLocalName(ix) ), attributes.getValue(ix) );
        stack.add(0,work);
    }
    public void endElement (String uri, String localName, String qName) throws SAXException {
        JSONObject pop = stack.remove(0);       // examine stack
        Object stashable = pop;
        if (pop.containsKey(TEXTKEY)) {
            String value = pop.getString(TEXTKEY).trim();
            if (pop.keySet().size()==1) stashable = value; // single value
            else if (StringUtils.isBlank(value)) pop.remove(TEXTKEY);
        }
        JSONObject parent = stack.get(0);
        if (!parent.containsKey(localName)) {   // add new object
            parent.put( localName, stashable );
        }
        else {                                  // aggregate into arrays
            Object work = parent.get(localName);
            if (work instanceof JSONArray) {
                ((JSONArray)work).add(stashable);
            }
            else {
                parent.put(localName,new JSONArray());
                parent.getJSONArray(localName).add(work);
                parent.getJSONArray(localName).add(stashable);
            }
        }
    }
    public void characters (char ch[], int start, int length) throws SAXException {
        JSONObject work = stack.get(0);            // aggregate characters
        String value = (work.containsKey(TEXTKEY) ? work.getString(TEXTKEY) : "" );
        work.put(TEXTKEY, value+new String(ch,start,length) );
    }
    public void warning (SAXParseException e) throws SAXException {
        System.out.println("warning  e=" + e.getMessage());
    }
    public void error (SAXParseException e) throws SAXException {
        System.err.println("error  e=" + e.getMessage());
    }
    public void fatalError (SAXParseException e) throws SAXException {
        System.err.println("fatalError  e=" + e.getMessage());
        throw e;
    }
}
Hooded answered 28/8, 2011 at 3:8 Comment(0)
U
4

If you need to be able to manipulate your XML before it gets converted to JSON, or want fine-grained control of your representation, go with XStream. It's really easy to convert between: xml-to-object, json-to-object, object-to-xml, and object-to-json. Here's an example from XStream's docs:

XML
<person>
  <firstname>Joe</firstname>
  <lastname>Walnes</lastname>
  <phone>
    <code>123</code>
    <number>1234-456</number>
  </phone>
  <fax>
    <code>123</code>
    <number>9999-999</number>
  </fax>
</person>
POJO (DTO)
public class Person {
    private String firstname;
    private String lastname;
    private PhoneNumber phone;
    private PhoneNumber fax;
    // ... constructors and methods
}
Convert from XML to POJO:
String xml = "<person>...</person>";
XStream xstream = new XStream();
Person person = (Person)xstream.fromXML(xml);
And then from POJO to JSON:
XStream xstream = new XStream(new JettisonMappedXmlDriver());
String json = xstream.toXML(person);

Note: although the method reads toXML() XStream will produce JSON, since the Jettison driver is used.

Underplot answered 25/2, 2011 at 6:22 Comment(0)
S
4

Converting complete docx files into JSON does not look like a good idea, because docx is a document centric XML format and JSON is a data centric format. XML in general is designed to be both, document and data centric. Though it is technical possible to convert document centric XML into JSON, handling the generated data might be overly complex. Try to focus on the actual needed data and convert only that part.

Selfeducated answered 25/2, 2011 at 7:49 Comment(0)
R
1

If you have a valid dtd file for the xml snippet, then you can easily convert xml to json and json to xml using the open source eclipse link jar. Detailed sample JAVA project can be found here: http://www.cubicrace.com/2015/06/How-to-convert-XML-to-JSON-format.html

Receiver answered 10/3, 2016 at 5:8 Comment(1)
The link has been fixed.Receiver
S
0

I have come across a tutorial, hope it helps you. http://www.techrecite.com/xml-to-json-data-parser-converter

Slack answered 23/10, 2013 at 3:3 Comment(2)
The example you link to is in PHP; the question is tagged Java.Tosh
And the site has now been deleted, this answer should also go that way.Patronage
A
0

Docx4j

I've used docx4j before, and it's worth taking a look at.

unXml

You could also check out my open source unXml-library that is available on Maven Central.

It is lightweight, and has a simple syntax to pick out XPaths from your xml, and get them returned as Json attributes in a Jackson ObjectNode.

Accad answered 25/9, 2015 at 7:45 Comment(0)
C
0

Use

xmlSerializer.setForceTopLevelObject(true)

to include root element in resulting JSON.

Your code would be like this

String xml = "<hello><test>1.2</test><test2>123</test2></hello>";
XMLSerializer xmlSerializer = new XMLSerializer();
xmlSerializer.setForceTopLevelObject(true);
JSON json = xmlSerializer.read(xml);
Claret answered 15/8, 2019 at 20:16 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.