Is there a SaxParser that reads json and fires events so it looks like xml
Asked Answered
K

4

25

This would be great as it would allow my xml stuff to read json w/out any change except for the different sax parser.

Kwa answered 7/12, 2010 at 3:52 Comment(6)
Why would you want that. The point of json is to not parse it like xml.Pegg
Seems like a reasonable request to me. (@Pegg - if JSON's only point were not to be parsed like XML, that would be a sad statement about JSON.) However XML and JSON are different enough in structure that I really doubt you could have 100% code compatibility, to use XML-oriented SAX-processing code when consuming JSON. But you might get close enough in simple cases.Neck
Both Xml and Json have properties and children. They are very much similar only their notation is different. Both hold 0 or more children and so on. Properties of a json object could be seen as xml attributes etc.Kwa
I must agree with Falmarri, JSON != XML, and one should only result to emulation as last effort if nothing else works. However, maybe original asker wanted something LIKE sax API, not SAX API -- SAX API makes no sense since it's xml-specific; but push style approach is generic. But this all depends on whether question is specifically about SAX (simple api for XML, very xml specific), or about streaming/incremental parsing approach, which is more general.Chemo
There are plenty of use cases. If you're working with Perl its easy to turn a JSON string into a deep structure you can manipulate easily, but not so with Java. The closest Java comes to having a simple hierarchical structure that can be manipulated and accessed like a Perl hash is a W3C DOM object (I don't consider Map a suitable alternative). So the ability to parse JSON using a SAX handler gives you an easy mechanism to build a DOM tree out of a JSON structure in Java. Aside from that, being able to treat JSON like XML gives you access to a ton of other XML related tools, like XSLTAshcraft
@Pegg When your JSON file exceeds the 2 GB virtual address space of your 32-bit process.Veradia
B
16

If you meant, event-based parser then there are a couple of projects out there that do this:

  1. http://code.google.com/p/json-simple/

    Stoppable SAX-like interface for streaming input of JSON text

    This project has moved to https://github.com/fangyidong/json-simple

  2. http://jackson.codehaus.org/Tutorial

    Jackson Streaming API is similar to Stax API

    This project has moved to https://github.com/FasterXML/jackson-core

Blanton answered 7/12, 2010 at 5:13 Comment(1)
Good point; maybe it's not so much for SAX, but SAX style (incremental processing, instead of tree model)Chemo
C
1

I think it is a bad idea to try treat JSON as if it was XML (which is what you are essentially asking); however, Jettison does just this. It exposes JSON content via Stax API (javax.xml.stream). And if you truly want SAX, writing wrapper from Stax to SAX is trivial as well (but not the other way around).

I also think you might get better answers if you explained bit more what you are trying to achieve, beyond mechanisms you are hoping to use. For example, there are many data binding tools for both XML and JSON; and using such tools could hide lower level details much better than using abstraction meant for one to process the other.

Chemo answered 7/12, 2010 at 16:51 Comment(12)
It might be a good idea to process large JSON Objects as a stream, in contrast to loading everything into memory.Consumption
Sure, and that's what many JSON packages offer: Jackson and GSON both have streaming parser/generator, and even allow combination of streaming access and data-binding for partial data (sub-trees). So SAX is just one streaming API, and one not designed for JSON but XML.Chemo
I realise that this answer has been written years ago, but I am curious why it seemed to you "a bad idea to try treat JSON as if it was XML"?Ingar
@stakx Because their data models are fundamentally different: XML uses hierarchic model on textual data (text markup), JSON frame-based data model. This adds lots of friction; for example, JSON concept of Array/Object difference is absent from XML where everything is an element/attribute with textual content, in a way only having Objects. Without external metadata (schema, or proglang Objects) it is impossible to reliably translate between models in general case.Chemo
@Chemo such translation is specified in XSLT 3.0: 22 Processing JSON data. Having a streaming parser that can convert JSON to that XML vocabulary would a create a bridge fror XSLT 2.0 transformations over JSON, not requiring version 3.0.Orest
@MartynasJusevičius while it is definitely possible to specify information-preserving transformation, this necessarily means adding additional, otherwise useless decoration on one or both format representations -- aka "franken-JSON" like with BadgerFish. This is because of differing information models. One can not both handle XML and JSON in natural way, converting AND retain all information; one part has to give.Chemo
@Chemo I'm not sure what you're talking about because it took me a few hours to develop a JSON to XML converter that does not loose any information nor requires any decoration: github.com/AtomGraph/JSON2XML It is also forward compatible with the XSLT 3.0 XML Representation of JSON that I mentioned above.Orest
@MartynasJusevičius looking at your converter, it should be rather obvious: mapping is adding otherwise unnecessary type indicators (JSON does not need <string>, <number> or <array> markers as it is typed), which neither natural JSON nor natural XML representations of logical content should have. This is exactly what is dubbed franken-JSON (when going from xml to json); in this it is the opposite. There are many such mapping, all as cumbersome and unnatural to use -- but if conversion of any and all content without loss needed, is necessary. So this is a good example of what I mean.Chemo
Your opinion aside, I can now transform my JSON with XLST 2.0 and be forward compatible with XSLT 3.0. Which was the point.Orest
If you have strict requirement having to use XSLT, that may be a good thing. I would just recommend against anyone doing this for other reasons (i.e. if they are free to choose tools): XML-specific tools (like XSLT) should be used for XML. JSON-specific tools for JSON. And only things that can properly model both using appropriate format-compatible abstractions for both.Chemo
What's wrong with generic data conversions? There is no XSLT-like language for JSON, that's why it needs to be converted to XML first. No way I'm writing the transformation with JavaScript or other imperative language when I can do it with a declarative XSLT stylesheet, specifically designed for data transformations.Orest
Actually now that you explained fully your use case -- JSON to XML only so you can use xslt, then XML back to JSON -- I do not object to it. If so, the additional metadata is only visible during processing, and is not something users would be exposed to. If so... yes, if you like XSLT (and I agree there seems to be no good json-specific transform tool), why not? Thank you for taking your time: I made some wrong assumptions here. So: my concern would be exposing JSON with embellishment, which (if not for XSLT) are extraneous.Chemo
S
0

I have developed a streaming StAX-based converter: https://github.com/AtomGraph/JSON2XML

It reads any JSON data and produces XML Representation of JSON specified in XSLT 3.0.

JSON2XML enables JSON transformation with XSLT even without having an XSLT 3.0 processor. You can simply pre-process the data by having JSON2XML before the transformation, and pipeline it into an XSLT 2.0 stylesheet, for example. That way your stylesheet stays forward compatible with XSLT 3.0, as the XML representation is exactly the same.

Feedback and pull requests are welcome.

Serif answered 1/4, 2019 at 22:43 Comment(0)
D
0

http://rapidjson.org/ ftw. "A fast JSON parser/generator for C++ with both SAX/DOM style API." Industrial-strength, used by TenCent.

Dorcas answered 24/6 at 21:26 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.