How do I make XSL transformation indent the output XML?
Asked Answered
E

5

16

I'm using xalan with the following xsl header:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    version="1.0"
    xmlns:redirect="http://xml.apache.org/xalan/redirect"
    extension-element-prefixes="redirect"
    xmlns:xalan="http://xml.apache.org/xalan">
<xsl:output method="text" indent="yes" xalan:indent-amount="4"/>

And the output is not indented.

Anyone with ideas?

Epilogue answered 8/3, 2010 at 15:5 Comment(2)
I was using the xsl tool in notepad++. It failed to indent the output when I had a typo in my xsl. Verify your xsl file has the correct syntax.Flue
Note: this question and answers are about method="xml" only, method="html" has different problems/behaviors. The most important being: com.sun.org.apache.xalan.internal.xsltc.runtime.AbstractTranslet#transferOutputSettings very simply ignores indent-amount for method="html" in the JDK (checked 8, 9 and 11). Java 11 supports indentation, because the default indent-number is 4 there, but not configurable.Rasheedarasher
T
24

For indentation you need to use a different namespace: http://xml.apache.org/xslt (see this issue)

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0"
xmlns:redirect="http://xml.apache.org/xalan/redirect"
extension-element-prefixes="redirect"
xmlns:xalan="http://xml.apache.org/xslt">
<xsl:output method="xml" indent="yes" xalan:indent-amount="4"/>
Thury answered 26/3, 2011 at 16:28 Comment(2)
The xalan namespace is kind-of documented at xalan.apache.org/xalan-j/apidocs/org/apache/xml/serializer/…Rasheedarasher
http://xml.apache.org/xslt was deprecated even in the old version (see Declare the xalan namespace), use http://xml.apache.org/xalan instead.Rasheedarasher
F
14

Was struggling with this for a while, however just got it working accidentally:

the key was to add <xsl:strip-space elements="*"/>

so it will look like this:

<xsl:stylesheet 
    version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:java="http://xml.apache.org/xalan/java"
    xmlns:xalan="http://xml.apache.org/xslt">
<xsl:output method="xml" encoding="ASCII" indent="yes" xalan:indent-amount="4"/>
<xsl:strip-space elements="*"/>

Not sure why, but probably removing all whitespacing helps xalan figure out the indentation

Factorage answered 18/4, 2013 at 16:11 Comment(2)
Without xsl:strip-space[@elements="*"], the xsl is trying to preserve the whitespace nodes from the input in the output.Recall
Worked like charm even without xalan :)Morrison
T
5

Jirka-x1, thank you for the issue-link. I used the following (as proposed by Ed Knoll 13/Aug/04):

<xsl:stylesheet ... xmlns:xslt="http://xml.apache.org/xslt">
<xsl:output ... indent="yes" xslt:indent-amount="4" />

This works for me with xalan (java) 2.7.1.

Tace answered 30/5, 2011 at 18:15 Comment(0)
P
2

I guess you have to set the method to xml. If that does not work, try the following:

<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xalan="http://xml.apache.org/xalan">

<xsl:output method="xml" encoding="UTF-8" indent="yes" xalan:indent-amount="4"/>
Preeminence answered 8/3, 2010 at 15:14 Comment(1)
Is is possible that you are viewing the xml with an application that does not render properly the content ?Preeminence
S
1

While this is a pretty old question, there might be another angle on the answer that hasn't been touched on yet.

TL;DR it matters what flavor of Result the Transformer is feeding into. (If you're using xalan through Java code you didn't write/can't change, this might not be what you want to hear.)

For demonstrations in this answer, I'll be using PostgreSQL PL/Java, because it comes with a set of example functions including preparexmltransform and transformxml that use Java's xalan-based XSLT 1.0 stuff, and have some extra arguments for test purposes. There's an important behavior effect here that I wouldn't have seen without those extra arguments.

I'll start by preparing a transform named indent:

SELECT
 preparexmltransform(
  'indent',
  '<xsl:transform version="1.0"
     xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" indent="yes"/>
    <xsl:template match="/">
     <xsl:copy-of select="."/>
    </xsl:template>
   </xsl:transform>',
  how => 5);

It should be clear enough that the first argument there is a name for the transform and the second is the XSLT defining it. I'll get to that "how" argument in a bit.

So anyway, let's use that transform on some XML and see what happens:

SELECT
  transformxml(
   'indent',
   '<a b="c" d="e"><f><g/><h/></f></a>',
   howin => 5, howout => 4);

  transformxml
----------------
<a b="c" d="e">
    <f>
        <g/>
        <h/>
    </f>
</a>

Cool, that did what was wanted right away, and shows that the short transform above is enough; notably, it doesn't need an xalan:indent-amount property (unless you like a different indent width), so it doesn't need an xalan namespace defined, and there doesn't have to be a strip-space element for it to work (if you try with spaces in the input document, the indent spaces are just added to them, which can look goofy, so you might choose to use strip-space, but the indenting happens either way).

I still haven't said what those extra arguments do (two of 'em now, "howin" and "howout"!), but that's coming, because look what happens changing nothing but "howout" from 4 to 5:

SELECT
  transformxml(
   'indent',
   '<a b="c" d="e"><f><g/><h/></f></a>',
   howin => 5, howout => 5);

            transformxml            
------------------------------------
 <a b="c" d="e"><f><g/><h/></f></a>

So the "howout" matters for whether the indenting happens. What are these hows?

Well, Java doesn't have just one API for working with XML. It has several, including DOM, StAX, and SAX, not to mention you might just want to handle the XML as a String, or a character stream via Reader/Writer, or an encoded byte stream via InputStream/OutputStream.

The JDBC spec says if you're writing Java code to work with XML in a database, the SQLXML API has to give you your choice of any of those ways to work with the data, whichever is convenient for your task. And the JAXP Transformations API says you have to be able to hand a Transformer pretty much any flavor of Source and any flavor of Result, and have it do the right thing.

So that's why those PL/Java example functions have "how" arguments: there needs to be a way to test all of the required ways the same XML content can be passed to the Transformer and all the ways the Transformer's result can come back. The "how"s are arranged (arbitrarily) like this:

 code |        form         |    howin     |   howout
------+---------------------+--------------+--------------
   1  | binary stream       | InputStream  | OutputStream
   2  | character stream    | Reader       | Writer
   3  | String              | String       | String
   4  | binary or character | StreamSource | StreamResult
   5  | SAX                 | SAXSource    | SAXResult
   6  | StAX                | StAXSource   | StAXResult
   7  | DOM                 | DOMSource    | DOMResult

So what does the same xalan indenting transform do, when it is called with different ways of producing its result?

SELECT
  i, transformxml(
   'indent',
   '<a b="c" d="e"><f><g/><h/></f></a>',
   howin => 5, howout => i)
  FROM
   generate_series(1,7) AS i;

 i |               transformxml
---+------------------------------------------
 1 | <a b="c" d="e">
   |     <f>
   |         <g/>
   |         <h/>
   |     </f>
   | </a>
   |
 2 | <a b="c" d="e">
   |     <f>
   |         <g/>
   |         <h/>
   |     </f>
   | </a>
   |
 3 | <a b="c" d="e">
   |     <f>
   |         <g/>
   |         <h/>
   |     </f>
   | </a>
   |
 4 | <a b="c" d="e">
   |     <f>
   |         <g/>
   |         <h/>
   |     </f>
   | </a>
   |
 5 | <a b="c" d="e"><f><g/><h/></f></a>
 6 | <a b="c" d="e"><f><g></g><h></h></f></a>
 7 | <a b="c" d="e"><f><g/><h/></f></a>

Well, there's the pattern. For all of the APIs where the Transformer actually has to directly produce a serialized stream of characters or bytes, it adds the indentation as requested.

When it is given a SAXResult, StAXResult, or DOMResult to write into, it doesn't add indentation, because those are all structural XML APIs; it's as if xalan treats indenting as strictly a serialization issue, and it technically isn't serializing when it is producing SAX, StAX, or DOM.

(The table above also shows that the StAX API doesn't always render an empty element as self-closed when the other APIs do. Side issue, but interesting.)

So, if you find yourself trying to get an xalan transform to do indenting and it isn't, double check which form of Result you are asking the Transformer to produce.

Edit: One final point: if you are coding this directly in Java, there really isn't any need at all to write those seven-ish lines of XSLT just to get what's nothing more than an identity-transform with the indent output property set.

If you call the no-argument TransformerFactory.newTransformer(), it straight-up gives you a plain-vanilla identity transform. Then all you need to do is set its output properties, and you're in business:

var tf = javax.xml.transform.TransformerFactory.newInstance();
var t = tf.newTransformer();
t.setOutputProperty("indent", "yes");
t.setOutputProperty("{http://xml.apache.org/xalan}indent-amount", "1"); // if you don't like the default 4
t.transform(source, result);

Doesn't get much simpler than that. Again, it's critical that result be a StreamResult, so that the transformer will do serialization.

Scagliola answered 10/3, 2020 at 0:49 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.