use xsl to output plain text
Asked Answered
M

2

36

I needed to use XSL to generate simple plain text output from XML. Since I didn't find any good, concise example online, I decided to post my solution here. Any links referring to a better example would of course be appreciated:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fo="http://www.w3.org/1999/XSL/Format" >
    <xsl:output method="text" omit-xml-declaration="yes" indent="no"/>
    <xsl:template match="/">
        <xsl:for-each select="script/command" xml:space="preserve">at -f <xsl:value-of select="username"/> <xsl:value-of select="startTime/@hours"/>:<xsl:value-of select="startTime/@minutes"/> <xsl:value-of select="startDate"/><xsl:text>
</xsl:text></xsl:for-each> 
    </xsl:template>
</xsl:stylesheet>

A few important things that helped me out here:

  1. the use of xsl:output to omit the standard declaration at the beginning of the output document
  2. the use of the xml:space="preserve" attribute to preserve any whitespace I wrote within the xsl:for-each tag. This also required me to write all code within the for-each tag, including that tag as well, on a single line (with the exception of the line break).
  3. the use of to insert a line break - again I had to omit standard xml indenting here.

The resulting and desired output for this xslt was:

at -f alluser 23:58 17.4.2010
at -f ggroup67 7:58 28.4.2010
at -f ggroup70 15:58 18.4.2010
at -f alluser 23:58 18.4.2010
at -f ggroup61 7:58 22.9.2010
at -f ggroup60 23:58 21.9.2010
at -f alluser 3:58 22.9.2010

As I said, any suggestions of how to do this more elegantly would be appreciated.


FOLLOW-UP 2011-05-08:

Here's the type of xml I am treating:

<script xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="script.xsd">
    <command>
        <username>alluser</username>
        <startTime minutes="58" hours="23"/>
        <startDate>17.4.2010</startDate>
    </command>
</script>
Mesognathous answered 6/5, 2011 at 8:17 Comment(5)
You could save on the number of <xsl:value> elements by using concat('at -f ', username, ' ', startTime/@hours, ' ', ...). Besides, you could wrap your source code – if you do that inside the tags, it won't affect the output.Scenic
Good question, +1. See my answer for a complete, very short and really generic solution.Infundibuliform
@Christopher Creutzig: Thanks for the great suggestion on concat(). What are you referring to with "wrap your source code"?Mesognathous
see Mads answer: There's no need to put everything onto one big line. (Although I would not break the line before the comma. It just looks weird and does not add anything, not even being able to comment something out more easily.)Scenic
We don't do code reviews on Stack Overflow. I would suggest you reframe your question so it's an actual question (e.g. how to I strip the text out of this XML document), then post your draft effort as an answer.Admonitory
A
29
  • You can define a template to match on script/command and eliminate the xsl:for-each
  • concat() can be used to shorten the expression and save you from explicitly inserting so many <xsl:text> and <xsl:value-of> elements.
  • The use of an entity reference &#xA; for the carriage return, rather than relying on preserving the line-break between your <xsl:text> element is a bit more safe, since code formatting won't mess up your line breaks. Also, for me, it reads as an explicit line-break and is easier to understand the intent.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
xmlns:fo="http://www.w3.org/1999/XSL/Format" >
    <xsl:output method="text" omit-xml-declaration="yes" indent="no"/>

    <xsl:template match="script/command">
        <xsl:value-of select="concat('at -f '
                    ,username
                    ,' '
                    ,startTime/@hours
                    ,':'
                    ,startTime/@minutes
                    ,' '
                    ,startDate
                    ,'&#xA;')"/>
    </xsl:template>

</xsl:stylesheet>
Angers answered 6/5, 2011 at 11:18 Comment(2)
Thanks Mads, excellent suggestions. This is exactly what I was looking for. I had forgotten about the useful features of XPath 2... How is it that &#xA; gives me a new line on windows, when windows usually requires not only a line feed, but also a carriage return?Mesognathous
@Mesognathous Dickinson Do note, this is an XSLT/XPath 1.0 solution, no XPath 2.0 features used. &#xA; (Line Feed) is often enough. You can add &#xD; (Carriage Return) if you need CRLF.Angers
I
9

Just for fun: this can be done in a very general and compact way:

<xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output omit-xml-declaration="yes" indent="yes"/>
    <xsl:output method="text"/>
    <xsl:strip-space elements="*"/>

    <xsl:template match="*">
        <xsl:apply-templates select="node()|@*"/>
        <xsl:text> </xsl:text>
    </xsl:template>

    <xsl:template match="username">
       at -f <xsl:apply-templates select="*|@*"/>
    </xsl:template>
</xsl:stylesheet>

when applied on this XML document:

<script>
 <command>
  <username>John</username>
  <startTime hours="09:" minutes="33"/>
  <startDate>05/05/2011</startDate>

  <username>Kate</username>
  <startTime hours="09:" minutes="33"/>
  <startDate>05/05/2011</startDate>

  <username>Peter</username>
  <startTime hours="09:" minutes="33"/>
  <startDate>05/05/2011</startDate>
 </command>
</script>

the wanted, correct result is produced:

   at -f 09:33 05/05/2011 
   at -f 09:33 05/05/2011 
   at -f 09:33 05/05/2011  

Note: This genaral approach is best applicable if all the data to be output is contained in text nodes -- not in attributes.

Infundibuliform answered 6/5, 2011 at 13:24 Comment(5)
The @* values are missing(and were supposed to be delimited by ':'). Also, not sure whether the leading spaces before 'at -f' in the output would be a problem.Angers
@Mads Hansen: Thanks for noting this. Fixed now.Infundibuliform
Almost, but I don't think the source XML has ':' in the value of @hours. The sample XSL posted is explicitly putting ':' in, not selecting from the attribute value.Angers
@Mads Hansen: Sure. While I said "for fun", my answer points out a generic method of designing the XML so that the same general and trivial XSLT transformation can be used to generate the output, not needing to know any additional details. As I said in my answer, I wouldn't use attributes and would store the data only in text nodes.Infundibuliform
Tip for LibXML2 users (PHP, Python, browsers, etc.): if you not use <xsl:text> it not strip spaces (!).Rothstein

© 2022 - 2024 — McMap. All rights reserved.