XSLT 3.0 partial streaming (Saxon)
Asked Answered
E

1

2

I have a big XML file (6 GB) with this kind of tree:

<Report>
   <Document>
      <documentType>E</documentType>
      <person>
         <firstname>John</firstname>
         <lastname>Smith</lastname>
      </person>
   </Document>
   <Document>
      [...]
   </Document>
   <Document>
      [...]
   </Document>
   [... there are a lot of Documents]
</Report>

So I used the new XSLT 3.0 streaming feature, with Saxon 9.6 EE. I don't want to have the streaming constrains once in a Document. This is why I tried to used copy-of(). I think that, what I want to do, is very close to the "burst mode" that is described here: http://saxonica.com/documentation/html/sourcedocs/streaming/burst-mode-streaming.html

Here is my XSLT style sheet:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="3.0">
<xsl:mode streamable="yes" />

<xsl:template match="/">
    GLOBAL HEADER
        <xsl:for-each select="/Report/Document/copy-of()" >
           DOC HEADER
           documentType: <xsl:value-of select="documentType"/>
           person/firstname: <xsl:value-of select="person/firstname"/>

           <xsl:call-template name="fnc1"/>

           DOC FOOTER
        </xsl:for-each>
    GLOBAL FOOTER
</xsl:template>

<xsl:template name="fnc1">
    documentType again: <xsl:value-of select="documentType"/>
</xsl:template>

</xsl:stylesheet>

In a sense it works because with the copy-of() I'm able to use several xsl:value-of directly in the for-each (like in this question). (Otherwise I have this error * There are at least two consuming operands: {xsl:value-of} on line 8, and {xsl:value-of} on line 9)

But I still have streaming constrains because <xsl:call-template name="fnc1"/> creates this error:

Error at xsl:template on line 4 column 25 of stylesheet.xsl:
  XTSE3430: Template rule is declared streamable but it does not satisfy the streamability rules.
  * xsl:call-template is not streamable in this Saxon release
Stylesheet compilation failed: 1 error reported

So my question is: how to do partial streaming (Documents are loaded one by one but fully) in order to be able to use call-template (and other apply-templates) in a Document?

Thank you for your help!

Esqueda answered 8/10, 2014 at 14:14 Comment(2)
Have you considered not to declare <xsl:mode streamable="yes" /> but instead to use <xsl:template name="main"><xsl:stream href="foo.xml"><xsl:apply-templates select="Report/Document/copy-of()"/></xsl:stream></xsl:template> and then you should be able to use any named and/or matching templates to processs Document element nodes and its descendants e.g. <xsl:template match="Document">DOC HEADER document type><xsl:value-of select="documentType"/>...<xsl:call-template name="fcn1"/>DOC FOOTER></xsl:template>?Starla
I tried something with stream, but I had the same error. Now with your trick with apply-templates over a copy-of() it gives an error Fatal error during transformation: java.lang.RuntimeException: Internal error evaluating template at line 4Esqueda
C
1

I think call-template should be streamable when the context item is grounded (ie. not a streamed node), so I'll treat this as a bug. Meanwhile a workaround might be to declare fnc1 as

<xsl:template name="fnc1" mode="fnc1" match="Document"/>

and call it as

<xsl:apply-templates select="." mode="fnc1"/>

Alternatively, replace the template with a function and supply the context item as an explicit argument.

You can track the bug here:

https://saxonica.plan.io/issues/2171

Although we don't claim 100% conformance with the XSLT 3.0 specification yet, we'll treat any unnecessary departures in the 9.6 release as bugs unless fixing them would destabilize the product.

Calysta answered 8/10, 2014 at 17:7 Comment(2)
Thank you a lot, it is working well, good performance also (input 6.4GB, output 1.2GB, processing time 4 minutes). Just a little syntax error, we need the match attribute: <xsl:template name="fnc1" mode="fnc1" match="Document"/>. (I tried to edit your answer but I didn't know that people who don't know the topic can reject the edition, thanks to @tylerk @littlebobbytables @dmitry-fucintv ;) )Esqueda
Thanks for the product feedback, xsl:call-template is now streamable in such cases.Calysta

© 2022 - 2024 — McMap. All rights reserved.