Use (assuming the provided XML fragment is elements that are children of the current node and there is only one element with the desired property):
substring-before(*[not(starts-with(., 'info:eu-repo'))], '-')
XSLT - based verification:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/*">
<xsl:copy-of select=
"substring-before(*[not(starts-with(., 'info:eu-repo'))], '-') "/>
</xsl:template>
</xsl:stylesheet>
When this transformation is applied to the following XML document (the provided fragment wrapped in a single top element and the namespace declared):
<t xmlns:dc="some:dc">
<dc:date>info:eu-repo/date/embargoEnd/2013-06-12</dc:date>
<dc:date>2012-07-04</dc:date>
</t>
the XPath expression is evaluated off the top element and the result of this evaluation is copied to the output:
2012
II. More than one element with the desired property:
In this case It isn't possible to produce the desired data with a single XPath 1.0 expression.
This XSLT transformation:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="*[not(starts-with(., 'info:eu-repo'))]/text()">
<xsl:copy-of select="substring-before(., '-') "/>
==============
</xsl:template>
<xsl:template match="text()"/>
</xsl:stylesheet>
when applied on this XML document:
<t xmlns:dc="some:dc">
<dc:date>info:eu-repo/date/embargoEnd/2013-06-12</dc:date>
<dc:date>2012-07-04</dc:date>
<dc:date>info:eu-repo/date/embargoEnd/2013-06-12</dc:date>
<dc:date>2011-07-05</dc:date>
</t>
produces the wanted, correct result:
2012
==============
2011
==============
III. XPath 2.0 one-liner
*[not(starts-with(., 'info:eu-repo'))]/substring-before(., '-')
When this XPath 2.0 expression is evaluated off the top element of the last XML document (nearest above), the wanted years are produced:
2012 2011
XSLT 2.0 - based verification:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/*">
<xsl:sequence select=
"*[not(starts-with(., 'info:eu-repo'))]/substring-before(., '-')"/>
</xsl:template>
</xsl:stylesheet>
When this transformation is applied on the last XML document, the XPath expression is evaluated and the result of this evaluation is copied to the output:
2012 2011
IV. The Most General and Difficult case:
Now, let's have this XML document:
<t xmlns:dc="some:dc">
<dc:date>info:eu-repo/date/embargoEnd/2013-06-12</dc:date>
<dc:date>2012-07-04</dc:date>
<dc:date>info:eu-repo/date/embargoEnd/2013-06-12</dc:date>
<dc:date>2011-07-05</dc:date>
<dc:date>*/date/embargoEnd/2014-06-12</dc:date>
</t>
We still want to get the year part of all dc:date
elements whose string value doesn't start with 'info:eu-repo'. However none of the previous solutions work correctly with the last dc:date
element above.
Remarkably, the wanted data can still be produced by a single XPAth 2.0 expression:
for $s in
*[not(starts-with(., 'info:eu-repo'))]/tokenize(.,'/')[last()]
return
substring-before($s, '-')
When this expression is evaluated off the top element of the above XML document, the wanted, correct result is produced:
2012 2011 2014
And this is the XSLT 2.0 - based verification:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/*">
<xsl:sequence select=
"for $s in
*[not(starts-with(., 'info:eu-repo'))]/tokenize(.,'/')[last()]
return
substring-before($s, '-')
"/>
</xsl:template>
</xsl:stylesheet>
<xsl:for-each select="dc:date[not(starts-with(., 'info:eu-repo'))]">
– Davila