How to use xpointer with Xinclude to reference elements
Asked Answered
S

3

8

I want to merge 2 XML files with the same structure to make one. For example;

Test1.xml

<?xml version="1.0" encoding="UTF-8"?>

<ns:Root
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns:ns="urn:TestNamespace"
    xsi:schemaLocation="urn:Test.Namespace Test1.xsd"
    >
    <ns:element1 id="001">
       <ns:element2 id="001.1" order="1">
           <ns:element3 id="001.1.1" />
       </ns:element2>
       <ns:element2 id="001.2" order="2">
           <ns:element3 id="001.1.2" />
       </ns:element2>
    </ns:element1>
</ns:Root>

and Test2.xml

<?xml version="1.0" encoding="UTF-8"?>

<ns:Root
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns:ns="urn:TestNamespace"
    xsi:schemaLocation="urn:Test.Namespace Test1.xsd"
    >
    <ns:element1 id="999">
        <ns:element2 id="999.1" order="1">
            <ns:element3 id="999.1.1" />
        </ns:element2>
    </ns:element1>
</ns:Root>

To create

TestOutput.xml

<?xml version="1.0" encoding="UTF-8"?>

<ns:Root
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns:ns="urn:TestNamespace"
    xsi:schemaLocation="urn:Test.Namespace Test1.xsd"
    >
    <ns:element1 id="001">
       <ns:element2 id="001.1" order="1">
           <ns:element3 id="001.1.1" />
       </ns:element2>
       <ns:element2 id="001.2" order="2">
           <ns:element3 id="001.1.2" />
       </ns:element2>
    </ns:element1>
    <ns:element1 id="999">
        <ns:element2 id="999.1" order="1">
            <ns:element3 id="999.1.1" />
        </ns:element2>
    </ns:element1>
</ns:Root>

ie one XML file with all the elements from each included.

I found a useful question on StackOverflow, and came up with this;

Merge.xml

<?xml version="1.0"?>

<ns:Root xmlns:xi="http://www.w3.org/2003/XInclude"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns:ns="urn:TestNamespace">

    <xi:include href="Test1.xml" parse="xml" xpointer="element(//ns:Root/ns:element1)" />  

    <xi:include href="Test2.xml" parse="xml" xpointer="element(//ns:Root/ns:element1)" />

</ns:Root>

Which I run by doing this (I need to use xmllint for reasons to involved to go into)

xmllint -xinclude Merge.xml

But this does not work, it complains about various thiongs, which seem to relate to xpointer.

parser error : warning: ChildSeq not starting by /1
Merge.xml:7: element include: XInclude error : XPointer evaluation failed: #element(//ns:Root/ns:element1)
Merge.xml:7: element include: XInclude error : could not load Test1.xml, and no fallback was found
parser error : warning: ChildSeq not starting by /1
Merge.xml:9: element include: XInclude error : XPointer evaluation failed: #element(//ns:Root/ns:element1)
Merge.xml:9: element include: XInclude error : could not load Test2.xml, and no fallback was found
<?xml version="1.0"?>
<ns:Root xmlns:xi="http://www.w3.org/2003/XInclude" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ns="urn:TestNamespace">

    <xi:include href="Test1.xml" parse="xml" xpointer="element(//ns:Root/ns:element1)"/>

    <xi:include href="Test2.xml" parse="xml" xpointer="element(//ns:Root/ns:element1)"/>

</ns:Root>

If I omit the xpointer attributes in Merge.xml then I get some sensible output, but it has done more than include the elements I want of course.

Can someone offer some advice as to what I am doing wrong with xpointer please?

Thanks in antcipation.

Spathic answered 15/5, 2013 at 10:50 Comment(2)
If I remove the namespaces, the above works, so this just looks to be an issue with XPointer and how I am dealing with the namespacesSpathic
The element() scheme does not support qualified names (see w3.org/TR/xptr-element). A name specified with element() must be a NCName and refers to a single element identified with an xs:ID of that name. That's obviously not what you want.Decompose
S
5

I have dabbled with this a bit more, and found plenty of examples on the web that suggest what I am doing is correct.This is now a working version...

<?xml version="1.0"?>

<Root xmlns:xi="http://www.w3.org/2003/XInclude"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns:ns="http://testurl.com/now">

    <xi:include href="Test1.xml" xpointer="xmlns(ns=http://testurl.com/now)xpointer(/ns:Root/ns:element1)" parse="xml" />
    <xi:include href="Test2.xml" xpointer="xpointer(//Root/element1)" parse="xml" />

</Root>

This example uses a version of Test1.xml which has namespaces, and Test2.xml which does not.

The output now looks like this....

<?xml version="1.0"?>
<Root xmlns:xi="http://www.w3.org/2003/XInclude" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ns="http://testurl.com/now">

    <ns:element1 xmlns:ns="http://testurl.com/now" id="001">
        <ns:element2 id="001.1" order="1">
            <ns:element3 id="001.1.1"/>
        </ns:element2>
        <ns:element2 id="001.2" order="2">
            <ns:element3 id="001.1.2"/>
        </ns:element2>
    </ns:element1><ns:element1 xmlns:ns="http://testurl.com/now" id="003">
        <ns:element2 id="007.0" order="1">
            <ns:element3 id="007.1.1"/>
        </ns:element2>
    </ns:element1><ns:element1 xmlns:ns="http://testurl.com/now" id="002">
        <ns:element2 id="002.1" order="3">
            <ns:element3 id="002.1.1"/>
        </ns:element2>
        <ns:element2 id="002.2" order="4">
            <ns:element3 id="002.1.2"/>
        </ns:element2>
    </ns:element1>
    <element1 id="999">
        <element2 id="999.1" order="1">
            <element3 id="999.1.1"/>
        </element2>
    </element1>

</Root>

This is of course acceptable, it would be nice if the line breaks between the open and close of element1 were still there

Spathic answered 16/5, 2013 at 7:47 Comment(1)
The line breaks are not part of the elements, so that can't be cought with referencing the elements. Try adding --pretty 1 to the xmllint command line: xmllint --pretty 1 -xinclude Merge.xml. That still will not reproduce the original spacing, but looks a bit nicer.Decompose
D
2

This works with and without namespaces:

<?xml version="1.0"?>
<ns:Root xmlns:xi="http://www.w3.org/2003/XInclude"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns:ns="urn:TestNamespace">

    <xi:include href="Test1.xml" xpointer="xpointer(*/*)" />  
    <xi:include href="Test2.xml" xpointer="xpointer(*/*)" />

</ns:Root>

Also parse="xml" is default. You don't need to specify it.

Decompose answered 12/11, 2016 at 19:8 Comment(0)
C
0

For those using Xerces in Java: it only supports xpointer="element(...)" pointers. This is defined at https://www.w3.org/TR/2003/REC-xptr-element-20030325/

It has an example:

For example, the following pointer part identifies the element with an ID (as defined in XPointer Framework) of "intro"

but I failed to understand XPointer Framework determined ID is from https://www.w3.org/TR/2003/REC-xptr-framework-20030325/#shorthand

Scanning through https://www.ibiblio.org/xml/books/bible3/chapters/ch18.html
and reading https://xerces.apache.org/xerces2-j/faq-xinclude.html
I realized that this is possible to achieve what you asked for:

<?xml version="1.0"?>
<ns:Root xmlns:xi="http://www.w3.org/2003/XInclude"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns:ns="urn:TestNamespace">

    <xi:include href="Test1.xml" xpointer="element(/1)" />  
    <xi:include href="Test2.xml" xpointer="element(/1)" />

</ns:Root>

The good thing about this is that I guess element() schema is supported in more places than the full xpointer() schema.

Note: this addressing scheme can be nested, so for example /1/2 means root's (/) 1st element that has a child (/) at position 2. So it would select 001.2 from Test1.xml.

Coppice answered 16/9, 2021 at 21:6 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.