Finding node order in XML document in SQL Server
Asked Answered
M

6

19

How can I find the order of nodes in an XML document?

What I have is a document like this:

<value code="1">
    <value code="11">
        <value code="111"/>
    </value>
    <value code="12">
        <value code="121">
            <value code="1211"/>
            <value code="1212"/>
        </value>
    </value>
</value>

and I'm trying to get this thing into a table defined like

CREATE TABLE values(
    code int,
    parent_code int,
    ord int
)

Preserving the order of the values from the XML document (they can't be ordered by their code). I want to be able to say

SELECT code 
FROM values 
WHERE parent_code = 121 
ORDER BY ord

and the results should, deterministically, be

code
1211
1212

I have tried

SELECT 
    value.value('@code', 'varchar(20)') code, 
    value.value('../@code', 'varchar(20)') parent, 
    value.value('position()', 'int')
FROM @xml.nodes('/root//value') n(value)
ORDER BY code desc

But it doesn't accept the position() function ('position()' can only be used within a predicate or XPath selector).

I guess it's possible some way, but how?

Mylohyoid answered 15/7, 2009 at 21:3 Comment(3)
Do you have finite depth of nodes? If not, it's going to be a pain. And to confirm: you can not rely on codes?Legere
...and what output do you want from the xml?Legere
I updated the question to provide more information. And no, there is an infinite depth.Mylohyoid
H
42

You can emulate the position() function by counting the number of sibling nodes preceding each node:

SELECT
    code = value.value('@code', 'int'),
    parent_code = value.value('../@code', 'int'),
    ord = value.value('for $i in . return count(../*[. << $i]) + 1', 'int')
FROM @Xml.nodes('//value') AS T(value)

Here is the result set:

code   parent_code  ord
----   -----------  ---
1      NULL         1
11     1            1
111    11           1
12     1            2
121    12           1
1211   121          1
1212   121          2

How it works:

  • The for $i in . clause defines a variable named $i that contains the current node (.). This is basically a hack to work around XQuery's lack of an XSLT-like current() function.
  • The ../* expression selects all siblings (children of the parent) of the current node.
  • The [. << $i] predicate filters the list of siblings to those that precede (<<) the current node ($i).
  • We count() the number of preceding siblings and then add 1 to get the position. That way the first node (which has no preceding siblings) is assigned a position of 1.
Hoang answered 25/3, 2012 at 19:12 Comment(1)
I used this code on a rather large XML file and because the for $i in . return count(../*[. << $i]) + 1 part traverses all "sibling" nodes preceding every node this took forever (we let it run at work while going home, it had crashed the next day). So be forewarned that this code has an O(n^2) efficiency.Gilbert
G
7

SQL Server's row_number() actually accepts an xml-nodes column to order by. Combined with a recursive CTE you can do this:

declare @Xml xml = 
'<value code="1">
    <value code="11">
        <value code="111"/>
    </value>
    <value code="12">
        <value code="121">
            <value code="1211"/>
            <value code="1212"/>
        </value>
    </value>
</value>'

;with recur as (
    select
        ordr        = row_number() over(order by x.ml),
        parent_code = cast('' as varchar(255)),
        code        = x.ml.value('@code', 'varchar(255)'),
        children    = x.ml.query('./value')
    from @Xml.nodes('value') x(ml)
    union all
    select
        ordr        = row_number() over(order by x.ml),
        parent_code = recur.code,
        code        = x.ml.value('@code', 'varchar(255)'),
        children    = x.ml.query('./value')
    from recur
    cross apply recur.children.nodes('value') x(ml)
)
select *
from recur
where parent_code = '121'
order by ordr

As an aside, you can do this and it'll do what do you expect:

select x.ml.query('.')
from @Xml.nodes('value/value')x(ml)
order by row_number() over (order by x.ml)

Why, if this works, you can't just order by x.ml directly without row_number() over is beyond me.

Gilbert answered 12/5, 2015 at 8:5 Comment(0)
I
3

The answer by erikkallen is absolutely correct.

However, if the original document/schema may be modified, an alternative is to store the position/index in an attribute. I use a mix of both approaches, depending who the "originator" of the XML is and the type of queries that need to be performed upon it. At the end of the day I rue most use of XML except possibly "dumb storage" in SQL Server and am usually happy when I can dump it (XML) for normalized tables.

Happy dealing with the unmentioned limitations of "enterprise-grade" products -- the wonders never end.

Irritant answered 12/1, 2011 at 6:12 Comment(1)
+1 for your "rue most use of XML". It really sucks when you have to select nodes much less update them.Untouched
R
3

You can get the position of the xml returned by a x.nodes() function like so:

row_number() over (order by (select 0))

For example:

DECLARE @x XML
SET @x = '<a><b><c>abc1</c><c>def1</c></b><b><c>abc2</c><c>def2</c></b></a>'

SELECT
    b.query('.'),
    row_number() over (partition by 0 order by (select 0))
FROM
    @x.nodes('/a/b') x(b)
Regardless answered 14/2, 2014 at 10:38 Comment(1)
There is no guarantee that the order will match what is desiredBuehrer
M
2

According to this document and this connect entry it is not possible, but the Connect entry contains two workarounds.

I do it like this:

WITH n(i) AS (SELECT 0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9),
     o(i) AS (SELECT n3.i * 100 + n2.i * 10 + n1.i FROM n n1, n n2, n n3)
SELECT v.value('@code', 'varchar(20)') AS code,
       v.value('../@code', 'varchar(20)') AS parent,
       o.i AS ord
  FROM o
 CROSS APPLY @xml.nodes('/root//value[sql:column("o.i")]') x(v)
 ORDER BY o.i
Mylohyoid answered 15/7, 2009 at 22:4 Comment(1)
Every-time I try to see if there is a good way to perform this I always feel like crying. It's the only way I have found (actually, I use a numbers table, but same ugly hack) -- it's an absolutely pathetic excuse for a server which "supports XML" and makes simple shredding and access much more complicated than it needs to be.Irritant
S
-1

I see answer by @Ben and... get new sollution

 row_number() over (order by (select null))

as

  SELECT value.value('@code', 'varchar(20)') code, 
  value.value('../@code', 'varchar(20)') parent, 
  row_number() over (order by (select null))
  FROM @xml.nodes('/root//value') n(value)
Seidler answered 7/6, 2016 at 14:39 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.