What I Have
I have a variable size XML document that needs to be parsed on MSSQL 2008 R2 that looks like this:
<data item_id_type="1" cfgid="{4F5BBD5E-72ED-4201-B741-F6C8CC89D8EB}" has_data_event="False">
<item name="1">
<field id="{EA032B25-19F1-4C1B-BDDE-3113542D13A5}" type="2">0.506543009706267</field>
<field id="{71014ACB-571B-4C72-9C9B-05458B11335F}" type="2">-0.79500402346138</field>
<field id="{740C36E9-1988-413E-A1D5-B3E5B4405B45}" type="2">0.0152649050024924</field>
</item>
<item name="2">
<field id="{EA032B25-19F1-4C1B-BDDE-3113542D13A5}" type="2">0.366096802804087</field>
<field id="{71014ACB-571B-4C72-9C9B-05458B11335F}" type="2">-0.386642801354842</field>
<field id="{740C36E9-1988-413E-A1D5-B3E5B4405B45}" type="2">0.031671174184115</field>
</item>
</data>
.
What I Want
I need to transform it into a regular table type dataset that looks like this:
item_name field_id field_type field_value
--------- ------------------------------------ ----------- ---------------
1 EA032B25-19F1-4C1B-BDDE-3113542D13A5 2 0.5065430097062
1 71014ACB-571B-4C72-9C9B-05458B11335F 2 -0.795004023461
1 740C36E9-1988-413E-A1D5-B3E5B4405B45 2 0.0152649050024
2 EA032B25-19F1-4C1B-BDDE-3113542D13A5 2 0.3660968028040
2 71014ACB-571B-4C72-9C9B-05458B11335F 2 -0.386642801354
2 740C36E9-1988-413E-A1D5-B3E5B4405B45 2 0.0316711741841
3 EA032B25-19F1-4C1B-BDDE-3113542D13A5 2 0.8839620369590
3 71014ACB-571B-4C72-9C9B-05458B11335F 2 -0.781459993268
3 740C36E9-1988-413E-A1D5-B3E5B4405B45 2 0.2284423515729
.
What Works
This cross apply
query creates the desired output:
create table #temp (x xml)
insert into #temp (x)
values ('
<data item_id_type="1" cfgid="{4F5BBD5E-72ED-4201-B741-F6C8CC89D8EB}" has_data_event="False">
<item name="1">
<field id="{EA032B25-19F1-4C1B-BDDE-3113542D13A5}" type="2">0.506543009706267</field>
<field id="{71014ACB-571B-4C72-9C9B-05458B11335F}" type="2">-0.79500402346138</field>
<field id="{740C36E9-1988-413E-A1D5-B3E5B4405B45}" type="2">0.0152649050024924</field>
</item>
<item name="2">
<field id="{EA032B25-19F1-4C1B-BDDE-3113542D13A5}" type="2">0.366096802804087</field>
<field id="{71014ACB-571B-4C72-9C9B-05458B11335F}" type="2">-0.386642801354842</field>
<field id="{740C36E9-1988-413E-A1D5-B3E5B4405B45}" type="2">0.031671174184115</field>
</item>
<item name="3">
<field id="{EA032B25-19F1-4C1B-BDDE-3113542D13A5}" type="2">0.883962036959074</field>
<field id="{71014ACB-571B-4C72-9C9B-05458B11335F}" type="2">-0.781459993268713</field>
<field id="{740C36E9-1988-413E-A1D5-B3E5B4405B45}" type="2">0.228442351572923</field>
</item>
</data>
')
select c.value('(../@name)','varchar(5)') as item_name
,c.value('(@id)','uniqueidentifier') as field_id
,c.value('(@type)','int') as field_type
,c.value('(.)','nvarchar(15)') as field_value
from #temp cross apply
#temp.x.nodes('/data/item/field') as y(c)
drop table #temp
.
Problem
When there are a few hundred (or fewer) <item>
elements in the XML, the query performs just fine. However, when there are 1,000 <item>
elements, it takes 24 seconds to finish returning the rows in SSMS. When there are 6,500 <item>
elements, it takes about 20 minutes to run the cross apply
query. We could have 10-20,000 <item>
elements.
.
Questions
What makes the cross apply
query perform so poorly on this simple XML document, and perform exponentially slower as the dataset grows?
Is there a more efficient way to transform the XML document into the tabular dataset (in SQL)?