Extracting data from XML using OpenXML in SQL Server
Asked Answered
A

3

5

I have a xml which I want it to be extracted using OpenXML within SQL Server

Here is the sample XML

<row>
  <student_token>7</student_token>
  <student_ssn>552</student_ssn>
  <alternate_id>20</alternate_id>
  <old_ssn xsi:nil="true" />
  <alien_num xsi:nil="true" />
  <last_name>A</last_name>
  <first_name>B</first_name>
  <middle_init xsi:nil="true" />
  <drivers_license_num xsi:nil="true" />
  <gpa_highschool xsi:nil="true" />
  <created_dt>2006-07-13T11:15:08.320</created_dt>
  <created_how>4</created_how>
  <modified_dt>2008-02-14T00:00:00</modified_dt>
  <modified_by>4</modified_by>
  <primary_street2 xsi:nil="true" />
  <primary_street3 xsi:nil="true" />
  <primary_country xsi:nil="true" />
  <email_address xsi:nil="true" />
  <address_start_dt xsi:nil="true" />
  <address_end_dt xsi:nil="true" />
  <entrance_iv_dt xsi:nil="true" />
  <entrance_iv_by xsi:nil="true" />
  <exit_iv_dt>2006-11-02T00:00:00</exit_iv_dt>
  <exit_iv_by>156</exit_iv_by>
  <foreign_address_indicator>N</foreign_address_indicator>
  <foreign_postal_code xsi:nil="true" />
  <pin>J27841</pin>
  <web_id>J08614   </web_id>
  <prior_name xsi:nil="true" />
  <orig_eps xsi:nil="true" />
  <web_role>STU1</web_role>
  <heal_limit_flag>N</heal_limit_flag>
  <email_address_2>[email protected]</email_address_2>
  <cellular_telephone>415</cellular_telephone>
  <alt_loan_debt xsi:nil="true" />
  <web_last_login xsi:nil="true" />
  <foreign_country_code xsi:nil="true" />
  <entrance_iv_dt_grad_plus xsi:nil="true" />
  <entrance_iv_by_grad_plus xsi:nil="true" />
  <failed_logins>0</failed_logins>
  <hispanic xsi:nil="true" />
  <race xsi:nil="true" />
  <primary_phone_number_intl xsi:nil="true" />
  <security_version>0</security_version>
  <failed_challenge_response>0</failed_challenge_response>
  <require_pin_reset xsi:nil="true" />
</row>

The query should extract into 3 fields for each row

  1. FieldName
  2. FieldValue
  3. IsNull

For example the first row should be

  • FieldName = student_token - The node name would be the field name
  • FieldValue = 7
  • IsNull = false - IsNull is based on the attribute xsi:nil="true"

How can I do this?

Appurtenant answered 6/3, 2012 at 14:13 Comment(3)
The tag name is the FieldName and ISNull is based on the Attribute xsi:nil="true"Appurtenant
The result should take the element name from each node and return that as the FieldName. For example the first row fieldname would be student_token and the next one would be student_ssn and the next would be alternate_id ....Appurtenant
You mean use the first element name and use that as the FieldName? There are multiple xsi:nil.Jesuitism
P
12

Sample data with namespace added.

declare @xml xml 
set @xml = 
'<row xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <student_token>7</student_token>
  <student_ssn>552</student_ssn>
  <alternate_id>20</alternate_id>
  <old_ssn xsi:nil="true" />
  <alien_num xsi:nil="true" />
  <last_name>A</last_name>
  <first_name>B</first_name>
  <middle_init xsi:nil="true" />
  <drivers_license_num xsi:nil="true" />
  <gpa_highschool xsi:nil="true" />
  <created_dt>2006-07-13T11:15:08.320</created_dt>
  <created_how>4</created_how>
  <modified_dt>2008-02-14T00:00:00</modified_dt>
  <modified_by>4</modified_by>
  <primary_street2 xsi:nil="true" />
  <primary_street3 xsi:nil="true" />
  <primary_country xsi:nil="true" />
  <email_address xsi:nil="true" />
  <address_start_dt xsi:nil="true" />
  <address_end_dt xsi:nil="true" />
  <entrance_iv_dt xsi:nil="true" />
  <entrance_iv_by xsi:nil="true" />
  <exit_iv_dt>2006-11-02T00:00:00</exit_iv_dt>
  <exit_iv_by>156</exit_iv_by>
  <foreign_address_indicator>N</foreign_address_indicator>
  <foreign_postal_code xsi:nil="true" />
  <pin>J27841</pin>
  <web_id>J08614   </web_id>
  <prior_name xsi:nil="true" />
  <orig_eps xsi:nil="true" />
  <web_role>STU1</web_role>
  <heal_limit_flag>N</heal_limit_flag>
  <email_address_2>[email protected]</email_address_2>
  <cellular_telephone>415</cellular_telephone>
  <alt_loan_debt xsi:nil="true" />
  <web_last_login xsi:nil="true" />
  <foreign_country_code xsi:nil="true" />
  <entrance_iv_dt_grad_plus xsi:nil="true" />
  <entrance_iv_by_grad_plus xsi:nil="true" />
  <failed_logins>0</failed_logins>
  <hispanic xsi:nil="true" />
  <race xsi:nil="true" />
  <primary_phone_number_intl xsi:nil="true" />
  <security_version>0</security_version>
  <failed_challenge_response>0</failed_challenge_response>
  <require_pin_reset xsi:nil="true" />
</row>'

Using openxml.

declare @idoc int
exec sp_xml_preparedocument @idoc out, @xml, '<row xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"/>'

select FieldName, 
       FieldValue,
       isnull([IsNull], 0)
from openxml(@idoc, '/row/*',1) 
  with (
         FieldName  varchar(50) '@mp:localname',
         FieldValue varchar(50) '.',
         [IsNull]   bit         '@xsi:nil'
       )

exec sp_xml_removedocument @idoc

Using the XML data type:

;with xmlnamespaces('http://www.w3.org/2001/XMLSchema-instance' as ns)
select T.N.value('local-name(.)', 'varchar(50)') as FieldName,
       T.N.value('.', 'varchar(50)') as FieldValue,
       isnull(T.N.value('@ns:nil', 'bit'), 0) as [IsNull]
from @xml.nodes('/row/*') as T(N)
Papaw answered 6/3, 2012 at 14:37 Comment(2)
@marc_s - Is the URL the same in the XML as in the sp_xml_preparedocument statement?Papaw
oops - no! :-) I had set it myself to just whatever - of course that doesn't match up with your setting! :-) Sorry, my bad..... works like a charm now!Warga
W
4

Not sure if you have that XML as a SQL variable or inside a table - question is very unclear .....

If you have it as an SQL variable, then try something like this (note: you must declare the xsi prefix somehow - otherwise SQL Server's XML processor won't even look at your XML document):

DECLARE @input XML = '<row xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <student_token>7</student_token>
  <student_ssn>552</student_ssn>
  <alternate_id>20</alternate_id>
  <old_ssn xsi:nil="true" />
  .........
</row>'

;WITH XMLNAMESPACES('http://www.w3.org/2001/XMLSchema-instance' as xsi)
SELECT
FieldName = T.C.value('local-name(.)', 'varchar(50)'),
FieldValue = T.C.value('(.)[1]', 'varchar(500)'),
IsNIL = ISNULL(T.C.value('(@xsi:nil)[1]', 'bit'), 0)
FROM
@Input.nodes('/row/*') AS T(C)

This gives me an output something like:

FieldName      FieldValue  IsNIL
student_token    7          0
student_ssn      552        0
alternate_id     20         0
old_ssn                     1
.....

Of course, all output is going to be of type varchar(500) now in the FieldValue column....

Updated my answer, based on Mikael Eriksson's answer, to include the IsNIL handling, too. Thanks Mikael for the inspiration! You deserve the nod and the accept vote!

Warga answered 6/3, 2012 at 14:35 Comment(3)
The reason for using OpenXml was performance. Isn't this slower than openxml?Appurtenant
@Appurtenant - I think that would be an interesting question on SO. It's been on my todo list for a while to check what is the fastest. I have read that they should be almost the same but ... needs some testing.Papaw
@Rush: no, not necessarily. Both approaches have their pros and cons - but the OpenXML is definitely more resource intensive (with the call to sp_xml_preparedocument) and less intuitive (at least to me)Warga
G
0

You can convert your xml file to json and then use OPENJSON instead. Take a look at openjson : SELECT star on how to use OPENJSON without having to list column names.

To convert an xml file to json you can use sp_execute_external_script and the xmltodict python module. Take a look at Importing Python Libraries to SQL Server to import the python module and deal with YAPI (Yet Another Python Install) issues.

Giddy answered 12/5, 2020 at 15:2 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.