Using XPath to parse an XML document
Asked Answered
L

8

16

Lets say I have the following xml (a quick example)

<rows>
   <row>
      <name>one</name>
   </row>
   <row>
      <name>two</name>
   </row>
</rows>

I am trying to parse this by using XmlDocument and XPath (ultimately so I can make a list of rows).

For example...

XmlDocument doc = new XmlDocument();
doc.LoadXml(xml);

foreach(XmlNode row in doc.SelectNodes("//row"))
{
   string rowName = row.SelectSingleNode("//name").InnerText;
}

Why, within my foreach loop, is rowName always "one"? I am expecting it to be "one" on the first iteration and "two" on the second.

It seems that //name gets the first instance in the document, rather than the first instance in the row as I would expect. After all, I am calling the method on the "row" node. If this is "just how it works" then can anybody please explain how I could change it to work to my needs?

Thank you

Lazare answered 28/11, 2011 at 16:36 Comment(1)
how about selecting it by //row/name? Does this work?Informed
I
19
XmlDocument doc = new XmlDocument();
doc.LoadXml(xml);

foreach(XmlNode row in doc.SelectNodes("//row"))
{
   var rowName = row.SelectSingleNode("name");
}

Is the code you posted actually correct? I get a compile error on row.SelectNode() as it isn't a member of XmlNode.

Anyway, my example above works, but assumes only a single <name> node within the <row> node so you may need to use SelectNodes() instead of SelectSingleNode() if that is not the case.

As others have shown, use .InnerText to get just the value.

Issue answered 28/11, 2011 at 16:42 Comment(0)
D
5

Use LINQ to XML. Include using System.Xml.Linq; in your code file and then do the following code to get your list

XDocument xDoc = XDocument.Load(filepath);
IEnumerable<XElement> xNames;

xNames = xDoc.Descendants("name");

That will give you a list of the name elements. Then if you want to turn that into a List<string> just do this:

List<string> list = new List<string>();
foreach (XElement element in xNames)
{
    list.Add(element.value);
}
Downpour answered 28/11, 2011 at 16:44 Comment(0)
H
4

Your second xpath starts with //. This is an abbreviation for /descendant-or-self::node(), which you can see starts with /, meaning it searches from the root of the document, whatever the context in which you use it.

You probably want one of:

var rowName = row.SelectSingleNode("name");

to find the name nodes that are immediate children of the row, or

var rowName = row.SelectSingleNode(".//name");

to find name nodes *anywhere undertherow. Note the.` in this second xpath that causes the xpath to start from the context node.

Hodgkin answered 28/11, 2011 at 16:44 Comment(0)
L
3

Use a relative path e.g. string rowName = row.SelectSingleNode("name").InnerText;.

Lenhart answered 28/11, 2011 at 16:43 Comment(0)
D
2

The problem is in your second XPath query:

//row

This has a global scope, so no matter where you call it from, it will select all row elements.

It should work if you replace your expression with:

.//row
Dore answered 28/11, 2011 at 16:43 Comment(0)
W
2

I would use SelectSingleNode, and then the InnerText property.

var rowName = row.SelectSingleNode("name").InnerText;
Waldon answered 28/11, 2011 at 16:44 Comment(0)
A
0

Use the following

        doc.LoadXml(xml);

            foreach(XmlNode row in doc.SelectNodes("/rows/row"))
            {
                string rowName = row.SelectSingleNode("//name").InnerText.ToString();
            }
Avigdor answered 28/11, 2011 at 16:55 Comment(2)
That will cause the same problem as I had to begin with, sorryLazare
Let me rephrase lol, I got rid of the /rows and just did it as //row and it worked for me.Avigdor
N
0

Let's take an example of XML as below to fetch data of document using XPath

<?xml version="1.0" encoding="UTF-8" ?>    <!DOCTYPE svg (View Source for full doctype...)>   <!-- Created with AIM.   -->   <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 1668.75 1074.75" xmlns:xlink="http://www.w3.org/1999/xlink" preserveAspectRatio="xMidYMid meet" zoomAndPan="magnify" version="1.0" contentScriptType="text/ecmascript" contentStyleType="text/css">   can't read ": no such element in array    <g id="#1" class="track" />  <g id="#5" class="dedication">  <metadata>   <meta name="color">Red</meta>    </metadata>   <text fill="#181818">AQWSD</text>    </g>  <g id="#6" class="wordasword">  <metadata>   <meta name="epigraph">Output 1</meta>    <meta name="color">Red</meta>    <meta name="refentry">qandadiv</meta>    </metadata>   <paramdef fill="none" />    <text fill="#181818">0.35</text>    </g>  <g id="#7" class="wordasword">  <metadata>   <meta name="epigraph">Output 2</meta>    <meta name="color">Red</meta>    <meta name="refentry">calloutlist</meta>    <meta name="screen">common></meta>   </metadata>   <path fill="none" />    <text fill="#181818">lineannotation</text>    <text fill="#181818">WHO</text>    <paramdef fill="#232323" />    </g>  <g id="#" class="wordasword">  <metadata>   <meta name="epigraph">Output 3</meta>    <meta name="color">Red</meta>    <meta name="refentry">calloutlist</meta>    <meta name="screen">common></meta>    </metadata>   <path fill="none" />    <text fill="#181818">lineannotation</text>    <text fill="#181818">WHO</text>    <paramdef fill="#232323" />    </g>   </svg>

I have checked and build the code below that is working correctly.

Below is run-time value of above mentioned XML document as xmlContent.

 var xmlContent = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\r\n<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG
1.0//EN\" \"http://www.w3.org/TR/2001/REC-SVG-20010904/DTD/svg10.dtd\">\r\n<!--Created with AIM.-->\r\n<svg xmlns=\"http://www.w3.org/2000/svg\" viewBox=\"0 0 1668.75 1074.75\">\r\ncan't read \": no such element in array<g id=\"#1\" class=\"track\"></g><g id=\"#5\" class=\"dedication\">\r\n<metadata>\r\n<meta name=\"color\">Red</meta>\r\n</metadata>\r\n<text fill=\"#181818\">AQWSD</text>\r\n</g>\r\n<g id=\"#6\" class=\"wordasword\">\r\n<metadata>\r\n<meta name=\"epigraph\">Output 1</meta>\r\n<meta name=\"color\">Red</meta>\r\n<meta name=\"refentry\">qandadiv</meta>\r\n</metadata>\r\n<paramdef fill=\"none\" />\r\n<text fill=\"#181818\">0.35</text>\r\n</g>\r\n<g id=\"#7\" class=\"wordasword\">\r\n<metadata>\r\n<meta name=\"epigraph\">Output 2</meta>\r\n<meta name=\"color\">Red</meta>\r\n<meta name=\"refentry\">calloutlist</meta>\r\n<meta name=\"screen\">common></meta>\r\n</metadata>\r\n<path fill=\"none\" />\r\n<text fill=\"#181818\">lineannotation</text>\r\n<text fill=\"#181818\">WHO</text>\r\n<paramdef fill=\"#232323\"/>\r\n</g>\r\n<g id=\"#\" class=\"wordasword\">\r\n<metadata>\r\n<meta name=\"epigraph\">Output 3</meta>\r\n<meta name=\"color\">Red</meta>\r\n<meta name=\"refentry\">calloutlist</meta>\r\n<meta name=\"screen\">common></meta>\r\n</metadata>\r\n<path fill=\"none\"/>\r\n<text fill=\"#181818\">lineannotation</text>\r\n<text fill=\"#181818\">WHO</text>\r\n<paramdef fill=\"#232323\"/>\r\n</g>\r\n</svg>";


XmlDocument xml = new XmlDocument();
xml.LoadXml(xmlContent);

//Select all g Nodes of class wordasword that have color red in metadata>meta 
var gNodesOnClassOfColorRed = xml.SelectNodes("//*[local-name()='g'][@class='wordasword'][*[local-name()='metadata'][*[local-name()='meta'][@name='color'] = 'Red']]").Cast<XmlNode>();

foreach (XmlNode gNode in gNodesOnClassOfColorRed)
{
    var metadata = gNode.SelectSingleNode("*[local-name()='metadata']").Cast<XmlNode>();    //Fetch metadata of g tag

    //Fetch epigraph value from meta tag from metadata
    var epigraph = metadata.Cast<XmlNode>()
                    .Where(z => z.Attributes.Count != 0 && z.Attributes.GetNamedItem("name") != null && z.Attributes.GetNamedItem("name").Value.Trim().ToLower() == "epigraph")
                    .Select(p => p.InnerText).FirstOrDefault();

    Console.WriteLine(epigraph);
} 

The above code will fetch the epigraph value from Metadata. The output of the epigraph value will be printed as

Output 1, Output 2, Output 3

The below code will fetch the text tag list of all g tags where is xml is same as above

var elementList = (XmlNodeList)xml.SelectNodes("//*[local-name()='g'][@class='wordasword'][*[local-name()='text']]");

foreach (XmlNode xmlNode in elementList)     //g
{
    XmlNodeList textList = (XmlNodeList)xmlNode.SelectNodes("*[local-name()='text']");
}  
Nardi answered 17/1 at 18:51 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.