How does one parse XML files? [closed]
Asked Answered
B

12

491

Is there a simple method of parsing XML files in C#? If so, what?

Built answered 11/9, 2008 at 5:4 Comment(4)
you could use this implementation: https://mcmap.net/q/75438/-parsing-xml-data-in-c-and-show-into-listboxValerio
Ok, I reopened this. The duplicate was an XML Reader solution where as this is about parsing XML files. The posssible duplicate can be seen in the questions edit history ps @GeorgeStockerDeakin
@JeremyThompson One of the reasons why this was a duplicate is the other question has a much better answer. The top answer being a simple "link only" answer is not useful.Defy
@GeorgeStocker the questions are different enough to co-exist and both have great answers, plus the accepted ones are using different technologies. That's why I voted we leave this open, I know this accepted one is link only but it is MSDN and was written at a time before that was unacceptable, hopefully a side effect of reopening is cheering Jon up a bit, read his profile. Anyway cheers.Deakin
R
255

I'd use LINQ to XML if you're in .NET 3.5 or higher.

Ryter answered 11/9, 2008 at 5:5 Comment(2)
I tried using this but was unable to figure out how to get something as simple as the value of the identifier of a certain element in my XML (or how to get an element by identifier, for that matter). In contrast, using XmlDocument I was able to do that with minimal effort.Centi
I'm with you Kira. I have yet to grok LINQ to XML, and considering the overhead of LINQ, I've never made the effort to figure it out. Doesn't seem intuitive to me.Groggery
K
336

It's very simple. I know these are standard methods, but you can create your own library to deal with that much better.

Here are some examples:

XmlDocument xmlDoc= new XmlDocument(); // Create an XML document object
xmlDoc.Load("yourXMLFile.xml"); // Load the XML document from the specified file

// Get elements
XmlNodeList girlAddress = xmlDoc.GetElementsByTagName("gAddress");
XmlNodeList girlAge = xmlDoc.GetElementsByTagName("gAge"); 
XmlNodeList girlCellPhoneNumber = xmlDoc.GetElementsByTagName("gPhone");

// Display the results
Console.WriteLine("Address: " + girlAddress[0].InnerText);
Console.WriteLine("Age: " + girlAge[0].InnerText);
Console.WriteLine("Phone Number: " + girlCellPhoneNumber[0].InnerText);

Also, there are some other methods to work with. For example, here. And I think there is no one best method to do this; you always need to choose it by yourself, what is most suitable for you.

Karlotte answered 11/9, 2008 at 5:17 Comment(6)
+1 for mentioning XmlDocument, which is much more convenient than serialisation interfaces in some cases. If you are after one specific element, you can access child elements with the indexer: xmlDoc["Root"], and these can be chained: xmlDoc["Root"]["Folder"]["Item"] to dig down the hierarchy (although it's sensible to validate that these elements actually exist)Bossuet
InnerText here gets the value of that node, concatenated with all values of child nodes - right? Seems like an odd thing to want.Thoroughbred
A programmer with a list of female friends? Shenanigans!Nambypamby
@E.vanPutten not in this day and age. This is not Revenge of the NerdsConsecution
@DonCheadle If you aren't expecting there to be any child nodes, then InnerText will just return the node value - which is what I (and probably everyone else reading this question) am parsing the XML to find in the first place.Shaia
I think this should be the accepted answer. XmlDocument is just that much more convenient to use for "basic" situations.Centi
R
255

I'd use LINQ to XML if you're in .NET 3.5 or higher.

Ryter answered 11/9, 2008 at 5:5 Comment(2)
I tried using this but was unable to figure out how to get something as simple as the value of the identifier of a certain element in my XML (or how to get an element by identifier, for that matter). In contrast, using XmlDocument I was able to do that with minimal effort.Centi
I'm with you Kira. I have yet to grok LINQ to XML, and considering the overhead of LINQ, I've never made the effort to figure it out. Doesn't seem intuitive to me.Groggery
T
50

Use a good XSD Schema to create a set of classes with xsd.exe and use an XmlSerializer to create a object tree out of your XML and vice versa. If you have few restrictions on your model, you could even try to create a direct mapping between you model classes and the XML with the Xml*Attributes.

There is an introductory article about XML Serialisation on MSDN.

Performance tip: Constructing an XmlSerializer is expensive. Keep a reference to your XmlSerializer instance if you intend to parse/write multiple XML files.

Tiffanietiffanle answered 11/9, 2008 at 6:16 Comment(2)
See codeproject.com/KB/cs/xsdtidy.aspx and blog.dotnetwiki.org/XsdTidyXSDMappingBeautifier.aspxTiffanietiffanle
Good example is the "Purchase Order Example" in the middle of this example from microsoft. msdn.microsoft.com/en-us/library/58a18dwa.aspx. You avoid having to create a schema -- your c# class is the schema, adorned with C# attributes.Unreadable
L
26

If you're processing a large amount of data (many megabytes) then you want to be using XmlReader to stream parse the XML.

Anything else (XPathNavigator, XElement, XmlDocument and even XmlSerializer if you keep the full generated object graph) will result in high memory usage and also a very slow load time.

Of course, if you need all the data in memory anyway, then you may not have much choice.

Littell answered 11/9, 2008 at 7:48 Comment(0)
R
18

Use XmlTextReader, XmlReader, XmlNodeReader and the System.Xml.XPath namespace. And (XPathNavigator, XPathDocument, XPathExpression, XPathnodeIterator).

Usually XPath makes reading XML easier, which is what you might be looking for.

Rennie answered 11/9, 2008 at 5:12 Comment(1)
FYI, you should not use new XmlTextReader() or new XmlTextWriter(). They have been deprecated since .NET 2.0. Use XmlReader.Create() or XmlWriter.Create() instead.Sweetie
T
12

I have just recently been required to work on an application which involved the parsing of an XML document and I agree with Jon Galloway that the LINQ to XML based approach is, in my opinion, the best. I did however have to dig a little to find usable examples, so without further ado, here are a few!

Any comments welcome as this code works but may not be perfect and I would like to learn more about parsing XML for this project!

public void ParseXML(string filePath)  
{  
    // create document instance using XML file path
    XDocument doc = XDocument.Load(filePath);

    // get the namespace to that within of the XML (xmlns="...")
    XElement root = doc.Root;
    XNamespace ns = root.GetDefaultNamespace();

    // obtain a list of elements with specific tag
    IEnumerable<XElement> elements = from c in doc.Descendants(ns + "exampleTagName") select c;

    // obtain a single element with specific tag (first instance), useful if only expecting one instance of the tag in the target doc
    XElement element = (from c in doc.Descendants(ns + "exampleTagName" select c).First();

    // obtain an element from within an element, same as from doc
    XElement embeddedElement = (from c in element.Descendants(ns + "exampleEmbeddedTagName" select c).First();

    // obtain an attribute from an element
    XAttribute attribute = element.Attribute("exampleAttributeName");
}

With these functions I was able to parse any element and any attribute from an XML file no problem at all!

Thermomotor answered 23/2, 2018 at 15:38 Comment(0)
F
10

In Addition you can use XPath selector in the following way (easy way to select specific nodes):

XmlDocument doc = new XmlDocument();
doc.Load("test.xml");

var found = doc.DocumentElement.SelectNodes("//book[@title='Barry Poter']"); // select all Book elements in whole dom, with attribute title with value 'Barry Poter'

// Retrieve your data here or change XML here:
foreach (XmlNode book in nodeList)
{
  book.InnerText="The story began as it was...";
}

Console.WriteLine("Display XML:");
doc.Save(Console.Out);

the documentation

Frisco answered 17/10, 2017 at 12:19 Comment(0)
G
8

If you're using .NET 2.0, try XmlReader and its subclasses XmlTextReader, and XmlValidatingReader. They provide a fast, lightweight (memory usage, etc.), forward-only way to parse an XML file.

If you need XPath capabilities, try the XPathNavigator. If you need the entire document in memory try XmlDocument.

Gensmer answered 11/9, 2008 at 5:12 Comment(0)
S
6

I'm not sure whether "best practice for parsing XML" exists. There are numerous technologies suited for different situations. Which way to use depends on the concrete scenario.

You can go with LINQ to XML, XmlReader, XPathNavigator or even regular expressions. If you elaborate your needs, I can try to give some suggestions.

Summon answered 11/9, 2008 at 5:11 Comment(1)
regex for xml. you monster.Enslave
R
3

You can parse the XML using this library System.Xml.Linq. Below is the sample code I used to parse a XML file

public CatSubCatList GenerateCategoryListFromProductFeedXML()
{
    string path = System.Web.HttpContext.Current.Server.MapPath(_xmlFilePath);

    XDocument xDoc = XDocument.Load(path);

    XElement xElement = XElement.Parse(xDoc.ToString());


    List<Category> lstCategory = xElement.Elements("Product").Select(d => new Category
    {
        Code = Convert.ToString(d.Element("CategoryCode").Value),
        CategoryPath = d.Element("CategoryPath").Value,
        Name = GetCateOrSubCategory(d.Element("CategoryPath").Value, 0), // Category
        SubCategoryName = GetCateOrSubCategory(d.Element("CategoryPath").Value, 1) // Sub Category
    }).GroupBy(x => new { x.Code, x.SubCategoryName }).Select(x => x.First()).ToList();

    CatSubCatList catSubCatList = GetFinalCategoryListFromXML(lstCategory);

    return catSubCatList;
}
Radiosonde answered 18/7, 2017 at 20:41 Comment(0)
C
1

You can use ExtendedXmlSerializer to serialize and deserialize.

Instalation You can install ExtendedXmlSerializer from nuget or run the following command:

Install-Package ExtendedXmlSerializer

Serialization:

ExtendedXmlSerializer serializer = new ExtendedXmlSerializer();
var obj = new Message();
var xml = serializer.Serialize(obj);

Deserialization

var obj2 = serializer.Deserialize<Message>(xml);

Standard XML Serializer in .NET is very limited.

  • Does not support serialization of class with circular reference or class with interface property,
  • Does not support Dictionaries,
  • There is no mechanism for reading the old version of XML,
  • If you want create custom serializer, your class must inherit from IXmlSerializable. This means that your class will not be a POCO class,
  • Does not support IoC.

ExtendedXmlSerializer can do this and much more.

ExtendedXmlSerializer support .NET 4.5 or higher and .NET Core. You can integrate it with WebApi and AspCore.

Canakin answered 16/2, 2017 at 14:26 Comment(0)
F
1

You can use XmlDocument and for manipulating or retrieve data from attributes you can Linq to XML classes.

Faus answered 16/11, 2017 at 12:27 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.