Remove empty/blanks elements in collection of XML nodes
Asked Answered
B

3

15

I have an XML document like this:

<magento_api>
    <data_item>
        <code>400</code>
        <message>Attribute weight is not applicable for product type Configurable Product</message>
    </data_item>
    <data_item>
        <code>400</code>
        <message>Resource data pre-validation error.</message>
    </data_item>
    <data_item>
        <code>1</code>
        <message></message>
    </data_item>
    <data_item>
        <code></code>
        <message>No code was given</message>
    </data_item>
</magento_api>

I'm trying to iterate each node and do the following:

  1. Throw out any elements that are empty/blank.
  2. Generate new Node with only elements containing values.
  3. Send the resulting doc to different web service.

The part I'm struggling with is how to iterate through each node and check each element for null values.

I've been testing this code out at http://rextester.com/runcode but can't seem to figure it out:

Console.WriteLine("Querying tree loaded with XElement.Load");
Console.WriteLine("----");
XElement doc = XElement.Parse(@"<magento_api>
          <data_item>
            <code>400</code>
            <message>Attribute weight is not applicable for product type Configurable Product</message>
          </data_item>
          <data_item>
            <code>400</code>
            <message>Resource data pre-validation error.</message>
          </data_item>
          <data_item>
            <code>1</code>
            <message></message>
          </data_item>
          <data_item>
            <code></code>
            <message>No code was given</message>
          </data_item>
    </magento_api>");

int counter = 1;
IEnumerable<XNode> nodes =
    from nd in doc.Nodes()
    select nd;
foreach (XNode node in nodes)
{
    Console.WriteLine(counter + "-" + node);
    IEnumerable<XElement> elements =
    from el in node //this is where I've been trying various methods, but no dice.
    select el;
    foreach (XElement e in elements)
    {
           Console.WriteLine(counter + "-" + e.Name + "-" + e.Value + "\r\n");
    }
    counter++;
}

Based on the above XML input, I'm hoping to get the following output:

<magento_api>
    <data_item>
        <code>400</code>
        <message>Attribute weight is not applicable for product type Configurable Product</message>
    </data_item>
    <data_item>
        <code>400</code>
        <message>Resource data pre-validation error.</message>
    </data_item>
    <data_item>
        <code>1</code>
    </data_item>
    <data_item>
        <message>No code was given</message>
    </data_item>
</magento_api>

I'm not sure if I'm using the right methods to iterate over the nodes and elements.

Brezhnev answered 24/1, 2013 at 19:28 Comment(3)
What do you mean by "elements that are NULL"? Also note that you're using query expressions for no purpose here - for example, instead of writing from el in node select el you can just use node later...Hygienic
@JonSkeet - I just mean elements that are blank/empty. Isn't that the same as NULL?Brezhnev
Not really - there's no such concept as "NULL" in XML. It's also not clear what structure you're expecting to return. It would be useful if you could edit your question with the desired output for the given input file.Hygienic
K
40

A single one-liner could do the job, no need to iterate over all elements. Here it goes:

doc.Descendants().Where(e => string.IsNullOrEmpty(e.Value)).Remove();

Tester

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;

namespace ConsoleApplication1
{
    public class TestRemove
    {
        public static void Main() {
            Console.WriteLine("----OLD TREE STARTS---");
            XElement doc = XElement.Parse(@"<magento_api>
                                              <data_item>
                                                <code>400</code>
                                                <message>Attribute weight is not applicable for product type Configurable Product</message>
                                              </data_item>
                                              <data_item>
                                                <code>400</code>
                                                <message>Resource data pre-validation error.</message>
                                              </data_item>
                                              <data_item>
                                                <code>1</code>
                                                <message></message>
                                              </data_item>
                                              <data_item>
                                                <code></code>
                                                <message>No code was given</message>
                                              </data_item>
                                        </magento_api>");
            Console.Write(doc.ToString());
            Console.WriteLine("");
            Console.WriteLine("----OLD TREE ENDS---");
            Console.WriteLine("");
            doc.Descendants().Where(e => string.IsNullOrEmpty(e.Value)).Remove();
            Console.WriteLine("----NEW TREE STARTS---");
            Console.Write(doc.ToString());
            Console.WriteLine("");
            Console.WriteLine("----NEW TREE ENDS---");
            Console.ReadKey();
        }
    }
}

And it also could be tested here

Kyleekylen answered 25/1, 2013 at 6:32 Comment(5)
You need to watch out for Self-Closing Elements that have attributes, it is most likely not the desire of the end user IE: <Reference Include="Microsoft.VisualBasic" />Bloodhound
+1 @aolszowka. And the magento_api uses attributes in the most bizarre places. I took this approach for a related issue. https://mcmap.net/q/821705/-xml-how-to-remove-all-nodes-which-have-no-attributes-nor-child-elementsMinaret
This approach removes the xml definition header, though.Brasilin
Seems the solution for that is to save it back into an XmlWriter using doc.Save(writer) instead of doc.ToString().Brasilin
Removing an empty element might leave the parent element empty, so you might need recursion to solve this.Manifesto
C
11
doc.Descendants().Where(e => string.IsNullOrEmpty(e.Value)).Remove(); 

This one line will not throw out empty parent tags that are full of empty children tags. It will just remove their children , which may or may not be appropriate in your situation. It is a really simple change to achieve this you simply have to start removing from the lowest level first. Something like

foreach(XElement child in doc.Descendants().Reverse())
{
    if(!child.HasElements && string.IsNullOrEmpty(child.Value) && !child.HasAttributes) child.Remove();
}

Thanks Nyerguds for the attribute suggestion.

Cathleencathlene answered 3/6, 2016 at 12:39 Comment(1)
You might want to add && !child.HasAttributes to that if check, though.Brasilin
P
1

In VB in case I need to find it again:

doc.Descendants().Where(Function(e) String.IsNullOrEmpty(e.Value)).Remove()
Phantasmagoria answered 13/7, 2015 at 23:47 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.