Linq, XmlNodes, foreach's and exceptions
Asked Answered
P

2

9

Consider the following code:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Xml;
using System.Xml.Linq;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            XmlDocument xmlDoc = new XmlDocument();

            xmlDoc.LoadXml(@"<Parts>
  <Part name=""DisappearsOk"" disabled=""true""></Part>
  <Part name=""KeepMe"" disabled=""false""></Part>
  <Part name=""KeepMe2"" ></Part>
  <Part name=""ShouldBeGone"" disabled=""true""></Part>  
</Parts>");

            XmlNode root = xmlDoc.DocumentElement;
            List<XmlNode> disabledNodes = new List<XmlNode>();

            try
            {

                foreach (XmlNode node in root.ChildNodes.Cast<XmlNode>()
                                             .Where(child => child.Attributes["disabled"] != null &&
                                                             Convert.ToBoolean(child.Attributes["disabled"].Value)))
                {
                    Console.WriteLine("Removing:");
                    Console.WriteLine(XDocument.Parse(node.OuterXml).ToString());
                    root.RemoveChild(node);
                }
            }
            catch (Exception Ex)
            {
                Console.WriteLine("Exception, as expected");
            }

            Console.WriteLine();
            Console.WriteLine(XDocument.Parse(root.OuterXml).ToString());

            Console.ReadKey();
        }
    }
}

When I run this code in visual studio express 2010 I don't get an exception, as expected. I'd expect one because I'm removing something from a list while iterating it.

What I do get is a list back with only the first child node removed:

enter image description here

Why do I not get an invalid operation exception?

Note that the equivilent code in IDEOne.com does give the expected exception: http://ideone.com/qoRBbb

Also note that if i remove all the LINQ (.Cast().Where()) I get the same result, only one node removed, no exception.

Is there some issue with my settings in VSExpress?


Note that I know deferred execution is involved, but I'd expect the where clause, when iterated upon to iterate upon the source enumeration (child note), which would give the exception I'm expecting.

My issue is that I don't get that exception in VSexpress, but do in IDEOne (I'd expect it in both/all cases, or at least if not, I'd expect the correct result).


From Wouter's answer it seems it's invalidating the iterator when the first child is removed, rather than giving an exception. Is there anything official that says this? Is this behaviour to be expected in other cases? I'd call invalidating the iterator silently rather than with an exception "Silent but deadly".

Panel answered 28/3, 2013 at 13:5 Comment(2)
I don't see the output of Console.WriteLine("Removing:"); in your result either. Are you sure the loop fires at all?Pelagian
Sorry, edited screenshot (added that check/demonstration after the initial test)Panel
M
2

Even the following code won't throw any exceptions:

foreach (XmlNode node in root.ChildNodes)
    root.RemoveChild(node);

And it will remove exactly one element. I am not 100% that my explanation is correct, but it is on the right track. When you iterate over a collection, you retrieve its enumerator. For XmlNode, which is a collection, this is a custom class called XmlChildEnumerator.

If you would look up the MoveNext implementation via Reflector, you would see that the enumerator remembers the node it is currently looking at. When you call MoveNext, you move onto the next sibling.

What happens in the code above is that you get the first node from the collection. Enumerator implicitly generated in the body of the foreach loop takes that first node as its current node. Then, in the body of the foreach loop you remove that node.

Now that node is detached from the list and the execution moves onto calling MoveNext again. However, since we have just removed the first node from the collection, it is detached from the collection and node has no sibling. Since there is no sibling for the node, iteration stops and foreach loop exits, thus removing only single element.

This doesn't throw exception since it doesn't check if the collection has been changed, it just wants to move on to the next node it can find. But since the removed (detached) node doesn't belong to a collection, the loop stops.

Hope this clears up the issue.

Mebane answered 28/3, 2013 at 13:55 Comment(1)
Excellent explanation. Clearly the Mono people must have the enumerator coded slightly differently, to cause the exception (either deliberately or not).Panel
P
2

Because you are iterating over ChildNodes, removing the first child invalidates the iterator. Because of this, iteration will stop after the first removal.

If you split your filtering and iteration, your code will remove all items:

var col = root.ChildNodes.Cast<XmlNode>()
                             .Where(child => child.Attributes["disabled"] != null &&
                                             Convert.ToBoolean(child.Attributes["disabled"].Value)).ToList();

foreach (XmlNode node in col)
{
    Console.WriteLine("Removing:");
    Console.WriteLine(XDocument.Parse(node.OuterXml).ToString());
    root.RemoveChild(node);
}
Patriotism answered 28/3, 2013 at 13:12 Comment(5)
Surely the where clause will still cause the underlying collection to be iterated though?Panel
But Where is supposed to be lazy-evaluated. Why would it create a separate collection and not stream data from root.ChildNodes?Pelagian
Edited my post. Removing an element from ChildNodes apparently doesn't throw an exception but just invalidates the iterator.Patriotism
I understand how to make it work (.ToList()) my issue is the fact that I'm not getting an exception in my case, it's just stopping executing silently.Panel
@WouterdeKort:This looks like the issue (invalidating the iterator), is there any documentation that says this? That must mean IDEOne/mono, although giving the expected result doesn't quite mimic the MS .net implementation correctly.Panel
M
2

Even the following code won't throw any exceptions:

foreach (XmlNode node in root.ChildNodes)
    root.RemoveChild(node);

And it will remove exactly one element. I am not 100% that my explanation is correct, but it is on the right track. When you iterate over a collection, you retrieve its enumerator. For XmlNode, which is a collection, this is a custom class called XmlChildEnumerator.

If you would look up the MoveNext implementation via Reflector, you would see that the enumerator remembers the node it is currently looking at. When you call MoveNext, you move onto the next sibling.

What happens in the code above is that you get the first node from the collection. Enumerator implicitly generated in the body of the foreach loop takes that first node as its current node. Then, in the body of the foreach loop you remove that node.

Now that node is detached from the list and the execution moves onto calling MoveNext again. However, since we have just removed the first node from the collection, it is detached from the collection and node has no sibling. Since there is no sibling for the node, iteration stops and foreach loop exits, thus removing only single element.

This doesn't throw exception since it doesn't check if the collection has been changed, it just wants to move on to the next node it can find. But since the removed (detached) node doesn't belong to a collection, the loop stops.

Hope this clears up the issue.

Mebane answered 28/3, 2013 at 13:55 Comment(1)
Excellent explanation. Clearly the Mono people must have the enumerator coded slightly differently, to cause the exception (either deliberately or not).Panel

© 2022 - 2024 — McMap. All rights reserved.