LINQ: RemoveAll and get elements removed
Asked Answered
F

4

13

Which is the easiest way to remove items that match some condition from a list and then, get those items.

I can think in a few ways, I don't know which is the best one:

var subList = list.Where(x => x.Condition);
list.RemoveAll(x => x.Condition);

or

var subList = list.Where(x => x.Condition);
list.RemoveAll(x => subList.Contains(x));

Is any of this one of the best ways? If it is, which one? If it's not, how should I do it?

Forsythia answered 16/4, 2012 at 17:5 Comment(0)
S
6

I would go with the first option for readability purposes, with the note that you should materialize the list first, or you'll lose the very items you're trying to select on the next line:

var sublist = list.Where(x => x.Condition).ToArray();
list.RemoveAll(x => x.Condition);

The second example is O(n^2) for no reason and the last is perfectly fine, but less readable.

Edit: now that I reread your last example, note that as it's written right now will take out every other item. You're missing the condition check and the remove line should actually be list.RemoveAt(i--); because the i+1th element becomes the ith element after the removal, and when you increment i you're skipping over it.

Sevier answered 16/4, 2012 at 17:9 Comment(9)
It's actually O(n^3), but I'm assuming the lack of materialization just slipped your mind ;)Sevier
Would the items (as I wrote it) be removed from subList with the second instruction? :OForsythia
Er you never remove from sublist, nor do you intend to if I read it correctly.Sevier
Your edit is right, I'll remove the third options because its really wrong and make it right would be very unreadableForsythia
No, no.. its not the idea to remove them from subList. I think I misunderstood the materialization. Whats the difference?Forsythia
When you run a linq query over a collection, you don't get back an array, you get back an object that when you iterate over it you run your actual selection. So take your first example. subList will be just an object, you remove the items from the main array, and then when you do foreach(var item in subList) you get back nothing because the condition will always return false.Sevier
So what you do instead is read the items out of your linq query and put them in an array immediately. You can do it manually or using .ToArray() or .ToList() (or a few others depending on how you want your data to look).Sevier
"you should materialize the list first, or you'll lose the very items you're trying to select" Maybe true if you're using, say, a lazy-loading recordset from an ORM, but in the general case, I don't think this is correct. Check this code. iterate true or false, you'll get 2,2.1 every time. No ToArray() needed.Anticipation
I'll be honest with you, this is a few good years old and looking at it I don't understand the need for the first line at all. sublist isn't used by RemoveAll at all.Sevier
M
8

I like to use a functional programming approach (only make new things, don't modify existing things). One advantage of ToLookup is that you can handle more than a two-way split of the items.

ILookup<bool, Customer> lookup = list.ToLookup(x => x.Condition);
List<Customer> sublist = lookup[true].ToList();
list = lookup[false].ToList();

Or if you need to modify the original instance...

list.Clear();
list.AddRange(lookup[false]);
Microgram answered 16/4, 2012 at 17:54 Comment(3)
I think it is much complex (and almost without knowledge I think its not really performance). Does this have any advantage?Forsythia
Condition is evaluated exactly once per item. List instance is not modified, which can be a big advantage if that list instance is shared among threads.Microgram
This is quite a genius idea.Overcrop
S
6

I would go with the first option for readability purposes, with the note that you should materialize the list first, or you'll lose the very items you're trying to select on the next line:

var sublist = list.Where(x => x.Condition).ToArray();
list.RemoveAll(x => x.Condition);

The second example is O(n^2) for no reason and the last is perfectly fine, but less readable.

Edit: now that I reread your last example, note that as it's written right now will take out every other item. You're missing the condition check and the remove line should actually be list.RemoveAt(i--); because the i+1th element becomes the ith element after the removal, and when you increment i you're skipping over it.

Sevier answered 16/4, 2012 at 17:9 Comment(9)
It's actually O(n^3), but I'm assuming the lack of materialization just slipped your mind ;)Sevier
Would the items (as I wrote it) be removed from subList with the second instruction? :OForsythia
Er you never remove from sublist, nor do you intend to if I read it correctly.Sevier
Your edit is right, I'll remove the third options because its really wrong and make it right would be very unreadableForsythia
No, no.. its not the idea to remove them from subList. I think I misunderstood the materialization. Whats the difference?Forsythia
When you run a linq query over a collection, you don't get back an array, you get back an object that when you iterate over it you run your actual selection. So take your first example. subList will be just an object, you remove the items from the main array, and then when you do foreach(var item in subList) you get back nothing because the condition will always return false.Sevier
So what you do instead is read the items out of your linq query and put them in an array immediately. You can do it manually or using .ToArray() or .ToList() (or a few others depending on how you want your data to look).Sevier
"you should materialize the list first, or you'll lose the very items you're trying to select" Maybe true if you're using, say, a lazy-loading recordset from an ORM, but in the general case, I don't think this is correct. Check this code. iterate true or false, you'll get 2,2.1 every time. No ToArray() needed.Anticipation
I'll be honest with you, this is a few good years old and looking at it I don't understand the need for the first line at all. sublist isn't used by RemoveAll at all.Sevier
E
6

The first option is good, but it makes two runs over collection. You can do it in one run by executing the additional logic inside predicate:

        var removedItems = new List<Example>();
        list.RemoveAll(x =>
        {
            if (x.Condition)
            {
                removedItems.Add(x);
                return true;
            }

            return false;
        });

You can also wrap it into extension for convinience:

public static class ListExtensions
{
    public static int RemoveAll<T>(this List<T> list, Predicate<T> predicate, Action<T> action)
    {
        return list.RemoveAll(item =>
        {
            if (predicate(item))
            {
                action(item);
                return true;
            }

            return false;
        });
    }
}

And use like this:

        var removedItems = new List<Example>();
        list.RemoveAll(x => x.Condition, x => removedItems.Add(x));
Eryn answered 28/7, 2019 at 14:13 Comment(1)
This is honestly the best answer. It minimises runs over the collection and leverages built in functions. All the other answers end up calling the match function twice on every element in the list, generate multiple lists, or use heavier structures like lookups.Accumulative
O
1

I was looking for the same thing. I wanted to do some error logging on the items that I was removing. Since I was adding a whole bunch of validation rules, the remove-and-log call should be as concise as possible.

I've made the following extension methods:

public static class ListExtensions
{
    /// <summary>
    /// Modifies the list by removing all items that match the predicate. Outputs the removed items.
    /// </summary>
    public static void RemoveWhere<T>(this List<T> input, Predicate<T> predicate, out List<T> removedItems)
    {
        removedItems = input.Where(item => predicate(item)).ToList();
        input.RemoveAll(predicate);
    }

    /// <summary>
    /// Modifies the list by removing all items that match the predicate. Calls the given action for each removed item.
    /// </summary>
    public static void RemoveWhere<T>(this List<T> input, Predicate<T> predicate, Action<T> actionOnRemovedItem)
    {
        RemoveWhere(input, predicate, out var removedItems);
        foreach (var removedItem in removedItems) actionOnRemovedItem(removedItem);
    }
}

Example usage:

items.RemoveWhere(item => item.IsWrong, removedItem =>
    errorLog.AppendLine($"{removedItem} was wrong."));
Overcrop answered 24/4, 2018 at 15:51 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.