What are the benefits of a Deferred Execution in LINQ?
Asked Answered
U

3

53

LINQ uses a Deferred Execution model which means that resulting sequence is not returned at the time the Linq operators are called, but instead these operators return an object which then yields elements of a sequence only when we enumerate this object.

While I understand how deferred queries work, I'm having some trouble understanding the benefits of deferred execution:

1) I've read that deferred query executing only when you actually need the results can be of great benefit. So what is this benefit?

2) Other advantage of deferred queries is that if you define a query once, then each time you enumerate the results, you will get different results if the data changes.

a) But as seen from the code below, we're able to achieve the same effect ( thus each time we enumerate the resource, we get different result if data changed ) even without using deferred queries:

List<string> sList = new List<string>( new[]{ "A","B" });

foreach (string item in sList)
    Console.WriteLine(item); // Q1 outputs AB

sList.Add("C");

foreach (string item in sList)
    Console.WriteLine(item); // Q2 outputs ABC

3) Are there any other benefits of deferred execution?

Uncinate answered 6/9, 2011 at 17:51 Comment(3)
You have a misunderstanding of how deferred execution works. The sequence is not enumerated each time, even though each stage returns IEnumerable<T>Kalif
@jlafay, why did you remove my edit?Uncinate
Because comments and answers should not be part of the question. Read my edit to see the explanation.Burgh
K
61

The main benefit is that this allows filtering operations, the core of LINQ, to be much more efficient. (This is effectively your item #1).

For example, take a LINQ query like this:

 var results = collection.Select(item => item.Foo).Where(foo => foo < 3).ToList();

With deferred execution, the above iterates your collection one time, and each time an item is requested during the iteration, performs the map operation, filters, then uses the results to build the list.

If you were to make LINQ fully execute each time, each operation (Select / Where) would have to iterate through the entire sequence. This would make chained operations very inefficient.

Personally, I'd say your item #2 above is more of a side effect rather than a benefit - while it's, at times, beneficial, it also causes some confusion at times, so I would just consider this "something to understand" and not tout it as a benefit of LINQ.


In response to your edit:

In your particular example, in both cases Select would iterate collection and return an IEnumerable I1 of type item.Foo. Where() would then enumerate I1 and return IEnumerable<> I2 of type item.Foo. I2 would then be converted to List.

This is not true - deferred execution prevents this from occurring.

In my example, the return type is IEnumerable<T>, which means that it's a collection that can be enumerated, but, due to deferred execution, it isn't actually enumerated.

When you call ToList(), the entire collection is enumerated. The result ends up looking conceptually something more like (though, of course, different):

List<Foo> results = new List<Foo>();
foreach(var item in collection)
{
    // "Select" does a mapping
    var foo = item.Foo; 

    // "Where" filters
    if (!(foo < 3))
         continue;

    // "ToList" builds results
    results.Add(foo);
}

Deferred execution causes the sequence itself to only be enumerated (foreach) one time, when it's used (by ToList()). Without deferred execution, it would look more like (conceptually):

// Select
List<Foo> foos = new List<Foo>();
foreach(var item in collection)
{
    foos.Add(item.Foo);
}

// Where
List<Foo> foosFiltered = new List<Foo>();
foreach(var foo in foos)
{
    if (foo < 3)
        foosFiltered.Add(foo);
}    

List<Foo> results = new List<Foo>();
foreach(var item in foosFiltered)
{
    results.Add(item);
}
Kalif answered 6/9, 2011 at 17:55 Comment(12)
+1 but perhaps a different example than ToList since that will in fact iterate the entire sequence.Catechin
@Davy8: I purposely wanted to have something that forced an evaluation - otherwise, it would never get evaluated in my sample code ;)Kalif
I was referring to "If you were to make LINQ fully execute each time, each operation (Select / Where / ToList) would have to iterate through the entire sequence." rather than the code sample because that statement seems to imply that ToList like the others did not iterate the entire sequence.Catechin
@user702769: I edited to show you the difference - does that help?Kalif
It sure does. I don't know the correct terminology, so this may come off a bit wrong - anyhow, I assumed each type returned by operator contains its own code for enumerating the sequence. Thus type I1 returned by Select contains code to enumerate collection, while type I2 returned by Where contains a separate code/logic used to enumerate the collection yielded by I1. But in fact the "enumeration / yielding logic" of both Select and Where is actually combined into a "single" code segment - am I making any sense?Uncinate
@user702769: Well, it's a bit different, but IEnumerable<T> just allows each item to be returned, one at a time. This means that "ToLists" enumeration of the sequence "pulls through" the values, and each of the operators occurs on the values one at a time. The actual enumeration/stepping through only happens once. This is what "deferred execution" actually means.Kalif
@user702769: As I said, what I did above was just conceptual - it doesn't actually merge the code together (in LINQ to Objects - IQueryable<T> is different, and sort of does) - but pulls items through the operators one at a time, so "collection" is only enumerated fully one time.Kalif
Awesome explanation for a beginner with no idea about deferred execution :)Engleman
Can someone cite documentation for this: "Deferred execution causes the sequence itself to only be enumerated (foreach) one time"?Len
Really like the 2 blocks of conceptual code showing the actual sequence of work, with and without DE. Without DE when chained LINQ is doing one method at a time serially, first SELECT(), then the whole result is applied with WHERE filter, not exactly as what we interpret by looking at the code. Picture this, we open all boxes to check for "foo", then pack and line up those having and ship to next station, re-open boxes to check if "foo"<3. Yuk, why not check "<3" when we see it has "foo" at the 1st station! On the other hand, if a LINQ is not chained, DE has no advantage.Aimeeaimil
The first code sample where there is just one foreach loop is a bit deceiving. Behind the scenes, it uses MoveNext() and reading the Current properties to get from selecting Foo (select) to applying the predicate (where). Because MoveNext() and reading the Current property are exactly what happens in a foreach loop, the way Linq works is probably better described by your 2nd code sample, just add yield returns and remove the intermediate lists. levelup.gitconnected.com/… describes the process.Mandle
@Len A bit late, but see this Microsoft article.Glen
C
38

Another benefit of deferred execution is that it allows you to work with infinite series. For instance:

public static IEnumerable<ulong> FibonacciNumbers()
{
    yield return 0;
    yield return 1;

    ulong previous = 0, current = 1;
    while (true)
    {
        ulong next = checked(previous + current);
        yield return next;
        previous = current;
        current = next;

    }
}

(Source: http://chrisfulstow.com/fibonacci-numbers-iterator-with-csharp-yield-statements/)

You can then do the following:

var firstTenOddFibNumbers = FibonacciNumbers().Where(n=>n%2 == 1).Take(10);
foreach (var num in firstTenOddFibNumbers)
{
    Console.WriteLine(num);
}

Prints:

1
1
3
5
13
21
55
89
233
377

Without deferred execution, you would get an OverflowException or if the operation wasn't checked it would run infinitely because it wraps around (and if you called ToList on it would cause an OutOfMemoryException eventually)

Catechin answered 6/9, 2011 at 18:3 Comment(4)
Got the difference. Very nice example.Madalynmadam
Why don't you just compute all the Fibonacci numbers first and just return a list?Eldwen
@MateenUlhaq Sorry for the late reply, but what do you mean by "all the Fibonacci numbers"? The list is infinite. If you mean if I knew I needed 10, why not just calculate that ahead of time, it is because sometimes you don't know how many you need until later on. Sometimes you don't need the first 10, maybe you need to do paging and you're asking for the 11th-20th values. Maybe you need to filter it to just get the prime values. The point is that you can decide on how you want to filter it later on in your code, without that function needing to know how it will be filtered.Catechin
@Catechin Sorry, it was a really bad joke. ;)Eldwen
M
11

An important benefit of deferred execution is that you receive up-to-date data. This may be a hit on performance (especially if you are dealing with absurdly large data sets) but equally the data might have changed by the time your original query returns a result. Deferred execution makes sure you will get the latest information from the database in scenarios where the database is updated rapidly.

Mischief answered 14/4, 2013 at 15:14 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.