Are Linq's Skip and Take optimized for arrays? [4.0 edition]
Asked Answered
P

3

4

It's a common situation to copy a range from an array. C# supports this operation in various ways, for instance using Array.Copy, but also by Linq's Skip and Take combination.

Starting at .NET 4.0, do the Skip and Take operations still add considerable overhead, or do they recognize (either at compile time or at run time) their target is an array?

To clarify: I am referring to a) the time taken to skip the initial bytes and b) to the combination of Skip-Take-ToArray, which does not suffer from side effects. For instance, new byte[10000].Skip(9999).Take(1).ToArray() could be optimized this way and would (for large n) be noticeably faster.

Paderewski answered 1/11, 2014 at 0:45 Comment(3)
They can't work in O(1) for arrays. How do you expect a lazily evaluated sequence to ever be O(1)? Just imagine if I did a .Skip(1) on a 10,000,000 element array - are you expecting a 9,999,999 element memory copy? That would be very inefficient if I then immediately did a .Take(1).Haslet
@Enigmativity: There's the size of the array... and then there's the argument passed to Skip. Anyway, neither Skip nor Take copy the source collection.Asper
I have reworded the question to remove the misleading statement of O(1).Paderewski
H
3

Skip/Take still operate in O(n) due to how LINQ has to maintain knowledge of how elements even in arrays can be changed and how the objects have to stay aware of the changes.Therefore, while the optimizations to O(1) seems trivial, it is not currently realized.

See: https://edulinq.googlecode.com/hg/posts/40-Optimization.html

Jon Skeet had an excellent blog a few years back discussing these and more in "reimplementing LINQ" (http://codeblog.jonskeet.uk/2011/01/02/reimplementing-linq-to-objects-part-23-take-skip-takewhile-skipwhile/). It's a great read.

Headreach answered 1/11, 2014 at 1:14 Comment(0)
J
2

Since they are not optimized in 4.5 I can confidently say that they are not optimized in 4.0 either

http://referencesource.microsoft.com/#System.Core/System/Linq/Enumerable.cs#83ec6a20321060a1#references

BTW there is no way Take is O(1) even on arrays since it has an output of N (assuming it is iterated of course)

Ja answered 1/11, 2014 at 1:53 Comment(0)
N
2

The optimisation that would be nice is if on Skip(k) the first .MoveNext just sets the internal index to k, and I thought the JITter could do this, but it seems not to, for the versions I've tried.

The code to be optimised is:

while (count > 0 && e.MoveNext()) count--;

from SkipIterator and:

      public bool MoveNext() {
            if (_index < _endIndex) {
                _index++;
                return (_index < _endIndex);
            }
            return false;
        }

from SZGenericArrayEnumerator.

(The reference source is 4.5.1, but reflector shows very similar code in 4.0.)

Nierman answered 1/11, 2014 at 3:50 Comment(3)
That's the question! Does the JITter do it? ;)Paderewski
@Paderewski Short answer, no: My LinqPad Test. It is possible that some JIT optimisations are still off but this is with the LinqPad /o+ option on. (You'll need my .Time My Extension to replicate the test.)Nierman
Just an update: A number of answers to similar questions eventually link to the same Jon Skeet blog post as Jason W. or this one, but more directly suggest loops cannot (or will not) be removed by optimisation; and, since .NET 4.5, ArraySegment becomes an option when you know you're dealing with an array.Nierman

© 2022 - 2024 — McMap. All rights reserved.