Best practices for dealing with LINQ statements that result in empty sequences and the like?
Asked Answered
L

5

23

...I'm a little confused, or unsure about how to deal with errors that arise from LINQ statements. I just love being able to pull one or more items from a collection, based on some criteria... with a single line of code. That's pretty awesome.

But where I'm torn is with the error handling, or the boundary condition checking. If I want to retrieve an item using First(), and no item satisfies my query, an exception gets thrown. That's a bit of a bummer because now I have to wrap every LINQ statement with a separate try/catch block. To me, the code starts to look a little messy with all of this, especially since I end up having to declare variables outside of the try/catch block so I can use their (null) values later (which were set to null in the catch block).

Does anyone here understand my predicament? If I have to wrap every LINQ statement in try/catch blocks, I will, because it's still a hell of a lot better than writing all sorts of for loops to accomplish the same thing. But there must be a better way, right? :) I'd like to hear what everyone else here does in this situation.

** UPDATE **

Thanks for the answers, guys, they have been very helpful. One other thing I also was going to bring up, along the "one-lineness" of LINQ, is that if I want to get some .some_value.some_value.some_other_value, if I adopt an approach where I have to check for a Nullable, I have to do it from the most basic LINQ query first, then I can query for the result's property that I'm looking for. I guess there's no way around this?

Loading answered 10/9, 2010 at 21:21 Comment(5)
Very undescriptive question title, but I see it's an exception from your other question titles :) How bout "How to prevent LINQ from throwing an exception when using First()" or something like that?Fantast
@Simen it all depends on my mood. I'll change it. :)Loading
I have updated my answer again to address your second question.Alphabetic
@Eric thanks, I have marked your answer as accepted. It was extremely helpful. Thanks for being so thorough.Loading
too many generic questions in one postDolli
A
60

Use First when you know that there is one or more items in the collection. Use Single when you know that there is exactly one item in the collection. If you don't know those things, then don't use those methods. Use methods that do something else, like FirstOrDefault(), SingleOrDefault() and so on.

You could, for example, say:

int? first = sequence.Any() ? (int?) sequence.First() : (int?) null;

which is far less gross than

int? first = null;
try { first = sequence.First(); } catch { }

But still not great because it iterates the first item of the sequence twice. In this case I would say if there aren't sequence operators that do what you want then write your own.

Continuing with our example, suppose you have a sequence of integers and want to get the first item, or, if there isn't one, return null. There isn't a built-in sequence operator that does that, but it's easy to write it:

public static int? FirstOrNull(this IEnumerable<int> sequence)
{
    foreach(int item in sequence)
        return item;
    return null;
}

or even better:

public static T? FirstOrNull<T>(this IEnumerable<T> sequence) where T : struct
{
    foreach(T item in sequence)
        return item;
    return null;
}

or this:

struct Maybe<T>
{
    public T Item { get; private set; }
    public bool Valid { get; private set; }
    public Maybe(T item) : this() 
    { this.Item = item; this.Valid = true; }
}

public static Maybe<T> MyFirst<T>(this IEnumerable<T> sequence) 
{
    foreach(T item in sequence)
        return new Maybe(item);
    return default(Maybe<T>);
}
...
var first = sequence.MyFirst();
if (first.Valid) Console.WriteLine(first.Item);

But whatever you do, do not handle those exceptions you mentioned. Those exceptions are not meant to be handled, they are meant to tell you that you have bugs in your code. You shouldn't be handling them, you should be fixing the bugs. Putting try-catches around them is hiding bugs, not fixing bugs.

UPDATE:

Dave asks how to make a FirstOrNull that takes a predicate. Easy enough. You could do it like this:

public static T? FirstOrNull<T>(this IEnumerable<T> sequence, Func<T, bool> predicate) where T : struct
{
    foreach(T item in sequence)
        if (predicate(item)) return item;
    return null;
}

Or like this

public static T? FirstOrNull<T>(this IEnumerable<T> sequence, Func<T, bool> predicate) where T : struct
{
    foreach(T item in sequence.Where(predicate))
        return item;
    return null;
}

Or, don't even bother:

var first = sequence.Where(x=>whatever).FirstOrNull();

No reason why the predicate has to go on FirstOrNull. We provide a First() that takes a predicate as a convenience so that you don't have to type the extra "Where".

UPDATE: Dave asks another follow-up question which I think might be "what if I want to say sequence.FirstOrNull().Frob().Blah().Whatever() but any one of those along the line could return null?"

We have considered adding a null-propagating member-access operator to C#, tentatively notated as ?. -- that is, you could say

x = a?.b?.c?.d;

and if a, b, or c produced null, then the result would be to assign null to x.

Obviously we did not actually implement it for C# 4.0. It is a possible work item for hypothetical future versions of the language... UPDATE the "Elvis operator" has been added to C# 6.0, yay!

Note that C# does have a null coalescing operator:

(sequence.FirstOrNull() ?? GetDefault()).Frob().Blah().Whatever()

means "If FirstOrNull returns non-null use it as the receiver of Frob, otherwise call GetDefault and use that as the receiver". An alternative approach would be to again, write your own:

public static T FirstOrLazy<T>(this IEnumerable<T> sequence, Func<T> lazy) 
{
    foreach(T item in sequence)
        return item;
    return lazy();
}

sequence.FirstOrLazy(()=>GetDefault()).Frob().Blah().Whatever();

Now you get the first item if there is one, or the result of a call to GetDefault() if there is not.

Alphabetic answered 10/9, 2010 at 21:27 Comment(16)
@Eric ok, so I think the take home message for me is that I can't just use LINQ as my first step in getting data out of my shared memory structure. I have two threads that run and look into this shared object and pull out information that they care about. I was just letting them query willy-nilly, since it seemed like a reasonable (read cleaner) way to code it up. But now it sounds like I'll have to roll something a little custom, and that's okay with me. Thanks!Loading
I'm sorry, but are those supposed to be yield return?Grindstone
@Dave: Alternately, it means you can use LINQ to handle the normal case, but if you want to know why it threw an exception, then your catch block might need to break down the original query into parts that can be executed in sequence and tested in between.Grindstone
@Steven I don't think so, because Eric doesn't want to return another sequence, he just wants to return the first. But what I don't see here is how this extension method can take the equivalent of a Where clause... I usually use First with a Where clause, something like this: var temp = MyStuff.First( p => p.Name == MyName);Loading
@Steven: No... why would they be? The first item of a sequence is not a sequence.Alphabetic
@Steven thanks for the tip. I might look into that later. So far, I've been lucky and all of my queries seem to work, except when what I'm looking for isn't there. :) hmm... that sounds kinda wrong.Loading
Ah, got it. I misunderstood the purpose of the foreach. Thanks.Grindstone
Eric, I thought your generic FirstOrNull had a bug. Shouldn't it be for(T item in sequence) ??Strawn
@Eric: About your error-handling advice, would you disapprove of catching the error for the purpose of logging it and returning a failure? After all, it may well not be a programming error so much as bad data, and as such, it's not necessarily cause to shut down the app.Grindstone
@Steven: Actions have consequences. Your question is essentially "when something unexpected happens due to a bug caused by a violated assumption is it better to fail fast, fail slow, or keep running and hope for the best?" It depends on what the consequences of each of those actions is. Computer programs often have human life safety implications, financial implications, and so on. Computers run life support equipment, trading floors, and factory robots. Sometimes trying to keep going is very important, sometimes stopping immediately before you make it worse is very important.Alphabetic
@Strawn yes that was a typo on Eric's part, but the point he was trying to make was clear enough. @Eric you're very correct. I am controlling machines, and there are cases where errors are okay and you can keep going, but much of the time in doing so you will end up with totally invalid results.Loading
@Eric man, this is good stuff. I need to play with this some more. Very good suggestions.Loading
@Eric: Thanks for the clarification. I do agree that just catching it is never acceptable, but I don't lean as far towards fail-fast as you do.Grindstone
@Dave: Indeed, you don't want to spill a thousand gallons of acid or pudding or whatever if the controller software throws an exception while the robot arm motor is turned on. The right thing to do is probably to catch the unexpected exception and immediately go into the "fail to the safest possible mode" subroutine that stops the robot safely. Then log the exception to disk.Alphabetic
@Steven: for example, in the compiler if we get an unexpected exception we know it is unlikely to have human-life safety implications, but we also know that we are very unlikely to be able to generate correct code. So we handle the error by telling the user what went wrong, giving some diagnostics that they can report to the compiler team, and activating the Watson-phone-home system to allow the user to report the error automatically. Failing relatively slowly is the right thing to do for us.Alphabetic
@Eric: That's a good example. I can think of a number of occasions where my own intuitions about what constitutes an error condition, even a fatal one, have turned out not to match business needs, so there's a lot to be said for doing full analysis before committing to any particular approach. Oh, and about .?, that would make some people very happy, particularly those who spend half of their time doing SQL.Grindstone
K
21

Use FirstOrDefault and then check for null.

Kellyekellyn answered 10/9, 2010 at 21:23 Comment(4)
It's (slightly) more difficult than that when your elements are value types.Nieves
+1: And for value types (like int), check for 0, not null. But that's just me nitpickin.Expand
Yeah except the actual value could be 0, and there is no way to distinguish this.Pascia
I guess you could say I have the opposite approach: I like being able to easily include a unique value assertion into LINQ statements with Single() - I'd much rather handle an exception that have an assumption broken silently. So it chaps my hide that some link sources (notably EF) don't support Single().Harebell
S
4

Does anyone here understand my predicament?

Not really, if you replace First() with FirstOrDefault() your try/catch blocks can be replace with if(...) statements or strategically used && or || operators.

Sw answered 10/9, 2010 at 21:25 Comment(0)
B
0

The FirstOrDefault and SingleOrDefault operators solve your problem.

A similar problem I've encountered is when a collection contains a collection; a nested list. In that case I often use the null coalescing operator to allow a single line retrieval through the nested list. The most trivial case looks like this:

var nestedList = new List<List<int>>();
int? first = (nestedList.FirstOrDefault() ?? new List<int>).FirstOrDefault();

So if the outer list is empty, a new empty list is returned which simply allows the final FirstOrDefault to return a null.

Buffy answered 12/9, 2010 at 13:11 Comment(0)
W
0

In addition to Eric Lippert's FirstOrNull implementations, here is a version of SingleOrNull for value types.

    /*
     * This SingleOrNull implementation is heavily based on the standard
     * Single/SingleOrDefault methods, retrieved from the reference
     * source codebase on Thu May 7, 2015.
     * http://referencesource.microsoft.com/#System.Core/System/Linq/Enumerable.cs
     *
     * In case it isn't clear, the first part is merely an opportunistic
     * optimization for sources that are actually lists, and which thus
     * expose a precomputed count.  Using a count is faster since we
     * only have to read 0-1 elements.  In contrast, the fallback must
     * read 1-2 elements.
     */
    public static TSource? SingleOrNull<TSource>(
        this IEnumerable<TSource> source)
        where TSource : struct
    {
        if (source == null) throw new ArgumentNullException("source");
        var list = source as IList<TSource>;
        if (list != null)
        {
            switch (list.Count)
            {
                case 0: return null;
                case 1: return list[0];
            }
        }
        else
        {
            using (var e = source.GetEnumerator())
            {
                if (!e.MoveNext()) return null;
                var result = e.Current;
                if (!e.MoveNext()) return result;
            }
        }
        return null;
    }

And here are a few tests thrown in for good measure.

Wisecrack answered 7/5, 2015 at 14:46 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.