Why does this code give a "Possible null reference return" compiler warning?
Asked Answered
P

5

121

Consider the following code:

using System;

#nullable enable

namespace Demo
{
    public sealed class TestClass
    {
        public string Test()
        {
            bool isNull = _test == null;

            if (isNull)
                return "";
            else
                return _test; // !!!
        }

        readonly string _test = "";
    }
}

When I build this, the line marked with !!! gives a compiler warning: warning CS8603: Possible null reference return..

I find this a little confusing, given that _test is readonly and initialised to non-null.

If I change the code to the following, the warning goes away:

        public string Test()
        {
            // bool isNull = _test == null;

            if (_test == null)
                return "";
            else
                return _test;
        }

Can anyone explain this behaviour?

Philender answered 12/12, 2019 at 14:27 Comment(19)
The Debug.Assert is irrelevant because that is a runtime check, whereas the compiler warning is a compile time check. The compiler does not have access to runtime behaviour.Outfight
I guess it's the same as if you do if(thisBool == true) { var x = 1; } else { var x = 0; } return x; This also can't get compiled, with the reason that x is not declared, even it will exist 100% at this point. For your code, the compiler just can't realize, that _test cannot be null. You could put it in an if(_test != null)Tense
The Debug.Assert is irrelevant because that is a runtime check - It is relevant because if you comment that line out, the warning goes away.Philender
In my opinion this is a bug in .NET 4.8.... a readonly field that was already initialized will never be null... So the warning is not working correctlyNilsanilsen
I've simplified the code in the question to avoid extraneous details. It's looking a bit like some kind of analysis bug.Philender
This is likely a bug in Roslyn. Paging @jaredpar for comment.Renfrew
If you replace Debug.Assert with bool b = _test != null (or any other non-trivial expression involving _test), .NET Core will give the same warning. Speculation: the reason it manifests with Debug.Assert in net48 but not Core is because in Core, Debug.Assert has a DoesNotReturnIf annotation that allows the compiler to incorporate the assert as a proper check (but note that you will still get the warning with bool b = test != null; Debug.Assert(1 == 2), so it specifically matters that the condition involves the variable). There may be more than one bug at play here.Gelation
@JeroenMostert Yes, I discovered that too - I've updated the question to remove the misleading information.Philender
I wouldn't say it's misleading, because it's quite interesting that with Debug.Assert .NET Core does behave as expected -- that seems like desirable behavior to retain in any case, even if should turn out the current behavior is wrong. (I say "even if" because the analysis is always a best effort, so not correctly checking nullability can't always be straight up called a bug even where it seems really obvious and simple for a particular case).Gelation
In .NET Framework 4.6.1 I don't see this warningMontfort
@JeroenMostert The analysis is supposed to err on the side of allowing things that might be null, so it really shouldn't be warning!Philender
Here's some more fun for you: with bool b = _test is null the warning remains, but with bool b = _test is { } it disappears. There's even less justification for this than with ==, as is doesn't suffer from any possible complication with overloads. And of course merely testing the value without acting on it shouldn't affect the nullability analysis at all.Gelation
@Polyfun: The compiler can potentially know (via attributes) that Debug.Assert will throw an exception if the test fails.Voight
I've been adding lots of different cases here, and there are some really interesting results. Will write up an answer later - work to do for now.Voight
"I've been adding lots of different cases here", It's going to be the next REIMPLEMENTING LINQ serie. Add few test ends up with 30+ blog post.Midwest
@xdtTransform: Ha! I don't think I have quite that much time, but it's a nice idea :)Voight
Regarding Debug.Assert: I do not know what the C# checker does. The Coverity flow checker has a policy of treating the condition asserted as an invariant; remember, the purpose of an assertion is to document invariants that are known by the developer but not the compiler; if the assertion is wrong, that will be caught by testing.Derrek
@EricLippert: Debug.Assert now has an annotation (src) of DoesNotReturnIf(false) for the condition parameter.Voight
By none other than the versatile Stephen Toub. He never stops impressing me.Renfrew
D
63

The nullable flow analysis tracks the null state of variables, but it does not track other state, such as the value of a bool variable (as isNull above), and it does not track the relationship between the state of separate variables (e.g. isNull and _test).

An actual static analysis engine would probably do those things, but would also be "heuristic" or "arbitrary" to some degree: you couldn't necessarily tell the rules it was following, and those rules might even change over time.

That's not something we can do directly in the C# compiler. The rules for nullable warnings are quite sophisticated (as Jon's analysis shows!), but they are rules, and can be reasoned about.

As we roll out the feature it feels like we mostly struck the right balance, but there are a few places that do come up as awkward, and we'll be revisiting those for C# 9.0.

Durgy answered 13/12, 2019 at 19:25 Comment(10)
You know you want to put lattice theory in the specification; lattice theory is awesome and not at all confusing! Do it! :)Derrek
You know your question is legit when the program manager for C# responds!Jew
@EricLippert Can I upvote your comment multiple times? After finding out what lattice theory is, that is.Renfrew
@TanveerBadar: Lattice theory is about the analysis of sets of values that have a partial order; types are a good example; if a value of type X is assignable to a variable of type Y then that implies that Y is "big enough" to hold X, and that is enough to form a lattice, which then tells us that checking assignability in the compiler could be phrased in the specification in terms of lattice theory. This is relevant to static analysis because a great many topics of interest to an analyzer other than type assignability are also expressible in terms of lattices.Derrek
@TanveerBadar: You're welcome. I'm joshing Mads here because for the last several decades now there has been a tension between making the spec more precise but at the same time keeping it readable by line-of-business programmers who have not studied theoretical mathematics. Type checking, definite assignment checking, nullability checking, all that could be described more "mathematically" in the specification but Anders and Mads have quite reasonably opted to limit the amount of theory in the spec.Derrek
@TanveerBadar: lara.epfl.ch/w/_media/sav08:schwartzbach.pdf has some good introductory examples of how static analysis engines use lattice theory.Derrek
@EricLippert Awesome doesn't begin to describe you. That link is going into my must read list right away.Renfrew
@TanveerBadar: Thanks! Andy also has some opinions on this subject: twitter.com/andygocke/status/1206692183111163904Derrek
Given that this answer is "from the horse's mouth" I've marked this as the answer instead of Jon's (despite Jon's being obviously a good answer!)Philender
@MatthewWatson, this is not an answer actually, it only says "it was sophisticated" to implement. John explains why it is happens, which is exactly what you have asked. Also "horses mouth wasn't active for past two years, I guess they are busy fixing it in "C# 9.0" even though I'm getting this warning with C# 10Desex
V
102

I can make a reasonable guess as to what's going on here, but it's all a bit complicated :) It involves the null state and null tracking described in the draft spec. Fundamentally, at the point where we want to return, the compiler will warn if the state of the expression is "maybe null" instead of "not null".

This answer is in somewhat narrative form rather than just "here's the conclusions"... I hope it's more useful that way.

I'm going to simplify the example slightly by getting rid of the fields, and consider a method with one of these two signatures:

public static string M(string? text)
public static string M(string text)

In the implementations below I've given each method a different number so I can refer to specific examples unambiguously. It also allows all of the implementations to be present in the same program.

In each of the cases described below, we'll do various things but end up trying to return text - so it's the null state of text that's important.

Unconditional return

First, let's just try to return it directly:

public static string M1(string? text) => text; // Warning
public static string M2(string text) => text;  // No warning

So far, so simple. The nullable state of the parameter at the start of the method is "maybe null" if it's of type string? and "not null" if it's of type string.

Simple conditional return

Now let's check for null within the if statement condition itself. (I would use the conditional operator, which I believe will have the same effect, but I wanted to stay truer to the question.)

public static string M3(string? text)
{
    if (text is null)
    {
        return "";
    }
    else
    {
        return text; // No warning
    }
}

public static string M4(string text)
{
    if (text is null)
    {
        return "";
    }
    else
    {
        return text; // No warning
    }
}

Great, so it looks like within an if statement where the condition itself checks for nullity, the state of the variable within each branch of the if statement can be different: within the else block, the state is "not null" in both pieces of code. So in particular, in M3 the state changes from "maybe null" to "not null".

Conditional return with a local variable

Now let's try to hoist that condition to a local variable:

public static string M5(string? text)
{
    bool isNull = text is null;
    if (isNull)
    {
        return "";
    }
    else
    {
        return text; // Warning
    }
}

public static string M6(string text)
{
    bool isNull = text is null;
    if (isNull)
    {
        return "";
    }
    else
    {
        return text; // Warning
    }
}

Both M5 and M6 issue warnings. So not only do we not get the positive effect of the state change from "maybe null" to "not null" in M5 (as we did in M3)... we get the opposite effect in M6, where the state goes from "not null" to "maybe null". That really surprised me.

So it looks like we've learned that:

  • Logic around "how a local variable was computed" isn't used to propagate state information. More on that later.
  • Introducing a null comparison can warn the compiler that something it previously thought wasn't null might be null after all.

Unconditional return after an ignored comparison

Let's look at the second of those bullet points, by introducing a comparison before an unconditional return. (So we're completely ignoring the result of the comparison.):

public static string M7(string? text)
{
    bool ignored = text is null;
    return text; // Warning
}

public static string M8(string text)
{
    bool ignored = text is null;
    return text; // Warning
}

Note how M8 feels like it should be equivalent to M2 - both have a not-null parameter which they return unconditionally - but the introduction of a comparison with null changes the state from "not null" to "maybe null". We can get further evidence of this by trying to dereference text before the condition:

public static string M9(string text)
{
    int length1 = text.Length;   // No warning
    bool ignored = text is null;
    int length2 = text.Length;   // Warning
    return text;                 // No warning
}

Note how the return statement doesn't have a warning now: the state after executing text.Length is "not null" (because if we execute that expression successfully, it couldn't be null). So the text parameter starts as "not null" due to its type, becomes "maybe null" due to the null comparison, then becomes "not null" again after text2.Length.

What comparisons affect state?

So that's a comparison of text is null... what effect similar comparisons have? Here are four more methods, all starting with a non-nullable string parameter:

public static string M10(string text)
{
    bool ignored = text == null;
    return text; // Warning
}

public static string M11(string text)
{
    bool ignored = text is object;
    return text; // No warning
}

public static string M12(string text)
{
    bool ignored = text is { };
    return text; // No warning
}

public static string M13(string text)
{
    bool ignored = text != null;
    return text; // Warning
}

So even though x is object is now a recommended alternative to x != null, they don't have the same effect: only a comparison with null (with any of is, == or !=) changes the state from "not null" to "maybe null".

Why does hoisting the condition have an effect?

Going back to our first bullet point earlier, why don't M5 and M6 take account of the condition which led to the local variable? This doesn't surprise me as much as it appears to surprise others. Building that sort of logic into the compiler and specification is a lot of work, and for relatively little benefit. Here's another example with nothing to do with nullability where inlining something has an effect:

public static int X1()
{
    if (true)
    {
        return 1;
    }
}

public static int X2()
{
    bool alwaysTrue = true;
    if (alwaysTrue)
    {
        return 1;
    }
    // Error: not all code paths return a value
}

Even though we know that alwaysTrue will always be true, it doesn't satisfy the requirements in the specification that make the code after the if statement unreachable, which is what we need.

Here's another example, around definite assignment:

public static void X3()
{
    string x;
    bool condition = DateTime.UtcNow.Year == 2020;
    if (condition)
    {
        x = "It's 2020.";
    }
    if (!condition)
    {
        x = "It's not 2020.";
    }
    // Error: x is not definitely assigned
    Console.WriteLine(x);
}

Even though we know that the code will enter exactly one of those if statement bodies, there's nothing in the spec to work that out. Static analysis tools may well be able to do so, but trying to put that into the language specification would be a bad idea, IMO - it's fine for static analysis tools to have all kinds of heuristics which can evolve over time, but not so much for a language specification.

Voight answered 13/12, 2019 at 8:37 Comment(14)
Great analysis Jon. The key thing that I learned when studying the Coverity checker is that code is evidence of the beliefs of its authors. When we see a null check that should inform us that the authors of the code believed the check was necessary. The checker is actually looking for evidence that the authors' beliefs were inconsistent because it is the places where we see inconsistent beliefs about, say, nullity, that bugs happen.Derrek
When we see for instance if (x != null) x.foo(); x.bar(); we have two pieces of evidence; the if statement is evidence of the proposition "the author believes that x could be null before the call to foo" and the following statement is evidence of "the author believes x is not null before the call to bar", and this contradiction leads to the conclusion that there is a bug. The bug is either the relatively benign bug of an unnecessary null check, or the potentially crashing bug. Which bug is the real bug is not clear, but it is clear that there is one.Derrek
The problem that relatively unsophisticated checkers that do not track the meanings of locals, and do not prune "false paths" -- control flow paths that humans can tell you are impossible -- tend to produce false positives precisely because they have not accurately modeled the beliefs of the authors. That's the tricky bit!Derrek
The inconsistency between "is object", "is {}" and "!= null" is an item we've been discussing internally the last few weeks. Going to bring it up at LDM in the very near future to decide if we need to consider these as pure null checks or not (which makes the behavior consistent).Knew
@EricLippert without daring to contradict you in this field, and out of genuine curiosity only. Isn't the nullable ref types feature made specifically to warn users where they didn't believe the check was necessary, but was, as opposed to discussing the beliefs of who wrote the code?Giantess
Thanks Jon for this great analysis of the compiler's behavior. It all makes sense to me, and obviously you got it right. But I still wonder whether it wouldn't be feasible for the compiler to avoid the warnings in M6, M8 and also in the question? After all, 'text' is known to be not null and it isn't modified... IMHO it looks like something we can expect to be improved in later versions, isn't it?Spermous
@Arnon: The compiler doesn't know it's not null in M8 - it might still be null, and the code author has indicated that they've partly considered the option that it might be null. M6 is genuinely safe, but I'm fine with the compiler not performing that level of analysis.Voight
@Jon: How come it doesn't know that it's not null if it's defined as a 'string' rather a 'string?' in M8?Spermous
@ArnonAxelrod That says it's not meant to be null. It could still be null, because nullable reference types are only a compiler hint. (Examples: M8(null!); or calling it from C# 7 code, or ignoring warnings.) It's not like the type safety of the rest of the platform.Voight
It would be nicer if string text was truly unable to be null. Then the C# compiler would warn when the user wrote text != null, and never warn when the user returns something of (compile-time) type "string" as a return parameter of the same type "string". This is the situation we have with value types that are not wrapped in Nullable<>. For example, with an int i, the compiler will warn if I say i != null (a clear coding error).Twila
The example X2 can be "fixed" by writing const at the declaration of the local alwaysTrue.Twila
@JeppeStigNielsen: Absolutely - the point is that the compiler doesn't "see" that a variable that happens to never be changed is effectively constant.Voight
I understand in this case someone from compiler dev couldn't be bothered to write code to handle these kind of special cases. Even though not related to this question, I don't understand why public myclass method1(){ return null;} shows this warning? It is not a "possible", it is a method that returns null. BTW document linked from MSDN is missing now, thanks to MS changing link to documents every few months.Desex
@AaA: I suspect the effort required to differentiate between "might return null" and "definitely returns null" is more than the value obtained by doing so.Voight
D
63

The nullable flow analysis tracks the null state of variables, but it does not track other state, such as the value of a bool variable (as isNull above), and it does not track the relationship between the state of separate variables (e.g. isNull and _test).

An actual static analysis engine would probably do those things, but would also be "heuristic" or "arbitrary" to some degree: you couldn't necessarily tell the rules it was following, and those rules might even change over time.

That's not something we can do directly in the C# compiler. The rules for nullable warnings are quite sophisticated (as Jon's analysis shows!), but they are rules, and can be reasoned about.

As we roll out the feature it feels like we mostly struck the right balance, but there are a few places that do come up as awkward, and we'll be revisiting those for C# 9.0.

Durgy answered 13/12, 2019 at 19:25 Comment(10)
You know you want to put lattice theory in the specification; lattice theory is awesome and not at all confusing! Do it! :)Derrek
You know your question is legit when the program manager for C# responds!Jew
@EricLippert Can I upvote your comment multiple times? After finding out what lattice theory is, that is.Renfrew
@TanveerBadar: Lattice theory is about the analysis of sets of values that have a partial order; types are a good example; if a value of type X is assignable to a variable of type Y then that implies that Y is "big enough" to hold X, and that is enough to form a lattice, which then tells us that checking assignability in the compiler could be phrased in the specification in terms of lattice theory. This is relevant to static analysis because a great many topics of interest to an analyzer other than type assignability are also expressible in terms of lattices.Derrek
@TanveerBadar: You're welcome. I'm joshing Mads here because for the last several decades now there has been a tension between making the spec more precise but at the same time keeping it readable by line-of-business programmers who have not studied theoretical mathematics. Type checking, definite assignment checking, nullability checking, all that could be described more "mathematically" in the specification but Anders and Mads have quite reasonably opted to limit the amount of theory in the spec.Derrek
@TanveerBadar: lara.epfl.ch/w/_media/sav08:schwartzbach.pdf has some good introductory examples of how static analysis engines use lattice theory.Derrek
@EricLippert Awesome doesn't begin to describe you. That link is going into my must read list right away.Renfrew
@TanveerBadar: Thanks! Andy also has some opinions on this subject: twitter.com/andygocke/status/1206692183111163904Derrek
Given that this answer is "from the horse's mouth" I've marked this as the answer instead of Jon's (despite Jon's being obviously a good answer!)Philender
@MatthewWatson, this is not an answer actually, it only says "it was sophisticated" to implement. John explains why it is happens, which is exactly what you have asked. Also "horses mouth wasn't active for past two years, I guess they are busy fixing it in "C# 9.0" even though I'm getting this warning with C# 10Desex
D
33

You have discovered evidence that the program-flow algorithm that produces this warning is relatively unsophisticated when it comes to tracking the meanings encoded in local variables.

I have no specific knowledge of the flow checker's implementation, but having worked on implementations of similar code in the past, I can make some educated guesses. The flow checker is likely deducing two things in the false positive case: (1) _test could be null, because if it could not, you would not have the comparison in the first place, and (2) isNull could be true or false -- because if it could not, you would not have it in an if. But the connection that the return _test; only runs if _test is not null, that connection is not being made.

This is a surprisingly tricky problem, and you should expect that it will take a while for the compiler to attain the sophistication of tools that have had multiple years of work by experts. The Coverity flow checker, for example, would have no problem at all in deducing that neither of your two variations had a null return, but the Coverity flow checker costs serious money for corporate customers.

Also, the Coverity checkers are designed to run on large codebases overnight; the C# compiler's analysis must run between keystrokes in the editor, which significantly changes the sorts of in-depth analyses you can reasonably perform.

Derrek answered 12/12, 2019 at 15:26 Comment(2)
"Unsophisticated" is right -- I consider it forgivable if it stumbles on things like conditionals, as we all know the halting problem is a bit of a tough nut in such matters, but the fact that there's a difference at all between bool b = x != null vs bool b = x is { } (with neither assignment actually used!) shows that even the recognized patterns for null checks are questionable. Not to disparage the undoubtedly hard work of the team to make this work mostly as it should for real, in-use code bases -- it looks like the analysis is capital-P pragmatic.Gelation
@JeroenMostert: Jared Par mentions in a comment on Jon Skeet's answer that Microsoft is discussing that issue internally.Burkhart
A
12

All the other answers are pretty much exactly correct.

In case anyone's curious, I tried to spell out the compiler's logic as explicitly as possible in https://github.com/dotnet/roslyn/issues/36927#issuecomment-508595947

The one piece that's not mentioned is how we decide whether a null check should be considered "pure", in the sense that if you do it, we should seriously consider whether null is a possibility. There are a lot of "incidental" null checks in C#, where you test for null as a part of doing something else, so we decided that we wanted to narrow down the set of checks to ones that we were sure people were doing deliberately. The heuristic we came up with was "contains the word null", so that's why x != null and x is object produce different results.

Absorbance answered 17/12, 2019 at 6:45 Comment(0)
T
5

Starting from C# 8.0 You can use the "Null forgiving operator" at the end of Your line. (the exclamation mark), in order to remove the warning. Like this:

...
else
  return _test!;
Techno answered 15/6, 2023 at 22:39 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.