Non-nullable reference types' default values VS non-nullable value types' default values
Asked Answered
M

2

17

This isn't my first question about nullable reference types as it's been few months I'm experiencing with it. But the more I'm experiencing it, the more I'm confused and the less I see the value added by that feature.

Take this code for example

string? nullableString = default(string?);
string nonNullableString = default(string);

int? nullableInt = default(int?);
int nonNullableInt = default(int);

Executing that gives:

nullableString => null

nonNullableString => null

nullableInt => null

nonNullableInt => 0

The default value of an (non-nullable) integer has always been 0 but to me it doesn't make sense a non-nullable string's default value is null. Why this choice? This is opposed to the non-nullable principles we've always been used to. I think the default non-nullable string's default value should have been String.Empty.

I mean somewhere deep down in the implementation of C# it must be specified that 0 is the default value of an int. We also could have chosen 1 or 2 but no, the consensus is 0. So can't we just specify the default value of a string is String.Empty when the Nullable reference type feature is activated? Moreover it seems Microsoft would like to activate it by default with .NET 5 projects in a near future so this feature would become the normal behavior.

Now same example with an object:

Foo? nullableFoo = default(Foo?);
Foo nonNullableFoo = default(Foo);

This gives:

nullableFoo => null

nonNullableFoo => null

Again this doesn't make sense to me, in my opinion the default value of a Foo should be new Foo() (or gives a compile error if no parameterless constructor is available). Why by default setting to null an object that isn't supposed to be null?

Now extending this question even more

string nonNullableString = null;
int nonNullableInt = null;

The compiler gives a warning for the 1st line which could be transformed into an error with a simple configuration in our .csproj file: <WarningsAsErrors>CS8600</WarningsAsErrors>. And it gives a compilation error for the 2nd line as expected.

So the behavior between non-nullable value types and non-nullable reference types isn't the same but this is acceptable since I can override it.

However when doing that:

string nonNullableString = null!;
int nonNullableInt = null!;

The compiler is completely fine with the 1st line, no warning at all. I discovered null! recently when experiencing with the nullable reference type feature and I was expecting the compiler to be fine for the 2nd line too but this isn't the case. Now I'm just really confused as for why Microsoft decided to implement different behaviors.

Considering it doesn't protect at all against having null in a non-nullable reference type variable, it seems this new feature doesn't change anything and doesn't improve developers' lives at all (as opposed to non-nullable value types which could NOT be null and therefore don't need to be null-checked)

So at the end it seems the only value added is just in terms of signatures. Developers can now be explicit whether or not a method's return value could be null or not or if a property could be null or not (for example in a C# representation of a database table where NULL is an allowed value in a column).

Beside that I don't see how can I efficiently use this new feature, could you please give me other useful examples on how you use nullable reference types? I really would like to make good use of this feature to improve my developer's life but I really don't see how...

Thank you

Maleficent answered 26/8, 2020 at 11:1 Comment(6)
You're ignoring the warnings generated by string nonNullableString = default(string);, that's why you don't see any benefit. You're explicitly storing a null into a non-nullable variable, so the compiler complains telling you something's wrong. That's the benefit of this feature, especially if you treat warnings as errors.Precincts
I'm not ignoring those warnings, in fact my configuration now is <WarningsAsErrors>CS8600;CS8625</WarningsAsErrors>. I'm just super confused on why this is even possible. We can never be sure a non-nullable reference type won't be null however for value types we are sure about it...Electrophorus
I answered the meritorical part of your question below. As an aside, I would suggest you try to use a little bit less loaded language when asking here. Your choice of words ("it doesn't make sense", "it's completely against (...) principles we've always been used to") makes your question read more like a rant than an honest inquiry. And people don't like to be ranted at, so if an actual programmer from the Roslyn team would read your post they could have less of a "let's help this guy understand" attitude and more of a "defend myself from hostility" attitude, which doesn't help the discussion.Allay
@JérômeMEVEL lots of things that are "obviously" invalid/broken code are syntactically perfectly legal and can't be detected as broken... and lots of things that are perfectly safe and legal are computationally impossible to prove valid (see "the halting problem"). The compiler's job is to block definite problems, try to warn you about possible problems, but not actively get in your way when you might be right. In this case: it gave you a warning - what more did you want?Echt
@Allay thanks a lot for your long answer and sorry about my wordings, English isn't my mother tongue. It wasn't my goal to just *rant*` but I wanted to show why the behavior of this feature makes me really confused in order to get different opinions and understand it betterElectrophorus
I think this is just a really confusingly named feature. It's called "Nullable Reference Types" but really it's just "Nullable Warnings Everywhere". Everyone I talk to is very confused by this feature and has the same expectation as you - that it actually means a type that isn't nullable and has a non-null default value.Illogical
A
11

You are very confused by how programming language design works.

Default values

The default value of an (non-nullable) integer has always been 0 but to me it doesn't make sense a non-nullable string's default value is null. Why this choice? This is completely against non-nullable principles we've always been used to. I think the default non-nullable string's default value should have been String.Empty.

Default values for variables are a basic feature of the language that is in C# since the very beginning. The specification defines the default values:

For a variable of a value_type, the default value is the same as the value computed by the value_type's default constructor ([see] Default constructors). For a variable of a reference_type, the default value is null.

This makes sense from a practical standpoint, as one of the basic usages of defaults is when declaring a new array of values of a given type. Thanks to this definition, the runtime can just zero all the bits in the allocated array - default constructors for value types are always all-zero values in all fields and null is represented as an all-zero reference. That's literally the next line in the spec:

Initialization to default values is typically done by having the memory manager or garbage collector initialize memory to all-bits-zero before it is allocated for use. For this reason, it is convenient to use all-bits-zero to represent the null reference.

Now Nullable Reference Types feature (NRT) was released last year with C#8. The choice here is not "let's implement default values to be null in spite of NRT" but rather "let's not waste time and resources trying to completely rework how the default keyword works because we're introducing NRTs". NRTs are annotations for programmers, by design they have zero impact on the runtime.

I would argue that not being able to specify default values for reference types is a similar case as for not being able to define a parameterless constructor on a value type - runtime needs a fast all-zero default and null values are a reasonable default for reference types. Not all types will have a reasonable default value - what is a reasonable default for a TcpClient?

If you want your own custom default, implement a static Default method or property and document it so that the developers can use that as a default for that type. No need to change the fundamentals of the language.

I mean somewhere deep down in the implementation of C# it must be specified that 0 is the default value of an int. We also could have chosen 1 or 2 but no, the consensus is 0. So can't we just specify the default value of a string is String.Empty when the Nullable reference type feature is activated?

As I said, the deep down is that zeroing a range of memory is blazingly fast and convenient. There is no runtime component responsible for checking what the default of a given type is and repeating that value in an array when you create a new one, since that would be horribly inefficient.

Your proposal would basically mean that the runtime would have to somehow inspect the nullability metadata of strings at runtime and treat an all-zero non-nullable string value as an empty string. This would be a very involved change digging deep into the runtime just for this one special case of an empty string. It's much more cost-efficient to just use a static analyzer to warn you when you're assigning null instead of a sensible default to a non-nullable string. Fortunately we have such analyzer, namely the NRT feature, which consistently refuses to compile my classes that contain definitions like this:

string Foo { get; set; }

by issuing a warning and forcing me to change that to:

string Foo { get; set; } = "";

(I recommend turning on Treat Warnings As Errors by the way, but it's a matter of taste.)

Again this doesn't make sense to me, in my opinion the default value of a Foo should be new Foo() (or gives a compile error if no parameterless constructor is available). Why by default setting to null an object that isn't supposed to be null?

This would, among other things, render you unable to declare an array of a reference type without a default constructor. Most basic collections use an array as the underlying storage, including List<T>. And it would require you to allocate N default instances of a type whenever you make an array of size N, which is again, horribly inefficient. Also the constructor can have side effects. I'm not going to further ponder how many things this would break, suffice to say it's hardly an easy change to make. Considering how complicated NRT was to implement anyway (the NullableReferenceTypesTests.cs file in the Roslyn repo has ~130,000 lines of code alone), the cost-efficiency of introducing such a change is... not great.

The bang operator (!) and Nullable Value Types

The compiler is completely fine with the 1st line, no warning at all. I discovered null! recently when experiencing with the nullable reference type feature and I was expecting the compiler to be fine for the 2nd line too but this isn't the case. Now I'm just really confused as for why Microsoft decided to implement different behaviors.

The null value is valid only for reference types and nullable value types. Nullable types are, again, defined in the spec:

A nullable type can represent all values of its underlying type plus an additional null value. A nullable type is written T?, where T is the underlying type. This syntax is shorthand for System.Nullable<T>, and the two forms can be used interchangeably. (...) An instance of a nullable type T? has two public read-only properties:

  • A HasValue property of type bool
  • A Value property of type T An instance for which HasValue is true is said to be non-null. A non-null instance contains a known value and Value returns that value.

The reason for which you can't assign a null to int is rather obvious - int is a value type that takes 32-bits and represents an integer. The null value is a special reference value that is machine-word sized and represents a location in memory. Assigning null to int has no sensible semantics. Nullable<T> exists specifically for the purpose of allowing null assignments to value types to represent "no value" scenarios. But note that doing

int? x = null;

is purely syntactic sugar. The all-zero value of Nullable<T> is the "no value" scenario, since it means that HasValue is false. There is no magic null value being assigned anywhere, it's the same as saying = default -- it just creates a new all-zero struct of the given type T and assigns it.

So again, the answer is -- no one deliberately tried to design this to work incompatibly with NRTs. Nullable value types are a much more fundamental feature of the language that works like this since its introduction in C#2. And the way you propose it to work doesn't translate to a sensible implementation - would you want all value types to be nullable? Then all of them would have to have the HasValue field that takes an additional byte and possible screws up padding (I think a language that represents ints as a 40-bit type and not 32 would be considered heretical :) ).

The bang operator is used specifically to tell the compiler "I know that I'm dereferencing a nullable/assigning null to a non-nullable, but I'm smarter than you and I know for a fact this is not going to break anything". It disables static analysis warnings. But it does not magically expand the underlying type to accommodate for a null value.

Summary

Considering it doesn't protect at all against having null in a non-nullable reference type variable, it seems this new feature doesn't change anything and doesn't improve developers' life at all (as opposed to non-nullable value types which could NOT be null and therefore don't need to be null-checked)

So at the end it seems the only value added is just in terms of signatures. Developers can now be explicit whether or not a method's return value could be null or not or if a property could be null or not (for example in a C# representation of a database table where NULL is an allowed value in a column).

From the official docs on NRTs:

This new feature provides significant benefits over the handling of reference variables in earlier versions of C# where the design intent can't be determined from the variable declaration. The compiler didn't provide safety against null reference exceptions for reference types (...) These warnings are emitted at compile time. The compiler doesn't add any null checks or other runtime constructs in a nullable context. At runtime, a nullable reference and a non-nullable reference are equivalent.

So you're right in that "the only value added is just in terms of signatures" and static analysis, which is the reason we have signatures in the first place. And that is not an improvement on developers' lives? Note that your line

string nonNullableString = default(string);

gives off a warning. If you did not ignore it (or even better, had Treat Warnings As Errors on) you'd get value - the compiler found a bug in your code for you.

Does it protect you from assigning null to a non-null reference type at runtime? No. Does it improve developers' lives? Thousand times yes. The power of the feature comes from warnings and nullable analysis done at compile time. If you ignore warnings issued by NRT, you're doing it at your own peril. The fact that you can ignore the compiler's helpful hand does not make it useless. After all you can just as well put your entire code in an unsafe context and program in C, doesn't mean that C# is useless since you can circumvent its safety guarantees.

Allay answered 26/8, 2020 at 12:28 Comment(0)
E
7

Again this doesn't make sense to me, in my opinion the default value of a Foo should be new Foo() (or gives a compile error if no parameterless constructor is available)

That's an opinion, but: that isn't how it is implemented. default means null for reference-types, even if it is invalid according to nullability rules. The compiler spots this and warns you about it on the line Foo nonNullableFoo = default(Foo);:

Warning CS8600 Converting null literal or possible null value to non-nullable type.

As for string nonNullableString = null!; and

The compiler is completely fine with the 1st line, no warning at all.

You told it to ignore it; that's what the ! means. If you tell the compiler to not complain about something, it isn't valid to complain that it didn't complain.

So at the end it seems the only value added is just in terms of signatures.

No, it has lots more validity, but if you ignore the warnings that it does raise (CS8600 above), and if you suppress the other things that it does for you (!): yes, it will be less useful. So... don't do that?

Echt answered 26/8, 2020 at 12:17 Comment(8)
Thx Marc. Please can you explain what default(int) has 0 .. i understand but why default(string) is null and not empty string?Savanna
@Savanna again: default means the same as null for reference-types; that's simply an axiomatic/definitional thing; string is a reference-type, therefore the default of string is null, not an empty string.Echt
ah .. I don't know so string is reference type. This make sense me now. Thank youSavanna
I'm not ignoring the warnings, in fact now my default configuration is <WarningsAsErrors>CS8600;CS8625</WarningsAsErrors>. Also I'm aware of what is the meaning of using null! therefore I NEVER use it. This was just examples to illustrate on why I'm so confused about this feature. Could you please elaborate your answer in order to show why this feature has lots more validity? I don't see how am I supposed to deal with non-nullable reference types which could actually be null. It's just adding even more confusion in the code...Electrophorus
@JérômeMEVEL reality check: NullReferenceException is one of the most common exceptions that people hit in .NET; the NRT feature makes it possible to spot possible NRT at compile-time, rather than hitting it at runtime. That's the point and purpose of NRT. Now, there are some limitations - V0ldek (other answer) talks about nulls in arrays as a great example - but to reasonable approximations, as long as you don't use ! habitually, and as long as you read and do something about the warnings it gives you: you should find you virtually never see any more NullReferenceExceptionEcht
@JérômeMEVEL and in reality: most of the time you expect references to be non-null, which means most of your code looks exactly the same and has no additional "confusion"Echt
@MarcGravell about your last comment: this is exactly why I find it confusing. if I've a non-null property I'm not expecting it to be null however it can be null and I still need to check if the property is not null before using it. I was trying to deal with all warnings at the beginning but I gave up because of EF Core, AutoMapper and our company's custom framework heavily using DTOs with {get; set;} properties in order to serialize and deserialize data. Now I'm just flooded with warnings that I don't know what's the proper way to get rid of because my DTOs must have {get; set;} propertiesElectrophorus
@JérômeMEVEL if it is null despite NRT, then somebody is probably using ! in a place where they shouldn't beEcht

© 2022 - 2024 — McMap. All rights reserved.