You are very confused by how programming language design works.
Default values
The default value of an (non-nullable) integer has always been 0
but to me it doesn't make sense a non-nullable string's default value is null
. Why this choice? This is completely against non-nullable principles we've always been used to. I think the default non-nullable string
's default value should have been String.Empty
.
Default values for variables are a basic feature of the language that is in C# since the very beginning. The specification defines the default values:
For a variable of a value_type, the default value is the same as the value computed by the value_type's default constructor ([see] Default constructors).
For a variable of a reference_type, the default value is null
.
This makes sense from a practical standpoint, as one of the basic usages of defaults is when declaring a new array of values of a given type. Thanks to this definition, the runtime can just zero all the bits in the allocated array - default constructors for value types are always all-zero values in all fields and null
is represented as an all-zero reference. That's literally the next line in the spec:
Initialization to default values is typically done by having the memory manager or garbage collector initialize memory to all-bits-zero before it is allocated for use. For this reason, it is convenient to use all-bits-zero to represent the null reference.
Now Nullable Reference Types feature (NRT) was released last year with C#8. The choice here is not "let's implement default values to be null
in spite of NRT" but rather "let's not waste time and resources trying to completely rework how the default
keyword works because we're introducing NRTs". NRTs are annotations for programmers, by design they have zero impact on the runtime.
I would argue that not being able to specify default values for reference types is a similar case as for not being able to define a parameterless constructor on a value type - runtime needs a fast all-zero default and null
values are a reasonable default for reference types. Not all types will have a reasonable default value - what is a reasonable default for a TcpClient
?
If you want your own custom default, implement a static Default
method or property and document it so that the developers can use that as a default for that type. No need to change the fundamentals of the language.
I mean somewhere deep down in the implementation of C# it must be specified that 0
is the default value of an int
. We also could have chosen 1
or 2
but no, the consensus is 0
. So can't we just specify the default value of a string
is String.Empty
when the Nullable reference type feature is activated?
As I said, the deep down is that zeroing a range of memory is blazingly fast and convenient. There is no runtime component responsible for checking what the default of a given type is and repeating that value in an array when you create a new one, since that would be horribly inefficient.
Your proposal would basically mean that the runtime would have to somehow inspect the nullability metadata of string
s at runtime and treat an all-zero non-nullable string
value as an empty string
. This would be a very involved change digging deep into the runtime just for this one special case of an empty string
. It's much more cost-efficient to just use a static analyzer to warn you when you're assigning null
instead of a sensible default to a non-nullable string
. Fortunately we have such analyzer, namely the NRT feature, which consistently refuses to compile my classes that contain definitions like this:
string Foo { get; set; }
by issuing a warning and forcing me to change that to:
string Foo { get; set; } = "";
(I recommend turning on Treat Warnings As Errors by the way, but it's a matter of taste.)
Again this doesn't make sense to me, in my opinion the default value of a Foo should be new Foo() (or gives a compile error if no parameterless constructor is available). Why by default setting to null an object that isn't supposed to be null?
This would, among other things, render you unable to declare an array of a reference type without a default constructor. Most basic collections use an array as the underlying storage, including List<T>
. And it would require you to allocate N
default instances of a type whenever you make an array of size N
, which is again, horribly inefficient. Also the constructor can have side effects. I'm not going to further ponder how many things this would break, suffice to say it's hardly an easy change to make. Considering how complicated NRT was to implement anyway (the NullableReferenceTypesTests.cs
file in the Roslyn repo has ~130,000 lines of code alone), the cost-efficiency of introducing such a change is... not great.
The bang operator (!) and Nullable Value Types
The compiler is completely fine with the 1st line, no warning at all. I discovered null!
recently when experiencing with the nullable reference type feature and I was expecting the compiler to be fine for the 2nd line too but this isn't the case. Now I'm just really confused as for why Microsoft decided to implement different behaviors.
The null
value is valid only for reference types and nullable value types. Nullable types are, again, defined in the spec:
A nullable type can represent all values of its underlying type plus an additional null
value. A nullable type is written T?
, where T
is the underlying type. This syntax is shorthand for System.Nullable<T>
, and the two forms can be used interchangeably. (...) An instance of a nullable type T?
has two public read-only properties:
- A
HasValue
property of type bool
- A
Value
property of type T
An instance for which HasValue
is true
is said to be non-null. A non-null instance contains a known value and Value
returns that value.
The reason for which you can't assign a null
to int
is rather obvious - int
is a value type that takes 32-bits and represents an integer. The null
value is a special reference value that is machine-word sized and represents a location in memory. Assigning null
to int
has no sensible semantics. Nullable<T>
exists specifically for the purpose of allowing null
assignments to value types to represent "no value" scenarios. But note that doing
int? x = null;
is purely syntactic sugar. The all-zero value of Nullable<T>
is the "no value" scenario, since it means that HasValue
is false
. There is no magic null
value being assigned anywhere, it's the same as saying = default
-- it just creates a new all-zero struct of the given type T
and assigns it.
So again, the answer is -- no one deliberately tried to design this to work incompatibly with NRTs. Nullable value types are a much more fundamental feature of the language that works like this since its introduction in C#2. And the way you propose it to work doesn't translate to a sensible implementation - would you want all value types to be nullable? Then all of them would have to have the HasValue
field that takes an additional byte and possible screws up padding (I think a language that represents int
s as a 40-bit type and not 32 would be considered heretical :) ).
The bang operator is used specifically to tell the compiler "I know that I'm dereferencing a nullable/assigning null to a non-nullable, but I'm smarter than you and I know for a fact this is not going to break anything". It disables static analysis warnings. But it does not magically expand the underlying type to accommodate for a null
value.
Summary
Considering it doesn't protect at all against having null
in a non-nullable reference type variable, it seems this new feature doesn't change anything and doesn't improve developers' life at all (as opposed to non-nullable value types which could NOT be null
and therefore don't need to be null-checked)
So at the end it seems the only value added is just in terms of signatures. Developers can now be explicit whether or not a method's return value could be null
or not or if a property could be null
or not (for example in a C# representation of a database table where NULL is an allowed value in a column).
From the official docs on NRTs:
This new feature provides significant benefits over the handling of reference variables in earlier versions of C# where the design intent can't be determined from the variable declaration. The compiler didn't provide safety against null reference exceptions for reference types (...) These warnings are emitted at compile time. The compiler doesn't add any null checks or other runtime constructs in a nullable context. At runtime, a nullable reference and a non-nullable reference are equivalent.
So you're right in that "the only value added is just in terms of signatures" and static analysis, which is the reason we have signatures in the first place. And that is not an improvement on developers' lives? Note that your line
string nonNullableString = default(string);
gives off a warning. If you did not ignore it (or even better, had Treat Warnings As Errors on) you'd get value - the compiler found a bug in your code for you.
Does it protect you from assigning null
to a non-null reference type at runtime? No. Does it improve developers' lives? Thousand times yes. The power of the feature comes from warnings and nullable analysis done at compile time. If you ignore warnings issued by NRT, you're doing it at your own peril. The fact that you can ignore the compiler's helpful hand does not make it useless. After all you can just as well put your entire code in an unsafe
context and program in C, doesn't mean that C# is useless since you can circumvent its safety guarantees.
string nonNullableString = default(string);
, that's why you don't see any benefit. You're explicitly storing anull
into a non-nullable variable, so the compiler complains telling you something's wrong. That's the benefit of this feature, especially if you treat warnings as errors. – Precincts<WarningsAsErrors>CS8600;CS8625</WarningsAsErrors>
. I'm just super confused on why this is even possible. We can never be sure a non-nullable reference type won't be null however for value types we are sure about it... – Electrophorus