Non-nullable reference types

Asked 28/3, 2009 at 19:9 Answered 2/7, 2022 at 1:56

I'm designing a language, and I'm wondering if it's reasonable to make reference types non-nullable by default, and use "?" for nullable value and reference types. Are there any problems with this? What would you do about this:

class Foo {
    Bar? b;
    Bar b2;
    Foo() {
        b.DoSomething(); //valid, but will cause exception
        b2.DoSomething(); //?
    }
}

Coppins answered 28/3, 2009 at 19:9 Comment(1)

possible duplicate of Best explanation for languages without null – Billow 8/7, 2014 at 10:24

My current language design philosophy is that nullability should be something a programmer is forced to ask for, not given by default on reference types (in this, I agree with Tony Hoare - Google for his recent QCon talk).

On this specific example, with the unnullable b2, it wouldn't even pass static checks: Conservative analysis cannot guarantee that b2 isn't NULL, so the program is not semantically meaningful.

My ethos is simple enough. References are an indirection handle to some resource, which we can traverse to obtain access to that resource. Nullable references are either an indirection handle to a resource, or a notification that the resource is not available, and one is never sure up front which semantics are being used. This gives either a multitude of checks up front (Is it null? No? Yay!), or the inevitable NPE (or equivalent). Most programming resources are, these days, not massively resource constrained or bound to some finite underlying model - null references are, simplistically, one of...

Laziness: "I'll just bung a null in here". Which frankly, I don't have too much sympathy with
Confusion: "I don't know what to put in here yet". Typically also a legacy of older languages, where you had to declare your resource names before you knew what your resources were.
Errors: "It went wrong, here's a NULL". Better error reporting mechanisms are thus essential in a language
A hole: "I know I'll have something soon, give me a placeholder". This has more merit, and we can think of ways to combat this.

Of course, solving each of the cases that NULL current caters for with a better linguistic choice is no small feat, and may add more confusion that it helps. We can always go to immutable resources, so NULL in it's only useful states (error, and hole) isn't much real use. Imperative technqiues are here to stay though, and I'm frankly glad - this makes the search for better solutions in this space worthwhile.

Rathbone answered 28/3, 2009 at 19:32 Comment(0)

Having reference types be non-nullable by default is the only reasonable choice. We are plagued by languages and runtimes that have screwed this up; you should do the Right Thing.

Crews answered 28/3, 2009 at 19:36 Comment(1)

Please look into explaining your dogmatic position. – Lovell 12/5, 2015 at 7:46

This feature was in Spec#. They defaulted to nullable references and used ! to indicate non-nullables. This was because they wanted backward compatibility.

In my dream language (of which I'd probably be the only user!) I'd make the same choice as you, non-nullable by default.

I would also make it illegal to use the . operator on a nullable reference (or anything else that would dereference it). How would you use them? You'd have to convert them to non-nullables first. How would you do this? By testing them for null.

In Java and C#, the if statement can only accept a bool test expression. I'd extend it to accept the name of a nullable reference variable:

if (myObj)
{
    // in this scope, myObj is non-nullable, so can be used
}

This special syntax would be unsurprising to C/C++ programmers. I'd prefer a special syntax like this to make it clear that we are doing a check that modifies the type of the name myObj within the truth-branch.

I'd add a further bit of sugar:

if (SomeMethodReturningANullable() into anotherObj)
{
    // anotherObj is non-nullable, so can be used
}

This just gives the name anotherObj to the result of the expression on the left of the into, so it can be used in the scope where it is valid.

I'd do the same kind of thing for the ?: operator.

string message = GetMessage() into m ? m : "No message available";

Note that string message is non-nullable, but so are the two possible results of the test above, so the assignment is value.

And then maybe a bit of sugar for the presumably common case of substituting a value for null:

string message = GetMessage() or "No message available";

Obviously or would only be validly applied to a nullable type on the left side, and a non-nullable on the right side.

(I'd also have a built-in notion of ownership for instance fields; the compiler would generate the IDisposable.Dispose method automatically, and the ~Destructor syntax would be used to augment Dispose, exactly as in C++/CLI.)

Spec# had another syntactic extension related to non-nullables, due to the problem of ensuring that non-nullables had been initialized correctly during construction:

class SpecSharpExampleClass
{
    private string! _nonNullableExampleField;

    public SpecSharpExampleClass(string s)
        : _nonNullableExampleField(s) 
    {

    }
}

In other words, you have to initialize fields in the same way as you'd call other constructors with base or this - unless of course you initialize them directly next to the field declaration.

Uncertainty answered 28/3, 2009 at 21:10 Comment(3)

in your first sentence, you said " they defaulted to nullable references", did you mean non-nullable? – Anatase 6/3, 2011 at 9:10

@Joan Venge - No. Later in that same sentence, I explain why: Spec# attempted to compile existing C# code correctly, so a reference must be assumed to be nullable unless the code indicates otherwise. – Uncertainty 7/3, 2011 at 12:32

Thanks Daniel. I thought Spec# was default to otherwise since they thought it was better, but I understand what you mean now. – Anatase 7/3, 2011 at 16:49

Have a look at the Elvis operator proposal for Java 7. This does something similar, in that it encapsulates a null check and method dispatch in one operator, with a specified return value if the object is null. Hence:

String s = mayBeNull?.toString() ?: "null";

checks if the String s is null, and returns the string "null" if so, and the value of the string if not. Food for thought, perhaps.

Midship answered 28/3, 2009 at 19:33 Comment(7)

Why is this called the Elvis operator? – Coppins 28/3, 2009 at 23:20

Good question, and one I didn't know the answer to. However see infoq.com/articles/groovy-1.5-new re. the operator ?:, which is related. A little tenuous, nonetheless... – Midship 29/3, 2009 at 0:3

I can only assume that it's some joke about what to do if the object has "left the building"? – Ova 26/8, 2009 at 8:38

In .NET code, I find that constantly checking for a null object just so I can call a method is a real PITA, and an operator like this would certainly help. It's very tempting to define extension methods that do something intelligent for null values, even though that's not how you are supposed to use them. :-( – Ova 26/8, 2009 at 8:40

@Zifre, @Christian Hayter, @Crews Agnew: I know this is way late... but look at it sideways, and think about Elvis' hair! I'll help by adding a mouth: ?:) – Wellwisher 17/6, 2010 at 19:4

Perhaps because Elvis sightings often raise the question of his continued existence. – Trumaine 18/6, 2012 at 8:47

Coffeescript has this ? too, but it's a workaround for a language failure. If you need a nullable in Haskell you must create an "either", a type with two implementations: data Match = Fail | Succeed String – Trumaine 18/6, 2012 at 8:48

A couple of examples of similar features in other languages:

boost::optional (C++)
Maybe (Haskell)

There's also Nullable<T> (from C#) but that is not such a good example because of the different treatment of reference vs. value types.

In your example you could add a conditional message send operator, e.g.

b?->DoSomething();

To send a message to b only if it is non-null.

Robbinrobbins answered 28/3, 2009 at 19:19 Comment(0)

Have the nullability be a configuration setting, enforceable in the authors source code. That way, you will allow people who like nullable objects by default enjoy them in their source code, while allowing those who would like all their objects be non-nullable by default have exactly that. Additionally, provide keywords or other facility to explicitly mark which of your declarations of objects and types can be nullable and which cannot, with something like nullable and not-nullable, to override the global defaults.

For instance

/// "translation unit 1"

#set nullable
{ /// Scope of default override, making all declarations within the scope nullable implicitly
     Bar bar; /// Can be null
     non-null Foo foo; /// Overriden, cannot be null
     nullable FooBar foobar; /// Overriden, can be null, even without the scope definition above 
}

/// Same style for opposite

/// ...

/// Top-bottom, until reset by scoped-setting or simply reset to another value
#set nullable;

/// Nullable types implicitly

#clear nullable;

/// Can also use '#set nullable = false' or '#set not-nullable = true'. Ugly, but human mind is a very original, mhm, thing.

Many people argue that giving everyone what they want is impossible, but if you are designing a new language, try new things. Tony Hoare introduced the concept of null in 1965 because he could not resist (his own words), and we are paying for it ever since (also, his own words, the man is regretful of it). Point is, smart, experienced people make mistakes that cost the rest of us, don't take anyones advice on this page as if it were the only truth, including mine. Evaluate and think about it.

I've read many many rants on how it's us poor inexperienced programmers who really don't understand where to really use null and where not, showing us patterns and antipatterns that are meant to prevent shooting ourselves in the foot. All the while, millions of still inexperienced programmers produce more code in languages that allow null. I may be inexperienced, but I know which of my objects don't benefit from being nullable.

Bronco answered 27/9, 2012 at 11:13 Comment(1)

Why the downvote? Have you seen what large programs look like with nearly every single reference prefixed with something like @NonNull? Sometimes I am shocked by the state of computer science, but then what else it's new. We deserve the programs we have :-) – Bronco 13/10, 2012 at 18:7

Here we are, 13 years later, and C# did it.

And, yes, this is the biggest improvement in languages since Barbara and Stephen invented types in 1974.:

Programming With Abstract Data Types

Barbara Liskov
Massachusetts Institute of Technology
Project MAC
Cambridge, Massachusetts

Stephen Zilles
Cambridge Systems Group
IBM Systems Development Division
Cambridge, Massachusetts

Abstract

The motivation behind the work in very-high-level languages is to ease the programming task by providing the programmer with a language containing primitives or abstractions suitable to his problem area. The programmer is then able to spend his effort in the right place; he concentrates on solving his problem, and the resulting program will be more reliable as a result. Clearly, this is a worthwhile goal. Unfortunately, it is very difficult for a designer to select in advance all the abstractions which the users of his language might need. If a language is to be used at all, it is likely to be used to solve problems which its designer did not envision, and for which the abstractions embedded in the language are not sufficient. This paper presents an approach which allows the set of built-in abstractions to be augmented when the need for a new data abstraction is discovered. This approach to the handling of abstraction is an outgrowth of work on designing a language for structured programming. Relevant aspects of this language are described, and examples of the use and definitions of abstractions are given.

Rici answered 2/7, 2022 at 1:56 Comment(0)

-3

I think null values are good: They are a clear indication that you did something wrong. If you fail to initialize a reference somewhere, you'll get an immediate notice.

The alternative would be that values are sometimes initialized to a default value. Logical errors are then a lot more difficult to detect, unless you put detection logic in those default values. This would be the same as just getting a null pointer exception.

Norsworthy answered 28/3, 2009 at 20:0 Comment(4)

It's sometimes necessary to initialize something without knowing the value it will eventually take. This is relatively rare, though. Usually, initialization takes place on the same line as the declaration. Also, the proposal doesn't eliminate null values, just makes them non-null by default. – Anachronistic 4/4, 2009 at 21:9

Additionally, it's not true that you'll get an immediate notice. You'll get an exception as soon as you try to dereference the null value, but that exception may be caught, and you'll only find out at runtime, not compile time. – Anachronistic 4/4, 2009 at 21:10

Problem is, you can't force programmers to initialize a variable correcty: How many times have you seen: DataSet ds = new DataSet(); ... ds = ReadDataSet( ... ); in code? If ds were non-nullable, no error would be detected if the call to ReadDataSet() were forgotten. – Norsworthy 6/4, 2009 at 7:7

Point is that bad/inexperienced programmers will make these mistakes. Or just make every variable nullable. Better programmers will make these mistakes less often. And if someone is stupid enough to catch Exception or NPE, well, that's his own fault. – Norsworthy 6/4, 2009 at 7:16

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Programming With Abstract Data Types

Abstract

Recommended topics

Hot tags