Marius Schulz

Stop Cheating the Type System

If you work in a statically typed language like C#, the compiler and its type system are your best friends. They will ensure that your program doesn't contain certain errors such as incorrect names or incompatible types. All of this happens at compilation time, so you don't have to take your chances and hope for the best at runtime.

Despite the compiler checking your identifiers and types, it's still possible to run into type errors due to invalid casts, incompatible access to covariant arrays, etc. Nevertheless, the type system is your first line of defense against obvious mistakes, which leads me to the key message of this post: Don't cheat the type system.

The Culprit: null

As you might have guessed, the problem is null. It's a loophole in the type system, if you will, because we have to keep in mind that every variable of a reference type can hold null as a value. Think about the following code:

public void DoSomething(string foo, int[] bar)
{
    // ...
}

What does the type string of the foo parameter tell us? Does it tell us that foo holds a reference to a string? Not exactly: It tells us that foo contains a reference to a string, or nothing. How about bar — does int[] tell us that we'll definitely receive an array of integers? It doesn't: It tells us bar holds a reference to an array of integers, or nothing. I could go on like this, but you should see the problem by now. For more reasons why null is a bad thing, read about the abject failure of weak typing.

Working Around null

To be on the safe side, we'd have to check every parameter of every (public) method for null, leading to highly defensive programming with null checks littered all over the code. Sometimes parameter null checks are unavoidable, and in those cases it makes sense to use a custom exception helper class for null checks to reduce the amount of bloating boilerplate code.

We won't be able to work around null entirely; after all, the whole .NET Framework is built around it. We can, however, be disciplined about it in our own code by avoiding anti-patterns and using null knowingly and responsibly.

No null Collections

The first anti-pattern is about collections (lists, arrays, sequences). When should a collection (list, array, sequence) be null? Never. After all, what should a null collection even represent conceptually? The absence of the concept of collections at all? That doesn't make sense. The default value for a list of things shouldn't be a non-existing list, but an empty list. A sequence containing no elements is an empty sequence, not null.

The problem with null collections — or anything that implements IEnumerable<T> — is that enumerating over them without a preceding null checks results in an exception. It's perfectly reasonable for methods to return empty collections or arrays of size 0. It's not reasonable at all to return null. To put it a little more drastically:

If your class stores a list in a field, instantiate the list in the constructor and make the field readonly. Now, your collection isn't null anymore and nobody (neither you nor callers of your code) will be able to change that after the constructor is left:

public class Foo
{
    private readonly List<Bar> _bars;

    public Foo()
    {
        _bars = new List<Bar>();
    }
}

No null Delegates

Similar to collections, delegates shouldn't be null, either. What does it tell us if the type of a parameter is Action<string>? It tells us that the parameter holds a reference to a delegate that can be called with a single string parameter, or nothing.

Try to avoid situations like these in your code base. This is, again, about providing reasonable default values. If there's no Action<string> to be performed, don't pass in null to methods expecting an Action<string> parameter, but an empty action:

Action<string> emptyAction = _ => { };

If I don't make use of the parameter of the lambda expression within its body, I like to use the _ as a parameter name, but that's just personal preference.

No Partly Initialized Objects

If your class needs to have certain properties set, enforce required properties through the constructor. The compiler won't stop you from not assigning required property values after an object is created.

Don't do this:

var errorProne = new FooObject
{
    RequiredProperty = "some value"
};

Do this instead:

var better = new FooObject("some value");

Required properties not specified through the constructor are also an extensibility problem, because adding another required property later won't break existing code, leading to half-initialized objects in some places. To quote Brad Wilson again:

'nuff said.

Responsibly Dealing with null

Avoid null values where possible. You can, for example, employ the Null Object Pattern. Depending on the issue at hand, it might also be possible to use reasonable default values as fallback using ??, the null-coalescing operator.

If you absolutely have to use null, make it explicit. Methods that possibly return null should be named accordingly, just like the LINQ methods FirstOrDefault() or SingleOrDefault(). Another option is to use the Try… method pattern, which returns a boolean indicating success and has an out parameter holding the actual value.

This list is by no means exclusive; there are many other places where you wouldn't want to use null. In general, try to use as few null values as possible — don't trick the type system. Reduce null reference exceptions, decrease your bug surface area, and live a happier developer's life.

Marius Schulz

Computer science student, developer, and blogger. I love C# and JavaScript, enjoy regular expressions, and probably drink too much coffee.

Next post: Combining Modifiers in C#: "protected internal" and "override sealed"
Previous post:

11 Comments

Carsten König

Ben is right ... just use option/maybe and make it explicit (you can do this in C# too - FSharpX is not only for F#ers ;) )

Tom

Interesting article. Any thoughts on string with a null value vs. empty string?

Keith Petersen

How do you handle database values that are nullable? Do you give every column a default value? Sometimes you would still have to determine whether a column has a certain default value, right? In those cases, it seems like the default value might as well be null if you have to check for it anyway.

Mike C

All good advice, but sometimes a null can't be avoided. One example is Optional parameters. If you're passing in a Collection or Delegate as an optional parameter, you can't specify an empty collection or do-nothing delegate, as a constant expression is required for the default value of the optional parameter.

For example, this will not compile (forgive the VB.NET, although the C# equivalent will also have the same issue):

Private Sub ActionWithDefaultOptionalDelegate(Optional doThis As Action = Sub() Exit Sub)
    doThis()
End Sub

...but this will:

Private Sub ActionWithNullableOptionalDelegate(Optional doThis As Action = Nothing)
    doThis = If(doThis, Sub() Exit Sub)
    doThis()
End Sub

Likewise, this will not compile:

Private Sub ActionWithDefaultOptionalCollection(Optional numList As IEnumerable(Of Int32) = {})
    Console.WriteLine(numList.Sum())
End Sub

...but this will:

Private Sub ActionWithNullableOptionalCollection(Optional numList As IEnumerable(Of Int32) = Nothing)
    numList = If(numList, {})
    Console.WriteLine(numList.Sum())
End Sub
Marius Schulz

Keith — Database null values that denote missing values are a different story. I'm not trying to get rid of null in C# or your favorite RDBMS, that's impossible.

What I want is to get back an empty collection (rather than null) for an OR-mapped 1:n (or m:n) relationship.

Marius Schulz

Mike — you're right, neither collection nor delegate types can be specified as compile-time constants. Two method overloads can easily call each other, though, and that's where you could fill in a non-null default value.

Chris Chiesa

In my 25 years' experience as a software engineer, I believe one thing is clear: no matter what decisions are made by language designers as to the inclusion, exclusion, or behavior of a feature -- in this case a type system, in particular whether or not to include 'null' --, eventually a programmer has a need that the designers' decisions make impossible to fulfill. When C was the big thing, it was downright necessary to trick the compiler "every time you turned around," or so I remember it. Certain languages in use today -- I read an article just today but my memory is terrible and I don't remember what language they were citing -- turn into unmanageable nightmares if one tries to declare variables as 'const'. And so on, and so forth. So I say quit worrying about it. Some users will be satisfied, and others dissatisfied, no matter what you do. So, you may as well include, or omit, 'null' or other features as the whim strikes you, and forget about it and move on.

Daniel Nixon

You might also consider adoption the Option<T> type from languages like Haskell, Scala and F# as a way to represent legitimately missing/optional values. See e.g. http://www.danielnixon.org/extending-rantpacks-option-type-or-null-reference-considered-harmful/

danie

Indeed Null is Evil! :)

Leave a comment

Marius Schulz on Twitter Marius Schulz on GitHub RSS Feed