Stop Cheating the Type System

July 2, 2014

If you work in a statically typed language like C#, the compiler and its type system are your best friends. They will ensure that your program doesn't contain certain errors such as incorrect names or incompatible types. All of this happens at compilation time, so you don't have to take your chances and hope for the best at runtime.

Despite the compiler checking your identifiers and types, it's still possible to run into type errors due to invalid casts, incompatible access to covariant arrays, etc. Nevertheless, the type system is your first line of defense against obvious mistakes, which leads me to the key message of this post: Don't cheat the type system.

#The Culprit: `null`

As you might have guessed, the problem is null. It's a loophole in the type system, if you will, because we have to keep in mind that every variable of a reference type can hold null as a value. Think about the following code:

public void DoSomething(string foo, int[] bar)
{
    // ...
}

What does the type string of the foo parameter tell us? Does it tell us that foo holds a reference to a string? Not exactly: It tells us that foo contains a reference to a string, or nothing. How about bar — does int[] tell us that we'll definitely receive an array of integers? It doesn't: It tells us bar holds a reference to an array of integers, or nothing. I could go on like this, but you should see the problem by now. For more reasons why null is a bad thing, read about the abject failure of weak typing.

#Working Around `null`

To be on the safe side, we'd have to check every parameter of every (public) method for null, leading to highly defensive programming with null checks littered all over the code. Sometimes parameter null checks are unavoidable, and in those cases it makes sense to use a custom exception helper class for null checks to reduce the amount of bloating boilerplate code.

We won't be able to work around null entirely; after all, the whole .NET Framework is built around it. We can, however, be disciplined about it in our own code by avoiding anti-patterns and using null knowingly and responsibly.

#No `null` Collections

The first anti-pattern is about collections (lists, arrays, sequences). When should a collection (list, array, sequence) be null? Never. After all, what should a null collection even represent conceptually? The absence of the concept of collections at all? That doesn't make sense. The default value for a list of things shouldn't be a non-existing list, but an empty list. A sequence containing no elements is an empty sequence, not null.

The problem with null collections — or anything that implements IEnumerable<T> — is that enumerating over them without a preceding null checks results in an exception. It's perfectly reasonable for methods to return empty collections or arrays of size 0. It's not reasonable at all to return null. To put it a little more drastically:

Pro-tip: If you ever return a null IEnumerable instead of an empty one, I'm going to come to your house and shoot your face with a bazooka.
— Brad Wilson (@bradwilson) February 8, 2014

If your class stores a list in a field, instantiate the list in the constructor and make the field readonly. Now, your collection isn't null anymore and nobody (neither you nor callers of your code) will be able to change that after the constructor is left:

public class Foo
{
    private readonly List<Bar> _bars;

    public Foo()
    {
        _bars = new List<Bar>();
    }
}

#No `null` Delegates

Similar to collections, delegates shouldn't be null, either. What does it tell us if the type of a parameter is Action<string>? It tells us that the parameter holds a reference to a delegate that can be called with a single string parameter, or nothing.

Try to avoid situations like these in your code base. This is, again, about providing reasonable default values. If there's no Action<string> to be performed, don't pass in null to methods expecting an Action<string> parameter, but an empty action:

Action<string> emptyAction = _ => { };

If I don't make use of the parameter of the lambda expression within its body, I like to use the _ as a parameter name, but that's just personal preference.

#No Partly Initialized Objects

If your class needs to have certain properties set, enforce required properties through the constructor. The compiler won't stop you from not assigning required property values after an object is created.

Don't do this:

var errorProne = new FooObject
{
    RequiredProperty = "some value"
};

Do this instead:

var better = new FooObject("some value");

Required properties not specified through the constructor are also an extensibility problem, because adding another required property later won't break existing code, leading to half-initialized objects in some places. To quote Brad Wilson again:

Using initializer syntax instead of constructors is the single biggest source of bugs in .NET. Half-initialized objects are the devil.
— Brad Wilson (@bradwilson) May 27, 2014

'nuff said.

#Responsibly Dealing with `null`

Avoid null values where possible. You can, for example, employ the Null Object Pattern. Depending on the issue at hand, it might also be possible to use reasonable default values as fallback using ??, the null-coalescing operator.

If you absolutely have to use null, make it explicit. Methods that possibly return null should be named accordingly, just like the LINQ methods FirstOrDefault() or SingleOrDefault(). Another option is to use the Try… method pattern, which returns a boolean indicating success and has an out parameter holding the actual value.

This list is by no means exclusive; there are many other places where you wouldn't want to use null. In general, try to use as few null values as possible — don't trick the type system. Reduce null reference exceptions, decrease your bug surface area, and live a happier developer's life.

#The Culprit: null

#Working Around null

#No null Collections

#No null Delegates