Friday, October 22, 2010

Scala's Option will save you from the most common cause of NullPointerException

A while back, Cedric Beust claimed "Scala’s “Option” and Haskell’s “Maybe” types won’t save you from null".  He's rightly convinced that Option is not a silver bullet for unexpectedly referencing null values at runtime.  However,  I believe there are "Good" NullPointerExceptions(NPE) and "Bad".  Option goes very, very far in saving you from "Bad" NPEs by enlisting the compilers help in tracking what is intended to be optional.

I believe there are two very different cases of NullPointerException:
"Bad" - The code incorrectly assumes the value is guaranteed to be set, but the value is null.
In this case "getPreferredTheme" is optional, but the programmer accesses it as if it will never be null in the current context. This is programmer error, due to the fact that the developer was unaware/not caring the value can sometimes be null. This is class of NPE I feel is the most common, and I believe the Option "pattern" can greatly, if not completely, reduce.

"Good" - The code correctly assumes the value is guaranteed to be set, but the value is null.
In this case, getCompany is designed to be set on a manager object. In my opinion, the developer should NOT try to accommodate getCompany returning null if it is specified that it never will. In the spirit of failing fast, the user should boldly use the object as it's specified.

In both Good and Bad scenarios, how was it specified that getPreferredTheme was optional and getCompany was required? In Java, I've seen a few approaches:

1. Sheer mental horsepower and subjective judgement. "Preferred Theme sounds pretty wishy washy to me, I better check for null". If you're not sure, check the javadoc, source, database schema, and any number of artifacts to inform your guess.  Good luck!

2. "Sentinel values" - Sentinel values are when you return a non-null value of the correct type, but with a wacky value to signify, well, null! For example, a binary search that returns -1 if the element isn't found. This is not the same as returning a default value, which is awesome when appropriate. Not realizing you're working with a sentinel value is identical to not realizing you are working with null. When you accidentally use it as a real value, it's gonna be ugly.

3. Annotations - JSR 305 defines some annotations which allow you to mark up your fields, variables and parameters as @Nonnull, @Nullable, @CheckForNull, @ParametersAreNonNullByDefault, and @ParametersAreNullableByDefault. A tool like windbags can process these annotations and give you warnings when you break the specification. I tried this approach for several months, and definitely had some success with it. However, it has a fair share of false negatives and false positives, and can be noisy. I gave up on it when I encountered code like this (paraphrased):
@Nonnull Person manager = ...
manager.getCompany(); //findbugs warns here - it doesn't know the semantics of checkNonNull

4. Naming conventions - I've tried preceding method names that can return null with a "try", ie tryGetPreferredTheme, and naming fields and vars using "optional", ie
private Theme optionalTheme.
This visually sets of the null access, and worked better than I'd thought. However, it seems a little noisy, and is very easy to overlook.

5. The "Null Object" Pattern - implement your interface using a "null" implementation.  Exactly like returning null, but more confusing.

Scala/Haskell and many other languages offer a 6th option: the Option type. In Java, our "getPreferredTheme" would change to this if it used Option:

Option is an interface with two implementations: "Some" to represent a real value and "None" to represent a null value.  Here's the code, adapted from this great article:

With the change above this would no longer compile:


Instead, the two choices are:
1. You just know it won't be null in this one special case:


get() will get you the actual Theme value if set, or throw a NullPointerException if not. What, a NPE? I thought we were avoiding those? We can't. I'm striving only to get rid of "bad" NPEs, ones which the programmer access the null reference out of ignorance or neglect. We only used get() here because we have assured ourselves it should be an error condition if it wasn't set - a "Good" NPE.  Think of it as a shortcut for "check for null and error out if it is null".

2. You're glad you were reminded it was potentially null, and can move on without it.

if (person.getPreferredTheme().isDefined()) {
...process the data

Yep, just a standard null check with a fancy name.

The important thing to note here is that accessing optional data without considering that fact is disallowed by the compiler. This cannot be said of the four approaches described at the beginning of the post.  You are moving what was previously developer busywork (which we're notoriously bad at) into the domain of the compiler, which excels at busywork. And not some whiz-bang language feature like Nullable Types or annotations either - it takes advantage of very simple classes.

It's also important to note Option should be used only on types that are specified to be optional.  I think a lot of people think Option is intended to protect you from all NPEs by forcing that extra check each time. Not so - just use it on your truly optional variables, and use your normal variables with confidence.

Option is absolutely worth trying in your Java project - I intend to try soon. I fear it may require too much boilerplate in Java.  Scala (and Haskell and others) make Option more appealing due to much stronger type inference (less noise) and a host of other tools for processing them beyond get() and isDefined(). The other tools available in Scala require much more of a leap of faith/new way of thinking.  They're also the source of most of Cedric's skepticism, I believe.  I hope to cover them in the next post.

"If it's worth telling another developer, it's worth telling the compiler"
-Guy Steele


Gareth McCaughan said...

Your Java implementation of Option has None objects returning true from isDefined. You might want to change that :-).

Adam Rabung said...

@Gareth - I have a tradition of introducing critical bugs into example code. Think of it like an easter egg :)
Updated - Thanks for pointing it out.

Javin @ find command in unix said...

great blog mate. Thanks for sharing information.

What is Garbage collection in Java ? How GC works in Java