Tuesday, October 26, 2010

3 easy wins using Google Guava

Google Guava used to be known as Google Collections.  It is a large set of utility methods used within Google.  It aims to help you create concise and easy-to-read code.

1. Concise collection initialization - No example of Java being overly verbose is complete without showing a repetitive and noisy constructor like this:

Java 7 intends to introduce a "diamond operator" which allows the code to be converted to this:

This kind of change probably won't prevent any bugs, but in my opinion it greatly reduces code "noise".  Noise does not matter in a single-line example, but makes a big difference when multiplied across an entire system.

Google Guava provides similar functionality, today, and it works all the way back to Java 5.  It takes advantage of one of the few places Java does infer types: method return values.

Here, newHashMap is a static method on com.google.common.collect.Maps, with a static import.  It gets better, Guava can also be used to initialize collections with values:

java.util.Arrays.asList does something similar, but is actually immutable.  In combination with other techniques (many from Guava), type inference for collections and easy collection initialization can really clean up your code.

2. Multimap - How often are the values in your map actually a collection?  Imagine taking a group of people and grouping these people by last name using a Map.

Current way:

Using Multimap:

3. One-liners
Here's a small sampling of clever utility methods provided:

I've got to stop here, or I'll never finish. I have only covered a few of the classes I really love,  honorable mention goes to:

  • BiMap - Easily create the inverse of your map (keys become values and values become keys)
  • Ordering - Lots of nice extensions to Comparator
  • MultiSet - A set that keeps count of duplicates
  • Predicates and Functions - reusable objects for filtering and transforming your collections
  • AbstractIterator - Writing your own iterator is annoyingly tricky - this can really help
  • Forwarding Collections - Makes it easy to write your collection classes by delegating to existing collections classes (w/o boilerplate)
There's also a huge portion of the library I don't know yet, and really want to investigate:

  • MapMaker - Factory for producing maps for caches - makes a very tricky job easier.
  • Immutables - I'm a big fan of immutability, but haven't investigated Guava's extensive support for it yet

Friday, October 22, 2010

Scala's Option will save you from the most common cause of NullPointerException

A while back, Cedric Beust claimed "Scala’s “Option” and Haskell’s “Maybe” types won’t save you from null".  He's rightly convinced that Option is not a silver bullet for unexpectedly referencing null values at runtime.  However,  I believe there are "Good" NullPointerExceptions(NPE) and "Bad".  Option goes very, very far in saving you from "Bad" NPEs by enlisting the compilers help in tracking what is intended to be optional.

I believe there are two very different cases of NullPointerException:
"Bad" - The code incorrectly assumes the value is guaranteed to be set, but the value is null.
In this case "getPreferredTheme" is optional, but the programmer accesses it as if it will never be null in the current context. This is programmer error, due to the fact that the developer was unaware/not caring the value can sometimes be null. This is class of NPE I feel is the most common, and I believe the Option "pattern" can greatly, if not completely, reduce.

"Good" - The code correctly assumes the value is guaranteed to be set, but the value is null.
In this case, getCompany is designed to be set on a manager object. In my opinion, the developer should NOT try to accommodate getCompany returning null if it is specified that it never will. In the spirit of failing fast, the user should boldly use the object as it's specified.

In both Good and Bad scenarios, how was it specified that getPreferredTheme was optional and getCompany was required? In Java, I've seen a few approaches:

1. Sheer mental horsepower and subjective judgement. "Preferred Theme sounds pretty wishy washy to me, I better check for null". If you're not sure, check the javadoc, source, database schema, and any number of artifacts to inform your guess.  Good luck!

2. "Sentinel values" - Sentinel values are when you return a non-null value of the correct type, but with a wacky value to signify, well, null! For example, a binary search that returns -1 if the element isn't found. This is not the same as returning a default value, which is awesome when appropriate. Not realizing you're working with a sentinel value is identical to not realizing you are working with null. When you accidentally use it as a real value, it's gonna be ugly.

3. Annotations - JSR 305 defines some annotations which allow you to mark up your fields, variables and parameters as @Nonnull, @Nullable, @CheckForNull, @ParametersAreNonNullByDefault, and @ParametersAreNullableByDefault. A tool like windbags can process these annotations and give you warnings when you break the specification. I tried this approach for several months, and definitely had some success with it. However, it has a fair share of false negatives and false positives, and can be noisy. I gave up on it when I encountered code like this (paraphrased):
@Nonnull Person manager = ...
manager.getCompany(); //findbugs warns here - it doesn't know the semantics of checkNonNull

4. Naming conventions - I've tried preceding method names that can return null with a "try", ie tryGetPreferredTheme, and naming fields and vars using "optional", ie
private Theme optionalTheme.
This visually sets of the null access, and worked better than I'd thought. However, it seems a little noisy, and is very easy to overlook.

5. The "Null Object" Pattern - implement your interface using a "null" implementation.  Exactly like returning null, but more confusing.

Scala/Haskell and many other languages offer a 6th option: the Option type. In Java, our "getPreferredTheme" would change to this if it used Option:

Option is an interface with two implementations: "Some" to represent a real value and "None" to represent a null value.  Here's the code, adapted from this great article:

With the change above this would no longer compile:


Instead, the two choices are:
1. You just know it won't be null in this one special case:


get() will get you the actual Theme value if set, or throw a NullPointerException if not. What, a NPE? I thought we were avoiding those? We can't. I'm striving only to get rid of "bad" NPEs, ones which the programmer access the null reference out of ignorance or neglect. We only used get() here because we have assured ourselves it should be an error condition if it wasn't set - a "Good" NPE.  Think of it as a shortcut for "check for null and error out if it is null".

2. You're glad you were reminded it was potentially null, and can move on without it.

if (person.getPreferredTheme().isDefined()) {
...process the data

Yep, just a standard null check with a fancy name.

The important thing to note here is that accessing optional data without considering that fact is disallowed by the compiler. This cannot be said of the four approaches described at the beginning of the post.  You are moving what was previously developer busywork (which we're notoriously bad at) into the domain of the compiler, which excels at busywork. And not some whiz-bang language feature like Nullable Types or annotations either - it takes advantage of very simple classes.

It's also important to note Option should be used only on types that are specified to be optional.  I think a lot of people think Option is intended to protect you from all NPEs by forcing that extra check each time. Not so - just use it on your truly optional variables, and use your normal variables with confidence.

Option is absolutely worth trying in your Java project - I intend to try soon. I fear it may require too much boilerplate in Java.  Scala (and Haskell and others) make Option more appealing due to much stronger type inference (less noise) and a host of other tools for processing them beyond get() and isDefined(). The other tools available in Scala require much more of a leap of faith/new way of thinking.  They're also the source of most of Cedric's skepticism, I believe.  I hope to cover them in the next post.

"If it's worth telling another developer, it's worth telling the compiler"
-Guy Steele