Saturday, November 20, 2010

"Intro to Scala for Java Programmers": slides, code, and links

Last week, I presented a talk titled "An Introduction to Scala for Java Programmers".  I had a lot of fun making it, and I learned a ton.  I thought it was interesting that the more I learned, the more I liked Scala.  This is certainly not always true of technologies that look good at first!

Here's the slides.  They're probably pretty reliant on the speaking that typically goes with them, but maybe you will see something interesting.

Here's the code I used to develop the slides.  It's in the form of an Eclipse workspace.  To use it, you'll need the Scala plugin installed on Eclipse.  Unfortunately, you'll need to use Eclipse 3.5 (Galileo) instead of the latest.  Once it's installed, make sure you activate the Scala perspective, and import the sample code using File...Import...Existing Project Into Workspace, and use "Select Archive File" to import  The code is a downright odd collection of snippets, but I suspect it would be interesting to play with if you were curious about Scala.

Finally, I wanted to give better credit for some of the ideas and samples in the slides.

  • 3 interesting uses of traits: lifted from this awesome article
  • Actors: inspired by this great intro on Actors 
  • As useful as named parameters and default values are, I didn't have a good example to illustrate them until I found this intro on Artima
  • I think the format of the slides was guided by this article on combining OO and functional principles.
  • Many times, I ended up referring back to Programming in Scala for clarity.  It's a great general-purpose programming book.
Update: Some people have asked for a downloadable pdf - here it is

Tuesday, October 26, 2010

3 easy wins using Google Guava

Google Guava used to be known as Google Collections.  It is a large set of utility methods used within Google.  It aims to help you create concise and easy-to-read code.

1. Concise collection initialization - No example of Java being overly verbose is complete without showing a repetitive and noisy constructor like this:

Java 7 intends to introduce a "diamond operator" which allows the code to be converted to this:

This kind of change probably won't prevent any bugs, but in my opinion it greatly reduces code "noise".  Noise does not matter in a single-line example, but makes a big difference when multiplied across an entire system.

Google Guava provides similar functionality, today, and it works all the way back to Java 5.  It takes advantage of one of the few places Java does infer types: method return values.

Here, newHashMap is a static method on, with a static import.  It gets better, Guava can also be used to initialize collections with values:

java.util.Arrays.asList does something similar, but is actually immutable.  In combination with other techniques (many from Guava), type inference for collections and easy collection initialization can really clean up your code.

2. Multimap - How often are the values in your map actually a collection?  Imagine taking a group of people and grouping these people by last name using a Map.

Current way:

Using Multimap:

3. One-liners
Here's a small sampling of clever utility methods provided:

I've got to stop here, or I'll never finish. I have only covered a few of the classes I really love,  honorable mention goes to:

  • BiMap - Easily create the inverse of your map (keys become values and values become keys)
  • Ordering - Lots of nice extensions to Comparator
  • MultiSet - A set that keeps count of duplicates
  • Predicates and Functions - reusable objects for filtering and transforming your collections
  • AbstractIterator - Writing your own iterator is annoyingly tricky - this can really help
  • Forwarding Collections - Makes it easy to write your collection classes by delegating to existing collections classes (w/o boilerplate)
There's also a huge portion of the library I don't know yet, and really want to investigate:

  • MapMaker - Factory for producing maps for caches - makes a very tricky job easier.
  • Immutables - I'm a big fan of immutability, but haven't investigated Guava's extensive support for it yet

Friday, October 22, 2010

Scala's Option will save you from the most common cause of NullPointerException

A while back, Cedric Beust claimed "Scala’s “Option” and Haskell’s “Maybe” types won’t save you from null".  He's rightly convinced that Option is not a silver bullet for unexpectedly referencing null values at runtime.  However,  I believe there are "Good" NullPointerExceptions(NPE) and "Bad".  Option goes very, very far in saving you from "Bad" NPEs by enlisting the compilers help in tracking what is intended to be optional.

I believe there are two very different cases of NullPointerException:
"Bad" - The code incorrectly assumes the value is guaranteed to be set, but the value is null.
In this case "getPreferredTheme" is optional, but the programmer accesses it as if it will never be null in the current context. This is programmer error, due to the fact that the developer was unaware/not caring the value can sometimes be null. This is class of NPE I feel is the most common, and I believe the Option "pattern" can greatly, if not completely, reduce.

"Good" - The code correctly assumes the value is guaranteed to be set, but the value is null.
In this case, getCompany is designed to be set on a manager object. In my opinion, the developer should NOT try to accommodate getCompany returning null if it is specified that it never will. In the spirit of failing fast, the user should boldly use the object as it's specified.

In both Good and Bad scenarios, how was it specified that getPreferredTheme was optional and getCompany was required? In Java, I've seen a few approaches:

1. Sheer mental horsepower and subjective judgement. "Preferred Theme sounds pretty wishy washy to me, I better check for null". If you're not sure, check the javadoc, source, database schema, and any number of artifacts to inform your guess.  Good luck!

2. "Sentinel values" - Sentinel values are when you return a non-null value of the correct type, but with a wacky value to signify, well, null! For example, a binary search that returns -1 if the element isn't found. This is not the same as returning a default value, which is awesome when appropriate. Not realizing you're working with a sentinel value is identical to not realizing you are working with null. When you accidentally use it as a real value, it's gonna be ugly.

3. Annotations - JSR 305 defines some annotations which allow you to mark up your fields, variables and parameters as @Nonnull, @Nullable, @CheckForNull, @ParametersAreNonNullByDefault, and @ParametersAreNullableByDefault. A tool like windbags can process these annotations and give you warnings when you break the specification. I tried this approach for several months, and definitely had some success with it. However, it has a fair share of false negatives and false positives, and can be noisy. I gave up on it when I encountered code like this (paraphrased):
@Nonnull Person manager = ...
manager.getCompany(); //findbugs warns here - it doesn't know the semantics of checkNonNull

4. Naming conventions - I've tried preceding method names that can return null with a "try", ie tryGetPreferredTheme, and naming fields and vars using "optional", ie
private Theme optionalTheme.
This visually sets of the null access, and worked better than I'd thought. However, it seems a little noisy, and is very easy to overlook.

5. The "Null Object" Pattern - implement your interface using a "null" implementation.  Exactly like returning null, but more confusing.

Scala/Haskell and many other languages offer a 6th option: the Option type. In Java, our "getPreferredTheme" would change to this if it used Option:

Option is an interface with two implementations: "Some" to represent a real value and "None" to represent a null value.  Here's the code, adapted from this great article:

With the change above this would no longer compile:


Instead, the two choices are:
1. You just know it won't be null in this one special case:


get() will get you the actual Theme value if set, or throw a NullPointerException if not. What, a NPE? I thought we were avoiding those? We can't. I'm striving only to get rid of "bad" NPEs, ones which the programmer access the null reference out of ignorance or neglect. We only used get() here because we have assured ourselves it should be an error condition if it wasn't set - a "Good" NPE.  Think of it as a shortcut for "check for null and error out if it is null".

2. You're glad you were reminded it was potentially null, and can move on without it.

if (person.getPreferredTheme().isDefined()) {
...process the data

Yep, just a standard null check with a fancy name.

The important thing to note here is that accessing optional data without considering that fact is disallowed by the compiler. This cannot be said of the four approaches described at the beginning of the post.  You are moving what was previously developer busywork (which we're notoriously bad at) into the domain of the compiler, which excels at busywork. And not some whiz-bang language feature like Nullable Types or annotations either - it takes advantage of very simple classes.

It's also important to note Option should be used only on types that are specified to be optional.  I think a lot of people think Option is intended to protect you from all NPEs by forcing that extra check each time. Not so - just use it on your truly optional variables, and use your normal variables with confidence.

Option is absolutely worth trying in your Java project - I intend to try soon. I fear it may require too much boilerplate in Java.  Scala (and Haskell and others) make Option more appealing due to much stronger type inference (less noise) and a host of other tools for processing them beyond get() and isDefined(). The other tools available in Scala require much more of a leap of faith/new way of thinking.  They're also the source of most of Cedric's skepticism, I believe.  I hope to cover them in the next post.

"If it's worth telling another developer, it's worth telling the compiler"
-Guy Steele

Friday, July 09, 2010

A new syntax for Java closures/lambdas

I'm surprised this isn't making more news.  Brian Goetz of Oracle has posted a dramatic revision of Java 7's Project Lambda.   The syntax is much less noisy than the original version, thanks to a lot more type inference for parameters and return types.  I'm also really happy with how accessible the document is - the document is very readable, even for a "blue-collar" Java developer like me.

Thursday, May 27, 2010

Oracle pushes first version of Lambda/Closures implementation

Test cases may not be the best source of coding gems, as they're probably testing the weirder stuff.  But you can now see some test cases, and therefore the evolving syntax here.

Oracle has a tough task in adding closures to a language that has checked exceptions, minimal type inference, and without breaking compatibility of zillions of lines of existing code.  Go Snoracle!

Monday, May 17, 2010

Some JUnit tricks for easier and better test cases

Whenever I start a new project, the first thing I'll write is a stack trace filterer/clarifier.  The second would be several enhancements to JUnit.  For test driven development to be effective, you need good coverage and lots of tests.  For that to happen, tests need to be dead-simple to write, easy to maintain, and fast.  The following enhancements have really helped me towards these goals.

1. Easy Fixtures - Many bugs/regressions will involve a set of complicated data relationships.  Create an XML/JSON/whatever format to easily import large sets of data.  For example, let's say bug #535 comes in: "Exception is thrown when trying to delete users when they are in a group".  You can quickly go out, and create 535.xml:

Then, your test case could be:

Without an extremely easy way to create test fixtures and import them, 99% of us will write far less tests.

2. Automatic transaction handling and application bootstrapping.  Ideally, the example test method I gave above should be all it takes to run the test.  No explicit setUp, tearDown for bootstrapping your environment, setting up transactions, etc.  It is worth investing your time in some means (maybe a AbstractTest superclass?)  to set these things up automatically for the 95% of test cases that require no special environmental setup.  Don't Repeat Yourself.

3. More powerful Set and List comparison - When I do assertEquals(expected, actual) where "expected" and "actual" are Collections with dozens of elements, the last thing I want to hear that they are not equal, maybe with a toString of both.  So I always implement a assertSetsEqual and assertListsEqual.  assertSetsEqual compare the two collections, where order doesn't matter.   Here's a sample of assertSetsEquals:

The List comparison, where order matters, requires more work to produce a nice message, but it's worth it:

Expected: [a, b, c, d, e, f, g, h, i, j, k]
Actual:   [a, b, c, e, f, g, h, i, j, k, l]
Missing 'd' @ 3
Unexpected l @ 10

4. Type-checking assertEquals - Due to limitations of Java's type system, assertEquals takes two parameters of type Object.  There can be no compile-time checking that the two arguments are actually of types that can actually have equal objects.  For example:

I discovered org.junit.Assert as I wrote this, much better than the junit.framework.  Why is there two?  Just printing the types goes a long way.  I'm surprised how often an assertEquals LOOKS right, but it happily returns false because you are passing incompatible types in.

5. An Eclipse template for creating a new test case.

Just laziness mostly, but I do think it's important to have that fail("No Asserts") in there from the start.  Too often I'll write a test case as I'm exploring a problem, but then I get distracted and abandon it before I really nail down the asserts.  Without asserts, your test case is mostly just slowing things down.

These little changes seem like overkill in the context of writing a silly little test case.  But if your goal is to write hundreds, I found these tricks to be really worth the investment.

Update #1: In the comments, Crias points out that junit.framework is for backwards compatibity with JUnit 3 and earlier and should be avoided if possible.  I think a custom assertEquals which uses reflection determine if the two arguments even can be equal is worthwhile,  but the new org.junit goes a long way just by printing class names.

Update #2: In the comments, David Karr correctly points out the the test runner will print any exception that bubbles out a test method created with my template.  The reason I took this weird approach is the the Eclipse JUnit runner does not output the trace to the console but to the JUnit View, which I find is a goofy place to inspect the exception.  More importantly, the Eclipse runner won't filter my stack traces, which I'm addicted to.

Update #3: Tym The Enchanter points out that Hamcrest helps with my #3 and #4 points about better asserts.  I know very little about Hamcrest, but here's my naive reaction:

  1. is/equalTo for safer equal to comparisons.  These basically force the type of the "actual" to be a subclass (or the same class) as the "expected".  This goes a long way to reducing dumb-dumb comparisons like new Integer(2) and "2".  In my experience, these are exactly the kinds of comparison errors that slow you down.  It's not perfect, however - it's quite possible that the compile time type of your expected is not a subclass of the actual, but they ARE assignable - see the "Garfield" example below.  Of course you can fall back to more traditional assertEquals if it bites you, and I'm unsure how often I'd see this in real life.  
  2. Set/List comparisons - For sure, Hamcrest beats JUnit asserts here.  However, these comparisons basically fail fast - they only tell you the first problem they find - which could lead you to running the test several times before you get it right.  Tell me all of the problems right off the bat! 

Thanks for the comments!

Tuesday, April 20, 2010

Another reason Virtual Machines are amazing

The details aren't important, but basically:
1. Clojure is a Lisp-like language which runs on the JVM, Scala a more C/Java style language which also runs on the JVM.
2. Clojure writes high performance immutable data structures which can handle small updates w/o copying the entire structure.
3. Clever Scala programmer just wraps the Clojure collection in a thin Scala wrapper.

I know language interop on the JVM has been evolving for some time, but it's especially neat to see when neither of the languages are Java!  We're entering an era where you will not know or care what language your libraries are written in.

Thursday, March 25, 2010

Filter your stack traces

Probably the first utility I write when starting up a new project is ExceptionUtils.getFilteredStackTrace(Throwable e)
The idea here is to:
1. "Flatten" the trace. e.printStackTrace attempts to display the stack trace of the creation of each Exception in the chain. In my experience, only the stack of the "bottom" exception is interesting.
2. Filter out third party lines. 99% of the time, the offending code is mine, not my library's. So why show those endless lines from Tomcat, JUnit, Hibernate, etc?

Filtering isn't always appropriate. Sometimes the trace provides good insight into what your libraries are trying to do when the messages aren't clear. I also recommend turning filtering off in production, where you really need all of the info you can get. However, most of the time, you mainly just want to see what your own code was up to when the exception was thrown, and this approach removes massive amounts of noise.

Here's an example. I took the standard Hibernate Person model and hooked it up to the H2 database. I altered the model to make username unique, so that this test fails:

Here's the output from the filtered run:

HibExample.test: Exception: org.hibernate.exception.ConstraintViolationException: could not insert: [hib.Person]
Caused by: org.h2.jdbc.JdbcSQLException: Unique index or primary key violation: "CONSTRAINT_INDEX_8 ON PUBLIC.PERSON(FIRSTNAME)"; SQL statement:
insert into PERSON (PERSON_ID, age, firstname, lastname) values (null, ?, ?, ?) [23001-132]
 37 lines skipped for [org.h2, org.hibernate, sun., java.lang.reflect.Method, $Proxy]
 at hib.HibExample.test(
 24 lines skipped for [sun., java.lang.reflect.Method, org.junit, org.eclipse]

And here's e.printStackTrace

org.hibernate.exception.ConstraintViolationException: could not insert: [hib.Person]
 at org.hibernate.exception.SQLStateConverter.convert(
 at org.hibernate.exception.JDBCExceptionHelper.convert(
 at org.hibernate.persister.entity.AbstractEntityPersister.insert(
 at org.hibernate.persister.entity.AbstractEntityPersister.insert(
 at org.hibernate.action.EntityIdentityInsertAction.execute(
 at org.hibernate.engine.ActionQueue.execute(
 at org.hibernate.event.def.AbstractSaveEventListener.performSaveOrReplicate(
 at org.hibernate.event.def.AbstractSaveEventListener.performSave(
 at org.hibernate.event.def.AbstractSaveEventListener.saveWithGeneratedId(
 at org.hibernate.event.def.DefaultPersistEventListener.entityIsTransient(
 at org.hibernate.event.def.DefaultPersistEventListener.onPersist(
 at org.hibernate.event.def.DefaultPersistEventListener.onPersist(
 at org.hibernate.impl.SessionImpl.firePersist(
 at org.hibernate.impl.SessionImpl.persist(
 at org.hibernate.impl.SessionImpl.persist(
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(
 at java.lang.reflect.Method.invoke(
 at org.hibernate.context.ThreadLocalSessionContext$TransactionProtectionWrapper.invoke(
 at $Proxy4.persist(Unknown Source)
 at hib.HibExample.test(
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(
 at java.lang.reflect.Method.invoke(
 at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(
 at org.junit.runners.model.FrameworkMethod.invokeExplosively(
 at org.junit.internal.runners.statements.InvokeMethod.evaluate(
 at org.junit.internal.runners.statements.RunBefores.evaluate(
 at org.junit.internal.runners.statements.RunAfters.evaluate(
 at org.junit.runners.BlockJUnit4ClassRunner.runChild(
 at org.junit.runners.BlockJUnit4ClassRunner.runChild(
 at org.junit.runners.ParentRunner.runChildren(
 at org.junit.runners.ParentRunner.access$000(
 at org.junit.runners.ParentRunner$1.evaluate(
 at org.junit.internal.runners.statements.RunBefores.evaluate(
 at org.junit.internal.runners.statements.RunAfters.evaluate(
 at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(
 at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(
 at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(
Caused by: org.h2.jdbc.JdbcSQLException: Unique index or primary key violation: "CONSTRAINT_INDEX_8 ON PUBLIC.PERSON(FIRSTNAME)"; SQL statement:
insert into PERSON (PERSON_ID, age, firstname, lastname) values (null, ?, ?, ?) [23001-132]
 at org.h2.message.DbException.getJdbcSQLException(
 at org.h2.message.DbException.get(
 at org.h2.message.DbException.get(
 at org.h2.index.BaseIndex.getDuplicateKeyException(
 at org.h2.index.PageBtree.find(
 at org.h2.index.PageBtreeLeaf.addRow(
 at org.h2.index.PageBtreeLeaf.addRowTry(
 at org.h2.index.PageBtreeIndex.addRow(
 at org.h2.index.PageBtreeIndex.add(
 at org.h2.table.RegularTable.addRow(
 at org.h2.command.dml.Insert.insertRows(
 at org.h2.command.dml.Insert.update(
 at org.h2.command.CommandContainer.update(
 at org.h2.command.Command.executeUpdate(
 at org.h2.jdbc.JdbcPreparedStatement.executeUpdateInternal(
 at org.h2.jdbc.JdbcPreparedStatement.executeUpdate(
 ... 44 more

Update: Here's a good starter version of the filtering code. It may have bugs: I quickly ported it. It depends on Google Collections, because I can no longer imagine coding without them!

Friday, March 05, 2010

File upload using Jersey Client

This post is nothing special - I just wanted to do an http file upload using Jersey's client API. There's a lot of "noise"/outdated code out there on this topic, just wanted to show an example that worked for me:

This requires Jersey 1.1.5 and the optional jersey-multipart module (the two jars at the bottom of this section)

If you're just curious about Jersey, look at how simple the server side is for file upload using Jersey:


Embed code in Blogger/Blogspot posts

Many source code snippet sites allow you to easily embed source code in your posts by providing a simple <script> tag for each snippet. Blogger seems to disable these tags, and I was surprised how little info I could find on the topic. Luckily, Pastebin provides <iframe>, which works like a charm on Blogger!

Wednesday, March 03, 2010

Scala features for Java Programmers: Case Classes

I'm a Java programmer who's been casually evaluating Scala for about 6 months.  I find it very hard to objectively compare the two languages.  Would concise collection literals save me significant time every day?  Do boilerplate getter methods really ever cause bugs, or do they just offend my sense of style? In what ways can Java be improved to allow me to write better code? What Java-related factors cause me to waste the most time, day-to-day?

I think that's a very difficult question to answer.  So far, I've come up with three principles to help me evaluate various differences between languages:
1. Would it make my code easier to read?
2. Would I be able to more easily write correct code?
3. Does it addresses common problems I've encountered that are related to Java?

For my first comparison, can you quickly tell me what the bug is here:

Did you catch the bug?  The problem is I had Eclipse generate hashCode/equals before I added the "_firstName" field.  This mistake is very easy to make, and can be brutal to track down in a bigger system.  Here's an example of how things break when you screw up hashCode:

I feel this is an example of where another language may make it easier to write this code correctly.  At 96 lines to essentially represent a struct of 5 fields, I'd say another language has ample opportunity to help me make this code more readable.  These type of "Bean" objects - heavy on data, light on code, are so common in Java, language level support seems justifiable.

One of my favorite Scala features addresses the readability and correctness problems mentioned above: Case Classes.  Case Classes build on Scala's already succinct class definition format, and adds some extra goodies such as compiler-generated equals and hashCode methods.  Of course, compiler-generated means that as you add/subtract fields, hashCode and equals automatically adjust accordingly.

I know it seems small, having the compiler maintain hashCode and equals methods on "Bean"-style classes.  However, it contributes much to both the readability and correctness of these common classes.  Here's the full definition of Person and the test main method in Scala:

That one little line at the top gives you: enforced immutable access to all 5 fields, correct hashCode and equals, and even a decent toString!  This is 1/96th of the code, and with less bugs.  And I'm holding back some additional tricks that Case Classes have.

Of course, Case Classes are not always appropriate.  I've often written hashCodes that did not consider certain fields, etc.  However, many, many classes I've written in Java would have greatly benefitted from a Case Class-type functionality.

I certainly don't spend my days debugging hashCode methods, but Case Classes absolutely address all 3 of my principles for evaluating language differences.

Friday, January 22, 2010

Tuesday, January 12, 2010

4 Great Resources For Getting Started With Scala

I've been aware of Scala for years, and it never really caught my interest. Every time I'd seen a Scala example, it was an insanely dense block of code solving some problem I'd never had before. However, once the creator of Groovy essentially endorsed Scala, I decided it was worth a closer look. I've since looked at a LOT of Scala content and these 4 resources really stand out for me. If you're interested in Scala, I highly recommend you start here:
  1. Programming In Scala - This is simply a great general-purpose programming book. In addition to being a gentle, clear introduction to the language, it's also a fantastic introduction to functional programming concepts and language design. Even if you hate Scala, this book will make you a better programmer.
  2. Pragmatic Real-World Scala - This video shows off all kinds of things that would make a Java developer drool.
  3. Scala For Java Refugees - Very well-written mostly gentle introduction to major Scala concepts.
  4. Daily Scala - Once you've done 1-3, and have written a few Scala apps, subscribe to Daily Scala. Thoughtful, bite-sized examples nearly every day.
  5. Another tour of Scala - A Java-centric breakdown of fundamental Scala features.