Thursday, February 22, 2007

Timezone problems, a few weeks early

Due to some serious ambiguities (Australia has an EST, China has a CST), Sun deprecated three-digit timezone IDs as of Java 1.4. Well, they really meant it: as of Java 1.5_08, "EST" shifts from traditional Eastern Standard Time to a 1980s System V timezone name for "Indiana, with DST not observed." The bug and related thread are an interesting insight into how TimeZones are managed in the software world.

For reasons long forgotten, we had some ancient code which allowed us to configure the VM-wide TimeZone, and we defaulted to "EST" - ouch. Changing that to America/New_York fixed the problem, but I didn't understand how default timezone was affecting our date formatting. I conceptualized databases storing Timestamps as what Joda would call an "Instant", or others would call "epoch time": what you get when you call System.currentTimeMillis. Instants don't have timezones, but represent a moment precisely:

1149138000000 == 6/1/2006 2:00 PM EST == 6/1/2006 11:00 AM PST

I was assuming the database essentially stores "1149138000000". When you read that value out, you get a Timestamp back, with this Instant as its backing time. After all, Timestamp/Date essentially represent Instants: look at their getTime() method. Once you have a Timestamp, you can easily format it to any timezone using DateFormat and TimeZone objects. With these horribly flawed ideas, I didn't see where default timezone fit in.

Boy, was I wrong. Consider the following SQL:
INSERT INTO DateTest (DateField) VALUES {ts '2006-06-01 00:14:00.00'}
How do you convert that to an Instant? Of course, you can't - that could be 2 PM in any one of dozens of timezones - all different Instants. Really, the database is just holding the various month/day/hour/etc values - it's up to the code reading the value to make it into an Instant. Considering that java.sql.Timestamp really is an Instant, it's pretty weird to reconsider this code:
Timestamp ts = rs.getTimestamp("DateField");
Which 2PM instant will "ts" represent?? JDBC will apply the VM default timezone in this case. getTimestamp provides a method which will probably make things more clear:
public Timestamp getTimestamp(String columnName, Calendar cal);
I'm still unclear as to why this takes Calendar, and not TimeZone, but I'm done with investigating for a while :)

1. Don't use three-digit timezone codes for looking up timezones!
2. DATE and TIMESTAMP columns do not store TimeZone info, or even Instant info.
3. Web programmers should be very careful with TimeZone.setDefaultTimeZone, its scope is either VM-wide or ThreadLocal, depending on your permissions.
4. Instead of setDefaultTimeZone, consider the Calendar-based permutations of getTimeXXX(....) in ResultSet. This is much more clear, and will avoid the scoping problems described in #3
5. TimeZone.getTimeZone("someIDThatDoesntExist") .equals(TimeZone.getTimeZone("GMT") )- BOO!

Wednesday, February 21, 2007

Excellent VM flag

This is a bit dated, but if you're using Java 1.5 or 1.4.2, there's an excellent flag in more recent VM updates: HeapDumpOnOutOfMemoryError. OOMs have to be one of the hardest problems to debug: traditionally, you'll have next to zero information on the cause of the crash. This flag apparently has no runtime cost, and will provide a thorough snapshot of the heap at the time of the crash. I hadn't had a "out of heap space"-style OOM in years, but recently had one. Luckily, this flag was enabled, and I was able to use the "jhat" tool to pretty quickly find the culprit. I had some problems with the HAT tool, but the jhat which is distributed with Mustang worked perfectly. This is one of those simple changes that can really save you later.

All the info

Thursday, February 08, 2007

When generic type inference fails

99% of the time, using generics is perfectly natural and pleasurable. The other 1%, it can be downright humbling.

Look at the signature of Arrays.asList:

public static <T> List<T> asList(T... a)

Going to take in an array of type T, and return a List containing the same type. But what is T? The parameter "a" is just a set of parameters the caller may have just defined in-place - with no explicit declaration of T. This is only a problem when the varargs passed in are of diverse types:

Arrays.asList(new Thing<Integer>(), new Thing<String>(), new Thing<String>());

In this case, the compiler must find the lowest common denominator type. Unfortunately, with the eclipse compiler at least, that's a tad insane - it would decide the common type is this:

Thing<? extends Object&Comparable<?>&Serializable>>

Yuck! It's true, that in a vacuum, there's not much else the compiler can do. But what about when I give it more info?

public List<Thing>> getThings() {
  return Arrays.asList(new Thing<Integer>(), new Thing<String>(), new Thing<String>());

The compiler complains that it can't convert its insane type to what would appear to be a proper "super" type: Thing. It doesn't seem to me the compiler is taking advantage of all the info I'm giving it. To be fair, I'm sure there are tons of situations where the compiler has no additional context to take advantage of.

However, there's hope. In his post Java 7 Wish List , the author describes a way (which is in Java 5, btw) to give the compiler more of a nudge:

Arrays.<Thing<?>>asList(new Thing<Integer>(), new Thing<String>());

Hey, it aint pretty, but my only guess to the alternative is a bit too verbose:
List<Thing> things = new ArrayList<Thing>();
things.add(new Thing<Integer>());
things.add(new Thing<String>());

Actually, you CAN cast:

(List<Thing<?>>)Arrays.asList(new Thing<Integer>(), new Thing<String>(), new Thing<String>());

But this works only on eclipse, not javac. Plus, doesn't it feel a little to ironic to add a cast to accomodate generics :)

PS. Dear Google programmers, please use your 15% time to make it a lot easier to add code snippets to blog entries.