Thursday, February 22, 2007

Timezone problems, a few weeks early

Due to some serious ambiguities (Australia has an EST, China has a CST), Sun deprecated three-digit timezone IDs as of Java 1.4. Well, they really meant it: as of Java 1.5_08, "EST" shifts from traditional Eastern Standard Time to a 1980s System V timezone name for "Indiana, with DST not observed." The bug and related thread are an interesting insight into how TimeZones are managed in the software world.

For reasons long forgotten, we had some ancient code which allowed us to configure the VM-wide TimeZone, and we defaulted to "EST" - ouch. Changing that to America/New_York fixed the problem, but I didn't understand how default timezone was affecting our date formatting. I conceptualized databases storing Timestamps as what Joda would call an "Instant", or others would call "epoch time": what you get when you call System.currentTimeMillis. Instants don't have timezones, but represent a moment precisely:

1149138000000 == 6/1/2006 2:00 PM EST == 6/1/2006 11:00 AM PST

I was assuming the database essentially stores "1149138000000". When you read that value out, you get a Timestamp back, with this Instant as its backing time. After all, Timestamp/Date essentially represent Instants: look at their getTime() method. Once you have a Timestamp, you can easily format it to any timezone using DateFormat and TimeZone objects. With these horribly flawed ideas, I didn't see where default timezone fit in.

Boy, was I wrong. Consider the following SQL:
INSERT INTO DateTest (DateField) VALUES {ts '2006-06-01 00:14:00.00'}
How do you convert that to an Instant? Of course, you can't - that could be 2 PM in any one of dozens of timezones - all different Instants. Really, the database is just holding the various month/day/hour/etc values - it's up to the code reading the value to make it into an Instant. Considering that java.sql.Timestamp really is an Instant, it's pretty weird to reconsider this code:
Timestamp ts = rs.getTimestamp("DateField");
Which 2PM instant will "ts" represent?? JDBC will apply the VM default timezone in this case. getTimestamp provides a method which will probably make things more clear:
public Timestamp getTimestamp(String columnName, Calendar cal);
I'm still unclear as to why this takes Calendar, and not TimeZone, but I'm done with investigating for a while :)

Lessons:
1. Don't use three-digit timezone codes for looking up timezones!
2. DATE and TIMESTAMP columns do not store TimeZone info, or even Instant info.
3. Web programmers should be very careful with TimeZone.setDefaultTimeZone, its scope is either VM-wide or ThreadLocal, depending on your permissions.
4. Instead of setDefaultTimeZone, consider the Calendar-based permutations of getTimeXXX(....) in ResultSet. This is much more clear, and will avoid the scoping problems described in #3
5. TimeZone.getTimeZone("someIDThatDoesntExist") .equals(TimeZone.getTimeZone("GMT") )- BOO!

No comments: