Wednesday, February 27, 2013

Dates on a computer

Dates are one of those things in Computer Science that keep finding new and more entertaining ways to behave really badly. So here. Read this XKCD comic, and then everyone, please just follow the ISO standard. Hell, if everyone follows the ISO standard, everyone can keep doing that annoying thing where they store a date in an integer. That'll work okay until well after my lifetime is over. Note that other habits for storing time as a single integer don't sort as well ( the ISO standard, take out the dashes, sorts perfectly fine using the standard integer comparators ), or they have edge cases that'll behave oddly.

For example, if you stored dates as month-day-year, and you need to store January first, 2001, let's first convert that to its numerical equivalent in month-day-year: 01-01-2001. To get your integer, remove the dashes.

That leading zero gets lost. If you're using -three- ints to store the date ( a distressingly common thing I see ), -two- leading zeroes get lost in conversion to a single integer, and we wind up with 112001. Which I don't know what that is when your custom date format object gets passed to my code.

What I'm asking for, is if you're going to be sloppy about your date formats, store them in a single int, in the ISO format.

Though what I -really- want is for your date formats to actually be a robust first-class object in your system, but I understand that's a pain to code for.

If you decided to store dates as the number of seconds on your system, well, okay, that's fine, I can work with that. Please don't ignore things like the 2038 problem. ( Hint: If you're going to count seconds, use a bigger integer )

Wait, what's the 2038 problem?

Okay. Well, assuming a single 32 bit block of data is storing time information, and assuming you're working on an architecture that assumes the beginning of time is 1970-01-01 at 00:00:00 UTC, ( so, pretty much all Unix-based systems ), AND you're storing time as the number of seconds since this beginning of time, the last date that can be recorded correctly is 2038-01-19, at 03:14:07 UTC. At that point, the integer will overflow, and it'll be 1970 again for everyone.


Note to actual professionals reading this blog: I'm still a college student. These are problems I deal with. Somebody tell me better in the comments.

Trivia point: if we're counting the number of years since 'the beginning of time', last year was year 42. I hope everyone remembered their towels.

No comments:

Post a Comment