Cross-Platform Time Zones in Haskell
FeedPipe supports setting the time zone used to display dates and times, per index. Time zones can be difficult to deal with, but it is straightforward in Linux and most operating systems. Unfortunately, Windows is problematic. I would like FeedPipe to work on Windows as well, so I need to plan appropriately.
Time Zones
What is difficult about time zones? UTC offsets are easy to use, but determining the local time in a specific location depends on arbitrary decisions made by local governments. The UTC offset of a local time in a specific location depends on those arbitrary rules, which change over time.
In FeedPipe, a time zone is specified per index. For example,
somebody may create an index that should display times in the
America/Los_Angeles
time zone. Los Angeles observes daylight
saving time, so the time zone may be Pacific Standard Time (PST,
-08:00
) or Pacific Daylight Time (PDT,
-07:00
) depending on the time of the year. The FeedPipe
configuration cannot specify a UTC offset because it is not constant.
The time zone is therefore configured as
America/Los_Angeles
, and the UTC offset is determined for
each specific time.
The time zone rules are standardized in a time zone database that is updated more often than one might expect. (There were six updates in 2020.) If the most recent database is not used, then any time after the time of the dated database release may be incorrect.
I have a great example of an arbitrary time zone change. In July 2018, Japan considered observing daylight saving time during the summer of 2020 to make the Tokyo weather more bearable during the 2020 Olympics. The committee even considered offsetting the time by two hours instead of one! It is fortunate that this idea did not get implemented, as I suspect that it would have caused many more problems in Japanese infrastructure than Y2K (at least in the systems that do not run on paper and fax). Since all of Japan is in a single time zone that does not observe daily saving time, many systems here take a simple approach and do not handle time correctly; there would have been many hours of overtime as developers scrambled to fix systems on short notice, as well as many issues during that summer.
Windows
Keeping a system time zone database up-to-date seems like something that every operating system should do. Most do! Unfortunately, Windows does not.
Cross-platform software that supports Windows usually includes a snapshot of the time zone database in the software. The software can use the system time zone database when one is available and fall back to the included snapshot on Windows. To keep the Windows database current, the software therefore needs to have regular releases including new time zone database snapshots as they are updated (six times in 2020).
For example, Python 3.9 implements time zone support in the zoneinfo module of the standard library, and a current snapshot of the time zone database is available in the tzdata package. By default, the system time zone database is used if it exists, and the snapshot is used as a fall back.
Haskell
Unfortunately, there are no production-ready solutions to this problem in Haskell that provide a high-level API like the Python 3.9 solution. The tz and tzdata packages appear to be working in the right direction, but the package description warns that it is still in an alpha phase, and the time zone database is not kept current.
I will therefore likely use the following packages:
- timezone-series provides the core data types and functionality.
- timezone-olson parses (and renders) the database files.
- timezone-olson-th loads database files at compile time.
This implementation loads binary database files for single
locations. When building on Linux, these files can be read from the
system time zone database. The (/usr/share/zoneinfo
)
directory can be traversed, and a map from time zone name to time zone
data can be created. Note that the files must be filtered, as the
directory includes some files that are not binary database files.
It is not as easy on Windows. IANA distributes the time zone database in source form, along with source code for building a utility that creates binary database files from source files. Compiling and using the C utility on Windows to create the binary database files required to compile FeedPipe does not sound like a good option…
What is a good option? I like the API provided by the timezone-series package, so it might be worthwhile to create a new package that provides snapshots of the time zone database for use with that API. The package could be built on Linux, removing the headache of complex build logic on Windows.
Thinking about the design of such a package, I think it would be nice
to create a package named timezone-data
that does not rely
on the “Olson” database file format. The package can provide a mapping
from time zone name to TimeZoneSeries
as a low-level API.
It would be convenient to provide a high-level API that loads time zone
data from the system time zone database when available and uses the
snapshot as a fall back, but I think that it would be best implemented
by adding a high-level API to the timezone-olson
package:
getTimeZoneSeries :: FilePath -- ^ zoneinfo database directory
-> TimeZoneName
-> IO (Maybe TimeZoneSeries)
This function has two differences with the existing, low-level function:
- The time zone name is not part of the path. This allows the time
zone name to use standard syntax (with slashes) and map to a file on the
filesystem in a platform independent manner. The
TimeZoneName
type (alias toString
?) should probably be defined in the timezone-series package, as it is also needed in thetimezone-data
API. - The return value is a
Maybe
type, whereNothing
indicates that the requested time zone was not found, perhaps because there is no system time zone database available.
With this definition, fall back could be implemented with code like the following:
<- maybe tzNotFound return . (<|> TZData.getTimeZoneSeries tzName))
tzSeries =<< TZOlson.getTimeZoneSeries "/usr/share/zoneinfo" tzName
In this example, tzName
is the name of the time zone and
tzNotFound
is some action to perform when the time zone is
not found. (In this simple example in the IO
monad,
tzNotFound
could throw an exception.)
Plans
On one hand, I would like to go ahead and work on this because it is pretty important functionality. On the other hand, I do not want to get “sidetracked” and delay the development of FeedPipe. Perhaps I should continue with FeedPipe and just not support Windows in the first release. (It is not like I can test on Windows anyway…)
When I work on this, perhaps after the first release of FeedPipe, I plan on first developing a prototype and testing out the API. Once satisfied, I can contact the maintainers of the existing packages and present the idea to them. It would probably be worthwhile to post an RFC to the libraries mailing list. I am happy to help maintain the package, but it is definitely a package that should have multiple maintainers since it needs to be updated in a timely manner.