Is there a reputable source that provides mappings of UN/LOCODEs to Olsen Timezones?
Asked Answered
C

5

8

I've been researching CLDR and IANA in order to find a centralized mapping of UN/LOCODEs to Olsen Timezones.

Ideally I would like to have for example:

+--------------+--------------------+
|un_locode     |timezone            |
+--------------+--------------------+
|USLAX         | America/Los_Angeles|
+--------------+--------------------+

for every UN/LOCODE.

Are my nube skills failing me in understanding how to use these sources to reach my goal? (If so please help point me towards the scripting that would allow me to automate providing these mappings).

Or, do these sources fail to have the data correlation that I'm looking for? (If so please let me know if you have a reliable source).

Catchings answered 10/11, 2015 at 20:20 Comment(0)
B
1

I've not seen such a source. You could try to create one by mapping the lat/lon coordinates for those entries that have them, and correlating to IANA time zone by one of the methods listed here.

However, be sure to read Wikipedia's article about UN/LOCODE, especially describing errors with coordinates. Also note that many of the coordinates simply not in the data - why? I don't know.

The list of UN/LOCODE for the US is here, and show Los Angeles to be US LAX (not UNLAX). Its coordinates field is blank.

If you can find some other reliable source of UN/LOCODE to lat/lon, then you are in business. A quick search found that GeoNames claims to have this in their premium data subscription, but I haven't investigated further.

Bartie answered 11/11, 2015 at 2:21 Comment(4)
Thank you Matt, from my research and from the post you referenced, it's not recommended to resolve timezones to locations using lat/lon because timezones are primarily politically governed, not geographically. China is a good example in that the entire country is offset UTC+8 by political institution, but if you were to resolve one of the westernmost Chinese cities by lat/lon, you could end up with a timezone offset as much as UTC+6.Catchings
@Catchings - That depends entirely on the method you use to resolve it. If you take a simple calculation, then you're absolutely correct. But most of the methods on the referenced post use maps with boundaries. Of course maps always carry a political context. If you disagree with those maps from a political perspective, then you can adjust them according to your views - and many solutions indeed do that.Bartie
WRT to China, note that politically the entire country is on UTC+8 (Asia/Shanghai), but some of the solutions do indeed include Asia/Urumqi (UTC+6) in their results. See Xijiang - Urumqi Time. In particular, solutions that use tz_world maps include this zone, but others such as Google's API do not.Bartie
Also note that even if you were to find a list of UN/LOCODE to IANA time zone somewhere else, and even if that list was meticulously hand crafted by some official authority, it would carry the same type of political assertions. For example, Urumqi is "CN URM" - does it map to Asia/Urumqi or Asia/Shanghai? The zone definition for Asia/Urumqi has the UTC+6 offset, so there is context just in the mapping alone.Bartie
M
5

We faced the exact same problem and hence had to provide a solution.

This solution involves linking the UN/LOCODES database with a geolocation/timezone database. There are a few caveats to this approach that were captured by Matt Johnson's answer and the accompanying comments.

Namely:

  • the UN/LOCODE database of coordinates is not complete[1] and sometime has inaccurate data[2]
  • in some cases, a 1 to 1 mapping between the UN/LOCODE and a timezone is impossible due to the political nature of the timezones.
  • the two points above are worsened by the inaccuracy of free coordinates-to-timezone databases. It is helpful to get a dataset that also includes territorial waters so that ports timezones can be properly linked to the country they belong.

The following repository https://github.com/Portchain/un_locodes_sql contains the code to extract and link the data. It outputs a SQL file that can be imported into a PostgreSQL DB. The geolocation/timezone data is based on the geo-tz[3] module which seems to source its data from timezone-boundary-builder[4].

Again, the list provided by our repository is of course incomplete and inaccurate. If you see any error in the data, please open a github issue and let's make an accurate, open source list of UN/LOCODE, coordinates and timezone information.

Micronucleus answered 16/5, 2017 at 13:57 Comment(1)
The link to Portchain/un_locodes_sql is sadly down :/Sardonyx
M
2

The GeoNames free database of cities (which is available to download) provides: city names, latitude/longitude and, most importantly, timezone information. You can fairly quickly make your own database connecting this information with the UN/LOCODE code lists based on the name/country/coordinates.

Meggie answered 30/3, 2016 at 20:1 Comment(0)
B
1

I've not seen such a source. You could try to create one by mapping the lat/lon coordinates for those entries that have them, and correlating to IANA time zone by one of the methods listed here.

However, be sure to read Wikipedia's article about UN/LOCODE, especially describing errors with coordinates. Also note that many of the coordinates simply not in the data - why? I don't know.

The list of UN/LOCODE for the US is here, and show Los Angeles to be US LAX (not UNLAX). Its coordinates field is blank.

If you can find some other reliable source of UN/LOCODE to lat/lon, then you are in business. A quick search found that GeoNames claims to have this in their premium data subscription, but I haven't investigated further.

Bartie answered 11/11, 2015 at 2:21 Comment(4)
Thank you Matt, from my research and from the post you referenced, it's not recommended to resolve timezones to locations using lat/lon because timezones are primarily politically governed, not geographically. China is a good example in that the entire country is offset UTC+8 by political institution, but if you were to resolve one of the westernmost Chinese cities by lat/lon, you could end up with a timezone offset as much as UTC+6.Catchings
@Catchings - That depends entirely on the method you use to resolve it. If you take a simple calculation, then you're absolutely correct. But most of the methods on the referenced post use maps with boundaries. Of course maps always carry a political context. If you disagree with those maps from a political perspective, then you can adjust them according to your views - and many solutions indeed do that.Bartie
WRT to China, note that politically the entire country is on UTC+8 (Asia/Shanghai), but some of the solutions do indeed include Asia/Urumqi (UTC+6) in their results. See Xijiang - Urumqi Time. In particular, solutions that use tz_world maps include this zone, but others such as Google's API do not.Bartie
Also note that even if you were to find a list of UN/LOCODE to IANA time zone somewhere else, and even if that list was meticulously hand crafted by some official authority, it would carry the same type of political assertions. For example, Urumqi is "CN URM" - does it map to Asia/Urumqi or Asia/Shanghai? The zone definition for Asia/Urumqi has the UTC+6 offset, so there is context just in the mapping alone.Bartie
A
0

CLDR's map is here: https://unicode.org/reports/tr35/#Time_Zone_Identifiers

I saw CLDR tagged but not mentioned.

Amphidiploid answered 22/6, 2018 at 23:1 Comment(0)
C
0

You can't, the quality of the dataset is just too low. For example: USPTY is specified to be Port Washington in New Jersey, but that doesn't exist. There is a Port Washington in New York and one in Wisconsin and there is no way to know for sure which is meant.

But we can get close, despite the many wrong and missing coordinates. We can do this by using multiple data sources:

OpenStreetMap's Nominatim

This is really cool because it's clever enough to work around different spellings. For example, this query finds the correct city Troyan, even though it's spelled "Trojan" in un/locode. You are limited to 1 request per second though, so going through all of them takes over a full day.

Wikidata

A hidden treasure trove of data. Basically, everything which has a Wikipedia page has a Wikidata page and that can be tagged with tags like UN/LOCODE. It can be edited just as easily: with a Wikipedia account, you can just add or correct tags really easily. This is especially handy for stuff like airports which are otherwise often missing.

A basic SPARQL query:

SELECT ?item ?unlocode ?itemLabel ?coords
WHERE {
  ?item wdt:P1937 ?unlocode.
  ?item wdt:P625 ?coords.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}

And boom, about 40000 places with a unlocode and coordinates.

Combining the three

Basically, don't trust the un/locode dataset: they have typos and stuff which makes you go off by thousands of kilometers. Only use these coordinates when the rest can't find anything. The best way is to manually do a check which coordinates differ by more than 100 km, and indicate which is the best, but that's quite a bit of work.

An improved dataset

Luckily, I did just that :D See https://github.com/cristan/improved-un-locodes for a dataset where 98% of unlocodes have coordinates and the data quality of the coordinates is much higher than the official dataset. You then only need to convert these coordinates to a timezone, but libraries for that exist.

Colossal answered 28/4 at 10:45 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.