You can't, the quality of the dataset is just too low. For example: USPTY is specified to be Port Washington in New Jersey, but that doesn't exist. There is a Port Washington in New York and one in Wisconsin and there is no way to know for sure which is meant.
But we can get close, despite the many wrong and missing coordinates. We can do this by using multiple data sources:
OpenStreetMap's Nominatim
This is really cool because it's clever enough to work around different spellings. For example, this query finds the correct city Troyan, even though it's spelled "Trojan" in un/locode. You are limited to 1 request per second though, so going through all of them takes over a full day.
Wikidata
A hidden treasure trove of data. Basically, everything which has a Wikipedia page has a Wikidata page and that can be tagged with tags like UN/LOCODE. It can be edited just as easily: with a Wikipedia account, you can just add or correct tags really easily. This is especially handy for stuff like airports which are otherwise often missing.
A basic SPARQL query:
SELECT ?item ?unlocode ?itemLabel ?coords
WHERE {
?item wdt:P1937 ?unlocode.
?item wdt:P625 ?coords.
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
And boom, about 40000 places with a unlocode and coordinates.
Combining the three
Basically, don't trust the un/locode dataset: they have typos and stuff which makes you go off by thousands of kilometers. Only use these coordinates when the rest can't find anything. The best way is to manually do a check which coordinates differ by more than 100 km, and indicate which is the best, but that's quite a bit of work.
An improved dataset
Luckily, I did just that :D See https://github.com/cristan/improved-un-locodes for a dataset where 98% of unlocodes have coordinates and the data quality of the coordinates is much higher than the official dataset. You then only need to convert these coordinates to a timezone, but libraries for that exist.