Finding the closest Apache Software Foundation mirror programatically
Asked Answered
F

3

8

For my deployment automation needs, I would like to dynamically and programatically determine the closest Apache Software Foundation mirror since the servers are distributed across geographically distinct locations and it would be ideal to dynamically determine the best mirror without having to hard-code that knowledge somewhere.

The only approach I could think of so far is to scrap the http://www.apache.org/dyn/closer.cgi page for the closest mirror suggested there, but it seems a bit cumbersome and fragile.

Is there a web API endpoint that provides this functionality in a stable and reliable way?

Fullbodied answered 3/2, 2014 at 18:24 Comment(4)
What is wrong with that approach? What other alternative would you try? Getting the entire mirror list and testing each mirror somehow?Antoinetteanton
Nothing wrong with the approach. I was just wondering if some kind of web service serving something more palatable than HTML providing that information existed. The page in question doesn't even have any annotation in the <a> tag of interest (for example a distinctive class attribute) that makes it easy to identify when scrapping.Fullbodied
You could always diff it with the raw template.Antoinetteanton
That's actually a very smart suggestion to deal with changes in the template which I hadn't considered @ElliotFrisch!Fullbodied
O
11

The mirror URLs in the page are marked up as <strong>, so you can scrape the page to get the top recommendation like this:

curl 'https://www.apache.org/dyn/closer.cgi' |
  grep -o '<strong>[^<]*</strong>' |
  sed 's/<[^>]*>//g' |
  head -1

Additionally, closer.cgi supports an ?as_json=1 query parameter to provide the same information as JSON. The result has a key of preferred for the closest mirror, as well as http for the alternatives.

Onondaga answered 30/7, 2014 at 10:20 Comment(1)
The ?as_json=1 query parameter was actually the elegant alternative I was looking for, in order to avoid the need to do any HTML scrapping.Fullbodied
G
7

There is a more elegant way by using jq:

curl -s 'https://www.apache.org/dyn/closer.cgi?as_json=1' | jq --raw-output '.preferred'
Graphitize answered 23/9, 2016 at 22:6 Comment(0)
P
3

Here is an alternative using python:

curl -s 'https://www.apache.org/dyn/closer.cgi?as_json=1' \
| python -c "import sys, json; print json.load(sys.stdin)['preferred']"
Pretonic answered 9/7, 2018 at 19:20 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.