Java: How to easily check if a URL was already shortened?
Asked Answered
C

11

7

If I have a general url (not restricted to twitter or google) like this:

http://t.co/y4o14bI

is there an easy way to check if this url is shortened?

In the above case, I as a human can of course see that it was shortend, but is there an automatic and elegant way?

Cabrales answered 16/11, 2011 at 11:48 Comment(0)
R
11

You could do a request to the URL, look if you get redirected and if so, assume it's a shortening service. For this you'd have to read the HTTP status codes.

On the other hand, you could whitelist some URL shortening services (t.co, bit.ly, and so on) and assume all links to those domains are shortened.

Drawback of the first method is that it isn't certain, some sites use redirects internally. The drawback of the second method is that you'd have to keep adding shortening services, although only a few are used widely.

Remedial answered 16/11, 2011 at 11:54 Comment(0)
P
2

One signal may be to request the URL and see if it results in a redirect to another domain. However, without a good definition of what "shortened" means, there is no generic way.

Punctuation answered 16/11, 2011 at 11:55 Comment(0)
K
1

if you know all the domains that can be used to shorten your URLs, check if it is contained :

String[] domains = {"bit.ly", "t.co"...};
for(String domain : domains){
  if(url.startsWith("http://" + domain)){
    return true;
  }
}
return false;
Kwarteng answered 16/11, 2011 at 11:53 Comment(1)
The downvote seems a bit harsh to me. Whitelisting know URL shortening services doesn't seem like a bad idea (as hinted to in the most upvoted answer here). Simply editing to + domain + "/" would make the bit.lyingbastard.com "attack" fail ; ) No need to downvote that aggressively I think : )Catalyze
A
1

You can't.

You can only check if you list a couple of shorteners and check if the url starts with it.

You can also try checking whether the url is shorter than a given length (and contains path/query string), but some shorteners (tinyurl for example) may have longer urls than normal sites (aol.com)

I would prefer the list of known shorteners.

Arsenide answered 16/11, 2011 at 11:53 Comment(0)
S
1

You can't: You will have to work by assumption.

Assumption:

  • Does www exist in url.
  • Does the server name end with a valid domain (e.g. com, edu, etc.) or does it has co.xx where xx is a valid country or organization code.

And you can add more assumption based on other url shortening links.

Speight answered 16/11, 2011 at 11:56 Comment(0)
A
1

Here's what you could do in Java, groovy and the like.

  • Get the url you want to test;
  • Open the url with HttpURLConnection
  • Check the response code
  • if it is a valid code, 200 for example, the you can retrieve the url string in long form from the connection object if it was shortened or back in its original form if it wasn't.

We all love to see some code don't we. Its crude, but hey!

String addr = "http://t.co/y4o14bI";
URL url = new URL(addr);

HttpURLConnection connection = (HttpURLConnection) url.openConnection();

if (connection.getResponseCode() == 200) {
    String longUrl = connection.url;
    System.out.println(longUrl);
} else {
    // You decide what you want to do here!
}
Altdorfer answered 16/11, 2011 at 12:31 Comment(0)
W
0

Actually, you as a human, can't. The only way you know that it's shortened is that it's a t.co domain. The y4o14bI could be an CMS identifier for all you know.

The best way would be to use a list of known shortener urls, and lookup against that.

And even then you would have problems. I use bit.ly with a personal domain, wtn.gd

So http://wtn.gd/random would also be a shortened URL.

You could maybe do a HTTP HEAD-request, and check for a 301/302 ?

Weathers answered 16/11, 2011 at 11:53 Comment(0)
C
0

If you request an URL like this, your HttpCLient should receive a HTTP Redirect instead of a HTML page. This wouldn't be an evidence but at least a hint.

Confucius answered 16/11, 2011 at 11:54 Comment(0)
A
0

Evaluate the URL and look for some clues:

  • the Path meets certain criteria

    • only has one step (i.e. not multiple slashes)
    • does not end with filename extensions
    • not longer than X characters (would need to evaluate various URL shortening services and adjust the upper bounds for the max token length)
  • HttpUrlConnection returns a redirect responseCode (i.e. 301, 302)

Ables answered 16/11, 2011 at 12:6 Comment(0)
L
0

I would suggest using android.util.Patterns.WEB_URL

public static List<String> findUrls(String input) {
    List<String> links = new ArrayList<>();

    Matcher m =  android.util.Patterns.WEB_URL.matcher(input);
    while (m.find()) {
        String url = m.group();
        links.add(url);
    }
    return links;
}
Linares answered 1/5, 2020 at 16:3 Comment(0)
W
0

Use the unshorten URL service like https://unshorten.me

They have an API as well https://unshorten.me/api

If the URL is shortened it will return the original URL. If not you will get the same one back.

Warfarin answered 25/8, 2021 at 10:23 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.