Python: Convert those TinyURL (bit.ly, tinyurl, ow.ly) to full URLS
Asked Answered
S

1

14

I am just learning python and is interested in how this can be accomplished. During the search for the answer, I came across this service: http://www.longurlplease.com

For example:

http://bit.ly/rgCbf can be converted to:

http://webdesignledger.com/freebies/the-best-social-media-icons-all-in-one-place

I did some inspecting with Firefox and see that the original url is not in the header.

Snotty answered 14/4, 2009 at 16:14 Comment(0)
B
32

Enter urllib2, which offers the easiest way of doing this:

>>> import urllib2
>>> fp = urllib2.urlopen('http://bit.ly/rgCbf')
>>> fp.geturl()
'http://webdesignledger.com/freebies/the-best-social-media-icons-all-in-one-place'

For reference's sake, however, note that this is also possible with httplib:

>>> import httplib
>>> conn = httplib.HTTPConnection('bit.ly')
>>> conn.request('HEAD', '/rgCbf')
>>> response = conn.getresponse()
>>> response.getheader('location')
'http://webdesignledger.com/freebies/the-best-social-media-icons-all-in-one-place'

And with PycURL, although I'm not sure if this is the best way to do it using it:

>>> import pycurl
>>> conn = pycurl.Curl()
>>> conn.setopt(pycurl.URL, "http://bit.ly/rgCbf")
>>> conn.setopt(pycurl.FOLLOWLOCATION, 1)
>>> conn.setopt(pycurl.CUSTOMREQUEST, 'HEAD')
>>> conn.setopt(pycurl.NOBODY, True)
>>> conn.perform()
>>> conn.getinfo(pycurl.EFFECTIVE_URL)
'http://webdesignledger.com/freebies/the-best-social-media-icons-all-in-one-place'
Bewley answered 14/4, 2009 at 16:17 Comment(5)
It's a better idea to use a HEAD request instead of a GET to avoid transferring the content of the page. urllib and curl can do HEAD, although httplib does not, I believe.Caresse
Updated, httplib didn't complain about the HEAD... that's what she said.Bewley
just a tad confused. In the first example using urllib2, is it making a head request or using get? (in reference to adam's post) Cause I see the reference to HEAD in httplib and pycurlSnotty
From my research, I don't think urllib2 supports HEAD requests. Everything I found suggested using httplib if you just need the HEAD.Bewley
Reading through the urllib2 documentation I get the impression that nothing is actually downloaded until you call read() on the connection, but I'm honestly not sure if this is true...Bewley

© 2022 - 2024 — McMap. All rights reserved.