Python error when using urllib.open
Asked Answered
F

3

25

When I run this:

import urllib

feed = urllib.urlopen("http://www.yahoo.com")

print feed

I get this output in the interactive window (PythonWin):

<addinfourl at 48213968 whose fp = <socket._fileobject object at 0x02E14070>>

I'm expecting to get the source of the above URL. I know this has worked on other computers (like the ones at school) but this is on my laptop and I'm not sure what the problem is here. Also, I don't understand this error at all. What does it mean? Addinfourl? fp? Please help.

Friesen answered 1/3, 2009 at 19:56 Comment(1)
S
55

Try this:

print feed.read()

See Python docs here.

Sociometry answered 1/3, 2009 at 20:0 Comment(7)
Thanks! That's very helpful! I'm one step closer to finishing this program! The link to the docs is very helpful too! Any idea about the error? Just wondering...trying to gain knowledge about these things.Friesen
addinfourl is not an error; it's an object. You haven't done anything wrong. Just replace "print feed" with "print feed.read()" and you have your HTML.Sociometry
OK, thanks. I'll read up on that some. Just don't quite understand why I got that. Thanks again!Friesen
Think of it this way: some variables, like numbers, strings, and lists, have simple ways of being displayed to the user. But what do you expect to see when you print, say, a file object? It just prints <open file "foo.htm", mode "w" at 0x090232...> to let you know "hey, I'm a file object".Sociometry
Ok, that makes sense. Thanks for taking the time. This is the first step in a larger project to take the data from TourFilter Dallas (specifically tourfilter.com/dallas/rss/by_concert_date), parse for a band, and geocode that band's events on an ArcGIS map. Thanks for the help!Friesen
Have a look at wwwsearch.sourceforge.net/mechanize, wiki.python.org/moin/RssLibraries and crummy.com/software/BeautifulSoup to help with the parsing.Sociometry
Thanks so much. You've been a huge help. I'll let you know how it turns out. I've been thinking of trying to make a google mashup of this same concept. We'll see!Friesen
L
17

urllib.urlopen actually returns a file-like object so to retrieve the contents you will need to use:

import urllib

feed = urllib.urlopen("http://www.yahoo.com")

print feed.read()
Liripipe answered 1/3, 2009 at 20:0 Comment(1)
Thanks! That's very helpful! I'm one step closer to finishing this program!Friesen
P
7

In python 3.0:

import urllib
import urllib.request

fh = urllib.request.urlopen(url)
html = fh.read().decode("iso-8859-1")
fh.close()

print (html)
Pluperfect answered 1/3, 2009 at 22:11 Comment(1)
thanks, the decode("iso-8859-1") was the critical step that put and end to the "Type str doesn't support the buffer API" error I was seeing!Lighterage

© 2022 - 2024 — McMap. All rights reserved.