hangs on open url with urllib (python3)
Asked Answered
S

2

9

I try to open url with python3:

import urllib.request
fp = urllib.request.urlopen("http://lebed.com/")

mybytes = fp.read()    
mystr = mybytes.decode("utf8")
fp.close()

print(mystr)

But it hangs on second line. What's the reason of this problem and how to fix it?

Sanction answered 19/8, 2017 at 6:28 Comment(0)
N
6

I suppose the reason is that the url does not support robot visiting a site visit. You need to fake a browser visit by sending browser headers along with your request

import urllib.request
url = "http://lebed.com/"
req = urllib.request.Request(
    url, 
    data=None, 
    headers={
        'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.47 Safari/537.36'
    }
)
f = urllib.request.urlopen(req)

Tried this one on my system and it works.

Noddy answered 19/8, 2017 at 6:48 Comment(0)
Y
4

Agree with Arpit Solanki. Shown output for a failed request vs successful.

Failed
    GET / HTTP/1.1
    Accept-Encoding: identity
    Host: www.lebed.com
    Connection: close
    User-Agent: Python-urllib/3.5

Success
    GET / HTTP/1.1
    Accept-Encoding: identity
    Host: www.lebed.com
    Connection: close
    User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.47 Safari/537.36
Yonina answered 19/8, 2017 at 7:9 Comment(2)
how you get this outputs?Sanction
This is the output of the packet capture. I am on a linux box so the command I ran is as: tcpdump -nn -vv -i eth0. Same could be got from wireshark or other packer capture utility.Yonina

© 2022 - 2024 — McMap. All rights reserved.