In Python 3.2, I can open and read an HTTPS web page with http.client, but urllib.request is failing to open the same page
Asked Answered
R

4

7

I want to open and read https://yande.re/ with urllib.request, but I'm getting an SSL error. I can open and read the page just fine using http.client with this code:

import http.client

conn = http.client.HTTPSConnection('www.yande.re')
conn.request('GET', 'https://yande.re/')
resp = conn.getresponse()
data = resp.read()

However, the following code using urllib.request fails:

import urllib.request

opener = urllib.request.build_opener()
resp = opener.open('https://yande.re/')
data = resp.read()

It gives me the following error: ssl.SSLError: [Errno 1] _ssl.c:392: error:1411809D:SSL routines:SSL_CHECK_SERVERHELLO_TLSEXT:tls invalid ecpointformat list. Why can I open the page with HTTPSConnection but not opener.open?

Edit: Here's my OpenSSL version and the traceback from trying to open https://yande.re/

>>> import ssl; ssl.OPENSSL_VERSION
'OpenSSL 1.0.0a 1 Jun 2010'
>>> import urllib.request
>>> urllib.request.urlopen('https://yande.re/')
Traceback (most recent call last):
  File "<pyshell#3>", line 1, in <module>
    urllib.request.urlopen('https://yande.re/')
  File "C:\Python32\lib\urllib\request.py", line 138, in urlopen
    return opener.open(url, data, timeout)
  File "C:\Python32\lib\urllib\request.py", line 369, in open
    response = self._open(req, data)
  File "C:\Python32\lib\urllib\request.py", line 387, in _open
    '_open', req)
  File "C:\Python32\lib\urllib\request.py", line 347, in _call_chain
    result = func(*args)
  File "C:\Python32\lib\urllib\request.py", line 1171, in https_open
    context=self._context, check_hostname=self._check_hostname)
  File "C:\Python32\lib\urllib\request.py", line 1138, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [Errno 1] _ssl.c:392: error:1411809D:SSL routines:SSL_CHECK_SERVERHELLO_TLSEXT:tls invalid ecpointformat list>
>>> 
Rafaelrafaela answered 21/5, 2012 at 1:43 Comment(3)
Can you paste the ouput of import ssl; ssl.OPENSSL_VERSION, and the result of urllib.request.urlopen('https://yande.re/')Bencher
FWIW, probably a data point for debugging. The equivalent Python 2.7.x code (shown below) works fine : import urllib2 req = urllib2.Request('yande.re') resp = urllib2.urlopen(req) resp.read()Fining
code for http.client is incorrect. You might mean: conn.request('GET', '/')Mustang
P
2

What a coincidence! I'm having the same problem as you are, with an added complication: I'm behind a proxy. I found this bug report regarding https-not-working-with-urllib. Luckily, they posted a workaround.

import urllib.request
import ssl

##uncomment this code if you're behind a proxy
##https port is 443 but it doesn't work for me, used port 80 instead

##proxy_auth = '{0}://{1}:{2}@{3}'.format('https', 'username', 'password', 
##             'proxy:80')
##proxies = { 'https' : proxy_auth }
##proxy = urllib.request.ProxyHandler(proxies)
##proxy_auth_handler = urllib.request.HTTPBasicAuthHandler()
##opener = urllib.request.build_opener(proxy, proxy_auth_handler, 
##                                     https_sslv3_handler)

https_sslv3_handler = 
         urllib.request.HTTPSHandler(context=ssl.SSLContext(ssl.PROTOCOL_SSLv3))
opener = urllib.request.build_opener(https_sslv3_handler)
urllib.request.install_opener(opener)
resp = opener.open('https://yande.re/')
data = resp.read().decode('utf-8')
print(data)

Btw, thanks for showing how to use http.client. I didn't know that there's another library that can be used to connect to the internet. ;)

Passivism answered 4/12, 2012 at 5:25 Comment(2)
Thank you very much, this had actually helped me with a slightly different urllib problemGeiger
This code snippet doesn't work for me; I end up with a handshake failure: "ssl.SSLError: [SSL: SSLV3_ALERT_HANDSHAKE_FAILURE] sslv3 alert handshake failure (_ssl.c:748)"Showing
R
2

This is due to a bug in the early 1.x OpenSSL implementation of elliptic curve cryptography. Take a closer look at the relevant part of the exception:

_ssl.c:392: error:1411809D:SSL routines:SSL_CHECK_SERVERHELLO_TLSEXT:tls invalid ecpointformat list

This is an error from the underlying OpenSSL library code which is a result of mishandling the EC point format TLS extension. One workaround is to use the SSLv3 instead of SSLv23 method, the other workaround is to use a cipher suite specification which disables all ECC cipher suites (I had good results with ALL:-ECDH, use openssl ciphers for testing). The fix is to update OpenSSL.

Resultant answered 20/12, 2012 at 16:30 Comment(1)
could you please specify how it is possible to use a cipher suite specification in the user's example?Noseband
L
1

The problem is due to the hostnames that your giving in the two examples:

import http.client
conn = http.client.HTTPSConnection('www.yande.re')
conn.request('GET', 'https://yande.re/')

and...

import urllib.request
urllib.request.urlopen('https://yande.re/')

Note that in the first example, you're asking the client to make a connection to the host: www.yande.re and in the second example, urllib will first parse the url 'https://yande.re' and then try a request at the host yande.re

Although www.yande.re and yande.re may resolve to the same IP address, from the perspective of the web server these are different virtual hosts. My guess is that you had an SNI configuration problem on your web server's side. Seeing as that the original question was posted on May 21, and the current cert at yande.re starts May 28, I'm thinking that you already fixed this problem?

Lail answered 30/5, 2012 at 18:47 Comment(0)
R
-1

Try this:

import connection #imports connection
import url 

url = 'http://www.google.com/'    
webpage = url.open(url)

try:
    connection.receive(webpage)
except:
    webpage = url.text('This webpage is not available!')
    connection.receive(webpage)
Removable answered 22/8, 2012 at 15:15 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.