Feedparser.parse() 'SSL: CERTIFICATE_VERIFY_FAILED'
Asked Answered
T

3

18

I'm having this SSL issue with feedparser parsing an HTTPS RSS feed, I don't really know what to do as I can't find any documentation on this error when it comes to feedparser:

>>> import feedparser
>>> feed = feedparser.parse(rss)
>>> feed
{'feed': {}, 'bozo': 1, 'bozo_exception': URLError(SSLError(1, u'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:581)'),), 'entries': []}
>>> feed["items"]
[]
>>> 
Thermograph answered 2/2, 2015 at 16:59 Comment(6)
Do you have a capture from SSL handshake ?Maryn
Is this what you need? i.imgur.com/1rYydb6.pngThermograph
Ok it seems like your client is rejecting the server's certificate with an unknown certificate authority error What versions of python and feedparser do you have ? And did you create a self signed certificate ?Maryn
Python 2.7.9 and feedparser 5.1.3 I didn't create any certificates, this is just straight from installing Python and Feedparser, also I should note this worked in Python 3 previously but I can no longer use Python 3 for this projectThermograph
Not sure if this would help you but here is a link what i thought the problem is it also talks about a possible work around linux.debian.bugs.dist.narkive.com/Fa81q1tS/…, generally speaking ssl clients usually have an option to ignore the certificate verificationMaryn
I think python has backported the default check for certificates to 2.7.9 too as stated here bugs.python.org/issue22417Maryn
T
33

Thanks you cmidi for the answer, which was to 'monkey patch' using ssl._create_default_https_context = ssl._create_unverified_context

import feedparser
import ssl
if hasattr(ssl, '_create_unverified_context'):
    ssl._create_default_https_context = ssl._create_unverified_context
feed = feedparser.parse(rss) #<<WORKS!!
Thermograph answered 3/2, 2015 at 10:1 Comment(0)
S
5

This is due to Python beginning to apply certificate verification by default for stdlib http clients.

A great explanation of the rationale of the change can be found in this Redhat article. There's also information regarding how to control and troubleshoot this new situation.

Both previous references explain how to avoid certificate verification in single connections (which is not a solution for feedparser users):

import ssl

# This restores the same behavior as before.
context = ssl._create_unverified_context()
urllib.urlopen("https://no-valid-cert", context=context)

Currently, feedparser users can only avoid certificate verification by monkeypatching, which is highly discouraged as it affects the whole application.

The code to change the behavior application-wide would be as follows (code taken from PEP-476):

import ssl

try:
    _create_unverified_https_context = ssl._create_unverified_context
except AttributeError:
    # Legacy Python that doesn't verify HTTPS certificates by default
    pass
else:
    # Handle target environment that doesn't support HTTPS verification
    ssl._create_default_https_context = _create_unverified_https_context

There is an issue on the feedparser tracker about this: How to fix SSL: CERTIFICATE_VERIFY_FAILED?.

Spoke answered 24/8, 2019 at 12:10 Comment(1)
Thanks for this. That article explains the big picture of why SSL certificate checking is happening.Astrobiology
J
1

Make sure ca-certificates is installed.

Ran into this issue when using feedparser in a docker container which was lacking it and simply installing it solved my problem.

Julie answered 29/11, 2020 at 4:29 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.