requests-html "RuntimeError: There is no current event loop in thread 'Thread-1' when using it on a flask endpoint
Asked Answered
O

1

6

I have a simple flask API with one endpoint that calls a method in another file to render some javascript from a site using request-html

@app.route('/renderJavascript')
def get_attributes():
    return get_item_attributes('https://www.site.com.mx/site.html')

The code of the method looks like this:

from requests_html import HTMLSession
from bs4 import BeautifulSoup

def get_item_attributes(url):
    #Connecting to site.
    session = HTMLSession()
    resp = session.get(url)
    resp.html.render()
    resp.session.close()
    soup = BeautifulSoup(resp.html.html,'lxml')

    ................................
    #Rest of the code is handling the data with bs4 and returning a json.

After calling the endpoint I receive this error:

Traceback (most recent call last):
  File "C:\Python37\lib\site-packages\flask\app.py", line 2446, in wsgi_app
    response = self.full_dispatch_request()
  File "C:\Python37\lib\site-packages\flask\app.py", line 1951, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "C:\Python37\lib\site-packages\flask\app.py", line 1820, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "C:\Python37\lib\site-packages\flask\_compat.py", line 39, in reraise
    raise value
  File "C:\Python37\lib\site-packages\flask\app.py", line 1949, in full_dispatch_request
    rv = self.dispatch_request()
  File "C:\Python37\lib\site-packages\flask\app.py", line 1935, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "API.py", line 35, in get_attributes
    return get_item_attributes('https://www.shein.com.mx/Floral-Print-Raglan-Sleeve-Curved-Hem-Tee-p-858258-cat-1738.html')
  File "C:\Users\xChapx\Desktop\Deving\API\request.py", line 25, in get_item_attributes
    resp.html.render()
  File "C:\Python37\lib\site-packages\requests_html.py", line 586, in render
    self.browser = self.session.browser  # Automatically create a event loop and browser
  File "C:\Python37\lib\site-packages\requests_html.py", line 727, in browser
    self.loop = asyncio.get_event_loop()
  File "C:\Python37\lib\asyncio\events.py", line 644, in get_event_loop
    % threading.current_thread().name)
RuntimeError: There is no current event loop in thread 'Thread-1'.

I read online that HTMLSession doesnt work correctly if it used outside of the main thread, as flask is running on a thread of its own maybe that is what is causing the error.

Overture answered 28/10, 2019 at 2:46 Comment(3)
Hi, did you find a solution to this? I am also facing the exact same issue.Crepe
@ShubhamNaik Not for the moment, I started using selenium and headless chrome to render the javascript so I can continue working.Overture
Same issue here. I suspect it might be due to the Python version 3.7Selfwill
S
0

The error is caused by pyppeteer sending exit signal to flask thread which is blocked. This workaround stops it from sending that signal in the first place.

   class AsyncHTMLSessionFixed(AsyncHTMLSession):
        def __init__(self, **kwargs):
            super(AsyncHTMLSessionFixed, self).__init__(**kwargs)
            self.__browser_args = kwargs.get("browser_args", ["--no-sandbox"])
    
        @property
        async def browser(self):
            if not hasattr(self, "_browser"):
                self._browser = await pyppeteer.launch(ignoreHTTPSErrors=not(self.verify), headless=True, handleSIGINT=False, handleSIGTERM=False, handleSIGHUP=False, args=self.__browser_args)
    
            return self._browser

    async def get_item_attributes(url):
        #Connecting to site.
        session = AsyncHTMLSession()
        resp = session.get(url)
        await resp.html.arender()
        resp.session.close()
        soup = BeautifulSoup(resp.html.html,'lxml')
    
        
    app = Flask(__name__)



    if __name__ == "__main__":
        asgi_app = WsgiToAsgi(app)
        asyncio.run(serve(asgi_app, Config()))
        app.run()

I found a note that app.run(threaded=False) is working too but couldn't replicate it myself and don't see a point in giving up threading to lose performance.

Sacramental answered 9/1, 2023 at 9:28 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.