psycopg2.OperationalError: SSL SYSCALL error: EOF detected on Flask/SQLAclemy/Celery + PostgreSQL app

M

3

6

I have an app that was written with Flask+SQLALchemy+Celery, RabbitMQ as a broker, database is PostgreSQL (PostgreSQL 10.11 (Ubuntu 10.11-1.pgdg16.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609, 64-bit). Database is hosting in DigitalOcean (1 CPU, 2Gb RAM). All app workers (flask or celery) are starting in Supervisor.

In my project for connection to DB I'm using flask_sqlalchemy package like this:

from flask_sqlalchemy import SQLAlchemy
from flask import Flask

# Init 
app = Flask(__name__)

# Create the connection to database
db = SQLAlchemy(app)

I wrote some login in Flask app, tested it and then copied it to the Celery project (in which db connection is the same). So now my example celery task looked like this:

@celery.task(name='example_task', queue='default')
def example_task(payload):
    """ Some logic here """
    data = ExampleModel.query.filter(ExampleModel.id == payload["id"]).first()

    """ Some another app logic """
    db.session.add(SecondModel(payload))
    db.session.commit()
    
    return {"success": True}

And the problem is when I'm running my app on my laptop, it's OK, everything is working fine, no errors. When I uploaded my app on VPS and there was not so much users, everything was ok too. But some time later when there are 30+ users on the same time and they calling this example_task, at some time errors began to appear periodically on very simple queries for select some data from database:

  File "/venv/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1193, in _execute_context
    context)
  File "/venv/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line 509, in do_execute
    cursor.execute(statement, parameters)
psycopg2.OperationalError: SSL SYSCALL error: EOF detected

The above exception was the direct cause of the following exception:

sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) SSL SYSCALL error: EOF detected
 [SQL: 'SELECT example_model.id AS example_model_id, example_model.key AS example_model_key
 \nFROM example_model \nWHERE example_model.id = %(id_1)s \n LIMIT %(param_1)s'] [parameters: {'id_1': 2, 'param_1': 1}] (Background on this error at: http://sqlalche.me/e/e3q8)

Sometimes, but very-very rare I saw in logs this error:

psycopg2.OperationalError: SSL error: decryption failed or bad record mac

I wrote a example exception decorator that handle the errors (for example any errors, not only the SQLAlchemy errors), and after it catched the error I do a db.session.rollback()

def exception_log(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        try:
            return func(*args, **kwargs)
        except Exception as err:
            # Do the rollback
            db.session.rollback()

            # Call function again
            return func(*args, **kwargs)

But it didn't helped me, because yes, it's reloads the connection for db, function worked fine then, but app starts to work slower and slower and at some point I should do a reload of workers in Supervisor. I saw a lot of idle connections in PostgreSQL, I set the idle transaction timeout for 5 mins, but it didn't helped.

SET SESSION idle_in_transaction_session_timeout = '5min';

I dont't know what to do next, because only solution that helps now is to reload app workers in supervisor any time I see that app is working slower and slower.

Mcfall answered 10/1, 2021 at 19:49 Comment(2)

Are you using psycopg2 or psycopg2-binary? – Tarnopol 11/1, 2021 at 13:9

@Tarnopol I've installed both of them. – Mcfall 11/1, 2021 at 15:14

L

12

I encountered the same issue using a managed Postgres database service seemingly dropping my connections every now and then.

psycopg2.OperationalError: SSL SYSCALL error: EOF detected

Thankfully it appears SQLAlchemy has flags for helping out with this.

Pass pre_ping=True to create_engine and it will check all pooled connections before using them for your actual queries.

The pessimistic approach refers to emitting a test statement on the SQL connection at the start of each connection pool checkout, to test that the database connection is still viable. Typically, this is a simple statement like “SELECT 1”, but may also make use of some DBAPI-specific method to test the connection for liveness.

engine = create_engine("mysql+pymysql://user:pw@host/db", pool_pre_ping=True)

https://docs.sqlalchemy.org/en/13/core/pooling.html#disconnect-handling-pessimistic

Little answered 25/2, 2021 at 0:27 Comment(0)

M

0

I've fixes this problem by upgrading the Flask-SQLAlchemy package to the latest vervion.

Mcfall answered 21/4, 2021 at 19:1 Comment(0)

L

0

If your Flask, Celery and PostgreSQL is on the same server then increasing your CPU cores might fix the problem.

Lazes answered 21/5, 2024 at 13:36 Comment(1)

This does not provide an answer to the question. Once you have sufficient reputation you will be able to comment on any post; instead, provide answers that don't require clarification from the asker. - From Review – Maestas 23/5, 2024 at 5:22

Recommended topics

Hot tags