Best practices for persistent database connections in Python when using Flask

Asked 4/4, 2019 at 19:6 Answered 12/8, 2023 at 16:26

Solved python database flask database-connection pooling

My question is about the recommended approach for handling database connections when using Flask in a production environment or other environment where performance is a concern. In Flask, the g object is available for storing things, and open database connections can be placed there to allow the application to reuse them in subsequent database queries during the same request. However, the g object doesn't persist across requests, so it seems that each new request would require a new database connection (and the performance hit that entails).

The most related question I found on this matter is this one: How to preserve database connection in a python web server but the answers only raise the abstract idea of connection pooling (without tying it to how one might use it within Flask and how it would survive across requests) or propose a solution that is only relevant to one particular type of database or particular stack.

So my question is about the general approach that should be taken when productionising apps built on Flask that connect to any kind of database. It seems like something involving connection pooling is in the right direction, especially since that would work for a traditional Python application. But I'm wondering what the recommended approach is when using Flask because of the aforementioned issues of persistence across connections, and also the fact that Flask apps in production are run from WSGI servers, which potentially add further complications.

EDIT: Based on the comments recommending flask sqlalchemy. Assuming flask sqlalchemy solves the problem, does it also work for Neo4J or indeed any arbitrary database that the Flask app uses? Many of the existing database connectors already support pooling natively, so why introduce an additional dependency whose primary purpose is to provide ORM capabilities rather than connection management? Also, how does sqlalchemy get around the fundamental problem of persistence across requests?

Malinger answered 4/4, 2019 at 19:6 Comment(3)

Check out sqlalchemy – Dallapiccola 4/4, 2019 at 19:11

USE flask sqlalchmey – Reviviscence 4/4, 2019 at 19:13

See the Flask Mega Tutorial section on databases – Pomegranate 4/4, 2019 at 19:14

Turns out there is a straightforward way to achieve what I was after. But as the commenters suggested, if it is at all possible to go the flask sqlalchemy route, then you might want to go that way. My approach to solving the problem is to save the connection object in a module level variable that is then imported as necessary. That way it will be available for use from within Flask and by other modules. Here is a simplified version of what I did:

app.py

from flask import Flask
from extensions import neo4j

app = Flask(__name__)
neo4j.init_app(app)

extensions.py

from neo4j_db import Neo4j

neo4j = Neo4j()

neo4j_db.py

from neo4j import GraphDatabase

class Neo4j:
    def __init__(self):
        self.app = None
        self.driver = None

    def init_app(self, app):
        self.app = app
        self.connect()

    def connect(self):
        self.driver = GraphDatabase.driver('bolt://xxx')
        return self.driver

    def get_db(self):
        if not self.driver:
            return self.connect()
        return self.driver

example.py

from extensions import neo4j

driver = neo4j.get_db()

And from here driver will contain the database driver that will persist across Flask requests.

Hope that helps anyone that has the same issue.

Malinger answered 5/4, 2019 at 14:9 Comment(7)

This solution works, I have tried it with MongoDB. Is someone willing to elaborate why it works? From my understanding from extensions import neo4j would run the entire module extension.py again reinitializing the neo4j variable. – Jesus 24/2, 2020 at 8:47

In Python, "for efficiency reasons, each module is only imported once per interpreter session" as per here docs.python.org/3/tutorial/modules.html So the relevant loadings happen once when the Flask app starts and stays in memory from then on. – Malinger 25/2, 2020 at 9:41

I am using postgres and I am basically initiating the psycopg2 connection in app.py during flask application's startup. Have not encountered issue so far and there is only 1 connection maintained across multiple requests (AFAIK). Any comments on that kind of solution? Is it similar to yours? – Juan 12/4, 2020 at 13:2

Your approach should work as well. However, as your application grows (e.g. with more types of DB connections), you may want to consider the pattern given here by breaking out into several files to make your application more structured and modular. – Malinger 14/4, 2020 at 8:4

What if the connection is lost and we have to reconnect? – Jesusitajet 21/1, 2021 at 5:49

@Jesusitajet you can simply use a library that manages connections for you. For example you can use the QueuePool class from sqlalchemy.pool. – Malinger 21/1, 2021 at 19:34

@Malinger Thanks, that way is also what I was thinking of. – Jesusitajet 22/1, 2021 at 2:44

For those who want to avoid ORMs, here's a version of the answer above using Postgres (psycopg2):

app.py

from flask import Flask
from extensions import pgdb

app = Flask(__name__)
pgdb.init_app(app)

extensions.py

from postgres_db import PostgresDB

pgdb = PostgresDB()

postgres_db.py

from psycopg2.pool import SimpleConnectionPool
from contextlib import contextmanager

class PostgresDB:
    def __init__(self):
        self.app = None
        self.pool = None

    def init_app(self, app):
        self.app = app
        self.connect()

    def connect(self):
        self.pool = SimpleConnectionPool(
            1, 20,
            user=os.environ['POSTGRES_USER'],
            password=os.environ['POSTGRES_PASSWORD'],
            host=os.environ['POSTGRES_HOST'],
            port=os.environ['POSTGRES_PORT'],
            database=os.environ['POSTGRES_DATABASE']
        )
        return self.pool

    @contextmanager
    def get_cursor(self):
        if self.pool is None:
            self.connect()
        con = self.pool.getconn()
        try:
            yield con.cursor()
            con.commit()
        finally:
            self.pool.putconn(con)

example.py

from extensions import pgdb

with pgdb.get_cursor() as cur:
    cur.execute('SELECT * FROM users WHERE email = %s', (email,))
    res = cur.fetchone()

Rasheedarasher answered 12/8, 2023 at 16:26 Comment(0)

Recommended topics

Hot tags