Postgres closes connection during query after a few hundred seconds when using Psycopg2
Asked Answered
S

3

28

I'm running PostgreSQL 9.6 (in Docker, using the postgres:9.6.13 image) and psycopg2 2.8.2.

My PostgreSQL server (local) hosts two databases. My goal is to create materialized views in one of the databases that use data from the other database using Postgres's foreign data wrappers. I do all this from a Python script that uses psycopg2.

This works well as long as creating the materialized view does not take too long (i.e. if the amount of data being imported isn't too large). However, if the process takes longer than roughly ~250 seconds, psycopg2 throws the exception

psycopg2.OperationalError: server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.

No error message (or any message concerning this whatsoever) can be found in Postgres's logs.

Materialized view creation completes successfully if I do it from an SQL client (Postico).

This code illustrates roughly what I'm doing in the Python script:

db = pg.connect(
    dbname=config.db_name,
    user=config.db_user,
    password=config.db_password,
    host=config.db_host,
    port=config.db_port
)
with db.cursor() as c:
    c.execute("""
        CREATE EXTENSION IF NOT EXISTS postgres_fdw;
        CREATE SERVER fdw FOREIGN DATA WRAPPER postgres_fdw OPTIONS (...);
        CREATE USER MAPPING FOR CURRENT_USER SERVER fdw OPTIONS (...);
        CREATE SCHEMA foreign;
        IMPORT FOREIGN SCHEMA foreign_schema FROM SERVER fdw INTO foreign;
    """)
    c.execute("""
        CREATE MATERIALIZED VIEW IF NOT EXISTS my_view AS (
            SELECT (...)
            FROM foreign.foreign_table
        );
    """)
Secrest answered 24/5, 2019 at 9:39 Comment(0)
S
13

Adding the keepalive parameters to the psycopg2.connect call seems to have solved the problem:

self.db = pg.connect(
            dbname=config.db_name,
            user=config.db_user,
            password=config.db_password,
            host=config.db_host,
            port=config.db_port,
            keepalives=1,
            keepalives_idle=30,
            keepalives_interval=10,
            keepalives_count=5
        )

I still don't know why this is necessary. I can't find anyone else who has described having to use the keepalives parameter keywords when using Postgres in Docker just to be able to run queries that take longer than 4-5 minutes, but maybe it's obvious enough that nobody has noted it?

Secrest answered 27/5, 2019 at 11:18 Comment(4)
According to postgresql.org/docs/current/libpq-connect.html the default value of keepalives is already 1. So the connection should be kept alive. But for my connection it has been closed by an OperationalError.Succussion
Also you can use the other parameters can be set to 0 to use the system default. However I'm not sure where to find the default parameters. Maybe they are OS specific.Succussion
I'm having this exact same issue right with exactly the same setup where I am trying to create multiple materialized views based using a FDW table. But after 5 minutes, the FDW connection gets closed. I tried the keepalives but it didn't seem to work for me. My postgres logs show that psycopg2 is aborting the transaction because the connection was lost to the fwd server.Forepart
I'm baffled by this, and would like to know why this works too. I created a question for it if any of you know the answer: https://mcmap.net/q/504749/-why-are-the-keepalives-params-in-psycopg2-connect-required-to-run-long-running-postgres-queries-in-docker-ubuntu-18-04/604048Hodosh
A
4

We encountered the same issue, and resolved it by adding net.ipv4.tcp_keepalive_time=200 to our docker-compose.yml file:

services:
  myservice:
    image: myimage
    sysctls:
      - net.ipv4.tcp_keepalive_time=200

From what I understand this will signal that the connection is alive after 200 seconds, which is less than the time it takes to drop the connection (300 seconds?), thus preventing it from being dropped.

Angeloangelology answered 4/3, 2022 at 7:26 Comment(0)
A
0

It might be that PostgreSQL 9.6 kills your connections after the new timeout mentioned at https://mcmap.net/q/504750/-set-transaction-query-timeout-in-psycopg2. In that case, you could set

the statement_timeout in postgresql.conf

but it is not recommended.

It might work in Postico because the value has been set there.

To log an error you need to set log_min_error_statement to ERROR or lower for it to show.

Armure answered 27/5, 2019 at 8:27 Comment(7)
Hi, idle_in_transaction_session_timeout is already set to 0 (and so are all the other timeouts). log_min_error_statement is already set to ERROR.Secrest
just to make sure: statement_timeout is 0 as well? This seems to have been introduced in 9.6Armure
Yes, statement_timeout is zero.Secrest
Did you try to set a connect_timeout as mentioned in https://mcmap.net/q/348312/-python-psycopg2-timeout?Armure
Yes, but I have no idea why that would do anything to help.Secrest
It mentioned "up to 5 minutes" which approximately matches your reported 250 seconds. Both fail with an OperationalError.Armure
For a client design to have fully long running connections, would be be appropriate to set all timeouts to 0?Succussion

© 2022 - 2024 — McMap. All rights reserved.