Why are the `keepalives` params in `psycopg2.connect(...)` required to run long running postgres queries in docker (ubuntu:18.04)?
Asked Answered
S

1

2

We just transitioned to using Docker for development and are using the ubuntu:18.04 image. We noticed that queries using psycopg2 failed after a few minutes. This answer solved the problem using the following keepalives params:

self.db = pg.connect(
    dbname=config.db_name,
    user=config.db_user,
    password=config.db_password,
    host=config.db_host,
    port=config.db_port,
    keepalives=1,
    keepalives_idle=30,
    keepalives_interval=10,
    keepalives_count=5
)

This works for us as well, but why does this work? The psycopg2 docs do not give insight into what the params do, however, this third party documentation does, and this postgres documentation does.

The question is, what is different in the docker environment vs the host environment which makes these non-default settings required? They work in a standard Ubuntu 18.04 environment too, but not in docker. I am hoping we could reconfigure our docker image so that these non-standard parameters aren't necessary in the first place.


Postgres version: PostgreSQL 13.4 (Ubuntu 13.4-1.pgdg20.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, 64-bit

psycopg2 version: 2.8.5

Host OS: Windows 10

Docker Image OS: Ubuntu 18:04

Scent answered 3/3, 2022 at 20:8 Comment(3)
Docker should make no difference there. These settings only reduce the time it takes for the client to notice that the network connection or the server went away.Fracas
I understand. But they do make a difference. If I boot a VirtualBox up with Ubuntu 18.04 or if I run it on our native Ubuntu 18.04 servers, the connections stay alive, while if I run the same queries with the same connection parameters (not using keepalives) they fail on Docker running the standard Ubuntu 18.04 image on a windows host. This is also not just something weird with our setup, because others report this in the linked answer: https://mcmap.net/q/494414/-postgres-closes-connection-during-query-after-a-few-hundred-seconds-when-using-psycopg2 I don't want these params set when we developed in docker, because they are not necessary at deploy.Generable
The question you reference exhibits a network setup problem. You have to configure your network not to drop idle connections. Examine how the network setup is different in both cases.Fracas
O
5

You are probably using Dockers Overlay Network feature (or Ingress network for loadbalanced services), which is based on Linux IP Virtual Server (IPVS), a.k.a.Linux Virtual Server. This uses a default 900 second (15 minutes) timeout for idle TCP connections.

See: https://github.com/moby/moby/issues/31208

Default Linux TCP Keep-Alive settings only start sending packets much later (if enabled at all) and thus you are left with the options of:

  • change the TCP Keep-Alive settings on the server or client
  • change the Docker networking to use the host network directly
  • change your software to avoid idle TCP connections, e.g. configure connection pools for databases to remove idle connections or check health more often
  • change the Kernel IPVS defaults or TCP defaults
Octahedrite answered 4/3, 2022 at 6:35 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.