Postgres Npgsql Connection Pooling
Asked Answered
R

1

46

I'd like to better understand Connection Pooling when using Npgsql for Postgres. (http://www.npgsql.org/)

When I use the connection string:

UserID=root;Password=myPassword;Host=localhost;Port=5432;Database=myDataBase;Pooling=true;Minimum Pool Size=0;Maximum Pool Size=100;

Where is the "Pooling" going to take place? On my application server or on the database?

When I call connection.Open(), what happens? Is a connection taken from the pool if one exists and if not, a pool is created?

Any other general info around Connection Pooling would be appreciated.

Thanks.

Rosenkrantz answered 30/5, 2017 at 21:44 Comment(1)
Min and Max Pool Size are now named: "Minimum Pool Size" and "Maximum Pool Size" npgsql.org/doc/connection-string-parameters.html#poolingAlyciaalyda
F
71

Npgsql connection pooling is implemented inside your application process - it has nothing to do with PostgreSQL, which is completely unaware of it.

The mechanism is very simple. When you close a pooled connection, instead of physically closing the connection to PostgreSQL the physical connection is kept around idle in memory (in a "pool"). The next time you open a new connection, if its connection string matches a physical connection already present in the pool, that physical connection is reused instead of opening a new physical connection.

Since opening/closing physical connections is an expensive process, this considerably speeds up your application.

Funicle answered 30/5, 2017 at 22:1 Comment(29)
When using pooling in a web application, when the session is closed, does the pool still stay open?Chemoprophylaxis
Database connection pooling has nothing to do with web sessions. When you close/dispose NpgsqlConnection, the physical connection gets returned to the pool and can be reused when someone else calls Open().Funicle
@ShayRojansky, I've set Pooling=True in my connection string and I still see hundreds of instances of postgres.exe Running and Terminated every second on the server. Should the pooling not stop that from happening?Chemoprophylaxis
That's right, it should not be happening - assuming your application isn't restarting or itself spawning processes which connect (the pool is internal to you're process). Make sure that's the case.Funicle
Could you please give me some advice on how to look for that? I have a web application in which I use one class to work with the database. For every query I open the connection, run the query and then close the connection.Chemoprophylaxis
@ShayRojansky, also, once I switch on pooling, within 20 minutes or so, I get this error hundreds of times a minute: The connection pool has been exhausted, either raise MaxPoolSize (currently 100) or Timeout (currently 15 seconds)Chemoprophylaxis
Seems default is set to True 1/100 so there's no need to add those parameters right? npgsql.org/doc/connection-string-parameters.htmlMatilda
Could you tell what happens if all the connections in the pool are occupied by long-running queries? Is it possible to open one more connection immediately, or we wait for a released one? Will it be added to the pool?Strikebreaker
Well, once you reach the maximum pool size, and all connections are busy, then attempting to open another connection will block until one of them is released (that's the point of having a maximum pool size).Funicle
@ShayRojansky I think that when you refer to "physical connection" you are making a reference to a network connection (like a TCP socket connected from client to PostgreSQL Server). They are not really "physical connections" like a USB, Serial or Parallel connection that a computer may have with a device (a USB camera for example).Crowson
@ignacio yes, it's understood within the context of this conversation that "physical connection" refers to a TCP socket connection (which is very expensive to create) as opposed to a pooled connection (which is very lightweight).Funicle
@ShayRojansky thank you for your reply and please, let me congratulate you because of the great job you do with Npgsql. If anyone want to go deeper in how a connection is build can check this link postgresql.org/docs/11/protocol-flow.html . That link describes the process from the PostgreSQL server point of view but we can get an idea of why opening a connection is expensive. We have to also add the tasks the operating systems have to do to open, maintain and close the TCP connection (through sockets).Crowson
@ShayRojansky Say, we have one web server connects to multiple DB servers and we are setting the max pool size to 100. Does it mean that, connection to each DB server can have maximum 100 connections in the pool? Total number of pool connection the web server can go up to 100 * Number of DBs ?Purdah
That's right. Each connection string has its own pool, and the MaxPoolSize is part of that connection string, so also part of that pool's setting. Different pools (and connection strings) can have different maximum pool sizes. So you may have DB1 with MaxPoolSize=5, and DB2 with MaxPoolSize=100, coming up to a total have 105 physical connections in total.Funicle
What happens to "set" variables postgresql.org/docs/current/sql-set.html? Are they unique or re-used from the pool? Say I open a connection connection1 and set a var to "val1". Then I open 2 other connections, both of them re-use connection1. Then I set "val2" and "val3" for the variable in different instances. Is it safe or there might be a race-condition?Schaller
@Schaller by default, when a physical connection is returned to the pool, a DISCARD ALL command is enqueued which will be executed the next time that connection is used; this resets all parameters (and other connection state) to their defaults, as if a new physical connection was opened. This prevents state leakage through the pool.Funicle
Thank you @ShayRojansky, I guess if I try to re-use conn1 by new conn2 and conn3, then I actually re-use it only by conn2 and conn3 will create a new connection as conn1 is already held by conn2.Schaller
I'm not sure what you mean by conn1 being held by conn2... Every time you call NpgsqlConnection.Open an idle physical connection will be taken out of the pool, and when you call Close/Dispose it will be returned, after making sure its state is reset when next used.Funicle
@ShayRojansky Is DISCARD ALL still being done in current versions of npgsql? I'm seeing advisory locks being held by connections in the pool, so apparently DISCARD ALL is not done. I'm using the connection through EF Core DbContext database facade, not directly. The connection is made persistent for the lifetime of the DbContext by explicitly calling OpenConnection().Kirven
Yes, DISCARD ALL is done unless you have No Reset On Close=true in your connection string. If you're seeing something else, please open an issue with a minimal code sample.Funicle
That's the nice answer. But what is the relationship between Appsetings -> Maximum Pool Size=200; with PostgreSQL --> SHOW max_connections;?. btw +1 from my respect. @ShayRojanskyUnderlinen
PG max_connections is the max number of physical connection PG will allow - server-side. Npgsql's Max Pool Size is a client-side setting for the pool, determining how many physical connections it may attempt to hold at any given point (attempts to open more would block until other connections are released).Funicle
Concretely, your PG max_connections should be at least the sum of all Npgsql Max Pool Size settings across all clients. In other words, two app servers with Max Pool Size=50 may require up to 100 total connections on the PG side.Funicle
Btw, Would I add your useful comment on my post? Thanks @ShayRojanskyUnderlinen
@ShayRojansky If each app server has say 4 DbContexts registered, and max pool size is 300 in connection string passed to each dbcontext, would this mean each app server can pool 1200 connections or just 300 connections.Microseism
@ShayRojansky provided connection strings only differ in DB name, DB server is sameMicroseism
Assuming each DbContext has a distinct connection string (e.g. since DB name is different), then yeah - each app server can use up to 1200 connections (which seems like a lot). Each connection string gets its own pool.Funicle
@ShayRojansky When you refer to the process of "opening a connection", do you mean 'var conn = new NpgsqlConnection(s); conn.Open();' or just 'conn.Open()'? In other words, is the act of constructing a NpgsqlConnection an expensive or cheap operation?Maisel
Constructing an NpgsqlConnection is extremely cheap, it's a trivial object allocation. conn.Open is what actually opens it; that's also extremely cheap if pooling is on and an idle connection is available. It's only expensive it you need to open a new physical connection, which should be a generally rare thing.Funicle

© 2022 - 2024 — McMap. All rights reserved.