PostgreSQL DB performance issues with thousands of connections and distributed transactions
Asked Answered
G

2

8

We are trying to evaluate PostgreSQL DB as an alternative to Oracle database in our application. We use PostgreSQL 9.5 which is installed on a Linux machine with 128 GBs of memory, 32 CPU cores and SSD storage. Connection pools and distributed transactions are managed by JBoss 7 application server, SQL queries are generated/executed by Hibernate 4. Most tables have tens millions rows, one of them has hundreds millions rows. In total around 3,000 database connections (they are pooled by the application server) are active and used concurrently. We modified some queries, created indexes for slow ones, tuned DB and OS settings based on documentation, etc. However, throughput is few times slower and eventually DB response time increases 10-20 times.

I've done some googling and I couldn't find information about anyone else (ab)using PostgreSQL DB the same way:

  • using thousands of active database connections
  • using this high number of distributed transactions (PREPARED TRANSACTIONS)
  • storing billions of rows in one table

Oracle doesn't have any problem handling even higher load. I would appreciate sharing your experience, suggestions, links, etc.

Thanks

Gouache answered 22/9, 2016 at 0:38 Comment(9)
Zalando uses Postgres and they are serving a lot of clients concurrently. So it's not like the engine is cumbersome I can tell you that :) For dealing with large tables partitioning comes in handy. This question is pretty broad though, I'm guessing you're not going to get the answers you expect.Sandlin
Thank you for your comment. But based on their GIT repo (github.com/zalando/patroni/search?utf8=%E2%9C%93&q=prepared) they don't use distributed transactions: max_prepared_transactions=0 Also default value for max_connections is 100. I do understand that production value is likely to be different.Gouache
Do you have your own transaction manager? Are you using distributed transactions to allow global transactions across different databases on a cluster? Have you considered compensation instead?Sandlin
We use JBoss transaction manager. We need distributed transactions because we use JMS queues.Gouache
"thousands of active database connections" is your problem. You should use a connection pooler to get that down to "hundreds"Oversold
I mentioned that we use a connection pool. We cannot decrease number of connections as we have to process a high number of concurrent requests in real time.Gouache
I think you have to try and pinpoint the issues. Is it the data volume (try archiving/purging). Is it really the concurrent connections (if you test with just a hundred connections are you getting the response times you want?). Try testing with a tool like JMeter or similar, identify the issue, then look for solutions. Have you modified the default parameter settings for PostgreSQL? I had migrated an Oracle DB with a couple of hundred GB (I guess it is a small one by your standards) and a few hundred concurrent users and got same/similar response time when we conducted performance tests.Louanneloucks
Just imagine that having thousands of connections mean having thousands of processes running at the same time. But the machine has 32 cores, so it is not able to process them really simultaneously. The pooler you wrote about is at the application server, so I assume that there are even more connections from applications, and then the app server makes 3000 connections to the database. The previous comment was definitely about pure PostgreSQL pooler, like pgbouncer, or pgpool. This way you can have 3000 connections to the pooler, and about 50 to the database.Desmund
I don't see how this would help. We can just decrease connection pool size in JBoss to 50 connections. Also this will not work with two phase commits and prepared statements: How to use prepared statements with transaction pooling? To make prepared statements work in this mode would need PgBouncer to keep track of them internally, which it does not do. So only way to keep using PgBouncer in this mode is to disable prepared statements in the client. pgbouncer.github.io/…Gouache
G
3

The solution was upgrading the Linux kernel and decreasing the number of DB connections in our Java connection pools from 3000 to 300. After this change we could handle the same traffic as we could with Oracle DB.

By accident I have discovered a precious piece of information which lead to problem resolution in comments section for the post Did I Say 32 Cores? How about 64? written by Robert Haas (VP, Chief Architect, Database Server @ EnterpriseDB, PostgreSQL Major Contributor and Committer):

No, I'm saying that to get good performance on a 64-core server, you're going to need PostgreSQL >= 9.2 and Linux >= 3.2. Most of the changes are actually on the PostgreSQL side, but the lseek scaling stuff in the Linux kernel was important, too.

Gouache answered 13/11, 2017 at 15:54 Comment(0)
L
-1

Approprite settings should be provided in postgresql.conf file to handle large number of connections. Also it can be front-ended by pgpool2 for replication and load balancing. We are using Postgres in a clustered env and it works well.

Lactam answered 30/9, 2016 at 14:19 Comment(3)
Do you have thousands of concurrent active connections, use distributed transactions, have hundreds millions of records?Gouache
yes, there are multiple servers, multiple applications, large number of users connected concurrentlyLactam
Can you please define 'large number of users'? Is it hundreds or thousands of concurrently ACTIVE queries?Gouache

© 2022 - 2024 — McMap. All rights reserved.