Application stuck on retrieving connections from c3p0?
Asked Answered
D

2

9

We had an outage event recently where the application threads got stuck while retrieving connections from c3p0. The configuration set is the following:

Version of c3p0 used: 0.9.1.2

  • c3p0.acquireRetryDelay = 10000;
  • c3p0.acquireRetryAttempts = 0;
  • c3p0.breakAfterAcquireFailure = false;
  • c3p0.numHelperThreads = 8;
  • c3p0.idleConnectionTestPeriod = 3;
  • c3p0.preferredTestQuery = "select 1 from dual";
  • c3p0.checkoutTimeout = 3000;
  • c3p0.user = "XYZ"; // changed to XYZ while posting
  • c3p0.password = "XYZ"; // change to XYZ while posting

During a normal scenario everything works fine and c3p0 has been serving us well. However, during a recent network event (network partitioning - where application hosts could not talk to the database), we saw that applications were indefinitely stuck on trying to get connections from c3p0.

Stacktrace seen in logs:

Caused by: java.sql.SQLException: An attempt by a client to checkout a Connection has timed out.
    at com.mchange.v2.sql.SqlUtils.toSQLException(SqlUtils.java:106)
    at com.mchange.v2.sql.SqlUtils.toSQLException(SqlUtils.java:65)
    at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutPooledConnection(C3P0PooledConnectionPool.java:527)
    at com.mchange.v2.c3p0.impl.AbstractPoolBackedDataSource.getConnection(AbstractPoolBackedDataSource.java:128)
    at amazon.identity.connection.WrappedDataSource.getConnectionWithOptionalCredentials(WrappedDataSource.java:42)
    at amazon.identity.connection.LoggingDataSource.getConnectionWithOptionalCredentials(LoggingDataSource.java:55)
    at amazon.identity.connection.WrappedDataSource.getConnection(WrappedDataSource.java:30)
    at amazon.identity.connection.WrappedDataSource.getConnectionWithOptionalCredentials(WrappedDataSource.java:42)
    at amazon.identity.connection.ConnectionProfilingDataSource.profileGetConnectionWithOptionalCredentials(ConnectionProfilingDataSource.java:118)
    at amazon.identity.connection.ConnectionProfilingDataSource.getConnectionWithOptionalCredentials(ConnectionProfilingDataSource.java:99)
    at amazon.identity.connection.WrappedDataSource.getConnection(WrappedDataSource.java:30)
    at amazon.identity.connection.CallCountTrackingDataSource.getConnectionWithOptionalCredentials(CallCountTrackingDataSource.java:82)
    at amazon.identity.connection.WrappedDataSource.getConnection(WrappedDataSource.java:30)
    at com.amazon.jdbc.FailoverDataSource.doGetConnection(FailoverDataSource.java:133)
    at com.amazon.jdbc.FailoverDataSource.getConnection(FailoverDataSource.java:109)
    at com.amazon.identity.accessmanager.WrappedConnection$1.call(WrappedConnection.java:84)
    at com.amazon.identity.accessmanager.WrappedConnection$1.call(WrappedConnection.java:82)
    at com.amazon.identity.accessmanager.WrappedConnection.getConnection(WrappedConnection.java:110)
    ... 40 more
    Caused by: com.mchange.v2.resourcepool.TimeoutException: A client timed out while waiting to acquire a resource from com.mchange.v2.resourcepool.BasicResourcePool@185e5c6b -- timeout at
 awaitAvailable()
    at com.mchange.v2.resourcepool.BasicResourcePool.awaitAvailable(BasicResourcePool.java:1317)
    at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:557)
    at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
    at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
    at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
    at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
    at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
    at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
    at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
    at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
    at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
    at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
    at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
    at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
    at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
    at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
    at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
    at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
    at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
....... (total of 317 such instances of prelimCheckoutResource):

Some excerpts I pulled up from the c3p0 documentation

When a c3p0 DataSource attempts and fails to acquire a Connection, it will retry up to acquireRetryAttempts times, with a delay of acquireRetryDelay between each attempt. If all attempts fail, any clients waiting for Connections from the DataSource will see an Exception, indicating that a Connection could not be acquired. Note that clients do not see any Exception until a full round of attempts fail, which may be some time after the initial Connection attempt. If acquireRetryAttempts is set to 0, c3p0 will attempt to acquire new Connections indefinitely, and calls to getConnection() may block indefinitely waiting for a successful acquisition.

checkoutTimeout limits how long a client will wait for a Connection, if all Connections are checked out and one cannot be supplied immediately

So here's my theory around why this happened:

The network partitioning existed for several minutes. I am assuming by then, the idle connection tests would have invalidated all active connections in the pool. This means that c3p0 would now be involved in getting new connections. If any application hosts tries to obtain connection from pool, it would have to wait indefinitely until connection has been acquired (see excerpt from the c3p0 docs). Also checkout timeout parameter would not have helped in this case since it enforces timeout only if all connections were checked out (and this was not the case).

My question here is the following:

  1. Is my understanding of the system correct?
  2. If yes, should checkoutTimeout (or some other parameter be present) which would timeout such application connection requests rather than hang forever?
  3. If there any better way to configure c3p0 to get away from facing this issue again. I can try wrapping getting a connection from c3p0 enforcing timeout (thread based timeout), but this is something I want to avoid if its possible to have a better c3p0 configuration or apply a c3p0 patch.

Thanks

Dogmatic answered 19/9, 2013 at 3:51 Comment(3)
is your database realy Oracle? the test query "select 1 from dual" will only work for oracle, unless you create the dual table..Rustie
Thats right. The database I am using is Oracle.Dogmatic
hey, did you manage to get a solution for this?Jill
F
2

The network partitioning existed for several minutes. I am assuming by then, the idle connection tests would have invalidated all active connections in the pool. This means that c3p0 would now be involved in getting new connections. If any application hosts tries to obtain connection from pool, it would have to wait indefinitely until connection has been acquired (see excerpt from the c3p0 docs).

  1. This is incorrect. The checkoutTimeout should controls this scenario as well as the case when your system is overloaded (pool is maxed out and all connections are used).

Also checkout timeout parameter would not have helped in this case since it enforces timeout only if all connections were checked out (and this was not the case).

  1. According to the c3p0 documentation: this timeout is enforced "at checkout", not when the connection is already checked-out. So it should help you.

  2. The checkoutTimeout is there to help you with client timeouts so no need to implement anything else; however I would say that trying to obtain a connection indefinitely is a mistake. I'm actually using the default 30 x 1000 ms = 30 seconds timeout.

I would also say that the checkoutTimeout should bigger or equal than the aquire timeout (acquireRetryAttempts * acquireRetryDelay), otherwise the second will apply.

Folkway answered 30/10, 2014 at 17:45 Comment(0)
G
0

As per documentation, this is the problem.

c3p0.acquireRetryAttempts = 0;

If acquireRetryAttempts is 0, then C3P0 keeps trying to acquire a connection on failure iterating indefinitely with a wait of 10s every iteration (as you have configured).

Change acquireRetryAttempts to a finite value like 10 and your wait will be around 100s followed by an exception.

Greatly answered 6/4, 2021 at 15:24 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.