Connection Pool and thread pool setting in Java
Asked Answered
B

3

7

Spring application using Hikari pool.

Now for a single request from the client I have to query 10 tables(business required), and then composite the result together. And querying for each table may cost 50ms to 200ms. To speed up the response time, I create a FixedThreadPool in my service to query each table in different thread(pseudocode):

class MyService{
    final int THREAD_POOL_SIZE = 20;
    final int CONNECTION_POOL_SIZE = 10;


    final ExecutorService pool = Executors.newFixedThreadPool(THREAD_POOL_SIZE);
    protected DataSource ds;


    MyClass(){
        Class.forName(getJdbcDriverName());
        HikariConfig config = new HikariConfig();
        config.setMaximumPoolSize(CONNECTION_POOL_SIZE);
        ds = new HikariDataSource(config);
    }



    public Items doQuery(){
        String[] tables=["a","b"......]; //10+ tables
        Items result=new Items();
        CompletionService<Items> executorService = new ExecutorCompletionService<Items>(pool);
        for (String tb : tables) {
            Callable<Item> c = () -> {
                Items items = ds.getConnection().query(tb); ......
                return Items;
            };
            executorService.submit(c);
        }


        for (String tb: tables) {
            final Future<Items> future = executorService.take();
            Items items = future.get();
            result.addAll(items);
        }
    }
}

Now for a single request, the average response time maybe 500ms.

enter image description here

But for concurrent requests, the average response time will increase rapidly, the more the requests, the long the response time will be.

enter image description here

I wonder how to set the proper connection pool size and thread pool size to make the app work effective?

BTW, the database use RDS in cloud with 4 cpu 16GB mem, 2000 max connections and 8000 max IOPS.

Bantustan answered 12/10, 2019 at 8:21 Comment(10)
If every request you send needs 10 connections to the database, and you have only 10 connections in your pool, then only 1 request can be handled at a time. Your threading strategy might increase the response time for one request. But if you expect continuous concurrent requests, doing things sequentially, without this threading strategy, will probably be simpler, as efficient, need less connections, and have proper transactional isolation semantics and thus return coherent results. But sure, you can increase the number of connections in the pool, since you can have 2000 max.Friedcake
Did you try to composite the result in the database?Absolutely
@MarmiteBomber We can not do that in database.Bantustan
@JBNizet: I am not sure why Your threading strategy might increase the response time for one request. For a single request, my thread strategy will use 10 threads each use a connection in the pool, which I think may decrease the response.Bantustan
@Bantustan sorry, I meant "improve", not "increase".Friedcake
'To speed up the response time, I create a FixedThreadPool in my service to query each table in different thread': this is already fallacious. The database is multi-threaded up to a point, but at some point it has to lock tables, or indexes, and in any case the intervening network isn't multi-threaded. Don't overthink this.Allege
@user207421: According to our bussiness logic,there is not much we can do to speed up the response time from the database .Bantustan
Sorry but doing something with 10 tables and then processing the data: that suggests that the database should combine things and/or a couple of data are not organized optimally. Besides: summary data could be done with database triggers. A factor 10 faster does not reach improvements on things like indices. (I understand that sometimes it cannot be helped.)Logy
I think your approach is not correct what if you optimize the query. You are querying 10 tables in a different thread which will increase the cost of resources. Only one thread will work well with a well-written query.Garcon
@huser I didn't say you could speed up the response from the database. I don't know anything about your database and I would not presume to guess. What I did say, as did others, is that your approach to doing that at the client end with threads is fallacious.Allege
C
6

You might want to think about a few more parameters:
1. Max concurrent request parameter for the database. Cloud providers have different limits of concurrent requests for different tiers, you might want to check yours.

2. When you say 50-200 ms, although it is difficult to say, are there 8 requests of 50ms and 2 requests of 200ms on an average or all of them pretty much the same? Why? Your doQuery might be limited by the query taking maximum time (which is 200ms), but the threads taking 50 ms will get released after it's task is done making them available for next set of requests.

3. What is the QPS you are expecting to receive?

Some calculations: If a single request takes 10 threads, and you have provisioned 100 connections with 100 concurrent query limit, assuming 200ms for each query, you can only handle 10 requests at a time. Maybe a little better than 10 if most queries take 50ms or so (but I wouldn't be optimistic).

Of course, some of these calculations goes for a toss if any of your queries takes >200ms (network latency or anything else) , in which case I recommend you have a circuit breaker, either at the connection end (if you are allowed to abort the query after a timeout) or at the API end.

Note : max connection limit is not the same as max concurrent query limit.

Suggestion: Since you need response under 500ms, You can also have a connectionTimeout of about 100-150ms on the pool. Worst case: 150ms connection timeout + 200ms query execution + 100ms for application processing < 500ms for your response. Works.

Childe answered 14/10, 2019 at 13:8 Comment(0)
P
0

You can create a custom thread executor

public class CustomThreadPoolExecutor extends ThreadPoolExecutor {

    private CustomThreadPoolExecutor(int corePoolSize, int maximumPoolSize,
                                     long keepAliveTime, TimeUnit unit, BlockingQueue<Runnable> workQueue) {
        super(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue);
    }

    /**
     * Returns a fixed thread pool where task threads take Diagnostic Context from the submitting thread.
     */

    public static ExecutorService newFixedThreadPool(int nThreads) {
        return new CustomThreadPoolExecutor(nThreads, nThreads,
                0L, TimeUnit.MILLISECONDS,
                new LinkedBlockingQueue<Runnable>());
    }
}

In the configuration, you can configure the ExecutorService bean as below

@Bean
    public ExecutorService executeService() {
        return CustomThreadPoolExecutor.newFixedThreadPool(10);
    }

This is the best practice for creating custom thread pool executor

Pluckless answered 17/10, 2019 at 6:5 Comment(0)
J
0

The proper way to size the connection pool is usually to leave it at the default.

From the hikari website:

If you have 10,000 front-end users, having a connection pool of 10,000 would be shear insanity. 1000 still horrible. Even 100 connections, overkill. You want a small pool of a few dozen connections at most, and you want the rest of the application threads blocked on the pool awaiting connections. If the pool is properly tuned it is set right at the limit of the number of queries the database is capable of processing simultaneously -- which is rarely much more than (CPU cores * 2) as noted above.

Given you know that each request is going to consume 10 threads, then you want to break with this advice and go for more threads - keeping it to a number less than 100 is probably going to provide enough capacity.

I would implement the controller like this:

Make your queries async in your controller / service classes with CompletableFutures and let the connection pool worry about keeping its threads busy.

So the controller could look like this (im adapting this from some other code that doesn't work like this example, so grain of salt with this code):

public class AppController { 

    @Autowired private DatabaseService databaseService; 

    public ResponseEntity<Thing> getThing() { 
        CompletableFuture<Foo> foo = CompletableFuture.runAsync(databaseService.getFoo());
        CompletableFuture<Bar> bar = CompletableFuture.runAsync(databaseService.getBar());
        CompletableFuture<Baz> baz = CompletableFuture.runAsync(databaseService.getBaz());

        // muck around with the completable future to return your data in the right way
        // this will be in there somewhere, followed by a .thenApply and .join
        CompletableFuture<Void> allFutures = CompletableFuture.allOf(foo, bar, baz);

        return new ResponseEntity<Thing>(mashUpDbData(cf.get()));
    }    
}

The controller will spawn as many threads as you allow the ForkJoinPool to use, they will hammer the DB all at the same time and the connection pool can worry about keeping the connections active.

But I think the reason you see the blowout in response times under small load is that by it's very design JDBC blocks the thread while waiting for the data to come back from the DB.

To stop the blocking affecting the response times so drastically, you could try the spring boot reactive style. This uses async io and backpressure to match IO production to consumption, basically it means that the app threads are as busy as they can be. This should stop that behaviour under load where the response times increase in a linear fashion.

Note if you do go the reactive path, the jdbc drivers still block, so spring have a big push to create a reactive database driver.

Jost answered 21/10, 2019 at 6:21 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.