Concurrent use of same JDBC connection by multiple threads
Asked Answered
G

4

15

I'm trying to better understand what will happen if multiple threads try to execute different sql queries, using the same JDBC connection, concurrently.

  • Will the outcome be functionally correct?

  • What are the performance implications?

  • Will thread A have to wait for thread B to be completely done with its query?

  • Or will thread A be able to send its query immediately after thread B has sent its query, after which the database will execute both queries in parallel?


I see that the Apache DBCP uses synchronization protocols to ensure that connections obtained from the pool are removed from the pool, and made unavailable, until they are closed. This seems more inconvenient than it needs to be. I'm thinking of building my own "pool" simply by creating a static list of open connections, and distributing them in a round-robin manner.

I don't mind the occasional performance degradation, and the convenience of not having to close the connection after every use seems very appealing. Is there any downside to me doing this?

Girder answered 3/10, 2015 at 20:46 Comment(4)
connection pooling is much more than just a set of connections. You should really read about what a connection pool provide The parameters here can provide some hint commons.apache.org/proper/commons-dbcp/configuration.htmlBelva
I had browsed through the config and other API on the interface, and none of them interest me for my use-case. Hence why I ended up writing my own implementation, which came out to ~10 lines of code, does exactly what I need, minimizes synchronization overhead, and saves me the trouble of having to add connection.close() in a million places.Girder
Multiplexing multiple threads on a single connections is a really bad idea. Just use a connection pool, check out a connection and return it to the pool (close it) when you are done with it. Have you - for example - considered what happens if one thread wants to commit and the other wants to rollback? You either lose data or the wrong data has been committed. That is a real bad place to be. Also, but not using Connection.close(), you make it really unclear to others what you are doing.Agan
Tried doing that with PostgreSQL 42.2.5 JDBC drivers and both threads got blocked.Monoicous
G
14

I ran the following set of tests using a AWS RDS Postgres database, and Java 11:

  1. Create a table with 11M rows, each row containing a single TEXT column, populated with a random 100-char string

  2. Pick a random 5 character string, and search for partial-matches of this string, in the above table

  3. Time how long the above query takes to return results. In my case, it takes ~23 seconds. Because there are very few results returned, we can conclude that the majority of this 23 seconds is spent waiting for the DB to run the full-table-scan, and not in sending the request/response packets

  4. Run multiple queries in parallel (with different keywords), using different connections. In my case, I see that they all complete in ~23 seconds. Ie, the queries are being efficiently parallelized

  5. Run multiple queries on parallel threads, using the same connection. I now see that the first result comes back in ~23 seconds. The second result comes back in ~46 seconds. The third in ~1 minute. etc etc. All the results are functionally correct, in that they match the specific keyword queried by that thread

To add on to what Joni mentioned earlier, his conclusion matches the behavior I'm seeing on Postgres as well. It appears that all "correctness" is preserved, but all parallelism benefits are lost, if multiple queries are sent on the same connection at the same time.

Girder answered 14/10, 2018 at 0:42 Comment(0)
U
9

Since the JDBC spec doesn't give guarantees of concurrent execution, this question can only be answered by testing the drivers you're interested in, or reading their source code.

In the case of MySQL Connector/J, all methods to execute statements lock the connection with a synchronized block. That is, if one thread is running a query, other threads using the connection will be blocked until it finishes.

Unfounded answered 4/10, 2015 at 8:9 Comment(0)
L
0

Doing things the wrong way will have undefined results... if someone runs some tests, maybe they'll answer all your questions exactly, but then a new JVM comes out, or someone tries it on another jdbc driver or database version, or they hit a different set of race conditions, or tries another platform or JVM implementation, and another different undefined result happens.

If two threads modify the same state at the same time, anything could happen depending on the timing. Maybe the 2nd one overwrites the first's query, and then both run the same query. Maybe the library will detect your error and throw an exception. I don't know and wouldn't bother testing... (or maybe someone already knows or it should be obvious what would happen) so this isn't "the answer", but just some advice. Just use a connection pool, or use a synchronized block to ensure problems don't happen.

Lucid answered 3/10, 2015 at 21:10 Comment(3)
The JDBC Connection spec is thread safe, with respect to issuing and running queries. I'm more concerned about the performance implications really.Girder
I assume thread safe in that context means all its methods are thread safe. So 2 calls by different threads won't interfere, but many calls that rely on a specific order might fail (like rollback() for example).Lucid
I get your point about there being corner cases, if you're using the full range of functionality in Connection. However, I'm not. I have in fact set up a wrapper around Connection, that only allows executing queries, and that's it. I'm more concerned about the performance implications of executing multiple queries on the same connection.Girder
P
0

We had to disable the statement cache on Websphere, because it was throwing ArrayOutOfBoundsException at PreparedStatement level. The issue was that some guy though it was smart to share a connection with multiple threads. He said it was to save connections, but there is no point multithreading queries because the db won't run them parallel.

There was also an issue with a java runnables that were blocking each others because they used the same connection.

So that's just something to not do, there is nothing to gain.

There is an option in websphere to detect this multithreaded access. I implemented my own since we use jetty in developpement.

Pheasant answered 28/9, 2017 at 20:50 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.