Is asynchronous jdbc call possible?
Asked Answered
S

16

177

I wonder if there is a way to make asynchronous calls to a database?

For instance, imagine that I've a big request that take a very long time to process, I want to send the request and receive a notification when the request will return a value (by passing a Listener/callback or something). I don't want to block waiting for the database to answer.

I don't consider that using a pool of threads is a solution because it doesn't scale, in the case of heavy concurrent requests this will spawn a very large number of threads.

We are facing this kind of problem with network servers and we have found solutions by using select/poll/epoll system call to avoid having one thread per connection. I'm just wondering how to have a similar feature with database request?

Note: I'm aware that using a FixedThreadPool may be a good work-around, but I'm surprised that nobody has developed a system really asynchronous (without the usage of extra thread).

** Update **
Because of the lack of real practical solutions, I decided to create a library (part of finagle) myself: finagle-mysql. It basically decodes/decodes mysql request/response, and use Finagle/Netty under the hood. It scales extremely well even with huge number of connections.

Splenic answered 3/11, 2010 at 13:52 Comment(4)
See code.google.com/p/async-mysql-connector/wiki/UsageExampleTemperamental
See also github.com/mauricio/postgresql-asyncOleomargarine
The problem is how could the db notify the client when the query finishes. One would be (e.g.) for Oracle to use the "Database Query Result Change Notification" feature and get notified when db data change. This applies for SQL queries that modify the db data For read-only queries this wouldn't work. On the other hand I am not sure that making connections async would be a good idea since establishing them is expensive. Of' course this is not a very general solution. Just food for thought ...Adsorb
Does finagle-mysql use JDBC?Polybius
D
175

I don't understand how any of the proposed approaches that wrap JDBC calls in Actors, executors or anything else can help here - can someone clarify.

Surely the basic problem is that the JDBC operations block on socket IO. When it does this it blocks the Thread its running on - end of story. Whatever wrapping framework you choose to use its going to end up with one thread being kept busy/blocked per concurrent request.

If the underlying database drivers (MySql?) offers a means to intercept the socket creation (see SocketFactory) then I imagine it would be possible to build an async event driven database layer on top of the JDBC api but we'd have to encapsulate the whole JDBC behind an event driven facade, and that facade wouldn't look like JDBC (after it would be event driven). The database processing would happen async on a different thread to the caller, and you'd have to work out how to build a transaction manager that doesn't rely on thread affinity.

Something like the approach I mention would allow even a single background thread to process a load of concurrent JDBC exec's. In practice you'd probably run a pool of threads to make use of multiple cores.

(Of course I'm not commenting on the logic of the original question just the responses that imply that concurrency in a scenario with blocking socket IO is possible without the user of a selector pattern - simpler just to work out your typical JDBC concurrency and put in a connection pool of the right size).


Looks like MySql probably does something along the lines I'm suggesting --- http://code.google.com/p/async-mysql-connector/wiki/UsageExample

Dermoid answered 30/9, 2011 at 21:6 Comment(3)
Using Akka does not make calls to relational DBs asynchronous. It allows you run them on a bunch of dedicated threads for DB access easily. This way you do not take the whole site down when the site becomes unresponsive because you have been always making async calls in service layer to the DAO layer with promises and your web server threads are separate from the rest of your application.Bergen
Actors are not the only workarounds (eg. micro-services and async http, which we scale to thousands per second), and I would not be so quick to dismiss them as being non-asynchronous from the client's perspective. If 1k UI threads traffic enter your system and only 10 threads are blocked on the DB, whilst 990 'messages' (or something similar) are queue'd in memory without blocking any of the 1k UI threads (which will probably be released)... isn't that what is required? I would love to see true async JDBC, but that does not mean there are not extremely viable workarounds in the interim.Ence
I'm using Kafka (messaging service) to send data asychronously to another service which makes the DB call. Not sure if this is a good approach. Should work if Kafka calls are really async, but might not be worth the complexity adding another service requiring additional DB sessions.Lecturer
T
42

It's impossible to make an asynchronous call to the database via JDBC, but you can make asynchronous calls to JDBC with Actors (e.g., actor makes calls to the DB via JDBC, and sends messages to the third parties, when the calls are over), or, if you like CPS, with pipelined futures (promises) (a good implementation is Scalaz Promises)

I don't consider that using a pool of threads is a solution because it doesn't scale, in the case of heavy concurrent requests this will spawn a very large number of threads.

Scala actors by default are event-based (not thread-based) - continuation scheduling allows creating millions of actors on a standard JVM setup.

If you're targeting Java, Akka Framework is an Actor model implementation that has a good API both for Java and Scala.


Aside from that, the synchronous nature of JDBC makes perfect sense to me. The cost of a database session is far higher than the cost of the Java thread being blocked (either in the fore- or background) and waiting for a response. If your queries run for so long that the capabilities of an executor service (or wrapping Actor/fork-join/promise concurrency frameworks) are not enough for you (and you're consuming too many threads) you should first of all think about your database load. Normally the response from a database comes back very fast, and an executor service backed with a fixed thread pool is a good enough solution. If you have too many long-running queries, you should consider upfront (pre-)processing - like nightly recalculation of the data or something like that.

Tempest answered 3/11, 2010 at 14:24 Comment(10)
+1. I've wrapped expensive JDBC calls (technically, Querulous) in actors and never looked back.Benzel
I'm targeting scala, and I'm very familiar with the Promise/Actor pattern. Actually I didn't consider the actor model because I still want to execute requests in parallel, I don't want to have a big actor that execute all my requests sequentially. My first idea was to use Promises backed by a FixedThreadPool with same size than the number of connection to the db, in case of heavy usage, every thread of the thread pool is blocked waiting for the db. What's annoy me is that, there is no need of all those threads, It would be completely feasible to have a equivalent system with just one thread.Splenic
Steve, just create a new actor for each call?Cretaceous
@Victor, every actor working in a parallel on a blocking operation (JDBC) will run on a separate thread that Steve's trying to avoidTempest
The actor approach still requires one thread per active database transaction, while the transaction is going on, so that's not really a solution to the OP's problem unless you are willing to constrain the number of parallel database transactions and have some "async" database operations wait for some already executing ones to finish and free up a thread. This isn't a bad idea, though - the database might get overloaded if you open too many connections - so putting your database transaction in a queue for processing instead of blocking your http request processing thread will help.Demarco
I dont think itis impossible... code.google.com/p/async-mysql-connector/wiki/UsageExample - saying that I have yet to try the code out...Temperamental
Actor based solution is still blocking the thread. Dont say its not possible to execute async jdbc call, there are experimental open source libraries that try to implement async jdbc.Jo
+1 "The cost of a database session is far higher than the cost of the Java thread being blocked"Esqueda
For expensive DB calls there is usually not such a big a problem. It is when the call is trivial that the network overhead becomes a problem. If you want to make 100 queries, which take 1 ms on the DB each, but the network overhead is 200 ms, then it is going to take over 20 seconds synchronously, but would take 300 ms asynchronously.Quimby
The cost of a database session is far higher than the cost of the Java thread being blocked (either in the fore- or background) and waiting for a response -- This is an unsubstantiated claim. I remember now after all these years, given this was the first question I saw on stackoverflow!Bil
D
12

Perhaps you could use a JMS asynchronous messaging system, which scales pretty well, IMHO:

  • Send a message to a Queue, where the subscribers will accept the message, and run the SQL process. Your main process will continue running and accepting or sending new requests.

  • When the SQL process ends, you can run the opposite way: send a message to a ResponseQueue with the result of the process, and a listener on the client side accept it and execute the callback code.

Dissatisfaction answered 3/11, 2010 at 14:1 Comment(0)
K
9

It looks like a new asynchronous jdbc API "JDBC next" is in the works.

See presentation here

You can download the API from here

Update:

  • This new jdbc API was later named ADBA. Then on September 2019 work was stopped see mailing list post.
  • R2DBC seems to achieve similar goals. It already supports most major databases (except oracle db). Note that this project is a library and not part of the jdk.
Keister answered 19/3, 2018 at 8:30 Comment(1)
a redirected link that points to a more recent implementation is here: oracle.com/goto/java-async-dbUnconventionality
L
7

There is no direct support in JDBC but you have multiple options like MDB, Executors from Java 5.

"I don't consider that using a pool of threads is a solution because it doesn't scale, in the case of heavy concurrent requests this will spawn a very large number of threads."

I am curious why would a bounded pool of threads is not going to scale? It is a pool not thread-per-request to spawn a thread per each request. I have been using this for quite sometime on a heavy load webapp and we have not seen any issues so far.

Logroll answered 3/11, 2010 at 13:55 Comment(2)
I think that the main argument against threads is that you are basically then outside any standard Java containers constraints, so you lose container managed clustering and fail over capabilities, though you could roll your own, or use something like Terracotta.Achilles
we can tap into app server managed thread polls by using work managers. websphere, weblogic and glassfish support itLogroll
I
5

A solution is being developed to make reactive connectivity possible with standard relational databases.

People wanting to scale while retaining usage of relational databases are cut off from reactive programming due to existing standards based on blocking I/O. R2DBC specifies a new API that allows reactive code that work efficiently with relational databases.

R2DBC is a specification designed from the ground up for reactive programming with SQL databases defining a non-blocking SPI for database driver implementors and client library authors. R2DBC drivers implement fully the database wire protocol on top of a non-blocking I/O layer.

R2DBC's WebSite

R2DBC's GitHub

Feature Matrix

enter image description here

Inflated answered 15/6, 2019 at 20:28 Comment(0)
W
4

As mentioned in other answers JDBC API is not Async by its nature.
However, if you can live with a subset of the operations and a different API there are solutions. One example is https://github.com/jasync-sql/jasync-sql that works for MySQL and PostgreSQL.

Weasel answered 9/9, 2018 at 22:4 Comment(0)
K
3

Ajdbc project seems to answer this problem http://code.google.com/p/adbcj/

There is currently 2 experimental natively async drivers for mysql and postgresql.

Keister answered 3/5, 2013 at 22:50 Comment(3)
I would like to have this approach ready. JDBC has evolved a lot from the beginning (iterators, templates, prepared procedures), but this async approach has never been implemented. It would be particularly interesting for write operations (Insert,Update,Delete), and specially those heavy batch TX we all face. In my opinion, any kind of client based approach (Pooling,Actor, Scheduling, Messaging...) would lead to little rewards in terms of resource usage (probably some gains in throughput or latency).Decorator
Old and abandoned, only two data types supported & not even close to production ready. Unfortunately :(Lassitude
Issue #1 of this library is about the website being not available. It's more than a year old. I suspect this library is pretty dead.Amenity
T
3

An old question, but some more information. It is not possible to have JDBC issue asynchronous requests to the database itself, unless a vendor provides an extension to JDBC and a wrapper to handle JDBC with. That said, it is possible to wrap JDBC itself with a processing queue, and to implement logic that can process off the queue on one or more separate connections. One advantage of this for some types of calls is that the logic, if under heavy enough load, could convert the calls into JDBC batches for processing, which can speed up the logic significantly. This is most useful for calls where data is being inserted, and the actual result need only be logged if there is an error. A great example of this is if inserts are being performed to log user activity. The application won't care if the call completes immediately or a few seconds from now.

As a side note, one product on the market provides a policy driven approach to allowing asynchronous calls like those I described to be made asynchronously (http://www.heimdalldata.com/). Disclaimer: I am co-founder of this company. It allows regular expressions to be applied to data transformation requests such as insert/update/deletes for any JDBC data source, and will automatically batch them together for processing. When used with MySQL and the rewriteBatchedStatements option (MySQL and JDBC with rewriteBatchedStatements=true) this can significantly lower overall load on the database.

Trouper answered 30/11, 2015 at 22:27 Comment(1)
But this still means that JDBC should have at least one separate thread. What about frameworks and stacks that are single-threaded but still callback-based(nodejs comes to mind)? Do you know how they manage JDBC calls?Schutz
B
3

You have three options in my opinion:

  1. Use a concurrent queue to distribute messages across a small and fixed number of threads. So if you have 1000 connections you will have 4 threads, not 1000 threads.
  2. Do the database access on another node (i.e. another process or machine) and have your database client make asynchronous network calls to that node.
  3. Implement a true distributed system through asynchronous messages. For that you will need an messaging queue such as CoralMQ or Tibco.

Diclaimer: I am one of the developers of CoralMQ.

Bordure answered 9/2, 2016 at 20:25 Comment(0)
T
2

The Java 5.0 executors might come handy.

You can have a fixed number of threads to handle long-running operations. And instead of Runnable you can use Callable, which return a result. The result is encapsulated in a Future<ReturnType> object, so you can get it when it is back.

Tramel answered 4/11, 2010 at 9:39 Comment(0)
P
2

Just a crazy idea : you could use an Iteratee pattern over JBDC resultSet wrapped in some Future/Promise

Hammersmith does that for MongoDB.

Plagioclase answered 4/8, 2011 at 22:27 Comment(0)
E
2

Here is an outline about what an non-blocking jdbc api could look like from Oracle presented at JavaOne: https://static.rainfocus.com/oracle/oow16/sess/1461693351182001EmRq/ppt/CONF1578%2020160916.pdf

So it seems that in the end, truly asynchronous JDBC calls will indeed be possible.

Elva answered 23/11, 2017 at 9:20 Comment(1)
It is not JDBC but an additional APIBurnley
O
1

I am just thinking ideas here. Why couldn't you have a pool of database connections with each one having a thread. Each thread has access to a queue. When you want to do a query that takes a long time, you can put on the queue and then one of threads will pick it up and handle it. You will never have too many threads because the number of your threads are bounded.

Edit: Or better yet, just a number of threads. When a thread sees something in a queue, it asks for a connection from the pool and handles it.

Operand answered 3/11, 2010 at 13:57 Comment(0)
C
1

The commons-dbutils library has support for an AsyncQueryRunner which you provide an ExecutorService to and it returns a Future. Worth checking out as it's simple to use and ensure you won't leak resources.

Clockmaker answered 23/11, 2011 at 1:54 Comment(0)
S
1

If you are interested in asynchronous database APIs for Java you should know that there is a new initiative to come up with a set of standard APIs based on CompletableFuture and lambdas. There is also an implementation of these APIs over JDBC which can be used to practice these APIs: https://github.com/oracle/oracle-db-examples/tree/master/java/AoJ The JavaDoc is mentioned in the README of the github project.

Semblable answered 12/4, 2018 at 8:7 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.