Closing a PreparedStatement after a single execute – is it a design flaw?
Asked Answered
S

5

6

I have looked into various places, and have heard a lot of dubious claims, ranging from PreparedStatement should be preferred over Statement everywhere, even if only for the performance benefit; all the way to claims that PreparedStatements should be used exclusively for batched statements and nothing else.

However, there seems to be a blind spot in (primarily online) discussions I have followed. Let me present a concrete scenario.


We have an EDA-designed application with a DB connection pool. Events come, some of them require persistence, some do not. Some are artificially generated (e.g. update/reset something every X minutes, for example). Some events come and are handled sequentially, but other types of events (also requiring persistence) can (and will) be handled concurrently.

Aside from those artificially generated events, there is no structure in how events requiring persistence arrive.

This application was designed quite a while ago (roughly 2005) and supports several DBMSes. The typical event handler (where persistence is required):

  • get connection from pool
  • prepare sql statement
  • execute prepared statement
  • process the result set, if applicable, close it
  • close prepared statement
  • prepare a different statement, if necessary and handle the same way
  • return connection to pool

If an event requires batch processing, the statement is prepared once and addBatch/executeBatch methods are used. This is an obvious performance benefit and these cases are not related to this question.


Recently, I have received an opinion, that the whole idea of preparing (parsing) a statement, executing it once and closing is essentially a misuse of PreparedStatement, provides zero performance benefits, regardless of whether server or client prepared statements are used and that typical DBMSes (Oracle, DB2, MSSQL, MySQL, Derby, etc.) will not even promote such a statement to prepared statement cache (or at least, their default JDBC driver/datasource will not).

Moreover, I had to test certain scenarios in dev environment on MySQL, and it seems that the Connector/J usage analyzer agrees with this idea. For all non-batched prepared statements, calling close() prints:

PreparedStatement created, but used 1 or fewer times. It is more efficient to prepare statements once, and re-use them many times


Due to application design choices outlined earlier, having a PreparedStatement instance cache that holds every single SQL statement used by any event for each connection in the connection pool sounds like a poor choice.

Could someone elaborate further on this? Is the logic "prepare-execute (once)-close" flawed and essentially discouraged?

P.S. Explicitly specifying useUsageAdvisor=true and cachePrepStmts=true for Connector/J and using either useServerPrepStmts=true or useServerPrepStmts=false still results in warnings about efficiency when calling close() on PreparedStatement instances for every non-batched SQL statement.

Sidsida answered 11/10, 2015 at 13:39 Comment(3)
The parameters passed to the PreparedStatement, where are they coming from? Many times a PreparedStatement is used to avoid building statements based on user input, which may break the structure of the statement (e.g. SQL Injection).Bing
Yes, I am aware of the automated input sanitization. Primarily (integer) PKs arrive with events; sometimes raw input (binary data fields, which are then parsed based on field type and inserted with setXxx), otherwise setXxxs can be called with global or session (aformentioned sequential events) variables, [one-sidedly] managed by the application. Still, the question is whether single-exec-then-close logic is discouraged, regardless of possible side-effect benefits.Sidsida
It all depends on the DB system and driver of what they make out of it. Having said that most drivers (and pools) do statement caching even when you close them. So unless you have a longrunning processing thread (actor style) you want to follow the pattern of closing the PS (as the pool would do it anyway when you return the connection). The biggest problem with keeping PS open is, that they only work for the current connection, so you need to hold onto the connection as well.Unsaid
C
3

Is the logic prepare-execute [once]-close flawed and essentially discouraged?

I don't see that as being a problem, per se. A given SQL statement needs to be "prepared" at some point, whether explicitly (with a PreparedStatement) or "on the fly" (with a Statement). There may be a tiny bit more overhead incurred if we use a PreparedStatement instead of a Statement for something that will only be executed once, but it is unlikely that the overhead involved would be significant, especially if the statement you cite is true:

typical DBMSes (Oracle, DB2, MSSQL, MySQL, Derby, etc.) will not even promote such a statement to prepared statement cache (or at least, their default JDBC driver/datasource will not).

What is discouraged is a pattern like this:

for (int thing : thingList) {
    PreparedStatement ps = conn.prepareStatement(" {some constant SQL statement} ");
    ps.setInt(1, thing);
    ps.executeUpdate();
    ps.close();
}

because the PreparedStatement is only used once and the same SQL statement is being prepared over and over again. (Although even that might not be such a big deal if the SQL statement and its executation plan are indeed cached.) The better way to do that is

PreparedStatement ps = conn.prepareStatement(" {some constant SQL statement} ");
for (int thing : thingList) {
    ps.setInt(1, thing);
    ps.executeUpdate();
}
ps.close();

... or even better, with a "try with resources" ...

try (PreparedStatement ps = conn.prepareStatement(" {some constant SQL statement} ")) {
    for (int thing : thingList) {
        ps.setInt(1, thing);
        ps.executeUpdate();
    }
}

Note that this is true even without using batch processing. The SQL statement is still only prepared once and used several times.

Caravan answered 11/10, 2015 at 14:49 Comment(1)
Actually for things like this, we use connection metadata to see if batch processing is enabled, get a connection with autocommit disabled and use addBatch() instead of executeUpdate() and do executeBatch() before closing PS, and then commit() before closing the connection. Of course, if metadata says that batch is not supported, we fall back to the logic described here.Sidsida
O
2

As others already stated, the most expensive part is the parsing the statement in the database. Some database systems (this is pretty much DB dependent – I will speak mainly for Oracle) may profit, if the statement is already parsed in the shared pool. (In Oracle terminology this is called a soft parse that is cheaper than a hard parse - a parse of a new statement). You can profit from soft parse even if you use the prepared statement only once.

So the important task is to give the database a chance to reuse the statement. A typical counter example is the handling of the IN list based on a collection in Hibernate. You end with the statement such as

 .. FROM T WHERE X in (?,?,?,  … length based on the size of the collection,?,? ,?,?)

You can’t reuse this statement if the size of the collection differ.

A good starting point to get overview of the spectrum of the SQL queries produced by a running application is (by Oracle) the V$SQL view. Filter the PARSING_SCHEMA_NAME with you connection pool user and check the SQL_TEXT and the EXECUTIONS count.

Two extreme situation should be avoided:

  • Passing parameters (IDs) in the query text (this is well known) and
  • Reusing statement for different access paths.

An example of the latter is a query that with a provided parameter performs an index access to a limited part of the table, while without the parameter all records should be processed (full table scan). In that case is definitively no problem to create two different statements (as the parsing of both leads to different execution plans).

Orthocephalic answered 11/10, 2015 at 15:59 Comment(1)
Well, I am glad to hear that PS re-parsing is at least implementation-dependent. Coupled with the fact that the application supports more than one DBMS and your other answer, the current approach seems to be optimal.Sidsida
T
1

PreparedStatements are preferable because one is needed regardless of whether you create one programmatically or not; internally the database creates one every time a query is run - creating one programatically just gives you a handle to it. Creating and throwing away a PreparedStatement every time doesn't add much overhead over using Statement.

A large effort is required by the database to create one (syntax checking, parsing, permissions checking, optimization, access strategy, etc). Reusing one bypasses this effort for subsequent executions.

Instead of throwing them away, try either writing the query in such a way that it can be reused, eg by ignoring null input parameters:

where someCol = coalesce(?, someCol)

so if you set the parameter to null (ie "unspecified), the condition succeeds)

or if you absolutely must build the query every time, keep references to the PreparedStatements in a Map where the built query is the key and reuse them if you get a hit. Use a WeakHashMap<String, PreparedStatements> for you map implementation to prevent running out of memory.

Taeniacide answered 11/10, 2015 at 14:48 Comment(2)
It does add an extra network round-trip to the database, though, since the application has to wait for the response with the statement handle before it can send the command to execute it. Not a major problem, but worth keeping in mind.Caswell
@Caswell that's a good point, but typically the query is very small (one frame), especially when compared to the result set, and the database is typically "close by" in the network, and also database query interactions are very chatty anyway, so an extra round trip in most cases may not even be noticed.Taeniacide
O
0

PreparedStatement created, but used 1 or fewer times. It is more efficient to prepare statements once, and re-use them many times

I thing you may safely ignore this warning, it is similar to a claim It is more efficient to work first 40 hour in the week, than sleep next 56 hours, eat following 7 hours and the rest is your free time.

You need exactly one execution per event - should you perform 50 to get a better average?

Orthocephalic answered 11/10, 2015 at 16:23 Comment(1)
Actually, this makes perfect sense.Sidsida
C
0

SQL commands that run only once, in terms of performance, just waste database resources (memory, processing) being sent in a Prepared Statement. In other hand, not using Prepared Statement let app vulnerable to SQL injection.

Are security (protection from SQL injection) working against performance (prepared statement that runs just once) ? Yes, but...

But it should not be that way. It's a choice java does NOT implement an interface to let developers call the right database API: SQL commands that run just once AND are properly protected against SQL injection ! Why Java just not implement the correct tool for this specific task?

It could be as follows:

  • Statement Interface - Different SQL commands could be submitted. One execution of SQL commands. Bind variables not allowed.
  • PreparedStatement Interface - One SQL command could be submitted. Multiple executions of SQL command. Bind variables allowed.
  • (MISSING IN JAVA!) RunOnceStatement - One SQL command could be submitted. One execution of SQL command. Bind variables allowed.

For example, the correct routine (API) could be called in Postgres, by driver mapping to:
- Statement Interface - call PQExec()
- PreparedStatement Interface - call PQPrepare() / PQExecPrepare() / ...
- (MISSING IN JAVA!) RunOnceStatement Interface - call PQExecParams()

Using prepared statement in SQL code that runs just once is a BIG performance problem: more processing in database, waste database memory, by maintaining plans that will not called later. Cache plans get so crowed that actual SQL commands that are executed multiple times could be deleted from cache.

But Java does not implement the correct interface, and forces everybody to use Prepared Statement everywhere, just to protect against SQL injection...

Calpac answered 27/5, 2020 at 16:8 Comment(1)
Please note, that while the PS object is used once in the event handler code, this does not automatically mean that the plan is wasted. As more events arrive, the plan will be reused. The question was about having a snippet where a PS is created and executed once being a flaw or not, completely regardless how often the snippet is executed (and yes, in this case it is often).Sidsida

© 2022 - 2024 — McMap. All rights reserved.