how bad is it to have "extra" database queries?
Asked Answered
M

2

6

I come from the front-end world in web development where we try really hard to limit the number of HTTP requests issued (by consolidating css, js files, images, etc.).

With db connections (MySQL), obviously you don't want to have unnecessary connections, but as a general rule, how bad is it to have multiple small queries? (they execute quickly)

I ask because I'm moving my application to a clustered environment and where before I was caching some stuff in server memory (as I was running on a single server), I am now trying to make my app "stateless" and in my current implementation that means more small db calls. This will help me with load balancing (avoiding sticky sessions) and also keep server memory usage down.

We're not talking a ton of queries, maybe 6-8 db calls instead of 2-4, returning anywhere from a handful of records to a few thousand records. Each of them executes quickly, less than 30ms (some much less), but I don't know if there is some "connection latency" I should be concerned about.

Thanks for your insight.

Merryman answered 2/4, 2016 at 2:25 Comment(5)
Brian I will happily pontificate a bit when I have a chance but not at the momentFortyfour
Thank you Drew, looking forward to your insight.Merryman
All rules have exceptions. Do what is best for the situation at hand.Symmetry
What Dan said. In some situations it's entirely possible that one long-running query is worse for you than two or three small, quick ones. Either way you should be using CachedWithin whenever possible.Duckworth
You could also consider grouping the queries into a stored procedure which you call with the cfstoredproc tag.Symmetry
M
10

Short answer: (1) make sure you're staying at the same big-O level, reuse connections, measure performance; (2) think about how much you care about data consistency.

Long answer:

Performance

Strictly from performance perspective, and generally speaking, unless you are already close to maxing out your database resources, such as max connections, this is not likely to have major impact. But there are certain things you should keep in mind:

  • do the "6-8" queries that replace "2-4" queries stay in the same execution time? e.g. if current database interaction is at O(1) is it going to change to O(n)? Or current O(n) going to change to O(n^2)? If yes, you should think about what that means for your application
  • most application servers can reuse existing database connections, or have persistent database connection pools; make sure your application does not establish a new connection for every query; otherwise this is going to make it even more inefficient
  • in many common cases, mainly on larger tables with complex indexes and joins, doing few queries by primary keys may be more efficient than joining those tables in a single query; this would be the case if, while doing such joins, the server not only takes longer to perform the complex query, but also blocks other queries against affected tables

Generally speaking about performance, the rule of thumb is - always measure.

Consistency

Performance is not the only aspect to consider, however. Also think about how much you care about data consistency in your application.

For example, consider a simple case - tables A and B that have one-to-one relationship and you are querying for a single record using a primary key. If you join those tables and retrieve result using a single query, you'll either get a record from both A and B, or no records from either, which is what your application expects too. Now consider if you split that up into 2 queries (and you're not using transactions with preferred isolation levels) - you get a record from table A, but before you could grab the matching record from table B, it is deleted/updated by another process. Now your application has a record from A but none from B.

General question here is - do you care about ACID compliance of your relational data as it pertains to the queries you are breaking apart? If the answer is yes, you must think about how your application logic will react in these specific cases.

Meggie answered 2/4, 2016 at 6:13 Comment(3)
Wonderful answer, thank you for your insight! Regarding your comment to "always measure," is there a particular tool you use and recommend to measure your db performance?Merryman
When ColdFusion is in debug mode, it can be set to show how long a query takes to run. It can also show if the query is cachedDetoxify
@BrianFitzGerald I would suggest measuring application performance while monitoring database (cpu, ram, connections, slow queries, etc.). If you have an environment you could use for load testing, or if you can create a separate stack for this purpose, it should be quite simple to get started with siege, apachebench, or a similar tool.Meggie
V
5

6-8 queries for one web page? Usually this is fine. I do it all the time.

Thousands of rows returned? Choke! What is the client going to do with that many? Can the SQL do more processing, then return fewer rows?

With rare exceptions, only 1 connection per web page.

Each query has a lot of overhead. For example, INSERTing 100 rows into a table -- 100 INSERT single-row statements will take about 10 times as long as a single 100-row INSERT. So when practical use fewer round-trips to the server. This becomes very important if the network is a WAN. The other side of the globe is 250ms away, just for latency. A server in the same datacenter is probably so close that latency can be ignored. In a WAN, use Stored Routines to minimize round trips.

I like to time each query actively in the code. Then, if I perceive a performance problem, I look to see which query to work on first. Or use the SlowLog.

Verner answered 3/4, 2016 at 0:6 Comment(1)
Thanks Rick! Some great tips in there. And good call on the thousands of rows... it's basically trying to pre-populate the user object so I can do something like user.getFavorites() (for example) and all the user's favorites will be available for use. I get those can be lazy loaded, etc., but before going "stateless" the user was cached on a per-session basis so it was a non-issue to pre-populate once upon session initialization. Anyhow, you have convinced me to make some architectural changes to my app to prevent loading up so many records on each request :)Merryman

© 2022 - 2024 — McMap. All rights reserved.