Which is fastest? SELECT SQL_CALC_FOUND_ROWS FROM `table`, or SELECT COUNT(*)
Asked Answered
R

8

194

When you limit the number of rows to be returned by a SQL query, usually used in paging, there are two methods to determine the total number of records:

Method 1

Include the SQL_CALC_FOUND_ROWS option in the original SELECT, and then get the total number of rows by running SELECT FOUND_ROWS():

SELECT SQL_CALC_FOUND_ROWS * FROM table WHERE id > 100 LIMIT 10;
SELECT FOUND_ROWS();  

Method 2

Run the query normally, and then get the total number of rows by running SELECT COUNT(*)

SELECT * FROM table WHERE id > 100 LIMIT 10;
SELECT COUNT(*) FROM table WHERE id > 100;  

Which method is the best / fastest?

Ronnaronnholm answered 9/10, 2008 at 9:53 Comment(0)
M
131

It depends. See the MySQL Performance Blog post on this subject: To SQL_CALC_FOUND_ROWS or not to SQL_CALC_FOUND_ROWS?

Just a quick summary: Peter says that it depends on your indexes and other factors. Many of the comments to the post seem to say that SQL_CALC_FOUND_ROWS is almost always slower - sometimes up to 10x slower - than running two queries.

Mandatory answered 9/10, 2008 at 18:58 Comment(7)
I can confirm this - I just updated a query with 4 joins on a 168,000 row database. Selecting just the first 100 rows with a SQL_CALC_FOUND_ROWS took over 20 seconds; using a separate COUNT(*) query took under 5 seconds (for both count + results queries).Froufrou
Very interesting findings. Since MySQL's documentation explicitly suggests that SQL_CALC_FOUND_ROWS will be faster, I wonder in what situations (if any) it actually is faster!Clavius
I am unsure why SQL_CALC_FOUND_ROWS can be so slower. I think this is just version-related, and with newer version of MySQL SQLC_CALC_FOUND_ROWS will be as fast as 2 query, if not fasterPita
old topic, but for those who still interesting! Just finished my check on INNODB from 10 checks I can tell that it's 26(2query) against 9.2(1 query) SELECT SQL_CALC_FOUND_ROWS tblA.*, tblB.id AS 'b_id', tblB.city AS 'b_city', tblC.id AS 'c_id', tblC.type AS 'c_type', tblD.id AS 'd_id', tblD.extype AS 'd_extype', tblY.id AS 'y_id', tblY.ydt AS y_ydt FROM tblA, tblB, tblC, tblD, tblY WHERE tblA.b = tblC.id AND tblA.c = tblB.id AND tblA.d = tblD.id AND tblA.y = tblY.idPetronia
I just ran this experiment and SQLC_CALC_FOUND_ROWS was much faster than two queries. Now my main table is only 65k and two joins of a few hundreds, but the main query takes 0.18 seconds with or without SQLC_CALC_FOUND_ROWS but when I ran a second query with COUNT(id) it took 0.25 alone.Barbaric
In addition to possible performance issues, consider that FOUND_ROWS() has been deprecated in MySQL 8.0.17. See also @madhur-bhaiya's answer.Subirrigate
The linked article doesn’t seem to test on anything else than a full table search, maybe the result is different when the query is much more complicated, with joins for instance. Also it’s missing an important part of the equation: when you’re doing two queries, you’re adding an important network delay. EDIT: Oh just read it’s deprecated, so yeah I guess we shouldn’t use it in the endRevulsive
I
34

MySQL has started deprecating SQL_CALC_FOUND_ROWS functionality with version 8.0.17 onwards.

So, it is always preferred to consider executing your query with LIMIT, and then a second query with COUNT(*) and without LIMIT to determine whether there are additional rows.

From docs:

The SQL_CALC_FOUND_ROWS query modifier and accompanying FOUND_ROWS() function are deprecated as of MySQL 8.0.17 and will be removed in a future MySQL version.

COUNT(*) is subject to certain optimizations. SQL_CALC_FOUND_ROWS causes some optimizations to be disabled.

Use these queries instead:

SELECT * FROM tbl_name WHERE id > 100 LIMIT 10;
SELECT COUNT(*) WHERE id > 100;

Also, SQL_CALC_FOUND_ROWS has been observed to having more issues generally, as explained in the MySQL WL# 12615 :

SQL_CALC_FOUND_ROWS has a number of problems. First of all, it's slow. Frequently, it would be cheaper to run the query with LIMIT and then a separate SELECT COUNT() for the same query, since COUNT() can make use of optimizations that can't be done when searching for the entire result set (e.g. filesort can be skipped for COUNT(*), whereas with CALC_FOUND_ROWS, we must disable some filesort optimizations to guarantee the right result)

More importantly, it has very unclear semantics in a number of situations. In particular, when a query has multiple query blocks (e.g. with UNION), there's simply no way to calculate the number of “would-have-been” rows at the same time as producing a valid query. As the iterator executor is progressing towards these kinds of queries, it is genuinely difficult to try to retain the same semantics. Furthermore, if there are multiple LIMITs in the query (e.g. for derived tables), it's not necessarily clear to which of them SQL_CALC_FOUND_ROWS should refer to. Thus, such nontrivial queries will necessarily get different semantics in the iterator executor compared to what they had before.

Finally, most of the use cases where SQL_CALC_FOUND_ROWS would seem useful should simply be solved by other mechanisms than LIMIT/OFFSET. E.g., a phone book should be paginated by letter (both in terms of UX and in terms of index use), not by record number. Discussions are increasingly infinite-scroll ordered by date (again allowing index use), not by paginated by post number. And so on.

Irresponsive answered 16/4, 2019 at 12:46 Comment(3)
How to perform this two selects as atomic operation? What if someone inserts a row before the SELECT COUNT(*) query? Thanks.Clydeclydebank
@Clydeclydebank if you have MySQL8+, you can run both the query in a single query using Window functions; but this won't be an optimal solution as indexes won't be used properly. Another option is to surround these two queries with LOCK TABLES <tablename> and UNLOCK TABLES. Third option and (best IMHO) is to rethink pagination. Please read: mariadb.com/kb/en/library/pagination-optimizationIrresponsive
In contrast to MySQL, MariaDB has not (yet) deprecated this feature.Mcnamara
B
23

When choosing the "best" approach, a more important consideration than speed might be the maintainability and correctness of your code. If so, SQL_CALC_FOUND_ROWS is preferable because you only need to maintain a single query. Using a single query completely precludes the possibility of a subtle difference between the main and count queries, which may lead to an inaccurate COUNT.

Boadicea answered 19/12, 2013 at 18:19 Comment(2)
This depends on your set up. If you're using some kind of ORM or query builder, it's very easy to use the same where criteria for both queries, swap the select fields for a count, and drop the limit. You should never write out the criteria twice.Mcburney
I would point out that I'd rather maintain code using two simple fairly standard, easy to understand SQL queries than one which uses a proprietary MySQL feature - which is worth noting is deprecated in newer MySQL versions.Heifetz
C
17

According to the following article: https://www.percona.com/blog/2007/08/28/to-sql_calc_found_rows-or-not-to-sql_calc_found_rows/

If you have an INDEX on your where clause (if id is indexed in your case), then it is better not to use SQL_CALC_FOUND_ROWS and use 2 queries instead, but if you don't have an index on what you put in your where clause (id in your case) then using SQL_CALC_FOUND_ROWS is more efficient.

Crusado answered 22/6, 2015 at 14:54 Comment(0)
A
10

IMHO, the reason why 2 queries

SELECT * FROM count_test WHERE b = 666 ORDER BY c LIMIT 5;
SELECT count(*) FROM count_test WHERE b = 666;

are faster than using SQL_CALC_FOUND_ROWS

SELECT SQL_CALC_FOUND_ROWS * FROM count_test WHERE b = 555 ORDER BY c LIMIT 5;

has to be seen as a particular case.

It in facts depends on the selectivity of the WHERE clause compared to the selectivity of the implicit one equivalent to the ORDER + LIMIT.

As Arvids told in comment (http://www.mysqlperformanceblog.com/2007/08/28/to-sql_calc_found_rows-or-not-to-sql_calc_found_rows/#comment-1174394), the fact that the EXPLAIN use, or not, a temporay table, should be a good base for knowing if SCFR will be faster or not.

But, as I added (http://www.mysqlperformanceblog.com/2007/08/28/to-sql_calc_found_rows-or-not-to-sql_calc_found_rows/#comment-8166482), the result really, really depends on the case. For a particular paginator, you could get to the conclusion that “for the 3 first pages, use 2 queries; for the following pages, use a SCFR” !

Almira answered 4/8, 2014 at 17:36 Comment(0)
F
6

Removing some unnecessary SQL and then COUNT(*) will be faster than SQL_CALC_FOUND_ROWS. Example:

SELECT Person.Id, Person.Name, Job.Description, Card.Number
FROM Person
JOIN Job ON Job.Id = Person.Job_Id
LEFT JOIN Card ON Card.Person_Id = Person.Id
WHERE Job.Name = 'WEB Developer'
ORDER BY Person.Name

Then count without unnecessary part:

SELECT COUNT(*)
FROM Person
JOIN Job ON Job.Id = Person.Job_Id
WHERE Job.Name = 'WEB Developer'
Fotinas answered 7/4, 2016 at 19:53 Comment(0)
E
4

There are other options for you to benchmark:

1.) A window function will return the actual size directly (tested in MariaDB):

SELECT 
  `mytable`.*,
  COUNT(*) OVER() AS `total_count`
FROM `mytable`
ORDER BY `mycol`
LIMIT 10, 20

2.) Thinking out of the box, most of the time users don't need to know the EXACT size of the table, an approximate is often good enough.

SELECT `TABLE_ROWS` AS `rows_approx`
FROM `INFORMATION_SCHEMA`.`TABLES`
WHERE `TABLE_SCHEMA` = DATABASE()
  AND `TABLE_TYPE` = "BASE TABLE"
  AND `TABLE_NAME` = ?
Edmondson answered 29/4, 2019 at 19:16 Comment(0)
T
0

Simple example on table with 2.000.000 rows and query like this :

select fieldname 
from table_add 
where 
descryption_per like '%marihuana%' 
or addiction_per like '%alkohol%';

it is a full table scan every query - so it take time x 2. I mean "select count(*) from .....

Tessi answered 16/10, 2021 at 19:19 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.