Getting total row count from OFFSET / FETCH NEXT
Asked Answered
V

4

116

So, I've got a function that returns a number of records that I want to implement paging for on my website. It was suggested to me that I use the Offset/Fetch Next in SQL Server 2012 to accomplish this. On our website, we have an area that lists total number of records and what page you're on at the time.

Before, I was getting the entire record set and was able to build the paging on that programatically. But using the SQL way with FETCH NEXT X ROWS ONLY, I am only given back X rows, so I don't know what my total record set is and how to calculate my min and max pages. The only way I can tell of doing this is calling the function twice and doing a count of rows on the first, then running the second with FETCH NEXT. Is there a better way that won't have me running the query twice? I am trying to speed up performance, not slow it down.

Vidette answered 10/9, 2012 at 13:11 Comment(0)
F
142

You can use COUNT(*) OVER() ... here is a quick example using sys.all_objects:

DECLARE 
  @PageSize INT = 10, 
  @PageNum  INT = 1;

SELECT 
  name, object_id, 
  overall_count = COUNT(*) OVER()
FROM sys.all_objects
ORDER BY name
  OFFSET (@PageNum-1)*@PageSize ROWS
  FETCH NEXT @PageSize ROWS ONLY;

However, this should be reserved for small data sets; on larger sets, the performance can be abysmal. See this Paul White article for better alternatives, including maintaining indexed views (which only works if the result is unfiltered or you know WHERE clauses in advance) and using ROW_NUMBER() tricks.

Forestry answered 18/9, 2012 at 3:42 Comment(10)
In a table with 3,500,000 records, the COUNT(*) OVER() took 1 minute and 3 seconds. The approach described below by James Moberg took 13 seconds to retrieve the same data-set. I'm sure the Count Over approach works fine for smaller data-sets, but when you start getting really large it slows down considerably.Prosaic
Or you could just use COUNT(1) OVER() which is a helluvalot faster since it doesn't have to read the actual data from the table, like count(*) doesOleson
one last to consider, I used a couple of larger tables where I work, ran a cross join to explode them (approx. 10.9 mil records) and used this: `select (fields you need)Greenhouse
@Oleson There is absolutely no difference between COUNT(*) and COUNT(1).Forestry
@AaronBertrand Really? that must mean that you either have an index that includes all columns, or that this has been improved a lot since 2008R2. In that version, the count(*) works sequentially, meaning that first * (as in: all columns) is selected, then counted. If you did a count(1), you just select a constant, which is a lot faster than reading the actual data.Oleson
@idx No, that's not how that worked in 2008 R2 either, sorry. I've been using SQL Server since 6.5 and I don't recall a time when the engine wasn't smart enough to just scan the narrowest index for both COUNT(*) or COUNT(1). Certainly not since 2000. But hey, I have an instance of 2008 R2, can you set up a repro on SQLfiddle that demonstrates this difference you claim exists? I'm happy to try it.Forestry
on a sql server 2016 database, searching on a table with about 25 millions rows, paging over about 3000 results (with several joins, including to a table-valued function), this took milliseconds - awesome!Azov
I know it's years later, but just ran into this so it's still relevant. The reason I don't like this is because it returns the record count for each record returned. For anyone concerned with cutting down on network/data size, this isn't ideal. I'd rather do a SELECT @Count = COUNT(*) FROM ... and return @Count as an OUTPUT parameter. On another note, a test on Sql Server 2016 hosted on Azure with 1 million records took 10 seconds to retrieve the last 1000 rows using this method, while it only took 1 second to do the Count(*) by itself and then the OFFSET FETCH NEXT.Svoboda
This is really cool, but on my test dataset it is much slower than running the query twice (once for the paged result, and again using COUNT(*) for the whole query). FWIW, my query is over about 750k records with two unions and multiple criteria in SQL Server 2012.Chaoan
@Chaoan Yes, if you're adding criteria and returning more columns than covered by the index used to satisfy the criteria, that is probably true. You might check out this technique to see if you can get at just the keys and the count with your criteria, then page the rest of the data after you've picked the rows for the current page.Forestry
G
169

I encountered some performance issues using the COUNT() OVER() method. (I'm not sure if it was the server as it took 40 seconds to return 10 records and then later didn't have any issues.) This technique worked under all conditions without having to use COUNT() OVER() and accomplishes the same thing:

DECLARE 
    @PageSize INT = 10, 
    @PageNum  INT = 1;

WITH TempResult AS(
    SELECT ID, Name
    FROM Table
), TempCount AS (
    SELECT COUNT(*) AS MaxRows FROM TempResult
)
SELECT *
FROM TempResult, TempCount
ORDER BY TempResult.Name
    OFFSET (@PageNum-1)*@PageSize ROWS
    FETCH NEXT @PageSize ROWS ONLY
Ghee answered 21/11, 2013 at 19:37 Comment(11)
It would be really awesome if there was a possibility to save COUNT(*) value to a variable. I would be able to set it as an OUTPUT parameter of my Stored Procedure. Any ideas?Flyblown
Is there any way to get the count in a separate table? It seems like you can only use "TempResult" for the first preceding SELECT statement.Prosaic
declare @result table ( ID int, Name nvarchar(max), counter int ) then on the above statement the final outside of CTE select, can be : SELECT * into @result FROM TempResult, TempCount ORDER BY TempResult.Name OFFSET (@PageNum-1)*@PageSize ROWS FETCH NEXT @PageSize ROWS ONLY in case you needed row counts and data re-use through a procedureGreenhouse
Why does this work so well? In the first CTE, all the rows are selected, then pared down by the fetch. I would have guessed that selecting all the row in the first CTE would slow things down significantly. In any case, thanks for this!Illuminative
in my case it slowed down than COUNT(1) OVER().. maybe because a function in the select.Bravissimo
I noticed though that adding ROW_NUMBER() in first CTE.. SELECT * FROM TempResult, TempCount WHERE RowNumber BETWEEN 0 AND 10 is faster then offset etc. in my case.Precipitous
With this approach, if the Offset is larger than the count of records then NO values are returned. How do you return the "MaxRows/Count" in this case?Granduncle
What issue are you encountering? And what are you referring to as the "count of records"? If @PageSize INT=10 and only 6 records exist in the table, only 6 records are returned. Regarding using the post-query rows in scripting, I suppress display of next/prev controls if recordcount <= pagesize.Ghee
The other answer did not even work properly for me b/c I was selecting from a view with a where clause and it would give me total results for the view w/o considering my where conditions. This answer works perfect!Escalate
This works perfect for small database when rows are millions it take too much time.Lachman
This works, but I don't want to put TotalCount on my type I'm deserializaing to... I need to store it to a variable that get's picked up via an output parameter and I can't set a variable to it in that select query because it's retrieving data... So for now I'm been reduced to duplicating the same query twice, one without fetch/offset...Lepidote
F
142

You can use COUNT(*) OVER() ... here is a quick example using sys.all_objects:

DECLARE 
  @PageSize INT = 10, 
  @PageNum  INT = 1;

SELECT 
  name, object_id, 
  overall_count = COUNT(*) OVER()
FROM sys.all_objects
ORDER BY name
  OFFSET (@PageNum-1)*@PageSize ROWS
  FETCH NEXT @PageSize ROWS ONLY;

However, this should be reserved for small data sets; on larger sets, the performance can be abysmal. See this Paul White article for better alternatives, including maintaining indexed views (which only works if the result is unfiltered or you know WHERE clauses in advance) and using ROW_NUMBER() tricks.

Forestry answered 18/9, 2012 at 3:42 Comment(10)
In a table with 3,500,000 records, the COUNT(*) OVER() took 1 minute and 3 seconds. The approach described below by James Moberg took 13 seconds to retrieve the same data-set. I'm sure the Count Over approach works fine for smaller data-sets, but when you start getting really large it slows down considerably.Prosaic
Or you could just use COUNT(1) OVER() which is a helluvalot faster since it doesn't have to read the actual data from the table, like count(*) doesOleson
one last to consider, I used a couple of larger tables where I work, ran a cross join to explode them (approx. 10.9 mil records) and used this: `select (fields you need)Greenhouse
@Oleson There is absolutely no difference between COUNT(*) and COUNT(1).Forestry
@AaronBertrand Really? that must mean that you either have an index that includes all columns, or that this has been improved a lot since 2008R2. In that version, the count(*) works sequentially, meaning that first * (as in: all columns) is selected, then counted. If you did a count(1), you just select a constant, which is a lot faster than reading the actual data.Oleson
@idx No, that's not how that worked in 2008 R2 either, sorry. I've been using SQL Server since 6.5 and I don't recall a time when the engine wasn't smart enough to just scan the narrowest index for both COUNT(*) or COUNT(1). Certainly not since 2000. But hey, I have an instance of 2008 R2, can you set up a repro on SQLfiddle that demonstrates this difference you claim exists? I'm happy to try it.Forestry
on a sql server 2016 database, searching on a table with about 25 millions rows, paging over about 3000 results (with several joins, including to a table-valued function), this took milliseconds - awesome!Azov
I know it's years later, but just ran into this so it's still relevant. The reason I don't like this is because it returns the record count for each record returned. For anyone concerned with cutting down on network/data size, this isn't ideal. I'd rather do a SELECT @Count = COUNT(*) FROM ... and return @Count as an OUTPUT parameter. On another note, a test on Sql Server 2016 hosted on Azure with 1 million records took 10 seconds to retrieve the last 1000 rows using this method, while it only took 1 second to do the Count(*) by itself and then the OFFSET FETCH NEXT.Svoboda
This is really cool, but on my test dataset it is much slower than running the query twice (once for the paged result, and again using COUNT(*) for the whole query). FWIW, my query is over about 750k records with two unions and multiple criteria in SQL Server 2012.Chaoan
@Chaoan Yes, if you're adding criteria and returning more columns than covered by the index used to satisfy the criteria, that is probably true. You might check out this technique to see if you can get at just the keys and the count with your criteria, then page the rest of the data after you've picked the rows for the current page.Forestry
C
4

Apparently results can vary vastly depending on the query. I tested my case with these results: (8 joins, 2 sub queries, 5800 rows in distinct result, 5900 non-distinct):

  • ~0.820 sec using COUNT(1) OVER() (Aaron Bertrand's answer, but with wrong results*)
  • ~0.850 sec using #TEMP table.
  • ~1.590 sec WITH .. AS (James Moberg's anser)
  • ~1.600 sec running twice (first time without ordering, just to count)

*In my case Aaron Bertrand's answer did not work out because COUNT(1) OVER() seems to include the rows filtered out by DISTINCT.

Using a temp table:

DECLARE 
  @PageSize INT = 10, 
  @PageNum  INT = 1;
 
SELECT
  name, object_id
INTO #MY_TEMP
FROM sys.all_objects

SELECT *
FROM #MY_TEMP
ORDER BY name
  OFFSET (@PageNum-1)*@PageSize ROWS
  FETCH NEXT @PageSize ROWS ONLY;

SELECT COUNT(1) FROM #MY_TEMP
-- or
-- SELECT @MY_OUTPUT_PARAM = COUNT(1) FROM #MY_TEMP

DROP TABLE #MY_TEMP

Nice thing about the temp table is that the count can be separated into a different result or output parameter.

Cayuse answered 29/3, 2021 at 16:1 Comment(0)
B
1

Based on James Moberg's answer:

This is an alternative using Row_Number(), if you don't have SQL server 2012 and you can't use OFFSET

DECLARE 
    @PageNumEnd INT = 10, 
    @PageNum  INT = 1;

WITH TempResult AS(
    SELECT ID, NAME
    FROM Tabla
), TempCount AS (
    SELECT COUNT(*) AS MaxRows FROM TempResult
)

select * 
from
(
    SELECT
     ROW_NUMBER() OVER ( ORDER BY PolizaId DESC) AS 'NumeroRenglon', 
     MaxRows, 
     ID,
     Name
    FROM TempResult, TempCount

)resultados
WHERE   NumeroRenglon >= @PageNum
    AND NumeroRenglon <= @PageNumEnd
ORDER BY NumeroRenglon
Bashee answered 27/4, 2016 at 23:11 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.