Is there any performance issue using Row_Number to implement table paging in Sql Server 2008?
Asked Answered
T

3

14

I want to implement table paging using this method:

SET @PageNum = 2;
SET @PageSize = 10;

WITH OrdersRN AS
(
    SELECT ROW_NUMBER() OVER(ORDER BY OrderDate, OrderID) AS RowNum
          ,*
      FROM dbo.Orders
)

SELECT * 
  FROM OrdersRN
 WHERE RowNum BETWEEN (@PageNum - 1) * @PageSize + 1 
                  AND @PageNum * @PageSize
 ORDER BY OrderDate ,OrderID;

Is there anything I should be aware of ? Table has millions of records.

Thx.

EDIT: After using suggested MAXROWS method for some time (which works really really fast) I had to switch back to ROW_NUMBER method because of its greater flexibility. I am also very happy about its speed so far (I am working with View having more then 1M records with 10 columns). To use any kind of query I use following modification:

PROCEDURE [dbo].[PageSelect] 
(
  @Sql nvarchar(512),
  @OrderBy nvarchar(128) = 'Id',
  @PageNum int = 1,
  @PageSize int = 0    
)
AS
BEGIN
SET NOCOUNT ON

 Declare @tsql as nvarchar(1024)
 Declare @i int, @j int

 if (@PageSize <= 0) OR (@PageSize > 10000)
  SET @PageSize = 10000  -- never return more then 10K records

 SET @i = (@PageNum - 1) * @PageSize + 1 
 SET @j = @PageNum * @PageSize

 SET @tsql = 
 'WITH MyTableOrViewRN AS
 (
  SELECT ROW_NUMBER() OVER(ORDER BY ' + @OrderBy + ') AS RowNum
     ,*
    FROM MyTableOrView
    WHERE ' + @Sql  + '

 )
 SELECT * 
  FROM MyTableOrViewRN 
  WHERE RowNum BETWEEN ' + CAST(@i as varchar) + ' AND ' + cast(@j as varchar)

 exec(@tsql)
END

If you use this procedure make sure u prevented sql injection.

Trinary answered 22/2, 2010 at 1:2 Comment(3)
Exact Duplicate: #1897936Felker
Pony, I am not very happy with that answer, mostly because it doesn't even mention Row_Number()..... The question is, again: I am using Row_Number(). What can you tell me about its performance comparing to another methods (so, don't offer me another methods)Trinary
BTW, Pony I find remarks like this very rude. I am sure I know what is the good answer for my question, I don't need u to tell me that. Typical amdin BS.Trinary
N
20

I've written about this a few times actually; ROW_NUMBER is by far the most flexible and easy-to-use, and performance is good, but for extremely large data sets it is not always the best. SQL Server still needs to sort the data and the sort can get pretty expensive.

There's a different approach here that uses a couple of variables and SET ROWCOUNT and is extremely fast, provided that you have the right indexes. It's old, but as far as I know, it's still the most efficient. Basically you can do a totally naïve SELECT with SET ROWCOUNT and SQL Server is able to optimize away most of the real work; the plan and cost ends up being similar to two MAX/MIN queries, which is usually a great deal faster than even a single windowing query. For very large data sets this runs in less than 1/10th the time.

Having said that, I still always recommend ROW_NUMBER when people ask about how to implement things like paging or groupwise maximums, because of how easy it is to use. I would only start looking at alternatives like the above if you start to notice slowdowns with ROW_NUMBER.

Northamptonshire answered 22/2, 2010 at 6:43 Comment(7)
First acceptable answer. Thanks m8. I don't need the best of the best. I need good.Trinary
I was using this method with ROWCOUNT and I am very happy with it, its extremely fast. However, I can't make it to work when I have custom ORDER BY statement with non-identity columns. Do you know a way around it ?Trinary
@majkinetor: Do you mean simply that you are want to sort/page by a field other than the ID, or that the table has no ID column or sequential key at all?Northamptonshire
Actually, I guess it doesn't really matter... it definitely works with non-ID columns, but you need to change everything - my suspicion is that you changed the ORDER BY in both lookups but still chose to save and filter by the ID; you need to change the query to save and filter by the actual sort column.Northamptonshire
The problem is 'keeping the last index' so to know where to continue for next page. If you don't have identity columns you don't know where to continue. For instance, imagine I am returning 100 rows of 1 column containing single constant. Even if 2nd column is identity, I didn't found the way to use it to mark the next subset to return. I switched to ROW_NUMBER because of that.Trinary
@majkinetor: You just store the sort column instead of the ID. If you're sorting by a column called Name, then you'd save a @first_name and in the second query write WHERE Name >= @first_name. If the column might have duplicates then you might need to save both @first_name and @first_id, and write WHERE Name >= @first_name AND ID >= @first_id in order to prevent duplicate items in the paged results. It works - try it!Northamptonshire
I did. WHERE Name >= @first_name AND ID >= @first_id simply doesn't work because selection doesn't have to be in any order. For instance, take recordset "204, 13", "204, 1", "204, 76", "204, 4". Second column is identity. PageSize is 2 records. This procedure depends on ordered column. Contrary to that ROW_NUMBER method produces ordered column which you can use no matter how the query looks like. It also works equally fast here (both procedures return result in 0ms on my 1M records single table.)Trinary
B
9

Recently, I used paging in a data warehouse environment with a star schema. I found that the performance was very good when I restricted the CTE to only query the rows necessary to determine the ROW_NUMBER. I had the CTE return the ROW_NUMBER plus the primary keys of the other rows that helped determine the row number.

In the main query, I referenced the ROW_NUMBER for paging, and then joined to the other tables based on the other primary keys from the CTE. I found that the joins were only performed on the rows that satisfied the WHERE clause in the outer query, saving a great deal of time.

Bertiebertila answered 22/2, 2010 at 1:8 Comment(1)
That should make it even less of a problem. Try it, then look at the execution plan.Bertiebertila
W
-2

test this solution, maybe it is better. change this with your need please.

CREATE PROCEDURE sp_PagedItems
    (
     @Page int,
     @RecsPerPage int
    )
AS

-- We don't want to return the # of rows inserted
-- into our temporary table, so turn NOCOUNT ON
SET NOCOUNT ON


--Create a temporary table
CREATE TABLE #TempItems
(
    ID int IDENTITY,
    Name varchar(50),
    Price currency
)


-- Insert the rows from tblItems into the temp. table
INSERT INTO #TempItems (Name, Price)
SELECT Name,Price FROM tblItem ORDER BY Price

-- Find out the first and last record we want
DECLARE @FirstRec int, @LastRec int
SELECT @FirstRec = (@Page - 1) * @RecsPerPage
SELECT @LastRec = (@Page * @RecsPerPage + 1)

-- Now, return the set of paged records, plus, an indiciation of we
-- have more records or not!
SELECT *,
       MoreRecords =
    (
     SELECT COUNT(*)
     FROM #TempItems TI
     WHERE TI.ID >= @LastRec
    )
FROM #TempItems
WHERE ID > @FirstRec AND ID < @LastRec


-- Turn NOCOUNT back OFF
SET NOCOUNT OFF
Washcloth answered 22/2, 2010 at 6:19 Comment(2)
Copying the entire table into a temp table... with no index? Yeah, that's gonna be slow. Reeeeal slow. Hard to imagine a worse approach, TBH.Northamptonshire
Also, notice the "Row_Number" problem in the question. Although I dont find this useful (no offense), I'll give you a plus just to make OMG Ponnies and his friends happy.Trinary

© 2022 - 2024 — McMap. All rights reserved.