ROW_NUMBER Without ORDER BY
Asked Answered
H

4

101

I've to add row number in my existing query so that I can track how much data has been added into Redis. If my query failed so I can start from that row no which is updated in other table.

Query to get data start after 1000 row from table

SELECT * FROM (SELECT *, ROW_NUMBER() OVER (Order by (select 1)) as rn ) as X where rn > 1000

Query is working fine. If any way that I can get the row no without using order by.

What is select 1 here?

Is the query optimized or I can do it by other ways. Please provide the better solution.

Heredity answered 22/5, 2017 at 6:7 Comment(0)
A
157

There is no need to worry about specifying constant in the ORDER BY expression. The following is quoted from the Microsoft SQL Server 2012 High-Performance T-SQL Using Window Functions written by Itzik Ben-Gan (it was available for free download from Microsoft free e-books site):

As mentioned, a window order clause is mandatory, and SQL Server doesn’t allow the ordering to be based on a constant—for example, ORDER BY NULL. But surprisingly, when passing an expression based on a subquery that returns a constant—for example, ORDER BY (SELECT NULL)—SQL Server will accept it. At the same time, the optimizer un-nests, or expands, the expression and realizes that the ordering is the same for all rows. Therefore, it removes the ordering requirement from the input data. Here’s a complete query demonstrating this technique:

SELECT actid, tranid, val,
 ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS rownum
FROM dbo.Transactions;

enter image description here

Observe in the properties of the Index Scan iterator that the Ordered property is False, meaning that the iterator is not required to return the data in index key order


The above means that when you are using constant ordering is not performed. I will strongly recommend to read the book as Itzik Ben-Gan describes in depth how the window functions are working and how to optimize various of cases when they are used.

Adequate answered 22/5, 2017 at 6:56 Comment(2)
so query, SELECT * FROM (SELECT *, ROW_NUMBER() OVER (Order by (select NULL)) as rn ) as X where rn > 1000 is always give the correct result. means order remains same for every exection of queryHeredity
@Heredity No, the order does not remain the same for every execution of the query. Using (SELECT constant) means that there is NO ORDER. You are selecting a specific amount of the data - you can execute the query 1 million times and it to return the same data, but there is no guarantee for this. Without specific ordering, the result is not deterministic.Adequate
G
18

You can use any literal value, for example:

order by (select 0)

order by (select null)

order by (select 'test')
Garretgarreth answered 22/5, 2017 at 6:37 Comment(3)
gotqn said in comment that select 1 means random order. So if add condition in query where row no > 1000. So it may be give previous record because it is giving random order. Or it is correct to useHeredity
every time select 0 will give the same order when I execute the query?Heredity
@Heredity - maybe it will sometimes, but it is not guaranteed since there is then no order. To guarantee the same results, use some data that is unique for each record in the ORDER BY clause.Dispatch
L
16

Try just order by 1. Read the error message. Then reinstate the order by (select 1). Realise that whoever wrote this has, at some point, read the error message and then decided that the right thing to do is to trick the system into not raising an error rather than realising the fundamental truth that the error was trying to alert them to.

Tables have no inherent order. If you want some form of ordering that you can rely upon, it's up to you to provide enough deterministic expression(s) to any ORDER BY clause such that each row is uniquely identified and ordered.

Anything else, including tricking the system into not emitting errors, is hoping that the system will do something sensible without using the tools provided to you to ensure that it does something sensible - a well specified ORDER BY clause.

Liguria answered 22/5, 2017 at 6:13 Comment(11)
so I am doing right, select 1.? I did not find anything else for this problemHeredity
I partially disagree with the statement, "Tables have no inherent order." I've found (when selecting from a Table) it will return its Records using the Clustered-Index Order every time. Now start joining tables and making a more complicated query, then I think the Default-Order may be up to whatever route the Query-Optimizer takes. BTW: The Questioner is not asking about leaving the Order-By Clause out all together. You could still have one, she just wanted to know if it could be left unspecified in the Row_Number() expression, which it can: ROW_NUMBER() OVER (ORDER BY (SELECT NULL))Orang
@Orang - it's a topic that has been done to death. There is no inherent user observable order. Often a select from a single table will follow the clustered index order but SQL Server offers no guarantee that it's the case. Even in this case, though, you can end up with a carousel scan - your query piggy-backs on another query that has already started a table scan. You query starts getting the results from the middle of the table because that's where the scan was and then will start a second scan to finish off. Out of order results even with just the clustered index defined.Liguria
Wow, I had no clue. Thanks for the super-informative response! After all these years, I feel like a newb again.Orang
ROW_NUMBER is still useful wo. orderDissert
@Dissert Can you please provide a working example of ROW_NUMBER without an order by clause?Kabul
@Grimm, I'm confused. Please elaborateDissert
@Dissert If I understood you correctly, you said that ROW_NUMBER is useful without an order clause. This seems to contradict the documentation (and this thread) where this clause is mandatory. That's why I was confused and asked for an example of the use of ROW_NUMBER without an order clause.Kabul
@Kabul no I just pointed out that ROW_NUMBER() is still useful, even without deterministic orderDissert
@Dissert Please excuse me - I misunderstood then. It's a pity that English is not as clear as TSQL. ;-)Kabul
@Grimm, no worries. It wasn't super obvious. Don't even remember the exact use I had for ROW_NUMBER() OVER(ORDER BY (SELECT 1)) but I think it's useful in cases where you want to assign each row a unique number in a query, without the penalty of a full table sort. It's an esoteric problem but in such case it's usefulDissert
L
6

What is select 1 here?

In this scenario, the author of query does not really have any particular sorting in mind. ROW_NUMBER requires ORDER BY clause so providing it is a way to satisfy the parser.

Sorting by "constant" will create "undeterministic" order(query optimizer is able to choose whatever order it found suitable).

Easiest way to think about it is as:

ROW_NUMBER() OVER(ORDER BY 1)    -- error
ROW_NUMBER() OVER(ORDER BY NULL) -- error

There are few possible scenarios to provide constant expression to "trick" query optimizer:

ROW_NUMBER() OVER(ORDER BY (SELECT 1)) -- already presented

Other options:

ROW_NUMBER() OVER(ORDER BY 1/0)       -- should not be used
ROW_NUMBER() OVER(ORDER BY @@SPID)
ROW_NUMBER() OVER(ORDER BY DB_ID())
ROW_NUMBER() OVER(ORDER BY USER_ID())

db<>fiddle demo

Laundress answered 22/11, 2020 at 11:44 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.