Implement Rank without using analytic function

Asked 20/10, 2017 at 19:44 Answered 6/2, 2019 at 5:8

I am wondering if there is a method to implement SQL analytic functions without using the inbuilt functions.

SELECT *,
    ROW_NUMBER() OVER (PARTITION BY dept_id ORDER BY salary DESC) AS rownum,
    DENSE_RANK() OVER (PARTITION BY dept_id ORDER BY salary DESC) AS denserank,
    RANK() OVER (PARTITION BY dept_id ORDER BY salary DESC) AS rnk
FROM emp;

Chronicles answered 20/10, 2017 at 19:44 Comment(2)

I dont want to do it in mysql. Will SELF JOIN work? – Chronicles 20/10, 2017 at 19:47

great to see you did the legwork already on this question. You might consider accepting an answer (especially take look at Gordon Linoff's excellent one). – Exceeding 20/4, 2019 at 22:27

Here are the three equivalent expressions:

select emp.*,
       (select count(*)
        from emp emp2
        where emp2.dept_id = emp.dept_id and
              (emp2.salary > emp.salary or
               emp2.salary = emp.salary and emp2.emp_id <= emp.emp_id
              )
       ) as "row_number",
       (select 1 + count(*)
        from emp emp2
        where emp2.dept_id = emp.dept_id and
              emp2.salary > emp.salary 
              )
       ) as "rank",
       (select count(distinct salary)
        from emp emp2
        where emp2.dept_id = emp.dept_id and
              emp2.salary >= emp.salary
       ) as "dense_rank",
from emp;

This assumes the existence of an emp_id to make the rows unique for "row_number".

Charter answered 20/10, 2017 at 20:2 Comment(5)

@teja . . . All three of these should work, regardless of whether or not there are duplicates. row_number() depends on the existence of a unique key in the table, as described in the answer. – Charter 21/10, 2017 at 19:12

Is there an another solution to do this without use of co related sub query – Autoicous 10/2, 2019 at 14:50

@Autoicous . . . Questions should be asked as questions not in comments, along with sample data, desired results, and a database tag. – Charter 10/2, 2019 at 22:1

I was looking for an authoritative answer to this precise question (since Spark Structured Streaming does not support windowing/analytic functions on a non-timestamp column) . Lo and behold from the horse's mouth Gordon Linoff – Exceeding 20/4, 2019 at 22:25

Ouch- just realized the above use correlated subqueries - which are not supported in spark structured streaming .. #55778182 – Exceeding 20/4, 2019 at 23:21

This would work for all cases

select DEPT_ID, SALARY,

    (select count(*)+1  from emp r where r.SALARY>o.SALARY and r.dept_id=o.dept_id) **rank**,

    (select count(distinct SALARY )+1  from emp r where r.SALARY>o.SALARY and r.dept_id=o.dept_id) *d_rank*,

    (select count(*)+1  from (select x.*,rownum rn from ( select emp.* from emp  order by DEPT_ID asc,salary desc ) x) r where r.rn<o.rn and r.dept_id=o.dept_id) **rownumm**

from (select x.*,rownum rn from ( select emp.* from emp  order by DEPT_ID asc,salary desc ) x) o 

order by DEPT_ID,salary desc;

for rank:- calculated using (count of (values less than current rows)+1

for dense rank:- same as rank (count distinct value less than current rows)+1

row_number:- create the nested query by generating rownum for each row which will be distinct for all rows. Now on top of that do the same logic as rank (count of values greater than previous rownum (rownum of select subquery))+1

Vaasa answered 6/2, 2019 at 5:8 Comment(2)

Please add some explanation why your code will work for all the cases. So, others can learn from it. – Versed 6/2, 2019 at 5:37

sure Jayson I am doing it – Vaasa 6/2, 2019 at 17:31

You can do this with a correlated sub-query.

select dept_id,salary,
(select count(*) from emp e1 where e1.dept_id=e.dept_id and e1.salary>=e.salary) as rnum
from emp e

This works well when there are no ties.

Ambidexter answered 20/10, 2017 at 19:48 Comment(4)

How can I implement rownum when there are ties? – Chronicles 20/10, 2017 at 20:25

Row number with ties is a dense rank. – Resentment 20/10, 2017 at 20:31

@RossBush How can it be a dense rank? If you use ROW_NUMBER() OVER ( PARTITION BY ORDER BY ) it would assign unique numbers even if there are dups.. – Chronicles 20/10, 2017 at 20:34

I meant if you ordered by salary within a department and gave each member of a department a row number writhing that order, however, you kept a placeholder for ties (each tie gets the same row number), then that is row_number with ties by salary and dept_id. – Resentment 20/10, 2017 at 20:44

Recommended topics

Hot tags