Getting values relating to the max and min rows in Oracle
Asked Answered
V

2

5

In Oracle 11g we need to be able to query a table to pull out information from rows with the highest and lowest values in a certain group. For example using the EMP table we'd like to find the name of person with the highest salary and the name of the person with the lowest salary in each department

DEPTNO   MAX_SAL    MAX_EARNER    MIN_SAL    MIN_EARNER
-------------------------------------------------------
10       5000       KING          1300       MILLER
20       3000       FORD          2975       JONES
etc

(if there are two or more staff with the highest or lowest salaray we want to always return the first in alphabetical order).

A Previous Post discussed how to get a value for just the maximum but not both max and min.

We have an untidy solution at the moment based on the link above then applying subsiquent queries but performance is important to us. I predict a good solution will also require analytic functions and possibly a pivot to coalesce the multiple rows into single rows.

Any help greatly appreciated! Richard

Vacuity answered 4/5, 2011 at 8:35 Comment(2)
which previous question? it helps if you give us a link.Phocaea
Sorry link was lost - #121887Vacuity
P
5

This is easily solvable with analytic functions. As you can see, there are two employees earning the maximum salary in DEPT 20; this is an important detail, as some common solutions to this kind of problem miss that information.

SQL> select ename
  2             , deptno
  3             , sal
  4  from (
  5      select ename
  6             , deptno
  7             , sal
  8             , max (sal) over (partition by deptno) max_sal
  9             , min (sal) over (partition by deptno) min_sal
 10      from emp
 11      )
 12  where sal = max_sal
 13  or    sal = min_sal
 14  order by deptno, sal
 15  /

ENAME          DEPTNO        SAL
---------- ---------- ----------
KISHORE            10       1300
SCHNEIDER          10       5000
CLARKE             20        800
RIGBY              20       3000
GASPAROTTO         20       3000
HALL               30        950
LIRA               30       3750
TRICHLER           50       3500
FEUERSTEIN         50       4500

9 rows selected.

SQL>

Oops, I missed an important detail about the result format. My data won't fit the requested output, because there are two employees earning the maximum salary. So this query, which I admit is a bit awkward, gives us the required layout. The MIN() on the employee names returns the alphabetical order :

SQL> select
  2         deptno
  3         , max (case when sal = min_sal then min_sal else null end ) as min_sal
  4         , min (case when sal = min_sal then ename else null end ) as min_name
  5         , max (case when sal = max_sal then max_sal else null end ) as max_sal
  6         , min (case when sal = max_sal then ename else null end ) as max_name
  7  from (
  8      select ename
  9             , deptno
 10             , sal
 11             , max (sal) over (partition by deptno) max_sal
 12             , min (sal) over (partition by deptno) min_sal
 13      from emp
 14      )
 15  where sal = max_sal
 16  or    sal = min_sal
 17  group by deptno
 18  order by deptno
 19  /

    DEPTNO    MIN_SAL MIN_NAME      MAX_SAL MAX_NAME
---------- ---------- ---------- ---------- ----------
        10       1300 KISHORE          5000 SCHNEIDER
        20        800 CLARKE           3000 GASPAROTTO
        30        950 HALL             3750 LIRA
        50       3500 TRICHLER         4500 FEUERSTEIN

SQL>

I don't like this solution. Most datasets will contain such clashes, and we need to acknowledge them. Filtering the result on the basis of some unrelated criteria to fit a Procrustean report layout is misleading. I would prefer a report layout which reflected the whole dataset. Ultimately it depends on the business purpose which the query serves. And, of course, the customer is always right 8-)

Phocaea answered 4/5, 2011 at 9:26 Comment(2)
Thanks APC, that's a good solution. You might guess that in our application we're not actually listing people's salaries, and it is perfectly acceptable to display any example of the thing that we're getting the best and the worst of if there are multiple.Vacuity
APC, can I just check - about the "max" in the outer query for selecting the salaries. I think that is just an aggregate function to pick the non-null value from the case statement and it could be replaced by other functions - eg SUM. Is this correct? I think SUM might be a marginally more intuitive in this instance.Vacuity
O
1

You can use the below query

SELECT
  dept,
  max_sal,
  (SELECT emp_name FROM emp WHERE salary = max_sal AND rownum =1) max_earner,
  min_sal,
  (SELECT emp_name FROM emp WHERE salary = min_sal AND rownum =1) min_earner
FROM
  (SELECT
    dept,
    MAX(salary) max_sal,
    MIN(salary) min_sal
  FROM emp
  GROUP BY dept);

assuming your table looks like this:

CREATE TABLE emp
(
    dept NUMBER,
    emp_name VARCHAR2(20 BYTE),
    salary NUMBER
);

Update

In order to acheive your other requirement "if there are two or more staff with the highest or lowest salaray we want to always return the first in alphabetical order", you may tweak the query a bit as below (I'm sure there's scope for improvement here):

SELECT
  dept,
  max_sal,
  (select emp_name from (SELECT * FROM emp order by emp_name asc) WHERE salary = max_sal AND rownum =1) max_earner,
  min_sal,
  (select emp_name from (SELECT * FROM emp order by emp_name asc) WHERE salary = min_sal AND rownum =1) min_earner
FROM
  (SELECT
    dept,
    MAX(salary) max_sal,
    MIN(salary) min_sal
  FROM emp
  GROUP BY dept);
Organization answered 4/5, 2011 at 9:8 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.