MAX() OVER PARTITION BY in Oracle SQL
Asked Answered
C

4

5

I am trying to utilize the MAX() OVER PARTITION BY function to evaluate the most recent receipt for a specific part that my company has bought. Below is an example table of the information for a few parts from the last year:

| VEND_NUM | VEND_NAME    | RECEIPT_NUM | RECEIPT_ITEM | RECEIPT_DATE |
|----------|--------------|-------------|----------|--------------|
| 100      | SmallTech    | 2001        | 5844HAJ  | 11/22/2017   |
| 100      | SmallTech    | 3188        | 5521LRO  | 12/31/2017   |
| 200      | RealSolution | 5109        | 8715JUI  | 05/01/2017   |
| 100      | SmallTech    | 3232        | 8715JUI  | 11/01/2017   |
| 200      | RealSolution | 2101        | 4715TEN  | 01/01/2017   |

As you can see, the third and fourth row show two different vendors for the SAME part number.

Here is my current query:

WITH

-- various other subqueries above...

    AllData AS
    (
        SELECT VEND_NUM, VEND_NAME, RECEIPT_NUM, RECEIPT_ITEM, RECEIPT_DATE
        FROM tblVend
            INNER JOIN tblReceipt ON VEND_NUM = RECEIPT_VEND_NUM
        WHERE
            VEND_NUM = '100' OR VEND_NUM = '200' AND RECEIPT_DATE >= '01-Jan-2017'
    ),

    SELECT MAX(RECEIPT_DATE) OVER PARTITION BY(RECEIPT_ITEM) AS "Recent Date", RECEIPT_ITEM
    FROM AllData

My return set looks like:

| Recent Date | RECEIPT_ITEM |
|-------------|--------------|
| 11/22/2017  | 5844HAJ      |
| 12/31/2017  | 5521LRO      |
| 11/01/2017  | 8715JUI      |
| 11/01/2017  | 8715JUI      |
| 01/01/2017  | 4715TEN      |

However, it should look like this:

| Recent Date | RECEIPT_ITEM |
|-------------|--------------|
| 11/22/2017  | 5844HAJ      |
| 12/31/2017  | 5521LRO      |
| 11/01/2017  | 8715JUI      |
| 01/01/2017  | 4715TEN      |

Can anybody please offer advice as to what I'm doing wrong? It looks like it is simply replacing the most recent date, not giving me just the row I want that is most recent.

Ultimately, I would like for my table to look like this. However, I don't know how to use the MAX() or MAX() OVER PARTITION BY() functions properly to allow for this:

| VEND_NUM | VEND_NAME    | RECEIPT_NUM | RECEIPT_ITEM | RECEIPT_DATE |
|----------|--------------|-------------|----------|--------------|
| 100      | SmallTech    | 2001        | 5844HAJ  | 11/22/2017   |
| 100      | SmallTech    | 3188        | 5521LRO  | 12/31/2017   |
| 100      | SmallTech    | 3232        | 8715JUI  | 11/01/2017   |
| 200      | RealSolution | 2101        | 4715TEN  | 01/01/2017   |
Coquette answered 22/3, 2018 at 18:39 Comment(4)
maybe use partition by trim(receipt_item) .. there might have been an extra space(s) in that column.Boatswain
Hi @VamsiPrabhala, thanks a lot for the suggestion. Unfortunately, same thing...I don't know why my MAX() isn't working.Coquette
Sounds like you want to filter out all but the latest receipt for each part? If so, max() is not necessarily the way to do it.Swayne
@WilliamRobertson correct William, sorry for poor wording. I have been researching how to approach this problem for awhile, and I mistakenly thought max was my best approach.Coquette
V
7

Use window function ROW_NUMBER() OVER (PARTITION BY receipt_item ORDER BY receipt_date DESC) to assign a sequence number to each row. The row with the most recent receipt_date for a receipt_item will be numbered as 1.

WITH
-- various other subqueries above...

    AllData AS
    (
        SELECT VEND_NUM, VEND_NAME, RECEIPT_NUM, RECEIPT_ITEM, RECEIPT_DATE,
        ROW_NUMBER() OVER (PARTITION BY RECEIPT_ITEM ORDER BY RECEIPT_DATE DESC ) AS RN
        FROM tblVend
            INNER JOIN tblReceipt ON VEND_NUM = RECEIPT_VEND_NUM
        WHERE
            VEND_NUM IN ( '100','200')  AND RECEIPT_DATE >= '01-Jan-2017'
    )
   SELECT VEND_NUM, VEND_NAME, RECEIPT_NUM, RECEIPT_ITEM, RECEIPT_DATE
   FROM AllData WHERE RN = 1
Viper answered 22/3, 2018 at 18:49 Comment(10)
thank you very much for the explanation and the code. This does work. However, I believe I sold myself short in my question and I accept error for that. I would like to also, ultimately, grab the "RECEIPT_NUM" for which the "RECEIPT_ITEM" came in on. Is that possible to be added?Coquette
I've edited my question to reflect my ultimate end goalCoquette
Hi, @kc2018. Thanks for your edit. I keep getting "ORDER_BY not found in window function" errors, and I'm trying to debug now. It has to do with the "t' subqueryCoquette
Do you have to use cte?Viper
Frankly, and I'm sorry, I don't know what cte is. But what I can tell you is that I have no power to edit anything administrative on the DB side. I have to work with just being able to query and read from the database.Coquette
I also got a missing right parenthesis error before I edited your suggestion, just as an FYICoquette
CTE = common table expression; it's MS SQL's made up term for creating subqueries with the WITH expression.Chincapin
@Viper yes, I need to use CTE.Coquette
Moved the row_number() into the CTE. Corrected the syntax error on missing parenthesis. Also removed the comma after the closing ) close to the end. Give it a try?Viper
This answer could do with some explanation in order to be useful to others. Just pasting modified SQL doesn't aid understanding or make it obviously applicable to other areas. Add some explanatory detail and I'll upvote.Foehn
C
4

I see a couple of issues here. One, the syntax for using the aggregate function MAX() as an analytic function (which is what Oracle helpfully calls a window function) looks like this:

MAX(receipt_date) OVER ( PARTITION BY receipt_item )

(note the position of the parentheses). Second, from your desired result set, you don't actually want a window function, you want to aggregate. A window (or analytic) function will always return a row for each row in its partition; that's just the way it works. So I think what you want is this:

WITH
-- various other subqueries above...
AllData AS
(
    SELECT VEND_NUM, VEND_NAME, RECEIPT_NUM, RECEIPT_ITEM, RECEIPT_DATE
      FROM tblVend
     INNER JOIN tblReceipt ON VEND_NUM = RECEIPT_VEND_NUM
     WHERE ( VEND_NUM = '100' OR VEND_NUM = '200' ) AND RECEIPT_DATE >= DATE'2017-01-01'
)
SELECT VEND_NUM, VEND_NAME, RECEIPT_NUM, RECEIPT_ITEM, MAX(RECEIPT_DATE)
  FROM AllData
 GROUP BY VEND_NUM, VEND_NAME, RECEIPT_NUM, RECEIPT_ITEM;

Now I made some small changes to the above, such as wrapping parentheses around the OR conditions (using IN ('100','200') might be even better) since AND takes precedence over OR (so your query would have gotten results where VEND_NUM = '100' OR ( VEND_NUM = '200' RECEIPT_DATE >= DATE'2017-01-01' ) ... but maybe that's what you wanted?).

Chincapin answered 22/3, 2018 at 19:42 Comment(3)
Hi, @DavidFaber. Thanks a lot for your answer, your inputs, and your explanations. Obviously I'm still a rookie but I am trying to learn. Your solution is akin to the one that kc had originally posted below. However, if ultimately I want to get the final table listed at the end of my question, how can I edit your example?Coquette
I made an update to my answer but I have to caution that I don't have a good sense of your data.Chincapin
Hey, @DavidFaber. I appreciate your input. Unfortunately, that did not yield the results I am looking for. I did not anticipate this problem being so hard.Coquette
Y
1

Just passing by but I think you have to format the date to a 'YYYY-MM-DD' format so that it doesn't consider the "time".

Yim answered 13/4, 2020 at 18:5 Comment(0)
R
0

THIS ANSWERS THE ORIGINAL VERSION OF THE QUESTION.

Your where clause should probably look like this:

 WHERE VEND_NUM IN ('100', '200') AND RECEIPT_DATE >= DATE '2017-01-01'

It is quite possible that what you want is simply:

SELECT DISTINCT RECEIPT_DATE, RECEIPT_ITEM
FROM tblVend INNER JOIN
     tblReceipt
     ON VEND_NUM = RECEIPT_VEND_NUM
WHERE VEND_NUM IN ('100', '200') AND RECEIPT_DATE >= DATE '2017-01-01';

At the very least, this returns what you want to return.

Republic answered 22/3, 2018 at 18:50 Comment(2)
I want to say thank you very much for the explanation and the code. This does work in the situation described above. However, I believe I sold myself short in my question and I accept error for that. I would like to also, ultimately, grab the "RECEIPT_NUM" for which the "RECEIPT_ITEM" came in on. Is that possible to be added? That way, I can use the "RECEIPT_NUM" in later subqueries to add in more data.Coquette
I've edited my question to reflect the intended end goal.Coquette

© 2022 - 2024 — McMap. All rights reserved.