Interesting Oracle analytic query challenge
Asked Answered
A

2

6

I'm fairly experienced with Oracle analytic functions but this one has stumped me. I'll kick myself if there's an obvious solution :)

I have a table, JOURNAL, which records Inserts, Updates and Deletes on another table.

The table that it's a journal for is BOND_PAYMENTS, which represents links between PAYMENTS and BONDS; it stores the amount of money (AMOUNT) that was allocated to a particular bond (identified by BOND_NUMBER) from a particular payment (identified by PAYMENT_ID). In addition, it records what aspect of the bond it was allocated to (BOP_DOMAIN) which might be 'BON', 'PET', or some other code. The BOND_PAYMENTS table has a surrogate key (BOP_ID).

Therefore, my journal table will typically have 1 or more records for each BOP_ID - firstly, an INSert, followed perhaps by some UPDates, followed perhaps by a DELete.

Here is the JOURNAL table:

CREATE TABLE JOURNAL
( JN_DATE_TIME  DATE         NOT NULL,
  JN_OPERATION  VARCHAR2(3)  NOT NULL,
  BOP_ID        NUMBER(9)    NOT NULL,
  PAYMENT_ID    NUMBER(9)    NOT NULL,
  BOND_NUMBER   VARCHAR2(20) NOT NULL,
  BOP_DOMAIN    VARCHAR2(10) NOT NULL,
  AMOUNT        NUMBER(14,2) NOT NULL
);

Here is some sample data:

INSERT INTO JOURNAL VALUES (TO_DATE('01/01/2010','DD/MM/YYYY'),'INS',1242043,1003700,'9995/10','BON',1800);
INSERT INTO JOURNAL VALUES (TO_DATE('03/01/2010','DD/MM/YYYY'),'INS',1242046,1003700,'9998/10','BON',1700);
INSERT INTO JOURNAL VALUES (TO_DATE('04/01/2010','DD/MM/YYYY'),'INS',1242048,1003700,'9999/10','BON',1800);
INSERT INTO JOURNAL VALUES (TO_DATE('05/01/2010','DD/MM/YYYY'),'INS',1242052,1003700,'10003/10','BON',1600);
INSERT INTO JOURNAL VALUES (TO_DATE('08/01/2010','DD/MM/YYYY'),'INS',1242058,1003700,'9998/10','BON',100);
INSERT INTO JOURNAL VALUES (TO_DATE('09/01/2010','DD/MM/YYYY'),'UPD',1242058,1003700,'9998/10','PET',100);
INSERT INTO JOURNAL VALUES (TO_DATE('01/01/2010','DD/MM/YYYY'),'INS',2242043,1003701,'8995/10','BON',1800);
INSERT INTO JOURNAL VALUES (TO_DATE('02/01/2010','DD/MM/YYYY'),'INS',2242046,1003701,'8998/10','BON',1700);
INSERT INTO JOURNAL VALUES (TO_DATE('03/01/2010','DD/MM/YYYY'),'INS',2242048,1003701,'8999/10','BON',1800);
INSERT INTO JOURNAL VALUES (TO_DATE('04/01/2010','DD/MM/YYYY'),'INS',2242058,1003701,'8998/10','BON',100);
INSERT INTO JOURNAL VALUES (TO_DATE('05/01/2010','DD/MM/YYYY'),'UPD',2242046,1003701,'8998/10','BON',1500);
INSERT INTO JOURNAL VALUES (TO_DATE('06/01/2010','DD/MM/YYYY'),'INS',2242052,1003701,'9003/10','BON',1600);
INSERT INTO JOURNAL VALUES (TO_DATE('07/01/2010','DD/MM/YYYY'),'UPD',2242058,1003701,'8998/10','PET',200);

Now, I need to extract a full set of data from this journal table but in a slightly different format. The main requirement is that we don't want the journal table to record BOP_DOMAIN anymore - it's just not required.

I need to generate a history of the total amount for each BOND_PAYMENT record. I cannot use the BOND_PAYMENT table itself because it only shows the latest status of each record. I need to mine this info from the journal.

I can't just take a SUM(amount) over(partition by payment_id, bond_number) because an individual BOP_ID might be updated several times; so at any one moment in time only the latest amount recorded for that BOP_ID should be used.

Given the above sample data, here is an illustration what I'd expect to produce:

SELECT jn_date_time,
       jn_operation,
       bop_id,
       payment_id,
       bond_number,
       bop_domain,
       amount,
       ? as running_total
FROM   JOURNAL
ORDER BY jn_date_time;

sample data and expected results

Here I've reproduced on the left the sample data, for two sample payments. On the right I've got "Running Total", which is the expected output. Next to it (in red) is the logic how it calculated the running total for each row.

The "Running Total" is a snapshot, at the point in time of the journal entry, of the total amount for that combination of PAYMENT_ID and BOND_NUMBER. Remember, a particular BOP_ID might be updated several times; the total amount must consider only the most recent record for that BOP_ID.

Any solution that works will be acceptable, but I suspect an analytic function (or combination of analytic functions) will be the best way to solve this.

Aviate answered 2/12, 2011 at 2:2 Comment(0)
V
6

Try this

WITH inner AS  
  (SELECT jn_date_time,
     jn_operation,
     bop_id,
     payment_id,
     bond_number,
     bop_domain,
     amount,
     amount - coalesce(lag(amount) over (partition by bop_id order by jn_date_time), 0)      
        as delta_bop_amount
  FROM   JOURNAL)
SELECT inner.*, 
sum(delta_bop_amount) 
  over (partition by payment_id, bond_number order by jn_date_time)  as running_total
FROM inner
ORDER BY bond_number, payment_id

This will return the same answer for your examples.

You need two passes - the analytic function in the inner query figures out how much each record changes the total for each BOP_ID. An INS is a straight addition, an UPD has to subtract the most recent value and add the new one.

The second pass then does a running total by bond/payment.

I'm assuming that you wanted to treat the bond/payment as a natural key for the running sum, and that there may be multiple BOP_ID's for any bond/payment combination.

Valenevalenka answered 2/12, 2011 at 2:50 Comment(1)
Very nice :) I see what you're doing there. You first calculate how much the amount changed from the previous relevant entry (via LAG), then it's just a matter of calculating a running sum across the deltas.Aviate
D
0
SELECT a.*,   
lag(amount,1) over (PARTITION BY bond_number ORDER BY 
payment_id,jn_date_time)recent_amount,   
amount + nvl(lag(amount,1) over (PARTITION BY bond_number ORDER BY  
payment_id,jn_date_time),0) running_total  
FROM JOURNAL a  
ORDER BY payment_id,jn_date_time  

This solution provides the exact answer which you are expecting for the above question and that too in a single table pass :).

I have just used a lag analytical function to get the most recent value of amount per bond_number/payment_id combination and then added that recent amount value to the amount to get the running total...SIMPLE!!!..aint it :)

Destruction answered 4/9, 2015 at 9:35 Comment(1)
Nice try, but it incorrectly reports $200 for bop_id = 1242058 for bond 9998/10 - it should show $1800. The reason is your running total is simply adding the amount from the previous row sorted by payment_id / jn_date_time whereas it needs to consider ALL the changes to amounts for that payment over history. Have a look at @wrschneider's answer which gives the correct results.Aviate

© 2022 - 2024 — McMap. All rights reserved.