Oracle SQL query efficiency Improvement
Asked Answered
M

2

6

I have a query which is never finishing (left running for over 24 hours and was still going).

Now there isn't a huge amount of data in each table so I can only assume it is the efficiency of the query I have written.

SELECT DISTINCT s.supplier_id 
FROM supplier_info s
INNER JOIN purchase_order_line_all po ON s.supplier_id = po.vendor_no
INNER JOIN purchase_req_line_all pr ON s.supplier_id = pr.vendor_no
INNER JOIN man_supp_invoice m ON s.supplier_id = m.IDENTITY
WHERE s.creation_date >= TRUNC(SYSDATE) - INTERVAL '6' MONTH    
OR po.state NOT IN ('Closed', 'Cancelled')
OR pr.state NOT IN ('PO Created', 'Cancelled')
OR m.invoice_date >= TRUNC(SYSDATE) - INTERVAL '18' MONTH   

Execution Plan

Plan hash value: 2195330353

-------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                      | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                           |  1884 |   318K|       |   112K  (1)| 00:30:34 |
|   1 |  HASH UNIQUE                        |                           |  1884 |   318K|  1299M|   112K  (1)| 00:30:34 |
|*  2 |   HASH JOIN                         |                           |  7484K|  1234M|       |  2474   (8)| 00:00:41 |
|   3 |    INDEX FULL SCAN                  | PURCHASE_REQUISITION_PK   | 45348 |   265K|       |    18   (0)| 00:00:01 |
|*  4 |    HASH JOIN RIGHT OUTER            |                           |  7484K|  1191M|       |  2410   (6)| 00:00:40 |
|   5 |     INDEX FULL SCAN                 | PUR_ORD_LINE_EXT_PK       |     1 |    16 |       |     1   (0)| 00:00:01 |
|*  6 |     HASH JOIN                       |                           |  7484K|  1077M|  3160K|  2364   (4)| 00:00:39 |
|   7 |      VIEW                           | index$_join$_013          | 92445 |  2076K|       |   351   (2)| 00:00:06 |
|*  8 |       HASH JOIN                     |                           |       |       |       |            |          |
|*  9 |        HASH JOIN                    |                           |       |       |       |            |          |
|  10 |         INDEX FAST FULL SCAN        | PURCHASE_REQ_LINE2_IX     | 92445 |  2076K|       |    40   (0)| 00:00:01 |
|  11 |         INDEX FAST FULL SCAN        | PURCHASE_REQ_LINE1_IX     | 92445 |  2076K|       |    71   (0)| 00:00:02 |
|  12 |        INDEX FAST FULL SCAN         | PURCHASE_REQ_LINE_PK      | 92445 |  2076K|       |    57   (0)| 00:00:01 |
|* 13 |      HASH JOIN                      |                           |   387K|    47M|  2984K|  1139   (2)| 00:00:19 |
|  14 |       VIEW                          | index$_join$_015          | 92589 |  1898K|       |   184   (2)| 00:00:04 |
|* 15 |        HASH JOIN                    |                           |       |       |       |            |          |
|  16 |         INDEX FAST FULL SCAN        | PURCHASE_ORDER_LINE_PK    | 92589 |  1898K|       |    57   (0)| 00:00:01 |
|  17 |         INDEX FAST FULL SCAN        | PURCHASE_ORDER_LINE_1_IX  | 92589 |  1898K|       |    64   (2)| 00:00:02 |
|* 18 |       HASH JOIN                     |                           |   172K|    17M|  1008K|   619   (2)| 00:00:11 |
|  19 |        VIEW                         | index$_join$_016          | 41115 |   521K|       |    58   (2)| 00:00:01 |
|* 20 |         HASH JOIN                   |                           |       |       |       |            |          |
|  21 |          INDEX FAST FULL SCAN       | PURCHASE_ORDER2_IX        | 41115 |   521K|       |    17   (0)| 00:00:01 |
|  22 |          INDEX FAST FULL SCAN       | PURCHASE_ORDER_PK         | 41115 |   521K|       |    13   (0)| 00:00:01 |
|* 23 |        HASH JOIN                    |                           | 13700 |  1257K|       |   523   (1)| 00:00:09 |
|  24 |         TABLE ACCESS FULL           | SUPPLIER_INFO_TAB         |  3269 | 45766 |       |    10   (0)| 00:00:01 |
|  25 |         NESTED LOOPS                |                           |       |       |       |            |          |
|  26 |          NESTED LOOPS               |                           | 23568 |  1841K|       |   512   (1)| 00:00:09 |
|  27 |           SORT UNIQUE               |                           |     4 |    76 |       |     1   (0)| 00:00:01 |
|* 28 |            INDEX RANGE SCAN         | USER_PROFILE_ENTRY_SYS_PK |     4 |    76 |       |     1   (0)| 00:00:01 |
|* 29 |           INDEX RANGE SCAN          | INVOICE_IND9              | 15928 |       |       |     6   (0)| 00:00:01 |
|* 30 |          TABLE ACCESS BY INDEX ROWID| INVOICE_TAB               |  6246 |   372K|       |   255   (0)| 00:00:05 |
-------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

   2 - access("A"."REQUISITION_NO"="B"."REQUISITION_NO")
   4 - access("POL"."ORDER_NO"="POLET"."ORDER_NO"(+) AND "POL"."LINE_NO"="POLET"."LINE_NO"(+) AND 
              "POL"."RELEASE_NO"="POLET"."RELEASE_NO"(+) AND "POL"."ORDER_NO"="POLET"."ORDER_NO"(+))
   6 - access("SUPPLIER_ID"="A"."VENDOR_NO")
       filter("CREATION_DATE">=TRUNC(SYSDATE@!)-INTERVAL'+00-06' YEAR(2) TO MONTH OR 
              "PURCHASE_ORDER_LINE_API"."FINITE_STATE_DECODE__"("POL"."ROWSTATE")<>'Closed' AND 
              "PURCHASE_ORDER_LINE_API"."FINITE_STATE_DECODE__"("POL"."ROWSTATE")<>'Cancelled' OR 
              "PURCHASE_REQ_LINE_API"."FINITE_STATE_DECODE__"("A"."ROWSTATE")<>'PO Created' AND 
              "PURCHASE_REQ_LINE_API"."FINITE_STATE_DECODE__"("A"."ROWSTATE")<>'Cancelled' OR 
              "I"."INVOICE_DATE">=TRUNC(SYSDATE@!)-INTERVAL'+01-06' YEAR(2) TO MONTH)
   8 - access(ROWID=ROWID)
   9 - access(ROWID=ROWID)
  13 - access("POL"."ORDER_NO"="PO"."ORDER_NO")
  15 - access(ROWID=ROWID)
  18 - access("SUPPLIER_ID"="PO"."VENDOR_NO")
  20 - access(ROWID=ROWID)
  23 - access("SUPPLIER_ID"="I"."IDENTITY")
  28 - access("USER_NAME"=NVL(RTRIM(SUBSTR(USERENV('CLIENT_INFO'),1,30)),USER@!) AND "ENTRY_CODE"='COMPANY')
  29 - access("I"."COMPANY"="ENTRY_VALUE")
  30 - filter("I"."CREATOR"='MAN_SUPP_INVOICE_API' AND "I"."PARTY_TYPE"='SUPPLIER' AND 
              "I"."ROWSTATE"<>'Cancelled')
Misprize answered 29/7, 2015 at 14:4 Comment(17)
Do u have an index in any of the columns? if so please mentionFolkestone
@Crazy2crack Not that I am aware of. How would i check/know?Misprize
Please edit your question and add the execution plan as formatted text, please do not post screenshotsIvied
@a_horse_with_no_name How can i get the execution plan if it never completes?Misprize
The execution plan should already be generated ahead of the actual execution of the query. What Oracle (and other DBMS) does is that in a small time right before executing the query it generates a list of possible execution plans (with the help of some "magic" and a lot of smart math) and then picks the one with the smallest cost and runs your query against that plan.Haldes
Getting the execution plan does not run the statement. explain plan for select ... ; then use select plan_table_output from table(dbms_xplan.display);Ivied
@a_horse_with_no_name added the execution plan (i think)Misprize
Predicates should also be shown after the execution plan. Can you post those as well? They are the conditions based on which data is filtered and it should be a list of clauses from your query, starting with a number followed by a * (they correlate with the numbers from the first column of your execution plan).Haldes
it would also be helpful if you could tie up the names between your example sql statement and the explain plan.Decapitate
Please post the table layouts for the tables, as well as the actual query. The explain doesn't help if it doesn't match to a real query with real column names.Biel
"USER_PROFILE_ENTRY_SYS_PK"? which table does that belong to?Decapitate
@AndyLester done and updatedMisprize
@RaduGheorghiu added itMisprize
Looks like at least some of these are views, with functions and context filters, which is perhaps doing a lot of work you don't need. What does the speed/plan look like if you replace the joins with where exists ... checks?Michikomichon
We also need to see the table and index definitions. Clearly you have some indexes, but we don't know how they are defined, and we don't know how the tables are defined.Biel
The distinct is killing you: HASH UNIQUE | | 1884 | 318K| 1299M that is writing 1.3GB to the temp tablespace. Do you really need the distinctIvied
@a_horse_with_no_name i did in that query above, the one below didnt create duplicates (probably why its faster!)Misprize
H
4

You have a bunch of OR conditions. I would suggest replacing these by not exists:

SELECT s.supplier_id 
FROM supplier_info s
WHERE s.creation_date >= TRUNC(SYSDATE) - INTERVAL '6' MONTH OR
      NOT EXISTS (SELECT 1
                  FROM purchase_order_line_all po
                  WHERE s.supplier_id = po.vendor_no AND
                        po.state IN ('Closed', 'Cancelled')
                 ) AND
      NOT EXISTS (SELECT 1
                  FROM purchase_req_line_all pr 
                  WHERE s.supplier_id = pr.vendor_no AND
                        r.state IN ('PO Created', 'Cancelled')
                 )
      EXISTS (SELECT 1
              FROM man_supp_invoice m 
              WHERE s.supplier_id = m.IDENTITY AND
                    m.invoice_date >= TRUNC(SYSDATE) - INTERVAL '18' MONTH  
             );

I'm pretty sure your performance problem is caused by cartesian products. If a supplier has 100 order lines, and 100 req lines, and 100 invoices, then the join is creating 100*100*100 = 1,000,000 rows just for that one supplier. This is a big intermediate table.

By using EXISTS instead, Oracle will not be producing gargantuan intermediate tables.

Also, you can test the performance by adding one clause at a time.

Finally, I'm not 100% sure if the logic is correct for the middle two conditions. For instance, you might really want this for first NOT EXISTS:

          EXISTS (SELECT 1
                  FROM purchase_order_line_all po
                  WHERE s.supplier_id = po.vendor_no AND
                        po.state NOT IN ('Closed', 'Cancelled')
                 ) AND

As written, your logic is that at least one state is not 'Closed' or 'Cancelled', which is what the above revision also does. I put in that no states are 'Closed' or 'Cancelled', just because that made more sense to me.

Hypersonic answered 29/7, 2015 at 14:38 Comment(6)
Oh my god your clever! Currently out of up votes but will tomorrow!Misprize
This runs instantly, cross checking the results now!Misprize
Hmm, I don't see any cartesian joins in the original query.Ivied
I think it's the implicit cartesian joins that Gordon is talking about, rather than explicit cartesian joins. If you don't have conditions joining all the tables together in the event of multiple 1-to-many row joins on more than two tables, you're going to end up with multiple rows. E.g. sqlfiddle.com/#!4/9eecb7d/3006Decapitate
Just to bring this full circle, results cross checked exactly. If you read this, can you suggest some reading material for using this method (exists over or conditions), i would like to further my knowledge in this (?? implicit/explicit Cartesian joins??)Misprize
@Misprize . . . The issue here is not about exists. It is about the cartesian products being calculated by the joins for each supplier. If you really understand what is happening (and just take a few rows for one supplier if you need to work it out), then you probably won't make the same mistake again.Hypersonic
L
1

It looks like you are working with IFS.

In the view PURCHASE_ORDER_LINE_ALL the columns objstate and state are defined in the DDL as:

   pol.rowstate                       objstate
   PURCHASE_ORDER_LINE_API.Finite_State_Decode__(pol.rowstate) state

I inherited some code that was using state in the query and it was occasionally timing out. When I changed the query to use objstate it ran much quicker. That trip through the Finite_State_Decode__ function was adding a lot of overhead to my query.

A little bit of background reading on why functions like this exist in the system: https://yourifs.blogspot.com/2007/10/read-only-methods-pragma-methods.html

Liveryman answered 19/8, 2019 at 19:55 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.