How to inline a variable in PL/SQL?
Asked Answered
F

5

8

The Situation

I have some trouble with my query execution plan for a medium-sized query over a large amount of data in Oracle 11.2.0.2.0. In order to speed things up, I introduced a range filter that does roughly something like this:

PROCEDURE DO_STUFF(
    org_from VARCHAR2 := NULL,
    org_to   VARCHAR2 := NULL)

  -- [...]
  JOIN organisations org
    ON (cust.org_id = org.id
   AND ((org_from IS NULL) OR (org_from <= org.no))
   AND ((org_to   IS NULL) OR (org_to   >= org.no)))
  -- [...]

As you can see, I want to restrict the JOIN of organisations using an optional range of organisation numbers. Client code can call DO_STUFF with (supposed to be fast) or without (very slow) the restriction.

The Trouble

The trouble is, PL/SQL will create bind variables for the above org_from and org_to parameters, which is what I would expect in most cases:

  -- [...]
  JOIN organisations org
    ON (cust.org_id = org.id
   AND ((:B1 IS NULL) OR (:B1 <= org.no))
   AND ((:B2 IS NULL) OR (:B2 >= org.no)))
  -- [...]

The Workaround

Only in this case, I measured the query execution plan to be a lot better when I just inline the values, i.e. when the query executed by Oracle is actually something like

  -- [...]
  JOIN organisations org
    ON (cust.org_id = org.id
   AND ((10 IS NULL) OR (10 <= org.no))
   AND ((20 IS NULL) OR (20 >= org.no)))
  -- [...]

By "a lot", I mean 5-10x faster. Note that the query is executed very rarely, i.e. once a month. So I don't need to cache the execution plan.

My questions

  • How can I inline values in PL/SQL? I know about EXECUTE IMMEDIATE, but I would prefer to have PL/SQL compile my query, and not do string concatenation.

  • Did I just measure something that happened by coincidence or can I assume that inlining variables is indeed better (in this case)? The reason why I ask is because I think that bind variables force Oracle to devise a general execution plan, whereas inlined values would allow for analysing very specific column and index statistics. So I can imagine that this is not just a coincidence.

  • Am I missing something? Maybe there is an entirely other way to achieve query execution plan improvement, other than variable inlining (note I have tried quite a few hints as well but I'm not an expert on that field)?

Fortyfive answered 18/3, 2011 at 15:0 Comment(13)
I'm pretty sure it's a coincidence. All queries are "compiled" and I'm note entirely sure I follow what you're asking about "inlining variables". Bind variables are Oracle's way of telling "get me the value that's in the shared pool".Estienne
@Sathya, I'll update my question. By "inlining" I mean "inlining", i.e. the variables are no longer variables, but constants to the SQL statementFortyfive
Are you actually getting different query plans consistently with bind variables vs. hard-coded values? Or is your performance difference potentially due to the "slow" query reading a bunch of data into cache (database, OS, or SAN) so that the "fast" query then benefits from faster reads? What version of Oracle? Is it safe to assume that the NO column in ORGANIZATIONS is a unique key? Are the statistics on the NO column accurate?Fairground
@Justin yes, the plans are consistent. I measured 1, 2, 1, 2, 1, 2, etc. Also I checked various bind values. With bind variables I get some FULL TABLE SCANS, whereas with hard-coded values, the plan looks a lot better. Version is 11.2.0.2.0. NO is unique, statistics are accurate.Fortyfive
@Justin, just in case, the trouble is not within the organisations table. That one only has around 550 records. Every organisation has 100k+ records in various other tables that are involved with the full query thoughFortyfive
Maybe post the execution plans?Wack
@Frustrated... Then don't use FormsDesigner :-) It will take some time to post the complete query (29 lines) and execution plans (45-47 execution steps), as I'd need to somehow anonymise them before posting them here. I will do so in a while, if this is really needed. The important question here (for me), however is, if there is a way of circumventing EXECUTE IMMEDIATE. I rearranged the order of questions...Fortyfive
@Lukas Eder: Ok, I didn't realise it was that complicated! I guess you've already created appropriately designed indexes and updated them?Wack
@Frustrated: Yes the indexes are correct. That's why inlining/hard-coding the values is so fast, because then, they are actually used. I've also tried using all sorts of hints to force Oracle to use those indexes (I should put that in my question).Fortyfive
@Lukas Eder: Hmmm this is a tricky one... Sometimes I've found that the "most correct" index is not an obvious one. I've sometimes fixed similar problems by creating new indices combining columns that are not obviously related, but since they all participate in the query, it greatly speeds up the query in question. But that varies a lot on the queries and underlying table structure. Maybe part of the conditions can be re-written to reference other indexed columns? What about rewriting to use BETWEEN for the range filter? Might that help?Wack
@Frustrated, the BETWEEN keyword results in almost the same execution plan. Probably, some very unorthodox index could resolve the problem beyond my understanding. But it wouldn't explain why inlined values perform so much better than bind variables. I think I'm gonna have to post the query... :-)Fortyfive
@Lukas: Just a shot here, but in your procedure you are passing in varchar2, but your inlining example uses numbers. Possible that Oracle is doing to_number conversions behind the scenes when using the varchars passed into the proc? (ie, set params as number instead).Psychodrama
@Tbone, hmm, nice observation. org.no is a VARCHAR2 column, but it only contains 5-digit numbers. So I guess, in my inline example, Oracle does automatic casting or to_char or something like that. I'll check if that has any meaning for execution plans (I would be surprised though)Fortyfive
A
7

In one of your comments you said:

"Also I checked various bind values. With bind variables I get some FULL TABLE SCANS, whereas with hard-coded values, the plan looks a lot better."

There are two paths. If you pass in NULL for the parameters then you are selecting all records. Under those circumstances a Full Table Scan is the most efficient way of retrieving data. If you pass in values then indexed reads may be more efficient, because you're only selecting a small subset of the information.

When you formulate the query using bind variables the optimizer has to take a decision: should it presume that most of the time you'll pass in values or that you'll pass in nulls? Difficult. So look at it another way: is it more inefficient to do a full table scan when you only need to select a sub-set of records, or to do indexed reads when you need to select all records?

It seems as though the optimizer has plumped for full table scans as being the least inefficient operation to cover all eventualities.

Whereas when you hard code the values the Optimizer knows immediately that 10 IS NULL evaluates to FALSE, and so it can weigh the merits of using indexed reads for find the desired sub-set records.


So, what to do? As you say this query is only run once a month I think it would only require a small change to business processes to have separate queries: one for all organisations and one for a sub-set of organisations.


"Btw, removing the :R1 IS NULL clause doesn't change the execution plan much, which leaves me with the other side of the OR condition, :R1 <= org.no where NULL wouldn't make sense anyway, as org.no is NOT NULL"

Okay, so the thing is you have a pair of bind variables which specify a range. Depending on the distribution of values, different ranges might suit different execution plans. That is, this range would (probably) suit an indexed range scan...

WHERE org.id BETWEEN 10 AND 11

...whereas this is likely to be more fitted to a full table scan...

WHERE org.id BETWEEN 10 AND 1199999

That is where Bind Variable Peeking comes into play.

(depending on distribution of values, of course).

Alum answered 18/3, 2011 at 15:56 Comment(2)
thanks for the feedback. That's my intuitive thinking as well. I just wanted to be sure. Btw, removing the :R1 IS NULL clause doesn't change the execution plan much, which leaves me with the other side of the OR condition, :R1 <= org.no where NULL wouldn't make sense anyway, as org.no is NOT NULLFortyfive
thanks for the update. The more I read through you guys's suggestions, the more I come to the conclusion that no matter what values I bind to the query, the very same plan is re-used, exactly like in the article posted by EvilTeach. That article also mentions range predicates. In fact, removing the range predicate and replacing it by a simple org.no = :B1 seems to provide an acceptable plan as well. However, since I'm using 11g and not 10g, I'll double-check and verify also ik_zelf's suggestions. There have been many feature additions in Oracle, since the 11g upgrade.Fortyfive
F
4

Since the query plans are actually consistently different, that implies that the optimizer's cardinality estimates are off for some reason. Can you confirm from the query plans that the optimizer expects the conditions to be insufficiently selective when bind variables are used? Since you're using 11.2, Oracle should be using adaptive cursor sharing so it shouldn't be a bind variable peeking issue (assuming you are calling the version with bind variables many times with different NO values in your testing.

Are the cardinality estimates on the good plan actually correct? I know you said that the statistics on the NO column are accurate but I would be suspicious of a stray histogram that may not be updated by your regular statistics gathering process, for example.

You could always use a hint in the query to force a particular index to be used (though using a stored outline or optimizer plan stability would be preferable from a long-term maintenance perspective). Any of those options would be preferable to resorting to dynamic SQL.

One additional test to try, however, would be to replace the SQL 99 join syntax with Oracle's old syntax, i.e.

SELECT <<something>>
  FROM <<some other table>> cust,
       organization org
 WHERE cust.org_id = org.id
   AND (    ((org_from IS NULL) OR (org_from <= org.no)) 
        AND ((org_to   IS NULL) OR (org_to   >= org.no)))

That obviously shouldn't change anything, but there have been parser issues with the SQL 99 syntax so that's something to check.

Fairground answered 18/3, 2011 at 15:54 Comment(3)
thanks for your extensive feedback! I'll try the things you suggest and double check statistics and histograms. Long-term maintenance is not that big of an issue in this case, as things rarely change around our database. But then again, I'm not an Oracle expert, so maybe you're pointing out something that I'm entirely unaware of.Fortyfive
Good input about cardinality estimates not being correct. I always get the same cardinalities for every inlined value, so maybe I happen run into the same problem as with the bind variables, with the tiny difference that I'm lucky to have gotten a "good" plan at the first attempt. I'll keep investigating...Fortyfive
I just tried avoiding the SQL 99 JOIN syntax, replacing it with a cross join and predicates. It completely changes the execution plan, but the results are the same. With bind variables I got a bad plan, with inlined values I got a good planFortyfive
C
3

It smells like Bind Peeking, but I am only on Oracle 10, so I can't claim the same issue exists in 11.

Cheeks answered 18/3, 2011 at 15:45 Comment(4)
thanks for the input. I'll check out that article right away!Fortyfive
check out Justin's answer, he also references "variable peeking"Fortyfive
your article really seems to describe the symptoms I experience. Cardinalities from the various executions' plans don't seem to change even if I change the bind values.Fortyfive
@lukas - The smell is switching from bind variables to hard coded seems to make the performance issue go away. I have a multisession application when some ran fast, and some ran slow, due to the plan being generated for small N (full table scan) which then was applied to a large N. Bind peeking turned out to be the reason.Cheeks
M
3

This looks a lot like a need for Adaptive Cursor Sharing, combined with SQLPlan stability. I think what is happening is that the capture_sql_plan_baselines parameter is true. And the same for use_sql_plan_baselines. If this is true, the following is happening:

  1. The first time that a query started it is parsed, it gets a new plan.
  2. The second time, this plan is stored in the sql_plan_baselines as an accepted plan.
  3. All following runs of this query use this plan, regardless of what the bind variables are.

If Adaptive Cursor Sharing is already active,the optimizer will generate a new/better plan, store it in the sql_plan_baselines but is not able to use it, until someone accepts this newer plan as an acceptable alternative plan. Check dba_sql_plan_baselines and see if your query has entries with accepted = 'NO' and verified = null You can use dbms_spm.evolve to evolve the new plan and have it automatically accepted if the performance of the plan is at least 1,5 times better than without the new plan.

I hope this helps.

Macguiness answered 18/3, 2011 at 16:27 Comment(2)
I love your expertise more and more every day! Maybe we should hire you at our company! :-) I'll check through your suggestions ASAP. Thanks for the feedback!Fortyfive
@Lukas Eder, I am always open for suggestions. Thanks :-DMacguiness
P
2

I added this as a comment, but will offer up here as well. Hope this isn't overly simplistic, and looking at the detailed responses I may be misunderstanding the exact problem, but anyway...

Seems your organisations table has column no (org.no) that is defined as a number. In your hardcoded example, you use numbers to do the compares.

JOIN organisations org
    ON (cust.org_id = org.id
   AND ((10 IS NULL) OR (10 <= org.no))
   AND ((20 IS NULL) OR (20 >= org.no)))

In your procedure, you are passing in varchar2:

PROCEDURE DO_STUFF(
    org_from VARCHAR2 := NULL,
    org_to   VARCHAR2 := NULL)

So to compare varchar2 to number, Oracle will have to do the conversions, so this may cause the full scans.

Solution: change proc to pass in numbers

Psychodrama answered 18/3, 2011 at 18:7 Comment(2)
nice observation, I hadn't thought of that. It's the other way round, actually. The inline example should use strings, not numbers, as org.no is of type VARCHAR2Fortyfive
@lucas, thanks. Seemed to simplistic given the other comments/solutions. May not change an explain plan that uses bind vars, but the actual runtime should be affected if comparing numbers to varchars (or vice versa). Anyway, hope that helps somehow.Psychodrama

© 2022 - 2024 — McMap. All rights reserved.