When are user defined functions evaluated in a query in Oracle?
Asked Answered
C

1

6

I'm experiencing some interesting behaviour with using user-defined functions within a SELECT statement.

I have a couple stored procedures that read and purge data from a single table. These stored procedures are used by multiple sources.

In my observations, it appears that the user defined functions are sometimes evaluated arbitrarily, not always immediately after or during the execution of the SELECT statement that it is used in.

For example, in a stored procedure, I have a select statement might look something like this:

SELECT Something, MyFunction(Something) FROM Somewhere;

This is followed by a call to another stored procedure, which purges data from the table. The amount of data purged is governed by another table, which stores maximum ID read. This is so that a purge should not delete any data that has not yet been read by another instance of the stored procedure executing.

In my test code, MyFunction just returns the number of rows in the table Somewhere. Thus, I would imagine that it should always be equal to the number of rows that the SELECT statement returns. However, in cases where I run two instances of this stored procedure, I get results something like this:

First query instance:

Something  MyFunction(Something)
---------  ---------------------
A          3
B          3
C          3

Second query instance:

Something  MyFunction(Something)
---------  ---------------------
A          0
B          0
C          0    

Why is it that the second query returns all rows, but the user defined function that operates on the same table reports that there are no more rows in the table?

Is there anyway that I can ensure that the second query instance is consistent in that the user defined functions still see the same data that the parent stored procedure is seeing?

Curcio answered 5/6, 2012 at 22:13 Comment(4)
I'm not sure to understand: between the 2 calls did you purge the table?Re
Sorry about the confusion. To clarify, there is one stored procedure that (1) performs the SELECT statement, and (2) calls the purge procedure after the SELECT statement.Curcio
I see, so the function returns the right result doesn't it? The select doesn't, or said differently no row should be returned. Confirmed?Re
The purge only deletes data that has been read by all instances of the stored procedure. There is a separate table in which all stored procedure instances update with the last row read. As such, the second SELECT statement is returning what I expect. However, from what I observe, the Function call is only evaluated after the stored procedure calls the purge function (perhaps only when the ref cursor is returned or read?). I suspect that the first function returns the correct result because its purge call doesn't delete any data yet, since it is waiting for the second query to execute.Curcio
M
9

In general, the problem you are seeing is due to the fact that while Oracle's multi-version read consistency ensures that a single SQL statement will always see a consistent view of the data, that same consistency does not mean that every SQL statement issued by a function called by the original SQL statement will see the same set of data that the original statement does.

In practical terms, that means that something like

SELECT something,
       COUNT(*) OVER ()
  FROM table_name

will always return the correct answer (3 if the query returns 3 rows), if you put exactly the same logic in a function

CREATE OR REPLACE FUNCTION count_table_name
  RETURN NUMBER
AS
  l_cnt INTEGER;
BEGIN
  SELECT COUNT(*)
    INTO l_cnt
    FROM table_name;
  RETURN l_cnt;
END;

that the SQL statement

SELECT something,
       count_table_name
  FROM table_name

will not necessarily return a value that matches the number of rows in the table (nor will it necessarily return the same result for every row). You can see that in action if you build in a delay to your function so that you can modify the data in a separate session. For example

SQL> create table foo( col1 number );

Table created.

SQL> insert into foo select level from dual connect by level <= 3;

3 rows created.

Create a function that adds a 10 second delay per row

SQL> ed
Wrote file afiedt.buf

  1  create or replace function fn_count_foo
  2    return number
  3  is
  4    l_cnt  integer;
  5  begin
  6    select count(*)
  7      into l_cnt
  8      from foo;
  9    dbms_lock.sleep(10);
 10    return l_cnt;
 11* end;
 12  /

Function created.

Now, if in session 1, I start the statement

select col1, fn_count_foo
  from foo;

then switch over to session 2 where I insert a new row

SQL> insert into foo values( 4 );

1 row created.

SQL> commit;

Commit complete.

you can see that the function sees the newly committed row during the second execution despite the fact that the SQL statement itself only sees 3 rows

SQL> select col1, fn_count_foo
  2    from foo;

      COL1 FN_COUNT_FOO
---------- ------------
         1            3
         2            4
         3            4

You can avoid that problem by having your session use the serializable transaction isolation level before executing the SQL statement. So, for example,

In session 1, set the transaction isolation level to serializable and start the query

SQL> set transaction isolation level serializable;

Transaction set.

SQL> select col1, fn_count_foo
  2    from foo;

In session 2, insert a new row

SQL> insert into foo values( 5 );

1 row created.

SQL> commit;

Commit complete.

and when Session 1 returns 40 seconds later, everything is consistent

SQL> select col1, fn_count_foo
  2    from foo;

      COL1 FN_COUNT_FOO
---------- ------------
         1            4
         2            4
         3            4
         4            4
Materfamilias answered 5/6, 2012 at 23:28 Comment(4)
Ah, thanks for the notes. How would I properly set this properly in the context of a stored procedure? The procedure is being called by another application.Curcio
@ashyu - Most commonly, the application would be able to set its own transaction isolation level often as a property of whatever 'Connection-type object the application uses. You can add the line set transaction isolation level` as the first statement of the procedure as well but that will only work if the call to the procedure begins a new transaction-- you'll get an error if an existing transaction tries to call the procedure because setting the isolation level has to be the first operation a transaction does.Materfamilias
+1 Nice explanation Justin. I would just add to be careful of ORA-08177 (can't serialize access for this transaction) error, which will happen for example, if session 2 updates (and commits) a row while session 1 is running its long query, and later session 1 tries to update that same row. I guess my point is most people are used to dealing with read committed isolation since its the default.Andersen
Thanks guys! I managed to find the transaction setting in the client application. It appears that I also had to add pragma autonomous_transaction to the function in conjunction with this to produce the result I expected.Curcio

© 2022 - 2024 — McMap. All rights reserved.