Unkillable Oracle session waiting on "SQL*Net message from client" event
Asked Answered
N

1

5

On Oracle 11gR2, I've recently encountered a very interesting situation involving a blocked (but idle!) MERGE statement that hangs on a "SQL*Net message from client" event, causing subsequent, concurrently executed MERGE statements to block on the first statement via "cursor: pin S wait on X" events. In Oracle Enterprise Manager, the following can be observed:

enter image description here

This situation turns even more severe, as the above Session-ID 1204 cannot be killed with either:

alter system kill session 'sid,serial#';
alter system kill session 'sid,serial#' immediate;

Our DBA is sometimes able to kill the operating system process, but often, the whole database needs to be restarted. Luckily, thus far, only on a test system, never in production.

Note:

I'm aware this is probably a similar issue as reported in this rather vague question: Oracle updates/inserts stuck, DB CPU at 100%, concurrency high, SQL*Net wait message from client. I'll still report it again, as I have a clear reproduction path, which I'll report as an answer.

Necessaries answered 24/7, 2015 at 9:27 Comment(0)
N
9

This seems to be a bug in Oracle when CLOB data types are used as values that are passed to the MERGE statement's ON clause. Assume this database:

CREATE TABLE t (
  v INT, 
  s VARCHAR2(400 CHAR)
);

Reproduction using inlined values

Now, run the following statement in any Oracle client, including SQL*Plus, SQL Developer or from JDBC, which helps reproducing the issue very easily (I'm using Oracle 11g XE 11.2.0.2.0):

MERGE INTO t                      
USING (
  SELECT 
    1 v, 
    CAST('abc' AS CLOB) s 
  FROM DUAL
) s 
ON (t.s = s.s) -- Using a CLOB here causes the bug.
WHEN MATCHED THEN UPDATE SET
  t.v = s.v        
WHEN NOT MATCHED THEN INSERT (v, s) 
VALUES (s.v, s.s);

The example is silly, and the CLOB was bound here by "accident". Nonetheless, such a statement should not create a zombie session in Oracle, but it's there. I'm running the above statement three times in SQL*Plus and then running this...

SELECT 
  s.sid,
  s.serial#,
  s.sql_id,
  s.event,
  s.blocking_session,
  q.sql_text
FROM v$session s
JOIN v$sql q
ON s.sql_id = q.sql_id
WHERE s.username = 'TEST'
AND UPPER(TRIM(q.sql_text)) LIKE 'MERGE%';

... I get:

sid serial# sql_id          event                       blocking_session
9   3       82a2k4sqzy1jq   cursor: pin S wait on X     92
49  89      82a2k4sqzy1jq   cursor: pin S wait on X     92
92  13      82a2k4sqzy1jq   db file sequential read     

Notice how the reported event is different ("db file sequential read") from the original event ("SQL*Net message from client"), which was using bind variables

Reproduction using bind values

var v_s varchar2(50)
exec :v_s := 'abc'

MERGE INTO t                      
USING (
  SELECT 
    1 v, 
    CAST(:v_s AS CLOB) s 
  FROM DUAL
) s 
ON (t.s = s.s) -- Using a CLOB here causes the bug.
WHEN MATCHED THEN UPDATE SET
  t.v = s.v        
WHEN NOT MATCHED THEN INSERT (v, s) 
VALUES (s.v, s.s);

The above statement run in SQL*Plus also produces the bug:

sid serial# sql_id          event                           blocking_session
8   1       4w9zuxrumumgj   SQL*Net message from client     
90  7       4w9zuxrumumgj   cursor: pin S wait on X         8
94  21      4w9zuxrumumgj   cursor: pin S wait on X         8

No reproduction in PL/SQL

Interestingly, the bug is avoided in the following PL/SQL statement:

DECLARE
  v_s CLOB := 'abc';
BEGIN
  MERGE INTO t                      
  USING (
    SELECT 
      1 v, 
      CAST(v_s AS CLOB) s 
    FROM DUAL
  ) s 
  ON (t.s = s.s) -- Using a CLOB here causes the bug.
  WHEN MATCHED THEN UPDATE SET
    t.v = s.v        
  WHEN NOT MATCHED THEN INSERT (v, s) 
  VALUES (s.v, s.s);
END;
/

I'm getting:

          CAST(v_s AS CLOB) s
          *
ERROR at line 8:
ORA-06550: line 8, column 11:
PL/SQL: ORA-00932: inconsistent datatypes: expected - got CLOB
ORA-06550: line 4, column 7:
PL/SQL: SQL Statement ignored

It looks as though the PL/SQL engine saveguards clients from this SQL engine bug.

Necessaries answered 24/7, 2015 at 9:27 Comment(3)
ON (t.s = s.s) is strange. AFAIK Oracle semantic check should not allow LOB comparison ( at least in where clause).Spurtle
@ibre5041: It is strange indeed, but perhaps bind variables circumvent such checks, or perhaps some implicit type coercion is applied but not implemented correctly in the database...Necessaries
when string is not long enough (less than 4000 chars, but depending on oracle version) it is treated as varchar2. So, it may or may not be "inconsistent datatypes" depending on actual data.Gerundive

© 2022 - 2024 — McMap. All rights reserved.