Страницы

вторник, 21 декабря 2021 г.

How to trace specific SQL statement executed by certain user

A question asking the same appeared on oracle-l today here. It is widely known how to trace a specific SQL statement at the system level, e.g.:

alter system set events 'sql_trace[sql:<some_sql_id>]';

What if we want to trace the same SQL statement when it is executed by some specific user? A common approach in this case is to create a LOGON trigger and use ALTER SESSION SET EVENTS in it. It turns out that it is possible by using event filters and the username() function too. For example, to trace the SQL_ID bqka14bvd2zmb run by the user TC1 we can use:

alter system set events 'sql_trace[sql:bqka14bvd2zmb] {streq:username(),"TC1"}';

Here is a full example which is also available as a Gist here:

[oracle@db-21 ~]$ sqlplus /nolog @q

SQL*Plus: Release 21.0.0.0.0 - Production on Tue Dec 21 15:54:22 2021
Version 21.4.0.0.0

Copyright (c) 1982, 2021, Oracle.  All rights reserved.

SQL> conn / as sysdba
Connected.
SQL> alter session set container=pdb;

Session altered.

SQL>
SQL> drop user tc1 cascade;

User dropped.

SQL> drop user tc2 cascade;

User dropped.

SQL>
SQL> grant create session, select_catalog_role to tc1 identified by tc1;

Grant succeeded.

SQL> grant create session, select_catalog_role to tc2 identified by tc2;

Grant succeeded.

SQL>
SQL> col sql_id old_v sql_id
SQL> select dbms_sql_translator.sql_id(q'#select 'trace_me' from dual#') sql_id from dual;

SQL_ID
--------------------------------------------------------------------------------
bqka14bvd2zmb

SQL>
SQL> alter system set events 'sql_trace[sql:&sql_id.] {streq:username(),"TC1"}';
old   1: alter system set events 'sql_trace[sql:&sql_id.] {streq:username(),"TC1"}'
new   1: alter system set events 'sql_trace[sql:bqka14bvd2zmb] {streq:username(),"TC1"}'

System altered.

SQL>
SQL> conn tc1/tc1@db-21/pdb
Connected.
SQL>
SQL> select 'trace_me' from dual;

'TRACE_M
--------
trace_me

SQL> col trace_file old_v trace_file
SQL> select value trace_file from v$diag_info where name='Default Trace File';

TRACE_FILE
--------------------------------------------------------------------------------
/u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_5455.trc

SQL>
SQL> !cat &trace_file.
Trace file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_5455.trc
Oracle Database 21c Enterprise Edition Release 21.0.0.0.0 - Production
Version 21.4.0.0.0
Build label:    RDBMS_21.3.0.0.0_LINUX.X64_210727
ORACLE_HOME:    /u01/app/oracle/product/21.0.0/dbhome_1
System name:    Linux
Node name:      db-21.localdomain
Release:        5.4.17-2136.300.7.el8uek.x86_64
Version:        #2 SMP Fri Oct 8 16:23:01 PDT 2021
Machine:        x86_64
VM name:        KVM CPUID feature flags: 0x01000089
Instance name: orcl
Redo thread mounted by this instance: 1
Oracle process number: 48
Unix process pid: 5455, NID: 4026531836, image: oracle@db-21.localdomain


*** 2021-12-21T15:54:23.397300+00:00 (PDB(3))
*** SESSION ID:(82.1864) 2021-12-21T15:54:23.397327+00:00
*** CLIENT ID:() 2021-12-21T15:54:23.397344+00:00
*** SERVICE NAME:(pdb) 2021-12-21T15:54:23.397360+00:00
*** MODULE NAME:(SQL*Plus) 2021-12-21T15:54:23.397377+00:00
*** ACTION NAME:() 2021-12-21T15:54:23.397394+00:00
*** CLIENT DRIVER:(SQL*PLUS) 2021-12-21T15:54:23.397409+00:00
*** CONTAINER ID:(3) 2021-12-21T15:54:23.397426+00:00
*** CLIENT IP:(10.0.2.15) 2021-12-21T15:54:23.397449+00:00
*** CLIENT IP:(10.0.2.15) 2021-12-21T15:54:23.397449+00:00

=====================
PARSING IN CURSOR #140557434715864 len=332 dep=1 uid=0 oct=3 lid=0 tim=1528633612 hv=2698389488 ad='85cb8630' sqlid='acmvv4fhdc9zh'
select obj#,type#,ctime,mtime,stime, status, dataobj#, flags, oid$, spare1, spare2, spare3, signature, spare7, spare8, spare9, nvl(dflcollid, 16382), creappid, creverid, modappid, modverid, crepatchid, modpatchid from obj$ where owner#=:1 and name=:2 and namespace=:3 and remoteowner is null and linkname is null and subname is null
END OF STMT
EXEC #140557434715864:c=0,e=25,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=813480514,tim=1528633611
FETCH #140557434715864:c=19,e=19,p=0,cr=3,cu=0,mis=0,r=0,dep=1,og=4,plh=813480514,tim=1528634393
CLOSE #140557434715864:c=0,e=0,dep=1,type=3,tim=1528634426
=====================
PARSING IN CURSOR #140557434718920 len=27 dep=0 uid=120 oct=3 lid=120 tim=1528635472 hv=4140924523 ad='7aa8feb0' sqlid='bqka14bvd2zmb'
select 'trace_me' from dual
END OF STMT
EXEC #140557434718920:c=30,e=30,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=1388734953,tim=1528635471
FETCH #140557434718920:c=5,e=6,p=0,cr=0,cu=0,mis=0,r=1,dep=0,og=1,plh=1388734953,tim=1528635612
STAT #140557434718920 id=1 cnt=1 pid=0 pos=1 obj=0 op='FAST DUAL  (cr=0 pr=0 pw=0 str=1 time=1 us cost=2 size=0 card=1)'
FETCH #140557434718920:c=0,e=0,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=0,plh=1388734953,tim=1528636086
CLOSE #140557434718920:c=0,e=7,dep=0,type=0,tim=1528637068

SQL>
SQL> conn tc2/tc2@db-21/pdb
Connected.
SQL>
SQL> select 'trace_me' from dual;

'TRACE_M
--------
trace_me

SQL> col trace_file old_v trace_file
SQL> select value trace_file from v$diag_info where name='Default Trace File';

TRACE_FILE
--------------------------------------------------------------------------------
/u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_5458.trc

SQL>
SQL> !cat &trace_file.
cat: /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_5458.trc: No such file or directory

Here I make use of the streq filter and the username function:

SQL> oradebug doc event filter streq

streq: filter to only fire an event when string s1 = s2 (up to <len> characters)

Usage
-------
{streq:  s1              <string>[256],
         s2              <string>[256],
         len             <ub4> }

SQL> oradebug doc event action username
username
        - Return user log-in name
Usage
-------
username( Action returns: <string>[30])

суббота, 18 декабря 2021 г.

Mythbusters: VARRAY faster than CLOB

There has been a tweet recently saying that VARRAY is faster than CLOB: link. On this data set with SQL*Plus, the correctness of this statement largely depends on the underlying hardware on the database, and the network between the client and the server. More specifically, VARRAY will be faster with a rather slow network.

A typical production environment that I work with includes one or more databases deploying across different availability zones (AZ) on the cloud. The applications reside in the same AZ as the database server to avoid inter-AZ traffic that costs extra money. I tested the script from this Gist across two major cloud providers and VARRAY was never faster than CLOB. In fact, it is significantly slower. See the output from 19.13 below (the script is from this Gist - I just added the last query with DBMS_LOB):

[oracle@rac2 ~]$ NLS_LANG=.AL32UTF8 sqlplus tc/tc@rac1:1522/pdb1 @test1

SQL*Plus: Release 19.0.0.0.0 - Production on Sat Dec 18 10:54:03 2021
Version 19.13.0.0.0

Copyright (c) 1982, 2021, Oracle.  All rights reserved.

Last Successful login time: Sat Dec 18 2021 10:53:17 +00:00

Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.13.0.0.0

SQL> set lobprefetch 32767
SQL> set long 10000000
SQL> set longchunksize 10000000
SQL> set timing on;
SQL> set arraysize 1000;
SQL> --set feedback only
SQL> set autotrace trace stat;
SQL> select id,c_lob from t_lob_1_mb where id<=25;

25 rows selected.

Elapsed: 00:00:01.50

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         82  consistent gets
       6475  physical reads
          0  redo size
   51655001  bytes sent via SQL*Net to client
      15345  bytes received via SQL*Net from client
         52  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         25  rows processed

SQL> select id,lob_to_varray(c_lob) c_varray from t_lob_1_mb where id<=25;

25 rows selected.

Elapsed: 00:00:14.10

Statistics
----------------------------------------------------------
         33  recursive calls
          0  db block gets
        199  consistent gets
     653950  physical reads
          0  redo size
   25107664  bytes sent via SQL*Net to client
       9261  bytes received via SQL*Net from client
         61  SQL*Net roundtrips to/from client
         25  sorts (memory)
          0  sorts (disk)
         25  rows processed

SQL>
SQL> select
  2    c_varray
  3  from t_lob_1_mb
  4       outer apply (
  5         select
  6           cast(
  7             collect(
  8                cast(substr(c_lob,(level-1)*4000 + 1,4000) as varchar2(4000))
  9                )
 10             as sys.odcivarchar2list
 11           ) c_varray
 12         from dual
 13         connect by level<=ceil(length(c_lob)/4000)
 14       )
 15  where id<=25;

25 rows selected.

Elapsed: 00:00:14.16

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        176  consistent gets
     653950  physical reads
          0  redo size
   25106012  bytes sent via SQL*Net to client
       9086  bytes received via SQL*Net from client
         58  SQL*Net roundtrips to/from client
         50  sorts (memory)
          0  sorts (disk)
         25  rows processed

SQL>
SQL> select
  2    c_varray
  3  from t_lob_1_mb
  4       outer apply (
  5         select
  6           cast(
  7             collect(
  8                dbms_lob.substr(c_lob,4000,(level-1)*4000 + 1)
  9                )
 10             as sys.odcivarchar2list
 11           ) c_varray
 12         from dual
 13         connect by level<=ceil(length(c_lob)/4000)
 14       )
 15  where id<=25;

25 rows selected.

Elapsed: 00:00:03.19

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        176  consistent gets
      24900  physical reads
          0  redo size
   25106012  bytes sent via SQL*Net to client
       9071  bytes received via SQL*Net from client
         58  SQL*Net roundtrips to/from client
         50  sorts (memory)
          0  sorts (disk)
         25  rows processed

NB: the DBMS_LOB query is not mentioned in the original tweet, but I wrote why SUBSTR should not be used against LOB's in 2019: Temporary LOBs.

As I said, it is quite a typical cloud environment in which the client is on a different VM from the DB server. I can get even better results with CLOB if I run the same script on the DB server itself or use a proximity placement group (Azure)/cluster placement group (AWS).

It can be seen that there is twice as much data transfered with CLOB than with the other queries (50MB vs 25MB). It is a known issue that was already observed by several other authors, e.g. LOB reads. In the specific example from this post, the extra CLOB data will become noticeable on a slow network, e.g. me pulling data from a different continent on a mobile broadband. It is not the case in most environments (including non-productions) that I work with - CLOB will be faster than VARRAY (1.5 seconds vs 14.16 seconds). As always, rather than relying on any information, such as the tweet above, it is better to test it for yourself as this post demonstrates.

пятница, 17 декабря 2021 г.

Using credentials with database links in 21c

Since 21c it is now possible to use credential objects in database links. Here is a short demonstration of this functionality:

SQL> exec dbms_credential.create_credential('TC_CRED', 'TC', 'tc')

PL/SQL procedure successfully completed.

SQL>
SQL> create database link link1 connect with tc_cred using 'localhost/pdb';

Database link created.

SQL> create database link link2 connect with tc_cred using 'localhost/pdb';

Database link created.

SQL>
SQL> select * from dual@link1;

D
-
X

SQL> select * from dual@link2;

D
-
X

SQL Language Reference has not been updated with the new syntax yet. If we alter the user's password, the existing DB links will not work anymore (I do not consider gradual password rollover here):

SQL> alter user tc identified by tc2;

User altered.

SQL>
SQL> alter session close database link link1;

Session altered.

SQL> alter session close database link link2;

Session altered.

SQL>
SQL> select * from dual@link1;
select * from dual@link1
                   *
ERROR at line 1:
ORA-01017: invalid username/password; logon denied
ORA-02063: preceding line from LINK1

It is enough to alter the credentials objects to make the DB links work again:

SQL> exec dbms_credential.update_credential('TC_CRED', 'PASSWORD', 'tc2')

PL/SQL procedure successfully completed.

SQL>
SQL> select * from dual@link1;

D
-
X

SQL> select * from dual@link2;

D
-
X

Conclusion

This functionality really comes into its own when you re-use one username and password pair in multiple database links. If we want to change the username or password, there is no need to change each link anymore. We can alter one credentials object instead. The functionality has been backported to 19c as well: Bug 29541929 - support credential objects in database links (Doc ID 29541929.8).

четверг, 9 декабря 2021 г.

gridSetup.sh executeConfigTools fails with PRVG-13606 : chrony daemon is not synchronized with any external time source

The command failed with the following errors in the log:

/u01/app/19.3.0/grid/gridSetup.sh -executeConfigTools -responseFile /opt/rsp/gi_19.3_config.rsp -silent
...
INFO:  [Dec 8, 2021 8:20:51 AM] Verifying Clock Synchronization ...FAILED
INFO:  [Dec 8, 2021 8:20:51 AM] Skipping line: Verifying Clock Synchronization ...FAILED
INFO:  [Dec 8, 2021 8:20:51 AM]   Verifying Network Time Protocol (NTP) ...FAILED
INFO:  [Dec 8, 2021 8:20:51 AM] Skipping line:   Verifying Network Time Protocol (NTP) ...FAILED
INFO:  [Dec 8, 2021 8:20:51 AM]     Verifying chrony daemon is synchronized with at least one external time
INFO:  [Dec 8, 2021 8:20:51 AM] Skipping line:     Verifying chrony daemon is synchronized with at least one external time
INFO:  [Dec 8, 2021 8:20:51 AM]     source ...FAILED
INFO:  [Dec 8, 2021 8:20:51 AM] Skipping line:     source ...FAILED
INFO:  [Dec 8, 2021 8:20:51 AM]     rac2: PRVG-13606 : chrony daemon is not synchronized with any external time
INFO:  [Dec 8, 2021 8:20:51 AM] Skipping line:     rac2: PRVG-13606 : chrony daemon is not synchronized with any external time
INFO:  [Dec 8, 2021 8:20:51 AM]           source on node "rac2".
INFO:  [Dec 8, 2021 8:20:51 AM] Skipping line:           source on node "rac2".
INFO:  [Dec 8, 2021 8:20:51 AM] Skipping line:
INFO:  [Dec 8, 2021 8:20:51 AM]     rac1: PRVG-13606 : chrony daemon is not synchronized with any external time
INFO:  [Dec 8, 2021 8:20:51 AM] Skipping line:     rac1: PRVG-13606 : chrony daemon is not synchronized with any external time
INFO:  [Dec 8, 2021 8:20:51 AM]           source on node "rac1".
INFO:  [Dec 8, 2021 8:20:51 AM] Skipping line:           source on node "rac1".

It can be easily reproduced by running CVU:

[grid@rac1 ~]$ cluvfy comp clocksync -n rac1 -verbose

Verifying Clock Synchronization ...
  Node Name                             Status
  ------------------------------------  ------------------------
  rac1                                  passed

  Node Name                             State
  ------------------------------------  ------------------------
  rac1                                  Observer

CTSS is in Observer state. Switching over to clock synchronization checks using NTP

  Verifying Network Time Protocol (NTP) ...
    Verifying '/etc/chrony.conf' ...
    Node Name                             File exists?
    ------------------------------------  ------------------------
    rac1                                  yes

    Verifying '/etc/chrony.conf' ...PASSED
    Verifying Daemon 'chronyd' ...
    Node Name                             Running?
    ------------------------------------  ------------------------
    rac1                                  yes

    Verifying Daemon 'chronyd' ...PASSED
    Verifying NTP daemon or service using UDP port 123 ...
    Node Name                             Port Open?
    ------------------------------------  ------------------------
    rac1                                  yes

    Verifying NTP daemon or service using UDP port 123 ...PASSED
    Verifying chrony daemon is synchronized with at least one external time source ...FAILED (PRVG-13606)
  Verifying Network Time Protocol (NTP) ...FAILED (PRVG-13606)
Verifying Clock Synchronization ...FAILED (PRVG-13606)

Verification of Clock Synchronization across the cluster nodes was unsuccessful on all the specified nodes.


Failures were encountered during execution of CVU verification request "Clock Synchronization across the cluster nodes".

Verifying Clock Synchronization ...FAILED
  Verifying Network Time Protocol (NTP) ...FAILED
    Verifying chrony daemon is synchronized with at least one external time
    source ...FAILED
    rac1: PRVG-13606 : chrony daemon is not synchronized with any external time
          source on node "rac1".


CVU operation performed:      Clock Synchronization across the cluster nodes
Date:                         Dec 9, 2021 10:56:45 AM
CVU home:                     /u01/app/19.3.0/grid/
User:                         grid

If we want to get more details, the CVU trace could be enabled:

[grid@rac1 ~]$ rm -rf /tmp/cvutrace
[grid@rac1 ~]$ mkdir /tmp/cvutrace
[grid@rac1 ~]$ export CV_TRACELOC=/tmp/cvutrace
[grid@rac1 ~]$ export SRVM_TRACE=true
[grid@rac1 ~]$ export SRVM_TRACE_LEVEL=1
[grid@rac1 ~]$ cluvfy comp clocksync -n rac1 -verbose

This produces the following lines in the trace file /tmp/cvutrace/cvutrace.log.0:

[main] [ 2021-12-09 10:58:10.179 UTC ] [VerificationUtil.traceAndLogInternal:16755]  [TaskNTP.doChronyTimeSourceCheck:2465] status=SUCCESSFUL; vfyCode=0; output=MS Name/IP address         Stratum Poll Reach Last
Rx Last sample
===============================================================================
^* 169.254.169.123               3   7   377   128  +1141ns[+4603ns] +/-  501us

[main] [ 2021-12-09 10:58:10.179 UTC ] [TaskAnonymousProxy.<init>:119]  Defining proxy task with: 'chrony daemon is synchronized with at least one external time source'
nodeList: 'rac1'
from task: 'TaskNTP'
Called from: 'TaskNTP.performNTPChecks:889'
[main] [ 2021-12-09 10:58:10.179 UTC ] [ResultSet.overwriteResultSet:810]  Overwriting ResultSet, called from: TaskAnonymousProxy.performAnonymousTask:148
[main] [ 2021-12-09 10:58:10.179 UTC ] [CVUVariables.getCVUVariable:607]  variable name : MODE_API
[main] [ 2021-12-09 10:58:10.179 UTC ] [CVUVariables.getCVUVariable:643]  Variable found in the CVU and Command Line context
[main] [ 2021-12-09 10:58:10.179 UTC ] [CVUVariables.resolve:981]  ForcedLookUp not enabled for variable:MODE_API
[main] [ 2021-12-09 10:58:10.179 UTC ] [CVUVariables.secureVariableValueTrace:789]  getting CVUVariableConstant : VAR = MODE_API VAL = FALSE
[main] [ 2021-12-09 10:58:10.180 UTC ] [ResultSet.traceResultSet:1040]

ResultSet AFTER overwrite ===>
        Overall Status->VERIFICATION_FAILED

        Uploaded Overall Status->UNKNOWN

        HasNodeResults: true


        contents of resultTable

Dumping Result data.
  Status     : VERIFICATION_FAILED
  Name       : rac1
  Type       : Node
  Has Results: No
  Exp. value : null
  Act. value : null

  Errors  :
    PRVG-13606 : chrony daemon is not synchronized with any external time source on node "rac1".

The same configuration steps were completed successfully on Oracle Linux (OL) 8.4, so that I started looking for any changes. It turns out that the chronyc sources output was changed.
OL 8.5 with chrony-4.1-1.0.1.el8.x86_64:

[root@rac1 ~]# chronyc sources
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
^* 169.254.169.123               3   7   377    10   -590ns[+4002ns] +/-  494us

OL 8.4 with chrony-3.5-2.0.1.el8.x86_64:

210 Number of sources = 1
MS Name/IP address         Stratum Poll Reach LastRx Last sample               
===============================================================================
^* 169.254.169.123               3   6   377    39  +4827ns[  +10us] +/-  560us

The number of sources line is absent in OL 8.5.
There are two workarounds that can be used.

Use ORA_DISABLED_CVU_CHECKS

[grid@rac1 ~]$ ORA_DISABLED_CVU_CHECKS=TASKNTP cluvfy comp clocksync -n rac1 -verbose

Verifying Clock Synchronization ...
  Node Name                             Status
  ------------------------------------  ------------------------
  rac1                                  passed

  Node Name                             State
  ------------------------------------  ------------------------
  rac1                                  Observer
  Verifying Network Time Protocol (NTP) ...WARNING (PRVG-11640)
Verifying Clock Synchronization ...WARNING (PRVG-11640)

Verification of Clock Synchronization across the cluster nodes was successful.


Warnings were encountered during execution of CVU verification request "Clock Synchronization across the cluster nodes".

Verifying Clock Synchronization ...WARNING
  Verifying Network Time Protocol (NTP) ...WARNING
  rac1: PRVG-11640 : The check "Network Time Protocol (NTP)" was not performed
        as it is disabled


CVU operation performed:      Clock Synchronization across the cluster nodes
Date:                         Dec 9, 2021 11:03:41 AM
CVU home:                     /u01/app/19.3.0/grid/
User:                         grid

Amend chronyc temporarily to produce the desired output

This is only to validate our hypothesis related to the cause of the validation error.

[root@rac1 ~]# mv /usr/bin/chronyc{,.orig}
[root@rac1 ~]# vi /usr/bin/chronyc
[root@rac1 ~]# chmod a+x /usr/bin/chronyc
[root@rac1 ~]# /usr/bin/chronyc sources
210 Number of sources = 1
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
^* 169.254.169.123               3   8   377    21    -32us[-5361ns] +/-  578us
[root@rac1 ~]# cat /usr/bin/chronyc
#!/bin/bash
echo '210 Number of sources = 1'
/usr/bin/chronyc.orig "$@"

Then we can validate the change by running CVU:

[grid@rac1 ~]$ cluvfy comp clocksync -n rac1 -verbose

Verifying Clock Synchronization ...
  Node Name                             Status
  ------------------------------------  ------------------------
  rac1                                  passed

  Node Name                             State
  ------------------------------------  ------------------------
  rac1                                  Observer

CTSS is in Observer state. Switching over to clock synchronization checks using NTP

  Verifying Network Time Protocol (NTP) ...
    Verifying '/etc/chrony.conf' ...
    Node Name                             File exists?
    ------------------------------------  ------------------------
    rac1                                  yes

    Verifying '/etc/chrony.conf' ...PASSED
    Verifying Daemon 'chronyd' ...
    Node Name                             Running?
    ------------------------------------  ------------------------
    rac1                                  yes

    Verifying Daemon 'chronyd' ...PASSED
    Verifying NTP daemon or service using UDP port 123 ...
    Node Name                             Port Open?
    ------------------------------------  ------------------------
    rac1                                  yes

    Verifying NTP daemon or service using UDP port 123 ...PASSED
    Verifying chrony daemon is synchronized with at least one external time source ...PASSED
  Verifying Network Time Protocol (NTP) ...PASSED
Verifying Clock Synchronization ...PASSED

Verification of Clock Synchronization across the cluster nodes was successful.

CVU operation performed:      Clock Synchronization across the cluster nodes
Date:                         Dec 9, 2021 11:05:36 AM
CVU home:                     /u01/app/19.3.0/grid/
User:                         grid

Conclusion

The requirement to have an external NTP server is questionable. There can be hardware clocks which chrony can use, such as TimeSync PTP service on Azure. This post demonstrates how to debug NTP issues during Grid Infrastructure installations and/or CVU checks. There is one generic method provided to work around such issues by setting ORA_DISABLED_CVU_CHECKS. If the issue is caused by Oracle utilities expecting certain output, then we might as well tweak system programs temporarily to produce the desired output.

вторник, 30 ноября 2021 г.

Setting database session to read only in 21c

Oracle introduced a new parameter read_only in 21c which is not documented in Database Reference yet at the time of this writing.

[oracle@db-21 ~]$ sqlplus /nolog @q

SQL*Plus: Release 21.0.0.0.0 - Production on Tue Nov 30 10:39:56 2021
Version 21.4.0.0.0

Copyright (c) 1982, 2021, Oracle.  All rights reserved.

SQL>
SQL> conn tc/tc@db-21/pdb
Connected.
SQL>
SQL> create table t(x int);

Table created.

SQL>
SQL> insert into t values (0);

1 row created.

SQL> commit;

Commit complete.

SQL>
SQL> alter session set read_only=true;

Session altered.

SQL>
SQL> update t
  2     set x=1;
update t
       *
ERROR at line 1:
ORA-16000: database or pluggable database open for read-only access


SQL>
SQL> create table new_tab(x int);
create table new_tab(x int)
*
ERROR at line 1:
ORA-16000: database or pluggable database open for read-only access


SQL>
SQL> select *
  2    from t
  3    for update;
  from t
       *
ERROR at line 2:
ORA-16000: database or pluggable database open for read-only access


SQL>
SQL> lock table t in exclusive mode;
lock table t in exclusive mode
*
ERROR at line 1:
ORA-16000: database or pluggable database open for read-only access


SQL>
SQL> alter session set read_only=false;

Session altered.

SQL>
SQL> update t
  2     set x=1;

1 row updated.

SQL>
SQL> create table new_tab(x int);

Table created.

SQL>
SQL> select *
  2    from t
  3    for update;

         X
----------
         1

SQL>
SQL> lock table t in exclusive mode;

Table(s) Locked.

I believe it was originally introduced for Oracle Autonomous Database offerings since it is the only place where it is documented: Change Autonomous Database Operation Mode to Read/Write Read-Only or Restricted. There are no comparable database features that can provide the same functionality at this level of granularity. A typical usage of this when some application sessions should be set to Read-Only. We can set read_only=true in a logon trigger for that:

SQL> create or replace trigger logon_trg
  2  after logon on tc.schema
  3  declare
  4  begin
  5    execute immediate 'alter session set read_only=true';
  6  end;
  7  /

Trigger created.

SQL>
SQL> conn tc/tc@db-21/pdb
Connected.
SQL> select *
  2    from t
  3    for update;
select *
*
ERROR at line 1:
ORA-16000: database or pluggable database open for read-only access


SQL>
SQL> conn system/manager@db-21/pdb
Connected.
SQL> select *
  2    from tc.t
  3    for update;

         X
----------
         1

пятница, 12 ноября 2021 г.

ORA-7445 kpdbfSourceFileSearch with NFSv3 without noac option

Creating a pluggable database (PDB) by plugging an unplugged PDB in 19.12 can fail with the ORA-7445 kpdbfSourceFileSearch error. Somehow there is no information about this error on MOS, so that I am putting it here.

SQL> create pluggable database pdb_test as clone using '/mnt/racdba/pdb_test.xml' copy source_file_directory='/mnt/racdba/datafile';
create pluggable database pdb_test as clone using '/mnt/racdba/pdb_test.xml' copy source_file_directory='/mnt/racdba/datafile'
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel
Process ID: 302875
Session ID: 140 Serial number: 36632

Producing the following lines in the alert log:

2021-11-11T16:11:59.436812+00:00
create pluggable database pdb_test as clone using '/mnt/racdba/pdb_test.xml' copy source_file_directory='/mnt/racdba/datafile'
2021-11-11T16:11:59.662614+00:00
WARNING:NFS file system /mnt/racdba mounted with incorrect options(rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,nolock,proto=tcp,port=2048,timeo=600,retrans=2,sec=sys,mountaddr=20.38.122.68,mountvers=3,mountport=2048,mo
untproto=tcp,local_lock=all,addr=20.38.122.68)
WARNING:Expected NFS mount options: rsize>=32768,wsize>=32768,hard,noac/actimeo=0
WARNING:NFS file system /mnt/racdba mounted with incorrect options(rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,nolock,proto=tcp,port=2048,timeo=600,retrans=2,sec=sys,mountaddr=20.38.122.68,mountvers=3,mountport=2048
,mountproto=tcp,local_lock=all,addr=20.38.122.68)
WARNING:Expected NFS mount options: rsize>=32768,wsize>=32768,hard,noac/actimeo=0
WARNING:NFS file system /mnt/racdba mounted with incorrect options(rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,nolock,proto=tcp,port=2048,timeo=600,retrans=2,sec=sys,mountaddr=20.38.122.68,mountvers=3,mountport=2048,mountproto=tcp,local_lock=all,addr=20.38.122.68)
WARNING:Expected NFS mount options: rsize>=32768,wsize>=32768,hard,noac/actimeo=0
WARNING:NFS file system /mnt/racdba mounted with incorrect options(rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,nolock,proto=tcp,port=2048,timeo=600,retrans=2,sec=sys,mountaddr=20.38.122.68,mountvers=3,mountport=2048,mountproto=tcp,local_lock=all,addr=20.38.122.68)
WARNING:Expected NFS mount options: rsize>=32768,wsize>=32768,hard,noac/actimeo=0
Scanning plugin datafile directory - /mnt/racdba/datafile for file originally          created  as +DATA/RACDBA/D0849B1A0CE20B41E0530101A8C085D0/DATAFILE/system.281.1088344145 with afn -14
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x1C8] [PC:0x99556E2, kpdbfSourceFileSearch()+674] [flags: 0x0, count: 1]
Errors in file /u01/app/oracle/diag/rdbms/racdba/racdba1/trace/racdba1_ora_302875.trc  (incident=16137) (PDBNAME=CDB$ROOT):
ORA-07445: exception encountered: core dump [kpdbfSourceFileSearch()+674] [SIGSEGV] [ADDR:0x1C8] [PC:0x99556E2] [Address not mapped to object] []
Incident details in: /u01/app/oracle/diag/rdbms/racdba/racdba1/incident/incdir_16137/racdba1_ora_302875_i16137.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
2021-11-11T16:12:01.412859+00:00
Dumping diagnostic data in directory=[cdmp_20211111161201], requested by (instance=1, osid=302875), summary=[incident=16137].
2021-11-11T16:12:47.954425+00:00
**************************************************************
Undo Create of Pluggable Database PDB_TEST with pdb id - 4.
**************************************************************

The partial call stack trace from the incident file is this:

__sighandler()       call     sslsshandler()       000000019 7F66A4F8EBF0
                                                   7F66A4F8EAC0 7F66A4F8EAC0 ?
                                                   7F66A4F8EA30 ? 000000083 ?
kpdbfSourceFileSear  signal   __sighandler()       7F66AA3C09C0 000000000
ch()+674                                           7F66A39B09E0 000000008 ?
                                                   000000701 ? 7F66AA27C450 ?
kpdbcPlugVerifyFile  call     kpdbfSourceFileSear  000000014 7F6600000000
s()+1853                      ch()                 7F66A35CC594 7F66A35CC58C
                                                   7F66AA411140 7F66AA27C450 ?
kpdbcPlugVerifyAndC  call     kpdbcPlugVerifyFile  7FFE906117B0 7F6600000000 ?
opyFiles()+1722               s()                  068618378 7F66A35CC58C ?
                                                   7F66AA411140 ? 7F66AA27C450 ?
kpdbcParseAndVerify  call     kpdbcPlugVerifyAndC  7FFE906117B0 063F4FBA8
XML()+558                     opyFiles()           068618378 ? 7F66A35CC58C ?
                                                   7F66AA411140 ? 7F66AA27C450 ?
kpdbcCreatePdbCbk()  call     kpdbcParseAndVerify  7FFE906117B0 063F4FBA8
+2832                         XML()                068618378 ? 7F66A35CC58C ?
                                                   7F66AA411140 ? 7F66AA27C450 ?
kpdbcCreatePdb()+20  call     kpdbcCreatePdbCbk()  7F66A464F5D0 063F4FBA8 ?
73                                                 7F66A464F5A8 7F66A35CC58C ?
                                                   7F66AA411140 ? 7F66AA27C450 ?

Direct NFS (DNFS) is disabled for the database, and the actual mount options are as follows:

[root@raca1 ~]# mount | grep /mnt/racdba
mikhailstorageaccount.blob.core.windows.net:/mikhailstorageaccount/mikhailstoragecontainer on /mnt/racdba type nfs (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,nolock,proto=tcp,port=2048,timeo=600,retrans=2,sec=sys,mountaddr=20.38.122.68,mountvers=3,mountport=2048,mountproto=tcp,local_lock=all,addr=20.38.122.68)

In this case remounting the filesystem with the "noac" option helps (no cache file attributes):

SQL> create pluggable database pdb_test as clone using '/mnt/racdba/pdb_test.xml' copy source_file_directory='/mnt/racdba/datafile';

Pluggable database created.

Producing the following lines in the alert log:

2021-11-11T16:37:07.331022+00:00
create pluggable database pdb_test as clone using '/mnt/racdba/pdb_test.xml' copy source_file_directory='/mnt/racdba/datafile'
2021-11-11T16:37:08.442188+00:00
Scanning plugin datafile directory - /mnt/racdba/datafile for file originally          created  as +DATA/RACDBA/D0849B1A0CE20B41E0530101A8C085D0/DATAFILE/system.281.1088344145 with afn -14
Using file-/mnt/racdba/datafile/data_D-RACDB_I-1085957449_TS-SYSTEM_FNO-14_020dtklc for original file-+DATA/RACDBA/D0849B1A0CE20B41E0530101A8C085D0/DATAFILE/system.281.1088344145 with afn-14
Scanning plugin datafile directory - /mnt/racdba/datafile for file originally          created  as +DATA/RACDBA/D0849B1A0CE20B41E0530101A8C085D0/DATAFILE/sysaux.279.1088344145 with afn -15
Using file-/mnt/racdba/datafile/data_D-RACDB_I-1085957449_TS-SYSAUX_FNO-15_050dtn16 for original file-+DATA/RACDBA/D0849B1A0CE20B41E0530101A8C085D0/DATAFILE/sysaux.279.1088344145 with afn-15
Scanning plugin datafile directory - /mnt/racdba/datafile for file originally          created  as +DATA/RACDBA/D0849B1A0CE20B41E0530101A8C085D0/DATAFILE/undotbs1.280.1088344145 with afn -16
Using file-/mnt/racdba/datafile/data_D-RACDB_I-1085957449_TS-UNDOTBS1_FNO-16_030dtler for original file-+DATA/RACDBA/D0849B1A0CE20B41E0530101A8C085D0/DATAFILE/undotbs1.280.1088344145 with afn-16
Creating new file-/mnt/racdba/datafile/temp.282.1088344151 for original file-+DATA/RACDBA/D0849B1A0CE20B41E0530101A8C085D0/TEMPFILE/temp.282.1088344151
Scanning plugin datafile directory - /mnt/racdba/datafile for file originally          created  as +DATA/RACDBA/D0849B1A0CE20B41E0530101A8C085D0/DATAFILE/undo_2.283.1088344175 with afn -17
Using file-/mnt/racdba/datafile/data_D-RACDB_I-1085957449_TS-UNDO_2_FNO-17_040dtm80 for original file-+DATA/RACDBA/D0849B1A0CE20B41E0530101A8C085D0/DATAFILE/undo_2.283.1088344175 with afn-17
2021-11-11T16:37:32.703370+00:00
PDB_TEST(4):Endian type of dictionary set to little
****************************************************************
Pluggable Database PDB_TEST with pdb id - 4 is created as UNUSABLE.
If any errors are encountered before the pdb is marked as NEW,
then the pdb must be dropped
local undo-1, localundoscn-0x0000000000000114
****************************************************************
PDB_TEST(4):Pluggable database PDB_TEST pseudo opening
PDB_TEST(4):SUPLOG: Initialize PDB SUPLOG SGA, old value 0x0, new value 0x18
PDB_TEST(4):Autotune of undo retention is turned on.
PDB_TEST(4):This instance was first to open pluggable database PDB_TEST (container=4)
PDB_TEST(4):queued attach DA request 0x8fff7aa0 for pdb 4, ospid 334557
2021-11-11T16:37:33.671956+00:00
Increasing priority of 2 RS
Domain Action Reconfiguration started (domid 4, new da inc 11, cluster inc 8)
Instance 1 is attaching to domain 4
 Global Resource Directory partially frozen for domain action
Domain Action Reconfiguration complete (total time 0.0 secs)
Decreasing priority of 2 RS
2021-11-11T16:37:33.793006+00:00
PDB_TEST(4):Undo initialization recovery: Parallel FPTR complete: start:18298543 end:18298544 diff:1 ms (0.0 seconds)
PDB_TEST(4):Undo initialization recovery: err:0 start: 18298542 end: 18298545 diff: 3 ms (0.0 seconds)
PDB_TEST(4):[334557] Successfully onlined Undo Tablespace 2.
PDB_TEST(4):Undo initialization online undo segments: err:0 start: 18298545 end: 18298678 diff: 133 ms (0.1 seconds)
PDB_TEST(4):Undo initialization finished serial:0 start:18298542 end:18298680 diff:138 ms (0.1 seconds)
PDB_TEST(4):Database Characterset for PDB_TEST is AL32UTF8
PDB_TEST(4):Pluggable database PDB_TEST pseudo closing
PDB_TEST(4):JIT: pid 334557 requesting stop
PDB_TEST(4):Closing sequence subsystem (18298726913).
PDB_TEST(4):Buffer Cache flush started: 4
PDB_TEST(4):Buffer Cache flush finished: 4
PDB_TEST(4):queued detach DA request 0x8fff7a38 for pdb 4, ospid 334557
2021-11-11T16:37:34.520104+00:00
Increasing priority of 2 RS
Domain Action Reconfiguration started (domid 4, new da inc 12, cluster inc 8)
Instance 1 is detaching from domain 4 (lazy abort? 0, recovery member? 0)
 Global Resource Directory partially frozen for domain action
* domain detach - domain 4 valid ? 1
 Non-local Process blocks cleaned out
 Set master node info
 Dwn-cvts replayed, VALBLKs dubious
 All grantable enqueues granted
freeing rdom 4
freeing the fusion rht of pdb 4
freeing the pdb enqueue rht
Domain Action Reconfiguration complete (total time 0.0 secs)
Decreasing priority of 2 RS
Completed: create pluggable database pdb_test as clone using '/mnt/racdba/pdb_test.xml' copy source_file_directory='/mnt/racdba/datafile'

суббота, 30 октября 2021 г.

Workaround for ORA-22835: Buffer too small for CLOB to CHAR or BLOB to RAW conversion

Oracle versions 9i and 10.1 have a bug in a LOB to CHAR conversion. When the value is larger than 4000 bytes, it is silently truncated without dropping an error (see: Ora-22835 - CLOB Larger Than 4000 Inserted Into Varchar2(4000) Column Is Silently Truncated (Doc ID 388512.1)). This can become an issue following a database upgrade since newer versions start dropping an ORA-22835 error. This happened in this topic on SQL.RU. This blog post provides an undocumented workaround for this issue.

SQL> !oerr ora 22836
22836, 00000, "Event to turn on lob to char/raw silent truncation"
// *Document: NO
// *Cause:    N/A
// *Action:   Do not throw error 22835 for truncation during LOB to CHAR/RAW
//            conversion. Truncate the data instead.

SQL>
SQL> create table t(vc varchar2(4000));

Table created.

SQL>
SQL> var c clob
SQL>
SQL> begin
  2    select xmltype.createxml(cursor(select * from all_objects)).getClobVal()
  3      into :c
  4      from dual;
  5  end;
  6  /

PL/SQL procedure successfully completed.

SQL>
SQL> exec dbms_output.put_line(dbms_lob.getlength(:c))
12273513

PL/SQL procedure successfully completed.

SQL>
SQL> insert into t values (to_char(:c));
insert into t values (to_char(:c))
                      *
ERROR at line 1:
ORA-22835: Buffer too small for CLOB to CHAR or BLOB to RAW conversion (actual:
12273513, maximum: 4000)


SQL>
SQL> alter session set events '22836 level 1';

Session altered.

SQL>
SQL> insert into t values (to_char(:c));

1 row created.

SQL>
SQL> select length(vc) from t;

LENGTH(VC)
----------
      4000

SQL>
SQL> select banner_full from v$version;

BANNER_FULL
--------------------------------------------------------------------------------
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.9.0.0.0

Conclusion

Event 22836 can be used to suppress the ORA-22835 error and enable the old behavior - LOB values larger than 4000 bytes will be silently truncated without throwing an error. This is not something that can be considered as a long term solution, however, it can become handy in some cases when changing the code is not practical or not feasible.

среда, 22 сентября 2021 г.

Finding cause of kernel: blk_update_request: I/O error, dev fd0, sector 0

There was a Linux server that got periodic kernel: blk_update_request: I/O error, dev fd0, sector 0 errors in /var/log/messages:

Sep 21 11:38:29 localhost kernel: blk_update_request: I/O error, dev fd0, sector 0

There was no cron jobs or systemd timers that can cause these errors, so that I had to resort to other tools.

In this specific example, I decided to use Linux audit:

[root@floppy ~]# auditctl -a exit,always -S open -F path=/dev/fd0 -k floppy
WARNING - 32/64 bit syscall mismatch, you should specify an arch

After I got new blk_update_request errors, I obtained the audit logs (formatted for readability):

[root@floppy ~]# ausearch -k floppy
----
time->Tue Sep 21 11:43:04 2021
type=CONFIG_CHANGE msg=audit(1632224584.979:146): auid=1000 ses=1 op=add_rule key="floppy" list=4 res=1
----
time->Tue Sep 21 11:43:17 2021
type=PROCTITLE msg=audit(1632224597.188:147): proctitle=666469736B002D6C002F6465762F666430
type=PATH msg=audit(1632224597.188:147): item=0 name="/dev/fd0" inode=8972 dev=00:05 mode=060660 ouid=0 ogid=6 rdev=02:00 objtype=NORMAL cap_fp=0000000000000000 cap_fi=0000000000000000 cap_fe=0 cap_fver=0
type=CWD msg=audit(1632224597.188:147):  cwd="/root"
type=SYSCALL msg=audit(1632224597.188:147): arch=c000003e syscall=2 success=no exit=-6 a0=7ffcd4c7290a a1=80000 a2=1 a3=7ffcd4c70e20 items=1 
                                            ppid=1767 pid=2060 auid=1000 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=1 
                                            comm="fdisk" exe="/usr/sbin/fdisk" key="floppy"
.. after a while ..
[root@floppy ~]# ausearch -k floppy
...
----
time->Tue Sep 21 11:43:31 2021
type=PROCTITLE msg=audit(1632224611.123:148): proctitle=666469736B002D6C002F6465762F666430
type=PATH msg=audit(1632224611.123:148): item=0 name="/dev/fd0" inode=8972 dev=00:05 mode=060660 ouid=0 ogid=6 rdev=02:00 objtype=NORMAL cap_fp=0000000000000000 cap_fi=0000000000000000 cap_fe=0 cap_fver=0
type=CWD msg=audit(1632224611.123:148):  cwd="/root"
type=SYSCALL msg=audit(1632224611.123:148): arch=c000003e syscall=2 success=no exit=-6 a0=7fff65dd690a a1=80000 a2=1 a3=7fff65dd4220 items=1
                                            ppid=1767 pid=2073 auid=1000 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=1
                                            comm="fdisk" exe="/usr/sbin/fdisk" key="floppy"

As it's seen from the executable, it was fdisk. More specifically, fdisk -l can cause such errors on Azure V1 VMs.

вторник, 31 августа 2021 г.

ASM disk group fails to mount with ORA-15017 ORA-15066 errors offlining disk "RACQ$LUN3" in group "DATA" may result in a data loss

It's a test 21c cluster that wasn't shutdown properly yesterday. While mounting a DATA disk group, the following errors are encountered:

SQL> alter diskgroup data mount;
alter diskgroup data mount
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15017: diskgroup "DATA" cannot be mounted
ORA-15066: offlining disk "RACQ$LUN3" in group "DATA" may result in a data loss
Mounting the disk group with the FORCE option is not possible either and fails with the same errors:
ASMCMD> mount data -f
ORA-15032: not all alterations performed
ORA-15017: diskgroup "DATA" cannot be mounted
ORA-15066: offlining disk "RACQ$LUN3" in group "DATA" may result in a data loss (DBD ERROR: OCIStmtExecute)
With the next entries in the alert log:
2021-08-31T10:31:52.117810+00:00
SQL> alter diskgroup data mount
2021-08-31T10:31:52.124846+00:00
NOTE: cache registered group DATA 2/0x05D14A8C
NOTE: cache began mount (first) of group DATA 2/0x05D14A8C
NOTE: Assigning number (2,5) to disk (/dev/flashgrid/rac2.lun5)
WARNING: preferred read failure group RAC1 does not exist in diskgroup DATA
NOTE: Assigning number (2,1) to disk (/dev/flashgrid/rac2.lun6)
WARNING: preferred read failure group RAC1 does not exist in diskgroup DATA
NOTE: Assigning number (2,4) to disk (/dev/flashgrid/rac2.lun3)
WARNING: preferred read failure group RAC1 does not exist in diskgroup DATA
NOTE: Assigning number (2,2) to disk (/dev/flashgrid/racq.lun3)
WARNING: preferred read failure group RAC1 does not exist in diskgroup DATA
NOTE: Assigning number (2,6) to disk (/dev/flashgrid/rac1.lun5)
WARNING: DATA has too many failure groups for a stretch cluster.
NOTE: Assigning number (2,0) to disk (/dev/flashgrid/rac1.lun6)
WARNING: DATA has too many failure groups for a stretch cluster.
NOTE: Assigning number (2,3) to disk (/dev/flashgrid/rac1.lun3)
WARNING: DATA has too many failure groups for a stretch cluster.
2021-08-31T10:31:52.294934+00:00
cluster guid (5f307c7210446f13bfcd86fa9d15c5f1) generated for PST Hbeat for instance 1
NOTE: initial disk modes for disk 2 (RACQ$LUN3) in group 2 (DATA) is not completely online: modes 0x1 lflags 0x4
2021-08-31T10:31:52.297979+00:00
NOTE: cache closing disk 2 of grp 2: (not open) RACQ$LUN3
2021-08-31T10:31:58.301169+00:00
ERROR: disk 2 (RACQ$LUN3) in group 2 cannot be offlined because all disks [2(RACQ$LUN3)] with mirror data would be offline.
2021-08-31T10:31:58.301239+00:00
ERROR: too many offline disks in PST (grp 2)
2021-08-31T10:31:58.301932+00:00
NOTE: cache dismounting (clean) group 2/0x05D14A8C (DATA)
NOTE: messaging CKPT to quiesce pins Unix process pid: 7067, NID: 4026531836, image: oracle@rac1.example.com (TNS V1-V3)
NOTE: dbwr not being msg'd to dismount
NOTE: LGWR not being messaged to dismount
NOTE: cache dismounted group 2/0x05D14A8C (DATA)
NOTE: cache ending mount (fail) of group DATA number=2 incarn=0x05d14a8c
NOTE: cache deleting context for group DATA 2/0x05d14a8c
2021-08-31T10:31:58.303103+00:00
GMON dismounting group 2 at 90 for pid 57, osid 7067
2021-08-31T10:31:58.303346+00:00
NOTE: Disk RAC1$LUN6 in mode 0x7f marked for de-assignment
NOTE: Disk RAC2$LUN6 in mode 0x7f marked for de-assignment
NOTE: Disk RACQ$LUN3 in mode 0x1 marked for de-assignment
NOTE: Disk RAC1$LUN3 in mode 0x7f marked for de-assignment
NOTE: Disk RAC2$LUN3 in mode 0x7f marked for de-assignment
NOTE: Disk RAC2$LUN5 in mode 0x7f marked for de-assignment
NOTE: Disk RAC1$LUN5 in mode 0x7f marked for de-assignment
ERROR: diskgroup DATA was not mounted
ORA-15032: not all alterations performed
ORA-15017: diskgroup "DATA" cannot be mounted
ORA-15066: offlining disk "RACQ$LUN3" in group "DATA" may result in a data loss

2021-08-31T10:31:58.314373+00:00
ERROR: alter diskgroup data mount

DATA is a NORMAL redundancy disk group with two regular and one quorum failure group. It's one of the cases when there might be a need to do some undocumented stuff:

SQL> select name,
       path,
       mount_status,
       header_status,
       mode_status,
       state,
       failgroup
  from v$asm_disk
 where group_number=(select group_number from v$asm_diskgroup where name='DATA')
 order by path;
  2    3    4    5    6    7    8    9   10
NAME                           PATH                           MOUNT_S HEADER_STATU MODE_ST STATE    FAILGROUP
------------------------------ ------------------------------ ------- ------------ ------- -------- ------------------------------
                               /dev/flashgrid/rac1.lun3       CLOSED  MEMBER       ONLINE  NORMAL
                               /dev/flashgrid/rac1.lun5       CLOSED  MEMBER       ONLINE  NORMAL
                               /dev/flashgrid/rac1.lun6       CLOSED  MEMBER       ONLINE  NORMAL
                               /dev/flashgrid/rac2.lun3       CLOSED  MEMBER       ONLINE  NORMAL
                               /dev/flashgrid/rac2.lun5       CLOSED  MEMBER       ONLINE  NORMAL
                               /dev/flashgrid/rac2.lun6       CLOSED  MEMBER       ONLINE  NORMAL
                               /dev/flashgrid/racq.lun2       CLOSED  FORMER       ONLINE  NORMAL
                               /dev/flashgrid/racq.lun3       CLOSED  MEMBER       ONLINE  NORMAL

8 rows selected.

SQL> alter diskgroup data mount restricted for recovery;

Diskgroup altered.

SQL> select name,
       path,
       mount_status,
       header_status,
       mode_status,
       state,
       failgroup
  from v$asm_disk
 where group_number=(select group_number from v$asm_diskgroup where name='DATA')
 order by path;
  2    3    4    5    6    7    8    9   10
NAME                           PATH                           MOUNT_S HEADER_STATU MODE_ST STATE    FAILGROUP
------------------------------ ------------------------------ ------- ------------ ------- -------- ------------------------------
RAC1$LUN3                      /dev/flashgrid/rac1.lun3       CACHED  MEMBER       ONLINE  NORMAL   RAC1
RAC1$LUN5                      /dev/flashgrid/rac1.lun5       CACHED  MEMBER       ONLINE  NORMAL   RAC1
RAC1$LUN6                      /dev/flashgrid/rac1.lun6       CACHED  MEMBER       ONLINE  NORMAL   RAC1
RAC2$LUN3                      /dev/flashgrid/rac2.lun3       CACHED  MEMBER       ONLINE  NORMAL   RAC2
RAC2$LUN5                      /dev/flashgrid/rac2.lun5       CACHED  MEMBER       ONLINE  NORMAL   RAC2
RAC2$LUN6                      /dev/flashgrid/rac2.lun6       CACHED  MEMBER       ONLINE  NORMAL   RAC2
RACQ$LUN3                      /dev/flashgrid/racq.lun3       CACHED  MEMBER       ONLINE  NORMAL   RACQ

7 rows selected.
The alert log:
2021-08-31T10:37:56.317024+00:00
SQL> alter diskgroup data mount restricted for recovery
2021-08-31T10:37:56.323711+00:00
NOTE: cache registered group DATA 2/0xEA114A90
NOTE: cache began mount (first) of group DATA 2/0xEA114A90
NOTE: Assigning number (2,5) to disk (/dev/flashgrid/rac2.lun5)
WARNING: preferred read failure group RAC1 does not exist in diskgroup DATA
NOTE: Assigning number (2,1) to disk (/dev/flashgrid/rac2.lun6)
WARNING: preferred read failure group RAC1 does not exist in diskgroup DATA
NOTE: Assigning number (2,4) to disk (/dev/flashgrid/rac2.lun3)
WARNING: preferred read failure group RAC1 does not exist in diskgroup DATA
NOTE: Assigning number (2,2) to disk (/dev/flashgrid/racq.lun3)
WARNING: preferred read failure group RAC1 does not exist in diskgroup DATA
NOTE: Assigning number (2,6) to disk (/dev/flashgrid/rac1.lun5)
WARNING: DATA has too many failure groups for a stretch cluster.
NOTE: Assigning number (2,0) to disk (/dev/flashgrid/rac1.lun6)
WARNING: DATA has too many failure groups for a stretch cluster.
NOTE: Assigning number (2,3) to disk (/dev/flashgrid/rac1.lun3)
WARNING: DATA has too many failure groups for a stretch cluster.
2021-08-31T10:37:56.529659+00:00
cluster guid (5f307c7210446f13bfcd86fa9d15c5f1) generated for PST Hbeat for instance 1
NOTE: initial disk modes for disk 2 (RACQ$LUN3) in group 2 (DATA) is not completely online: modes 0x1 lflags 0x4
2021-08-31T10:37:56.532517+00:00
NOTE: cache closing disk 2 of grp 2: (not open) RACQ$LUN3
2021-08-31T10:38:02.539918+00:00
NOTE: GMON heartbeating for grp 2 (DATA)
2021-08-31T10:38:02.540568+00:00
NOTE: cache closing disk 2 of grp 2: (not open) RACQ$LUN3
GMON querying group 2 at 125 for pid 57, osid 7067
2021-08-31T10:38:02.540905+00:00
NOTE: cache closing disk 2 of grp 2: (not open) RACQ$LUN3
2021-08-31T10:38:02.541129+00:00
NOTE: cache is mounting group DATA created on 2021/07/30 12:37:34
NOTE: cache opening disk 0 of grp 2: RAC1$LUN6 path:/dev/flashgrid/rac1.lun6
NOTE: group 2 (DATA) high disk header ckpt advanced to fcn 0.42883
NOTE: 08/31/21 10:38:02 DATA.F1X0 found on disk 0 au 10 fcn 0.42883 datfmt 2
NOTE: cache opening disk 1 of grp 2: RAC2$LUN6 path:/dev/flashgrid/rac2.lun6
NOTE: cache opening disk 3 of grp 2: RAC1$LUN3 path:/dev/flashgrid/rac1.lun3
NOTE: cache opening disk 4 of grp 2: RAC2$LUN3 path:/dev/flashgrid/rac2.lun3
NOTE: cache opening disk 5 of grp 2: RAC2$LUN5 path:/dev/flashgrid/rac2.lun5
NOTE: 08/31/21 10:38:02 DATA.F1X0 found on disk 5 au 10 fcn 0.42883 datfmt 2
NOTE: cache opening disk 6 of grp 2: RAC1$LUN5 path:/dev/flashgrid/rac1.lun5
2021-08-31T10:38:02.541726+00:00
NOTE: cache mounting (first) normal redundancy group 2/0xEA114A90 (DATA)
2021-08-31T10:38:02.963014+00:00
NOTE: attached to recovery domain 2
2021-08-31T10:38:03.009050+00:00
validate pdb 2, flags x4, valid 0, pdb flags x204
* validated domain 2, flags = 0x200
NOTE: cache recovered group 2 to fcn 0.43425
NOTE: redo buffer size is 512 blocks (2105344 bytes)
2021-08-31T10:38:03.011498+00:00
NOTE: LGWR attempting to mount thread 1 for diskgroup 2 (DATA)
NOTE: LGWR found thread 1 closed at ABA 30.7188 lock domain=0 inc#=0 instnum=1
NOTE: LGWR mounted thread 1 for diskgroup 2 (DATA)
2021-08-31T10:38:03.022814+00:00
NOTE: LGWR opened thread 1 (DATA) at fcn 0.43425 ABA 31.7189 lock domain=2 inc#=2 instnum=1 gx.incarn=3927001744 mntstmp=2021/08/31 10:38:03.012000
2021-08-31T10:38:03.023034+00:00
NOTE: cache mounting group 2/0xEA114A90 (DATA) succeeded
NOTE: cache ending mount (success) of group DATA number=2 incarn=0xea114a90
WARNING: DATA has too many failure groups for a stretch cluster.
2021-08-31T10:38:03.103835+00:00
NOTE: cache closing disk 2 of grp 2: (not open) RACQ$LUN3
2021-08-31T10:38:03.104063+00:00
NOTE: Instance updated compatible.asm to 19.0.0.0.0 for grp 2 (DATA).
2021-08-31T10:38:03.104307+00:00
NOTE: cache closing disk 2 of grp 2: (not open) RACQ$LUN3
2021-08-31T10:38:03.104463+00:00
NOTE: Instance updated compatible.asm to 19.0.0.0.0 for grp 2 (DATA).
2021-08-31T10:38:03.105130+00:00
NOTE: Instance updated compatible.rdbms to 19.0.0.0.0 for grp 2 (DATA).
2021-08-31T10:38:03.105436+00:00
NOTE: Instance updated compatible.rdbms to 19.0.0.0.0 for grp 2 (DATA).
WARNING: DATA has too many failure groups for a stretch cluster.
WARNING: DATA has too many failure groups for a stretch cluster.
2021-08-31T10:38:03.148108+00:00
SUCCESS: diskgroup DATA was mounted
2021-08-31T10:38:03.157095+00:00
SUCCESS: alter diskgroup data mount restricted for recovery
2021-08-31T10:38:03.167013+00:00
NOTE: diskgroup resource ora.DATA.dg is online
2021-08-31T10:38:19.113318+00:00
NOTE: cache closing disk 2 of grp 2: (not open) RACQ$LUN3
2021-08-31T10:38:19.114528+00:00
NOTE: cache closing disk 2 of grp 2: (not open) RACQ$LUN3
2021-08-31T10:38:19.475013+00:00
SQL> ALTER DISKGROUP "DATA" ONLINE QUORUM DISK "RACQ$LUN3"
2021-08-31T10:38:19.478374+00:00
NOTE: cache closing disk 2 of grp 2: (not open) RACQ$LUN3
2021-08-31T10:38:19.486520+00:00
NOTE: GroupBlock outside rolling migration privileged region
NOTE: initiating resync of disk group 2 disks
RACQ$LUN3 (2)

NOTE: process _user21083_+asm1 (21083) initiating offline of disk 2.4042374244 (RACQ$LUN3) with mask 0x7e in group 2 (DATA) without client assisting
2021-08-31T10:38:19.521378+00:00
NOTE: sending set offline flag message (2320259704) to 1 disk(s) in group 2
2021-08-31T10:38:19.521792+00:00
WARNING: Disk 2 (RACQ$LUN3) in group 2 mode 0x1 is now being offlined
2021-08-31T10:38:19.522023+00:00
NOTE: initiating PST update: grp 2 (DATA), dsk = 2/0xf0f1bc64, mask = 0x6a, op = clear mandatory
2021-08-31T10:38:19.522202+00:00
GMON updating disk modes for group 2 at 135 for pid 52, osid 21083
2021-08-31T10:38:19.522431+00:00
NOTE: cache closing disk 2 of grp 2: (not open) RACQ$LUN3
2021-08-31T10:38:19.523238+00:00
NOTE: PST update grp = 2 completed successfully
NOTE: initiating PST update: grp 2 (DATA), dsk = 2/0xf0f1bc64, mask = 0x7e, op = clear mandatory
2021-08-31T10:38:19.523484+00:00
GMON updating disk modes for group 2 at 136 for pid 52, osid 21083
2021-08-31T10:38:19.523608+00:00
NOTE: cache closing disk 2 of grp 2: (not open) RACQ$LUN3
2021-08-31T10:38:19.524311+00:00
NOTE: PST update grp = 2 completed successfully
NOTE: requesting all-instance membership refresh for group=2
NOTE: initiating PST update: grp 2 (DATA), dsk = 2/0x0, mask = 0x11, op = assign mandatory
2021-08-31T10:38:19.532576+00:00
GMON updating disk modes for group 2 at 137 for pid 52, osid 21083
2021-08-31T10:38:19.532724+00:00
NOTE: cache closing disk 2 of grp 2: (not open) RACQ$LUN3
2021-08-31T10:38:19.540473+00:00
NOTE: PST update grp = 2 completed successfully
NOTE: requesting all-instance disk validation for group=2
2021-08-31T10:38:19.540830+00:00
NOTE: disk validation pending for 1 disk in group 2/0xea114a90 (DATA)
NOTE: Found /dev/flashgrid/racq.lun3 for disk RACQ$LUN3
WARNING: DATA has too many failure groups for a stretch cluster.
NOTE: completed disk validation for 2/0xea114a90 (DATA)
2021-08-31T10:38:19.677322+00:00
NOTE: running client discovery for group 2 (reqid:16866344061815119730)
NOTE: discarding redo for group 2 disk 2
NOTE: initiating PST update: grp 2 (DATA), dsk = 2/0x0, mask = 0x19, op = assign mandatory
2021-08-31T10:38:20.176650+00:00
GMON updating disk modes for group 2 at 138 for pid 52, osid 21083
NOTE: group DATA: updated PST location: disks 0005 0000 0002
2021-08-31T10:38:20.225248+00:00
NOTE: PST update grp = 2 completed successfully
WARNING: DATA has too many failure groups for a stretch cluster.
2021-08-31T10:38:20.225980+00:00
NOTE: membership refresh pending for group 2/0xea114a90 (DATA)
2021-08-31T10:38:20.227553+00:00
GMON querying group 2 at 139 for pid 31, osid 22344
2021-08-31T10:38:20.237960+00:00
WARNING: DATA has too many failure groups for a stretch cluster.
NOTE: cache opening disk 2 of grp 2: RACQ$LUN3 path:/dev/flashgrid/racq.lun3
SUCCESS: refreshed membership for 2/0xea114a90 (DATA)
2021-08-31T10:38:20.238448+00:00
NOTE: initiating PST update: grp 2 (DATA), dsk = 2/0x0, mask = 0x7f, op = assign mandatory
2021-08-31T10:38:20.239039+00:00
GMON updating disk modes for group 2 at 140 for pid 52, osid 21083
2021-08-31T10:38:20.284132+00:00
NOTE: PST update grp = 2 completed successfully
2021-08-31T10:38:20.284349+00:00
SUCCESS: ALTER DISKGROUP "DATA" ONLINE QUORUM DISK "RACQ$LUN3"
2021-08-31T10:38:21.596382+00:00
NOTE: Attempting voting file refresh on diskgroup DATA
NOTE: Refresh completed on diskgroup DATA. No voting file found.

The alert log shows that the RACQ$LUN3 disk is brought offline first and then put online. It can also be seen that ASM corrected the issue itself. The 'for recovery' mount option used to be documented somewhere on MOS, but I cannot find where it is now. Looks like Oracle Support made the document non-public.

I can now remount the disk group cleanly:

SQL> alter diskgroup data dismount;

Diskgroup altered.

SQL> alter diskgroup data mount;

Diskgroup altered.

среда, 30 июня 2021 г.

Making DBMS_SCHEDULER.CREATE_JOB Transactional

It is known that DBMS_SCHEDULER.CREATE_JOB is non-transactional as opposed to the old DBMS_JOB. There is even an Oracle idea to provide a transactional interface: link. In Oracle 19c DBMS_JOB jobs are actually DBMS_SCHEDULER jobs, and they can be part of a bigger transaction. I was curious how it is done and if I can make it work for DBMS_SCHEDULER jobs.

Here is a test script that I used for this blogpost on 19.9:

SQL> conn / as sysdba
Connected.
SQL>
SQL> alter session set container=pdb;

Session altered.

SQL>
SQL> exec dbms_scheduler.purge_log()

PL/SQL procedure successfully completed.

SQL>
SQL> drop user tc cascade;

User dropped.

SQL>
SQL> grant create job, create session to tc identified by tc;

Grant succeeded.

SQL>
SQL> grant execute on sys.dbms_isched to tc;

Grant succeeded.

Firstly, let us create a traditional DBMS_SCHEDULER job so as to demonstrate that CREATE_JOB is non-transactional. By "non-transactional" here I mean that it does not leave the session in a transaction.

SQL> conn tc/tc@localhost/pdb
Connected.
SQL>
SQL> doc
DOC>################################################################################
DOC>#  Traditional Job
DOC>################################################################################
DOC>#
SQL>
SQL> select dbms_transaction.local_transaction_id from dual;

LOCAL_TRANSACTION_ID
--------------------------------------------------------------------------------


SQL>
SQL> exec dbms_scheduler.create_job( -
>   job_name => 'JOB_NON_TX', -
>   job_type => 'PLSQL_BLOCK', -
>   job_action => 'null;', -
>   enabled    => true)

PL/SQL procedure successfully completed.

SQL>
SQL> select dbms_transaction.local_transaction_id from dual;

LOCAL_TRANSACTION_ID
--------------------------------------------------------------------------------


SQL>
SQL> exec dbms_session.sleep(5)

PL/SQL procedure successfully completed.

SQL>
SQL> col job_name for a10
SQL>
SQL> select job_name, state
  2    from user_scheduler_jobs;

no rows selected

SQL>
SQL> col log_date for a35
SQL> select log_date, job_name, status
  2    from user_scheduler_job_run_details
  3   order by log_date;

LOG_DATE                            JOB_NAME   STATUS
----------------------------------- ---------- ----------
30-JUN-21 12.02.26.780546 PM +01:00 JOB_NON_TX SUCCEEDED

Now, I try the same but I call DBMS_ISCHED.SET_NO_COMMIT_FLAG before calling the CREATE_JOB procedure.

SQL> doc
DOC>################################################################################
DOC>#  Transactional Job
DOC>################################################################################
DOC>#
SQL>
SQL> exec sys.dbms_isched.set_no_commit_flag

PL/SQL procedure successfully completed.

SQL>
SQL> select dbms_transaction.local_transaction_id from dual;

LOCAL_TRANSACTION_ID
--------------------------------------------------------------------------------


SQL>
SQL> exec dbms_scheduler.create_job( -
>   job_name => 'JOB_TX', -
>   job_type => 'PLSQL_BLOCK', -
>   job_action => 'null;', -
>   enabled    => true)

PL/SQL procedure successfully completed.

SQL>
SQL> select dbms_transaction.local_transaction_id from dual;

LOCAL_TRANSACTION_ID
--------------------------------------------------------------------------------
9.23.604

SQL>
SQL> exec dbms_session.sleep(5)

PL/SQL procedure successfully completed.

SQL>
SQL> col job_name for a10
SQL>
SQL> select job_name, state
  2    from user_scheduler_jobs;

JOB_NAME   STATE
---------- ----------
JOB_TX     SCHEDULED

SQL>
SQL> col log_date for a35
SQL> select log_date, job_name, status
  2    from user_scheduler_job_run_details
  3   order by log_date;

LOG_DATE                            JOB_NAME   STATUS
----------------------------------- ---------- ----------
30-JUN-21 12.02.26.780546 PM +01:00 JOB_NON_TX SUCCEEDED

The things are a bit different this time around:

  • DBMS_SCHEDULER.CREATE_JOB left the session in a transaction.
  • The job is not started.
  • It is not demonstrated here, but it is possible to issue rollback and it will remove the job definition.

Such a job gets started as soon as commit is done:

SQL> commit;

Commit complete.

SQL>
SQL> exec dbms_session.sleep(5)

PL/SQL procedure successfully completed.

SQL>
SQL> col job_name for a10
SQL> col state for a10
SQL>
SQL> select job_name, state
  2    from user_scheduler_jobs;

no rows selected

SQL>
SQL> col status for a10
SQL>
SQL> col log_date for a35
SQL> select log_date, job_name, status
  2    from user_scheduler_job_run_details
  3   order by log_date;

LOG_DATE                            JOB_NAME   STATUS
----------------------------------- ---------- ----------
30-JUN-21 12.02.26.780546 PM +01:00 JOB_NON_TX SUCCEEDED
30-JUN-21 12.02.36.835878 PM +01:00 JOB_TX     SUCCEEDED

Conclusion

The post demonstrates that one can utilize undocumented DBMS_ISCHED.SET_NO_COMMIT_FLAG to make DBMS_SCHEDULER.CREATE_JOB transactional. Since the package is undocumented, there is no guarantee that it will keep working in future versions. Hopefully, Oracle will make a transactional interface available someday.

пятница, 23 апреля 2021 г.

How to Send Messages to AWS SQS from PL/SQL Using UTL_HTTP

This blog post provides a complete example showing how to send messages to an AWS SQS queue.

AWS Prerequisites

I created a queue called TestQueue. I also created an SQS VPC endpoint for the purpose of this example.
There is an IAM policy that I assigned to a dedicated IAM user created for this example:
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": "sqs:SendMessage",
            "Resource": "arn:aws:sqs:us-east-1:12<redacted>:TestQueue"
        }
    ]
}

Environment

I used Oracle Database 19.11 in my tests. The database user that is going to call the AWS SQS API is TC.

Wallet

First of all, let us load the AWS certificates into a wallet. Creating a new wallet:
[oracle@rac1 ~]$ orapki wallet create -wallet /u01/app/oracle/wallet_sqs -pwd Oracle123 -auto_login
Oracle PKI Tool Release 19.0.0.0.0 - Production
Version 19.4.0.0.0
Copyright (c) 2004, 2021, Oracle and/or its affiliates. All rights reserved.

Operation is successfully completed.
Getting AWS certificates and loading them:
[oracle@rac1 ~]$ openssl s_client -showcerts -connect vpce-010f631ef8f1e24e7-8br55nv1.sqs.us-east-1.vpce.amazonaws.com:443 </dev/null 2>/dev/null | sed '/BEGIN/,/END/!d' | csplit -f aws -b '%d.pem' - /END/+1
2033
4798
[oracle@rac1 ~]$ orapki wallet add -wallet /u01/app/oracle/wallet_sqs -cert aws1.pem -trusted_cert -pwd Oracle123
Oracle PKI Tool Release 19.0.0.0.0 - Production
Version 19.4.0.0.0
Copyright (c) 2004, 2021, Oracle and/or its affiliates. All rights reserved.

Operation is successfully completed.
In order to call AWS SQS, I need to provide both AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY. I certainly do not want to store them somewhere in the database. Instead, I am going to add the credentials to the wallet:
[oracle@rac1 ~]$ mkstore -wrl /u01/app/oracle/wallet_sqs -createCredential aws_sqs AKI<AWS_ACCESS_KEY_ID redacted>
Oracle Secret Store Tool Release 19.0.0.0.0 - Production
Version 19.4.0.0.0
Copyright (c) 2004, 2021, Oracle and/or its affiliates. All rights reserved.

Your secret/Password is missing in the command line
Enter your secret/Password:
Re-enter your secret/Password:
Enter wallet password:
In the end, that is what my wallet looks like:
[oracle@rac1 ~]$ orapki wallet display -wallet /u01/app/oracle/wallet_sqs
Oracle PKI Tool Release 19.0.0.0.0 - Production
Version 19.4.0.0.0
Copyright (c) 2004, 2021, Oracle and/or its affiliates. All rights reserved.

Requested Certificates:
User Certificates:
Oracle Secret Store entries:
oracle.security.client.connect_string1
oracle.security.client.password1
oracle.security.client.username1
Trusted Certificates:
Subject:        CN=Amazon,OU=Server CA 1B,O=Amazon,C=US
[oracle@rac1 ~]$ mkstore -wrl /u01/app/oracle/wallet_sqs -listCredential
Oracle Secret Store Tool Release 19.0.0.0.0 - Production
Version 19.4.0.0.0
Copyright (c) 2004, 2021, Oracle and/or its affiliates. All rights reserved.

Enter wallet password:
List credential (index: connect_string username)
1: aws_sqs AKI<AWS_ACCESS_KEY_ID redacted>

Configuring Network Access Control Lists (ACLs)

There are two privileges required for the calling user, which is called TC:
  • The user must be able to access the VPC endpoint over network
  • The user must be able to access the wallet file
The privileges assigned as follows:
SQL> exec dbms_network_acl_admin.append_host_ace( -
  host => '*.amazonaws.com', -
  ace  => xs$ace_type(privilege_list => xs$name_list('connect'), -
                      principal_name => 'tc', -
                      principal_type => xs_acl.ptype_db))

PL/SQL procedure successfully completed.

SQL> exec dbms_network_acl_admin.append_wallet_ace( -
  wallet_path => 'file:/u01/app/oracle/wallet_sqs', -
  ace         =>  xs$ace_type(privilege_list => xs$name_list('use_client_certificates', 'use_passwords'), -
                              principal_name => 'tc', -
                              principal_type => xs_acl.ptype_db))

PL/SQL procedure successfully completed.

Making UTL_HTTP Call

Finally, I am ready to make an AWS SQS API call:
SQL> declare
  2    req utl_http.req;
  3    resp utl_http.resp;
  4    value varchar2(32767);
  5    endpoint varchar2(32767) := 'https://vpce-010f631ef8f1e24e7-8br55nv1.sqs.us-east-1.vpce.amazonaws.com/12<redacted>/TestQueue/';
  6    parameters varchar2(32767) := 'Action=SendMessage&MessageBody=example';
  7    request_url varchar2(32767) := endpoint || '?' || parameters;
  8  begin
  9    utl_http.set_wallet('file:/u01/app/oracle/wallet_sqs');
 10    req := utl_http.begin_request(request_url);
 11    utl_http.set_header(req, 'x-amz-date', to_char(sysdate,'yyyymmdd"T"hh24miss"Z"'));
 12    utl_http.set_property(req, 'aws-region', 'us-east-1');
 13    utl_http.set_property(req, 'aws-service', 'sqs');
 14    utl_http.set_authentication_from_wallet(req, 'aws_sqs', 'AWS4-HMAC-SHA256');
 15    resp := utl_http.get_response(req);
 16    loop
 17      utl_http.read_line(resp, value, true);
 18      dbms_output.put_line(value);
 19    end loop;
 20  exception
 21    when utl_http.end_of_body
 22    then
 23      utl_http.end_request(req);
 24      utl_http.end_response(resp);
 25    when others
 26    then
 27      utl_http.end_request(req);
 28      raise;
 29  end;
 30  /
<?xml version="1.0"?><SendMessageResponse
xmlns="http://queue.amazonaws.com/doc/2012-11-05/"><SendMessageResult><MessageId
>d59b7cfb-4d9d-4675-bf1c-d64a9cb27929</MessageId><MD5OfMessageBody>1a79a4d60de67
18e8e5b326e338ae533</MD5OfMessageBody></SendMessageResult><ResponseMetadata><Req
uestId>675ba300-5f89-59d8-bf23-fbfcb7bd1fc9</RequestId></ResponseMetadata></Send
MessageResponse>

PL/SQL procedure successfully completed.
Let me explain the code above:
  • line 11:
    11    utl_http.set_header(req, 'x-amz-date', to_char(sysdate,'yyyymmdd"T"hh24miss"Z"'));
    x-amz-date is one of the mandatory headers in Signature Version 4
  • lines 12-13:
    12    utl_http.set_property(req, 'aws-region', 'us-east-1');
    13    utl_http.set_property(req, 'aws-service', 'sqs');
    The lines are provided in accordance with the comments in ?/rdbms/admin/utlhttp.sql
  • line 14:
    14    utl_http.set_authentication_from_wallet(req, 'aws_sqs', 'AWS4-HMAC-SHA256');
    There is one UTL_HTTP call but there are a lot of things are performed inside the procedure, such as: loading AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY from the wallet, constructing a Signature Version 4 request. The scheme AWS4-HMAC-SHA256 is not documented yet in Oracle documentation, but it is mentioned in ?/rdbms/admin/utlhttp.sql

Conclusion

As this blog post demonstrates, it is quite simple to call the AWS SQS API from PL/SQL using UTL_HTTP. Regretfully, some of the parameters I used are not widely known, and there is little documentation for them. It is my hope that Oracle improves its documentation one day.

It should be possible to call other AWS services using the same technique. SQS was used since there was an original question about it on SQL.RU. For instance, I remember I was calling S3 the other day using a similar script and a Signature Version 2 request, which corresponds to schema=AWS in the UTL_HTTP.SET_AUTHENTICATION_FROM_WALLET call. When it comes to S3, there is yet another way to utilize it that I blogged about in January: Reading Amazon S3 Data from Oracle on non-OCI Environment.

среда, 14 апреля 2021 г.

Patching with OPatch Failed with Undefined Reference to __libc_csu_fini __libc_csu_init due to Wrong Libc Library

Patching an Oracle Home failed on one cluster node with the following errors:
Patching component oracle.rdbms, 19.0.0.0.0...
Make failed to invoke "/usr/bin/make -f ins_rdbms.mk ioracle ORACLE_HOME=/u01/app/oracle/product/19.3.0/dbhome_1 OPATCH_SESSION=apply"....'/usr/lib64/crt1.o: In function `_start':
(.text+0x12): undefined reference to `__libc_csu_fini'
/usr/lib64/crt1.o: In function `_start':
(.text+0x19): undefined reference to `__libc_csu_init'
/u01/app/oracle/product/19.3.0/dbhome_1/lib//libserver19.a(kf.o): In function `kfNotify':
kf.c:(.text+0x26f): undefined reference to `atexit'
/u01/app/oracle/product/19.3.0/dbhome_1/lib//libserver19.a(jskm.o): In function `jskmCheckIMJobBCast':
jskm.c:(.text+0x202d): undefined reference to `stat'
jskm.c:(.text+0x2066): undefined reference to `stat'
/u01/app/oracle/product/19.3.0/dbhome_1/lib//libserver19.a(qmsqx.o): In function `qmsqxFetchPos':
qmsqx.c:(text.unlikely+0x1048f): undefined reference to `stat'
/u01/app/oracle/product/19.3.0/dbhome_1/lib//libcore19.a(slzgetevar.o): In function `slzgetevarf_insert_keyval':
slzgetevar.c:(text.unlikely+0x40e): undefined reference to `atexit'
/u01/app/oracle/product/19.3.0/dbhome_1/lib//libgeneric19.a(skgfqsbt.o): In function `xsbtinit':
skgfqsbt.c:(text.unlikely+0x88): undefined reference to `atexit'
/u01/app/oracle/product/19.3.0/dbhome_1/lib//libjavavm19.a(eobtl.o): In function `eobti_create_sym_tmp_file':
eobtl.c:(.text+0x106): undefined reference to `fstat'
/u01/app/oracle/product/19.3.0/dbhome_1/lib//libjavavm19.a(eobtl.o): In function `eobti_digest_symbol_file':
eobtl.c:(.text+0x1b8): undefined reference to `fstat'
/u01/app/oracle/product/19.3.0/dbhome_1/lib//libjavavm19.a(eobtl.o): In function `eobti_build_lookup_tables':
eobtl.c:(.text+0x24ae): undefined reference to `fstat'
eobtl.c:(.text+0x264f): undefined reference to `fstat'
/u01/app/oracle/product/19.3.0/dbhome_1/lib//libnnzst19.a(ext_ztrsaadapter.o): In function `ztcaProcCtx_New':
ztrsaadapter.c:(.text+0xb95): undefined reference to `atexit'
/u01/app/oracle/product/19.3.0/dbhome_1/lib//libnnzst19.a(ccme_ck_rand_load_fileS1.o): In function `r_ck_random_load_file':
ck_rand_load_file.c:(.text+0xd4): undefined reference to `stat'
/u01/app/oracle/product/19.3.0/dbhome_1/rdbms/lib//libfthread19.a(fthread_task.o): In function `fthread_task_tab_init':
fthread_task.c:(text.unlikely+0x3df): undefined reference to `atexit'
make: *** [/u01/app/oracle/product/19.3.0/dbhome_1/rdbms/lib/oracle] Error 1
After a bit of searching, I found that the same errors were reported when a wrong libc library was picked up in a case which was not related to Oracle.
Sure enough, there was a symbolic link $ORACLE_HOME/lib/libc.so to /lib64/libc.so.6:
[oracle@rac1 ~]$ find $ORACLE_HOME -name libc.so -ls
137463494    4 -rw-r--r--   1 oracle   oinstall       83 Apr 25  2014 /u01/app/oracle/product/19.3.0/dbhome_1/lib/stubs/libc.so
88704788    0 lrwxrwxrwx   1 oracle   oinstall       16 Apr 14 15:16 /u01/app/oracle/product/19.3.0/dbhome_1/lib/libc.so -> /lib64/libc.so.6
Having deleted the link, I was able to patch the Oracle Home successfully.

Thoughts

My main suspect was ldconfig and the files in /etc/ld.so.conf.d. For instance, Oracle Instant Client creates a file there with the $ORACLE_HOME/lib in it:
[oracle@rac1 ~]$ cat /etc/ld.so.conf.d/oracle-instantclient.conf
/usr/lib/oracle/19.3/client64/lib
[oracle@rac1 ~]$ rpm -qf /etc/ld.so.conf.d/oracle-instantclient.conf
oracle-instantclient19.3-basic-19.3.0.0.0-1.x86_64
ldconfig is also called when glibc is installed:
# rpm -qp --scripts /tmp/glibc/glibc-2.17-323.0.1.el7_9.x86_64.rpm
preinstall scriptlet (using <lua>):
-- Check that the running kernel is new enough
required = '2.6.32'
rel = posix.uname("%r")
if rpm.vercmp(rel, required) < 0 then
  error("FATAL: kernel too old", 0)
end
postinstall program: /usr/sbin/glibc_post_upgrade.x86_64
postuninstall program: /sbin/ldconfig
Besides, Oracle also uses some libc stubs ($ORACLE_HOME/lib/stubs/libc.so), so could it be the case that some unsuccessful patch application left a symbolic link there?
Then, it could be the case that somebody followed an article from My Oracle Support that recommends adding $ORACLE_HOME/lib to /etc/ld.so.conf.d: IF: External Jobs Failed With ORA-27301/ORA-27302 (Doc ID 2083336.1).
It is still unclear what caused the issue in the problem scenario since there were no references to $ORACLE_HOME/lib in /etc/ld.so.conf.d. For what is worth, next time I will be better prepared when I encounter a similar error.

вторник, 13 апреля 2021 г.

Direct Path Read Write Temp while Selecting CLOB from GV$ in RAC

I was intestigating why a simple SELECT * FROM GV$SQL started working much slower in 19c compared to 11.2.0.4. It turns out Oracle introduced some changes there in 18c. This blog post is written to demonstrate the new behavior.

A simple test case which I constructed for this example is below:
set echo on
set timing on

select banner from v$version;
alter session set events 'sql_trace wait=true';
begin
  for r in (
    select sql_fulltext
      from gv$sql)
  loop
    null;
  end loop;
end;
/
alter session set events 'sql_trace off';

select value
  from v$diag_info
 where name='Default Trace File';
I ran it in 19.10 and got the following results:
SQL> select banner from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production

Elapsed: 00:00:00.00
SQL> alter session set events 'sql_trace wait=true';

Session altered.

Elapsed: 00:00:00.00
SQL> begin
  2    for r in (
  3      select sql_fulltext
  4        from gv$sql)
  5    loop
  6      null;
  7    end loop;
  8  end;
  9  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:02.14
SQL> alter session set events 'sql_trace off';

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> select value
  2    from v$diag_info
  3   where name='Default Trace File';

VALUE
--------------------------------------------------------------------------------
/u01/app/oracle/diag/rdbms/orcl19/orcl191/trace/orcl191_ora_19616.trc

Elapsed: 00:00:00.00
I processed the trace file using tkprof and got the output below:
SQL ID: 250c8asaww0wz Plan Hash: 1891717107

SELECT SQL_FULLTEXT
FROM
 GV$SQL


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch       21      0.44       2.13        288        288      25355        2011
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       23      0.44       2.13        288        288      25355        2011

Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: SYS   (recursive depth: 1)
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
      2011       2011       2011  PX COORDINATOR  (cr=288 pr=288 pw=2259 time=2499846 us starts=1)
         0          0          0   PX SEND QC (RANDOM) :TQ10000 (cr=0 pr=0 pw=0 time=0 us starts=0 cost=0 size=1638400 card=100)
         0          0          0    VIEW  GV$SQL (cr=0 pr=0 pw=0 time=0 us starts=0)
         0          0          0     FIXED TABLE FULL X$KGLCURSOR_CHILD (cr=0 pr=0 pw=0 time=0 us starts=0 cost=0 size=1638400 card=100)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  PGA memory operation                           18        0.00          0.00
  PX Deq: reap credit                           232        0.00          0.00
  PX Deq: Join ACK                                2        0.00          0.00
  PX Deq: Parse Reply                             2        0.00          0.00
  PX Deq: Execute Reply                         146        0.00          0.00
  Disk file operations I/O                        5        0.00          0.00
  CSS initialization                              2        0.00          0.00
  CSS operation: query                            6        0.00          0.00
  CSS operation: action                           2        0.00          0.00
  asynch descriptor resize                        1        0.00          0.00
  ASM IO for non-blocking poll                 6769        0.00          0.00
  direct path write temp                       2158        0.00          1.59
  db file sequential read                       144        0.00          0.06
  direct path read temp                         144        0.00          0.06
  reliable message                                1        0.00          0.00
  PX Deq: Signal ACK EXT                          2        0.00          0.00
  IPC send completion sync                        1        0.00          0.00
  PX Deq: Slave Session Stats                     2        0.00          0.00
  IPC group service call                          1        0.00          0.00
  enq: PS - contention                            1        0.00          0.00
It is worth noting that the number of rows is 2011 while the number of direct path write temp wait events is 2158. I believe it is a general pattern rather than a coincidence based on my tests. I found that the number of direct path write temp is always greater than the number of rows but not much. That basically means that Oracle does at least one write per CLOB returned.
Let us now compare it with 12.2 (I tested both 12.2 and 12.2 with the latest release update):
SQL> select banner from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production
PL/SQL Release 12.2.0.1.0 - Production
CORE    12.2.0.1.0      Production
TNS for Linux: Version 12.2.0.1.0 - Production
NLSRTL Version 12.2.0.1.0 - Production

Elapsed: 00:00:00.01
SQL> alter session set events 'sql_trace wait=true';

Session altered.

Elapsed: 00:00:00.00
SQL> begin
  2    for r in (
  3      select sql_fulltext
  4        from gv$sql)
  5    loop
  6      null;
  7    end loop;
  8  end;
  9  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.05
SQL> alter session set events 'sql_trace off';

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> select value
  2    from v$diag_info
  3   where name='Default Trace File';

VALUE
--------------------------------------------------------------------------------
/u01/app/oracle/diag/rdbms/o122a/o122a1/trace/o122a1_ora_24892.trc
It can be seen that 12.2 is much faster than 19.10.
Here is a tkprof processed output:
SQL ID: 250c8asaww0wz Plan Hash: 1891717107

SELECT SQL_FULLTEXT
FROM
 GV$SQL


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch       21      0.02       0.03          0          0          0        2069
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       23      0.03       0.05          0          0          0        2069

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 76     (recursive depth: 1)
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
      2069       2069       2069  PX COORDINATOR  (cr=0 pr=0 pw=0 time=138102 us starts=1)
         0          0          0   PX SEND QC (RANDOM) :TQ10000 (cr=0 pr=0 pw=0 time=0 us starts=0 cost=0 size=200200 card=100)
         0          0          0    VIEW  GV$SQL (cr=0 pr=0 pw=0 time=0 us starts=0)
         0          0          0     FIXED TABLE FULL X$KGLCURSOR_CHILD (cr=0 pr=0 pw=0 time=0 us starts=0 cost=0 size=200200 card=100)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  enq: PS - contention                            3        0.00          0.00
  PX Deq: Join ACK                                5        0.00          0.00
  PX Deq: reap credit                           277        0.00          0.00
  IPC send completion sync                        3        0.00          0.00
  PX Deq: Parse Reply                             2        0.00          0.00
  PGA memory operation                           28        0.00          0.00
  PX Deq: Execute Reply                         132        0.00          0.01
  reliable message                                1        0.00          0.00
  PX Deq: Signal ACK EXT                          2        0.00          0.00
  PX Deq: Slave Session Stats                     2        0.00          0.00
Although the number of rows is roughly the same (2069 in 12.2 vs 2011 in 19.10), the runtime performance and wait events are drastically different. Please bear in mind, that I initially asked to investigate why the same query in 11.2 was running in 1 second while the same query in 19.10 was running in 90 seconds with about the same number of rows as in 11.2.
I also reviewed the trace file in 19c to see if I can spot any pattern there:
[oracle@rac1 ~]$ grep 'WAIT #139925193433952' /u01/app/oracle/diag/rdbms/orcl19/orcl191/trace/orcl191_ora_19616.trc | grep -E 'direct path (read|write) temp|db file sequential read'
WAIT #139925193433952: nam='direct path write temp' ela= 758 file number=201 first dba=4355 block cnt=1 obj#=-1 tim=20569064799
WAIT #139925193433952: nam='direct path write temp' ela= 674 file number=201 first dba=4356 block cnt=1 obj#=-1 tim=20569065748
..59 direct path write temp in total
WAIT #139925193433952: nam='direct path write temp' ela= 680 file number=201 first dba=4413 block cnt=1 obj#=-1 tim=20569114649
..then there are groups consisting of one db file sequential read/one direct path read temp (the same block)/multiple direct path write temp:
..group 1
WAIT #139925193433952: nam='db file sequential read' ela= 473 file#=201 block#=4413 blocks=1 obj#=-1 tim=20569115200
WAIT #139925193433952: nam='direct path read temp' ela= 449 file number=201 first dba=4413 block cnt=1 obj#=-1 tim=20569115739
WAIT #139925193433952: nam='direct path write temp' ela= 699 file number=201 first dba=4413 block cnt=1 obj#=-1 tim=20569116529
WAIT #139925193433952: nam='direct path write temp' ela= 643 file number=201 first dba=4414 block cnt=1 obj#=-1 tim=20569117342
WAIT #139925193433952: nam='direct path write temp' ela= 753 file number=201 first dba=4415 block cnt=1 obj#=-1 tim=20569118266
WAIT #139925193433952: nam='direct path write temp' ela= 706 file number=201 first dba=4416 block cnt=1 obj#=-1 tim=20569119134
WAIT #139925193433952: nam='direct path write temp' ela= 593 file number=201 first dba=4417 block cnt=1 obj#=-1 tim=20569119882
WAIT #139925193433952: nam='direct path write temp' ela= 626 file number=201 first dba=4418 block cnt=1 obj#=-1 tim=20569120652
WAIT #139925193433952: nam='direct path write temp' ela= 807 file number=201 first dba=4419 block cnt=1 obj#=-1 tim=20569121585
WAIT #139925193433952: nam='direct path write temp' ela= 625 file number=201 first dba=4420 block cnt=1 obj#=-1 tim=20569122348
WAIT #139925193433952: nam='direct path write temp' ela= 605 file number=201 first dba=4421 block cnt=1 obj#=-1 tim=20569123097
WAIT #139925193433952: nam='direct path write temp' ela= 509 file number=201 first dba=4422 block cnt=1 obj#=-1 tim=20569123753
WAIT #139925193433952: nam='direct path write temp' ela= 577 file number=201 first dba=4423 block cnt=1 obj#=-1 tim=20569124476
WAIT #139925193433952: nam='direct path write temp' ela= 667 file number=201 first dba=4424 block cnt=1 obj#=-1 tim=20569125286
WAIT #139925193433952: nam='direct path write temp' ela= 501 file number=201 first dba=4425 block cnt=1 obj#=-1 tim=20569125930
WAIT #139925193433952: nam='direct path write temp' ela= 617 file number=201 first dba=4426 block cnt=1 obj#=-1 tim=20569126698
WAIT #139925193433952: nam='direct path write temp' ela= 634 file number=201 first dba=4427 block cnt=1 obj#=-1 tim=20569127473
WAIT #139925193433952: nam='direct path write temp' ela= 577 file number=201 first dba=4428 block cnt=1 obj#=-1 tim=20569128188
WAIT #139925193433952: nam='direct path write temp' ela= 868 file number=201 first dba=4429 block cnt=1 obj#=-1 tim=20569129188
WAIT #139925193433952: nam='direct path write temp' ela= 609 file number=201 first dba=4430 block cnt=1 obj#=-1 tim=20569130099
..group 2
WAIT #139925193433952: nam='db file sequential read' ela= 399 file#=201 block#=4430 blocks=1 obj#=-1 tim=20569130544
WAIT #139925193433952: nam='direct path read temp' ela= 454 file number=201 first dba=4430 block cnt=1 obj#=-1 tim=20569131072
WAIT #139925193433952: nam='direct path write temp' ela= 669 file number=201 first dba=4430 block cnt=1 obj#=-1 tim=20569131834
WAIT #139925193433952: nam='direct path write temp' ela= 686 file number=201 first dba=4431 block cnt=1 obj#=-1 tim=20569132665
WAIT #139925193433952: nam='direct path write temp' ela= 588 file number=201 first dba=4432 block cnt=1 obj#=-1 tim=20569133416
WAIT #139925193433952: nam='direct path write temp' ela= 794 file number=201 first dba=4433 block cnt=1 obj#=-1 tim=20569134366
WAIT #139925193433952: nam='direct path write temp' ela= 648 file number=201 first dba=4434 block cnt=1 obj#=-1 tim=20569135192
WAIT #139925193433952: nam='direct path write temp' ela= 792 file number=201 first dba=4435 block cnt=1 obj#=-1 tim=20569136134
WAIT #139925193433952: nam='direct path write temp' ela= 682 file number=201 first dba=4436 block cnt=1 obj#=-1 tim=20569136961
WAIT #139925193433952: nam='direct path write temp' ela= 609 file number=201 first dba=4437 block cnt=1 obj#=-1 tim=20569137753
WAIT #139925193433952: nam='direct path write temp' ela= 563 file number=201 first dba=4438 block cnt=1 obj#=-1 tim=20569138501
WAIT #139925193433952: nam='direct path write temp' ela= 645 file number=201 first dba=4439 block cnt=1 obj#=-1 tim=20569139298
WAIT #139925193433952: nam='direct path write temp' ela= 807 file number=201 first dba=4440 block cnt=1 obj#=-1 tim=20569140265
WAIT #139925193433952: nam='direct path write temp' ela= 679 file number=201 first dba=4441 block cnt=1 obj#=-1 tim=20569141079
WAIT #139925193433952: nam='direct path write temp' ela= 626 file number=201 first dba=4442 block cnt=1 obj#=-1 tim=20569141844
WAIT #139925193433952: nam='direct path write temp' ela= 740 file number=201 first dba=4443 block cnt=1 obj#=-1 tim=20569142718
WAIT #139925193433952: nam='direct path write temp' ela= 753 file number=201 first dba=4444 block cnt=1 obj#=-1 tim=20569143615
WAIT #139925193433952: nam='direct path write temp' ela= 686 file number=201 first dba=4445 block cnt=1 obj#=-1 tim=20569144450
WAIT #139925193433952: nam='direct path write temp' ela= 673 file number=201 first dba=4446 block cnt=1 obj#=-1 tim=20569145257
WAIT #139925193433952: nam='direct path write temp' ela= 728 file number=201 first dba=4447 block cnt=1 obj#=-1 tim=20569146129
WAIT #139925193433952: nam='direct path write temp' ela= 589 file number=201 first dba=4448 block cnt=1 obj#=-1 tim=20569146857
WAIT #139925193433952: nam='direct path write temp' ela= 658 file number=201 first dba=4449 block cnt=1 obj#=-1 tim=20569147672
WAIT #139925193433952: nam='direct path write temp' ela= 666 file number=201 first dba=4450 block cnt=1 obj#=-1 tim=20569148478
WAIT #139925193433952: nam='direct path write temp' ela= 602 file number=201 first dba=4451 block cnt=1 obj#=-1 tim=20569149255
WAIT #139925193433952: nam='direct path write temp' ela= 841 file number=201 first dba=4452 block cnt=1 obj#=-1 tim=20569150247
WAIT #139925193433952: nam='direct path write temp' ela= 763 file number=201 first dba=4453 block cnt=1 obj#=-1 tim=20569151167
WAIT #139925193433952: nam='direct path write temp' ela= 779 file number=201 first dba=4454 block cnt=1 obj#=-1 tim=20569152208
..group 3
WAIT #139925193433952: nam='db file sequential read' ela= 206 file#=201 block#=4454 blocks=1 obj#=-1 tim=20569152457
WAIT #139925193433952: nam='direct path read temp' ela= 711 file number=201 first dba=4454 block cnt=1 obj#=-1 tim=20569153234
WAIT #139925193433952: nam='direct path write temp' ela= 575 file number=201 first dba=4454 block cnt=1 obj#=-1 tim=20569153905
WAIT #139925193433952: nam='direct path write temp' ela= 930 file number=201 first dba=4455 block cnt=1 obj#=-1 tim=20569155093
WAIT #139925193433952: nam='direct path write temp' ela= 809 file number=201 first dba=4355 block cnt=1 obj#=-1 tim=20569156069
WAIT #139925193433952: nam='direct path write temp' ela= 820 file number=201 first dba=4356 block cnt=1 obj#=-1 tim=20569157024
WAIT #139925193433952: nam='direct path write temp' ela= 976 file number=201 first dba=4357 block cnt=1 obj#=-1 tim=20569158174
WAIT #139925193433952: nam='direct path write temp' ela= 683 file number=201 first dba=4358 block cnt=1 obj#=-1 tim=20569159165
WAIT #139925193433952: nam='db file sequential read' ela= 638 file#=201 block#=4358 blocks=1 obj#=-1 tim=20569159850
WAIT #139925193433952: nam='direct path read temp' ela= 183 file number=201 first dba=4358 block cnt=1 obj#=-1 tim=20569160107
..
My next thought was to test how 19c processes CLOB's in traditional tables - I have not found any issues there. There have been no direct path read/write temp on copies of GV$SQL.
I then tried to select the same persistent tables (non-GV$) via database link in 19c - again everything worked as it should.
It seems that only GV$ views are affected. Local selects are not affected at all (the output below is from the first instance):
SQL ID: 87myh1vhjdcf8 Plan Hash: 1891717107

SELECT SQL_FULLTEXT
FROM
 GV$SQL WHERE INST_ID = 1


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch       13      0.02       0.02          0          0          0        1231
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       15      0.03       0.03          0          0          0        1231

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: SYS   (recursive depth: 1)
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
      1231       1231       1231  PX COORDINATOR  (cr=0 pr=0 pw=0 time=13516 us starts=1)
      1231       1231       1231   PX SEND QC (RANDOM) :TQ10000 (cr=0 pr=0 pw=0 time=13283 us starts=1 cost=0 size=1639700 card=100)
      1231       1231       1231    VIEW  GV$SQL (cr=0 pr=0 pw=0 time=12908 us starts=1)
      1231       1231       1231     FIXED TABLE FULL X$KGLCURSOR_CHILD (cr=0 pr=0 pw=0 time=12657 us starts=1 cost=0 size=1639700 card=100)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  PGA memory operation                           99        0.00          0.00
Whereas any select when non-local instance is involved is done utilizing a temporary tablespace:
SQL ID: 4p7vm0hhw6u1v Plan Hash: 1891717107

SELECT SQL_FULLTEXT
FROM
 GV$SQL WHERE INST_ID = 2


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch       10      0.21       1.06        118        118      11953         968
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       12      0.21       1.07        118        118      11953         968

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: SYS   (recursive depth: 1)
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
       968        968        968  PX COORDINATOR  (cr=118 pr=118 pw=1075 time=907430 us starts=1)
         0          0          0   PX SEND QC (RANDOM) :TQ10000 (cr=0 pr=0 pw=0 time=0 us starts=0 cost=0 size=1639700 card=100)
         0          0          0    VIEW  GV$SQL (cr=0 pr=0 pw=0 time=0 us starts=0)
         0          0          0     FIXED TABLE FULL X$KGLCURSOR_CHILD (cr=0 pr=0 pw=0 time=0 us starts=0 cost=0 size=1639700 card=100)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  PX Deq: reap credit                           158        0.00          0.00
  PX Deq: Join ACK                                1        0.00          0.00
  PX Deq: Parse Reply                             1        0.00          0.00
  PGA memory operation                           11        0.00          0.00
  PX Deq: Execute Reply                          61        0.00          0.01
  Disk file operations I/O                        3        0.00          0.00
  CSS initialization                              2        0.00          0.00
  CSS operation: query                            6        0.00          0.00
  CSS operation: action                           2        0.00          0.00
  asynch descriptor resize                        1        0.00          0.00
  ASM IO for non-blocking poll                 3208        0.00          0.00
  direct path write temp                       1029        0.00          0.81
  db file sequential read                        59        0.00          0.02
  direct path read temp                          59        0.00          0.02
  reliable message                                1        0.00          0.00
  PX Deq: Signal ACK EXT                          1        0.00          0.00
  IPC send completion sync                        1        0.00          0.00
  PX Deq: Slave Session Stats                     1        0.00          0.00
  IPC group service call                          1        0.00          0.00
  enq: PS - contention                            1        0.00          0.00
I then shut one instance down, and rechecked the queries against GV$ - there were no direct path read/write temp wait events.

Conclusion

It is still not clear why Oracle started writing every CLOB from GV$ to a temporary tablespace. I tried several different GV$ views and the same behaviour was observed everywhere. It looks like something was changed inside those GV$ functions.
My initial thought was that it is some kind of a temporary tablespace flush for persistence in case some parallel processes got terminated. However, parallel slaves started on remote instances are sending the blocks to the QC bypassing any temp.
Then I was thinking that it might be done to reduce memory usage. It has little sense to write every CLOB to temp anyway. Why do not keep a small memory area and write to temp everything that exceeded some threshold?
I also do not know how to alter this behavior to make the things the same as they were before 18c. I tested 11.2.0.4, 12.2(vanilla and with the latest RU), 18c, 19c (vanilla and with the latest RU - 19.10). Both 18c and 19c are affected, so that they wait for direct path read/write temp when a CLOB column is selected from GV$ with more than one instance.
I will update the blog post if I get to the bottom of it and identify the root cause.