Страницы

четверг, 31 декабря 2020 г.

Creating RAC Physical Standby for Single Instance Primary

Oracle made some interesting improvements in Database Configuration Assistant (DBCA) in 18c, namely this:
Creating a copy of an Oracle RAC database
A copy of an Oracle RAC database can be created using the -createDuplicateDB command option -databaseConfigType with the value of RAC or RACONENODE.

It has been possible to create a standby database using the createDuplicateDB DBCA command since 12.2.
18c added a new capability to specify the database configuration type of a new standby, such as: RAC, RAC One Node, or a regular single instance database.

This method by itself requires some prerequisite actions. I do not see that these are documented, so that I am writing this blog post.

Setup

Primary

  • Hostname: primary
  • db_name: orcl
  • db_unique_name: orcl
  • Configuration Type: Single Instance

Standby

  • Hostnames: rac1.example.com, rac2.example.com
  • db_unique_name: racdb
  • Configuration Type: RAC
  • SCAN port: 1521
  • Local listener port: 1522
  • Grid Home: /u01/app/19.3.0/grid, owner=grid
  • DB Home: /u01/app/oracle/product/19.3.0/dbhome_1, owner=oracle (role separation)
  • ASM disk groups: single disk group DATA - it is a sandbox
The naming convention might be a bit odd, but the main purpose of this configuration is to provide some initial steps to setup a new RAC database with a view to making it a new primary.

DBCA Command

This is a DBCA command that I will run to create a standby database in 19.9:
dbca -createDuplicateDB -silent \
     -gdbName orcl \
     -primaryDBConnectionString primary:1521/orcl \
     -sid racdb \
     -initParams "dg_broker_start=true" \
     -sysPassword change_on_install \
     -adminManaged \
     -nodelist rac1,rac2 \
     -recoveryAreaDestination +DATA \
       -recoveryAreaSize 10000 \
     -databaseConfigType RAC \
     -useOMF true \
     -storageType ASM \
       -datafileDestination +DATA \
     -createAsStandby \
       -dbUniqueName racdb \
     -createListener rac1.example.com:1522
Firstly, let me go over the keys that are worth mentioning.
  1. -primaryDBConnectionString primary:1521/orcl - it should have a specific port number even if it is default 1521
  2. -createListener rac1.example.com:1522 - this is the most interesting part. This is part of the RMAN block that is run to create a physical standby database:
    duplicate target database
    for standby
    from active database
    dorecover
    nofilenamecheck
    ;
    It is the push-based method of active database duplication. Therefore, the primary database should be able to connect to the new standby. I specified the local listener endpoint (1522) in this example. Then, DBCA always tries to create a new listener in this configuration. I do not know how to avoid that here. Even worse, it attempts to create a new listener in DB home. Thankfully, it silently swallows an error if the listener already exists. Here is a sample excerpt from the DBCA log file to substantiate that remark with some facts:
    [progressPage.flowWorker] [ 2020-12-30 19:04:30.642 UTC ] [ClusterInfo.getHostName:462]  Hostname = rac1
    INFO: Dec 30, 2020 7:04:30 PM oracle.install.commons.system.process.ProcessLauncher launchProcess
    INFO: Executing [/u01/app/oracle/product/19.3.0/dbhome_1/bin/lsnrctl, start, rac1.example.com]
    
    INFO: Dec 30, 2020 7:04:30 PM oracle.install.commons.system.process.ProcessLauncher launchProcess
    INFO: Starting Output Reader Threads for process /u01/app/oracle/product/19.3.0/dbhome_1/bin/lsnrctl
    
    INFO: Dec 30, 2020 7:04:30 PM oracle.assistants.common.util.NetworkConfigHelper$2 processLine
    INFO:
    
    INFO: Dec 30, 2020 7:04:30 PM oracle.assistants.common.util.NetworkConfigHelper$2 processLine
    INFO: LSNRCTL for Linux: Version 19.0.0.0.0 - Production on 30-DEC-2020 19:04:30
    
    INFO: Dec 30, 2020 7:04:30 PM oracle.assistants.common.util.NetworkConfigHelper$2 processLine
    INFO:
    
    INFO: Dec 30, 2020 7:04:30 PM oracle.assistants.common.util.NetworkConfigHelper$2 processLine
    INFO: Copyright (c) 1991, 2020, Oracle.  All rights reserved.
    
    INFO: Dec 30, 2020 7:04:30 PM oracle.assistants.common.util.NetworkConfigHelper$2 processLine
    INFO:
    
    INFO: Dec 30, 2020 7:04:30 PM oracle.assistants.common.util.NetworkConfigHelper$2 processLine
    INFO: TNS-01106: Listener using listener name LISTENER has already been started
    
    INFO: Dec 30, 2020 7:04:30 PM oracle.install.commons.system.process.ProcessLauncher launchProcess
    INFO: The process /u01/app/oracle/product/19.3.0/dbhome_1/bin/lsnrctl exited with code 1
    
    INFO: Dec 30, 2020 7:04:30 PM oracle.install.commons.system.process.ProcessLauncher launchProcess
    INFO: Waiting for output processor threads to exit.
    
    INFO: Dec 30, 2020 7:04:30 PM oracle.install.commons.system.process.ProcessLauncher launchProcess
    INFO: Output processor threads exited.
    
    INFO: Dec 30, 2020 7:04:30 PM oracle.assistants.common.util.NetworkConfigHelper startListener
    INFO: Exit code of lsnrctl is:1
    
    INFO: Dec 30, 2020 7:04:30 PM oracle.assistants.dbca.driver.backend.steps.ListenerConfigStep createStaticListener
    INFO: Static listener created.
    
    INFO: Dec 30, 2020 7:04:30 PM oracle.assistants.dbca.driver.StepDBCAJob$1 update
    INFO: Percentage Progress got for job:Listener config step progress:100.0
    
    There is another little drawback with this approach - it creates listener.ora in DB home which uses a static configuration. It is harmless, but I would rather clean it up after the exercise to have a nice and tidy environment.

Prerequisite Steps

It should be possible to establish a connection from the primary host using the local listener endpoint: rac1.example.com:1522.
I do not discuss DNS setup - it is implied.
I need to configure a static registration for the new RAC instance. For that, I edit my GI listener.ora on the host I am going to run the duplicate command from (rac1):
# /u01/app/19.3.0/grid/network/admin/listener.ora
SID_LIST_LISTENER=
  (SID_LIST=
      (SID_DESC=
         (SID_NAME=racdb1)
         (ORACLE_HOME=/u01/app/oracle/product/19.3.0/dbhome_1)
       )
   )
That is basically it. Once I reload the local listener, I am good to continue to the next step.

Creating RAC Physical Standby

[oracle@rac1 ~]$ dbca -createDuplicateDB -silent \
>      -gdbName orcl \
>      -primaryDBConnectionString primary:1521/orcl \
>      -sid racdb \
>      -initParams "dg_broker_start=true" \
>      -sysPassword change_on_install \
>      -adminManaged \
>      -nodelist rac1,rac2 \
>      -recoveryAreaDestination +DATA \
>        -recoveryAreaSize 10000 \
>      -databaseConfigType RAC \
>      -useOMF true \
>      -storageType ASM \
>        -datafileDestination +DATA \
>      -createAsStandby \
>        -dbUniqueName racdb \
>      -createListener rac1.example.com:1522
Prepare for db operation
22% complete
Listener config step
44% complete
Auxiliary instance creation
67% complete
RMAN duplicate
89% complete
Post duplicate database operations
100% complete

Look at the log file "/u01/app/oracle/cfgtoollogs/dbca/racdb/racdb.log" for further details.
That is it:
SQL> select database_role from v$database;

DATABASE_ROLE
----------------
PHYSICAL STANDBY

SQL> select instance_name, host_name from gv$instance;

INSTANCE_NAME    HOST_NAME
---------------- --------------------
racdb1           rac1.example.com
racdb2           rac2.example.com


Further Steps

  1. Delete /u01/app/oracle/product/19.3.0/dbhome_1/network/admin/listener.ora. DBCA created this file with the following content:
    SID_LIST_RAC1.EXAMPLE.COM =
      (SID_LIST =
        (SID_DESC =
          (SID_NAME = racdb1)
        )
      )
    
    RAC1.EXAMPLE.COM =
      (ADDRESS_LIST =
        (ADDRESS = (PROTOCOL = TCP)(HOST = rac1.example.com)(PORT = 1522))
      )
  2. Adding standby redo logs, creating a new Data Guard Configuration, enabling FORCE_LOGGING, Flashback database, etc. - I do not mention it here. The purpose of this post to show how to instantiate a new RAC standby database in one command after completing some simple preliminary steps.

Notable Downsides

There ain't no such thing as a free lunch.
Both slow connection between the sites, and a "huge" database size limits the applicability of this method.
I do not see how this method can utilize the restartable duplication too.
In all other cases, it is possible to quickly spin up a new RAC standby database using this excellent DBCA createDuplicateDB command.

среда, 30 сентября 2020 г.

Determining Oracle Database Edition

The other day I debugged some Oracle provided utilities and came across another way of determining an Oracle Database edition. It does not require a running database instance.

The Oracle code was checking what is inside the $ORACLE_HOME/lib/libedtn19.a library:
# Enterprise Edition
[oracle@rac1 lib]$ ar t /u01/app/oracle/product/19.3.0/dbhome_1/lib/libedtn19.a
vsnfent.o

# Standard Edition
[oracle@rac1 lib]$ ar t /u01/app/oracle/product/19.3.0/dbhome_2/lib/libedtn19.a
vsnfstd.o
As it can be seen, it is vsnfstd.o for Standard vs vsnfent.o for Enterprise.
It appears to be a file for each edition and the file libedtn19.a contains the edition of that specific Oracle Home:
[oracle@rac1 lib]$ find . -name 'libedtn19*a' -print -exec ar t '{}' \;
./libedtn19_xp.a
vsnfxp.o
./libedtn19_ent.a
vsnfent.o
./libedtn19_cse.a
vsnfcse.o
./libedtn19_cee.a
vsnfcee.o
./libedtn19.a
vsnfent.o
./libedtn19_hp.a
vsnfhp.o
./libedtn19_std.a
vsnfstd.o
It has no effect on V$VERSION output whatsoever, and it obviously cannot be used to change one edition to another.
Yet I discovered that srvctl checks that libedtn19.a file in certain scenarios to determine whether or not some edition specific features should be allowed.
Per my research, that library was introduced in Oracle 18c.

суббота, 11 июля 2020 г.

ORA-600 [kglobpg_is_pkp] During Rolling RU 19.7 Apply

I came across this issue while applying RU 19.7 to my RAC database.
That was the old ORACLE_HOME:
[oracle@rac1 dbhome_1]$ OPatch/opatch lspatches
30128191;OJVM RELEASE UPDATE: 19.5.0.0.191015 (30128191)
30122149;OCW RELEASE UPDATE 19.5.0.0.0 (30122149)
30125133;Database Release Update : 19.5.0.0.191015 (30125133)

OPatch succeeded.
That was the new one:
[oracle@rac1 dbhome_3]$ OPatch/opatch lspatches
30805684;OJVM RELEASE UPDATE: 19.7.0.0.200414 (30805684)
30894985;OCW RELEASE UPDATE 19.7.0.0.0 (30894985)
30869156;Database Release Update : 19.7.0.0.200414 (30869156)

OPatch succeeded.
There were two nodes in the cluster: rac1 and rac2.

I set the new ORACLE_HOME for the database and restarted rac2:
Patch Id: 30805684
Patch Description: OJVM RELEASE UPDATE: 19.7.0.0.200414 (30805684)
Patch Apply Time: 2020-07-09T19:38:43Z
Bugs Fixed: 29254623,29445548,29512125,29540327,29774362,29942275,30134746,
30160625,30534662,30855101
===========================================================
2020-07-11T15:39:33.323415+00:00
Resize operation completed for file# 1, old size 993280K, new size 1003520K
Resize operation completed for file# 1, old size 1003520K, new size 1013760K
2020-07-11T15:39:35.107445+00:00
Resize operation completed for file# 1, old size 1013760K, new size 1024000K
Resize operation completed for file# 1, old size 1024000K, new size 1034240K
2020-07-11T15:39:37.061108+00:00
Resize operation completed for file# 1, old size 1034240K, new size 1044480K
Resize operation completed for file# 1, old size 1044480K, new size 1054720K
2020-07-11T15:39:38.695595+00:00
Resize operation completed for file# 1, old size 1054720K, new size 1064960K
jox_pujs ending in pid 23121 cid 1
2020-07-11T15:39:39.024323+00:00
Java patching prepare phase started.
2020-07-11T15:39:39.068031+00:00
## jox_ujs_status: op_instance_patched: returning TRUE in pid 22749
2020-07-11T15:39:44.069088+00:00
## jox_ujs_status: op_instance_patched: returning TRUE in pid 22749
pid 22749 was GEN0 and it was writing that 'jox_ujs_status' message to the alert log every 5 seconds.

That was what I had on rac1:
2020-07-11T15:38:52.976980+00:00
Increasing priority of 1 RS
Domain Action Reconfiguration started (domid 3, new da inc 6, cluster inc 8)
Instance 2 is attaching to domain 3
 Global Resource Directory partially frozen for domain action
 Non-local Process blocks cleaned out
 Set master node info
 Dwn-cvts replayed, VALBLKs dubious
 All grantable enqueues granted
Domain Action Reconfiguration complete (total time 0.1 secs)
Decreasing priority of 1 RS
2020-07-11T15:39:39.024882+00:00
Java patching prepare phase started.
2020-07-11T15:39:39.049868+00:00
## jox_ujs_status: op_instance_patched: UJS active in root, ujs state present, its version does not match executable version, returning FALSE in pid 31373
2020-07-11T15:39:44.069761+00:00
## jox_ujs_status: op_instance_patched: UJS active in root, ujs state present, its version does not match executable version, returning FALSE in pid 31373
2020-07-11T15:39:49.071807+00:00
## jox_ujs_status: op_instance_patched: UJS active in root, ujs state present, its version does not match executable version, returning FALSE in pid 31373
Although it looked suspicious, I decided to continue the patching.
Once I stopped rac1, I was not able to start it anymore:
2020-07-11T15:42:40.981054+00:00
## jox_ujs_status: op_instance_patched: returning TRUE in pid 4744
Starting background process GTX0
2020-07-11T15:42:41.422960+00:00
GTX0 started with pid=63, OS id=4926
2020-07-11T15:42:41.427547+00:00
joxcsys_required_dirobj_exists: directory object exists with required path /u01/app/oracle/product/19.3.0/dbhome_3/javavm/admin/, pid 4844 cid 1
Starting background process RCBG
2020-07-11T15:42:41.492063+00:00
RCBG started with pid=64, OS id=4928
replication_dependency_tracking turned off (no async multimaster replication found)
2020-07-11T15:42:41.831459+00:00
Errors in file /u01/app/oracle/diag/rdbms/racdb/racdb1/trace/racdb1_ora_4844.trc  (incident=5249) (PDBNAME=CDB$ROOT):
ORA-00600: internal error code, arguments: [kglobpg_is_pkp], [0x077DD9BC8], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/racdb/racdb1/incident/incdir_5249/racdb1_ora_4844_i5249.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
..
2020-07-11T15:42:42.782978+00:00
## jox_ujs_status: op_instance_patched: returning TRUE in pid 4744
Errors in file /u01/app/oracle/diag/rdbms/racdb/racdb1/trace/racdb1_ora_4844.trc  (incident=5250) (PDBNAME=CDB$ROOT):
ORA-00603: ORACLE server session terminated by fatal error
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [kglobpg_is_pkp], [0x077DD9BC8], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/racdb/racdb1/incident/incdir_5250/racdb1_ora_4844_i5250.trc
2020-07-11T15:42:42.941948+00:00
Dumping diagnostic data in directory=[cdmp_20200711154242], requested by (instance=1, osid=4844), summary=[incident=5249].
2020-07-11T15:42:43.784890+00:00
## jox_ujs_status: op_instance_patched: returning TRUE in pid 4744
2020-07-11T15:42:43.960028+00:00
opiodr aborting process unknown ospid (4844) as a result of ORA-603
2020-07-11T15:42:43.976310+00:00
ORA-603 : opitsk aborting process
License high water mark = 1
USER(prelim) (ospid: 4844): terminating the instance due to ORA error 600
That essentially left me with rac2 whose GEN0 was spamming the following message to the alert log once in 5 seconds:
## jox_ujs_status: op_instance_patched: returning TRUE in pid 22749
Any attempts to run datapatch and finish the patching ended up with the following errors:
ORA-29548: Java system class reported: UJS still running
ORA-06512: at "SYS.DBMS_JAVA_TEST", line 2
ORA-06512: at "SYS.DBMS_JAVA_TEST", line 55
ORA-06512: at line 4
There were some ORA-00600 [kglobpg_is_pkp] errors on My Oracle Support (MOS) but there was nothing specific that I could link to my issue.

When I get no hits, I usually turn PowerView off or extend my search to include bugs, patches, etc.
That worked out well and I found the following patch:
Patch 31359215: INSTANCE STARTUP FAILS ORA-600 [KGLOBPG_IS_PKP] DURING ROLLING RU 19.7 APPLY
I applied that patch to rac1 and was finally able to start that instance successfully. The remaining part of that rolling patch exercise went without a hitch.
The patch itself was released on 7th July. I hope Oracle Support will update the Knowledge Base, or publish some alerts about this issue, or even merge that patch into 19.7 OJVM.

суббота, 13 июня 2020 г.

CURSOR Expressions Return Wrong Results in 18c/19c

That is a wrong results issue I stumbled upon this morning. A developer told me that one of their queries started returning wrong results after we upgraded one of our databases from 12.2.0.1 to 19.7. I constructed a simplified test case to demonstrate this issue which I ran on 19.4. I have been able to reproduce it on 18.3, 19.7, and Live SQL as well.
SQL> create table t1(id int);

Table created.

SQL> create table t2(id int);

Table created.

SQL>
SQL> insert into t1 values (1);

1 row created.

SQL> insert into t2 values (1);

1 row created.
In a nutshell, I created two tables that have one row with a numeric 1 value. Let's now run the following query:
SQL> select query_type
  2    from (select 'A' query_type
  3            from t1 t11
  4           where exists (
  5                   select null
  6                     from t2 t21
  7                    where t21.id = t11.id)
  8           union all
  9          select 'B' query_type
 10            from t1 t12
 11           where not exists (
 12                   select null
 13                     from t2 t22
 14                    where t22.id = t12.id)
 15         );

Q
-
A
Then, I am going to put that query into a CURSOR expression:
SQL> select cursor(
  2           select query_type
  3             from (select 'A' query_type
  4                     from t1 t11
  5                    where exists (
  6                            select null
  7                              from t2 t21
  8                             where t21.id = t11.id)
  9                    union all
 10                   select 'B' query_type
 11                     from t1 t12
 12                    where not exists (
 13                            select null
 14                              from t2 t22
 15                             where t22.id = t12.id)
 16                  )
 17         ) data
 18    from dual;

DATA
--------------------
CURSOR STATEMENT : 1

CURSOR STATEMENT : 1

Q
-
B
That is when it gets interesting. Somehow the query which returned 'A' being run separately starts returning 'B' after it comes as an input subquery into a CURSOR expression.

I initially started looking at the execution plan in an attempt to figure out what changed in the execution plan between 12.2 and 19c. There are some changes but nothing that drew my eye.
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 2197603499

-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |     1 |       |    14   (0)| 00:00:01 |
|   1 |  VIEW                |      |     2 |     6 |    12   (0)| 00:00:01 |
|   2 |   UNION-ALL          |      |       |       |            |          |
|*  3 |    FILTER            |      |       |       |            |          |
|   4 |     TABLE ACCESS FULL| T1   |     1 |    13 |     3   (0)| 00:00:01 |
|*  5 |     TABLE ACCESS FULL| T2   |     1 |    13 |     3   (0)| 00:00:01 |
|*  6 |    FILTER            |      |       |       |            |          |
|   7 |     TABLE ACCESS FULL| T1   |     1 |    13 |     3   (0)| 00:00:01 |
|*  8 |     TABLE ACCESS FULL| T2   |     1 |    13 |     3   (0)| 00:00:01 |
|   9 |  FAST DUAL           |      |     1 |       |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SET$1 / from$_subquery$_001@SEL$2
   2 - SET$1
   3 - SEL$3
   4 - SEL$3 / T11@SEL$3
   5 - SEL$4 / T21@SEL$4
   6 - SEL$5
   7 - SEL$5 / T12@SEL$5
   8 - SEL$6 / T22@SEL$6
   9 - SEL$1 / DUAL@SEL$1

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      FULL(@"SEL$6" "T22"@"SEL$6")
      FULL(@"SEL$4" "T21"@"SEL$4")
      PQ_FILTER(@"SEL$3" SERIAL)
      FULL(@"SEL$3" "T11"@"SEL$3")
      PQ_FILTER(@"SEL$5" SERIAL)
      FULL(@"SEL$5" "T12"@"SEL$5")
      NO_ACCESS(@"SEL$2" "from$_subquery$_001"@"SEL$2")
      OUTLINE_LEAF(@"SEL$1")
      OUTLINE_LEAF(@"SEL$2")
      OUTLINE_LEAF(@"SET$1")
      OUTLINE_LEAF(@"SEL$5")
      OUTLINE_LEAF(@"SEL$6")
      OUTLINE_LEAF(@"SEL$3")
      OUTLINE_LEAF(@"SEL$4")
      ALL_ROWS
      DB_VERSION('19.1.0')
      OPTIMIZER_FEATURES_ENABLE('19.1.0')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter( EXISTS (SELECT 0 FROM "T2" "T21" WHERE "T21"."ID"=:B1))
   5 - filter("T21"."ID"=:B1)
   6 - filter( NOT EXISTS (SELECT 0 FROM "T2" "T22" WHERE
              "T22"."ID"=:B1))
   8 - filter("T22"."ID"=:B1)

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)

68 rows selected.

The gather_plan_statistics hint was not really helpful as it is a nested cursor, so that it does not show the rowsource statistics from that level - the nested cursor runs separately and its rowsource statistics are not included in the top-level cursor ones:
-----------------------------------------------------------------------------
| Id  | Operation            | Name | Starts | E-Rows | A-Rows |   A-Time   |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |      1 |        |      1 |00:00:00.01 |
|   1 |  VIEW                |      |      0 |      2 |      0 |00:00:00.01 |
|   2 |   UNION-ALL          |      |      0 |        |      0 |00:00:00.01 |
|*  3 |    FILTER            |      |      0 |        |      0 |00:00:00.01 |
|   4 |     TABLE ACCESS FULL| T1   |      0 |      1 |      0 |00:00:00.01 |
|*  5 |     TABLE ACCESS FULL| T2   |      0 |      1 |      0 |00:00:00.01 |
|*  6 |    FILTER            |      |      0 |        |      0 |00:00:00.01 |
|   7 |     TABLE ACCESS FULL| T1   |      0 |      1 |      0 |00:00:00.01 |
|*  8 |     TABLE ACCESS FULL| T2   |      0 |      1 |      0 |00:00:00.01 |
|   9 |  FAST DUAL           |      |      1 |      1 |      1 |00:00:00.01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter( IS NOT NULL)
   5 - filter("T21"."ID"=:B1)
   6 - filter( IS NULL)
   8 - filter("T22"."ID"=:B1)

I found the explanation of this issue in the SQL trace file:
=====================
PARSING IN CURSOR #139973723001152 len=579 dep=0 uid=78 oct=3 lid=78 tim=4622189819 hv=2185287422 ad='68d3c950' sql
id='0mkkpha141pry'
select cursor(
         select query_type
           from (select 'A' query_type
                   from t1 t11
                  where exists (
                          select null
                            from t2 t21
                           where t21.id = t11.id)
                  union all
                 select 'B' query_type
                   from t1 t12
                  where not exists (
                          select null
                            from t2 t22
                           where t22.id = t12.id)
                )
       ) data
  from dual
END OF STMT
PARSE #139973723001152:c=11048,e=12511,p=0,cr=45,cu=12,mis=1,r=0,dep=0,og=1,plh=2197603499,tim=4622189819
EXEC #139973723001152:c=13,e=13,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=2197603499,tim=4622189873
=====================
PARSING IN CURSOR #139973720520408 len=272 dep=2 uid=78 oct=3 lid=78 tim=4622190302 hv=3362864010 ad='68d141b0' sqlid='faj0udr472fwa'
SELECT "A2"."QUERY_TYPE" "QUERY_TYPE" 
  FROM  ( 
          (SELECT 'A' "QUERY_TYPE" 
             FROM "T1" "A5"
            WHERE  EXISTS (
                     SELECT 0 
                       FROM "T2" "A6"
                      WHERE "A6"."ID"=:CV1$))
           UNION ALL
          (SELECT 'B' "QUERY_TYPE" 
             FROM "T1" "A4"
            WHERE  NOT EXISTS (
                     SELECT 0 
                       FROM "T2" "A7"
                      WHERE "A7"."ID"=:CV2$))) "A2"
END OF STMT
PARSE #139973720520408:c=375,e=375,p=0,cr=0,cu=0,mis=1,r=0,dep=2,og=1,plh=0,tim=4622190302
..skip..
BINDS #139973720520408:

 Bind#0
  oacdty=02 mxl=22(00) mxlc=00 mal=00 scl=00 pre=00
  oacflg=11 fl2=0001 frm=00 csi=00 siz=48 off=0
  kxsbbbfp=7f4e2be3b278  bln=22  avl=00  flg=05
 Bind#1
  oacdty=02 mxl=22(00) mxlc=00 mal=00 scl=00 pre=00
  oacflg=11 fl2=0001 frm=00 csi=00 siz=0 off=24
  kxsbbbfp=7f4e2be3b290  bln=22  avl=00  flg=01
EXEC #139973720520408:c=9732,e=9360,p=0,cr=37,cu=0,mis=1,r=0,dep=2,og=1,plh=489056501,tim=4622199711
FETCH #139973723001152:c=10218,e=9845,p=0,cr=37,cu=0,mis=0,r=1,dep=0,og=1,plh=2197603499,tim=4622199753
STAT #139973723001152 id=1 cnt=0 pid=0 pos=1 obj=0 op='VIEW  (cr=0 pr=0 pw=0 str=0 time=0 us cost=12 size=6 card=2)'
STAT #139973723001152 id=2 cnt=0 pid=1 pos=1 obj=0 op='UNION-ALL  (cr=0 pr=0 pw=0 str=0 time=0 us)'
STAT #139973723001152 id=3 cnt=0 pid=2 pos=1 obj=0 op='FILTER  (cr=0 pr=0 pw=0 str=0 time=0 us)'
STAT #139973723001152 id=4 cnt=0 pid=3 pos=1 obj=25013 op='TABLE ACCESS FULL T1 (cr=0 pr=0 pw=0 str=0 time=0 us cost=3 size=13 card=1)'
STAT #139973723001152 id=5 cnt=0 pid=3 pos=2 obj=25014 op='TABLE ACCESS FULL T2 (cr=0 pr=0 pw=0 str=0 time=0 us cost=3 size=13 card=1)'
STAT #139973723001152 id=6 cnt=0 pid=2 pos=2 obj=0 op='FILTER  (cr=0 pr=0 pw=0 str=0 time=0 us)'
STAT #139973723001152 id=7 cnt=0 pid=6 pos=1 obj=25013 op='TABLE ACCESS FULL T1 (cr=0 pr=0 pw=0 str=0 time=0 us cost=3 size=13 card=1)'
STAT #139973723001152 id=8 cnt=0 pid=6 pos=2 obj=25014 op='TABLE ACCESS FULL T2 (cr=0 pr=0 pw=0 str=0 time=0 us cost=3 size=13 card=1)'
STAT #139973723001152 id=9 cnt=1 pid=0 pos=2 obj=0 op='FAST DUAL  (cr=0 pr=0 pw=0 str=1 time=0 us cost=2 size=0 card=1)'
FETCH #139973720520408:c=178,e=177,p=0,cr=8,cu=0,mis=0,r=1,dep=0,og=1,plh=489056501,tim=4622200599
STAT #139973720520408 id=1 cnt=1 pid=0 pos=1 obj=0 op='VIEW  (cr=9 pr=0 pw=0 str=1 time=179 us cost=12 size=6 card=2)'
STAT #139973720520408 id=2 cnt=1 pid=1 pos=1 obj=0 op='UNION-ALL  (cr=9 pr=0 pw=0 str=1 time=177 us)'
STAT #139973720520408 id=3 cnt=0 pid=2 pos=1 obj=0 op='FILTER  (cr=1 pr=0 pw=0 str=1 time=8 us)'
STAT #139973720520408 id=4 cnt=0 pid=3 pos=1 obj=25013 op='TABLE ACCESS FULL T1 (cr=0 pr=0 pw=0 str=0 time=0 us cost=3 size=0 card=1)'
STAT #139973720520408 id=5 cnt=0 pid=3 pos=2 obj=25014 op='TABLE ACCESS FULL T2 (cr=1 pr=0 pw=0 str=1 time=6 us cost=3 size=13 card=1)'
STAT #139973720520408 id=6 cnt=1 pid=2 pos=2 obj=0 op='FILTER  (cr=8 pr=0 pw=0 str=1 time=164 us)'
STAT #139973720520408 id=7 cnt=1 pid=6 pos=1 obj=25013 op='TABLE ACCESS FULL T1 (cr=7 pr=0 pw=0 str=1 time=144 us cost=3 size=0 card=1)'
STAT #139973720520408 id=8 cnt=0 pid=6 pos=2 obj=25014 op='TABLE ACCESS FULL T2 (cr=1 pr=0 pw=0 str=1 time=9 us cost=3 size=13 card=1)'
FETCH #139973723001152:c=0,e=0,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=0,plh=2197603499,tim=4622200901
CLOSE #139973720520408:c=3,e=2,dep=1,type=0,tim=4622201312
CLOSE #139973723001152:c=6,e=193,dep=0,type=0,tim=4622201500
That is where the most remarkable part happens. The subquery is modified in an awfully wrong way: the columns of the outer queries disappear from the inner queries, and they are replaced with some bind variables: CV1$, CV2$.
Those variables, which are of the NUMBER data type, are set to NULL.
Then, the 'A' branch of the UNION ALL, which uses EXISTS, returns nothing. The 'B' branch, which uses NOT EXISTS, returns the 'B' row since the NOT EXISTS subquery brings back an empty result set.

At this stage the root cause of the wrong results issue is known. Obviously, I have no clue why those variables were introduced but I know why that query returns that result.

I tried to tinker around with that query a little bit to see what can be done to make it work, and here is a short summary of my findings:
  • Correlated subqueries are essential. The query starts returning the correct results (one 'A' row) once I decorrelate the subqueries:
    SQL> select cursor(
      2           select query_type
      3             from (select 'A' query_type
      4                     from t1 t11
      5                    where id in (
      6                            select id
      7                              from t2 t21)
      8                    union all
      9                   select 'B' query_type
     10                     from t1 t12
     11                    where id not in (
     12                            select id
     13                              from t2 t22)
     14                  )
     15         ) data
     16    from dual;
    
    DATA
    --------------------
    CURSOR STATEMENT : 1
    
    CURSOR STATEMENT : 1
    
    Q
    -
    A
        
  • Nesting of the subqueries does matter. For instance, the query returns the correct results after I reduce the nesting level:
    SQL> select cursor(
      2           select 'A' query_type
      3             from t1 t11
      4            where exists (
      5                    select null
      6                      from t2 t21
      7                     where t21.id = t11.id)
      8            union all
      9           select 'B' query_type
     10             from t1 t12
     11            where not exists (
     12                    select null
     13                      from t2 t22
     14                     where t22.id = t12.id)
     15         ) data
     16    from dual;
    
    DATA
    --------------------
    CURSOR STATEMENT : 1
    
    CURSOR STATEMENT : 1
    
    Q
    -
    A
    
Based on how Oracle modifies the subquery under the hood, certain CURSOR expressions can return:
  • Less rows:
    SQL> doc
    DOC>##############################
    DOC>   Less rows
    DOC>##############################
    DOC>#
    SQL>
    SQL> select cursor(
      2           select query_type
      3             from (select 'A' query_type
      4                     from t1 t11
      5                    where exists (
      6                            select null
      7                              from t2 t21
      8                             where t21.id = t11.id)
      9                    union all
     10                   select 'B' query_type
     11                     from t1 t12
     12                    where exists (
     13                            select null
     14                              from t2 t22
     15                             where t22.id = t12.id)
     16                  )
     17         ) data
     18    from dual;
    
    DATA
    --------------------
    CURSOR STATEMENT : 1
    
    CURSOR STATEMENT : 1
    
    no rows selected
    
    
    SQL>
    SQL> doc
    DOC>   The cursor subquery
    DOC>#
    SQL> select query_type
      2    from (select 'A' query_type
      3            from t1 t11
      4           where exists (
      5                   select null
      6                     from t2 t21
      7                    where t21.id = t11.id)
      8           union all
      9          select 'B' query_type
     10            from t1 t12
     11           where exists (
     12                   select null
     13                     from t2 t22
     14                    where t22.id = t12.id)
     15         )
     16  ;
    
    Q
    -
    A
    B
        
  • More rows:
    SQL> doc
    DOC>##############################
    DOC>   More rows
    DOC>##############################
    DOC>#
    SQL>
    SQL> select cursor(
      2           select query_type
      3             from (select 'A' query_type
      4                     from t1 t11
      5                    union all
      6                   select 'B' query_type
      7                     from t1 t12
      8                    where not exists (
      9                            select null
     10                              from t2 t22
     11                             where t22.id = t12.id)
     12                  )
     13         ) data
     14    from dual;
    
    DATA
    --------------------
    CURSOR STATEMENT : 1
    
    CURSOR STATEMENT : 1
    
    Q
    -
    A
    B
    
    
    SQL>
    SQL> doc
    DOC>   The cursor subquery
    DOC>#
    SQL> select query_type
      2    from (select 'A' query_type
      3            from t1 t11
      4           union all
      5          select 'B' query_type
      6            from t1 t12
      7           where not exists (
      8                   select null
      9                     from t2 t22
     10                    where t22.id = t12.id)
     11         );
    
    Q
    -
    A
        
  • Different values as it was demostrated in the beginning of this post
The good news is that it is a known documented issue: Bug 30528947 - Nested Query Using CURSOR Expression Returns Wrong Result (Doc ID 30528947.8). There are patches available for some release updates.
The bad news is that the issue is not mentioned on the document I would expect it to be: Things to Consider to Avoid Prominent Wrong Result Problems on 19C Proactively (Doc ID 2606585.1). How is one supposed to avoid such issues? I always read those 'things to consider' before I devise any upgrade plans.
The scripts, a sample trace file, and the spool file for this post are available on: GitHub.

вторник, 9 июня 2020 г.

OR Expansion and Virtual Columns: Changes in 19c

This post is a continuation of my previous one: OR Expansion and Virtual Columns. The last post demonstrated that OR Expansion can somehow be affected by virtual columns in such a way that queries that did not perform OR Expansion can start using it as soon as a virtual column is created. Both an FBI and extended statistics lead to the same result provided that a certain column used in the query is included in them.
This time around, I am going to take a look at how 19c has changed that.

Firstly, I am going to recreate my initial table: or_expand_setup.sql:
SQL> @or_expand_setup
SQL> drop table t1;

Table dropped.

SQL>
SQL> create table t1 (
  2    part_key varchar2(8),
  3    status   varchar2(12),
  4    pad1     char(500)
  5  )
  6  partition by list(part_key) (
  7    partition values ('P1'),
  8    partition values (default)
  9  )
 10  ;

Table created.

SQL>
SQL>
SQL> insert into t1
  2  select 'P1', status, 'X'
  3    from (select 90 pct, 'PROCESSED' status from dual union all
  4          select 1, 'UNPROCESSED' from dual union all
  5          select 9, 'PENDING' from dual) params,
  6         lateral(
  7           select level
  8             from dual
  9             connect by level <= params.pct * 1000
 10         ) duplicator;

100000 rows created.

SQL>
SQL> commit;

Commit complete.

SQL>
SQL> create index t1_status_i on t1(status) local;

Index created.

SQL>
SQL> select status,
  2         count(*),
  3         round(ratio_to_report(count(*)) over () * 100, 2) pct
  4    from t1
  5   group by status
  6   order by 1;

STATUS         COUNT(*)        PCT
------------ ---------- ----------
PENDING            9000          9
PROCESSED         90000         90
UNPROCESSED        1000          1

Secondly, I am running the or_expand_test_vcol.sql script which was previously performed OR Expansion in both 12.2 and 18c:
SQL> @or_expand_test_vcol
SQL> set def on lin 124 pages 100
SQL>
SQL> alter table t1 add lower_status varchar2(128) generated always as (lower(status)) virtual;

Table altered.

SQL>
SQL> exec dbms_stats.gather_table_stats( '', 't1', method_opt => 'for all columns size 254')



PL/SQL procedure successfully completed.

SQL>
SQL> col tfi old_v tfi nopri
SQL>
SQL> select to_char(sysdate, 'yyyymmdd_hh24miss') tfi from dual;




SQL>
SQL> alter session set tracefile_identifier='&tfi.';
old   1: alter session set tracefile_identifier='&tfi.'
new   1: alter session set tracefile_identifier='20200608_180223'

Session altered.

SQL>
SQL> alter session set events 'trace[sql_optimizer.*]';

Session altered.

SQL>
SQL>
SQL> var part_key varchar2(10)='P1'
SQL> var param    varchar2(12)='WAITING'
SQL>
SQL> select --+ gather_plan_statistics or_expand(@sel$1 (1) (2))
  2         count(pad1)
  3    from t1
  4   where part_key = :part_key
  5     and (:param = 'WAITING' and status = 'UNPROCESSED'
  6          or
  7          :param = 'ALL' and status <> 'PENDING');

COUNT(PAD1)
-----------
       1000

SQL>
SQL> select *
  2    from dbms_xplan.display_cursor(format=> 'allstats last outline');

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------
SQL_ID  fq8b3jumk131t, child number 0
-------------------------------------
select --+ gather_plan_statistics or_expand(@sel$1 (1) (2))
count(pad1)   from t1  where part_key = :part_key    and (:param =
'WAITING' and status = 'UNPROCESSED'         or         :param = 'ALL'
and status <> 'PENDING')

Plan hash value: 1293629841

-----------------------------------------------------------------------------------------
| Id  | Operation              | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |      |      1 |        |      1 |00:00:00.03 |    7826 |
|   1 |  SORT AGGREGATE        |      |      1 |      1 |      1 |00:00:00.03 |    7826 |
|   2 |   PARTITION LIST SINGLE|      |      1 |    920 |   1000 |00:00:00.03 |    7826 |
|*  3 |    TABLE ACCESS FULL   | T1   |      1 |    920 |   1000 |00:00:00.03 |    7826 |
-----------------------------------------------------------------------------------------

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('19.1.0')
      DB_VERSION('19.1.0')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$1")
      FULL(@"SEL$1" "T1"@"SEL$1")
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter((((:PARAM='ALL' AND "STATUS"<>'PENDING') OR
              ("STATUS"='UNPROCESSED' AND :PARAM='WAITING')) AND "PART_KEY"=:PART_KEY))


38 rows selected.

SQL>
SQL> alter session set events 'trace[sql_optimizer.*] off';

Session altered.

SQL>
SQL> col value for a80
SQL>
SQL> select value
  2    from v$diag_info
  3   where name='Default Trace File';

VALUE
--------------------------------------------------------------------------------
/u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_7253_20200608_180223.trc
As the output above demonstrates, there is no OR Expansion anymore. The trace file orcl_ora_7253_20200608_180223.trc shows the following:
***********************************
Cost-Based OR Expansion
***********************************
ORE: Trying CBQT OR expansion before unnesting
ORE: Checking validity of OR Expansion for query block SEL$1 (#1)
ORE: Predicate chain before QB validity check - SEL$1
"T1"."PART_KEY"=:B1 AND (:B2='WAITING' AND "T1"."STATUS"='UNPROCESSED' OR :B3='ALL' AND "T1"."STATUS"<>'PENDING')

ORE: Predicate chain after QB validity check - SEL$1
"T1"."PART_KEY"=:B1 AND (:B2='WAITING' AND "T1"."STATUS"='UNPROCESSED' OR :B3='ALL' AND "T1"."STATUS"<>'PENDING')
OR Expansion on query block SEL$1 (#1)
ORE: Checking validity of OR Expansion for query block SEL$1 (#1)
ORE: Predicate chain before QB validity check - SEL$1
"T1"."PART_KEY"=:B1 AND (:B2='WAITING' AND "T1"."STATUS"='UNPROCESSED' OR :B3='ALL' AND "T1"."STATUS"<>'PENDING')

ORE: Predicate chain after QB validity check - SEL$1
"T1"."PART_KEY"=:B1 AND (:B2='WAITING' AND "T1"."STATUS"='UNPROCESSED' OR :B3='ALL' AND "T1"."STATUS"<>'PENDING')
ORE: Using search type: linear
ORE: Checking validity of OR Expansion for query block SEL$1 (#1)
ORE: Predicate chain before QB validity check - SEL$1
"T1"."PART_KEY"=:B1 AND (:B2='WAITING' AND "T1"."STATUS"='UNPROCESSED' OR :B3='ALL' AND "T1"."STATUS"<>'PENDING')

ORE: Predicate chain after QB validity check - SEL$1
"T1"."PART_KEY"=:B1 AND (:B2='WAITING' AND "T1"."STATUS"='UNPROCESSED' OR :B3='ALL' AND "T1"."STATUS"<>'PENDING')
ORE: Checking validity of disjunct chain 
ORE: Bypassed for disjunct chain: No Index or Partition driver found.
ORE: # conjunction chain - 1
ORE: No state generated.
It is worth noting that even a full outline does not lead to the desired plan. That is how I discovered that issue initially. The query in question did have an SQL Plan Baseline which stopped being reproducible following a 19c upgrade.

A straightforward way to get the desired execution plan is to rewrite the initial query as follows:
SQL> @or_expand_test_rewrite

..skip..

SQL> var part_key varchar2(10)='P1'
SQL> var param    varchar2(12)='WAITING'
SQL>
SQL> select --+ gather_plan_statistics or_expand(@sel$1 (1) (2))
  2         count(pad1)
  3    from t1
  4   where 1 = 1
  5     and (part_key = :part_key and :param = 'WAITING' and status = 'UNPROCESSED'
  6          or
  7          part_key = :part_key||'' and :param = 'ALL' and status <> 'PENDING');

COUNT(PAD1)
-----------
       1000

SQL>
SQL> select *
  2    from dbms_xplan.display_cursor(format=> 'allstats last outline');

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------
SQL_ID  99zvump911bc7, child number 0
-------------------------------------
select --+ gather_plan_statistics or_expand(@sel$1 (1) (2))
count(pad1)   from t1  where 1 = 1    and (part_key = :part_key and
:param = 'WAITING' and status = 'UNPROCESSED'         or
part_key = :part_key||'' and :param = 'ALL' and status <> 'PENDING')

Plan hash value: 3973059565

----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                      | Name            | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                               |                 |      1 |        |      1 |00:00:00.01 |      83 |
|   1 |  SORT AGGREGATE                                |                 |      1 |      1 |      1 |00:00:00.01 |      83 |
|   2 |   VIEW                                         | VW_ORE_BA8ECEFB |      1 |  91982 |   1000 |00:00:00.01 |      83 |
|   3 |    UNION-ALL                                   |                 |      1 |        |   1000 |00:00:00.01 |      83 |
|*  4 |     FILTER                                     |                 |      1 |        |   1000 |00:00:00.01 |      83 |
|   5 |      PARTITION LIST SINGLE                     |                 |      1 |    999 |   1000 |00:00:00.01 |      83 |
|*  6 |       TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| T1              |      1 |    999 |   1000 |00:00:00.01 |      83 |
|*  7 |        INDEX RANGE SCAN                        | T1_STATUS_I     |      1 |   1000 |   1000 |00:00:00.01 |       6 |
|*  8 |     FILTER                                     |                 |      1 |        |      0 |00:00:00.01 |       0 |
|   9 |      PARTITION LIST SINGLE                     |                 |      0 |  90983 |      0 |00:00:00.01 |       0 |
|* 10 |       TABLE ACCESS FULL                        | T1              |      0 |  90983 |      0 |00:00:00.01 |       0 |
----------------------------------------------------------------------------------------------------------------------------

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('19.1.0')
      DB_VERSION('19.1.0')
      ALL_ROWS
      OUTLINE_LEAF(@"SET$2A13AF86_2")
      OUTLINE_LEAF(@"SET$2A13AF86_1")
      OUTLINE_LEAF(@"SET$2A13AF86")
      OUTLINE_LEAF(@"SEL$9162BF3C")
      OR_EXPAND(@"SEL$1" (1) (2))
      OUTLINE(@"SEL$1")
      NO_ACCESS(@"SEL$9162BF3C" "VW_ORE_BA8ECEFB"@"SEL$BA8ECEFB")
      INDEX_RS_ASC(@"SET$2A13AF86_1" "T1"@"SET$2A13AF86_1" ("T1"."STATUS"))
      BATCH_TABLE_ACCESS_BY_ROWID(@"SET$2A13AF86_1" "T1"@"SET$2A13AF86_1")
      FULL(@"SET$2A13AF86_2" "T1"@"SET$2A13AF86_2")
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - filter(:PARAM='WAITING')
   6 - filter("PART_KEY"=:PART_KEY)
   7 - access("STATUS"='UNPROCESSED')
   8 - filter(:PARAM='ALL')
  10 - filter(("STATUS"<>'PENDING' AND "PART_KEY"=:PART_KEY||'' AND (LNNVL("PART_KEY"=:PART_KEY) OR
              LNNVL(:PARAM='WAITING') OR LNNVL("STATUS"='UNPROCESSED'))))


57 rows selected.

SQL>
SQL> alter session set events 'trace[sql_optimizer.*] off';

Session altered.

SQL>
SQL> col value for a80
SQL>
SQL> select value
  2    from v$diag_info
  3   where name='Default Trace File';

VALUE
--------------------------------------------------------------------------------
/u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_7253_20200608_180825.trc
That is what is done in or_expand_test_rewrite.sql; the corresponding trace file is: orcl_ora_7253_20200608_180825.trc.
Evidently, this rewrite can be applied in a limited number of cases.
Thankfully, the old OR Expansion transformation, aka CONCATENATION, works just fine for this example which is demonstrated in the listing below: or_expand_test_use_concat.sql
SQL> @or_expand_test_use_concat

..skip..

SQL> var part_key varchar2(10)='P1'
SQL> var param    varchar2(12)='WAITING'
SQL>
SQL> select --+ gather_plan_statistics use_concat(or_predicates(2))
  2         count(pad1)
  3    from t1
  4   where part_key = :part_key
  5     and (:param = 'WAITING' and status = 'UNPROCESSED'
  6          or
  7          :param = 'ALL' and status <> 'PENDING');

COUNT(PAD1)
-----------
       1000

SQL>
SQL> select *
  2    from dbms_xplan.display_cursor(format=> 'allstats last outline');

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------
SQL_ID  cygpq5hyy2uu2, child number 0
-------------------------------------
select --+ gather_plan_statistics use_concat(or_predicates(2))
count(pad1)   from t1  where part_key = :part_key    and (:param =
'WAITING' and status = 'UNPROCESSED'         or         :param = 'ALL'
and status <> 'PENDING')

Plan hash value: 496678280

-----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                     | Name        | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                              |             |      1 |        |      1 |00:00:00.01 |      83 |
|   1 |  SORT AGGREGATE                               |             |      1 |      1 |      1 |00:00:00.01 |      83 |
|   2 |   CONCATENATION                               |             |      1 |        |   1000 |00:00:00.01 |      83 |
|*  3 |    FILTER                                     |             |      1 |        |      0 |00:00:00.01 |       0 |
|   4 |     PARTITION LIST SINGLE                     |             |      0 |  90992 |      0 |00:00:00.01 |       0 |
|*  5 |      TABLE ACCESS FULL                        | T1          |      0 |  90992 |      0 |00:00:00.01 |       0 |
|*  6 |    FILTER                                     |             |      1 |        |   1000 |00:00:00.01 |      83 |
|   7 |     PARTITION LIST SINGLE                     |             |      1 |    990 |   1000 |00:00:00.01 |      83 |
|*  8 |      TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| T1          |      1 |    990 |   1000 |00:00:00.01 |      83 |
|*  9 |       INDEX RANGE SCAN                        | T1_STATUS_I |      1 |    990 |   1000 |00:00:00.01 |       6 |
-----------------------------------------------------------------------------------------------------------------------

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('19.1.0')
      DB_VERSION('19.1.0')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$1")
      OUTLINE_LEAF(@"SEL$1_1")
      USE_CONCAT(@"SEL$1" OR_PREDICATES(2))
      OUTLINE_LEAF(@"SEL$1_2")
      FULL(@"SEL$1_1" "T1"@"SEL$1")
      INDEX_RS_ASC(@"SEL$1_2" "T1"@"SEL$1_2" ("T1"."STATUS"))
      BATCH_TABLE_ACCESS_BY_ROWID(@"SEL$1_2" "T1"@"SEL$1_2")
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter(:PARAM='ALL')
   5 - filter(("STATUS"<>'PENDING' AND "PART_KEY"=:PART_KEY))
   6 - filter(:PARAM='WAITING')
   8 - filter("PART_KEY"=:PART_KEY)
   9 - access("STATUS"='UNPROCESSED')
       filter((LNNVL(:PARAM='ALL') OR LNNVL("STATUS"<>'PENDING')))


53 rows selected.

SQL>
SQL> alter session set events 'trace[sql_optimizer.*] off';

Session altered.

SQL>
SQL> col value for a80
SQL>
SQL> select value
  2    from v$diag_info
  3   where name='Default Trace File';

VALUE
--------------------------------------------------------------------------------
/u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_7253_20200608_181301.trc
The corresponding trace file is: orcl_ora_7253_20200608_181301.trc.
This blog post demonstrates a specific example in which OR Expansion fails to produce the desired execution plan, leading to a suboptimal execution.
By way of contrast, CONCATENATION results in the execution plan I was seeking, performing as effectively as OR Expansion did before 19c.
It is my hope that Oracle will fix this issue eventually, which I am trying to get done in SR 3-22449907431 : OR-Expansion does not work following 19.5 upgrade.

четверг, 21 мая 2020 г.

OR Expansion and Virtual Columns

The following test case can be used to demonstrate a little peculiarity of the OR Expansion transformation, that was introduced in 12.2.
It is important to note that the issue is not present in 19c - I have reproduced it only on 12.2.0.1.190416 and 18c.
The listings that I am using in this article are taken from 18c.

SQL> @or_expand_setup
SQL> drop table t1;

Table dropped.

SQL>
SQL> create table t1 (
  2    part_key varchar2(8),
  3    status   varchar2(12),
  4    pad1     char(500)
  5  )
  6  partition by list(part_key) (
  7    partition values ('P1'),
  8    partition values (default)
  9  )
 10  ;

Table created.

SQL>
SQL>
SQL> insert into t1
  2  select 'P1', status, 'X'
  3    from (select 90 pct, 'PROCESSED' status from dual union all
  4          select 1, 'UNPROCESSED' from dual union all
  5          select 9, 'PENDING' from dual) params,
  6         lateral(
  7           select level
  8             from dual
  9             connect by level <= params.pct * 1000
 10         ) duplicator;

100000 rows created.

SQL>
SQL> commit;

Commit complete.

SQL>
SQL> create index t1_status_i on t1(status) local;

Index created.

SQL>
SQL> select status,
  2         count(*),
  3         round(ratio_to_report(count(*)) over () * 100, 2) pct
  4    from t1
  5   group by status
  6   order by 1;

STATUS         COUNT(*)        PCT
------------ ---------- ----------
PENDING            9000          9
PROCESSED         90000         90
UNPROCESSED        1000          1

In a nutshell, it is a list-partitioned table with a local index. 90% of the rows are in the PROCESSED status, whereas PENDING and UNPROCESSED statuses account only for 9% and 1% correspondingly.
The script itself: or_expand.sql

Next, I am going to run a simple query and check its plan with and without a virtual column on the STATUS column:
 select --+ gather_plan_statistics or_expand(@sel$1 (1) (2))
        count(pad1)
   from t1
  where part_key = :part_key
    and (:param = 'WAITING' and status = 'UNPROCESSED'
         or
         :param = 'ALL' and status <> 'PENDING');
The variables PARAM and PART_KEY are set as follows:
var part_key varchar2(10)='P1'
var param    varchar2(12)='WAITING'
Thus, I am going to query a single partition to get the rows in the UNPROCESSED status from it.

Here is the result of the script without the virtual column first:
SQL> @or_expand_test_no_vcol
SQL> set def on lin 124 pages 100
SQL>
SQL> -- alter table t1 add lower_status varchar2(128) generated always as (lower(status)) virtual;
SQL>
SQL> exec dbms_stats.gather_table_stats( '', 't1', method_opt => 'for all columns size 254')

PL/SQL procedure successfully completed.

SQL>
SQL> col tfi old_v tfi nopri
SQL>
SQL> select to_char(sysdate, 'yyyymmdd_hh24miss') tfi from dual;




SQL>
SQL> alter session set tracefile_identifier='&tfi.';
old   1: alter session set tracefile_identifier='&tfi.'
new   1: alter session set tracefile_identifier='20200520_181304'

Session altered.

SQL>
SQL> alter session set events 'trace[sql_optimizer.*]';

Session altered.

SQL>
SQL>
SQL> var part_key varchar2(10)='P1'
SQL> var param    varchar2(12)='WAITING'
SQL>
SQL> select --+ gather_plan_statistics or_expand(@sel$1 (1) (2))
  2         count(pad1)
  3    from t1
  4   where part_key = :part_key
  5     and (:param = 'WAITING' and status = 'UNPROCESSED'
  6          or
  7          :param = 'ALL' and status <> 'PENDING');

COUNT(PAD1)
-----------
       1000

SQL>
SQL> select *
  2    from dbms_xplan.display_cursor(format=> 'allstats last outline');

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------
SQL_ID  fq8b3jumk131t, child number 0
-------------------------------------
An uncaught error happened in prepare_sql_statement : ORA-01403: no data found

Plan hash value: 1293629841

-----------------------------------------------------------------------------------------
| Id  | Operation              | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |      |      1 |        |      1 |00:00:00.02 |    7831 |
|   1 |  SORT AGGREGATE        |      |      1 |      1 |      1 |00:00:00.02 |    7831 |
|   2 |   PARTITION LIST SINGLE|      |      1 |    920 |   1000 |00:00:00.01 |    7831 |
|*  3 |    TABLE ACCESS FULL   | T1   |      1 |    920 |   1000 |00:00:00.01 |    7831 |
-----------------------------------------------------------------------------------------

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('18.1.0')
      DB_VERSION('18.1.0')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$1")
      FULL(@"SEL$1" "T1"@"SEL$1")
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter((((:PARAM='ALL' AND "STATUS"<>'PENDING') OR
              ("STATUS"='UNPROCESSED' AND :PARAM='WAITING')) AND "PART_KEY"=:PART_KEY))


35 rows selected.

SQL>
SQL> alter session set events 'trace[sql_optimizer.*] off';

Session altered.

SQL>
SQL> col value for a80
SQL>
SQL> select value
  2    from v$diag_info
  3   where name='Default Trace File';

VALUE
--------------------------------------------------------------------------------
/u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_2137_20200520_181304.trc



I highlighted the commented ALTER TABLE ADD COLUMN command. I am going to refer to it below.

It is worth noting that the OR Expansion transformation has not been performed and the relevant trace file contains a clue:
***********************************
Cost-Based OR Expansion
***********************************
ORE: Trying CBQT OR expansion before unnesting
ORE: Checking validity of OR Expansion for query block SEL$1 (#1)
ORE: Predicate chain before QB validity check - SEL$1
"T1"."PART_KEY"=:B1 AND (:B2='WAITING' AND "T1"."STATUS"='UNPROCESSED' OR :B3='ALL' AND "T1"."STATUS"<>'PENDING')

ORE: Predicate chain after QB validity check - SEL$1
"T1"."PART_KEY"=:B1 AND (:B2='WAITING' AND "T1"."STATUS"='UNPROCESSED' OR :B3='ALL' AND "T1"."STATUS"<>'PENDING')
OR Expansion on query block SEL$1 (#1)
ORE: Checking validity of OR Expansion for query block SEL$1 (#1)
ORE: Predicate chain before QB validity check - SEL$1
"T1"."PART_KEY"=:B1 AND (:B2='WAITING' AND "T1"."STATUS"='UNPROCESSED' OR :B3='ALL' AND "T1"."STATUS"<>'PENDING')

ORE: Predicate chain after QB validity check - SEL$1
"T1"."PART_KEY"=:B1 AND (:B2='WAITING' AND "T1"."STATUS"='UNPROCESSED' OR :B3='ALL' AND "T1"."STATUS"<>'PENDING')
ORE: Using search type: linear
ORE: Checking validity of OR Expansion for query block SEL$1 (#1)
ORE: Predicate chain before QB validity check - SEL$1
"T1"."PART_KEY"=:B1 AND (:B2='WAITING' AND "T1"."STATUS"='UNPROCESSED' OR :B3='ALL' AND "T1"."STATUS"<>'PENDING')

ORE: Predicate chain after QB validity check - SEL$1
"T1"."PART_KEY"=:B1 AND (:B2='WAITING' AND "T1"."STATUS"='UNPROCESSED' OR :B3='ALL' AND "T1"."STATUS"<>'PENDING')
ORE: Checking validity of disjunct chain 
ORE: Bypassed for disjunct chain: No Index or Partition driver found.
ORE: # conjunction chain - 1
ORE: No state generated.
OFF: I came across that ORE: Bypassed for disjunct chain: No Index or Partition driver found. message in one of my previous posts: Concatenation and filter subqueries oddity.

What can go wrong if we now uncomment the aforementioned ALTER TABLE ADD COLUMN command?
The script or_expand_test_vcol.sql:
SQL> @or_expand_test_vcol
SQL> set def on lin 124 pages 100
SQL>
SQL> alter table t1 add lower_status varchar2(128) generated always as (lower(status)) virtual;

Table altered.

SQL>
SQL> exec dbms_stats.gather_table_stats( '', 't1', method_opt => 'for all columns size 254')

PL/SQL procedure successfully completed.

SQL>
SQL> col tfi old_v tfi nopri
SQL>
SQL> select to_char(sysdate, 'yyyymmdd_hh24miss') tfi from dual;




SQL>
SQL> alter session set tracefile_identifier='&tfi.';
old   1: alter session set tracefile_identifier='&tfi.'
new   1: alter session set tracefile_identifier='20200520_181319'

Session altered.

SQL>
SQL> alter session set events 'trace[sql_optimizer.*]';

Session altered.

SQL>
SQL>
SQL> var part_key varchar2(10)='P1'
SQL> var param    varchar2(12)='WAITING'
SQL>
SQL> select --+ gather_plan_statistics or_expand(@sel$1 (1) (2))
  2         count(pad1)
  3    from t1
  4   where part_key = :part_key
  5     and (:param = 'WAITING' and status = 'UNPROCESSED'
  6          or
  7          :param = 'ALL' and status <> 'PENDING');

COUNT(PAD1)
-----------
       1000

SQL>
SQL> select *
  2    from dbms_xplan.display_cursor(format=> 'allstats last outline');

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------
SQL_ID  fq8b3jumk131t, child number 0
-------------------------------------
An uncaught error happened in prepare_sql_statement : ORA-01403: no data found

Plan hash value: 3973059565

----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                      | Name            | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                               |                 |      1 |        |      1 |00:00:00.01 |      83 |
|   1 |  SORT AGGREGATE                                |                 |      1 |      1 |      1 |00:00:00.01 |      83 |
|   2 |   VIEW                                         | VW_ORE_BA8ECEFB |      1 |  91982 |   1000 |00:00:00.01 |      83 |
|   3 |    UNION-ALL                                   |                 |      1 |        |   1000 |00:00:00.01 |      83 |
|*  4 |     FILTER                                     |                 |      1 |        |   1000 |00:00:00.01 |      83 |
|   5 |      PARTITION LIST SINGLE                     |                 |      1 |    999 |   1000 |00:00:00.01 |      83 |
|*  6 |       TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| T1              |      1 |    999 |   1000 |00:00:00.01 |      83 |
|*  7 |        INDEX RANGE SCAN                        | T1_STATUS_I     |      1 |   1000 |   1000 |00:00:00.01 |       6 |
|*  8 |     FILTER                                     |                 |      1 |        |      0 |00:00:00.01 |       0 |
|   9 |      PARTITION LIST SINGLE                     |                 |      0 |  90983 |      0 |00:00:00.01 |       0 |
|* 10 |       TABLE ACCESS FULL                        | T1              |      0 |  90983 |      0 |00:00:00.01 |       0 |
----------------------------------------------------------------------------------------------------------------------------

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('18.1.0')
      DB_VERSION('18.1.0')
      ALL_ROWS
      OUTLINE_LEAF(@"SET$9162BF3C_2")
      OUTLINE_LEAF(@"SET$9162BF3C_1")
      OUTLINE_LEAF(@"SET$9162BF3C")
      OR_EXPAND(@"SEL$1" (1) (2))
      OUTLINE_LEAF(@"SEL$BA8ECEFB")
      OUTLINE(@"SEL$1")
      NO_ACCESS(@"SEL$BA8ECEFB" "VW_ORE_BA8ECEFB"@"SEL$BA8ECEFB")
      INDEX_RS_ASC(@"SET$9162BF3C_1" "T1"@"SET$9162BF3C_1" ("T1"."STATUS"))
      BATCH_TABLE_ACCESS_BY_ROWID(@"SET$9162BF3C_1" "T1"@"SET$9162BF3C_1")
      FULL(@"SET$9162BF3C_2" "T1"@"SET$9162BF3C_2")
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - filter(:PARAM='WAITING')
   6 - filter("PART_KEY"=:PART_KEY)
   7 - access("STATUS"='UNPROCESSED')
   8 - filter(:PARAM='ALL')
  10 - filter(("STATUS"<>'PENDING' AND "PART_KEY"=:PART_KEY AND (LNNVL(:PARAM='WAITING') OR
              LNNVL("STATUS"='UNPROCESSED'))))


54 rows selected.

SQL>
SQL> alter session set events 'trace[sql_optimizer.*] off';

Session altered.

SQL>
SQL> col value for a80
SQL>
SQL> select value
  2    from v$diag_info
  3   where name='Default Trace File';

VALUE
--------------------------------------------------------------------------------
/u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_2137_20200520_181319.trc


The relevant section of the trace file is as follows:
***********************************
Cost-Based OR Expansion
***********************************
ORE: Trying CBQT OR expansion before unnesting
ORE: Checking validity of OR Expansion for query block SEL$1 (#1)
ORE: Predicate chain before QB validity check - SEL$1
"T1"."PART_KEY"=:B1 AND (:B2='WAITING' AND "T1"."STATUS"='UNPROCESSED' OR :B3='ALL' AND "T1"."STATUS"<>'PENDING')

ORE: Predicate chain after QB validity check - SEL$1
"T1"."PART_KEY"=:B1 AND (:B2='WAITING' AND "T1"."STATUS"='UNPROCESSED' OR :B3='ALL' AND "T1"."STATUS"<>'PENDING')
OR Expansion on query block SEL$1 (#1)
ORE: Checking validity of OR Expansion for query block SEL$1 (#1)
ORE: Predicate chain before QB validity check - SEL$1
"T1"."PART_KEY"=:B1 AND (:B2='WAITING' AND "T1"."STATUS"='UNPROCESSED' OR :B3='ALL' AND "T1"."STATUS"<>'PENDING')

ORE: Predicate chain after QB validity check - SEL$1
"T1"."PART_KEY"=:B1 AND (:B2='WAITING' AND "T1"."STATUS"='UNPROCESSED' OR :B3='ALL' AND "T1"."STATUS"<>'PENDING')
ORE: Using search type: linear
ORE: Checking validity of OR Expansion for query block SEL$1 (#1)
ORE: Predicate chain before QB validity check - SEL$1
"T1"."PART_KEY"=:B1 AND (:B2='WAITING' AND "T1"."STATUS"='UNPROCESSED' OR :B3='ALL' AND "T1"."STATUS"<>'PENDING')

ORE: Predicate chain after QB validity check - SEL$1
"T1"."PART_KEY"=:B1 AND (:B2='WAITING' AND "T1"."STATUS"='UNPROCESSED' OR :B3='ALL' AND "T1"."STATUS"<>'PENDING')
ORE: Checking validity of disjunct chain 

ORE: # conjunction chain - 2
ORE: Checking validity of disjunct chain 

ORE:  Predicate list
P1 : "T1"."PART_KEY"=:B1
P2 : :B1='WAITING'
P3 : "T1"."STATUS"='UNPROCESSED'
P4 : :B1='ALL'
P5 : "T1"."STATUS"<>'PENDING'

 DNF Matrix (Before sorting OR branches)
            P1  P2  P3  P4  P5 
CNJ (#1) :   1   1   1   0   0 

CNJ (#2) :   1   0   0   1   1 
I have made the following observations regarding this issue:
  • The virtual column should be on the STATUS column. Neither PART_KEY, nor PAD1 works - they both result in the absence of OR Expansion.
  • The OR Expansion transformation kicks in if there are extended statistics on the STATUS column, such as: SYS_OP_COMBINED_HASH(STATUS, PART_KEY).
  • Unsurprisingly, there is OR Expansion when there is an FBI, such as:
    create index t1_lower_status_i on t1(lower(status)) local;
  • The full outline of the query with OR Expansion does not lead to the OR Expansion transformation in the examples from this article where there was no OR Expansion.
  • Once again, 19c behaves differently - there is no OR Expansion in the given examples when 12.2 and 18c performs it. I am going to talk about in the next blog post.

Although I do expect OR Expansion in the queries from this article, I am not able to make a connection between OR Expansion and virtual columns/extended statistics/FBIs in the provided examples.
Hence, this issue is a bit mixed bag for me: I would love to have OR Expansion - the examples were designed after real application queries after all (they benefit from OR Expansion) - it might as well be a bug that was fixed in 19c.
Time will tell. I have raised an SR to follow up on this issue: SR 3-22449907431 : OR-Expansion does not work following 19.5 upgrade. Hopefully, Oracle Support will explain it one day - we have not made any progress during last 2 months which has become customary while working with Oracle Support.

четверг, 30 апреля 2020 г.

Secure Use of Runtime.exec Functionality in Oracle Database

Developers wanted to call external programs from the database using a Java class. By default, those commands are run under the database user account which is anything but secure.
Following Oracle security guidelines, I created a separate low-privileged OS account and configured the database user access to it.
It turns out that this functionality has some limitations and does not work especially well with Grid Infrastructure.

Here are the details of my setup:
Oracle Grid Infrastructure 19.7 standalone that uses separate OS groups: $ORACLE_HOME/rdbms/lib/config.c:
#define SS_DBA_GRP "asmdba"
#define SS_OPER_GRP "asmoper"
#define SS_ASM_GRP "asmadmin"
#define SS_BKP_GRP "asmadmin"
#define SS_DGD_GRP "asmadmin"
#define SS_KMT_GRP "asmadmin"
#define SS_RAC_GRP "asmadmin"
Oracle Database 19.7 uses the following groups:
#define SS_DBA_GRP "dba"
#define SS_OPER_GRP "oper"
#define SS_ASM_GRP ""
#define SS_BKP_GRP "backupdba"
#define SS_DGD_GRP "dgdba"
#define SS_KMT_GRP "kmdba"
#define SS_RAC_GRP "racdba"
The oracle user is the software owner; its group is oinstall. The database listener runs from the GI ORACLE_HOME (it does not matter for the purpose of the test case, though, as the issue can be reproduced even with bequeath connections).

Here is an SQL script that I use to demonstrate the issue:
def user_connect_string=tc/tc@//localhost/pdb
def sys_connect_string="sys/oracle@//localhost/pdb as sysdba"

set echo on

spool ojvm_test_case

connect &sys_connect_string

doc
  CREATING THE TEST CASE USER
#

drop user tc cascade;
grant create procedure, create session to tc identified by tc;

exec dbms_java.grant_permission('TC', 'SYS:java.io.FilePermission', '/usr/bin/who', 'read,execute')

connect &user_connect_string

CREATE OR REPLACE AND COMPILE JAVA SOURCE NAMED "OSCommand" AS 
import java.io.*; 
public class OSCommand {
    public static String Run(String command) {
        final String crlf = "" + (char) 10 + (char) 13;
        final int max_length = 4000;
        String str_out = "";

        try {
            final Process pr = Runtime.getRuntime().exec(command);
            pr.waitFor();

            BufferedReader br_in = null;
            try {
                br_in = new BufferedReader(new InputStreamReader(pr.getInputStream()));
                String buff = null;
                while (((buff = br_in.readLine()) != null) & (str_out.length() < max_length)) {
                    System.out.println("Process out :" + buff);
                    str_out += buff + crlf;
                }
                br_in.close();
            } catch (IOException ioe) {
                System.out.println("Exception caught printing process output.");
                ioe.printStackTrace();
            } finally {
                try {
                    br_in.close();
                } catch (Exception ex) {}
            }

            BufferedReader br_err = null;
            try {
                br_err = new BufferedReader(new InputStreamReader(pr.getErrorStream()));
                String buff = null;
                while (((buff = br_err.readLine()) != null) & (str_out.length() < max_length)) {
                    str_out += buff + crlf;
                }
                br_err.close();
            } catch (IOException ioe) {
                System.out.println("Exception caught printing process error.");
                ioe.printStackTrace();
            } finally {
                try {
                    br_err.close();
                } catch (Exception ex) {}
            }

        } catch (Exception e) {
            e.printStackTrace();
            return (e.getMessage());
        }
        if (str_out.length() > 4000) str_out = str_out.substring(0, max_length); //limit output
        return str_out;
    }
}
/

CREATE OR REPLACE FUNCTION OSCommand_Run(Command IN STRING) 
RETURN VARCHAR2 IS 
LANGUAGE JAVA 
NAME 'OSCommand.Run(java.lang.String) return int';
/

doc
  Test #1: call OS command
#

select oscommand_run('/usr/bin/who') from dual; 


doc
  ENABLING OJVM SECURITY
#

conn &sys_connect_string

exec dbms_java.set_runtime_exec_credentials('TC', 'oraexec', 'test');
commit;


connect &user_connect_string

doc
  Test #2: call OS command after Java credentials were set
#

select oscommand_run('/usr/bin/who') from dual; 

doc
  Test #3: getting additional output
#
set serverout on
exec dbms_java.set_output(100000)
select oscommand_run('/usr/bin/who') from dual; 


spo off

In a nutshell, the script creates a separate user and grants it necessary privileges to run the external command /usr/bin/who using a Java class (that might be copied from somewhere over the Internet judging by how it handles exceptions).
When I call that command for the first time, I get the following output:
SQL> select oscommand_run('/usr/bin/who') from dual;

OSCOMMAND_RUN('/USR/BIN/WHO')                                                   
--------------------------------------------------------------------------------
oracle   pts/0        2020-04-29 09:06 (:pts/2:S.0)
ec2-user pts/2        2020-04-29 09:06 (172.17.1.27)
ec2-user pts/3        2020-04-29 09:06 (172.17.1.27)

There is an OS account oraexec with a password test that is created on the database server host, and I associate that account to run any external commands initiated by the TC database user (Java Developer's Guide):
exec dbms_java.set_runtime_exec_credentials('TC', 'oraexec', 'test');
The subsequent calls of the previously working function, now starts failing:
SQL> select oscommand_run('/usr/bin/who') from dual;

OSCOMMAND_RUN('/USR/BIN/WHO')                                                   
--------------------------------------------------------------------------------
Cannot run program "/usr/bin/who": cannot execute /u01/app/oracle/product/db_19.
7.0/bin/jssu                                                                    
OJVM can produce additional output, which in this case is not particularly helpful:
SQL> set serverout on
SQL> exec dbms_java.set_output(100000)

PL/SQL procedure successfully completed.

SQL> select oscommand_run('/usr/bin/who') from dual;

OSCOMMAND_RUN('/USR/BIN/WHO')                                                   
--------------------------------------------------------------------------------
Cannot run program "/usr/bin/who": cannot execute /u01/app/oracle/product/db_19.
7.0/bin/jssu                                                                    
                                                                                

java.io.IOException: Cannot run program "/usr/bin/who": cannot execute          
/u01/app/oracle/product/db_19.7.0/bin/jssu                                      
 at java.lang.ProcessBuilder.start(ProcessBuilder.java:1057)                    
 at java.lang.Runtime.exec(Runtime.java:620)                                    
 at java.lang.Runtime.exec(Runtime.java:450)                                    
 at java.lang.Runtime.exec(Runtime.java:347)                                    
 at OSCommand.Run(OSCommand:11)                                                 
Caused by: java.io.IOException: cannot execute                                  
/u01/app/oracle/product/db_19.7.0/bin/jssu                                      
 at java.lang.OracleProcess.create(Native Method)                               
 at java.lang.OracleProcess.construct(OracleProcess.java:112)                   
 at java.lang.OracleProcess.<init>(OracleProcess.java:42)                       
 at java.lang.OracleProcess.start(OracleProcess.java:383)                       
 at java.lang.ProcessBuilder.start(ProcessBuilder.java:1038)                    
 ... 4 more                                                                     
jssu is an Oracle's way to switch to another user per my knowledge. I know that the database uses it when it runs external jobs through DBMS_SCHEDULER. The jssu name itself was likely inspired by the Linux 'su' command (Switch User).
The first thing that I do is to check its permissions since I am quite familiar with certain issues when those permissions can go sideways due to patching actions:
[oracle@ip-172-17-31-160 lib]$ ls -la $ORACLE_HOME/bin/jssu
-rwsr-x---. 1 root oinstall 2327920 Apr 28 15:14 /u01/app/oracle/product/db_19.7.0/bin/jssu
The permissions are fine - its owned by root and has the setuid bit set.
In order to further diagnose this issue, I run strace against the server process and find this:
stat("/u01/app/oracle/product/db_19.7.0/bin/jssu", {st_mode=S_IFREG|S_ISUID|0750, st_size=2327920, ...}) = 0
geteuid()                         = 54321
getegid()                         = 54331
The uid 54321 is the oracle user, however, the gid 54331 is asmadmin (the OSASM group):
[oracle@ip-172-17-31-160 lib]$ grep 54321 /etc/passwd
oracle:x:54321:54321::/home/oracle:/bin/bash
[oracle@ip-172-17-31-160 lib]$ grep 54331 /etc/group
asmadmin:x:54331:oracle
When there is a role-separation in place, the Oracle binary is owned by the OSASM group:
[oracle@ip-172-17-31-160 lib]$ ls -ls /u01/app/oracle/product/db_19.7.0/bin/oracle
432852 -rwsr-s--x. 1 oracle asmadmin 443232296 Apr 28 15:15 /u01/app/oracle/product/db_19.7.0/bin/oracle
Oracle processes have the following groups:
# ps -o rgid,egid,cmd -p 27280
 RGID  EGID CMD
54321 54331 oracleorcl (LOCAL=NO)
Where RGID=oinstall, EGID=asmadmin (OSASM):
# grep 543[23]1 /etc/group
oinstall:x:54321:oracle
asmadmin:x:54331:oracle
The OJVM code expects that jssu is owned by the OSASM group in this case, so that a possible workaround for this issue is to amend the jssu permissions as follows:
[oracle@ip-172-17-31-160 lib]$ ls -ls /u01/app/oracle/product/db_19.7.0/bin/jssu
2276 -rwsr-x---. 1 root asmadmin 2327920 Apr 28 15:14 /u01/app/oracle/product/db_19.7.0/bin/jssu
Obviously it cannot be used as a long-term solution as I recall that jssu permissions might be changed when relink is run, or when some patches are applied.
There permissions also work fine with external jobs that use credentials.
I opened an SR with Oracle Support a few months ago and it has not gone anywhere yet.

вторник, 3 марта 2020 г.

Using LogMiner to Identify Redo Tag for Transactions

I was asked if there is a way to identify what redo tag was set for a specific transaction.

There is a document that answers the question if any tag was set without clarifying what tag was set specifically:
Using Logminer to Identify if a Non-NULL Tag Has Been Set Using DBMS_STREAMS.SET_TAG (Doc ID 740574.1)
Besides, LogDump can be used to display the tag information when GoldenGate is in use: GGSTOKEN.

My experiments revealed that the required information can be obtained using X$LOGMNR_CONTENTS.DUMP_INFO as the example below demonstrates.
SYS@PDB> grant create session, create table, select_catalog_role to tc identified by tc;

Grant succeeded.

SYS@PDB>
SYS@PDB> grant execute on dbms_streams_adm to tc;

Grant succeeded.

SYS@PDB>
SYS@PDB> alter user tc quota 100m on users;

User altered.

SYS@PDB>
SYS@PDB> conn tc/tc@pdb
Connected.
TC@PDB>
TC@PDB>
TC@PDB> select current_scn from v$database;

CURRENT_SCN
-----------
    2221571

TC@PDB>
TC@PDB> create table t(id int, s varchar2(30))
  2    segment creation immediate;

Table created.

TC@PDB>
TC@PDB> exec dbms_streams_adm.set_tag(hextoraw('112233'))

PL/SQL procedure successfully completed.

TC@PDB>
TC@PDB> insert into t values (1, 'tag_1');

1 row created.

TC@PDB> select dbms_transaction.local_transaction_id from dual;

LOCAL_TRANSACTION_ID
--------------------------------------------------------------------------------
3.29.732

TC@PDB> commit;

Commit complete.

TC@PDB>
TC@PDB> exec dbms_streams_adm.set_tag(hextoraw('deadbeef'))

PL/SQL procedure successfully completed.

TC@PDB>
TC@PDB> insert into t values (11, 'tag_1');

1 row created.

TC@PDB> insert into t values (12, 'tag_1');

1 row created.

TC@PDB> select dbms_transaction.local_transaction_id from dual;

LOCAL_TRANSACTION_ID
--------------------------------------------------------------------------------
8.26.689

TC@PDB> commit;

Commit complete.

TC@PDB>
TC@PDB> select current_scn from v$database;

CURRENT_SCN
-----------
    2221592

Here is summary information for the transactions above:

TRANSACTIONTAG
3.29.732112233
8.26.689DEADBEEF


The following query shows only the redo opcode 5.20 (Transaction Audit) rows for these transactions:
SQL> col log_file old_v log_file
SQL>
SQL> select member log_file 
  2    from v$log l, 
  3         v$logfile f
  4   where l.status = 'CURRENT' 
  5     and f.group# = l.group#;

LOG_FILE
--------------------------------------------------------------------------------
/opt/oracle/oradata/ORCLCDB/redo02.log

SQL>
SQL> exec dbms_logmnr.add_logfile('&log_file.')

PL/SQL procedure successfully completed.

SQL>
SQL> exec dbms_logmnr.start_logmnr(options => dbms_logmnr.dict_from_online_catalog)

PL/SQL procedure successfully completed.

SQL> select xidusn, xidslt, xidsqn, 
  2        dump_info 
  3   from x$logmnr_contents
  4  where (xidusn, xidslt, xidsqn) in ((3,29,732),(8,26,689)) 
  5    and scn between 2221571 and 2221592 
  6    and component_id = 5
  7    and opcode = 20;

    XIDUSN     XIDSLT     XIDSQN DUMP_INFO                                                                                           
---------- ---------- ---------- ----------------------------------------------------------------------------------------------------
         3         29        732 CHANGE #3 MEDIA RECOVERY MARKER CON_ID:3 SCN:0x0000000000000000 SEQ:0 OP:5.20 ENC:0 FLG:0x0000      
                                 session number   = 264                                                                              
                                 serial  number   = 29041                                                                            
                                 transaction name =                                                                                  
                                 version 318767104                                                                                   
                                 audit sessionid 24                                                                                  
                                 Client Id =                                                                                         
                                 login   username = TC                                                                               
                                 REPL MARKER:                                                                                        
                                  06 00 09 00 00 00 01 00 03 11 22 33                                                                


    XIDUSN     XIDSLT     XIDSQN DUMP_INFO                                                                                           
---------- ---------- ---------- ----------------------------------------------------------------------------------------------------
         8         26        689 CHANGE #3 MEDIA RECOVERY MARKER CON_ID:3 SCN:0x0000000000000000 SEQ:0 OP:5.20 ENC:0 FLG:0x0000      
                                 session number   = 264                                                                              
                                 serial  number   = 29041                                                                            
                                 transaction name =                                                                                  
                                 version 318767104                                                                                   
                                 audit sessionid 24                                                                                  
                                 Client Id =                                                                                         
                                 login   username = TC                                                                               
                                 REPL MARKER:                                                                                        
                                  06 00 0a 00 00 00 01 00 04 de ad be ef     
It appears that the 9th byte of a REPL MARKER value stores the length of the tag. Therefore, it is possible to extract a tag using the following query:
SQL> select xidusn, xidslt, xidsqn, repl_marker, 
  2         utl_raw.substr(repl_marker, 10, to_number(utl_raw.substr(repl_marker, 9, 1), 'xxxxxxxxxx')) tag,
  3         dump_info
  4    from (
  5          select xidusn, xidslt, xidsqn, 
  6                 hextoraw(replace(regexp_substr(dump_info, 'REPL MARKER:.(.*).', 1, 1, 'n', 1), ' ')) repl_marker, 
  7                 dump_info 
  8            from x$logmnr_contents
  9           where (xidusn, xidslt, xidsqn) in ((3,29,732),(8,26,689)) 
 10             and scn between 2221571 and 2221592 
 11             and component_id = 5
 12             and opcode = 20);

    XIDUSN     XIDSLT     XIDSQN REPL_MARKER                TAG        DUMP_INFO                                                                                           
---------- ---------- ---------- -------------------------- ---------- ----------------------------------------------------------------------------------------------------
         3         29        732 060009000000010003112233   112233     CHANGE #3 MEDIA RECOVERY MARKER CON_ID:3 SCN:0x0000000000000000 SEQ:0 OP:5.20 ENC:0 FLG:0x0000      
                                                                       session number   = 264                                                                              
                                                                       serial  number   = 29041                                                                            
                                                                       transaction name =                                                                                  
                                                                       version 318767104                                                                                   
                                                                       audit sessionid 24                                                                                  
                                                                       Client Id =                                                                                         
                                                                       login   username = TC                                                                               
                                                                       REPL MARKER:                                                                                        
                                                                        06 00 09 00 00 00 01 00 03 11 22 33                                                                


    XIDUSN     XIDSLT     XIDSQN REPL_MARKER                TAG        DUMP_INFO                                                                                           
---------- ---------- ---------- -------------------------- ---------- ----------------------------------------------------------------------------------------------------
         8         26        689 06000A000000010004DEADBEEF DEADBEEF   CHANGE #3 MEDIA RECOVERY MARKER CON_ID:3 SCN:0x0000000000000000 SEQ:0 OP:5.20 ENC:0 FLG:0x0000      
                                                                       session number   = 264                                                                              
                                                                       serial  number   = 29041                                                                            
                                                                       transaction name =                                                                                  
                                                                       version 318767104                                                                                   
                                                                       audit sessionid 24                                                                                  
                                                                       Client Id =                                                                                         
                                                                       login   username = TC                                                                               
                                                                       REPL MARKER:                                                                                        
                                                                        06 00 0a 00 00 00 01 00 04 de ad be ef  
It is quite handy that X$LOGMNR_CONTENTS.DUMP_INFO contains the same or, at least, the subset of the information present in the redo log dump, that allows to use it for sophisticated analysis in SQL*Plus without resorting to low-level dumps or the LogDump utility of Oracle GoldenGate.
I was using a vanilla 19.3 database on Linux for these experiments.

суббота, 1 февраля 2020 г.

Tracing SQL Macro Expansion

I decided to take a look at what events Oracle provides to diagnose or troubleshoot SQL Macro Expansions.
For that, I took almost the same code that I used before: SQL Macro script
drop table t;
drop table t1;
drop table t2;

CREATE OR REPLACE FUNCTION sample(t DBMS_TF.Table_t, how_many number DEFAULT 5)
RETURN VARCHAR2 SQL_MACRO
AS 
BEGIN 
  RETURN q'[SELECT * 
              FROM t
      WHERE rownum <= how_many]';
END sample;
/

create table t
as
select n
  from xmltable('1 to 10'
  columns
    n int path '.');

create table t1
as
select n,
       'val_'||to_char(n, 'fm00') val
  from xmltable('1 to 10'
  columns
    n int path '.');

create table t2
as
select 'val_'||to_char(n, 'fm00') s
  from xmltable('1 to 10'
  columns
    n int path '.');

select * from t1;
select * from t2;

select *
  from sample(t1, 3);

select *
  from sample(t1);

select *
  from sample(t2);
A query failed when I was trying to supply the table T - which is the same that is used in the SQL macro's definition:
SQL> select *
  2    from sample(t);
  from sample(t)
               *
ERROR at line 2:
ORA-32039: recursive WITH clause must have column alias list
The errorstack dump for ORA-32039 (level=3) did not provide much information about the underlying cause:
----- Error Stack Dump -----
<error barrier> at 0x7ffd69f7aaf0 placed dbkda.c@296
ORA-32039: recursive WITH clause must have column alias list
----- Current SQL Statement for this session (sql_id=d3ktrfsd4s0sr) -----
select * from sample(t)
SQL Trace was referring to the same statement:
PARSE ERROR #140320112869816:len=24 dep=0 uid=119 oct=3 lid=119 tim=1054524149653 err=32039
select * from sample(t)
It showed the following statement, though:
PARSING IN CURSOR #140320116348624 len=446 dep=1 uid=119 oct=47 lid=119 tim=1054472624364 hv=3547957408 ad='7cda6688' sqlid='9awjhv79rm250'

declare
t0 DBMS_TF.Table_t := DBMS_TF.Table_t(column => :0);

begin
:macro_text := "SAMPLE"(t0);
end;
While studying those qksptfSQM functions, namely: qksptfSQM_GetTxt and qksptfSQM_Template, I found that they are calling to the PTF_Comp UTS component (Polymorphic Table Functions Compilation (qksptf)):
   0x000000000f75c87a <+4202>:  mov    $0x6000,%ecx
   0x000000000f75c87f <+4207>:  mov    $0x205018b,%esi
   0x000000000f75c884 <+4212>:  mov    $0x2,%edx
   0x000000000f75c889 <+4217>:  mov    -0x200(%rbp),%r8
   0x000000000f75c890 <+4224>:  mov    -0xcb58(%rbx),%rdi
   0x000000000f75c897 <+4231>:  callq  0x485cef0 <dbgtCtrl_intEvalCtrlEvent>
Therefore, I enabled PTF_Comp tracing:
alter session set events 'trace[ptf_comp]' tracefile_identifier=ptf_comp;
Here is what I got in the trace file:
qksptfSQM_GetTxt(): Anonymous Block
===================================

declare
t0 DBMS_TF.Table_t := DBMS_TF.Table_t(column => :0);

begin
:macro_text := "SAMPLE"(t0);
end;
qksptfSQM_GetTxt(): Macro Text
==============================

SELECT *
              FROM t
             WHERE rownum <= how_many
qksptfSQM_Template(): Template Text
===================================

with  T as (select  /*+ INLINE */ * from "MY_USER"."T")
select "SYS__$".*
from (select 5 "HOW_MANY" from SYS.DUAL) "SAMPLE",
     lateral(SELECT *
              FROM t
             WHERE rownum <= how_many) "SYS__$"
Obviously, the ORA-32039 error is absolutely legit in this case and the PTF_Comp tracing clearly shows where the issue is. It highlights one of the current limitations of that functionality in Oracle 19.6. Beside it demonstrates how literals (HOW_MANY in the example above) are passed through a subquery and joined to a lateral view - that approach has certain restrictions. Apparently there seem to be more restrictions than that according to $ORACLE_HOME/plsql/mesg/pcmus.msg:
776, 0, "SQL macro can only return character return types"
// INDEX: "SQL macro"
// RELEASE: 20.1
// CAUSE: A SQL macro function returned a type that was not a character type like VARCHAR2.
// ACTION:  Change the return type to one of the character types.
777, 0, "scalar SQL macro cannot have argument of type DBMS_TF.TABLE_T"
// INDEX: "SQL macro"
// RELEASE: 20.1
// CAUSE: DBMS_TF.TABLE_T argument was used for scalar macros.
// ACTION:  Change the argument type to a type other than
//          DBMS_TF.TABLE_T.
778, 0, "PARTITION BY, ORDER BY, CLUSTER BY, or PARALLEL_ENABLE are not allowed for SQL macros"
// MANUAL: partition by, order by, cluster by, deterministic, authid or parallel_enable
//         not allowed for SQL macros
// INDEX: "SQL macro"
// RELEASE: 20.1
// CAUSE: An attempt was made to use the PARTITION BY, ORDER BY, CLUSTER BY,
//        DETERMINISTIC, AUTHID, or PARALLEL_ENABLE clause in a
//        SQL macro.
// ACTION:  Do not specify the PARTITION BY, CLUSTER BY, ORDER BY, DETERMINISTIC, AUTHID, or
//          PARALLEL_ENABLE clause with a SQL macro.
779, 0, "Formal parameters of type TABLE cannot have a default value."
// MANUAL: A default value cannot be specified for parameter of type TABLE.
// INDEX:  "SQL macros"
// RELEASE: 20.1
// CAUSE: A default value was specified for a formal parameter of type TABLE
//        in a SQL macro declaration.
// ACTION: Remove the default value from the parameter of type TABLE in
//         the SQL macro declaration.
//
780, 0, "Formal parameters of type COLUMNS may only have a NULL default value."
// MANUAL:  Parameters of type COLUMNS can only have a NULL default value.
// INDEX:  "SQL macros"
// RELEASE: 20.1
// CAUSE:  A non-NULL default value was specified for a formal parameter of type COLUMNS
//        in a polymorphic table function.
// ACTION: Change the default value for the parameter of type COLUMNS to
//         have a NULL default value in the polymorphic table function
//         specification.
//
//
781, 0, "SQL macro cannot be a type method"
// INDEX:  "SQL macros"
// RELEASE: 20.1
// CAUSE: The SQL macro could not be declared as a method in a type specification.
// ACTION: Use the SQL macro as a package function or a top level function.
//
782, 0, "A package containing SQL macro cannot be an invoker's rights package."
// INDEX:  "SQL macros"
// RELEASE: 20.1
// CAUSE: A package could not be declared as AUTHID CURRENT_USER if it had a SQL
//        macro function declared inside.
// ACTION: Declare the package as AUTHID DEFINER if it contains one or more
//         SQL macro functions.
Keep in mind that this functionality is not officially released and only a limited subset of it is available in 19.6: Bug 30324180 SQL Macro should be backported to 19.5. Hence, the final version of SQL Macro in Oracle 20 might be different from what we have now in Oracle 19.6.