Oracle Notes by Mikhail

X$MESSAGES hidden columns

2023-04-15T17:21:00.000+06:00

Sometimes X$ tables might have seemingly unused gaps in a row. Here is an example with X$MESSAGES:

SQL> select kqftarsz
  2    from x$kqfta
  3   where kqftanam = 'X$MESSAGES'
  4  /

  KQFTARSZ
----------
        32

SQL> select c.kqfconam column_name,
  2         c.kqfcodty datatype,
  3         c.kqfcosiz size_byte,
  4         c.kqfcooff offset
  5    from x$kqfta t,
  6         x$kqfco c
  7   where t.kqftanam = 'X$MESSAGES'
  8     and c.kqfcotab = t.indx
  9   order by c.indx
 10  /

COLUMN_NAME   DATATYPE  SIZE_BYTE     OFFSET
----------- ---------- ---------- ----------
ADDR                23          8          0
INDX                 2          4          0
INST_ID              2          4          0
CON_ID               2          2          0
DESCRIPTION          1         64          8
DEST                 1         64         16

6 rows selected.

SQL> desc x$messages
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 ADDR                                               RAW(8)
 INDX                                               NUMBER
 INST_ID                                            NUMBER
 CON_ID                                             NUMBER
 DESCRIPTION                                        VARCHAR2(64)
 DEST                                               VARCHAR2(64)

Having looked carefully at the output above, several discrepancies can be noticed:

How come the row size is 32 bytes whereas there are only two non-generated columns of 8 bytes each (DESCRIPTION and DEST, which are pointers)?
If the row size is 32 bytes, and DESCRIPTION and DEST starts at offsets 8 and 16 correspondingly, what data is in bytes 0-7 and 24-31?

Based on the fact that X$MESSAGES takes its data from ksbsdt (see X$ tables: starting address), the structure itself can be inspected in case it can help connect the dots.

[oracle@rac1 bin]$ readelf -s oracle | grep -w ksbsdt
203009: 00000000152e5760 16992 OBJECT  GLOBAL DEFAULT   17 ksbsdt
218757: 00000000152e5760 16992 OBJECT  GLOBAL DEFAULT   17 ksbsdt
[oracle@rac1 bin]$ objdump -s --start-address=0x00000000152e5760 --stop-address=$((0x00000000152e5760+16992)) oracle

oracle:     file format elf64-x86-64

Contents of section .rodata:
 152e5760 203eee00 00000000 48f12e15 00000000   >......H.......
 152e5770 f818be13 00000000 11000000 00000000  ................
 152e5780 d04aee00 00000000 60f12e15 00000000  .J......`.......
 152e5790 f818be13 00000000 01000000 00000000  ................
 152e57a0 f0b11307 00000000 78f12e15 00000000  ........x.......
 152e57b0 f818be13 00000000 00000000 00000000  ................
 152e57c0 80043401 00000000 90f12e15 00000000  ..4.............
 152e57d0 f818be13 00000000 05000000 00000000  ................
 152e57e0 f0162707 00000000 acf12e15 00000000  ..'.............
 152e57f0 c0b28f15 00000000 00000000 00000000  ................
...

It can be seen that there is a symbol at byte 0 and some number at byte 24; byte 8 and byte 16 has expected pointers:

[oracle@rac1 bin]$ readelf -s oracle | grep 00ee3e20
 30104: 0000000000ee3e20  1344 FUNC    GLOBAL DEFAULT   13 ksl_pdb_event_stats_exten
322705: 0000000000ee3e20  1344 FUNC    GLOBAL DEFAULT   13 ksl_pdb_event_stats_exten
[oracle@rac1 bin]$ objdump -s --start-address=0x152ef148 --stop-address=$((0x152ef148+32)) oracle

oracle:     file format elf64-x86-64

Contents of section .rodata:
 152ef148 70646220 6576656e 74207374 61747320  pdb event stats
 152ef158 61637469 6f6e0000 57616974 204f7574  action..Wait Out
[oracle@rac1 bin]$ objdump -s --start-address=0x13be18f8 --stop-address=$((0x13be18f8+32)) oracle

oracle:     file format elf64-x86-64

Contents of section .rodata:
 13be18f8 47454e30 00000000 4c4d4643 00000000  GEN0....LMFC....
 13be1908 44425730 00000000 4c434b31 00000000  DBW0....LCK1....
...

Thus, the missing columns can be obtained using another tool that I wrote for such cases - bide (binary data extractor):

[oracle@rac1 src]$ bide dump-table ksbsdt --format func:symbol description:string dest:string L
+----------------------------------------+-----------------------------------------------------------------+------------------------------------+-----------+
| func                                   | description                                                     | dest                               | _unnamed3 |
+----------------------------------------+-----------------------------------------------------------------+------------------------------------+-----------+
| ksl_pdb_event_stats_extend             | pdb event stats action                                          | GEN0                               |        17 |
| kslwo_compute_sys_thresholds_bg_action | Wait Outlier thresholds                                         | GEN0                               |         1 |
| kslwo_process_sys_wait_bg_action       | Process wait outlier                                            | GEN0                               |         0 |
| ksuguid_mac_update_act                 | KSU GUID MAC Address update                                     | GEN0                               |         5 |
| ksu_inc_sos_param                      | session param update action                                     | GEN1                               |         0 |
| ksu_inc_sos                            | session update action                                           | GEN2                               |         1 |
| kstmUpdateTranslationCbk               | Update KGSTM Translation                                        | MMON                               |         1 |
| ksustaen_init_act                      | event nfy timeout action                                        | GEN0                               |         8 |
| ksbmsg                                 | KSB action for X-instance calls                                 | *                                  |         2 |
| ksbckbast                              | KSB action for bast checking                                    | *                                  |         1 |
| ksbsht                                 | generic shutdown background                                     | *                                  |         0 |
...

Comparing it with the data seen in the database, it can be concluded that not only do we have a description of each "message" and a process presumably handling it, but also we have an underlying function:

SQL> select * from x$messages;

ADDR                   INDX    INST_ID     CON_ID DESCRIPTION                                                      DEST
---------------- ---------- ---------- ---------- ---------------------------------------------------------------- -----
00000000152E5760          0          1          0 pdb event stats action                                           GEN0
00000000152E5780          1          1          0 Wait Outlier thresholds                                          GEN0
00000000152E57A0          2          1          0 Process wait outlier                                             GEN0
00000000152E57C0          3          1          0 KSU GUID MAC Address update                                      GEN0
00000000152E57E0          4          1          0 session param update action                                      GEN1
00000000152E5800          5          1          0 session update action                                            GEN2
00000000152E5820          6          1          0 Update KGSTM Translation                                         MMON
00000000152E5840          7          1          0 event nfy timeout action                                         GEN0
00000000152E5860          8          1          0 KSB action for X-instance calls                                  *
00000000152E5880          9          1          0 KSB action for bast checking                                     *
00000000152E58A0         10          1          0 generic shutdown background                                      *
...

This allows one to execute those actions by calling the respective function. It also explains what is stored in bytes 0-7 and 24-31 of X$MESSAGES rows.

X$ tables: starting address

2023-04-15T01:32:00.001+06:00

The starting address of X$ tables makes sense for tables residing in SGA or defined within the Oracle binary. It is usually found by querying an X$ table. This blog post describes an alternative method of obtaining the starting address by calling an Oracle C function.

The Oracle function that can be used here is qerfxArrayMaxSize. Here is a short GDB script demonstrating its usage:

set pagination off
set trace-commands on

set $ctx = (int *)malloc(4)

def x_desc
  printf "result = 0x%x\n", (int)qerfxArrayMaxSize($arg0, (int)strlen($arg0), (int *)$ctx)
  printf "start_addr = 0x%x\n", *(long *)($ctx+18)
  info symbol *(int *)($ctx+18)
  p/a (long[10])(**(long *)($ctx+18))
  printf "row_count = %d\n", *(int *)($ctx+20)
end

x_desc "X$KSUPR"
x_desc "X$KSUSE"
x_desc "X$MESSAGES"

The three tables are going to be examined in separate sections below: X$KSUPR, X$KSUSE, X$MESSAGES.

X$KSUPR

The GDB output:

+x_desc "X$KSUPR"
++printf "result = 0x%x\n", (int)qerfxArrayMaxSize("X$KSUPR", (int)strlen("X$KSUPR"), (int *)$ctx)
result = 0x9a537260
++printf "start_addr = 0x%x\n", *(long *)($ctx+18)
start_addr = 0x9a537260
++info symbol *(int *)($ctx+18)
No symbol matches *(int *)($ctx+18).
++p/a (long[10])(**(long *)($ctx+18))
$1 = {0x9a588070, 0x9a5895e0, 0x9a58ab50, 0x9a58c0c0, 0x9a58d630, 0x9a58eba0, 0x9a590110, 0x9a591680, 0x9a592bf0, 0x9a594160}
++printf "row_count = %d\n", *(int *)($ctx+20)
row_count = 600

The SQL*Plus output:

SQL> select addr from x$ksupr where rownum<=10;

ADDR
----------------
000000009A588070
000000009A5895E0
000000009A58AB50
000000009A58C0C0
000000009A58D630
000000009A58EBA0
000000009A590110
000000009A591680
000000009A592BF0
000000009A594160

10 rows selected.

Thus, the result of the function is a pointer to an array storing pointers to X$KSUPR rows (0x9a588070, 0x9a5895e0, 0x9a58ab50, etc.). The function also conveniently returns the number of rows of the relevant fixed array:

++printf "row_count = %d\n", *(int *)($ctx+20)
row_count = 600

SQL> select count(*) from x$ksupr;

  COUNT(*)
----------
       600

SQL> select value from v$parameter where name='processes';

VALUE
--------------------------------------------------------------------------------
600

X$KSUSE

The GDB output:

+x_desc "X$KSUSE"
++printf "result = 0x%x\n", (int)qerfxArrayMaxSize("X$KSUSE", (int)strlen("X$KSUSE"), (int *)$ctx)
result = 0x39c
++printf "start_addr = 0x%x\n", *(long *)($ctx+18)
start_addr = 0x9ae5a808
++info symbol *(int *)($ctx+18)
No symbol matches *(int *)($ctx+18).
++p/a (long[10])(**(long *)($ctx+18))
$2 = {0x9a8ec740, 0x9a8eef28, 0x9a8f1710, 0x9a8f3ef8, 0x9a8f66e0, 0x9a8f8ec8, 0x9a8fb6b0, 0x9a8fde98, 0x9a900680, 0x9a902e68}
++printf "row_count = %d\n", *(int *)($ctx+20)
row_count = 924

This time around the function returns the number of rows (0x39c=924) of the corresponding array.

The SQL*Plus output:

SQL> select addr from x$ksuse where rownum<=10;

ADDR
----------------
000000009A8EC740
000000009A8EEF28
000000009A8F1710
000000009A8F3EF8
000000009A8F66E0
000000009A8F8EC8
000000009A8FB6B0
000000009A8FDE98
000000009A900680
000000009A902E68

10 rows selected.

SQL> select count(*) from x$ksuse;

  COUNT(*)
----------
       924

SQL> select value from v$parameter where name='sessions';

VALUE
--------------------------------------------------------------------------------
924

X$MESSAGES

The GDB output:

+x_desc "X$MESSAGES"
++printf "result = 0x%x\n", (int)qerfxArrayMaxSize("X$MESSAGES", (int)strlen("X$MESSAGES"), (int *)$ctx)
result = 0x212
++printf "start_addr = 0x%x\n", *(long *)($ctx+18)
start_addr = 0x152e5760
++info symbol *(int *)($ctx+18)
ksbsdt in section .rodata of /u01/app/oracle/product/19.3.0/dbhome_1/bin/oracle
++p/a (long[10])(**(long *)($ctx+18))
$3 = {0xee3e20 <ksl_pdb_event_stats_extend>, 0x152ef148, 0x13be18f8, 0x11, 0xee4ad0 <kslwo_compute_sys_thresholds_bg_action>, 0x152ef160, 0x13be18f8, 0x1, 0x713b1f0 <kslwo_process_sys_wait_bg_action>, 0x152ef178}
++printf "row_count = %d\n", *(int *)($ctx+20)
row_count = 530

The output above is very different from both X$KSUPR and X$KSUSE - the starting address is the actual address of the ksbsdt structure defined in the Oracle binary:

[oracle@rac1 bin]$ readelf -s oracle | grep 152e5760
203009: 00000000152e5760 16992 OBJECT  GLOBAL DEFAULT   17 ksbsdt
218757: 00000000152e5760 16992 OBJECT  GLOBAL DEFAULT   17 ksbsdt

The SQL*Plus output:

SQL> select addr from x$messages where rownum<=10;

ADDR
----------------
00000000152E5760
00000000152E5780
00000000152E57A0
00000000152E57C0
00000000152E57E0
00000000152E5800
00000000152E5820
00000000152E5840
00000000152E5860
00000000152E5880

10 rows selected.

SQL> select count(*) from x$messages;

  COUNT(*)
----------
       530

Conclusion

The qerfxArrayMaxSize function can be used to determine the starting address of X$ tables residing in SGA or Oracle binary. The return value of the function is not very consistent: the function returned the starting address for X$KSUPR and the number of rows for X$KSUSE/X$MESSAGES. It might be the case that the function returns the void type and the results are inconsistent because we are just examining $rax. By contrast, the memory area defined by the third parameter always stores the starting address and the number of rows for the tables I tested it with. I consider it is reliable for these types of tables (non-UGA/non-PGA based). I am disposed to think that X$KQFTA.KQFTATYP and maybe KQFTAFLG determine whether it is an SGA/PGA/UGA/Oracle binary based table:

SQL> select kqftanam, kqftatyp, kqftaflg
  2    from x$kqfta
  3   where kqftanam in ('X$KSUPR', 'X$KSUSE', 'X$MESSAGES')
  4  /

KQFTANAM                         KQFTATYP   KQFTAFLG
------------------------------ ---------- ----------
X$KSUSE                                 2          1
X$KSUPR                                 2          0
X$MESSAGES                              1          0

However, more experiments need to be conducted to confirm that. I initially started looking at this because I found that I could not determine the static structure behind X$MESSAGES using my xinfo tool:

[oracle@rac1 ~]$ xinfo list 'X$MESSAGES' --with-kqftap
+------------+-----+------------+------------+-----------------+---------+-----+-----+-----+-----+-----------------------------------------------------+
|        obj | ver |    nam_ptr | nam        | xstruct_nam_ptr | xstruct | typ | flg | rsz | coc | kqftap                                              |
+------------+-----+------------+------------+-----------------+---------+-----+-----+-----+-----+-----------------------------------------------------+
| 4294950992 |   3 | 0x14b845e0 | X$MESSAGES |      0x14b845ec | ksbsd   |   1 |   0 |  32 |   6 | {'xstruct_ptr': '0x14a26180', 'xstruct': 'ksbsd_c'} |
+------------+-----+------------+------------+-----------------+---------+-----+-----+-----+-----+-----------------------------------------------------+

There are structures such as ksbsd/ksbsd_c but there is no explicit ksbsdt. qerfxArrayMaxSize can be used to identify this missing structure.

Computing ORA_HASH

2023-04-13T00:05:00.001+06:00

I wrote a short Python function to compute a hash value for a given expression. The return value of the function matches the return value of ORA_HASH on inputs I tested it with in Oracle 23c on Linux x86-64.

In its simplest form, the invocation can be as follows:

>>> ora_hash(b'test')
2662839991

SQL> select ora_hash('test');

ORA_HASH('TEST')
----------------
      2662839991

Handling non-string datatypes

In case of a non-string datatype, some conversion should be performed. ORA_HASH accepts a variety of datatypes. I use NUMBER and DATE in the examples below. The procedure is essentially the same with both datatypes: need to take the internal representation of a value in Oracle and pass it to Python. It should be possible to handle any other supported datatypes following the same procedure.

NUMBER

SQL> select ora_hash(2023);

ORA_HASH(2023)
--------------
    2671887358

SQL> select dump(2023, 16);

DUMP(2023,16)
---------------------
Typ=2 Len=3: c2,15,18

>>> ora_hash(b'\xc2\x15\x18')
2671887358

DATE

SQL> select ora_hash(to_date('2023-03-01 12:34:56', 'yyyy-mm-dd hh24:mi:ss'));

ORA_HASH(TO_DATE('2023-03-0112:34:56','YYYY-MM-DDHH24:MI:SS'))
--------------------------------------------------------------
                                                     112410422
                                                     
SQL> select dump(to_date('2023-03-01 12:34:56', 'yyyy-mm-dd hh24:mi:ss'), 16);

DUMP(TO_DATE('2023-03-0112:34:56
--------------------------------
Typ=13 Len=8: e7,7,3,1,c,22,38,0

>>> ora_hash(b'\xe7\x07\x03\x01\x0c\x22\x38\x00')
112410422

Specifying max_bucket and seed_value

These parameters are also supported.

MAX_BUCKET

SQL> select ora_hash('abracadabra', 255);

ORA_HASH('ABRACADABRA',255)
---------------------------
                         82

>>> ora_hash(b'abracadabra', 255)
82

SEED_VALUE

SQL> select ora_hash('abracadabra', power(2,32)-1, 123);

ORA_HASH('ABRACADABRA',POWER(2,32)-1,123)
-----------------------------------------
                               4012392341

>>> ora_hash(b'abracadabra', seed=123)
4012392341

MAX_BUCKET and SEED_VALUE

SQL> select ora_hash('abracadabra', 255, 123);

ORA_HASH('ABRACADABRA',255,123)
-------------------------------
                            149

>>> ora_hash(b'abracadabra', 255, 123)
149

X$ tables: reading X$KQFCO from Oracle binary

2023-03-25T17:23:00.000+06:00

Column definition structures are stored in kqftap:

[oracle@db-21 ~]$ xinfo list 'X$KCCCP' -o json --with-kqftap
{
  "457": {
    "obj": 4294951392,
    "ver": 5,
    "nam_ptr": 383999376,
    "nam": "X$KCCCP",
    "xstruct_nam_ptr": 383999384,
    "xstruct": "kctcpx",
    "typ": 5,
    "flg": 0,
    "rsz": 552,
    "coc": 25,
    "kqftap": {
      "xstruct_ptr": 382889376,
      "cb2_ptr": 75535856,
      "xstruct": "kctcpx_c",
      "cb2": "kctxcp"
    }
  }
}

For example, the structure kctcpx_c defines X$KCCCP's columns:

[oracle@db-21 bin]$ readelf -s oracle | grep -E -A1 -w 'Symbol|kctcpx_c' --no-group-separator
Symbol table '.dynsym' contains 225083 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
Symbol table '.symtab' contains 402534 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
 80301: 0000000016d77100  1664 OBJECT  LOCAL  DEFAULT   17 kctcpx_c
 80302: 0000000016d77780  2304 OBJECT  LOCAL  DEFAULT   17 kctmtx_c

Here is what it looks like:

[oracle@db-21 bin]$ objdump -s --start-address=0x0000000016d77100 --stop-address=$((0x0000000016d77100+1664)) oracle

oracle:     file format elf64-x86-64

Contents of section .rodata:
 16d77100 04000000 00000000 9c95e616 00000000  ................
 16d77110 17090000 00000000 00000000 00000000  ................
 16d77120 00000000 00000000 08000000 00000000  ................
 16d77130 00000000 00000000 00000000 00000000  ................
 16d77140 04000000 00000000 60d6d715 00000000  ........`.......
 16d77150 020b0000 00000000 00000000 00000000  ................
 16d77160 00000000 00000000 04000000 00000000  ................
 16d77170 00000000 00000000 00000000 00000000  ................
 16d77180 07000000 00000000 b0960f16 00000000  ................
 16d77190 020b0000 00000000 00000000 00000000  ................
 16d771a0 00000000 00000000 04000000 00000000  ................
 16d771b0 00000000 00000000 00000000 00000000  ................
 16d771c0 06000000 00000000 5c8de716 00000000  ........\.......
 16d771d0 020b0000 00000000 00000000 00000000  ................
 16d771e0 00000000 00000000 02000000 00000000  ................
 16d771f0 00000000 00000000 00000000 00000000  ................
 16d77200 05000000 00000000 6c82e916 00000000  ........l.......
 16d77210 02000100 00000000 00000000 00000000  ................
 16d77220 00000000 00000000 04000000 00000000  ................
 16d77230 0c020000 00000000 00000000 00000000  ................

Each column is defined within 64 bytes:

[oracle@db-21 bin]$ for a in 16e6959c 15d7d660 160f96b0 16e78d5c 16e9826c
> do
>   objdump -s --start-address=0x$a --stop-address=$((0x$a+16)) oracle
> done

oracle:     file format elf64-x86-64

Contents of section .rodata:
 16e6959c 41444452 00000000 6b71665f 66696c6c  ADDR....kqf_fill

oracle:     file format elf64-x86-64

Contents of section .rodata:
 15d7d660 494e4458 00000000 49535355 45440000  INDX....ISSUED..

oracle:     file format elf64-x86-64

Contents of section .rodata:
 160f96b0 494e5354 5f494400 71656573 46696e64  INST_ID.qeesFind

oracle:     file format elf64-x86-64

Contents of section .rodata:
 16e78d5c 434f4e5f 49440000 5352435f 434f4e5f  CON_ID..SRC_CON_

oracle:     file format elf64-x86-64

Contents of section .rodata:
 16e9826c 4350544e 4f000000 43505354 41000000  CPTNO...CPSTA...

Here is the structure of X$KCCCP:

SQL> desc x$kcccp
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 ADDR                                               RAW(8)
 INDX                                               NUMBER
 INST_ID                                            NUMBER
 CON_ID                                             NUMBER
 CPTNO                                              NUMBER
 CPSTA                                              NUMBER
 CPFLG                                              NUMBER
 CPDRT                                              NUMBER
 CPRDB                                              NUMBER
 CPLRBA_SEQ                                         NUMBER
 CPLRBA_BNO                                         NUMBER
 CPLRBA_BOF                                         NUMBER
 CPODR_SEQ                                          NUMBER
 CPODR_BNO                                          NUMBER
 CPODR_BOF                                          NUMBER
 CPODS                                              VARCHAR2(20)
 CPODT                                              VARCHAR2(20)
 CPODT_I                                            NUMBER
 CPHBT                                              NUMBER
 CPRLS                                              VARCHAR2(20)
 CPRLC                                              NUMBER
 CPMID                                              NUMBER
 CPSDR_SEQ                                          NUMBER
 CPSDR_BNO                                          NUMBER
 CPSDR_ADB                                          NUMBER

And the actual X$KQFCO rows:

SQL> select kqfcocno,
  2         kqfconam,
  3         kqfcosiz,
  4         kqfcodty,
  5         kqfcotyp,
  6         kqfcomax,
  7         kqfcolsz,
  8         kqfcolof,
  9         kqfcooff,
 10         kqfcoidx,
 11         kqfcoipo
 12    from x$kqfta t,
 13         x$kqfco c
 14   where t.kqftanam = 'X$KCCCP'
 15     and c.kqfcotab = t.indx
 16   order by 1;

  KQFCOCNO KQFCONAM     KQFCOSIZ   KQFCODTY   KQFCOTYP   KQFCOMAX   KQFCOLSZ   KQFCOLOF   KQFCOOFF   KQFCOIDX   KQFCOIPO
---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
         1 ADDR                8         23          9          0          0          0          0          0          0
         2 INDX                4          2         11          0          0          0          0          0          0
         3 INST_ID             4          2         11          0          0          0          0          0          0
         4 CON_ID              2          2         11          0          0          0          0          0          0
         5 CPTNO               4          2          0          0          0          0        524          1          0
         6 CPSTA               4          2         11          0          0          0          0          0          0
         7 CPFLG               4          2         11          0          0          0          4          0          0
         8 CPDRT               4          2         11          0          0          0          8          0          0
         9 CPRDB               4          2         11          0          0          0        548          0          0
        10 CPLRBA_SEQ          4          2         11          0          0          0         12          0          0
        11 CPLRBA_BNO          4          2         11          0          0          0         16          0          0
        12 CPLRBA_BOF          2          2         11          0          0          0         20          0          0
        13 CPODR_SEQ           4          2         11          0          0          0         24          0          0
        14 CPODR_BNO           4          2         11          0          0          0         28          0          0
        15 CPODR_BOF           2          2         11          0          0          0         32          0          0
        16 CPODS              20          1          5          0          0          0        484          0          0
        17 CPODT              20          1          5          0          0          0        504          0          0
        18 CPODT_I             4          2         11          0          0          0         44          0          0
        19 CPHBT               4          2         11          0          0          0         60          0          0
        20 CPRLS              20          1          5          0          0          0        528          0          0
        21 CPRLC               4          2         11          0          0          0         48          0          0
        22 CPMID               4          2         11          0          0          0         64          0          0
        23 CPSDR_SEQ           4          2         11          0          0          0        104          0          0
        24 CPSDR_BNO           4          2         11          0          0          0        108          0          0
        25 CPSDR_ADB           2          2         11          0          0          0        112          0          0

25 rows selected.

The xinfo tool has a describe command that produces the same data:

[oracle@db-21 ~]$ xinfo desc 'X$KCCCP'
+-----+------------+------------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-------------+
| cno |    nam_ptr | nam        | siz | dty | typ | max | lsz | lof | off | idx | ipo | kqfcop_indx |
+-----+------------+------------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-------------+
|   1 | 0x16e6959c | ADDR       |   8 |  23 |   9 |   0 |   0 |   0 |   0 |   0 |   0 |           0 |
|   2 | 0x15d7d660 | INDX       |   4 |   2 |  11 |   0 |   0 |   0 |   0 |   0 |   0 |           0 |
|   3 | 0x160f96b0 | INST_ID    |   4 |   2 |  11 |   0 |   0 |   0 |   0 |   0 |   0 |           0 |
|   4 | 0x16e78d5c | CON_ID     |   2 |   2 |  11 |   0 |   0 |   0 |   0 |   0 |   0 |           0 |
|   5 | 0x16e9826c | CPTNO      |   4 |   2 |   0 |   0 |   0 |   0 | 524 |   1 |   0 |           0 |
|   6 | 0x16e98274 | CPSTA      |   4 |   2 |  11 |   0 |   0 |   0 |   0 |   0 |   0 |           0 |
|   7 | 0x16e9827c | CPFLG      |   4 |   2 |  11 |   0 |   0 |   0 |   4 |   0 |   0 |           0 |
|   8 | 0x16e98284 | CPDRT      |   4 |   2 |  11 |   0 |   0 |   0 |   8 |   0 |   0 |           0 |
|   9 | 0x16e9828c | CPRDB      |   4 |   2 |  11 |   0 |   0 |   0 | 548 |   0 |   0 |           0 |
|  10 | 0x16e98294 | CPLRBA_SEQ |   4 |   2 |  11 |   0 |   0 |   0 |  12 |   0 |   0 |           0 |
|  11 | 0x16e982a0 | CPLRBA_BNO |   4 |   2 |  11 |   0 |   0 |   0 |  16 |   0 |   0 |           0 |
|  12 | 0x16e982ac | CPLRBA_BOF |   2 |   2 |  11 |   0 |   0 |   0 |  20 |   0 |   0 |           0 |
|  13 | 0x16e982b8 | CPODR_SEQ  |   4 |   2 |  11 |   0 |   0 |   0 |  24 |   0 |   0 |           0 |
|  14 | 0x16e982c4 | CPODR_BNO  |   4 |   2 |  11 |   0 |   0 |   0 |  28 |   0 |   0 |           0 |
|  15 | 0x16e982d0 | CPODR_BOF  |   2 |   2 |  11 |   0 |   0 |   0 |  32 |   0 |   0 |           0 |
|  16 | 0x16e982dc | CPODS      |  20 |   1 |   5 |   0 |   0 |   0 | 484 |   0 |   0 |           0 |
|  17 | 0x16e982e4 | CPODT      |  20 |   1 |   5 |   0 |   0 |   0 | 504 |   0 |   0 |           0 |
|  18 | 0x16e982ec | CPODT_I    |   4 |   2 |  11 |   0 |   0 |   0 |  44 |   0 |   0 |           0 |
|  19 | 0x16e982f4 | CPHBT      |   4 |   2 |  11 |   0 |   0 |   0 |  60 |   0 |   0 |           0 |
|  20 | 0x16e982fc | CPRLS      |  20 |   1 |   5 |   0 |   0 |   0 | 528 |   0 |   0 |           0 |
|  21 | 0x16e98304 | CPRLC      |   4 |   2 |  11 |   0 |   0 |   0 |  48 |   0 |   0 |           0 |
|  22 | 0x16e9830c | CPMID      |   4 |   2 |  11 |   0 |   0 |   0 |  64 |   0 |   0 |           0 |
|  23 | 0x16e98314 | CPSDR_SEQ  |   4 |   2 |  11 |   0 |   0 |   0 | 104 |   0 |   0 |           0 |
|  24 | 0x16e98320 | CPSDR_BNO  |   4 |   2 |  11 |   0 |   0 |   0 | 108 |   0 |   0 |           0 |
|  25 | 0x16e9832c | CPSDR_ADB  |   2 |   2 |  11 |   0 |   0 |   0 | 112 |   0 |   0 |           0 |
+-----+------------+------------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-------------+

ORA-00855 PGA_AGGREGATE_TARGET cannot be set because of insufficient physical memory

2023-03-18T19:05:00.004+06:00

Given the following amount of memory and %target parameters, what PGA_AGGREGATE_TARGET (PAT) can be set for this system?

SQL> sho parameter ga%target

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
pga_aggregate_target                 big integer 200M
sga_target                           big integer 16640M
SQL> !grep Mem /proc/meminfo
MemTotal:       32053636 kB
MemFree:         2063524 kB
MemAvailable:    5843840 kB

Oracle allows to set the value to more than MemAvailable:

SQL> alter system set pga_aggregate_target=5843841K;

System altered.

At the same time, an attempt to set the PAT value to 6GB fails with ORA-855:

SQL> alter system set pga_aggregate_target=6G;
alter system set pga_aggregate_target=6G
*
ERROR at line 1:
ORA-02097: parameter cannot be modified because specified value is invalid
ORA-00855: PGA_AGGREGATE_TARGET cannot be set because of insufficient physical
memory.

with the following message written to the DB alert log:

pga_aggregate_target cannot be set to 6442450944 due to SGA memory requirement and the physical memory size.

The description of the error is self-explanatory:

SQL> !oerr ora 855
00855, 00000, "PGA_AGGREGATE_TARGET cannot be set because of insufficient physical memory."
// *Cause:  PGA_AGGREGATE_TARGET value was too high for the current system global area (SGA) size and amount of physical memory available.
// *Action: Reduce the SGA size or increase the physical memory size.

Which, however, doesn't explain what the maximum PAT value allowed for this system.

During my experiments, I found out that Oracle calls the ksmc_physmem_pga_target function internally that returns the maximum PAT value. The function takes the total host memory as input, which in my case would be 32053636KB (or 0x7a4661000 in hex). We can use gdb to show the value:

(gdb) printf "0x%lx\n", (long) ksmc_physmem_pga_target(0x7a4661000)
0x1686120cc

Which is 6,046,163,148 bytes. oradebug outputs just 4 last bytes and it is not reliable here (the same value is shown in the trace file):

SQL> oradebug call ksmc_physmem_pga_target 0x7a4661000
Function returned 686120CC

The trace file:

Oradebug command 'call ksmc_physmem_pga_target 0x7a4661000' console output:
Function returned 686120CC

Oracle developers must have used the x format in oradebug:

(gdb) printf "0x%x\n", (long) ksmc_physmem_pga_target(0x7a4661000)
0x686120cc

The PAT value can be validated in SQL*Plus:

SQL> alter system set pga_aggregate_target=6046163148;

System altered.

While setting the value to more than that is not allowed:

SQL> alter system set pga_aggregate_target=6046163149;
alter system set pga_aggregate_target=6046163149
*
ERROR at line 1:
ORA-02097: parameter cannot be modified because specified value is invalid
ORA-00855: PGA_AGGREGATE_TARGET cannot be set because of insufficient physical
memory.

The internal algorithm used by Oracle may change, but at the time of my experiments in 19.18 the maximum PGA value seems to be calculated as follows:

max PAT = (TotalMemory * _pga_limit_physmem_perc / 100 - SGA_TARGET) * 100 / _pga_limit_target_perc

Description:

max PAT: PGA_AGGREGATE_TARGET max value
TotalMemory: total host memory (MemTotal in /proc/meminfo)

_pga_limit_physmem_perc: the parameter limiting total PGA and SGA (90% by default, in other words Oracle reserves 10% for OS and everything else):

SQL> select indx, ksppdesc from x$ksppi where ksppinm='_pga_limit_physmem_perc';

      INDX KSPPDESC
---------- --------------------------------------------------------------------------------
       246 default percent of physical memory for pga_aggregate_limit and SGA
       
SQL> select ksppstvl from x$ksppsv where indx=246;

KSPPSTVL
--------------------------------------------------------------------------------
90

_pga_limit_target_perc: the default percent of PAT for pga_aggregate_limit (200% by default):

SQL> select indx, ksppdesc from x$ksppi where ksppinm='_pga_limit_target_perc';

      INDX KSPPDESC
---------- --------------------------------------------------------------------------------
       234 default percent of pga_aggregate_target for pga_aggregate_limit

SQL> select ksppstvl from x$ksppsv where indx=234;

KSPPSTVL
--------------------------------------------------------------------------------
200

Substituting the values from the sample system to the formula, we get the expected value we experimentally found previously:

max PAT = (32053636 * 1024 * 90 / 100 - 16640 * 1024 * 1024) * 100 / 200 = 6046163148.8

Please note that this formula applies when the value is set while the instance is up and running. It is still possible to set a higher value in spfile and bounce the instance. It will work producing the following output in the alert log:

2023-03-18T13:20:50.763260+00:00
**********************************************************************
PGA_AGGREGATE_TARGET specified is high
Errors in file /u01/app/oracle/diag/rdbms/racdb/racdb1/trace/racdb1_ora_275709.trc  (incident=14402):
ORA-00700: soft internal error, arguments: [pga physmem limit], [6046163149], [6046163148], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/racdb/racdb1/incident/incdir_14402/racdb1_ora_275709_i14402.trc

Where the second argument of the ORA-00700 error (6046163149) is the actual PAT value, the third argument (6046163148) is the max PAT value calculated by the formula above. In such a scenario, it is not possible to set the PAT value to itself:

SQL> sho parameter pga

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
pga_aggregate_limit                  big integer 12092326298
pga_aggregate_target                 big integer 6046163149
SQL> alter system set pga_aggregate_target=6046163149;
alter system set pga_aggregate_target=6046163149
*
ERROR at line 1:
ORA-02097: parameter cannot be modified because specified value is invalid
ORA-00855: PGA_AGGREGATE_TARGET cannot be set because of insufficient physical
memory.

X$ tables: table-level callbacks and column definitions (without parsing)

2022-02-19T02:31:00.001+06:00

Thus far, I have reviewed the kqftab structure which is used to build the majority of tables in X$KQFTA. It is not clear yet where actual columns are coming from. There is an additional structure called kqftap that has the extra information.

[oracle@db-21 bin]$ readelf -s oracle | grep -E -A1 -w 'Symbol|kqftap' --no-group-separator
Symbol table '.dynsym' contains 225083 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
225049: 0000000016cf4d00 40736 OBJECT  GLOBAL DEFAULT   17 kqftap
225050: 0000000007b02100   256 FUNC    GLOBAL DEFAULT   13 l9_ippsRLEGetInUseTable_8
Symbol table '.symtab' contains 402534 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
351482: 0000000016cf4d00 40736 OBJECT  GLOBAL DEFAULT   17 kqftap
351483: 0000000012aef270   800 FUNC    GLOBAL DEFAULT   14 qesxlGetPayloadData

Here is what it looks like:

[oracle@db-21 bin]$ objdump -s --start-address=0x0000000016cf4d00 --stop-address=$((0x0000000016cf4d00+40736)) oracle

oracle:     file format elf64-x86-64

Contents of section .rodata:
 16cf4d00 00000000 00000000 00efcf16 00000000  ................
 16cf4d10 7039670e 00000000 00000000 00000000  p9g.............
 16cf4d20 00000000 00000000 00f2cf16 00000000  ................
 16cf4d30 00000000 00000000 00000000 00000000  ................
 16cf4d40 00000000 00000000 00f4cf16 00000000  ................
 16cf4d50 00000000 00000000 00000000 00000000  ................
 16cf4d60 00000000 00000000 80f5cf16 00000000  ................
 16cf4d70 00000000 00000000 00000000 00000000  ................
 16cf4d80 00000000 00000000 80f7cf16 00000000  ................
 16cf4d90 903b670e 00000000 00000000 00000000  .;g.............
..

It is made of 32-byte rows. The third column in the output points to the X$-column structure - the structure describing the corresponding X$ table:

[oracle@db-21 bin]$ for a in 16cfef00 16cff200 16cff400 16cff580
> do
>   nm oracle | grep $a
> done
0000000016cfef00 r kqfta_c
0000000016cff200 r kqfvi_c
0000000016cff400 r kqfvt_c
0000000016cff580 r kqfdt_c

[oracle@db-21 ~]$ xinfo list | head -10
+------------+-----+------------+-------------------------------+-----------------+---------------------------+-----+------+--------+-----+
|        obj | ver |    nam_ptr | nam                           | xstruct_nam_ptr | xstruct                   | typ |  flg |    rsz | coc |
+------------+-----+------------+-------------------------------+-----------------+---------------------------+-----+------+--------+-----+
| 4294950912 |   6 | 0x16282d00 | X$KQFTA                       |      0x16e33810 | kqftv                     |   4 |    0 |     80 |  11 |
| 4294950913 |   3 | 0x16e33818 | X$KQFVI                       |      0x16e33810 | kqftv                     |   1 |    0 |     80 |   7 |
| 4294951149 |   3 | 0x16e33820 | X$KQFVT                       |      0x16e33828 | kqftp                     |   1 |    0 |     32 |   5 |
| 4294950914 |   4 | 0x16e33830 | X$KQFDT                       |      0x16e33838 | kqfdt                     |   1 |    0 |     40 |   7 |
| 4294951036 |   6 | 0x16e33840 | X$KQFCO                       |      0x16e33848 | kqfcc                     |   4 |    2 |     80 |  17 |
| 4294952712 |   2 | 0x16e33850 | X$KQFOPT                      |      0x16e3385c | kqfopt                    |   1 |    0 |     24 |   6 |
| 4294952922 |   3 | 0x16e33864 | X$KYWMPCTAB                   |      0x16e33870 | kywmpctab                 |   4 |    0 |     88 |  12 |

Both kqftab and kqftap define tables and corresponding column structures in the same order: X$KQFTA's columns are described in kqfta_c, X$KQFVI's columns are described in kqfvi_c, etc.

Then, some X$ tables require additional processing, so that there are callback functions:

[oracle@db-21 bin]$ nm oracle | grep -E 'e673970|e673b90'
000000000e673970 T kqftbl_cb
000000000e673b90 T kqftco

The xinfo tool was enhanced to output the corresponding kqftap rows when the --with-kqftap option is specified:

[oracle@db-21 ~]$ xinfo list 'X$KSMLRU' --with-kqftap -o json
{
  "88": {
    "obj": 4294951099,
    "ver": 7,
    "nam_ptr": 383991380,
    "nam": "X$KSMLRU",
    "xstruct_nam_ptr": 383991392,
    "xstruct": "ksmlr",
    "typ": 4,
    "flg": 0,
    "rsz": 112,
    "coc": 18,
    "kqftap": {
      "xstruct_ptr": 382474144,
      "cb1_ptr": 132689584,
      "xstruct": "ksmlru_c",
      "cb1": "ksmlrs"
    }
  }
}

I will explain how to extract the column definitions in the next post.

Virtual X$ tables

2022-02-11T22:16:00.001+06:00

Some of X$ tables are not coming from the kqftab structure. For example, the xinfo tool outputs only 1,272 tables in 21.5, whereas X$KQFTA has 1,377 rows:

[oracle@db-21 sf_exchange]$ xinfo list -o json |
>   jq -r '.[] | .nam' |
>   awk '{printf "%4d %s\n",NR,$0}'
   1 X$KQFTA
   2 X$KQFVI
   3 X$KQFVT
   4 X$KQFDT
   5 X$KQFCO
..
1271 X$BCAPPLY_STATS
1272 X$FSDDBFS

What about extra 105 rows? Let me show what the output of X$KQFTA is like in my database:

SQL> select * from x$kqfta;

ADDR                   INDX    INST_ID     CON_ID   KQFTAOBJ   KQFTAVER KQFTANAM                         KQFTATYP   KQFTAFLG   KQFTARSZ   KQFTACOC
---------------- ---------- ---------- ---------- ---------- ---------- ------------------------------ ---------- ---------- ---------- ----------
0000000016CDBF20          0          1          0 4294950912          6 X$KQFTA                                 4          0         80         11
0000000016CDBF70          1          1          0 4294950913          3 X$KQFVI                                 1          0         80          7
..
0000000016CF4C00       1270          1          0 4294956360          2 X$BCAPPLY_STATS                         4          2        260         16
0000000016CF4C50       1271          1          0 4294956225          2 X$FSDDBFS                               4          0       1144         14
000000008F8127C0       1272          1          0 4294953644          0 X$KSIPC_PROC_STATS                      9         18          0          0
000000008F812810       1273          1          0 4294953645          0 X$KSIPC_INFO                            9         18          0          0
000000008F812860       1274          1          0 4294952215          1 X$KSXPTESTTBL                          10          2        146         15
000000008F8128B0       1275          1          0 4294952216          0 X$KSXP_STATS                            9         18          0          0
000000008F812900       1276          1          0 4294952217          0 X$SKGXP_PORT                            9         18          0          0
000000008F812950       1277          1          0 4294952218          0 X$SKGXP_CONNECTION                      9         18          0          0
000000008F8129A0       1278          1          0 4294952219          0 X$SKGXP_MISC                            9         18          0          0
000000008F8129F0       1279          1          0 4294952227          1 X$KTCNQROW                              9          2       1480         47
..
000000008F8147F0       1375          1          0 4294954950          3 X$DIAG_VTEST_EXISTS                    10         34       1536         13
000000008F814840       1376          1          0 4294954964          3 X$DIAG_VADR_CONTROL                    10         34       1072         21

1377 rows selected.

It can be seen that starting at INDX=1,272 (which is row 1,273 because INDX starts at 0) the ADDR value is quite different - it is 0x8F8127C0. It is not an address from the Oracle binary anymore:

SQL> select *
  2    from x$ksmsp
  3   where to_number('8F8127C0','XXXXXXXXXXXXXXXX')
  4           between to_number(ksmchptr,'XXXXXXXXXXXXXXXX')
  5               and to_number(ksmchptr,'XXXXXXXXXXXXXXXX') + ksmchsiz - 1;

ADDR                   INDX    INST_ID     CON_ID   KSMCHIDX   KSMCHDUR KSMCHCOM         KSMCHPTR           KSMCHSIZ KSMCHCLS   KSMCHTYP KSMCHPAR
---------------- ---------- ---------- ---------- ---------- ---------- ---------------- ---------------- ---------- -------- ---------- ----------------
00007F6C5E50DC10     106576          1          1          1          1 KQF runtime def  000000008F8127B0       8416 perm              0 000000008F101000

The address refers to the KQF runtime def area of SGA. The X$ tables starting from INDX=1,272 are what Oracle is called virtual tables in its code. Of course, the INDX value is version dependent but these virtual tables always come after other X$ tables defined in kqftab. The starting address of the virtual tables can be obtained from the kqftvrt_ SGA variable:

SQL> oradebug dumpvar sga kqftvrt_
struct kqftv* kqftvrt_ [0600A7408, 0600A7410) = 8F8127C0 00000000

The virtual tables are built inside the kqfbldtvrt function.

X$ tables: reading X$KQFTA from Oracle binary

2022-01-31T17:01:00.001+06:00

X$ tables are visible in X$KQFTA, for example:

SQL> select * from x$kqfta where rownum<=10;

ADDR                   INDX    INST_ID     CON_ID   KQFTAOBJ   KQFTAVER KQFTANAM       KQFTATYP   KQFTAFLG   KQFTARSZ   KQFTACOC
---------------- ---------- ---------- ---------- ---------- ---------- ------------ ---------- ---------- ---------- ----------
0000000016CDBF20          0          1          0 4294950912          6 X$KQFTA               4          0         80         11
0000000016CDBF70          1          1          0 4294950913          3 X$KQFVI               1          0         80          7
0000000016CDBFC0          2          1          0 4294951149          3 X$KQFVT               1          0         32          5
0000000016CDC010          3          1          0 4294950914          4 X$KQFDT               1          0         40          7
0000000016CDC060          4          1          0 4294951036          6 X$KQFCO               4          2         80         17
0000000016CDC0B0          5          1          0 4294952712          2 X$KQFOPT              1          0         24          6
0000000016CDC100          6          1          0 4294952922          3 X$KYWMPCTAB           4          0         88         12
0000000016CDC150          7          1          0 4294953009          2 X$KYWMWRCTAB          4          0         72          6
0000000016CDC1A0          8          1          0 4294952923          2 X$KYWMCLTAB           4          0       4076          7
0000000016CDC1F0          9          1          0 4294952924          5 X$KYWMNF              4          0        240         11

This information is mostly coming from the kqftab structure within the Oracle binary:

[oracle@db-21 bin]$ readelf -s oracle | grep -E -A1 -w 'Symbol|kqftab' --no-group-separator
Symbol table '.dynsym' contains 225083 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
224955: 0000000016cdbf20 0x18dd0 OBJECT  GLOBAL DEFAULT   17 kqftab
224956: 00000000058400a0   288 FUNC    GLOBAL DEFAULT   13 kole_length
Symbol table '.symtab' contains 402534 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
344560: 0000000016cdbf20 0x18dd0 OBJECT  GLOBAL DEFAULT   17 kqftab
344561: 0000000016bc7a98    40 OBJECT  GLOBAL DEFAULT   17 kqreqv

This is the actual kqftab structure:

[oracle@db-21 bin]$ objdump -s --start-address=0x0000000016cdbf20 --stop-address=$((0x0000000016cdbf20+0x18dd0)) oracle

oracle:     file format elf64-x86-64

Contents of section .rodata:
 16cdbf20 07000000 00000000 80f52c16 00000000  ..........,.....
 16cdbf30 05000000 00000000 703be816 00000000  ........p;......
 16cdbf40 04000000 00000000 50000000 00000000  ........P.......
 16cdbf50 00000000 00000000 0b000000 02000000  ................
 16cdbf60 00c0ffff 06000000 00000000 00000000  ................
 16cdbf70 07000000 00000000 783be816 00000000  ........x;......
 16cdbf80 05000000 00000000 703be816 00000000  ........p;......
 16cdbf90 01000000 00000000 50000000 00000000  ........P.......
 16cdbfa0 00000000 00000000 07000000 00000000  ................
 16cdbfb0 01c0ffff 03000000 00000000 00000000  ................
 16cdbfc0 07000000 00000000 803be816 00000000  .........;......
 16cdbfd0 05000000 00000000 883be816 00000000  .........;......
 16cdbfe0 01000000 00000000 20000000 00000000  ........ .......
 16cdbff0 00000000 00000000 05000000 00000000  ................
 16cdc000 edc0ffff 03000000 00000000 00000000  ................
...

There is a visible pattern in this data. The highlighted lines contain some addresses which can be easily dereferenced:

[oracle@db-21 bin]$ for a in 162cf580 16e83b78 16e83b80
> do
>   objdump -s --start-address=0x$a --stop-address=$((0x$a+16)) oracle
> done

oracle:     file format elf64-x86-64

Contents of section .rodata:
 162cf580 58244b51 46544100 716b736f 7050726f  X$KQFTA.qksopPro

oracle:     file format elf64-x86-64

Contents of section .rodata:
 16e83b78 58244b51 46564900 58244b51 46565400  X$KQFVI.X$KQFVT.

oracle:     file format elf64-x86-64

Contents of section .rodata:
 16e83b80 58244b51 46565400 6b716674 70000000  X$KQFVT.kqftp...

Not surprisingly, the order of tables is the same as in X$KQFTA. It made me think that the structure kqftab can be used to construct an output similar to X$KQFTA. That is why I wrote a simple program that reads the Oracle binary and does just that:

[oracle@db-21 ~]$ xinfo list
+------------+-----+------------+-------------------------------+-----------------+---------------------------+-----+------+--------+-----+
|        obj | ver |    nam_ptr | nam                           | xstruct_nam_ptr | xstruct                   | typ |  flg |    rsz | coc |
+------------+-----+------------+-------------------------------+-----------------+---------------------------+-----+------+--------+-----+
| 4294950912 |   6 | 0x16282d00 | X$KQFTA                       |      0x16e33810 | kqftv                     |   4 |    0 |     80 |  11 |
| 4294950913 |   3 | 0x16e33818 | X$KQFVI                       |      0x16e33810 | kqftv                     |   1 |    0 |     80 |   7 |
| 4294951149 |   3 | 0x16e33820 | X$KQFVT                       |      0x16e33828 | kqftp                     |   1 |    0 |     32 |   5 |
| 4294950914 |   4 | 0x16e33830 | X$KQFDT                       |      0x16e33838 | kqfdt                     |   1 |    0 |     40 |   7 |
| 4294951036 |   6 | 0x16e33840 | X$KQFCO                       |      0x16e33848 | kqfcc                     |   4 |    2 |     80 |  17 |
| 4294952712 |   2 | 0x16e33850 | X$KQFOPT                      |      0x16e3385c | kqfopt                    |   1 |    0 |     24 |   6 |
| 4294952922 |   3 | 0x16e33864 | X$KYWMPCTAB                   |      0x16e33870 | kywmpctab                 |   4 |    0 |     88 |  12 |
| 4294953009 |   2 | 0x16e3387c | X$KYWMWRCTAB                  |      0x16e3388c | kywmwrctab                |   4 |    0 |     72 |   6 |
| 4294952923 |   2 | 0x16e33898 | X$KYWMCLTAB                   |      0x16e338a4 | kywmcltab                 |   4 |    0 |   4076 |   7 |
| 4294952924 |   5 | 0x16e338b0 | X$KYWMNF                      |      0x16e338bc | kywmnf                    |   4 |    0 |    240 |  11 |
...
| 4294956357 |   2 | 0x16e3a834 | X$CONSENSUS_STATS             |      0x16e3a848 | kconsstats                |   4 |    2 |     64 |  10 |
| 4294956360 |   2 | 0x16e3a854 | X$BCAPPLY_STATS               |      0x16e3a864 | kbbaStats                 |   4 |    2 |    260 |  16 |
| 4294956225 |   2 | 0x16e3a870 | X$FSDDBFS                     |      0x16e3a87c | fsddbfs                   |   4 |    0 |   1144 |  14 |
+------------+-----+------------+-------------------------------+-----------------+---------------------------+-----+------+--------+-----+

[oracle@db-21 ~]$ xinfo list 'X$KCCC*'
+------------+-----+------------+---------+-----------------+---------+-----+-----+-----+-----+
|        obj | ver |    nam_ptr | nam     | xstruct_nam_ptr | xstruct | typ | flg | rsz | coc |
+------------+-----+------------+---------+-----------------+---------+-----+-----+-----+-----+
| 4294951110 |   5 | 0x16e3553c | X$KCCCF |      0x16e35544 | kcccf   |   4 |   0 | 532 |   9 |
| 4294951215 |   4 | 0x16e3567c | X$KCCCC |      0x16e35684 | kcccc   |   5 |   6 |  48 |  14 |
| 4294951392 |   5 | 0x16e35d90 | X$KCCCP |      0x16e35d98 | kctcpx  |   5 |   0 | 552 |  25 |
+------------+-----+------------+---------+-----------------+---------+-----+-----+-----+-----+

I also make use of other information that is stored in kqftab and add a column called xstruct to the output.

How to trace specific SQL statement executed by certain user

2021-12-21T23:40:00.002+06:00

A question asking the same appeared on oracle-l today here. It is widely known how to trace a specific SQL statement at the system level, e.g.:

alter system set events 'sql_trace[sql:<some_sql_id>]';

What if we want to trace the same SQL statement when it is executed by some specific user? A common approach in this case is to create a LOGON trigger and use ALTER SESSION SET EVENTS in it. It turns out that it is possible by using event filters and the username() function too. For example, to trace the SQL_ID bqka14bvd2zmb run by the user TC1 we can use:

alter system set events 'sql_trace[sql:bqka14bvd2zmb] {streq:username(),"TC1"}';

Here is a full example which is also available as a Gist here:

[oracle@db-21 ~]$ sqlplus /nolog @q

SQL*Plus: Release 21.0.0.0.0 - Production on Tue Dec 21 15:54:22 2021
Version 21.4.0.0.0

Copyright (c) 1982, 2021, Oracle.  All rights reserved.

SQL> conn / as sysdba
Connected.
SQL> alter session set container=pdb;

Session altered.

SQL>
SQL> drop user tc1 cascade;

User dropped.

SQL> drop user tc2 cascade;

User dropped.

SQL>
SQL> grant create session, select_catalog_role to tc1 identified by tc1;

Grant succeeded.

SQL> grant create session, select_catalog_role to tc2 identified by tc2;

Grant succeeded.

SQL>
SQL> col sql_id old_v sql_id
SQL> select dbms_sql_translator.sql_id(q'#select 'trace_me' from dual#') sql_id from dual;

SQL_ID
--------------------------------------------------------------------------------
bqka14bvd2zmb

SQL>
SQL> alter system set events 'sql_trace[sql:&sql_id.] {streq:username(),"TC1"}';
old   1: alter system set events 'sql_trace[sql:&sql_id.] {streq:username(),"TC1"}'
new   1: alter system set events 'sql_trace[sql:bqka14bvd2zmb] {streq:username(),"TC1"}'

System altered.

SQL>
SQL> conn tc1/tc1@db-21/pdb
Connected.
SQL>
SQL> select 'trace_me' from dual;

'TRACE_M
--------
trace_me

SQL> col trace_file old_v trace_file
SQL> select value trace_file from v$diag_info where name='Default Trace File';

TRACE_FILE
--------------------------------------------------------------------------------
/u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_5455.trc

SQL>
SQL> !cat &trace_file.
Trace file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_5455.trc
Oracle Database 21c Enterprise Edition Release 21.0.0.0.0 - Production
Version 21.4.0.0.0
Build label:    RDBMS_21.3.0.0.0_LINUX.X64_210727
ORACLE_HOME:    /u01/app/oracle/product/21.0.0/dbhome_1
System name:    Linux
Node name:      db-21.localdomain
Release:        5.4.17-2136.300.7.el8uek.x86_64
Version:        #2 SMP Fri Oct 8 16:23:01 PDT 2021
Machine:        x86_64
VM name:        KVM CPUID feature flags: 0x01000089
Instance name: orcl
Redo thread mounted by this instance: 1
Oracle process number: 48
Unix process pid: 5455, NID: 4026531836, image: oracle@db-21.localdomain


*** 2021-12-21T15:54:23.397300+00:00 (PDB(3))
*** SESSION ID:(82.1864) 2021-12-21T15:54:23.397327+00:00
*** CLIENT ID:() 2021-12-21T15:54:23.397344+00:00
*** SERVICE NAME:(pdb) 2021-12-21T15:54:23.397360+00:00
*** MODULE NAME:(SQL*Plus) 2021-12-21T15:54:23.397377+00:00
*** ACTION NAME:() 2021-12-21T15:54:23.397394+00:00
*** CLIENT DRIVER:(SQL*PLUS) 2021-12-21T15:54:23.397409+00:00
*** CONTAINER ID:(3) 2021-12-21T15:54:23.397426+00:00
*** CLIENT IP:(10.0.2.15) 2021-12-21T15:54:23.397449+00:00
*** CLIENT IP:(10.0.2.15) 2021-12-21T15:54:23.397449+00:00

=====================
PARSING IN CURSOR #140557434715864 len=332 dep=1 uid=0 oct=3 lid=0 tim=1528633612 hv=2698389488 ad='85cb8630' sqlid='acmvv4fhdc9zh'
select obj#,type#,ctime,mtime,stime, status, dataobj#, flags, oid$, spare1, spare2, spare3, signature, spare7, spare8, spare9, nvl(dflcollid, 16382), creappid, creverid, modappid, modverid, crepatchid, modpatchid from obj$ where owner#=:1 and name=:2 and namespace=:3 and remoteowner is null and linkname is null and subname is null
END OF STMT
EXEC #140557434715864:c=0,e=25,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=813480514,tim=1528633611
FETCH #140557434715864:c=19,e=19,p=0,cr=3,cu=0,mis=0,r=0,dep=1,og=4,plh=813480514,tim=1528634393
CLOSE #140557434715864:c=0,e=0,dep=1,type=3,tim=1528634426
=====================
PARSING IN CURSOR #140557434718920 len=27 dep=0 uid=120 oct=3 lid=120 tim=1528635472 hv=4140924523 ad='7aa8feb0' sqlid='bqka14bvd2zmb'
select 'trace_me' from dual
END OF STMT
EXEC #140557434718920:c=30,e=30,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=1388734953,tim=1528635471
FETCH #140557434718920:c=5,e=6,p=0,cr=0,cu=0,mis=0,r=1,dep=0,og=1,plh=1388734953,tim=1528635612
STAT #140557434718920 id=1 cnt=1 pid=0 pos=1 obj=0 op='FAST DUAL  (cr=0 pr=0 pw=0 str=1 time=1 us cost=2 size=0 card=1)'
FETCH #140557434718920:c=0,e=0,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=0,plh=1388734953,tim=1528636086
CLOSE #140557434718920:c=0,e=7,dep=0,type=0,tim=1528637068

SQL>
SQL> conn tc2/tc2@db-21/pdb
Connected.
SQL>
SQL> select 'trace_me' from dual;

'TRACE_M
--------
trace_me

SQL> col trace_file old_v trace_file
SQL> select value trace_file from v$diag_info where name='Default Trace File';

TRACE_FILE
--------------------------------------------------------------------------------
/u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_5458.trc

SQL>
SQL> !cat &trace_file.
cat: /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_5458.trc: No such file or directory

Here I make use of the streq filter and the username function:

SQL> oradebug doc event filter streq

streq: filter to only fire an event when string s1 = s2 (up to <len> characters)

Usage
-------
{streq:  s1              <string>[256],
         s2              <string>[256],
         len             <ub4> }

SQL> oradebug doc event action username
username
        - Return user log-in name
Usage
-------
username( Action returns: <string>[30])

Mythbusters: VARRAY faster than CLOB

2021-12-18T17:36:00.001+06:00

There has been a tweet recently saying that VARRAY is faster than CLOB: link. On this data set with SQL*Plus, the correctness of this statement largely depends on the underlying hardware on the database, and the network between the client and the server. More specifically, VARRAY will be faster with a rather slow network.

A typical production environment that I work with includes one or more databases deploying across different availability zones (AZ) on the cloud. The applications reside in the same AZ as the database server to avoid inter-AZ traffic that costs extra money. I tested the script from this Gist across two major cloud providers and VARRAY was never faster than CLOB. In fact, it is significantly slower. See the output from 19.13 below (the script is from this Gist - I just added the last query with DBMS_LOB):

[oracle@rac2 ~]$ NLS_LANG=.AL32UTF8 sqlplus tc/tc@rac1:1522/pdb1 @test1

SQL*Plus: Release 19.0.0.0.0 - Production on Sat Dec 18 10:54:03 2021
Version 19.13.0.0.0

Copyright (c) 1982, 2021, Oracle.  All rights reserved.

Last Successful login time: Sat Dec 18 2021 10:53:17 +00:00

Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.13.0.0.0

SQL> set lobprefetch 32767
SQL> set long 10000000
SQL> set longchunksize 10000000
SQL> set timing on;
SQL> set arraysize 1000;
SQL> --set feedback only
SQL> set autotrace trace stat;
SQL> select id,c_lob from t_lob_1_mb where id<=25;

25 rows selected.

Elapsed: 00:00:01.50

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         82  consistent gets
       6475  physical reads
          0  redo size
   51655001  bytes sent via SQL*Net to client
      15345  bytes received via SQL*Net from client
         52  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         25  rows processed

SQL> select id,lob_to_varray(c_lob) c_varray from t_lob_1_mb where id<=25;

25 rows selected.

Elapsed: 00:00:14.10

Statistics
----------------------------------------------------------
         33  recursive calls
          0  db block gets
        199  consistent gets
     653950  physical reads
          0  redo size
   25107664  bytes sent via SQL*Net to client
       9261  bytes received via SQL*Net from client
         61  SQL*Net roundtrips to/from client
         25  sorts (memory)
          0  sorts (disk)
         25  rows processed

SQL>
SQL> select
  2    c_varray
  3  from t_lob_1_mb
  4       outer apply (
  5         select
  6           cast(
  7             collect(
  8                cast(substr(c_lob,(level-1)*4000 + 1,4000) as varchar2(4000))
  9                )
 10             as sys.odcivarchar2list
 11           ) c_varray
 12         from dual
 13         connect by level<=ceil(length(c_lob)/4000)
 14       )
 15  where id<=25;

25 rows selected.

Elapsed: 00:00:14.16

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        176  consistent gets
     653950  physical reads
          0  redo size
   25106012  bytes sent via SQL*Net to client
       9086  bytes received via SQL*Net from client
         58  SQL*Net roundtrips to/from client
         50  sorts (memory)
          0  sorts (disk)
         25  rows processed

SQL>
SQL> select
  2    c_varray
  3  from t_lob_1_mb
  4       outer apply (
  5         select
  6           cast(
  7             collect(
  8                dbms_lob.substr(c_lob,4000,(level-1)*4000 + 1)
  9                )
 10             as sys.odcivarchar2list
 11           ) c_varray
 12         from dual
 13         connect by level<=ceil(length(c_lob)/4000)
 14       )
 15  where id<=25;

25 rows selected.

Elapsed: 00:00:03.19

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        176  consistent gets
      24900  physical reads
          0  redo size
   25106012  bytes sent via SQL*Net to client
       9071  bytes received via SQL*Net from client
         58  SQL*Net roundtrips to/from client
         50  sorts (memory)
          0  sorts (disk)
         25  rows processed

NB: the DBMS_LOB query is not mentioned in the original tweet, but I wrote why SUBSTR should not be used against LOB's in 2019: Temporary LOBs.

As I said, it is quite a typical cloud environment in which the client is on a different VM from the DB server. I can get even better results with CLOB if I run the same script on the DB server itself or use a proximity placement group (Azure)/cluster placement group (AWS).

It can be seen that there is twice as much data transfered with CLOB than with the other queries (50MB vs 25MB). It is a known issue that was already observed by several other authors, e.g. LOB reads. In the specific example from this post, the extra CLOB data will become noticeable on a slow network, e.g. me pulling data from a different continent on a mobile broadband. It is not the case in most environments (including non-productions) that I work with - CLOB will be faster than VARRAY (1.5 seconds vs 14.16 seconds). As always, rather than relying on any information, such as the tweet above, it is better to test it for yourself as this post demonstrates.

Using credentials with database links in 21c

2021-12-17T23:54:00.003+06:00

Since 21c it is now possible to use credential objects in database links. Here is a short demonstration of this functionality:

SQL> exec dbms_credential.create_credential('TC_CRED', 'TC', 'tc')

PL/SQL procedure successfully completed.

SQL>
SQL> create database link link1 connect with tc_cred using 'localhost/pdb';

Database link created.

SQL> create database link link2 connect with tc_cred using 'localhost/pdb';

Database link created.

SQL>
SQL> select * from dual@link1;

D
-
X

SQL> select * from dual@link2;

D
-
X

SQL Language Reference has not been updated with the new syntax yet. If we alter the user's password, the existing DB links will not work anymore (I do not consider gradual password rollover here):

SQL> alter user tc identified by tc2;

User altered.

SQL>
SQL> alter session close database link link1;

Session altered.

SQL> alter session close database link link2;

Session altered.

SQL>
SQL> select * from dual@link1;
select * from dual@link1
                   *
ERROR at line 1:
ORA-01017: invalid username/password; logon denied
ORA-02063: preceding line from LINK1

It is enough to alter the credentials objects to make the DB links work again:

SQL> exec dbms_credential.update_credential('TC_CRED', 'PASSWORD', 'tc2')

PL/SQL procedure successfully completed.

SQL>
SQL> select * from dual@link1;

D
-
X

SQL> select * from dual@link2;

D
-
X

Conclusion

This functionality really comes into its own when you re-use one username and password pair in multiple database links. If we want to change the username or password, there is no need to change each link anymore. We can alter one credentials object instead. The functionality has been backported to 19c as well: Bug 29541929 - support credential objects in database links (Doc ID 29541929.8).

gridSetup.sh executeConfigTools fails with PRVG-13606 : chrony daemon is not synchronized with any external time source

2021-12-09T17:26:00.000+06:00

The command failed with the following errors in the log:

/u01/app/19.3.0/grid/gridSetup.sh -executeConfigTools -responseFile /opt/rsp/gi_19.3_config.rsp -silent
...
INFO:  [Dec 8, 2021 8:20:51 AM] Verifying Clock Synchronization ...FAILED
INFO:  [Dec 8, 2021 8:20:51 AM] Skipping line: Verifying Clock Synchronization ...FAILED
INFO:  [Dec 8, 2021 8:20:51 AM]   Verifying Network Time Protocol (NTP) ...FAILED
INFO:  [Dec 8, 2021 8:20:51 AM] Skipping line:   Verifying Network Time Protocol (NTP) ...FAILED
INFO:  [Dec 8, 2021 8:20:51 AM]     Verifying chrony daemon is synchronized with at least one external time
INFO:  [Dec 8, 2021 8:20:51 AM] Skipping line:     Verifying chrony daemon is synchronized with at least one external time
INFO:  [Dec 8, 2021 8:20:51 AM]     source ...FAILED
INFO:  [Dec 8, 2021 8:20:51 AM] Skipping line:     source ...FAILED
INFO:  [Dec 8, 2021 8:20:51 AM]     rac2: PRVG-13606 : chrony daemon is not synchronized with any external time
INFO:  [Dec 8, 2021 8:20:51 AM] Skipping line:     rac2: PRVG-13606 : chrony daemon is not synchronized with any external time
INFO:  [Dec 8, 2021 8:20:51 AM]           source on node "rac2".
INFO:  [Dec 8, 2021 8:20:51 AM] Skipping line:           source on node "rac2".
INFO:  [Dec 8, 2021 8:20:51 AM] Skipping line:
INFO:  [Dec 8, 2021 8:20:51 AM]     rac1: PRVG-13606 : chrony daemon is not synchronized with any external time
INFO:  [Dec 8, 2021 8:20:51 AM] Skipping line:     rac1: PRVG-13606 : chrony daemon is not synchronized with any external time
INFO:  [Dec 8, 2021 8:20:51 AM]           source on node "rac1".
INFO:  [Dec 8, 2021 8:20:51 AM] Skipping line:           source on node "rac1".

It can be easily reproduced by running CVU:

[grid@rac1 ~]$ cluvfy comp clocksync -n rac1 -verbose

Verifying Clock Synchronization ...
  Node Name                             Status
  ------------------------------------  ------------------------
  rac1                                  passed

  Node Name                             State
  ------------------------------------  ------------------------
  rac1                                  Observer

CTSS is in Observer state. Switching over to clock synchronization checks using NTP

  Verifying Network Time Protocol (NTP) ...
    Verifying '/etc/chrony.conf' ...
    Node Name                             File exists?
    ------------------------------------  ------------------------
    rac1                                  yes

    Verifying '/etc/chrony.conf' ...PASSED
    Verifying Daemon 'chronyd' ...
    Node Name                             Running?
    ------------------------------------  ------------------------
    rac1                                  yes

    Verifying Daemon 'chronyd' ...PASSED
    Verifying NTP daemon or service using UDP port 123 ...
    Node Name                             Port Open?
    ------------------------------------  ------------------------
    rac1                                  yes

    Verifying NTP daemon or service using UDP port 123 ...PASSED
    Verifying chrony daemon is synchronized with at least one external time source ...FAILED (PRVG-13606)
  Verifying Network Time Protocol (NTP) ...FAILED (PRVG-13606)
Verifying Clock Synchronization ...FAILED (PRVG-13606)

Verification of Clock Synchronization across the cluster nodes was unsuccessful on all the specified nodes.


Failures were encountered during execution of CVU verification request "Clock Synchronization across the cluster nodes".

Verifying Clock Synchronization ...FAILED
  Verifying Network Time Protocol (NTP) ...FAILED
    Verifying chrony daemon is synchronized with at least one external time
    source ...FAILED
    rac1: PRVG-13606 : chrony daemon is not synchronized with any external time
          source on node "rac1".


CVU operation performed:      Clock Synchronization across the cluster nodes
Date:                         Dec 9, 2021 10:56:45 AM
CVU home:                     /u01/app/19.3.0/grid/
User:                         grid

If we want to get more details, the CVU trace could be enabled:

[grid@rac1 ~]$ rm -rf /tmp/cvutrace
[grid@rac1 ~]$ mkdir /tmp/cvutrace
[grid@rac1 ~]$ export CV_TRACELOC=/tmp/cvutrace
[grid@rac1 ~]$ export SRVM_TRACE=true
[grid@rac1 ~]$ export SRVM_TRACE_LEVEL=1
[grid@rac1 ~]$ cluvfy comp clocksync -n rac1 -verbose

This produces the following lines in the trace file /tmp/cvutrace/cvutrace.log.0:

[main] [ 2021-12-09 10:58:10.179 UTC ] [VerificationUtil.traceAndLogInternal:16755]  [TaskNTP.doChronyTimeSourceCheck:2465] status=SUCCESSFUL; vfyCode=0; output=MS Name/IP address         Stratum Poll Reach Last
Rx Last sample
===============================================================================
^* 169.254.169.123               3   7   377   128  +1141ns[+4603ns] +/-  501us

[main] [ 2021-12-09 10:58:10.179 UTC ] [TaskAnonymousProxy.<init>:119]  Defining proxy task with: 'chrony daemon is synchronized with at least one external time source'
nodeList: 'rac1'
from task: 'TaskNTP'
Called from: 'TaskNTP.performNTPChecks:889'
[main] [ 2021-12-09 10:58:10.179 UTC ] [ResultSet.overwriteResultSet:810]  Overwriting ResultSet, called from: TaskAnonymousProxy.performAnonymousTask:148
[main] [ 2021-12-09 10:58:10.179 UTC ] [CVUVariables.getCVUVariable:607]  variable name : MODE_API
[main] [ 2021-12-09 10:58:10.179 UTC ] [CVUVariables.getCVUVariable:643]  Variable found in the CVU and Command Line context
[main] [ 2021-12-09 10:58:10.179 UTC ] [CVUVariables.resolve:981]  ForcedLookUp not enabled for variable:MODE_API
[main] [ 2021-12-09 10:58:10.179 UTC ] [CVUVariables.secureVariableValueTrace:789]  getting CVUVariableConstant : VAR = MODE_API VAL = FALSE
[main] [ 2021-12-09 10:58:10.180 UTC ] [ResultSet.traceResultSet:1040]

ResultSet AFTER overwrite ===>
        Overall Status->VERIFICATION_FAILED

        Uploaded Overall Status->UNKNOWN

        HasNodeResults: true


        contents of resultTable

Dumping Result data.
  Status     : VERIFICATION_FAILED
  Name       : rac1
  Type       : Node
  Has Results: No
  Exp. value : null
  Act. value : null

  Errors  :
    PRVG-13606 : chrony daemon is not synchronized with any external time source on node "rac1".

The same configuration steps were completed successfully on Oracle Linux (OL) 8.4, so that I started looking for any changes. It turns out that the chronyc sources output was changed.
OL 8.5 with chrony-4.1-1.0.1.el8.x86_64:

[root@rac1 ~]# chronyc sources
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
^* 169.254.169.123               3   7   377    10   -590ns[+4002ns] +/-  494us

OL 8.4 with chrony-3.5-2.0.1.el8.x86_64:

210 Number of sources = 1
MS Name/IP address         Stratum Poll Reach LastRx Last sample               
===============================================================================
^* 169.254.169.123               3   6   377    39  +4827ns[  +10us] +/-  560us

The number of sources line is absent in OL 8.5.
There are two workarounds that can be used.

Use ORA_DISABLED_CVU_CHECKS

[grid@rac1 ~]$ ORA_DISABLED_CVU_CHECKS=TASKNTP cluvfy comp clocksync -n rac1 -verbose

Verifying Clock Synchronization ...
  Node Name                             Status
  ------------------------------------  ------------------------
  rac1                                  passed

  Node Name                             State
  ------------------------------------  ------------------------
  rac1                                  Observer
  Verifying Network Time Protocol (NTP) ...WARNING (PRVG-11640)
Verifying Clock Synchronization ...WARNING (PRVG-11640)

Verification of Clock Synchronization across the cluster nodes was successful.


Warnings were encountered during execution of CVU verification request "Clock Synchronization across the cluster nodes".

Verifying Clock Synchronization ...WARNING
  Verifying Network Time Protocol (NTP) ...WARNING
  rac1: PRVG-11640 : The check "Network Time Protocol (NTP)" was not performed
        as it is disabled


CVU operation performed:      Clock Synchronization across the cluster nodes
Date:                         Dec 9, 2021 11:03:41 AM
CVU home:                     /u01/app/19.3.0/grid/
User:                         grid

Amend chronyc temporarily to produce the desired output

This is only to validate our hypothesis related to the cause of the validation error.

[root@rac1 ~]# mv /usr/bin/chronyc{,.orig}
[root@rac1 ~]# vi /usr/bin/chronyc
[root@rac1 ~]# chmod a+x /usr/bin/chronyc
[root@rac1 ~]# /usr/bin/chronyc sources
210 Number of sources = 1
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
^* 169.254.169.123               3   8   377    21    -32us[-5361ns] +/-  578us
[root@rac1 ~]# cat /usr/bin/chronyc
#!/bin/bash
echo '210 Number of sources = 1'
/usr/bin/chronyc.orig "$@"

Then we can validate the change by running CVU:

[grid@rac1 ~]$ cluvfy comp clocksync -n rac1 -verbose

Verifying Clock Synchronization ...
  Node Name                             Status
  ------------------------------------  ------------------------
  rac1                                  passed

  Node Name                             State
  ------------------------------------  ------------------------
  rac1                                  Observer

CTSS is in Observer state. Switching over to clock synchronization checks using NTP

  Verifying Network Time Protocol (NTP) ...
    Verifying '/etc/chrony.conf' ...
    Node Name                             File exists?
    ------------------------------------  ------------------------
    rac1                                  yes

    Verifying '/etc/chrony.conf' ...PASSED
    Verifying Daemon 'chronyd' ...
    Node Name                             Running?
    ------------------------------------  ------------------------
    rac1                                  yes

    Verifying Daemon 'chronyd' ...PASSED
    Verifying NTP daemon or service using UDP port 123 ...
    Node Name                             Port Open?
    ------------------------------------  ------------------------
    rac1                                  yes

    Verifying NTP daemon or service using UDP port 123 ...PASSED
    Verifying chrony daemon is synchronized with at least one external time source ...PASSED
  Verifying Network Time Protocol (NTP) ...PASSED
Verifying Clock Synchronization ...PASSED

Verification of Clock Synchronization across the cluster nodes was successful.

CVU operation performed:      Clock Synchronization across the cluster nodes
Date:                         Dec 9, 2021 11:05:36 AM
CVU home:                     /u01/app/19.3.0/grid/
User:                         grid

Conclusion

The requirement to have an external NTP server is questionable. There can be hardware clocks which chrony can use, such as TimeSync PTP service on Azure. This post demonstrates how to debug NTP issues during Grid Infrastructure installations and/or CVU checks. There is one generic method provided to work around such issues by setting ORA_DISABLED_CVU_CHECKS. If the issue is caused by Oracle utilities expecting certain output, then we might as well tweak system programs temporarily to produce the desired output.

Setting database session to read only in 21c

2021-11-30T16:53:00.000+06:00

Oracle introduced a new parameter read_only in 21c which is not documented in Database Reference yet at the time of this writing.

[oracle@db-21 ~]$ sqlplus /nolog @q

SQL*Plus: Release 21.0.0.0.0 - Production on Tue Nov 30 10:39:56 2021
Version 21.4.0.0.0

Copyright (c) 1982, 2021, Oracle.  All rights reserved.

SQL>
SQL> conn tc/tc@db-21/pdb
Connected.
SQL>
SQL> create table t(x int);

Table created.

SQL>
SQL> insert into t values (0);

1 row created.

SQL> commit;

Commit complete.

SQL>
SQL> alter session set read_only=true;

Session altered.

SQL>
SQL> update t
  2     set x=1;
update t
       *
ERROR at line 1:
ORA-16000: database or pluggable database open for read-only access


SQL>
SQL> create table new_tab(x int);
create table new_tab(x int)
*
ERROR at line 1:
ORA-16000: database or pluggable database open for read-only access


SQL>
SQL> select *
  2    from t
  3    for update;
  from t
       *
ERROR at line 2:
ORA-16000: database or pluggable database open for read-only access


SQL>
SQL> lock table t in exclusive mode;
lock table t in exclusive mode
*
ERROR at line 1:
ORA-16000: database or pluggable database open for read-only access


SQL>
SQL> alter session set read_only=false;

Session altered.

SQL>
SQL> update t
  2     set x=1;

1 row updated.

SQL>
SQL> create table new_tab(x int);

Table created.

SQL>
SQL> select *
  2    from t
  3    for update;

         X
----------
         1

SQL>
SQL> lock table t in exclusive mode;

Table(s) Locked.

I believe it was originally introduced for Oracle Autonomous Database offerings since it is the only place where it is documented: Change Autonomous Database Operation Mode to Read/Write Read-Only or Restricted. There are no comparable database features that can provide the same functionality at this level of granularity. A typical usage of this when some application sessions should be set to Read-Only. We can set read_only=true in a logon trigger for that:

SQL> create or replace trigger logon_trg
  2  after logon on tc.schema
  3  declare
  4  begin
  5    execute immediate 'alter session set read_only=true';
  6  end;
  7  /

Trigger created.

SQL>
SQL> conn tc/tc@db-21/pdb
Connected.
SQL> select *
  2    from t
  3    for update;
select *
*
ERROR at line 1:
ORA-16000: database or pluggable database open for read-only access


SQL>
SQL> conn system/manager@db-21/pdb
Connected.
SQL> select *
  2    from tc.t
  3    for update;

         X
----------
         1

ORA-7445 kpdbfSourceFileSearch with NFSv3 without noac option

2021-11-12T02:14:00.000+06:00

Creating a pluggable database (PDB) by plugging an unplugged PDB in 19.12 can fail with the ORA-7445 kpdbfSourceFileSearch error. Somehow there is no information about this error on MOS, so that I am putting it here.

SQL> create pluggable database pdb_test as clone using '/mnt/racdba/pdb_test.xml' copy source_file_directory='/mnt/racdba/datafile';
create pluggable database pdb_test as clone using '/mnt/racdba/pdb_test.xml' copy source_file_directory='/mnt/racdba/datafile'
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel
Process ID: 302875
Session ID: 140 Serial number: 36632

Producing the following lines in the alert log:

2021-11-11T16:11:59.436812+00:00
create pluggable database pdb_test as clone using '/mnt/racdba/pdb_test.xml' copy source_file_directory='/mnt/racdba/datafile'
2021-11-11T16:11:59.662614+00:00
WARNING:NFS file system /mnt/racdba mounted with incorrect options(rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,nolock,proto=tcp,port=2048,timeo=600,retrans=2,sec=sys,mountaddr=20.38.122.68,mountvers=3,mountport=2048,mo
untproto=tcp,local_lock=all,addr=20.38.122.68)
WARNING:Expected NFS mount options: rsize>=32768,wsize>=32768,hard,noac/actimeo=0
WARNING:NFS file system /mnt/racdba mounted with incorrect options(rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,nolock,proto=tcp,port=2048,timeo=600,retrans=2,sec=sys,mountaddr=20.38.122.68,mountvers=3,mountport=2048
,mountproto=tcp,local_lock=all,addr=20.38.122.68)
WARNING:Expected NFS mount options: rsize>=32768,wsize>=32768,hard,noac/actimeo=0
WARNING:NFS file system /mnt/racdba mounted with incorrect options(rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,nolock,proto=tcp,port=2048,timeo=600,retrans=2,sec=sys,mountaddr=20.38.122.68,mountvers=3,mountport=2048,mountproto=tcp,local_lock=all,addr=20.38.122.68)
WARNING:Expected NFS mount options: rsize>=32768,wsize>=32768,hard,noac/actimeo=0
WARNING:NFS file system /mnt/racdba mounted with incorrect options(rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,nolock,proto=tcp,port=2048,timeo=600,retrans=2,sec=sys,mountaddr=20.38.122.68,mountvers=3,mountport=2048,mountproto=tcp,local_lock=all,addr=20.38.122.68)
WARNING:Expected NFS mount options: rsize>=32768,wsize>=32768,hard,noac/actimeo=0
Scanning plugin datafile directory - /mnt/racdba/datafile for file originally          created  as +DATA/RACDBA/D0849B1A0CE20B41E0530101A8C085D0/DATAFILE/system.281.1088344145 with afn -14
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x1C8] [PC:0x99556E2, kpdbfSourceFileSearch()+674] [flags: 0x0, count: 1]
Errors in file /u01/app/oracle/diag/rdbms/racdba/racdba1/trace/racdba1_ora_302875.trc  (incident=16137) (PDBNAME=CDB$ROOT):
ORA-07445: exception encountered: core dump [kpdbfSourceFileSearch()+674] [SIGSEGV] [ADDR:0x1C8] [PC:0x99556E2] [Address not mapped to object] []
Incident details in: /u01/app/oracle/diag/rdbms/racdba/racdba1/incident/incdir_16137/racdba1_ora_302875_i16137.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
2021-11-11T16:12:01.412859+00:00
Dumping diagnostic data in directory=[cdmp_20211111161201], requested by (instance=1, osid=302875), summary=[incident=16137].
2021-11-11T16:12:47.954425+00:00
**************************************************************
Undo Create of Pluggable Database PDB_TEST with pdb id - 4.
**************************************************************

The partial call stack trace from the incident file is this:

__sighandler()       call     sslsshandler()       000000019 7F66A4F8EBF0
                                                   7F66A4F8EAC0 7F66A4F8EAC0 ?
                                                   7F66A4F8EA30 ? 000000083 ?
kpdbfSourceFileSear  signal   __sighandler()       7F66AA3C09C0 000000000
ch()+674                                           7F66A39B09E0 000000008 ?
                                                   000000701 ? 7F66AA27C450 ?
kpdbcPlugVerifyFile  call     kpdbfSourceFileSear  000000014 7F6600000000
s()+1853                      ch()                 7F66A35CC594 7F66A35CC58C
                                                   7F66AA411140 7F66AA27C450 ?
kpdbcPlugVerifyAndC  call     kpdbcPlugVerifyFile  7FFE906117B0 7F6600000000 ?
opyFiles()+1722               s()                  068618378 7F66A35CC58C ?
                                                   7F66AA411140 ? 7F66AA27C450 ?
kpdbcParseAndVerify  call     kpdbcPlugVerifyAndC  7FFE906117B0 063F4FBA8
XML()+558                     opyFiles()           068618378 ? 7F66A35CC58C ?
                                                   7F66AA411140 ? 7F66AA27C450 ?
kpdbcCreatePdbCbk()  call     kpdbcParseAndVerify  7FFE906117B0 063F4FBA8
+2832                         XML()                068618378 ? 7F66A35CC58C ?
                                                   7F66AA411140 ? 7F66AA27C450 ?
kpdbcCreatePdb()+20  call     kpdbcCreatePdbCbk()  7F66A464F5D0 063F4FBA8 ?
73                                                 7F66A464F5A8 7F66A35CC58C ?
                                                   7F66AA411140 ? 7F66AA27C450 ?

Direct NFS (DNFS) is disabled for the database, and the actual mount options are as follows:

[root@raca1 ~]# mount | grep /mnt/racdba
mikhailstorageaccount.blob.core.windows.net:/mikhailstorageaccount/mikhailstoragecontainer on /mnt/racdba type nfs (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,nolock,proto=tcp,port=2048,timeo=600,retrans=2,sec=sys,mountaddr=20.38.122.68,mountvers=3,mountport=2048,mountproto=tcp,local_lock=all,addr=20.38.122.68)

In this case remounting the filesystem with the "noac" option helps (no cache file attributes):

SQL> create pluggable database pdb_test as clone using '/mnt/racdba/pdb_test.xml' copy source_file_directory='/mnt/racdba/datafile';

Pluggable database created.

Producing the following lines in the alert log:

2021-11-11T16:37:07.331022+00:00
create pluggable database pdb_test as clone using '/mnt/racdba/pdb_test.xml' copy source_file_directory='/mnt/racdba/datafile'
2021-11-11T16:37:08.442188+00:00
Scanning plugin datafile directory - /mnt/racdba/datafile for file originally          created  as +DATA/RACDBA/D0849B1A0CE20B41E0530101A8C085D0/DATAFILE/system.281.1088344145 with afn -14
Using file-/mnt/racdba/datafile/data_D-RACDB_I-1085957449_TS-SYSTEM_FNO-14_020dtklc for original file-+DATA/RACDBA/D0849B1A0CE20B41E0530101A8C085D0/DATAFILE/system.281.1088344145 with afn-14
Scanning plugin datafile directory - /mnt/racdba/datafile for file originally          created  as +DATA/RACDBA/D0849B1A0CE20B41E0530101A8C085D0/DATAFILE/sysaux.279.1088344145 with afn -15
Using file-/mnt/racdba/datafile/data_D-RACDB_I-1085957449_TS-SYSAUX_FNO-15_050dtn16 for original file-+DATA/RACDBA/D0849B1A0CE20B41E0530101A8C085D0/DATAFILE/sysaux.279.1088344145 with afn-15
Scanning plugin datafile directory - /mnt/racdba/datafile for file originally          created  as +DATA/RACDBA/D0849B1A0CE20B41E0530101A8C085D0/DATAFILE/undotbs1.280.1088344145 with afn -16
Using file-/mnt/racdba/datafile/data_D-RACDB_I-1085957449_TS-UNDOTBS1_FNO-16_030dtler for original file-+DATA/RACDBA/D0849B1A0CE20B41E0530101A8C085D0/DATAFILE/undotbs1.280.1088344145 with afn-16
Creating new file-/mnt/racdba/datafile/temp.282.1088344151 for original file-+DATA/RACDBA/D0849B1A0CE20B41E0530101A8C085D0/TEMPFILE/temp.282.1088344151
Scanning plugin datafile directory - /mnt/racdba/datafile for file originally          created  as +DATA/RACDBA/D0849B1A0CE20B41E0530101A8C085D0/DATAFILE/undo_2.283.1088344175 with afn -17
Using file-/mnt/racdba/datafile/data_D-RACDB_I-1085957449_TS-UNDO_2_FNO-17_040dtm80 for original file-+DATA/RACDBA/D0849B1A0CE20B41E0530101A8C085D0/DATAFILE/undo_2.283.1088344175 with afn-17
2021-11-11T16:37:32.703370+00:00
PDB_TEST(4):Endian type of dictionary set to little
****************************************************************
Pluggable Database PDB_TEST with pdb id - 4 is created as UNUSABLE.
If any errors are encountered before the pdb is marked as NEW,
then the pdb must be dropped
local undo-1, localundoscn-0x0000000000000114
****************************************************************
PDB_TEST(4):Pluggable database PDB_TEST pseudo opening
PDB_TEST(4):SUPLOG: Initialize PDB SUPLOG SGA, old value 0x0, new value 0x18
PDB_TEST(4):Autotune of undo retention is turned on.
PDB_TEST(4):This instance was first to open pluggable database PDB_TEST (container=4)
PDB_TEST(4):queued attach DA request 0x8fff7aa0 for pdb 4, ospid 334557
2021-11-11T16:37:33.671956+00:00
Increasing priority of 2 RS
Domain Action Reconfiguration started (domid 4, new da inc 11, cluster inc 8)
Instance 1 is attaching to domain 4
 Global Resource Directory partially frozen for domain action
Domain Action Reconfiguration complete (total time 0.0 secs)
Decreasing priority of 2 RS
2021-11-11T16:37:33.793006+00:00
PDB_TEST(4):Undo initialization recovery: Parallel FPTR complete: start:18298543 end:18298544 diff:1 ms (0.0 seconds)
PDB_TEST(4):Undo initialization recovery: err:0 start: 18298542 end: 18298545 diff: 3 ms (0.0 seconds)
PDB_TEST(4):[334557] Successfully onlined Undo Tablespace 2.
PDB_TEST(4):Undo initialization online undo segments: err:0 start: 18298545 end: 18298678 diff: 133 ms (0.1 seconds)
PDB_TEST(4):Undo initialization finished serial:0 start:18298542 end:18298680 diff:138 ms (0.1 seconds)
PDB_TEST(4):Database Characterset for PDB_TEST is AL32UTF8
PDB_TEST(4):Pluggable database PDB_TEST pseudo closing
PDB_TEST(4):JIT: pid 334557 requesting stop
PDB_TEST(4):Closing sequence subsystem (18298726913).
PDB_TEST(4):Buffer Cache flush started: 4
PDB_TEST(4):Buffer Cache flush finished: 4
PDB_TEST(4):queued detach DA request 0x8fff7a38 for pdb 4, ospid 334557
2021-11-11T16:37:34.520104+00:00
Increasing priority of 2 RS
Domain Action Reconfiguration started (domid 4, new da inc 12, cluster inc 8)
Instance 1 is detaching from domain 4 (lazy abort? 0, recovery member? 0)
 Global Resource Directory partially frozen for domain action
* domain detach - domain 4 valid ? 1
 Non-local Process blocks cleaned out
 Set master node info
 Dwn-cvts replayed, VALBLKs dubious
 All grantable enqueues granted
freeing rdom 4
freeing the fusion rht of pdb 4
freeing the pdb enqueue rht
Domain Action Reconfiguration complete (total time 0.0 secs)
Decreasing priority of 2 RS
Completed: create pluggable database pdb_test as clone using '/mnt/racdba/pdb_test.xml' copy source_file_directory='/mnt/racdba/datafile'

Workaround for ORA-22835: Buffer too small for CLOB to CHAR or BLOB to RAW conversion

2021-10-30T17:02:00.000+06:00

Oracle versions 9i and 10.1 have a bug in a LOB to CHAR conversion. When the value is larger than 4000 bytes, it is silently truncated without dropping an error (see: Ora-22835 - CLOB Larger Than 4000 Inserted Into Varchar2(4000) Column Is Silently Truncated (Doc ID 388512.1)). This can become an issue following a database upgrade since newer versions start dropping an ORA-22835 error. This happened in this topic on SQL.RU. This blog post provides an undocumented workaround for this issue.

SQL> !oerr ora 22836
22836, 00000, "Event to turn on lob to char/raw silent truncation"
// *Document: NO
// *Cause:    N/A
// *Action:   Do not throw error 22835 for truncation during LOB to CHAR/RAW
//            conversion. Truncate the data instead.

SQL>
SQL> create table t(vc varchar2(4000));

Table created.

SQL>
SQL> var c clob
SQL>
SQL> begin
  2    select xmltype.createxml(cursor(select * from all_objects)).getClobVal()
  3      into :c
  4      from dual;
  5  end;
  6  /

PL/SQL procedure successfully completed.

SQL>
SQL> exec dbms_output.put_line(dbms_lob.getlength(:c))
12273513

PL/SQL procedure successfully completed.

SQL>
SQL> insert into t values (to_char(:c));
insert into t values (to_char(:c))
                      *
ERROR at line 1:
ORA-22835: Buffer too small for CLOB to CHAR or BLOB to RAW conversion (actual:
12273513, maximum: 4000)


SQL>
SQL> alter session set events '22836 level 1';

Session altered.

SQL>
SQL> insert into t values (to_char(:c));

1 row created.

SQL>
SQL> select length(vc) from t;

LENGTH(VC)
----------
      4000

SQL>
SQL> select banner_full from v$version;

BANNER_FULL
--------------------------------------------------------------------------------
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.9.0.0.0

Conclusion

Event 22836 can be used to suppress the ORA-22835 error and enable the old behavior - LOB values larger than 4000 bytes will be silently truncated without throwing an error. This is not something that can be considered as a long term solution, however, it can become handy in some cases when changing the code is not practical or not feasible.

Finding cause of kernel: blk_update_request: I/O error, dev fd0, sector 0

2021-09-22T01:10:00.002+06:00

There was a Linux server that got periodic kernel: blk_update_request: I/O error, dev fd0, sector 0 errors in /var/log/messages:

Sep 21 11:38:29 localhost kernel: blk_update_request: I/O error, dev fd0, sector 0

There was no cron jobs or systemd timers that can cause these errors, so that I had to resort to other tools.

In this specific example, I decided to use Linux audit:

[root@floppy ~]# auditctl -a exit,always -S open -F path=/dev/fd0 -k floppy
WARNING - 32/64 bit syscall mismatch, you should specify an arch

After I got new blk_update_request errors, I obtained the audit logs (formatted for readability):

[root@floppy ~]# ausearch -k floppy
----
time->Tue Sep 21 11:43:04 2021
type=CONFIG_CHANGE msg=audit(1632224584.979:146): auid=1000 ses=1 op=add_rule key="floppy" list=4 res=1
----
time->Tue Sep 21 11:43:17 2021
type=PROCTITLE msg=audit(1632224597.188:147): proctitle=666469736B002D6C002F6465762F666430
type=PATH msg=audit(1632224597.188:147): item=0 name="/dev/fd0" inode=8972 dev=00:05 mode=060660 ouid=0 ogid=6 rdev=02:00 objtype=NORMAL cap_fp=0000000000000000 cap_fi=0000000000000000 cap_fe=0 cap_fver=0
type=CWD msg=audit(1632224597.188:147):  cwd="/root"
type=SYSCALL msg=audit(1632224597.188:147): arch=c000003e syscall=2 success=no exit=-6 a0=7ffcd4c7290a a1=80000 a2=1 a3=7ffcd4c70e20 items=1 
                                            ppid=1767 pid=2060 auid=1000 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=1 
                                            comm="fdisk" exe="/usr/sbin/fdisk" key="floppy"
.. after a while ..
[root@floppy ~]# ausearch -k floppy
...
----
time->Tue Sep 21 11:43:31 2021
type=PROCTITLE msg=audit(1632224611.123:148): proctitle=666469736B002D6C002F6465762F666430
type=PATH msg=audit(1632224611.123:148): item=0 name="/dev/fd0" inode=8972 dev=00:05 mode=060660 ouid=0 ogid=6 rdev=02:00 objtype=NORMAL cap_fp=0000000000000000 cap_fi=0000000000000000 cap_fe=0 cap_fver=0
type=CWD msg=audit(1632224611.123:148):  cwd="/root"
type=SYSCALL msg=audit(1632224611.123:148): arch=c000003e syscall=2 success=no exit=-6 a0=7fff65dd690a a1=80000 a2=1 a3=7fff65dd4220 items=1
                                            ppid=1767 pid=2073 auid=1000 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=1
                                            comm="fdisk" exe="/usr/sbin/fdisk" key="floppy"

As it's seen from the executable, it was fdisk. More specifically, fdisk -l can cause such errors on Azure V1 VMs.

ASM disk group fails to mount with ORA-15017 ORA-15066 errors offlining disk "RACQ$LUN3" in group "DATA" may result in a data loss

2021-08-31T16:58:00.001+06:00

It's a test 21c cluster that wasn't shutdown properly yesterday. While mounting a DATA disk group, the following errors are encountered:

SQL> alter diskgroup data mount;
alter diskgroup data mount
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15017: diskgroup "DATA" cannot be mounted
ORA-15066: offlining disk "RACQ$LUN3" in group "DATA" may result in a data loss

Mounting the disk group with the FORCE option is not possible either and fails with the same errors:

ASMCMD> mount data -f
ORA-15032: not all alterations performed
ORA-15017: diskgroup "DATA" cannot be mounted
ORA-15066: offlining disk "RACQ$LUN3" in group "DATA" may result in a data loss (DBD ERROR: OCIStmtExecute)

With the next entries in the alert log:

2021-08-31T10:31:52.117810+00:00
SQL> alter diskgroup data mount
2021-08-31T10:31:52.124846+00:00
NOTE: cache registered group DATA 2/0x05D14A8C
NOTE: cache began mount (first) of group DATA 2/0x05D14A8C
NOTE: Assigning number (2,5) to disk (/dev/flashgrid/rac2.lun5)
WARNING: preferred read failure group RAC1 does not exist in diskgroup DATA
NOTE: Assigning number (2,1) to disk (/dev/flashgrid/rac2.lun6)
WARNING: preferred read failure group RAC1 does not exist in diskgroup DATA
NOTE: Assigning number (2,4) to disk (/dev/flashgrid/rac2.lun3)
WARNING: preferred read failure group RAC1 does not exist in diskgroup DATA
NOTE: Assigning number (2,2) to disk (/dev/flashgrid/racq.lun3)
WARNING: preferred read failure group RAC1 does not exist in diskgroup DATA
NOTE: Assigning number (2,6) to disk (/dev/flashgrid/rac1.lun5)
WARNING: DATA has too many failure groups for a stretch cluster.
NOTE: Assigning number (2,0) to disk (/dev/flashgrid/rac1.lun6)
WARNING: DATA has too many failure groups for a stretch cluster.
NOTE: Assigning number (2,3) to disk (/dev/flashgrid/rac1.lun3)
WARNING: DATA has too many failure groups for a stretch cluster.
2021-08-31T10:31:52.294934+00:00
cluster guid (5f307c7210446f13bfcd86fa9d15c5f1) generated for PST Hbeat for instance 1
NOTE: initial disk modes for disk 2 (RACQ$LUN3) in group 2 (DATA) is not completely online: modes 0x1 lflags 0x4
2021-08-31T10:31:52.297979+00:00
NOTE: cache closing disk 2 of grp 2: (not open) RACQ$LUN3
2021-08-31T10:31:58.301169+00:00
ERROR: disk 2 (RACQ$LUN3) in group 2 cannot be offlined because all disks [2(RACQ$LUN3)] with mirror data would be offline.
2021-08-31T10:31:58.301239+00:00
ERROR: too many offline disks in PST (grp 2)
2021-08-31T10:31:58.301932+00:00
NOTE: cache dismounting (clean) group 2/0x05D14A8C (DATA)
NOTE: messaging CKPT to quiesce pins Unix process pid: 7067, NID: 4026531836, image: oracle@rac1.example.com (TNS V1-V3)
NOTE: dbwr not being msg'd to dismount
NOTE: LGWR not being messaged to dismount
NOTE: cache dismounted group 2/0x05D14A8C (DATA)
NOTE: cache ending mount (fail) of group DATA number=2 incarn=0x05d14a8c
NOTE: cache deleting context for group DATA 2/0x05d14a8c
2021-08-31T10:31:58.303103+00:00
GMON dismounting group 2 at 90 for pid 57, osid 7067
2021-08-31T10:31:58.303346+00:00
NOTE: Disk RAC1$LUN6 in mode 0x7f marked for de-assignment
NOTE: Disk RAC2$LUN6 in mode 0x7f marked for de-assignment
NOTE: Disk RACQ$LUN3 in mode 0x1 marked for de-assignment
NOTE: Disk RAC1$LUN3 in mode 0x7f marked for de-assignment
NOTE: Disk RAC2$LUN3 in mode 0x7f marked for de-assignment
NOTE: Disk RAC2$LUN5 in mode 0x7f marked for de-assignment
NOTE: Disk RAC1$LUN5 in mode 0x7f marked for de-assignment
ERROR: diskgroup DATA was not mounted
ORA-15032: not all alterations performed
ORA-15017: diskgroup "DATA" cannot be mounted
ORA-15066: offlining disk "RACQ$LUN3" in group "DATA" may result in a data loss

2021-08-31T10:31:58.314373+00:00
ERROR: alter diskgroup data mount

DATA is a NORMAL redundancy disk group with two regular and one quorum failure group. It's one of the cases when there might be a need to do some undocumented stuff:

SQL> select name,
       path,
       mount_status,
       header_status,
       mode_status,
       state,
       failgroup
  from v$asm_disk
 where group_number=(select group_number from v$asm_diskgroup where name='DATA')
 order by path;
  2    3    4    5    6    7    8    9   10
NAME                           PATH                           MOUNT_S HEADER_STATU MODE_ST STATE    FAILGROUP
------------------------------ ------------------------------ ------- ------------ ------- -------- ------------------------------
                               /dev/flashgrid/rac1.lun3       CLOSED  MEMBER       ONLINE  NORMAL
                               /dev/flashgrid/rac1.lun5       CLOSED  MEMBER       ONLINE  NORMAL
                               /dev/flashgrid/rac1.lun6       CLOSED  MEMBER       ONLINE  NORMAL
                               /dev/flashgrid/rac2.lun3       CLOSED  MEMBER       ONLINE  NORMAL
                               /dev/flashgrid/rac2.lun5       CLOSED  MEMBER       ONLINE  NORMAL
                               /dev/flashgrid/rac2.lun6       CLOSED  MEMBER       ONLINE  NORMAL
                               /dev/flashgrid/racq.lun2       CLOSED  FORMER       ONLINE  NORMAL
                               /dev/flashgrid/racq.lun3       CLOSED  MEMBER       ONLINE  NORMAL

8 rows selected.

SQL> alter diskgroup data mount restricted for recovery;

Diskgroup altered.

SQL> select name,
       path,
       mount_status,
       header_status,
       mode_status,
       state,
       failgroup
  from v$asm_disk
 where group_number=(select group_number from v$asm_diskgroup where name='DATA')
 order by path;
  2    3    4    5    6    7    8    9   10
NAME                           PATH                           MOUNT_S HEADER_STATU MODE_ST STATE    FAILGROUP
------------------------------ ------------------------------ ------- ------------ ------- -------- ------------------------------
RAC1$LUN3                      /dev/flashgrid/rac1.lun3       CACHED  MEMBER       ONLINE  NORMAL   RAC1
RAC1$LUN5                      /dev/flashgrid/rac1.lun5       CACHED  MEMBER       ONLINE  NORMAL   RAC1
RAC1$LUN6                      /dev/flashgrid/rac1.lun6       CACHED  MEMBER       ONLINE  NORMAL   RAC1
RAC2$LUN3                      /dev/flashgrid/rac2.lun3       CACHED  MEMBER       ONLINE  NORMAL   RAC2
RAC2$LUN5                      /dev/flashgrid/rac2.lun5       CACHED  MEMBER       ONLINE  NORMAL   RAC2
RAC2$LUN6                      /dev/flashgrid/rac2.lun6       CACHED  MEMBER       ONLINE  NORMAL   RAC2
RACQ$LUN3                      /dev/flashgrid/racq.lun3       CACHED  MEMBER       ONLINE  NORMAL   RACQ

7 rows selected.

The alert log:

2021-08-31T10:37:56.317024+00:00
SQL> alter diskgroup data mount restricted for recovery
2021-08-31T10:37:56.323711+00:00
NOTE: cache registered group DATA 2/0xEA114A90
NOTE: cache began mount (first) of group DATA 2/0xEA114A90
NOTE: Assigning number (2,5) to disk (/dev/flashgrid/rac2.lun5)
WARNING: preferred read failure group RAC1 does not exist in diskgroup DATA
NOTE: Assigning number (2,1) to disk (/dev/flashgrid/rac2.lun6)
WARNING: preferred read failure group RAC1 does not exist in diskgroup DATA
NOTE: Assigning number (2,4) to disk (/dev/flashgrid/rac2.lun3)
WARNING: preferred read failure group RAC1 does not exist in diskgroup DATA
NOTE: Assigning number (2,2) to disk (/dev/flashgrid/racq.lun3)
WARNING: preferred read failure group RAC1 does not exist in diskgroup DATA
NOTE: Assigning number (2,6) to disk (/dev/flashgrid/rac1.lun5)
WARNING: DATA has too many failure groups for a stretch cluster.
NOTE: Assigning number (2,0) to disk (/dev/flashgrid/rac1.lun6)
WARNING: DATA has too many failure groups for a stretch cluster.
NOTE: Assigning number (2,3) to disk (/dev/flashgrid/rac1.lun3)
WARNING: DATA has too many failure groups for a stretch cluster.
2021-08-31T10:37:56.529659+00:00
cluster guid (5f307c7210446f13bfcd86fa9d15c5f1) generated for PST Hbeat for instance 1
NOTE: initial disk modes for disk 2 (RACQ$LUN3) in group 2 (DATA) is not completely online: modes 0x1 lflags 0x4
2021-08-31T10:37:56.532517+00:00
NOTE: cache closing disk 2 of grp 2: (not open) RACQ$LUN3
2021-08-31T10:38:02.539918+00:00
NOTE: GMON heartbeating for grp 2 (DATA)
2021-08-31T10:38:02.540568+00:00
NOTE: cache closing disk 2 of grp 2: (not open) RACQ$LUN3
GMON querying group 2 at 125 for pid 57, osid 7067
2021-08-31T10:38:02.540905+00:00
NOTE: cache closing disk 2 of grp 2: (not open) RACQ$LUN3
2021-08-31T10:38:02.541129+00:00
NOTE: cache is mounting group DATA created on 2021/07/30 12:37:34
NOTE: cache opening disk 0 of grp 2: RAC1$LUN6 path:/dev/flashgrid/rac1.lun6
NOTE: group 2 (DATA) high disk header ckpt advanced to fcn 0.42883
NOTE: 08/31/21 10:38:02 DATA.F1X0 found on disk 0 au 10 fcn 0.42883 datfmt 2
NOTE: cache opening disk 1 of grp 2: RAC2$LUN6 path:/dev/flashgrid/rac2.lun6
NOTE: cache opening disk 3 of grp 2: RAC1$LUN3 path:/dev/flashgrid/rac1.lun3
NOTE: cache opening disk 4 of grp 2: RAC2$LUN3 path:/dev/flashgrid/rac2.lun3
NOTE: cache opening disk 5 of grp 2: RAC2$LUN5 path:/dev/flashgrid/rac2.lun5
NOTE: 08/31/21 10:38:02 DATA.F1X0 found on disk 5 au 10 fcn 0.42883 datfmt 2
NOTE: cache opening disk 6 of grp 2: RAC1$LUN5 path:/dev/flashgrid/rac1.lun5
2021-08-31T10:38:02.541726+00:00
NOTE: cache mounting (first) normal redundancy group 2/0xEA114A90 (DATA)
2021-08-31T10:38:02.963014+00:00
NOTE: attached to recovery domain 2
2021-08-31T10:38:03.009050+00:00
validate pdb 2, flags x4, valid 0, pdb flags x204
* validated domain 2, flags = 0x200
NOTE: cache recovered group 2 to fcn 0.43425
NOTE: redo buffer size is 512 blocks (2105344 bytes)
2021-08-31T10:38:03.011498+00:00
NOTE: LGWR attempting to mount thread 1 for diskgroup 2 (DATA)
NOTE: LGWR found thread 1 closed at ABA 30.7188 lock domain=0 inc#=0 instnum=1
NOTE: LGWR mounted thread 1 for diskgroup 2 (DATA)
2021-08-31T10:38:03.022814+00:00
NOTE: LGWR opened thread 1 (DATA) at fcn 0.43425 ABA 31.7189 lock domain=2 inc#=2 instnum=1 gx.incarn=3927001744 mntstmp=2021/08/31 10:38:03.012000
2021-08-31T10:38:03.023034+00:00
NOTE: cache mounting group 2/0xEA114A90 (DATA) succeeded
NOTE: cache ending mount (success) of group DATA number=2 incarn=0xea114a90
WARNING: DATA has too many failure groups for a stretch cluster.
2021-08-31T10:38:03.103835+00:00
NOTE: cache closing disk 2 of grp 2: (not open) RACQ$LUN3
2021-08-31T10:38:03.104063+00:00
NOTE: Instance updated compatible.asm to 19.0.0.0.0 for grp 2 (DATA).
2021-08-31T10:38:03.104307+00:00
NOTE: cache closing disk 2 of grp 2: (not open) RACQ$LUN3
2021-08-31T10:38:03.104463+00:00
NOTE: Instance updated compatible.asm to 19.0.0.0.0 for grp 2 (DATA).
2021-08-31T10:38:03.105130+00:00
NOTE: Instance updated compatible.rdbms to 19.0.0.0.0 for grp 2 (DATA).
2021-08-31T10:38:03.105436+00:00
NOTE: Instance updated compatible.rdbms to 19.0.0.0.0 for grp 2 (DATA).
WARNING: DATA has too many failure groups for a stretch cluster.
WARNING: DATA has too many failure groups for a stretch cluster.
2021-08-31T10:38:03.148108+00:00
SUCCESS: diskgroup DATA was mounted
2021-08-31T10:38:03.157095+00:00
SUCCESS: alter diskgroup data mount restricted for recovery
2021-08-31T10:38:03.167013+00:00
NOTE: diskgroup resource ora.DATA.dg is online
2021-08-31T10:38:19.113318+00:00
NOTE: cache closing disk 2 of grp 2: (not open) RACQ$LUN3
2021-08-31T10:38:19.114528+00:00
NOTE: cache closing disk 2 of grp 2: (not open) RACQ$LUN3
2021-08-31T10:38:19.475013+00:00
SQL> ALTER DISKGROUP "DATA" ONLINE QUORUM DISK "RACQ$LUN3"
2021-08-31T10:38:19.478374+00:00
NOTE: cache closing disk 2 of grp 2: (not open) RACQ$LUN3
2021-08-31T10:38:19.486520+00:00
NOTE: GroupBlock outside rolling migration privileged region
NOTE: initiating resync of disk group 2 disks
RACQ$LUN3 (2)

NOTE: process _user21083_+asm1 (21083) initiating offline of disk 2.4042374244 (RACQ$LUN3) with mask 0x7e in group 2 (DATA) without client assisting
2021-08-31T10:38:19.521378+00:00
NOTE: sending set offline flag message (2320259704) to 1 disk(s) in group 2
2021-08-31T10:38:19.521792+00:00
WARNING: Disk 2 (RACQ$LUN3) in group 2 mode 0x1 is now being offlined
2021-08-31T10:38:19.522023+00:00
NOTE: initiating PST update: grp 2 (DATA), dsk = 2/0xf0f1bc64, mask = 0x6a, op = clear mandatory
2021-08-31T10:38:19.522202+00:00
GMON updating disk modes for group 2 at 135 for pid 52, osid 21083
2021-08-31T10:38:19.522431+00:00
NOTE: cache closing disk 2 of grp 2: (not open) RACQ$LUN3
2021-08-31T10:38:19.523238+00:00
NOTE: PST update grp = 2 completed successfully
NOTE: initiating PST update: grp 2 (DATA), dsk = 2/0xf0f1bc64, mask = 0x7e, op = clear mandatory
2021-08-31T10:38:19.523484+00:00
GMON updating disk modes for group 2 at 136 for pid 52, osid 21083
2021-08-31T10:38:19.523608+00:00
NOTE: cache closing disk 2 of grp 2: (not open) RACQ$LUN3
2021-08-31T10:38:19.524311+00:00
NOTE: PST update grp = 2 completed successfully
NOTE: requesting all-instance membership refresh for group=2
NOTE: initiating PST update: grp 2 (DATA), dsk = 2/0x0, mask = 0x11, op = assign mandatory
2021-08-31T10:38:19.532576+00:00
GMON updating disk modes for group 2 at 137 for pid 52, osid 21083
2021-08-31T10:38:19.532724+00:00
NOTE: cache closing disk 2 of grp 2: (not open) RACQ$LUN3
2021-08-31T10:38:19.540473+00:00
NOTE: PST update grp = 2 completed successfully
NOTE: requesting all-instance disk validation for group=2
2021-08-31T10:38:19.540830+00:00
NOTE: disk validation pending for 1 disk in group 2/0xea114a90 (DATA)
NOTE: Found /dev/flashgrid/racq.lun3 for disk RACQ$LUN3
WARNING: DATA has too many failure groups for a stretch cluster.
NOTE: completed disk validation for 2/0xea114a90 (DATA)
2021-08-31T10:38:19.677322+00:00
NOTE: running client discovery for group 2 (reqid:16866344061815119730)
NOTE: discarding redo for group 2 disk 2
NOTE: initiating PST update: grp 2 (DATA), dsk = 2/0x0, mask = 0x19, op = assign mandatory
2021-08-31T10:38:20.176650+00:00
GMON updating disk modes for group 2 at 138 for pid 52, osid 21083
NOTE: group DATA: updated PST location: disks 0005 0000 0002
2021-08-31T10:38:20.225248+00:00
NOTE: PST update grp = 2 completed successfully
WARNING: DATA has too many failure groups for a stretch cluster.
2021-08-31T10:38:20.225980+00:00
NOTE: membership refresh pending for group 2/0xea114a90 (DATA)
2021-08-31T10:38:20.227553+00:00
GMON querying group 2 at 139 for pid 31, osid 22344
2021-08-31T10:38:20.237960+00:00
WARNING: DATA has too many failure groups for a stretch cluster.
NOTE: cache opening disk 2 of grp 2: RACQ$LUN3 path:/dev/flashgrid/racq.lun3
SUCCESS: refreshed membership for 2/0xea114a90 (DATA)
2021-08-31T10:38:20.238448+00:00
NOTE: initiating PST update: grp 2 (DATA), dsk = 2/0x0, mask = 0x7f, op = assign mandatory
2021-08-31T10:38:20.239039+00:00
GMON updating disk modes for group 2 at 140 for pid 52, osid 21083
2021-08-31T10:38:20.284132+00:00
NOTE: PST update grp = 2 completed successfully
2021-08-31T10:38:20.284349+00:00
SUCCESS: ALTER DISKGROUP "DATA" ONLINE QUORUM DISK "RACQ$LUN3"
2021-08-31T10:38:21.596382+00:00
NOTE: Attempting voting file refresh on diskgroup DATA
NOTE: Refresh completed on diskgroup DATA. No voting file found.

The alert log shows that the RACQ$LUN3 disk is brought offline first and then put online. It can also be seen that ASM corrected the issue itself. The 'for recovery' mount option used to be documented somewhere on MOS, but I cannot find where it is now. Looks like Oracle Support made the document non-public.

I can now remount the disk group cleanly:

SQL> alter diskgroup data dismount;

Diskgroup altered.

SQL> alter diskgroup data mount;

Diskgroup altered.

Making DBMS_SCHEDULER.CREATE_JOB Transactional

2021-06-30T17:46:00.002+06:00

It is known that DBMS_SCHEDULER.CREATE_JOB is non-transactional as opposed to the old DBMS_JOB. There is even an Oracle idea to provide a transactional interface: link. In Oracle 19c DBMS_JOB jobs are actually DBMS_SCHEDULER jobs, and they can be part of a bigger transaction. I was curious how it is done and if I can make it work for DBMS_SCHEDULER jobs.

Here is a test script that I used for this blogpost on 19.9:

SQL> conn / as sysdba
Connected.
SQL>
SQL> alter session set container=pdb;

Session altered.

SQL>
SQL> exec dbms_scheduler.purge_log()

PL/SQL procedure successfully completed.

SQL>
SQL> drop user tc cascade;

User dropped.

SQL>
SQL> grant create job, create session to tc identified by tc;

Grant succeeded.

SQL>
SQL> grant execute on sys.dbms_isched to tc;

Grant succeeded.

Firstly, let us create a traditional DBMS_SCHEDULER job so as to demonstrate that CREATE_JOB is non-transactional. By "non-transactional" here I mean that it does not leave the session in a transaction.

SQL> conn tc/tc@localhost/pdb
Connected.
SQL>
SQL> doc
DOC>################################################################################
DOC>#  Traditional Job
DOC>################################################################################
DOC>#
SQL>
SQL> select dbms_transaction.local_transaction_id from dual;

LOCAL_TRANSACTION_ID
--------------------------------------------------------------------------------


SQL>
SQL> exec dbms_scheduler.create_job( -
>   job_name => 'JOB_NON_TX', -
>   job_type => 'PLSQL_BLOCK', -
>   job_action => 'null;', -
>   enabled    => true)

PL/SQL procedure successfully completed.

SQL>
SQL> select dbms_transaction.local_transaction_id from dual;

LOCAL_TRANSACTION_ID
--------------------------------------------------------------------------------


SQL>
SQL> exec dbms_session.sleep(5)

PL/SQL procedure successfully completed.

SQL>
SQL> col job_name for a10
SQL>
SQL> select job_name, state
  2    from user_scheduler_jobs;

no rows selected

SQL>
SQL> col log_date for a35
SQL> select log_date, job_name, status
  2    from user_scheduler_job_run_details
  3   order by log_date;

LOG_DATE                            JOB_NAME   STATUS
----------------------------------- ---------- ----------
30-JUN-21 12.02.26.780546 PM +01:00 JOB_NON_TX SUCCEEDED

Now, I try the same but I call DBMS_ISCHED.SET_NO_COMMIT_FLAG before calling the CREATE_JOB procedure.

SQL> doc
DOC>################################################################################
DOC>#  Transactional Job
DOC>################################################################################
DOC>#
SQL>
SQL> exec sys.dbms_isched.set_no_commit_flag

PL/SQL procedure successfully completed.

SQL>
SQL> select dbms_transaction.local_transaction_id from dual;

LOCAL_TRANSACTION_ID
--------------------------------------------------------------------------------


SQL>
SQL> exec dbms_scheduler.create_job( -
>   job_name => 'JOB_TX', -
>   job_type => 'PLSQL_BLOCK', -
>   job_action => 'null;', -
>   enabled    => true)

PL/SQL procedure successfully completed.

SQL>
SQL> select dbms_transaction.local_transaction_id from dual;

LOCAL_TRANSACTION_ID
--------------------------------------------------------------------------------
9.23.604

SQL>
SQL> exec dbms_session.sleep(5)

PL/SQL procedure successfully completed.

SQL>
SQL> col job_name for a10
SQL>
SQL> select job_name, state
  2    from user_scheduler_jobs;

JOB_NAME   STATE
---------- ----------
JOB_TX     SCHEDULED

SQL>
SQL> col log_date for a35
SQL> select log_date, job_name, status
  2    from user_scheduler_job_run_details
  3   order by log_date;

LOG_DATE                            JOB_NAME   STATUS
----------------------------------- ---------- ----------
30-JUN-21 12.02.26.780546 PM +01:00 JOB_NON_TX SUCCEEDED

The things are a bit different this time around:

DBMS_SCHEDULER.CREATE_JOB left the session in a transaction.
The job is not started.
It is not demonstrated here, but it is possible to issue rollback and it will remove the job definition.

Such a job gets started as soon as commit is done:

SQL> commit;

Commit complete.

SQL>
SQL> exec dbms_session.sleep(5)

PL/SQL procedure successfully completed.

SQL>
SQL> col job_name for a10
SQL> col state for a10
SQL>
SQL> select job_name, state
  2    from user_scheduler_jobs;

no rows selected

SQL>
SQL> col status for a10
SQL>
SQL> col log_date for a35
SQL> select log_date, job_name, status
  2    from user_scheduler_job_run_details
  3   order by log_date;

LOG_DATE                            JOB_NAME   STATUS
----------------------------------- ---------- ----------
30-JUN-21 12.02.26.780546 PM +01:00 JOB_NON_TX SUCCEEDED
30-JUN-21 12.02.36.835878 PM +01:00 JOB_TX     SUCCEEDED

Conclusion

The post demonstrates that one can utilize undocumented DBMS_ISCHED.SET_NO_COMMIT_FLAG to make DBMS_SCHEDULER.CREATE_JOB transactional. Since the package is undocumented, there is no guarantee that it will keep working in future versions. Hopefully, Oracle will make a transactional interface available someday.

How to Send Messages to AWS SQS from PL/SQL Using UTL_HTTP

2021-04-23T23:00:00.003+06:00

This blog post provides a complete example showing how to send messages to an AWS SQS queue.

AWS Prerequisites

I created a queue called TestQueue. I also created an SQS VPC endpoint for the purpose of this example.
There is an IAM policy that I assigned to a dedicated IAM user created for this example:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": "sqs:SendMessage",
            "Resource": "arn:aws:sqs:us-east-1:12<redacted>:TestQueue"
        }
    ]
}

Environment

I used Oracle Database 19.11 in my tests. The database user that is going to call the AWS SQS API is TC.

Wallet

First of all, let us load the AWS certificates into a wallet. Creating a new wallet:

[oracle@rac1 ~]$ orapki wallet create -wallet /u01/app/oracle/wallet_sqs -pwd Oracle123 -auto_login
Oracle PKI Tool Release 19.0.0.0.0 - Production
Version 19.4.0.0.0
Copyright (c) 2004, 2021, Oracle and/or its affiliates. All rights reserved.

Operation is successfully completed.

Getting AWS certificates and loading them:

[oracle@rac1 ~]$ openssl s_client -showcerts -connect vpce-010f631ef8f1e24e7-8br55nv1.sqs.us-east-1.vpce.amazonaws.com:443 </dev/null 2>/dev/null | sed '/BEGIN/,/END/!d' | csplit -f aws -b '%d.pem' - /END/+1
2033
4798
[oracle@rac1 ~]$ orapki wallet add -wallet /u01/app/oracle/wallet_sqs -cert aws1.pem -trusted_cert -pwd Oracle123
Oracle PKI Tool Release 19.0.0.0.0 - Production
Version 19.4.0.0.0
Copyright (c) 2004, 2021, Oracle and/or its affiliates. All rights reserved.

Operation is successfully completed.

In order to call AWS SQS, I need to provide both AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY. I certainly do not want to store them somewhere in the database. Instead, I am going to add the credentials to the wallet:

[oracle@rac1 ~]$ mkstore -wrl /u01/app/oracle/wallet_sqs -createCredential aws_sqs AKI<AWS_ACCESS_KEY_ID redacted>
Oracle Secret Store Tool Release 19.0.0.0.0 - Production
Version 19.4.0.0.0
Copyright (c) 2004, 2021, Oracle and/or its affiliates. All rights reserved.

Your secret/Password is missing in the command line
Enter your secret/Password:
Re-enter your secret/Password:
Enter wallet password:

In the end, that is what my wallet looks like:

[oracle@rac1 ~]$ orapki wallet display -wallet /u01/app/oracle/wallet_sqs
Oracle PKI Tool Release 19.0.0.0.0 - Production
Version 19.4.0.0.0
Copyright (c) 2004, 2021, Oracle and/or its affiliates. All rights reserved.

Requested Certificates:
User Certificates:
Oracle Secret Store entries:
oracle.security.client.connect_string1
oracle.security.client.password1
oracle.security.client.username1
Trusted Certificates:
Subject:        CN=Amazon,OU=Server CA 1B,O=Amazon,C=US
[oracle@rac1 ~]$ mkstore -wrl /u01/app/oracle/wallet_sqs -listCredential
Oracle Secret Store Tool Release 19.0.0.0.0 - Production
Version 19.4.0.0.0
Copyright (c) 2004, 2021, Oracle and/or its affiliates. All rights reserved.

Enter wallet password:
List credential (index: connect_string username)
1: aws_sqs AKI<AWS_ACCESS_KEY_ID redacted>

Configuring Network Access Control Lists (ACLs)

There are two privileges required for the calling user, which is called TC:

The user must be able to access the VPC endpoint over network
The user must be able to access the wallet file

The privileges assigned as follows:

SQL> exec dbms_network_acl_admin.append_host_ace( -
  host => '*.amazonaws.com', -
  ace  => xs$ace_type(privilege_list => xs$name_list('connect'), -
                      principal_name => 'tc', -
                      principal_type => xs_acl.ptype_db))

PL/SQL procedure successfully completed.

SQL> exec dbms_network_acl_admin.append_wallet_ace( -
  wallet_path => 'file:/u01/app/oracle/wallet_sqs', -
  ace         =>  xs$ace_type(privilege_list => xs$name_list('use_client_certificates', 'use_passwords'), -
                              principal_name => 'tc', -
                              principal_type => xs_acl.ptype_db))

PL/SQL procedure successfully completed.

Making UTL_HTTP Call

Finally, I am ready to make an AWS SQS API call:

SQL> declare
  2    req utl_http.req;
  3    resp utl_http.resp;
  4    value varchar2(32767);
  5    endpoint varchar2(32767) := 'https://vpce-010f631ef8f1e24e7-8br55nv1.sqs.us-east-1.vpce.amazonaws.com/12<redacted>/TestQueue/';
  6    parameters varchar2(32767) := 'Action=SendMessage&MessageBody=example';
  7    request_url varchar2(32767) := endpoint || '?' || parameters;
  8  begin
  9    utl_http.set_wallet('file:/u01/app/oracle/wallet_sqs');
 10    req := utl_http.begin_request(request_url);
 11    utl_http.set_header(req, 'x-amz-date', to_char(sysdate,'yyyymmdd"T"hh24miss"Z"'));
 12    utl_http.set_property(req, 'aws-region', 'us-east-1');
 13    utl_http.set_property(req, 'aws-service', 'sqs');
 14    utl_http.set_authentication_from_wallet(req, 'aws_sqs', 'AWS4-HMAC-SHA256');
 15    resp := utl_http.get_response(req);
 16    loop
 17      utl_http.read_line(resp, value, true);
 18      dbms_output.put_line(value);
 19    end loop;
 20  exception
 21    when utl_http.end_of_body
 22    then
 23      utl_http.end_request(req);
 24      utl_http.end_response(resp);
 25    when others
 26    then
 27      utl_http.end_request(req);
 28      raise;
 29  end;
 30  /
<?xml version="1.0"?><SendMessageResponse
xmlns="http://queue.amazonaws.com/doc/2012-11-05/"><SendMessageResult><MessageId
>d59b7cfb-4d9d-4675-bf1c-d64a9cb27929</MessageId><MD5OfMessageBody>1a79a4d60de67
18e8e5b326e338ae533</MD5OfMessageBody></SendMessageResult><ResponseMetadata><Req
uestId>675ba300-5f89-59d8-bf23-fbfcb7bd1fc9</RequestId></ResponseMetadata></Send
MessageResponse>

PL/SQL procedure successfully completed.

Let me explain the code above:

line 11:

11    utl_http.set_header(req, 'x-amz-date', to_char(sysdate,'yyyymmdd"T"hh24miss"Z"'));

x-amz-date is one of the mandatory headers in Signature Version 4

lines 12-13:

12    utl_http.set_property(req, 'aws-region', 'us-east-1');
13    utl_http.set_property(req, 'aws-service', 'sqs');

The lines are provided in accordance with the comments in ?/rdbms/admin/utlhttp.sql

line 14:
```
14    utl_http.set_authentication_from_wallet(req, 'aws_sqs', 'AWS4-HMAC-SHA256');
```
There is one UTL_HTTP call but there are a lot of things are performed inside the procedure, such as: loading AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY from the wallet, constructing a Signature Version 4 request. The scheme AWS4-HMAC-SHA256 is not documented yet in Oracle documentation, but it is mentioned in ?/rdbms/admin/utlhttp.sql

Conclusion

As this blog post demonstrates, it is quite simple to call the AWS SQS API from PL/SQL using UTL_HTTP. Regretfully, some of the parameters I used are not widely known, and there is little documentation for them. It is my hope that Oracle improves its documentation one day.

It should be possible to call other AWS services using the same technique. SQS was used since there was an original question about it on SQL.RU. For instance, I remember I was calling S3 the other day using a similar script and a Signature Version 2 request, which corresponds to schema=AWS in the UTL_HTTP.SET_AUTHENTICATION_FROM_WALLET call. When it comes to S3, there is yet another way to utilize it that I blogged about in January: Reading Amazon S3 Data from Oracle on non-OCI Environment.

Patching with OPatch Failed with Undefined Reference to __libc_csu_fini __libc_csu_init due to Wrong Libc Library

2021-04-14T22:40:00.001+06:00

Patching an Oracle Home failed on one cluster node with the following errors:

Patching component oracle.rdbms, 19.0.0.0.0...
Make failed to invoke "/usr/bin/make -f ins_rdbms.mk ioracle ORACLE_HOME=/u01/app/oracle/product/19.3.0/dbhome_1 OPATCH_SESSION=apply"....'/usr/lib64/crt1.o: In function `_start':
(.text+0x12): undefined reference to `__libc_csu_fini'
/usr/lib64/crt1.o: In function `_start':
(.text+0x19): undefined reference to `__libc_csu_init'
/u01/app/oracle/product/19.3.0/dbhome_1/lib//libserver19.a(kf.o): In function `kfNotify':
kf.c:(.text+0x26f): undefined reference to `atexit'
/u01/app/oracle/product/19.3.0/dbhome_1/lib//libserver19.a(jskm.o): In function `jskmCheckIMJobBCast':
jskm.c:(.text+0x202d): undefined reference to `stat'
jskm.c:(.text+0x2066): undefined reference to `stat'
/u01/app/oracle/product/19.3.0/dbhome_1/lib//libserver19.a(qmsqx.o): In function `qmsqxFetchPos':
qmsqx.c:(text.unlikely+0x1048f): undefined reference to `stat'
/u01/app/oracle/product/19.3.0/dbhome_1/lib//libcore19.a(slzgetevar.o): In function `slzgetevarf_insert_keyval':
slzgetevar.c:(text.unlikely+0x40e): undefined reference to `atexit'
/u01/app/oracle/product/19.3.0/dbhome_1/lib//libgeneric19.a(skgfqsbt.o): In function `xsbtinit':
skgfqsbt.c:(text.unlikely+0x88): undefined reference to `atexit'
/u01/app/oracle/product/19.3.0/dbhome_1/lib//libjavavm19.a(eobtl.o): In function `eobti_create_sym_tmp_file':
eobtl.c:(.text+0x106): undefined reference to `fstat'
/u01/app/oracle/product/19.3.0/dbhome_1/lib//libjavavm19.a(eobtl.o): In function `eobti_digest_symbol_file':
eobtl.c:(.text+0x1b8): undefined reference to `fstat'
/u01/app/oracle/product/19.3.0/dbhome_1/lib//libjavavm19.a(eobtl.o): In function `eobti_build_lookup_tables':
eobtl.c:(.text+0x24ae): undefined reference to `fstat'
eobtl.c:(.text+0x264f): undefined reference to `fstat'
/u01/app/oracle/product/19.3.0/dbhome_1/lib//libnnzst19.a(ext_ztrsaadapter.o): In function `ztcaProcCtx_New':
ztrsaadapter.c:(.text+0xb95): undefined reference to `atexit'
/u01/app/oracle/product/19.3.0/dbhome_1/lib//libnnzst19.a(ccme_ck_rand_load_fileS1.o): In function `r_ck_random_load_file':
ck_rand_load_file.c:(.text+0xd4): undefined reference to `stat'
/u01/app/oracle/product/19.3.0/dbhome_1/rdbms/lib//libfthread19.a(fthread_task.o): In function `fthread_task_tab_init':
fthread_task.c:(text.unlikely+0x3df): undefined reference to `atexit'
make: *** [/u01/app/oracle/product/19.3.0/dbhome_1/rdbms/lib/oracle] Error 1

After a bit of searching, I found that the same errors were reported when a wrong libc library was picked up in a case which was not related to Oracle.
Sure enough, there was a symbolic link $ORACLE_HOME/lib/libc.so to /lib64/libc.so.6:

[oracle@rac1 ~]$ find $ORACLE_HOME -name libc.so -ls
137463494    4 -rw-r--r--   1 oracle   oinstall       83 Apr 25  2014 /u01/app/oracle/product/19.3.0/dbhome_1/lib/stubs/libc.so
88704788    0 lrwxrwxrwx   1 oracle   oinstall       16 Apr 14 15:16 /u01/app/oracle/product/19.3.0/dbhome_1/lib/libc.so -> /lib64/libc.so.6

Having deleted the link, I was able to patch the Oracle Home successfully.

Thoughts

My main suspect was ldconfig and the files in /etc/ld.so.conf.d. For instance, Oracle Instant Client creates a file there with the $ORACLE_HOME/lib in it:

[oracle@rac1 ~]$ cat /etc/ld.so.conf.d/oracle-instantclient.conf
/usr/lib/oracle/19.3/client64/lib
[oracle@rac1 ~]$ rpm -qf /etc/ld.so.conf.d/oracle-instantclient.conf
oracle-instantclient19.3-basic-19.3.0.0.0-1.x86_64

ldconfig is also called when glibc is installed:

# rpm -qp --scripts /tmp/glibc/glibc-2.17-323.0.1.el7_9.x86_64.rpm
preinstall scriptlet (using <lua>):
-- Check that the running kernel is new enough
required = '2.6.32'
rel = posix.uname("%r")
if rpm.vercmp(rel, required) < 0 then
  error("FATAL: kernel too old", 0)
end
postinstall program: /usr/sbin/glibc_post_upgrade.x86_64
postuninstall program: /sbin/ldconfig

Besides, Oracle also uses some libc stubs ($ORACLE_HOME/lib/stubs/libc.so), so could it be the case that some unsuccessful patch application left a symbolic link there?
Then, it could be the case that somebody followed an article from My Oracle Support that recommends adding $ORACLE_HOME/lib to /etc/ld.so.conf.d: IF: External Jobs Failed With ORA-27301/ORA-27302 (Doc ID 2083336.1).
It is still unclear what caused the issue in the problem scenario since there were no references to $ORACLE_HOME/lib in /etc/ld.so.conf.d. For what is worth, next time I will be better prepared when I encounter a similar error.

Direct Path Read Write Temp while Selecting CLOB from GV$ in RAC

2021-04-13T21:29:00.005+06:00

I was intestigating why a simple SELECT * FROM GV$SQL started working much slower in 19c compared to 11.2.0.4. It turns out Oracle introduced some changes there in 18c. This blog post is written to demonstrate the new behavior.

A simple test case which I constructed for this example is below:

set echo on
set timing on

select banner from v$version;
alter session set events 'sql_trace wait=true';
begin
  for r in (
    select sql_fulltext
      from gv$sql)
  loop
    null;
  end loop;
end;
/
alter session set events 'sql_trace off';

select value
  from v$diag_info
 where name='Default Trace File';

I ran it in 19.10 and got the following results:

SQL> select banner from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production

Elapsed: 00:00:00.00
SQL> alter session set events 'sql_trace wait=true';

Session altered.

Elapsed: 00:00:00.00
SQL> begin
  2    for r in (
  3      select sql_fulltext
  4        from gv$sql)
  5    loop
  6      null;
  7    end loop;
  8  end;
  9  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:02.14
SQL> alter session set events 'sql_trace off';

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> select value
  2    from v$diag_info
  3   where name='Default Trace File';

VALUE
--------------------------------------------------------------------------------
/u01/app/oracle/diag/rdbms/orcl19/orcl191/trace/orcl191_ora_19616.trc

Elapsed: 00:00:00.00

I processed the trace file using tkprof and got the output below:

SQL ID: 250c8asaww0wz Plan Hash: 1891717107

SELECT SQL_FULLTEXT
FROM
 GV$SQL


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch       21      0.44       2.13        288        288      25355        2011
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       23      0.44       2.13        288        288      25355        2011

Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: SYS   (recursive depth: 1)
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
      2011       2011       2011  PX COORDINATOR  (cr=288 pr=288 pw=2259 time=2499846 us starts=1)
         0          0          0   PX SEND QC (RANDOM) :TQ10000 (cr=0 pr=0 pw=0 time=0 us starts=0 cost=0 size=1638400 card=100)
         0          0          0    VIEW  GV$SQL (cr=0 pr=0 pw=0 time=0 us starts=0)
         0          0          0     FIXED TABLE FULL X$KGLCURSOR_CHILD (cr=0 pr=0 pw=0 time=0 us starts=0 cost=0 size=1638400 card=100)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  PGA memory operation                           18        0.00          0.00
  PX Deq: reap credit                           232        0.00          0.00
  PX Deq: Join ACK                                2        0.00          0.00
  PX Deq: Parse Reply                             2        0.00          0.00
  PX Deq: Execute Reply                         146        0.00          0.00
  Disk file operations I/O                        5        0.00          0.00
  CSS initialization                              2        0.00          0.00
  CSS operation: query                            6        0.00          0.00
  CSS operation: action                           2        0.00          0.00
  asynch descriptor resize                        1        0.00          0.00
  ASM IO for non-blocking poll                 6769        0.00          0.00
  direct path write temp                       2158        0.00          1.59
  db file sequential read                       144        0.00          0.06
  direct path read temp                         144        0.00          0.06
  reliable message                                1        0.00          0.00
  PX Deq: Signal ACK EXT                          2        0.00          0.00
  IPC send completion sync                        1        0.00          0.00
  PX Deq: Slave Session Stats                     2        0.00          0.00
  IPC group service call                          1        0.00          0.00
  enq: PS - contention                            1        0.00          0.00

It is worth noting that the number of rows is 2011 while the number of direct path write temp wait events is 2158. I believe it is a general pattern rather than a coincidence based on my tests. I found that the number of direct path write temp is always greater than the number of rows but not much. That basically means that Oracle does at least one write per CLOB returned.
Let us now compare it with 12.2 (I tested both 12.2 and 12.2 with the latest release update):

SQL> select banner from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production
PL/SQL Release 12.2.0.1.0 - Production
CORE    12.2.0.1.0      Production
TNS for Linux: Version 12.2.0.1.0 - Production
NLSRTL Version 12.2.0.1.0 - Production

Elapsed: 00:00:00.01
SQL> alter session set events 'sql_trace wait=true';

Session altered.

Elapsed: 00:00:00.00
SQL> begin
  2    for r in (
  3      select sql_fulltext
  4        from gv$sql)
  5    loop
  6      null;
  7    end loop;
  8  end;
  9  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.05
SQL> alter session set events 'sql_trace off';

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> select value
  2    from v$diag_info
  3   where name='Default Trace File';

VALUE
--------------------------------------------------------------------------------
/u01/app/oracle/diag/rdbms/o122a/o122a1/trace/o122a1_ora_24892.trc

It can be seen that 12.2 is much faster than 19.10.
Here is a tkprof processed output:

SQL ID: 250c8asaww0wz Plan Hash: 1891717107

SELECT SQL_FULLTEXT
FROM
 GV$SQL


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch       21      0.02       0.03          0          0          0        2069
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       23      0.03       0.05          0          0          0        2069

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 76     (recursive depth: 1)
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
      2069       2069       2069  PX COORDINATOR  (cr=0 pr=0 pw=0 time=138102 us starts=1)
         0          0          0   PX SEND QC (RANDOM) :TQ10000 (cr=0 pr=0 pw=0 time=0 us starts=0 cost=0 size=200200 card=100)
         0          0          0    VIEW  GV$SQL (cr=0 pr=0 pw=0 time=0 us starts=0)
         0          0          0     FIXED TABLE FULL X$KGLCURSOR_CHILD (cr=0 pr=0 pw=0 time=0 us starts=0 cost=0 size=200200 card=100)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  enq: PS - contention                            3        0.00          0.00
  PX Deq: Join ACK                                5        0.00          0.00
  PX Deq: reap credit                           277        0.00          0.00
  IPC send completion sync                        3        0.00          0.00
  PX Deq: Parse Reply                             2        0.00          0.00
  PGA memory operation                           28        0.00          0.00
  PX Deq: Execute Reply                         132        0.00          0.01
  reliable message                                1        0.00          0.00
  PX Deq: Signal ACK EXT                          2        0.00          0.00
  PX Deq: Slave Session Stats                     2        0.00          0.00

Although the number of rows is roughly the same (2069 in 12.2 vs 2011 in 19.10), the runtime performance and wait events are drastically different. Please bear in mind, that I initially asked to investigate why the same query in 11.2 was running in 1 second while the same query in 19.10 was running in 90 seconds with about the same number of rows as in 11.2.
I also reviewed the trace file in 19c to see if I can spot any pattern there:

[oracle@rac1 ~]$ grep 'WAIT #139925193433952' /u01/app/oracle/diag/rdbms/orcl19/orcl191/trace/orcl191_ora_19616.trc | grep -E 'direct path (read|write) temp|db file sequential read'
WAIT #139925193433952: nam='direct path write temp' ela= 758 file number=201 first dba=4355 block cnt=1 obj#=-1 tim=20569064799
WAIT #139925193433952: nam='direct path write temp' ela= 674 file number=201 first dba=4356 block cnt=1 obj#=-1 tim=20569065748
..59 direct path write temp in total
WAIT #139925193433952: nam='direct path write temp' ela= 680 file number=201 first dba=4413 block cnt=1 obj#=-1 tim=20569114649
..then there are groups consisting of one db file sequential read/one direct path read temp (the same block)/multiple direct path write temp:
..group 1
WAIT #139925193433952: nam='db file sequential read' ela= 473 file#=201 block#=4413 blocks=1 obj#=-1 tim=20569115200
WAIT #139925193433952: nam='direct path read temp' ela= 449 file number=201 first dba=4413 block cnt=1 obj#=-1 tim=20569115739
WAIT #139925193433952: nam='direct path write temp' ela= 699 file number=201 first dba=4413 block cnt=1 obj#=-1 tim=20569116529
WAIT #139925193433952: nam='direct path write temp' ela= 643 file number=201 first dba=4414 block cnt=1 obj#=-1 tim=20569117342
WAIT #139925193433952: nam='direct path write temp' ela= 753 file number=201 first dba=4415 block cnt=1 obj#=-1 tim=20569118266
WAIT #139925193433952: nam='direct path write temp' ela= 706 file number=201 first dba=4416 block cnt=1 obj#=-1 tim=20569119134
WAIT #139925193433952: nam='direct path write temp' ela= 593 file number=201 first dba=4417 block cnt=1 obj#=-1 tim=20569119882
WAIT #139925193433952: nam='direct path write temp' ela= 626 file number=201 first dba=4418 block cnt=1 obj#=-1 tim=20569120652
WAIT #139925193433952: nam='direct path write temp' ela= 807 file number=201 first dba=4419 block cnt=1 obj#=-1 tim=20569121585
WAIT #139925193433952: nam='direct path write temp' ela= 625 file number=201 first dba=4420 block cnt=1 obj#=-1 tim=20569122348
WAIT #139925193433952: nam='direct path write temp' ela= 605 file number=201 first dba=4421 block cnt=1 obj#=-1 tim=20569123097
WAIT #139925193433952: nam='direct path write temp' ela= 509 file number=201 first dba=4422 block cnt=1 obj#=-1 tim=20569123753
WAIT #139925193433952: nam='direct path write temp' ela= 577 file number=201 first dba=4423 block cnt=1 obj#=-1 tim=20569124476
WAIT #139925193433952: nam='direct path write temp' ela= 667 file number=201 first dba=4424 block cnt=1 obj#=-1 tim=20569125286
WAIT #139925193433952: nam='direct path write temp' ela= 501 file number=201 first dba=4425 block cnt=1 obj#=-1 tim=20569125930
WAIT #139925193433952: nam='direct path write temp' ela= 617 file number=201 first dba=4426 block cnt=1 obj#=-1 tim=20569126698
WAIT #139925193433952: nam='direct path write temp' ela= 634 file number=201 first dba=4427 block cnt=1 obj#=-1 tim=20569127473
WAIT #139925193433952: nam='direct path write temp' ela= 577 file number=201 first dba=4428 block cnt=1 obj#=-1 tim=20569128188
WAIT #139925193433952: nam='direct path write temp' ela= 868 file number=201 first dba=4429 block cnt=1 obj#=-1 tim=20569129188
WAIT #139925193433952: nam='direct path write temp' ela= 609 file number=201 first dba=4430 block cnt=1 obj#=-1 tim=20569130099
..group 2
WAIT #139925193433952: nam='db file sequential read' ela= 399 file#=201 block#=4430 blocks=1 obj#=-1 tim=20569130544
WAIT #139925193433952: nam='direct path read temp' ela= 454 file number=201 first dba=4430 block cnt=1 obj#=-1 tim=20569131072
WAIT #139925193433952: nam='direct path write temp' ela= 669 file number=201 first dba=4430 block cnt=1 obj#=-1 tim=20569131834
WAIT #139925193433952: nam='direct path write temp' ela= 686 file number=201 first dba=4431 block cnt=1 obj#=-1 tim=20569132665
WAIT #139925193433952: nam='direct path write temp' ela= 588 file number=201 first dba=4432 block cnt=1 obj#=-1 tim=20569133416
WAIT #139925193433952: nam='direct path write temp' ela= 794 file number=201 first dba=4433 block cnt=1 obj#=-1 tim=20569134366
WAIT #139925193433952: nam='direct path write temp' ela= 648 file number=201 first dba=4434 block cnt=1 obj#=-1 tim=20569135192
WAIT #139925193433952: nam='direct path write temp' ela= 792 file number=201 first dba=4435 block cnt=1 obj#=-1 tim=20569136134
WAIT #139925193433952: nam='direct path write temp' ela= 682 file number=201 first dba=4436 block cnt=1 obj#=-1 tim=20569136961
WAIT #139925193433952: nam='direct path write temp' ela= 609 file number=201 first dba=4437 block cnt=1 obj#=-1 tim=20569137753
WAIT #139925193433952: nam='direct path write temp' ela= 563 file number=201 first dba=4438 block cnt=1 obj#=-1 tim=20569138501
WAIT #139925193433952: nam='direct path write temp' ela= 645 file number=201 first dba=4439 block cnt=1 obj#=-1 tim=20569139298
WAIT #139925193433952: nam='direct path write temp' ela= 807 file number=201 first dba=4440 block cnt=1 obj#=-1 tim=20569140265
WAIT #139925193433952: nam='direct path write temp' ela= 679 file number=201 first dba=4441 block cnt=1 obj#=-1 tim=20569141079
WAIT #139925193433952: nam='direct path write temp' ela= 626 file number=201 first dba=4442 block cnt=1 obj#=-1 tim=20569141844
WAIT #139925193433952: nam='direct path write temp' ela= 740 file number=201 first dba=4443 block cnt=1 obj#=-1 tim=20569142718
WAIT #139925193433952: nam='direct path write temp' ela= 753 file number=201 first dba=4444 block cnt=1 obj#=-1 tim=20569143615
WAIT #139925193433952: nam='direct path write temp' ela= 686 file number=201 first dba=4445 block cnt=1 obj#=-1 tim=20569144450
WAIT #139925193433952: nam='direct path write temp' ela= 673 file number=201 first dba=4446 block cnt=1 obj#=-1 tim=20569145257
WAIT #139925193433952: nam='direct path write temp' ela= 728 file number=201 first dba=4447 block cnt=1 obj#=-1 tim=20569146129
WAIT #139925193433952: nam='direct path write temp' ela= 589 file number=201 first dba=4448 block cnt=1 obj#=-1 tim=20569146857
WAIT #139925193433952: nam='direct path write temp' ela= 658 file number=201 first dba=4449 block cnt=1 obj#=-1 tim=20569147672
WAIT #139925193433952: nam='direct path write temp' ela= 666 file number=201 first dba=4450 block cnt=1 obj#=-1 tim=20569148478
WAIT #139925193433952: nam='direct path write temp' ela= 602 file number=201 first dba=4451 block cnt=1 obj#=-1 tim=20569149255
WAIT #139925193433952: nam='direct path write temp' ela= 841 file number=201 first dba=4452 block cnt=1 obj#=-1 tim=20569150247
WAIT #139925193433952: nam='direct path write temp' ela= 763 file number=201 first dba=4453 block cnt=1 obj#=-1 tim=20569151167
WAIT #139925193433952: nam='direct path write temp' ela= 779 file number=201 first dba=4454 block cnt=1 obj#=-1 tim=20569152208
..group 3
WAIT #139925193433952: nam='db file sequential read' ela= 206 file#=201 block#=4454 blocks=1 obj#=-1 tim=20569152457
WAIT #139925193433952: nam='direct path read temp' ela= 711 file number=201 first dba=4454 block cnt=1 obj#=-1 tim=20569153234
WAIT #139925193433952: nam='direct path write temp' ela= 575 file number=201 first dba=4454 block cnt=1 obj#=-1 tim=20569153905
WAIT #139925193433952: nam='direct path write temp' ela= 930 file number=201 first dba=4455 block cnt=1 obj#=-1 tim=20569155093
WAIT #139925193433952: nam='direct path write temp' ela= 809 file number=201 first dba=4355 block cnt=1 obj#=-1 tim=20569156069
WAIT #139925193433952: nam='direct path write temp' ela= 820 file number=201 first dba=4356 block cnt=1 obj#=-1 tim=20569157024
WAIT #139925193433952: nam='direct path write temp' ela= 976 file number=201 first dba=4357 block cnt=1 obj#=-1 tim=20569158174
WAIT #139925193433952: nam='direct path write temp' ela= 683 file number=201 first dba=4358 block cnt=1 obj#=-1 tim=20569159165
WAIT #139925193433952: nam='db file sequential read' ela= 638 file#=201 block#=4358 blocks=1 obj#=-1 tim=20569159850
WAIT #139925193433952: nam='direct path read temp' ela= 183 file number=201 first dba=4358 block cnt=1 obj#=-1 tim=20569160107
..

My next thought was to test how 19c processes CLOB's in traditional tables - I have not found any issues there. There have been no direct path read/write temp on copies of GV$SQL.
I then tried to select the same persistent tables (non-GV$) via database link in 19c - again everything worked as it should.
It seems that only GV$ views are affected. Local selects are not affected at all (the output below is from the first instance):

SQL ID: 87myh1vhjdcf8 Plan Hash: 1891717107

SELECT SQL_FULLTEXT
FROM
 GV$SQL WHERE INST_ID = 1


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch       13      0.02       0.02          0          0          0        1231
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       15      0.03       0.03          0          0          0        1231

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: SYS   (recursive depth: 1)
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
      1231       1231       1231  PX COORDINATOR  (cr=0 pr=0 pw=0 time=13516 us starts=1)
      1231       1231       1231   PX SEND QC (RANDOM) :TQ10000 (cr=0 pr=0 pw=0 time=13283 us starts=1 cost=0 size=1639700 card=100)
      1231       1231       1231    VIEW  GV$SQL (cr=0 pr=0 pw=0 time=12908 us starts=1)
      1231       1231       1231     FIXED TABLE FULL X$KGLCURSOR_CHILD (cr=0 pr=0 pw=0 time=12657 us starts=1 cost=0 size=1639700 card=100)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  PGA memory operation                           99        0.00          0.00

Whereas any select when non-local instance is involved is done utilizing a temporary tablespace:

SQL ID: 4p7vm0hhw6u1v Plan Hash: 1891717107

SELECT SQL_FULLTEXT
FROM
 GV$SQL WHERE INST_ID = 2


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch       10      0.21       1.06        118        118      11953         968
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       12      0.21       1.07        118        118      11953         968

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: SYS   (recursive depth: 1)
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
       968        968        968  PX COORDINATOR  (cr=118 pr=118 pw=1075 time=907430 us starts=1)
         0          0          0   PX SEND QC (RANDOM) :TQ10000 (cr=0 pr=0 pw=0 time=0 us starts=0 cost=0 size=1639700 card=100)
         0          0          0    VIEW  GV$SQL (cr=0 pr=0 pw=0 time=0 us starts=0)
         0          0          0     FIXED TABLE FULL X$KGLCURSOR_CHILD (cr=0 pr=0 pw=0 time=0 us starts=0 cost=0 size=1639700 card=100)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  PX Deq: reap credit                           158        0.00          0.00
  PX Deq: Join ACK                                1        0.00          0.00
  PX Deq: Parse Reply                             1        0.00          0.00
  PGA memory operation                           11        0.00          0.00
  PX Deq: Execute Reply                          61        0.00          0.01
  Disk file operations I/O                        3        0.00          0.00
  CSS initialization                              2        0.00          0.00
  CSS operation: query                            6        0.00          0.00
  CSS operation: action                           2        0.00          0.00
  asynch descriptor resize                        1        0.00          0.00
  ASM IO for non-blocking poll                 3208        0.00          0.00
  direct path write temp                       1029        0.00          0.81
  db file sequential read                        59        0.00          0.02
  direct path read temp                          59        0.00          0.02
  reliable message                                1        0.00          0.00
  PX Deq: Signal ACK EXT                          1        0.00          0.00
  IPC send completion sync                        1        0.00          0.00
  PX Deq: Slave Session Stats                     1        0.00          0.00
  IPC group service call                          1        0.00          0.00
  enq: PS - contention                            1        0.00          0.00

I then shut one instance down, and rechecked the queries against GV$ - there were no direct path read/write temp wait events.

Conclusion

It is still not clear why Oracle started writing every CLOB from GV$ to a temporary tablespace. I tried several different GV$ views and the same behaviour was observed everywhere. It looks like something was changed inside those GV$ functions.
My initial thought was that it is some kind of a temporary tablespace flush for persistence in case some parallel processes got terminated. However, parallel slaves started on remote instances are sending the blocks to the QC bypassing any temp.
Then I was thinking that it might be done to reduce memory usage. It has little sense to write every CLOB to temp anyway. Why do not keep a small memory area and write to temp everything that exceeded some threshold?
I also do not know how to alter this behavior to make the things the same as they were before 18c. I tested 11.2.0.4, 12.2(vanilla and with the latest RU), 18c, 19c (vanilla and with the latest RU - 19.10). Both 18c and 19c are affected, so that they wait for direct path read/write temp when a CLOB column is selected from GV$ with more than one instance.
I will update the blog post if I get to the bottom of it and identify the root cause.

Strict Redaction Semantics

2021-03-13T23:54:00.003+06:00

A DBMS_REDACT issue has recently popped up on SQL.RU. I constructed a simple test case to demonstrate the problem in 19.9:

set echo on lin 120

conn tc/tc@localhost/pdb

grant create table to scott identified by tiger;
alter user scott quota unlimited on users;

drop table scott.emp;
create table scott.emp(client_name varchar2(30));

insert into scott.emp values ('Larry Ellison');
insert into scott.emp values ('Jeff Bezos');
insert into scott.emp values ('Elon Musk');
insert into scott.emp values ('Vladimir Putin');
insert into scott.emp values (null);

exec dbms_redact.add_policy( -
       object_schema         => 'SCOTT', -
       object_name           => 'emp', -
       column_name           => 'CLIENT_NAME', -
       policy_name           => 'client_fio', -
       function_type         => dbms_redact.regexp, -
       regexp_pattern        => '(\S{3})(\S+)', -
       regexp_replace_string => '\1***', -
       expression            => '1 = 1' -
)

col client_name for a16
col c1 for a16
col c2 like c1
col c3 like c1
create or replace view scott.vw as
select client_name
     , nvl(client_name, 'zyzy') c1
     , case when client_name='Vla*** Put***' then 'y' else client_name end c2
     ,'Hello '||client_name c3
  from scott.emp;

grant alter session, create session to tc2 identified by tc2;

grant select on scott.vw to tc2;

conn tc2/tc2@localhost/pdb

select * from scott.vw;

In a nutshell, there is a table that has a data redaction policy. There is a view built on top of it. However, when a user tries to select from the view, it returns an error:

SQL> select * from scott.vw;
select * from scott.vw
                    *
ERROR at line 1:
ORA-28094: SQL construct not supported by data redaction

This restriction is mentioned in the Advanced Security Guide:

To avoid the ORA-28094 error, ensure that the query has the following properties:
In a CREATE VIEW definition or an inline view, there cannot be any SQL expression that involves a redacted column.

When I come across such restrictions, it is always interesting to explore if there is a way of getting around it. Let us enable Data Redaction tracing and rerun the query:

SQL> alter session set events 'trace[radm]';

Session altered.

SQL> select * from scott.vw;
select * from scott.vw
                    *
ERROR at line 1:
ORA-28094: SQL construct not supported by data redaction

Here is the trace file:

kzdmpci: Table obj# is: 23656
kzdmpci: Policy Name is: client_fio
kzdmpci: Policy Expression ID radm_mc$.pe# = 5001
kzdmpci: intcol# is: 1
kzdmpci: Masking Function is: 5
kzdmpci: RegExp pattern: (\S{3})(\S+) (len=12)
kzdmpci: RegExp replace: \1*** (len=5)
kzdmpci: RegExp match  :  (len=0), mpl=17
kzdmpci: no RADM policies exist, starting new RADM Policy Chain.
kzdmpci: new RADM Policy Chain ci->kkscdmpc=0x12da37ec started with  obj#=23656
kzdmpci:2: radm_pe$ lookup: for Policy Expression ID pe# 5001, pe_pexpr is [1 = 1], pe_name = [], pe_obj# = 23656
kzdmpci: added column 1 for object 23656
kzradm_vc_replace_expr_allowed: policy on table EMP

kzdmrfs: entered with opntyp of OPNTCOL
kzdmrfs: found redacted col CLIENT_NAME at 7
kzradm_vc_replace_expr_allowed: 0

kzdmchkerpna: kzpcxp(KZSERP) gave FALSE, no bypass.

kzdmchkerpna: kzpcxp(KZSERP) gave FALSE, no bypass.

kzdmchkerpna: kzpcxp(KZSERP) gave FALSE, no bypass.

kzdmchkerpna: kzpcxp(KZSERP) gave FALSE, no bypass.

kzdmchkerp: kzpcsp(KZSERP) gave FALSE, no bypass.
kzdmprqb: entered
kzdminsc: entered with opntyp=OPNTCOL

kzdminsc: view column CLIENT_NAME with KCCF2MASKDEP detected, calling kzdmpvc
kzdmpvc: level 0, column 'CLIENT_NAME' has KCCF2MASKDEP flag
kzdmpvc: level 0, calling kzdmcrmo to insert MASK operator around view column with name 'CLIENT_NAME'

kzdmcrmo: the pe# is 5001
kzdmcrmo: entered, creating MASK operator for column CLIENT_NAME of table EMP
kzdmcrmo: policy chain = 0x6319aa30
kzdmcrmo: obj# = 23656, intcol# = 1
kzdmcrmo: New ctxmask chain pgactx->ctxmask=0x62485db0 started with obj#=23656
kzdmcrmo: found object 23656
kzdmcrmo: found column 1, metadata=0x6319a988
kzdmcrmo: New ctxmask_re chain 0x62485d80 started with col=(23656,1)
kzdmcrmo: finished creating MASK operator
kzdmpvc: level 0, kzdmcrmo inserted MASK operator.
kzdminsc: entered with opntyp=OPNTCOL

kzdminsc: view column C1 with KCCF2MASKDEP detected, calling kzdmpvc
kzdmpvc: level 0, column 'C1' has KCCF2MASKDEP flag
kzdmpvc: level 0, processing SQL expression...
kzdmpvc: strict semantics disallow expression in VIEW


kzdmchkerpna: kzpcxp(KZSERP) gave FALSE, no bypass.

This strict semantics disallow expression in VIEW bit sounded intriguing, so that I decided to research what it is and how to change it:

SQL> select ksppinm, ksppdesc from x$ksppi where ksppinm like '%redaction%';

KSPPINM                        KSPPDESC
------------------------------ ----------------------------------------
_strict_redaction_semantics    Strict Data Redaction semantics

I set the parameter to FALSE and bounced the database:

SQL> alter system set "_strict_redaction_semantics"=false scope=spfile;

System altered.

$ srvctl stop database -db orcl ; srvctl start database -db orcl

Sure enough, that is what I got after selecting the data from the view:

SQL> select * from scott.vw;

CLIENT_NAME      C1                  C2               C3
---------------- ------------------- ---------------- -------------------
Lar*** Ell***    Lar*** Ell***       Lar*** Ell***    Hello Lar*** Ell***
Jef*** Bez***    Jef*** Bez***       Jef*** Bez***    Hello Jef*** Bez***
Elo*** Mus***    Elo*** Mus***       Elo*** Mus***    Hello Elo*** Mus***
Vla*** Put***    Vla*** Put***       y                Hello Vla*** Put***
                 zyzy                                 Hello

Conclusion

Oracle obviously makes such restrictions on purpose. There is no much information about this parameter, but there should be some corner cases due to which Oracle decided to keep the redaction semantics as strict by default. It might be the case that Oracle will make this functionality officially available in a next release once it is ready for production use.

Oracle to PostgreSQL Replication: Numbers Truncated to 6 Decimal Places

2021-03-09T02:17:00.005+06:00

I configured an initial load from Oracle to PostgreSQL, and found that some numbers were truncated after the load, e.g. Oracle number 123.0123456789 happened to be 123.012346 in PostgreSQL.

Setup

Source: Oracle Goldengate 19.1.0.0.4

GGSCI (ogg-hub as c##ggadmin@orcl/CDB$ROOT) 9> versions

Operating System:
Linux
Version #2 SMP Fri Mar 29 17:05:02 PDT 2019, Release 4.1.12-124.26.7.el7uek.x86_64
Node: ogg-hub
Machine: Linux

Database:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production

Target: Oracle GoldenGate 19.1.0.0.200714 for PostgreSQL

GGSCI (ogg-hub as ggadmin@testdb) 8> versions

Operating System:
Linux
Version #2 SMP Fri Mar 29 17:05:02 PDT 2019, Release 4.1.12-124.26.7.el7uek.x86_64
Node: ogg-hub
Machine: Linux

Database:
PostgreSQL
Version 12.06.0000 PostgreSQL 12.6
ODBC Version 03.52.0000
Driver Information:
GGpsql25.so
Version 07.16.0353 (B0519, U0369)
ODBC Version 03.52

odbc.ini

[ODBC Data Sources]
testdb=PostgreSQL on testdb

[ODBC]
IANAAppCodePage=106
InstallDir=/u01/app/oracle/product/19/ogg_pg_home1

[testdb]
Driver=/u01/app/oracle/product/19/ogg_pg_home1/lib/GGpsql25.so
Description=PostgreSQL 12.6
Database=testdb
HostName=postgres
PortNumber=5432
TransactionErrorBehavior=2

Initial Data

There is a table in the Oracle database:

SQL> create table tc.test(
  2    id int primary key,
  3    n1 number,
  4    n2 number);

Table created.

SQL>
SQL> insert into tc.test values (0, 123.0123456789, 123.0123456789);

1 row created.

SQL> commit;

Commit complete.

SQL>
SQL> select * from tc.test;

             ID              N1              N2
--------------- --------------- ---------------
              0  123.0123456789  123.0123456789

The corresponding PostgreSQL table is this:

testdb=# create table tc.test(
testdb(#   id bigint primary key,
testdb(#   n1 numeric,
testdb(#   n2 numeric(38,10));
CREATE TABLE
testdb=# \d tc.test
                     Table "tc.test"
 Column |      Type      | Collation | Nullable | Default
--------+----------------+-----------+----------+---------
 id     | bigint         |           | not null |
 n1     | numeric        |           |          |
 n2     | numeric(38,10) |           |          |
Indexes:
    "test_pkey" PRIMARY KEY, btree (id)

GoldenGate Initial Load Processes

Extract

GGSCI (ogg-hub) 1> view params einit

EXTRACT einit
USERIDALIAS oggadmin
RMTHOST ogg-hub, MGRPORT 7810
RMTTASK REPLICAT, GROUP rinit

TABLE pdb.tc.test;

Replicat

GGSCI (ogg-hub) 1> view params rinit

REPLICAT rinit
SETENV (PGCLIENTENCODING="UTF8")
SETENV (ODBCINI="/home/ogg/pg/odbc.ini" )
TARGETDB testdb, USERIDALIAS ggadmin
DISCARDFILE ./dirrpt/rinit.dsc, PURGE

MAP pdb.*.*, TARGET *.*;

Performing Initial Load

Extract

GGSCI (ogg-hub) 1> start extract einit

Sending START request to MANAGER ...
EXTRACT EINIT starting

Replicat Report File

***********************************************************************
Oracle GoldenGate Delivery for PostgreSQL
Version 19.1.0.0.200714 OGGCORE_19.1.0.0.0OGGBP_PLATFORMS_200628.2141
Linux, x64, 64bit (optimized), PostgreSQL on Jun 29 2020 04:06:46

Starting at 2021-03-08 18:05:04
***********************************************************************

Operating System Version:
Linux
Version #2 SMP Fri Mar 29 17:05:02 PDT 2019, Release 4.1.12-124.26.7.el7uek.x86_64
Node: ogg-hub
Machine: x86_64
soft limit hard limit
Address Space Size : unlimited unlimited
Heap Size : unlimited unlimited
File Size : unlimited unlimited
CPU Time : unlimited unlimited

Process id: 15745

Description:

***********************************************************************
** Running with the following parameters **
***********************************************************************

2021-03-08 18:05:09 INFO OGG-03059 Operating system character set identified as UTF-8.

2021-03-08 18:05:09 INFO OGG-02695 ANSI SQL parameter syntax is used for parameter parsing.

2021-03-08 18:05:09 INFO OGG-02095 Successfully set environment variable PGCLIENTENCODING=UTF8.

2021-03-08 18:05:09 INFO OGG-02095 Successfully set environment variable ODBCINI=/home/ogg/pg/odbc.ini.

2021-03-08 18:05:09 INFO OGG-01360 REPLICAT is running in Remote Task mode.
REPLICAT rinit
SETENV (PGCLIENTENCODING="UTF8")
SETENV (ODBCINI="/home/ogg/pg/odbc.ini" )
TARGETDB testdb, USERIDALIAS ggadmin

2021-03-08 18:05:09 INFO OGG-03036 Database character set identified as UTF-8. Locale: en_US.UTF-8.

2021-03-08 18:05:09 INFO OGG-03037 Session character set identified as UTF-8.
DISCARDFILE ./dirrpt/rinit.dsc, purge
MAP pdb.*.*, TARGET *.*;

2021-03-08 18:05:09 INFO OGG-01815 Virtual Memory Facilities for: COM
anon alloc: mmap(MAP_ANON) anon free: munmap
file alloc: mmap(MAP_SHARED) file free: munmap
target directories:
/u01/app/oracle/product/19/ogg_pg_home1/dirtmp.

2021-03-08 18:05:09 INFO OGG-25340
Database Version:
PostgreSQL
Version 12.06.0000 PostgreSQL 12.6
ODBC Version 03.52.0000

Driver Information:
GGpsql25.so
Version 07.16.0353 (B0519, U0369)
ODBC Version 03.52.

2021-03-08 18:05:09 INFO OGG-25341
Database Language and Character Set:
SERVER_ENCODING = "UTF8"
LC_CTYPE = "en_US.UTF-8"
TIMEZONE = "Europe/Dublin".

***********************************************************************
** Run Time Messages **
***********************************************************************

2021-03-08 18:05:10 INFO OGG-06506 Wildcard MAP resolved (entry pdb.*.*): MAP "PDB"."TC"."TEST", TARGET "TC"."TEST".

2021-03-08 18:05:10 INFO OGG-02756 The definition for table PDB.TC.TEST is obtained from the trail file.

2021-03-08 18:05:10 INFO OGG-06511 Using following columns in default map by name: id, n1, n2.

2021-03-08 18:05:10 INFO OGG-06510 Using the following key columns for target table tc.test: id.

PostgreSQL

testdb=# select * from tc.test;
 id |     n1     |       n2
----+------------+----------------
  0 | 123.012346 | 123.0123456789
(1 row)

testdb=# \d tc.test
                     Table "tc.test"
 Column |      Type      | Collation | Nullable | Default
--------+----------------+-----------+----------+---------
 id     | bigint         |           | not null |
 n1     | numeric        |           |          |
 n2     | numeric(38,10) |           |          |
Indexes:
    "test_pkey" PRIMARY KEY, btree (id)

It can be seen that the column n1 has value 123.012346 rather than the expected value 123.0123456789.
It is also worth noting that the column n2 defined as numeric(38,10) has the proper value as it is in the source Oracle database.

Thoughts and Fix

The issue itself is that the number was truncated to 6 decimal places. It is quite easy to find the same issue was already reported in 2018, but it is still not resolved: Golden Gate truncating decimal places for number data type post migration.

Firstly, when I encountered that issue, I tried to identify what component is causing this: GoldenGate for Oracle, GoldenGate for PostgreSQL, or may be PostgreSQL itself. GoldenGate for PostgreSQL uses Progress DataDirect driver that has an interesting setting: Unbounded Numeric Scale that has a default value of 6. Well it looked like something. I added that parameter to odbc.ini:

[testdb]
Driver=/u01/app/oracle/product/19/ogg_pg_home1/lib/GGpsql25.so
Description=PostgreSQL 12.6
Database=testdb
HostName=ora
PortNumber=5432
TransactionErrorBehavior=2
UnboundedNumericScale=10

Having rerun the load from scratch, I finally got the correct number in PostgreSQL:

testdb=# select * from tc.test;
 id |       n1       |       n2
----+----------------+----------------
  0 | 123.0123456789 | 123.0123456789
(1 row)

Conclusion

The ODBC driver used by GoldenGate truncates unbounded numeric columns to 6 decimal places by default. Thankfully, there is an option to alter it: Unbounded Numeric Scale. I am disposed to think that it is usually more reliable to specify a certain scale in the target database than relying on such settings.

Create Standby Skip Pluggable Database

2021-02-27T02:09:00.004+06:00

Goal

Let us assume there is a need to create a standby database that excludes some pluggable databases of the primary, e.g. the PDBs must not be restored during standby database creation, and the standby will ignore any redo related to them.
It is known that the DUPLICATE command does not allow that: Backup and Recovery Reference

Note: RMAN does not support partial PDB duplication. Therefore, you cannot use the SKIP TABLESPACE, TABLESPACE, SKIP PLUGGABLE DATABASE, and PLUGGABLE DATABASE options when creating a standby database.

This blog post describes how the desired can be accomplished.

Idea

Skip the tablespaces of the pluggable databases that should not be protected by Data Guard: SKIP [FOREVER] TABLESPACE clause of the RESTORE command can be used
Drop the excluded data files on the standby
Set ENABLED_PDBS_ON_STANDBY to apply redo only to certain PDBs

Setup

Primary and Standby

db_name: orcl
Oracle version: 19.10

Primary

Hostname: servera
db_unique_name: orcla
instance_name: orcla1
Pluggable Databases: PDB1-3

Standby

Hostname: serverb
db_unique_name: orclb
instance_name: orclb1
Pluggable Database: PDB1

Creating Primary

The primary database was created by the following DBCA command:

dbca -silent -createDatabase \
  -responseFile NO_VALUE \
  -gdbName orcl \
  -sid orcla1 \
  -templateName New_Database.dbt \
  -createAsContainerDatabase true \
    -numberOfPDBs 3 \
    -pdbName pdb \
    -pdbAdminPassword Oracle123 \
  -sysPassword Oracle123 \
  -systemPassword Oracle123 \
  -recoveryAreaDestination +FRA \
    -recoveryAreaSize 10000 \
  -storageType ASM \
    -datafileDestination +DATA \
  -enableArchive true \
  -memoryMgmtType AUTO_SGA \
  -dbOptions "APEX:false,DV:false,CWMLITE:false,IMEDIA:false,JSERVER:false,OMS:false,ORACLE_TEXT:false,SAMPLE_SCHEMA:false,SPATIAL:false" \
  -initParams "db_unique_name=orcla,dg_broker_start=true"

The actual output from the terminal:

[oracle@servera ~]$ dbca -silent -createDatabase \
>   -responseFile NO_VALUE \
>   -gdbName orcl \
>   -sid orcla1 \
>   -templateName New_Database.dbt \
>   -createAsContainerDatabase true \
>     -numberOfPDBs 3 \
>     -pdbName pdb \
>     -pdbAdminPassword Oracle123 \
>   -sysPassword Oracle123 \
>   -systemPassword Oracle123 \
>   -recoveryAreaDestination +FRA \
>     -recoveryAreaSize 10000 \
>   -storageType ASM \
>     -datafileDestination +DATA \
>   -enableArchive true \
>   -memoryMgmtType AUTO_SGA \
>   -dbOptions "APEX:false,DV:false,CWMLITE:false,IMEDIA:false,JSERVER:false,OMS:false,ORACLE_TEXT:false,SAMPLE_SCHEM
A:false,SPATIAL:false" \
>   -initParams "db_unique_name=orcla,dg_broker_start=true"
[WARNING] [DBT-06801] Specified Fast Recovery Area size (10,000 MB) is less than the recommended value.
   CAUSE: Fast Recovery Area size should at least be three times the database size (8,897 MB).
   ACTION: Specify Fast Recovery Area Size to be at least three times the database size.
[WARNING] [DBT-06801] Specified Fast Recovery Area size (10,000 MB) is less than the recommended value.
   CAUSE: Fast Recovery Area size should at least be three times the database size (4,502 MB).
   ACTION: Specify Fast Recovery Area Size to be at least three times the database size.
Prepare for db operation
6% complete
Registering database with Oracle Restart
10% complete
Creating and starting Oracle instance
12% complete
16% complete
Creating database files
17% complete
23% complete
Creating data dictionary views
25% complete
29% complete
32% complete
34% complete
36% complete
38% complete
42% complete
Creating cluster database views
43% complete
55% complete
Completing Database Creation
59% complete
61% complete
Creating Pluggable Databases
65% complete
69% complete
73% complete
81% complete
Executing Post Configuration Actions
100% complete
Database creation complete. For details check the logfiles at:
 /u01/app/oracle/cfgtoollogs/dbca/orcla.
Database Information:
Global Database Name:orcla
System Identifier(SID):orcla1
Look at the log file "/u01/app/oracle/cfgtoollogs/dbca/orcla/orcla.log" for further details.
[oracle@servera ~]$

There are three pluggable databases created: PDB1-3. PDB1 should be replicated as usual. Both PDB2 and PDB3 should not be available on standby.

Creating Standby

Adding Static Service to Listener

I edited /u01/app/19.3.0/grid/network/admin/listener.ora and added a static registration for my standby instance:

SID_LIST_LISTENER=
  (SID_LIST=
      (SID_DESC=
         (GLOBAL_DBNAME=orclb)
         (SID_NAME=orclb1)
         (ORACLE_HOME=/u01/app/oracle/product/19.3.0/dbhome_1)
       )
   )

Copying password file

[oracle@serverb ~]$ scp servera:$ORACLE_HOME/dbs/orapworcla1 $ORACLE_HOME/dbs/orapworclb1
Warning: Permanently added 'servera' (ECDSA) to the list of known hosts.
orapworcla1                                                                        100% 2048   860.9KB/s   00:00

Bootstrap init.ora

db_name=orcl
db_unique_name=orclb
instance_name=orclb1
enable_pluggable_database=true
db_recovery_file_dest=+FRA
db_recovery_file_dest_size=10G
dg_broker_start=true
enabled_pdbs_on_standby=pdb1

Only PDB1 will be enabled on the standby database.

Starting Standby Instance

[oracle@serverb ~]$ sqlplus / as sysdba

SQL*Plus: Release 19.0.0.0.0 - Production on Fri Feb 26 15:00:55 2021
Version 19.10.0.0.0

Copyright (c) 1982, 2020, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup nomount pfile=/tmp/init.ora
ORACLE instance started.

Total System Global Area  457176832 bytes
Fixed Size                  9135872 bytes
Variable Size             306184192 bytes
Database Buffers          134217728 bytes
Redo Buffers                7639040 bytes
SQL> Disconnected from Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.10.0.0.0

Restoring Standby Control File and Mounting Standby Database

[oracle@serverb ~]$ rman

Recovery Manager: Release 19.0.0.0.0 - Production on Fri Feb 26 15:01:04 2021
Version 19.10.0.0.0

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.

RMAN> connect target sys/Oracle123@servera/orcla

connected to target database: ORCL (DBID=1593309578)

RMAN> connect auxiliary sys/Oracle123@serverb/orclb

connected to auxiliary database: ORCL (not mounted)

RMAN> run
{
   sql clone 'create spfile from memory';
   shutdown clone immediate;
   startup clone nomount;
   restore clone from service  'servera/orcla' standby controlfile;
}

2> 3> 4> 5> 6> 7>
using target database control file instead of recovery catalog
sql statement: create spfile from memory

Oracle instance shut down

connected to auxiliary database (not started)
Oracle instance started

Total System Global Area     457176832 bytes

Fixed Size                     9135872 bytes
Variable Size                306184192 bytes
Database Buffers             134217728 bytes
Redo Buffers                   7639040 bytes

Starting restore at 26.02.2021 15:02:27
allocated channel: ORA_AUX_DISK_1
channel ORA_AUX_DISK_1: SID=76 device type=DISK

channel ORA_AUX_DISK_1: starting datafile backup set restore
channel ORA_AUX_DISK_1: using network backup set from service servera/orcla
channel ORA_AUX_DISK_1: restoring control file
channel ORA_AUX_DISK_1: restore complete, elapsed time: 00:00:02
output file name=+FRA/ORCLB/CONTROLFILE/current.256.1065538951
Finished restore at 26.02.2021 15:02:31

RMAN>
RMAN> run
{
   sql clone 'alter database mount standby database';
}

2> 3> 4>
sql statement: alter database mount standby database

RMAN>

Restoring Standby Database

Here are the primary data files:

alter session set "_exclude_seed_cdb_view"=false;

select p.pdb_name pdb_name, f.file_id, f.file_name
  from cdb_data_files f, dba_pdbs p
 where p.pdb_id(+) = f.con_id
 order by f.file_id, p.pdb_id;


PDB_NAME      FILE_ID FILE_NAME
---------- ---------- --------------------------------------------------------------------------------
                    1 +DATA/ORCLA/DATAFILE/system.260.1065528853
PDB$SEED            2 +DATA/ORCLA/BC3D34CE047E62F0E0530600000AF770/DATAFILE/system.261.1065528859
                    3 +DATA/ORCLA/DATAFILE/sysaux.262.1065528863
PDB$SEED            4 +DATA/ORCLA/BC3D34CE047E62F0E0530600000AF770/DATAFILE/sysaux.263.1065528865
                    5 +DATA/ORCLA/DATAFILE/undotbs1.264.1065528867
PDB$SEED            6 +DATA/ORCLA/BC3D34CE047E62F0E0530600000AF770/DATAFILE/undotbs1.265.1065528869
                    7 +DATA/ORCLA/DATAFILE/users.268.1065528883
PDB1                8 +DATA/ORCLA/BC3DBECD0A3C07D3E0530600000A8D82/DATAFILE/system.272.1065531169
PDB1                9 +DATA/ORCLA/BC3DBECD0A3C07D3E0530600000A8D82/DATAFILE/sysaux.270.1065531169
PDB1               10 +DATA/ORCLA/BC3DBECD0A3C07D3E0530600000A8D82/DATAFILE/undotbs1.271.1065531169
PDB1               11 +DATA/ORCLA/BC3DBECD0A3C07D3E0530600000A8D82/DATAFILE/users.274.1065531177
PDB2               12 +DATA/ORCLA/BC3DBF5BB0310A08E0530600000A91B8/DATAFILE/system.277.1065531179
PDB2               13 +DATA/ORCLA/BC3DBF5BB0310A08E0530600000A91B8/DATAFILE/sysaux.275.1065531179
PDB2               14 +DATA/ORCLA/BC3DBF5BB0310A08E0530600000A91B8/DATAFILE/undotbs1.276.1065531179
PDB2               15 +DATA/ORCLA/BC3DBF5BB0310A08E0530600000A91B8/DATAFILE/users.279.1065531185
PDB3               16 +DATA/ORCLA/BC3DBFD5112A0AB0E0530600000AE27E/DATAFILE/system.282.1065531187
PDB3               17 +DATA/ORCLA/BC3DBFD5112A0AB0E0530600000AE27E/DATAFILE/sysaux.280.1065531187
PDB3               18 +DATA/ORCLA/BC3DBFD5112A0AB0E0530600000AE27E/DATAFILE/undotbs1.281.1065531187
PDB3               19 +DATA/ORCLA/BC3DBFD5112A0AB0E0530600000AE27E/DATAFILE/users.284.1065531193

All data files of pluggable databases PDB2 and PDB3 should not be restored on the standby site, so that I would like to skip data files 12-19.
The command to restore the standby database and its output are as follows:

run
{
   set newname for tempfile 1 to '+DATA';
   set newname for tempfile 2 to '+DATA';
   set newname for tempfile 3 to '+DATA';
   set newname for tempfile 4 to '+DATA';
   set newname for tempfile 5 to '+DATA';
   switch clone tempfile all;
   set newname for database to '+DATA';
   restore
   from service 'servera/orcla' clone database
   skip forever tablespace
     pdb2:sysaux, pdb2:system, pdb2:undotbs1, pdb2:users,
     pdb3:sysaux, pdb3:system, pdb3:undotbs1, pdb3:users
   ;
}

executing command: SET NEWNAME

executing command: SET NEWNAME

executing command: SET NEWNAME

executing command: SET NEWNAME

executing command: SET NEWNAME

renamed tempfile 1 to +DATA in control file
renamed tempfile 2 to +DATA in control file
renamed tempfile 3 to +DATA in control file
renamed tempfile 4 to +DATA in control file
renamed tempfile 5 to +DATA in control file

executing command: SET NEWNAME

Starting restore at 26.02.2021 15:03:50
using channel ORA_AUX_DISK_1

channel ORA_AUX_DISK_1: starting datafile backup set restore
channel ORA_AUX_DISK_1: using network backup set from service servera/orcla
channel ORA_AUX_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_AUX_DISK_1: restoring datafile 00001 to +DATA
channel ORA_AUX_DISK_1: restore complete, elapsed time: 00:00:07
channel ORA_AUX_DISK_1: starting datafile backup set restore
channel ORA_AUX_DISK_1: using network backup set from service servera/orcla
channel ORA_AUX_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_AUX_DISK_1: restoring datafile 00002 to +DATA
channel ORA_AUX_DISK_1: restore complete, elapsed time: 00:00:07
channel ORA_AUX_DISK_1: starting datafile backup set restore
channel ORA_AUX_DISK_1: using network backup set from service servera/orcla
channel ORA_AUX_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_AUX_DISK_1: restoring datafile 00003 to +DATA
channel ORA_AUX_DISK_1: restore complete, elapsed time: 00:00:07
channel ORA_AUX_DISK_1: starting datafile backup set restore
channel ORA_AUX_DISK_1: using network backup set from service servera/orcla
channel ORA_AUX_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_AUX_DISK_1: restoring datafile 00004 to +DATA
channel ORA_AUX_DISK_1: restore complete, elapsed time: 00:00:03
channel ORA_AUX_DISK_1: starting datafile backup set restore
channel ORA_AUX_DISK_1: using network backup set from service servera/orcla
channel ORA_AUX_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_AUX_DISK_1: restoring datafile 00005 to +DATA
channel ORA_AUX_DISK_1: restore complete, elapsed time: 00:00:03
channel ORA_AUX_DISK_1: starting datafile backup set restore
channel ORA_AUX_DISK_1: using network backup set from service servera/orcla
channel ORA_AUX_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_AUX_DISK_1: restoring datafile 00006 to +DATA
channel ORA_AUX_DISK_1: restore complete, elapsed time: 00:00:04
channel ORA_AUX_DISK_1: starting datafile backup set restore
channel ORA_AUX_DISK_1: using network backup set from service servera/orcla
channel ORA_AUX_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_AUX_DISK_1: restoring datafile 00007 to +DATA
channel ORA_AUX_DISK_1: restore complete, elapsed time: 00:00:01
channel ORA_AUX_DISK_1: starting datafile backup set restore
channel ORA_AUX_DISK_1: using network backup set from service servera/orcla
channel ORA_AUX_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_AUX_DISK_1: restoring datafile 00008 to +DATA
channel ORA_AUX_DISK_1: restore complete, elapsed time: 00:00:03
channel ORA_AUX_DISK_1: starting datafile backup set restore
channel ORA_AUX_DISK_1: using network backup set from service servera/orcla
channel ORA_AUX_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_AUX_DISK_1: restoring datafile 00009 to +DATA
channel ORA_AUX_DISK_1: restore complete, elapsed time: 00:00:03
channel ORA_AUX_DISK_1: starting datafile backup set restore
channel ORA_AUX_DISK_1: using network backup set from service servera/orcla
channel ORA_AUX_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_AUX_DISK_1: restoring datafile 00010 to +DATA
channel ORA_AUX_DISK_1: restore complete, elapsed time: 00:00:03
channel ORA_AUX_DISK_1: starting datafile backup set restore
channel ORA_AUX_DISK_1: using network backup set from service servera/orcla
channel ORA_AUX_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_AUX_DISK_1: restoring datafile 00011 to +DATA
channel ORA_AUX_DISK_1: restore complete, elapsed time: 00:00:01
Finished restore at 26.02.2021 15:04:33

RMAN>

It can be seen that data files 12-19 were not restored.

Updating Standby Control File

I reconnect to the standby database and run SWITCH TO COPY commands:

RMAN>

Recovery Manager complete.
[oracle@serverb ~]$ rman target sys/Oracle123@serverb/orclb

Recovery Manager: Release 19.0.0.0.0 - Production on Fri Feb 26 15:06:55 2021
Version 19.10.0.0.0

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.

connected to target database: ORCL (DBID=1593309578, not open)

RMAN> switch database root to copy;

Starting implicit crosscheck backup at 26.02.2021 15:06:59
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=76 device type=DISK
Crosschecked 1 objects
Finished implicit crosscheck backup at 26.02.2021 15:06:59

Starting implicit crosscheck copy at 26.02.2021 15:06:59
using channel ORA_DISK_1
Crosschecked 11 objects
Finished implicit crosscheck copy at 26.02.2021 15:07:00

searching for all files in the recovery area
cataloging files...
no files cataloged

datafile 1 switched to datafile copy "+DATA/ORCLB/DATAFILE/system.257.1065539031"
datafile 3 switched to datafile copy "+DATA/ORCLB/DATAFILE/sysaux.258.1065539045"
datafile 5 switched to datafile copy "+DATA/ORCLB/DATAFILE/undotbs1.259.1065539055"
datafile 7 switched to datafile copy "+DATA/ORCLB/DATAFILE/users.260.1065539063"

RMAN> switch pluggable database "PDB$SEED", pdb1 to copy;

datafile 2 switched to datafile copy "+DATA/ORCLB/BC3D34CE047E62F0E0530600000AF770/DATAFILE/system.261.1065539039"
datafile 4 switched to datafile copy "+DATA/ORCLB/BC3D34CE047E62F0E0530600000AF770/DATAFILE/sysaux.262.1065539053"
datafile 6 switched to datafile copy "+DATA/ORCLB/BC3D34CE047E62F0E0530600000AF770/DATAFILE/undotbs1.256.1065539059"
datafile 8 switched to datafile copy "+DATA/ORCLB/BC3DBECD0A3C07D3E0530600000A8D82/DATAFILE/system.263.1065539063"
datafile 9 switched to datafile copy "+DATA/ORCLB/BC3DBECD0A3C07D3E0530600000A8D82/DATAFILE/sysaux.264.1065539067"
datafile 10 switched to datafile copy "+DATA/ORCLB/BC3DBECD0A3C07D3E0530600000A8D82/DATAFILE/undotbs1.265.1065539069"
datafile 11 switched to datafile copy "+DATA/ORCLB/BC3DBECD0A3C07D3E0530600000A8D82/DATAFILE/users.266.1065539073"

Dropping Excluded Data Files

I can drop all data files of pluggable databases PDB2 and PDB3 on standby:

SQL> alter session set container=pdb2;

Session altered.

SQL> alter database datafile 12,13,14,15 offline drop;

Database altered.

SQL> alter session set container=pdb3;

Session altered.

SQL> alter database datafile 16,17,18,19 offline drop;

Creating Data Guard Configuration

[oracle@serverb ~]$ dgmgrl
DGMGRL for Linux: Release 19.0.0.0.0 - Production on Fri Feb 26 15:13:12 2021
Version 19.10.0.0.0

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.

Welcome to DGMGRL, type "help" for information.
DGMGRL> connect sys/Oracle123@servera/orcla
Connected to "orcla"
Connected as SYSDBA.
DGMGRL>
DGMGRL> create configuration orcl as primary database is orcla connect identifier is '//servera/orcla';
Configuration "orcl" created with primary database "orcla"
DGMGRL> add database orclb as connect identifier is '//serverb/orclb';
Database "orclb" added
DGMGRL> enable configuration;
Enabled.
DGMGRL> show configuration;

Configuration - orcl

  Protection Mode: MaxPerformance
  Members:
  orcla - Primary database
    Warning: ORA-16789: standby redo logs configured incorrectly

    orclb - Physical standby database
      Warning: ORA-16809: multiple warnings detected for the member

Fast-Start Failover:  Disabled

Configuration Status:
WARNING   (status updated 50 seconds ago)

DGMGRL> show database verbose orclb;

Database - orclb

  Role:               PHYSICAL STANDBY
  Intended State:     APPLY-ON
  Transport Lag:      14 seconds (computed 5 seconds ago)
  Apply Lag:          14 seconds (computed 5 seconds ago)
  Average Apply Rate: 3.65 MByte/s
  Active Apply Rate:  0 Byte/s
  Maximum Apply Rate: 0 Byte/s
  Real Time Query:    OFF
  Instance(s):
    orclb1

  Database Warning(s):
    ORA-16789: standby redo logs configured incorrectly

  Properties:
    DGConnectIdentifier             = '//serverb/orclb'
    ObserverConnectIdentifier       = ''
    FastStartFailoverTarget         = ''
    PreferredObserverHosts          = ''
    LogShipping                     = 'ON'
    RedoRoutes                      = ''
    LogXptMode                      = 'ASYNC'
    DelayMins                       = '0'
    Binding                         = 'optional'
    MaxFailure                      = '0'
    ReopenSecs                      = '300'
    NetTimeout                      = '30'
    RedoCompression                 = 'DISABLE'
    PreferredApplyInstance          = ''
    ApplyInstanceTimeout            = '0'
    ApplyLagThreshold               = '30'
    TransportLagThreshold           = '30'
    TransportDisconnectedThreshold  = '30'
    ApplyParallel                   = 'AUTO'
    ApplyInstances                  = '0'
    StandbyFileManagement           = ''
    ArchiveLagTarget                = '0'
    LogArchiveMaxProcesses          = '0'
    LogArchiveMinSucceedDest        = '0'
    DataGuardSyncLatency            = '0'
    LogArchiveTrace                 = '0'
    LogArchiveFormat                = ''
    DbFileNameConvert               = ''
    LogFileNameConvert              = ''
    ArchiveLocation                 = ''
    AlternateLocation               = ''
    StandbyArchiveLocation          = ''
    StandbyAlternateLocation        = ''
    InconsistentProperties          = '(monitor)'
    InconsistentLogXptProps         = '(monitor)'
    LogXptStatus                    = '(monitor)'
    SendQEntries                    = '(monitor)'
    RecvQEntries                    = '(monitor)'
    HostName                        = 'serverb.example.com'
    StaticConnectIdentifier         = '(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=serverb.example.com)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=orclb_DGMGRL)(INSTANCE_NAME=orclb1)(SERVER=DEDICATED)))'
    TopWaitEvents                   = '(monitor)'
    SidName                         = '(monitor)'

  Log file locations:
    Alert log               : /u01/app/oracle/diag/rdbms/orclb/orclb1/trace/alert_orclb1.log
    Data Guard Broker log   : /u01/app/oracle/diag/rdbms/orclb/orclb1/trace/drcorclb1.log

Database Status:
WARNING

I can ignore standby log file warnings. They are not important for this exercise.

Running Tests

Opening Standby Database Read-Only

Let us open the standby database:

DGMGRL> edit database orclb set state='apply-off';
Succeeded.

SQL> alter database open read only;

Database altered.

SQL> select name from v$tempfile;

NAME
--------------------------------------------------------------------------------
+DATA/ORCLB/TEMPFILE/temp.267.1065539757
+DATA/ORCLB/BC3D34CE047E62F0E0530600000AF770/TEMPFILE/temp.268.1065539759
+DATA
+DATA
+DATA

SQL> sho pdbs

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB1                           MOUNTED
         4 PDB2                           MOUNTED
         5 PDB3                           MOUNTED
SQL> alter pluggable database pdb1 open read only;

Pluggable database altered.

SQL> alter pluggable database pdb2 open read only;
alter pluggable database pdb2 open read only
*
ERROR at line 1:
ORA-01147: SYSTEM tablespace file 12 is offline

While PDB1 can be opened just fine, the pluggable database PDB2 cannot be opened as expected. Neither can PDB3.

Enabling standby recovery:

DGMGRL> edit database orclb set state='apply-on';
Succeeded.

Testing Data Guard Replication

Finally, I can check that the data guard replication works as expected:

STANDBY> alter session set container=pdb1;

Session altered.

STANDBY> select count(*) from new_table;
select count(*) from new_table
                     *
ERROR at line 1:
ORA-00942: table or view does not exist

PRIMARY> alter session set container=pdb1;

Session altered.

PRIMARY> create table new_table as select * from dba_objects;

Table created.

PRIMARY> alter session set container=cdb$root;

Session altered.

PRIMARY> alter system archive log current;

System altered.

STANDBY> select count(*) from new_table;

  COUNT(*)
----------
     23043

Adding new Pluggable Database

A newly added pluggable database will not be replicated due to the current ENABLED_PDBS_ON_STANDBY setting. Depending on the parameter, some PDBs can be excluded and others can be replicated by default.

SQL> create pluggable database pdb4 admin user pdb_admin identified by pdb_admin;

Pluggable database created.

SQL> alter pluggable database pdb4 open;

Pluggable database altered.

SQL> alter system archive log current;

System altered.

The standby alert log shows that the PDB data files were not created:

2021-02-26T15:19:29.147366+00:00
PR00 (PID:4163): Media Recovery Log +FRA/ORCLB/ARCHIVELOG/2021_02_26/thread_1_seq_16.263.1065539909
Recovery created pluggable database PDB4
PDB4(6):File copy for ts-SYSTEM skipped for excluded/offline tablespace
PDB4(6):File #20 added to control file as 'UNNAMED00020'. Originally created as:
PDB4(6):'+DATA/ORCLA/BC3FCA17DFEB23DEE0530600000AF95F/DATAFILE/system.287.1065539949'
PDB4(6):because the pluggable database was created with nostandby
PDB4(6):or the tablespace belonging to the pluggable database is
PDB4(6):offline.
PDB4(6):File copy for ts-SYSAUX skipped for excluded/offline tablespace
PDB4(6):File #21 added to control file as 'UNNAMED00021'. Originally created as:
PDB4(6):'+DATA/ORCLA/BC3FCA17DFEB23DEE0530600000AF95F/DATAFILE/sysaux.285.1065539949'
PDB4(6):because the pluggable database was created with nostandby
PDB4(6):or the tablespace belonging to the pluggable database is
PDB4(6):offline.
PDB4(6):File copy for ts-UNDOTBS1 skipped for excluded/offline tablespace
PDB4(6):File #22 added to control file as 'UNNAMED00022'. Originally created as:
PDB4(6):'+DATA/ORCLA/BC3FCA17DFEB23DEE0530600000AF95F/DATAFILE/undotbs1.286.1065539949'
PDB4(6):because the pluggable database was created with nostandby
PDB4(6):or the tablespace belonging to the pluggable database is
PDB4(6):offline.
PDB4(6):File copy for ts-TEMP skipped for excluded/offline tablespace
PR00 (PID:4163): Media Recovery Waiting for T-1.S-17 (in transit)

Conclusion

Oracle already has enough features to create a standby database that includes subset PDBs of the primary. Although it is possible to create a clone (non-standby) that excludes certain PDBs using the DUPLICATE command, the SKIP PLUGGABLE DATABASE clause cannot be used while duplicating for standby. Instead, this blog post provides an alternative solution. Another option is to use Refreshable Clone PDBs - they can be considered as a standby replacement with certain restrictions.

The situation itself reminds me the Snapshot Standby Feature. I used it even before Oracle introduced the feature with a much of fuss. Depending on the customers' demand, Oracle can extend DUPLICATE DATABASE FOR STANDBY to start supporting the SKIP PLUGGABLE DATABASE clause.

CHA Alert CHA_DBWR_CKPT_IO_COMPLETE in OEM

2021-02-24T01:13:00.000+06:00

A customer was getting the following alerts from Oracle Enterprise Manager (OEM):

Host=rac1.example.com
Target type=Cluster
Target name=myrac
Message=DB Checkpoint Performance on Host rac1 Database/Cluster racdba Instance racdba1.
The Cluster Health Advisor (CHA) detected that Database Writer (DBW) checkpoints were slow because the database writes took longer than expected to complete.
Increase the number of DBWR processes. Add additional disks to the disk group for the database.
Relocate the database files to faster disks or to Solid State Devices.
If the storage subsystem supports a storage write back cache, check that the storage cache is functioning properly.
Severity=Warning
Event reported time=Feb 23, 2021 12:04:53 PM GMT
Operating System=Linux
Platform=x86_64
Event Type=Metric Alert
Event name=cha_alerts:cha_alert_level
Metric Group=CHA Alerts
Metric=Alert Level
Metric value=Warning
Key Value=CHA_INCIDENT_STATE_CHANGE_DATABASE_racdba_racdba1_CHA_DBWR_CKPT_IO_COMPLETE
Key Column 1=Key
Rule Name=SYSMAN Monitor Targets,rule 61
Rule Owner=SYSMAN
Update Details:
DB Checkpoint Performance on Host rac1 Database/Cluster racdba Instance racdba1.
The Cluster Health Advisor (CHA) detected that Database Writer (DBW) checkpoints were slow because the database writes took longer than expected to complete.
Increase the number of DBWR processes. Add additional disks to the disk group for the database.
Relocate the database files to faster disks or to Solid State Devices.
If the storage subsystem supports a storage write back cache, check that the storage cache is functioning properly.

I observed some infamous checkpoint not complete messages in the alert log, so I suspected that they might be the real cause.
Thus, I created quite small log files 20MB in size:

SQL> select group#, thread#, bytes from v$log;

    GROUP#    THREAD#      BYTES
---------- ---------- ----------
         5          1   20971520
         6          1   20971520
         7          1   20971520
         8          2   20971520
         9          2   20971520
        10          2   20971520

6 rows selected.

Then, I wrote a script that generates a lot of redo:

create table t
as
select o.*
  from dba_objects o,
       xmltable('1 to 50');

update t
   set object_name=object_name;

I ran the script and here is an excerpt from the alert log showing checkpoint not complete messages:

2021-02-23T12:04:47.234452+00:00
Thread 1 advanced to log sequence 114 (LGWR switch),  current SCN: 7677512
  Current log# 6 seq# 114 mem# 0: +DATA/RACDBA/ONLINELOG/group_6.290.1065267481
2021-02-23T12:04:47.506174+00:00
PDB(3):Resize operation completed for file# 12, old size 645120K, new size 650240K
2021-02-23T12:04:47.848002+00:00
Thread 1 cannot allocate new log, sequence 115
Checkpoint not complete
  Current log# 6 seq# 114 mem# 0: +DATA/RACDBA/ONLINELOG/group_6.290.1065267481
2021-02-23T12:04:50.392351+00:00
Thread 1 advanced to log sequence 115 (LGWR switch),  current SCN: 7677540
  Current log# 7 seq# 115 mem# 0: +DATA/RACDBA/ONLINELOG/group_7.291.1065267483
2021-02-23T12:04:50.661220+00:00
PDB(3):Resize operation completed for file# 12, old size 650240K, new size 655360K
2021-02-23T12:04:50.993950+00:00
Thread 1 cannot allocate new log, sequence 116
Checkpoint not complete
  Current log# 7 seq# 115 mem# 0: +DATA/RACDBA/ONLINELOG/group_7.291.1065267483
2021-02-23T12:04:53.576282+00:00
Thread 1 advanced to log sequence 116 (LGWR switch),  current SCN: 7677574
  Current log# 5 seq# 116 mem# 0: +DATA/RACDBA/ONLINELOG/group_5.289.1065267477
2021-02-23T12:04:53.845381+00:00
PDB(3):Resize operation completed for file# 12, old size 655360K, new size 660480K
2021-02-23T12:04:54.183174+00:00
Thread 1 cannot allocate new log, sequence 117
Checkpoint not complete
  Current log# 5 seq# 116 mem# 0: +DATA/RACDBA/ONLINELOG/group_5.289.1065267477
2021-02-23T12:04:56.721595+00:00
Thread 1 advanced to log sequence 117 (LGWR switch),  current SCN: 7677600
  Current log# 6 seq# 117 mem# 0: +DATA/RACDBA/ONLINELOG/group_6.290.1065267481
2021-02-23T12:04:56.995439+00:00
PDB(3):Resize operation completed for file# 12, old size 660480K, new size 665600K
2021-02-23T12:04:57.332102+00:00
Thread 1 cannot allocate new log, sequence 118
Checkpoint not complete
  Current log# 6 seq# 117 mem# 0: +DATA/RACDBA/ONLINELOG/group_6.290.1065267481

Shortly afterwards, I got a new CHA alert in the OEM:

To understand what is happening, firstly, I started looking for some clues in the CHA log: /u01/app/grid/crsdata/rac1/trace/chad/ochad.trc.0, in which /u01/app/grid is ORACLE_BASE of the grid user.

[Thread-22] [ 2021-02-23 12:04:53.390 UTC ] [BNet.validate:422]  INFO: Tue Feb 23 12:04:50 UTC 2021 Decided CHA_DBWR_CKPT_IO_COMPLETE NORMAL (Belief= 94.900925  %) --> ABNORMAL (Belief= 81.315659  %)
[Thread-22] [ 2021-02-23 12:04:53.390 UTC ] [BNet.validate:429]  INFO: Tue Feb 23 12:04:50 UTC 2021      Children
[Thread-22] [ 2021-02-23 12:04:53.391 UTC ] [BNet.validate:446]  INFO: Tue Feb 23 12:04:50 UTC 2021           DBWR IO Completion= NORMAL
[Thread-22] [ 2021-02-23 12:04:53.391 UTC ] [BNet.validate:446]  INFO: Tue Feb 23 12:04:50 UTC 2021           log_file_switch_checkpoint_incompleteRANGE3= HIGH
[Thread-22] [ 2021-02-23 12:04:53.391 UTC ] [EventPublisher$EventRequest.dump:193]  INFO: type:DATABASE, target:racdba, host:rac1, instance:racdba1, 
problem:CHA_DBWR_CKPT_IO_COMPLETE, state:ABNORMAL, time:2021-02-23 12:04:50,
evidence:[{"signal":"log_file_switch_checkpoint_incomplete", "fd":"log_file_switch_checkpoint_incompleteRANGE3", "state":"HIGH", "value":1.09745E06, "pred":null, "since":"2021-02-23 12:04:50"}]

Based on the evidence above, I can conclude that CHA is checking the log file switch (checkpoint incomplete) wait event:

SQL> select event, total_waits, total_timeouts, time_waited_micro
  2    from v$system_event
  3   where event='log file switch (checkpoint incomplete)'
  4  /

EVENT                                    TOTAL_WAITS TOTAL_TIMEOUTS TIME_WAITED_MICRO
---------------------------------------- ----------- -------------- -----------------
log file switch (checkpoint incomplete)          166              0         196362926

The following process accesses this file:

[grid@rac1 grid]$ fuser /u01/app/grid/crsdata/rac1/trace/chad/ochad.trc.0
/u01/app/grid/crsdata/rac1/trace/chad/ochad.trc.0: 13858
[grid@rac1 grid]$ ps fww -p 13858
  PID TTY      STAT   TIME COMMAND
13858 ?        Sl     0:37 /u01/app/19.3.0/grid/jdk/bin/java -server -Xms30M -Xmx512M 
-Djava.awt.headless=true -Ddisable.checkForUpdate=true -DTRACING.ENABLED=false 
-XX:ParallelGCThreads=1 -cp /u01/app/19.3.0/grid/jlib/cha.jar:/u01/app/19.3.0/grid/jlib/chaconfig.jar:/u01/app/19.3.0/grid/jlib/cha-diag-msg.jar:/u01/app/19.3.0/grid/jlib/clsce.jar:/u01/app/19.3.0/grid/jlib/srvm.jar:/u01/app/19.3.0/grid/jlib/srvmhas.jar:/u01/app/19.3.0/grid/jlib/srvmasm.jar:/u01/app/19.3.0/grid/jlib/netcfg.jar:/u01/app/19.3.0/grid/jdbc/lib/ojdbc8.jar:/u01/app/19.3.0/grid/ucp/lib/ucp.jar:/u01/app/19.3.0/grid/jlib/fte.jar:/u01/app/19.3.0/grid/jlib/jai_core.jar:/u01/app/19.3.0/grid/jlib/mlibwrapper_jai.jar:/u01/app/19.3.0/grid/jlib/vecmath.jar:/u01/app/19.3.0/grid/jlib/jai_codec.jar:/u01/app/19.3.0/grid/jlib/jh.jar
oracle.cha.server.CHADDriver

The corresponding CRS resource is ora.chad:

[grid@rac1 trace]$ crsctl stat res ora.chad -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.chad
               ONLINE  ONLINE       rac1                     STABLE
               ONLINE  ONLINE       rac2                     STABLE
--------------------------------------------------------------------------------

It can be managed by srvctl commands: srvctl stop/status/start cha.

CHA is tightly integrated with the Grid Infrastructure Management Repository (GIMR). In fact, most chactl commands cannot work without it. However, I do not use GIMR, and it appears that some CHA functionality is still available, at least part of it that can publish database related events to OEM. It is explained in the following whitepaper: Oracle Database 12c Rel. 2 Cluster Health Advisor - Deep Dive How it Works and How to Use It

Conclusion

I do not think that the message produced by the CHA_DBWR_CKPT_IO_COMPLETE alert has appropriate solutions:

DB Checkpoint Performance on Host rac1 Database/Cluster racdba Instance racdba1. The Cluster Health Advisor (CHA) detected that Database Writer (DBW) checkpoints were slow because the database writes took longer than expected to complete. Increase the number of DBWR processes. Add additional disks to the disk group for the database. Relocate the database files to faster disks or to Solid State Devices. If the storage subsystem supports a storage write back cache, check that the storage cache is functioning properly.

It is usually much easier to increase the total redo size, either by adding additional log groups, changing the size of the existing ones, or both. All of these actions give more chances to DBWR to catch up.

When there are CHA alerts in OEM, it makes sense to check the CHA log: /u01/app/grid/crsdata/rac1/trace/chad/ochad.trc.0. It has some invaluable information that could reveal exactly the cause.