Node 1 Does Not Support Connections (RAC)

This morning I have been reported a problem in a test environment. Some connections were hung when trying to enter. After connecting and taking a look, the environment, it is a cluster of two nodes version 12c, note that node one is the one that does not support connections and the connections were left is waiting for a change of log. It’s funny but it did not give any error, it’s usually this situations shows a ora-00257, but this is not the case, maybe because of the engine version.

Checking the situation at the instance level;

 
SQL> select inst_id, version, status, thread#, archiver, log_switch_wait, logins from gv$instance

   INST_ID VERSION           STATUS          THREAD# ARCHIVE LOG_SWITCH_WAIT LOGINS
---------- ----------------- ------------ ---------- ------- --------------- ----------
         2 12.1.0.2.0        OPEN                  2 STARTED                 ALLOWED
         1 12.1.0.2.0        OPEN                  1 STARTED ARCHIVE LOG     ALLOWED

 
Sessions are waiting on event “log file switch (archiving needed)” like show below, first column is node (in this case the problem is in node 1);

 
N     SID TIME    SQL_ID            EXECS EVENT                              Marker
- ------- ------- -------------- -------- ---------------------------------- --------
1      22 0       6vm7g6qj4mqhd        0  PX Deq: Execution Msg
2     327 0       6vm7g6qj4mqhd        4  PX Deq: Execution Msg
2     649 0       6vm7g6qj4mqhd        4  PX Deq: Execute Reply
1     361 1471    6228pzdt28kzd     1008  log file switch (archiving needed)      <<<
2     121 106     61tssjb6hj8x7      329  gc buffer busy acquire
1      88 1645    9zg9qd9bm4spu     7787  log file switch (archiving needed)      <<<
1     804 1704    3mptsg6h27zg9        1  log file switch (archiving needed)      <<<
1      89 1705    0bfdn75zn75pw        2  log file switch (archiving needed)      <<<
2      20 1699    6nauzjpthp1w7        3  db file sequential read
1       3 +1H     1aa2fpqtx557g     3877  log file switch (archiving needed)      <<<
1     328 1705    g8bkp70myp46t        2  log file switch (archiving needed)      <<<
1     257 +1H     dr6d1upgkc1g3        1  log file switch (archiving needed)      <<<
2     615 505     aq8yqxyyb40nn     1255  gc current request
1     800 817     aq8yqxyyb40nn     1282  buffer busy waits
1     480 817     aq8yqxyyb40nn     1282  buffer busy waits
2     342 +1H     5ms6rbzdnq16t    15644  gc buffer busy acquire
1     377 817     aq8yqxyyb40nn     1282  buffer busy waits
1     345 817     aq8yqxyyb40nn     1282  log file switch (archiving needed)      <<<
1      21 817     aq8yqxyyb40nn     1282  buffer busy waits

 
After perform archive backup (with delete option);

 
SQL> select inst_id, version, status, thread#, archiver, log_switch_wait, logins from gv$instance

   INST_ID VERSION           STATUS          THREAD# ARCHIVE LOG_SWITCH_WAIT LOGINS
---------- ----------------- ------------ ---------- ------- --------------- ----------
         2 12.1.0.2.0        OPEN                  2 STARTED                 ALLOWED
         1 12.1.0.2.0        OPEN                  1 STARTED                 ALLOWED

 
HTH – Antonio NAVARRO

 

Advertisements

Legato Can Not Connect To The Database

Today I was setting up a database backup using Legato, one of the most powerful backup tools. When aparantemente is well configured as shown in the following image;

fail_setup_legato

This error appears;

fail_setup_legato_2

It’s strange because accessing directly using sqlplus is no problem and connects. after investigating for a while I realized that I was possibly making a connection with sysdba or equivalent, in my case, giving permission to sysbackup was enough;

GRANT SYSBACKUP TO USERBACKUP;

You can see these permissions in the view v$pwfile_users.

 

HTH – Antonio NAVARRO

 

 

 

Errors ORA-19625 ORA-27037 ORA-19600 ORA-19601

Yesterday a co-worker was creating a dataguard by using the command duplicte for standby

 
channel c1: restoring control file
ORA-19625: error identifying file /tbe/prod/uti/bck/controle_cdn_standby.ctl
ORA-27037: unable to obtain file status
HPUX-ia64 Error: 2: No such file or directory
Additional information: 3
ORA-19600: input file is control file  (/tbe/prod/uti/bck/controle_cdn_standby.ctl)
ORA-19601: output file is control file  (/tbe/prod/data/cdn/tbectrl1.con)
failover to previous backup

He is executing a duplicate like show below;

DUPLICATE TARGET DATABASE FOR STANDBY

He is using a tape backup of the primary database, but the file RMAN is looking to restore, was on disk instead of tape, RMAN can not find it.

After a bit of investatiion, we have seen that my partner, after the tape backup, executed the following command

ALTER DATABASE BACKUP CONTROLFILE TO ‘/xxxx/xxxx/control.bck’;

The alter system has been registered in the control file of the primary database, to which we are connected from the stanby to execute the duplicate, as a backup of the control file. The duplicate that is executed does not include any set until clauses, so Oracle looks for the more recently backup of the CF, in this case the copy on disk (with the same path), but has not been copied from the source machine to the destination (the rest of the backup is on tape)

Copy this backup to the CF disk in the same route in the machine where the standby solves the problem. Other solution would be set until time (or scn) to a point before the CF backup to disk.

HTH – Antonio NAVARRO

ORA-19511, ORA-19870, ORA-19501 And ORA-27190 Errors

Today when I arrived to the work I saw a email from Backup department about a restore failing the last Saturday. The error was the next;

 
channel aux12: ORA-27192: skgfcls: sbtclose2 returned error - failed to close file
ORA-19511: non RMAN, but media manager or vendor specific failure, error text:
   We could not read the checksum. (0:3:2)
ORA-19870: error while restoring backup piece WEBP_k3sq8u31_1_1
ORA-19501: read error on file "CLOUD_k3sf8t32_1_1", block number 1 (block size=512)
ORA-27190: skgfrd: sbtread2 returned error
ORA-19511: non RMAN, but media manager or vendor specific failure, error text:
   asdf_

After a bit of research I discovered the problem was in the network. This backup connect to serveral servers to work (Legato Server, a recover catalog and other database becouse of be a duplicate). Maybe a cut down of miliseconds order was enough to crash the restore. To verify the network stability and repeat the Rman Script solve the problem.

HTH – Antonio NAVARRO

 

RMAN-11003 ORA-32001 Errors When Duplicate Database

Last day I was performing a duplicate from production enviroment to developmente enviroment, when I received the next error;

 
channel: aux3 released channel: aux4 released channel: aux5 released channel: aux6 released channel: aux7 released channel: aux8
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of Duplicate Db command at 09/11/2017 17:09:24
RMAN-05501: aborting duplication of target database
RMAN-03015: error occurred in stored script Memory Script
RMAN-03009: failure of sql command on clone_default channel at 09/11/2017 17:09:24
RMAN-11003: failure during parse/execution of SQL statement: alter system set  db_name =  'SAPP' comment= 'Modified by RMAN duplicate' scope=spfile
ORA-32001: write to SPFILE requested but no SPFILE is in use

In this case the problem was the SPFILE, like the error show. I remove the previus clone of database before to duplicate again, but when I removed the previous image it not remove the service,
It is a RAC and I need remove from CRS. Remove the service and repeat the duplicate command again solve the problem.

HTH – Antonio NAVARRO

 

RMAN-06136 ORA-17627 ORA-01017 ORA-17629 Errors When Duplicating DB

Last morning I was improving a duplicate script, from production environment to development, when I get the next error;Last morning I was improving a duplicate script, from production environment to development, when I get the next error;

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of Duplicate Db command at 08/31/2017 08:41:54
RMAN-05501: aborting duplication of target database
RMAN-03015: error occurred in stored script Memory Script
RMAN-06136: ORACLE error from auxiliary database: ORA-17629: Cannot connect to the remote database server
ORA-17627: ORA-01017: invalid username/password; logon denied
ORA-17629: Cannot connect to the remote database server

After many probes I have discovered that password file is wrong. I performed a scp from prod to dev. When I send the file again it works fine.

 
HTH – Antonio NAVARRO

ORA-12720, RMAN-06136 And RMAN-05501 Errors When Duplicating Database

Today I was performa a duplicate database when I got the next error.

 
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of Duplicate Db command at 08/30/2017 10:32:22
RMAN-05501: aborting duplication of target database
RMAN-06136: ORACLE error from auxiliary database: ORA-01503: CREATE CONTROLFILE failed
ORA-12720: operation requires database is in EXCLUSIVE mode

Okay, it is my fault, I forgot set the cluster_databsae to FALSE. It came from a RAC and have value of TRUE. It is so easy as change this parameter.

HTH – Antonio NAVARRO