Oracle DataGuard
Focus on Logical Standby Support Issues Maintains a standby database
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 2
DataGuard
Must be SYS to make changes
Sqlplus / as sysdba Some cant be made while apply process running Change Guard status Create physical standby
Support Issues
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 3
DataGuard Errors
DataGuard reports lot of errors
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 4
Verify archiving enabled Backup db (hot or cold) Create standby control file
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 5
Copy db backup files from primary Copy standby control file from primary Setup init.ora/spfile parameters Start physical standby db
Trace file
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 6
Build LogMiner dictionary Stop redo apply Errors, no impact Convert database to logical standby Two trace files Restart db Open resetlogs Verify logical standby working
On Standby database
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 7
Standby frozen most of the day Standby catches up once per day
Alert log messages while catching up Disk space for archived redo logs
Other issues
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 8
Constraint violations
Errors, resolution
No data found
Errors, resolution
ORA-16211
Errors, Oracle Support
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 9
After refresh
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 10
Thu Oct 18 16:59:20 2007 Error 1034 received logging on to the standby Thu Oct 18 16:59:20 2007 Errors in file /shared/orahome01/admin/BRHPROD/bdump/brhprod_arc1_2635.trc: ORA-01034: ORACLE not available PING[ARC1]: Heartbeat failed to connect to standby BRHPRSB'. Error is 1034.
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 11
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 12
On primary and standby Cat $ORACLE_HOME/dbs/orapw<SID> Alter user SYS identified by <password> Update password file
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 13
Start log apply process Trace file created Stops when log apply process stops
See file contents later
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 14
Primary configured, sending redo logs Standby not yet created/running Our scripts maintain primary archived redo logs
Compress to save disk space, delete after 2 days
Manually register
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 15
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 16
How long will you need to store redo logs? Not an issue if converting to logical soon
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 17
Starts trace file When physical standby first created Ends when log apply stops Trace file looks like a problem
Normal processing
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 18
Generates error
Why is this an error? Typical of DataGuard Everything seems to be an error - Even when it is perfectly routine Makes support more difficult - When is an error something to worry about?
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 19
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 20
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 21
Trace File
$ more /orahome01/admin/BRHBETA/bdump/brhbeta_mrp0_13474.trc /orahome01/admin/BRHBETA/bdump/brhbeta_mrp0_13474.trc Oracle Database 10g Enterprise Edition Release 10.2.0.2.0 - 64bit Production With the Partitioning, OLAP and Data Mining options ORACLE_HOME = /orahome01/product/10.2.0 System name: SunOS Node name: brh-beta1-zone04 Release: 5.10 Version: Generic_118833-36 Machine: sun4u Instance name: BRHBETA Redo thread mounted by this instance: 1 Oracle process number: 11 Unix process pid: 13474, image: oracle@beta1-zone04 (MRP0) *** SERVICE NAME:() 2007-10-09 16:34:36.298 *** SESSION ID:(394.1) 2007-10-09 16:34:36.298 ARCH: Connecting to console port... *** 2007-10-09 16:34:36.299 60639 kcrr.c Start applying redo logs to physical standby MRP0: Background Managed Standby Recovery process started
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 22
Trace File
*** 2007-10-09 16:34:41.302 1018 krsm.c Managed Recovery: Initialization posted. *** 2007-10-09 16:34:41.303 60639 kcrr.c Managed Standby Recovery not using Real Time Apply Recovery target incarnation = 2, activation ID = 0 Influx buffer limit = 27762 (50% x 55524) Successfully allocated 7 recovery slaves Using 158 overflow buffers per recovery slave Start recovery at thread 1 ckpt scn 8257757517457 logseq 1956 block 5 *** 2007-10-09 16:34:42.124 Media Recovery add redo thread 1 *** 2007-10-09 16:34:42.124 1018 krsm.c Recreating redo logs Managed Recovery: Active posted. ORA-00367: checksum error in log file header ORA-00305: log 1 of thread 1 inconsistent; belongs to another database ORA-00312: online log 1 thread 1: '/shared/oralogs01/BRHBETA/redo01a.log' *** 2007-10-09 16:34:42.147 60639 kcrr.c Clearing online redo logfile 1 /shared/oralogs01/BRHBETA/redo01a.log *** 2007-10-09 16:36:15.066 *** 2007-10-09 16:36:15.066 60639 kcrr.c Clearing online redo logfile 1 complete ORA-00367: checksum error in log file header ORA-00305: log 2 of thread 1 inconsistent; belongs to another database ORA-00312: online log 2 thread 1: '/shared/oralogs01/BRHBETA/redo02a.log'
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 23
Trace File
*** 2007-10-09 16:36:15.100 60639 kcrr.c Clearing online redo logfile 2 /shared/oralogs01/BRHBETA/redo02a.log *** 2007-10-09 16:37:51.473 *** 2007-10-09 16:37:51.473 60639 kcrr.c Clearing online redo logfile 2 complete ORA-00367: checksum error in log file header ORA-00305: log 3 of thread 1 inconsistent; belongs to another database ORA-00312: online log 3 thread 1: '/shared/oradata02/BRHBETA/redo03b.log' *** 2007-10-09 16:37:51.479 60639 kcrr.c Clearing online redo logfile 3 /shared/oradata02/BRHBETA/redo03b.log *** 2007-10-09 16:39:26.048 *** 2007-10-09 16:39:26.048 60639 kcrr.c Clearing online redo logfile 3 complete ORA-00367: checksum error in log file header ORA-00305: log 4 of thread 1 inconsistent; belongs to another database ORA-00312: online log 4 thread 1: '/shared/oradata02/BRHBETA/redo04b.log' *** 2007-10-09 16:39:26.488 60639 kcrr.c Clearing online redo logfile 4 /shared/oradata02/BRHBETA/redo04b.log *** 2007-10-09 16:41:00.447 *** 2007-10-09 16:41:00.447 60639 kcrr.c Clearing online redo logfile 4 complete *** 2007-10-09 16:41:00.469 60639 kcrr.c Media Recovery Waiting for thread 1 sequence 1956
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 24
*** 2007-10-09 16:41:00.469 60639 kcrr.c Fetching gap sequence in thread 1, gap sequence 1956-1976 *** 2007-10-09 16:41:30.782 ----------------------------------------------------------Check that the CONTROL_FILE_RECORD_KEEP_TIME initialization parameter is defined to a value that is sufficiently large enough to maintain adequate log switch information to resolve archivelog gaps. ----------------------------------------------------------*** 2007-10-09 16:54:31.045 *** 2007-10-09 16:54:31.045 60639 kcrr.c Fetching gap sequence in thread 1, gap sequence 1956-1956 *** 2007-10-09 16:55:01.154 ----------------------------------------------------------Check that the CONTROL_FILE_RECORD_KEEP_TIME initialization parameter is defined to a value that is sufficiently large enough to maintain adequate log switch information to resolve archivelog gaps. ----------------------------------------------------------*** 2007-10-09 16:56:31.179 Media Recovery Log /oraarch01/BRHBETA/LOG_1956_1_629245032.arc *** 2007-10-09 16:56:33.431 Media Recovery Log /oraarch01/BRHBETA/LOG_1957_1_629245032.arc *** 2007-10-09 16:56:44.495
Trace File
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 25
Trace File
*** 2007-10-09 16:56:44.495 60639 kcrr.c Media Recovery Waiting for thread 1 sequence 1958 *** 2007-10-09 16:56:44.495 60639 kcrr.c Fetching gap sequence in thread 1, gap sequence 1958-1976 *** 2007-10-09 16:57:14.647 ----------------------------------------------------------Check that the CONTROL_FILE_RECORD_KEEP_TIME initialization parameter is defined to a value that is sufficiently large enough to maintain adequate log switch information to resolve archivelog gaps. ----------------------------------------------------------*** 2007-10-09 17:05:14.785 Media Recovery Log /oraarch01/BRHBETA/LOG_1958_1_629245032.arc *** 2007-10-09 17:05:18.043 60639 kcrr.c Media Recovery Waiting for thread 1 sequence 1959 *** 2007-10-09 17:05:18.043 60639 kcrr.c Fetching gap sequence in thread 1, gap sequence 1959-1976 *** 2007-10-09 17:05:48.284 ----------------------------------------------------------Check that the CONTROL_FILE_RECORD_KEEP_TIME initialization parameter is defined to a value that is sufficiently large enough to maintain adequate log switch information to resolve archivelog gaps. -----------------------------------------------------------
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 26
*** 2007-10-09 17:07:18.309 Media Recovery Log /oraarch01/BRHBETA/LOG_1959_1_629245032.arc *** 2007-10-09 17:07:21.114 Media Recovery Log /oraarch01/BRHBETA/LOG_1960_1_629245032.arc *** 2007-10-09 17:07:22.945 Media Recovery Log /oraarch01/BRHBETA/LOG_1961_1_629245032.arc *** 2007-10-09 17:07:27.300 Media Recovery Log /oraarch01/BRHBETA/LOG_1962_1_629245032.arc *** 2007-10-09 17:07:29.637 Media Recovery Log /oraarch01/BRHBETA/LOG_1963_1_629245032.arc *** 2007-10-09 17:07:29.709 60639 kcrr.c Media Recovery Waiting for thread 1 sequence 1964 *** 2007-10-09 17:07:29.709 60639 kcrr.c Fetching gap sequence in thread 1, gap sequence 1964-1976 *** 2007-10-09 17:07:59.858 ----------------------------------------------------------Check that the CONTROL_FILE_RECORD_KEEP_TIME initialization parameter is defined to a value that is sufficiently large enough to maintain adequate log switch information to resolve archivelog gaps. ----------------------------------------------------------*** 2007-10-09 17:08:29.866 Media Recovery Log /oraarch01/BRHBETA/LOG_1964_1_629245032.arc
Trace File
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 27
*** 2007-10-09 Media Recovery *** 2007-10-09 Media Recovery *** 2007-10-09 Media Recovery *** 2007-10-09 Media Recovery *** 2007-10-09 Media Recovery *** 2007-10-09 Media Recovery *** 2007-10-09 Media Recovery *** 2007-10-09 Media Recovery *** 2007-10-09 Media Recovery *** 2007-10-09 Media Recovery *** 2007-10-09 Media Recovery *** 2007-10-09 Media Recovery
Trace File
17:08:31.924 Log /oraarch01/BRHBETA/LOG_1965_1_629245032.arc 17:09:12.510 Log /oraarch01/BRHBETA/LOG_1966_1_629245032.arc 17:09:21.050 Log /oraarch01/BRHBETA/LOG_1967_1_629245032.arc 17:09:40.234 Log /oraarch01/BRHBETA/LOG_1968_1_629245032.arc 17:09:45.055 Log /oraarch01/BRHBETA/LOG_1969_1_629245032.arc 17:09:50.572 Log /oraarch01/BRHBETA/LOG_1970_1_629245032.arc 17:09:58.968 Log /oraarch01/BRHBETA/LOG_1971_1_629245032.arc 17:10:03.922 Log /oraarch01/BRHBETA/LOG_1972_1_629245032.arc 17:10:13.196 Log /oraarch01/BRHBETA/LOG_1973_1_629245032.arc 17:10:21.927 Log /oraarch01/BRHBETA/LOG_1974_1_629245032.arc 17:10:34.064 Log /oraarch01/BRHBETA/LOG_1975_1_629245032.arc 17:10:42.420 60639 kcrr.c Waiting for thread 1 sequence 1976
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 28
*** 2007-10-09 17:10:42.421 60639 kcrr.c Fetching gap sequence in thread 1, gap sequence 1976-1976 *** 2007-10-09 17:11:12.538 ----------------------------------------------------------Check that the CONTROL_FILE_RECORD_KEEP_TIME initialization parameter is defined to a value that is sufficiently large enough to maintain adequate log switch information to resolve archivelog gaps. ----------------------------------------------------------*** 2007-10-09 17:12:42.563 Media Recovery Log /oraarch01/BRHBETA/LOG_1976_1_629245032.arc *** 2007-10-09 17:12:45.563 Media Recovery Log /oraarch01/BRHBETA/LOG_1977_1_629245032.arc *** 2007-10-09 17:12:48.534 Media Recovery Log /oraarch01/BRHBETA/LOG_1978_1_629245032.arc *** 2007-10-09 17:13:00.505 Media Recovery Log /oraarch01/BRHBETA/LOG_1979_1_629245032.arc *** 2007-10-09 17:13:02.054 Media Recovery Log /oraarch01/BRHBETA/LOG_1980_1_629245032.arc *** 2007-10-09 17:13:03.231 Media Recovery Log /oraarch01/BRHBETA/LOG_1981_1_629245032.arc *** 2007-10-09 17:13:03.902 Media Recovery Log /oraarch01/BRHBETA/LOG_1982_1_629245032.arc
Trace File
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 29
*** 2007-10-09 Media Recovery *** 2007-10-09 Media Recovery *** 2007-10-09 *** 2007-10-09 Media Recovery *** 2007-10-09 Media Recovery *** 2007-10-09 Media Recovery *** 2007-10-09 Media Recovery *** 2007-10-09 Media Recovery *** 2007-10-09 Media Recovery *** 2007-10-09 Media Recovery *** 2007-10-09 Media Recovery *** 2007-10-09 *** 2007-10-09 Media Recovery
Trace File
17:13:04.492 Log /oraarch01/BRHBETA/LOG_1983_1_629245032.arc 17:13:08.171 Log /oraarch01/BRHBETA/LOG_1984_1_629245032.arc 17:13:26.860 17:13:26.860 60639 kcrr.c Waiting for thread 1 sequence 1985 17:16:07.172 Log /oraarch01/BRHBETA/LOG_1985_1_629245032.arc 17:16:08.067 Log /oraarch01/BRHBETA/LOG_1986_1_629245032.arc 17:16:08.131 Log /oraarch01/BRHBETA/LOG_1987_1_629245032.arc 17:16:08.195 60639 kcrr.c Waiting for thread 1 sequence 1988 17:16:13.202 Log /oraarch01/BRHBETA/LOG_1988_1_629245032.arc 17:16:13.268 60639 kcrr.c Waiting for thread 1 sequence 1989 21:14:01.119 Log /oraarch01/BRHBETA/LOG_1989_1_629245032.arc 21:14:16.922 21:14:16.922 60639 kcrr.c Waiting for thread 1 sequence 1990
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 30
*** 2007-10-10 09:32:33.399 *** 2007-10-10 09:32:33.399 60639 kcrr.c Fetching gap sequence in thread 1, gap sequence 1990-1990 *** 2007-10-10 09:33:05.187 Media Recovery Log /oraarch01/BRHBETA/LOG_1990_1_629245032.arc *** 2007-10-10 09:33:22.505 Media Recovery Log /oraarch01/BRHBETA/LOG_1991_1_629245032.arc *** 2007-10-10 09:33:22.570 Media Recovery Log /oraarch01/BRHBETA/LOG_1992_1_629245032.arc *** 2007-10-10 09:33:22.631 Media Recovery Log /oraarch01/BRHBETA/LOG_1993_1_629245032.arc *** 2007-10-10 09:33:22.693 Media Recovery Log /oraarch01/BRHBETA/LOG_1994_1_629245032.arc *** 2007-10-10 09:33:22.761 Media Recovery Log /oraarch01/BRHBETA/LOG_1995_1_629245032.arc *** 2007-10-10 09:33:22.807 Media Recovery Log /oraarch01/BRHBETA/LOG_1996_1_629245032.arc *** 2007-10-10 09:33:22.864 Media Recovery Log /oraarch01/BRHBETA/LOG_1997_1_629245032.arc *** 2007-10-10 09:33:22.918 Media Recovery Log /oraarch01/BRHBETA/LOG_1998_1_629245032.arc *** 2007-10-10 09:33:23.199 Media Recovery Log /oraarch01/BRHBETA/LOG_1999_1_629245032.arc
Trace File
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 31
*** 2007-10-10 09:33:23.255 60639 kcrr.c Media Recovery Waiting for thread 1 sequence 2000 *** 2007-10-10 10:11:07.685 Media Recovery Log /oraarch01/BRHBETA/LOG_2000_1_629245032.arc *** 2007-10-10 10:11:08.422 60639 kcrr.c Media Recovery Waiting for thread 1 sequence 2001 *** 2007-10-10 10:14:48.843 Media Recovery Log /oraarch01/BRHBETA/LOG_2001_1_629245032.arc *** 2007-10-10 10:14:49.013 60639 kcrr.c Media Recovery Waiting for thread 1 sequence 2002 *** 2007-10-10 10:15:19.072 *** 2007-10-10 10:15:19.072 60639 kcrr.c MRP0: Background Media Recovery cancelled with status 16037 ORA-16037: user requested cancel of managed recovery operation ----- Redo read statistics for thread 1 ----Read rate (ASYNC): 619732Kb in 63640.12s => 0.01 Mb/sec Total physical reads: 619732Kb Longest record: 28Kb, moves: 0/2001133 (0%) Change moves: 779641/4101685 (19%), moved: 141Mb Longest LWN: 1023Kb, moves: 117/175493 (0%), moved: 23Mb Last redo scn: 0x0782.a8f27f37 (8257761607479) ---------------------------------------------*** 2007-10-10 10:15:19.088 Media Recovery drop redo thread 1
Trace File
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 32
Trace File
*** 2007-10-10 10:15:20.864 1018 krsm.c Managed Recovery: Not Active posted. ORA-16037: user requested cancel of managed recovery operation ARCH: Connecting to console port... *** 2007-10-10 10:15:20.871 60639 kcrr.c MRP0: Background Media Recovery process shutdown *** 2007-10-10 10:15:20.871 1018 krsm.c oraarch01/BRHBETA $
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 33
When applying redo logs Generates 2 trace files What are they? One shows start of kcrrwkx Second shows end of kcrrwkx What are these for? Neither show up in alert log Both continue as long as SQL apply process runs
Trace files
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 34
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 35
*** SERVICE NAME:() 2007-10-10 10:40:26.358 *** SESSION ID:(396.1) 2007-10-10 10:40:26.358 kcrrwkx: nothing to do (start) *** 2007-10-10 10:41:26.315 kcrrwkx: nothing to do (end) *** 2007-10-10 10:42:26.322 kcrrwkx: nothing to do (end)
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 36
While applying archived redo logs Trace file documents everything standby does Once converted to logical standby Two trace files generated Contain messages for start/stop of each log apply
Logical Standby
Why not have DataGuard alert logs? Trace files tell me that something is wrong
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 37
Normal Operation
Logical Standby catching up to Primary
Redo log from primary registered with DG Redo logs applied to standby Redo logs deleted from standby
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 38
Standby Catching Up
Tue Oct 16 15:13:22 2007 Stop SQL Apply process Completed: ALTER DATABASE STOP LOGICAL STANDBY APPLY Tue Oct 16 15:14:16 2007 Incremental checkpoint up to RBA [0x7.a0aa2.0], current log tail at RBA [0x7.b8e2c.0] Tue Oct 16 15:14:45 2007 ALTER DATABASE START LOGICAL STANDBY APPLY Start SQL Apply process after skipping table Tue Oct 16 15:14:45 2007 ALTER DATABASE START LOGICAL STANDBY APPLY (BRHBETA) Tue Oct 16 15:14:45 2007 No optional part Attempt to start background Logical Standby process LSP0 started with pid=21, OS id=5041 LOGSTDBY status: ORA-16111: log mining and apply setting up Tue Oct 16 15:14:46 2007 LOGMINER: Parameters summary for session# = 1 LOGMINER: Number of processes = 3, Transaction Chunk Size = 201 LOGMINER: Memory Size = 30M, Checkpoint interval = 150M Tue Oct 16 15:14:46 2007 Completed: ALTER DATABASE START LOGICAL STANDBY APPLY LOGMINER: session# = 1, builder process P001 started with pid=7 OS id=10018 LOGMINER: session# = 1, reader process P000 started with pid=34 OS id=10014 LOGMINER: session# = 1, preparer process P002 started with pid=36 OS id=10020 LSP2 started with pid=23, OS id=5043
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 39
Standby Catching Up
Tue Oct 16 15:14:48 2007 LOGMINER: Begin mining logfile: /oraarch01/BRHBETA/LOG_2048_1_629245032.arc LOGSTDBY Analyzer process P003 started with pid=13 OS id=10051 Tue Oct 16 15:14:48 2007 LOGMINER: Turning ON Log Auto Delete LOGSTDBY Apply process P004 started with pid=40 OS id=10054 LOGSTDBY Apply process P006 started with pid=42 OS id=10062 LOGSTDBY Apply process P007 started with pid=17 OS id=10064 LOGSTDBY Apply process P005 started with pid=15 OS id=10060 Tue Oct 16 15:22:02 2007 Beginning log switch checkpoint up to RBA [0x8.2.10], SCN: 8295181217591 Thread 1 advanced to log sequence 8 Current log# 4 seq# 8 mem# 0: /shared/oradata02/BRHBETA/redo04b.log Current log# 4 seq# 8 mem# 1: /shared/oralogs01/BRHBETA/redo04a.log Tue Oct 16 15:25:28 2007 Completed checkpoint up to RBA [0x8.2.10], SCN: 8295181217591 Tue Oct 16 15:34:32 2007 Incremental checkpoint up to RBA [0x8.4cbae.0], current log tail at RBA [0x8.65553.0] Tue Oct 16 15:42:40 2007 LOGMINER: End mining logfile: /oraarch01/BRHBETA/LOG_2048_1_629245032.arc Tue Oct 16 15:42:40 2007 LOGMINER: Begin mining logfile: /oraarch01/BRHBETA/LOG_2049_1_629245032.arc ... ... www.brianhitchcock.net ...
Brian Hitchcock October 23, 2007
Page 40
Standby Catching Up
Tue Oct 16 17:20:48 2007 LOGMINER: End mining logfile: /oraarch01/BRHBETA/LOG_2049_1_629245032.arc Processing redo logs Tue Oct 16 17:20:48 2007 LOGMINER: Begin mining logfile: /oraarch01/BRHBETA/LOG_2050_1_629245032.arc Tue Oct 16 17:20:54 2007 LOGMINER: End mining logfile: /oraarch01/BRHBETA/LOG_2050_1_629245032.arc ... ... ... Deleting redo logs Tue Oct 16 18:39:13 2007 LOGMINER: Log Auto Delete - deleting: /oraarch01/BRHBETA/LOG_2048_1_629245032.arc Deleted file /oraarch01/BRHBETA/LOG_2048_1_629245032.arc ... ... ... Tue Oct 16 18:43:40 2007 LOGMINER: Begin mining logfile: /oraarch01/BRHBETA/LOG_2082_1_629245032.arc Tue Oct 16 18:43:59 2007 Processing redo logs LOGMINER: End mining logfile: /oraarch01/BRHBETA/LOG_2082_1_629245032.arc Tue Oct 16 18:43:59 2007 LOGMINER: Begin mining logfile: /oraarch01/BRHBETA/LOG_2083_1_629245032.arc
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 41
Standby Catching Up
Tue Oct 16 18:44:01 2007 LOGMINER: Log Auto Delete - deleting: /oraarch01/BRHBETA/LOG_2056_1_629245032.arc Deleted file /oraarch01/BRHBETA/LOG_2056_1_629245032.arc Tue Oct 16 18:44:01 2007 LOGMINER: Log Auto Delete - deleting: /oraarch01/BRHBETA/LOG_2057_1_629245032.arc Deleted file /oraarch01/BRHBETA/LOG_2057_1_629245032.arc Tue Oct 16 18:44:01 2007 LOGMINER: Log Auto Delete - deleting: /oraarch01/BRHBETA/LOG_2058_1_629245032.arc Deleted file /oraarch01/BRHBETA/LOG_2058_1_629245032.arc ...
Deleting redo logs
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 42
Standby Catching Up
Standby is at 2087 Tue Oct 16 18:44:15 2007 LOGMINER: Begin mining logfile: /oraarch01/BRHBETA/LOG_2087_1_629245032.arc Tue Oct 16 18:48:37 2007 Completed checkpoint up to RBA [0xa.2.10], SCN: 8295181577382 Tue Oct 16 18:55:18 2007 Incremental checkpoint up to RBA [0xa.12dad.0], current log tail at RBA [0xa.1314b.0] Tue Oct 16 19:01:31 2007 RFS[1]: No standby redo logfiles created RFS[1]: Archived Log: '/oraarch01/BRHBETA/LOG_2153_1_629245032.arc' Primary is at 2153 Tue Oct 16 19:01:32 2007 RFS LogMiner: Registered logfile [/oraarch01/BRHBETA/LOG_2153_1_629245032.arc] to LogMiner session id [1] Tue Oct 16 19:15:22 2007 Incremental checkpoint up to RBA [0xa.142b2.0], current log tail at RBA [0xa.143fe.0] Tue Oct 16 19:29:01 2007 LSP0: warning -- apply server 2, sid 384 waiting on user sid 196 for event (since 0 seconds): Tue Oct 16 19:29:01 2007 LOGMINER: End mining logfile: /oraarch01/BRHBETA/LOG_2087_1_629245032.arc ...
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 43
Standby Catching Up
Tue Oct 16 19:30:58 2007 LOGSTDBY stmt: CREATE PFILE = '/tmp/datatools/BRHBETA.PFILE.19144.1192413665' FROM SPFILE = LOGSTDBY status: ORA-16226: DDL skipped due to lack of support LOGSTDBY id: XID 0x0003.02d.00013e70, hSCN 0x0782.a9c2fdb8, lSCN 0x0782.a9c2fdb8, Thread 1, RBA LOGSTDBY stmt: create pfile='/orahome01/oradba/tmp/ora_adm_sqlbt_bkp.tmp1.17449.BRHBETA' from spfile LOGSTDBY status: ORA-16226: DDL skipped due to lack of support LOGSTDBY id: XID 0x000b.001.000126cf, hSCN 0x0782.a9c2fe15, lSCN 0x0782.a9c2fe15, Thread 1, RBA LOGSTDBY stmt: CREATE PFILE = '/tmp/datatools/BRHBETA.PFILE.19695.1192413687' FROM SPFILE = LOGSTDBY status: ORA-16226: DDL skipped due to lack of support LOGSTDBY id: XID 0x0003.00c.00013e62, hSCN 0x0782.a9c2fe4a, lSCN 0x0782.a9c2fe4a, Thread 1, RBA LOGSTDBY stmt: ALTER DATABASE BACKUP CONTROLFILE TO '/tmp/datatools/dtodump_ LOGSTDBY status: ORA-16226: DDL skipped due to lack of support LOGSTDBY id: XID 0x0009.007.00011453, hSCN 0x0782.a9c2feb4, lSCN 0x0782.a9c2feb4, Thread 1, RBA Tue Oct 16 19:30:58 2007 ALTER TABLESPACE "SYSTEM" BEGIN BACKUP Completed: ALTER TABLESPACE "SYSTEM" BEGIN BACKUP Tue Oct 16 19:30:58 2007 ALTER TABLESPACE "SYSTEM" END BACKUP Completed: ALTER TABLESPACE "SYSTEM" END BACKUP ...
Unsupported DDL Standby doesnt execute
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 44
Standby Catching Up
Standby catches up at 2158
Tue Oct 16 21:29:19 2007 LOGMINER: Begin mining logfile: /oraarch01/BRHBETA/LOG_2157_1_629245032.arc Tue Oct 16 21:30:03 2007 LOGMINER: End mining logfile: /oraarch01/BRHBETA/LOG_2157_1_629245032.arc Tue Oct 16 21:35:52 2007 Incremental checkpoint up to RBA [0xa.f41b7.0], current log tail at RBA [0xa.f41cc.0] Tue Oct 16 21:55:56 2007 Incremental checkpoint up to RBA [0xa.f43b5.0], current log tail at RBA [0xa.f43b5.0] Tue Oct 16 22:11:16 2007 RFS[1]: No standby redo logfiles created RFS[1]: Archived Log: '/oraarch01/BRHBETA/LOG_2158_1_629245032.arc' Tue Oct 16 22:11:16 2007 RFS LogMiner: Registered logfile [/oraarch01/BRHBETA/LOG_2158_1_629245032.arc] to LogMiner session id [1] Tue Oct 16 22:11:16 2007 LOGMINER: Begin mining logfile: /oraarch01/BRHBETA/LOG_2158_1_629245032.arc Tue Oct 16 22:11:20 2007 LOGMINER: End mining logfile: /oraarch01/BRHBETA/LOG_2158_1_629245032.arc
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 45
How long will you need to store redo logs? If standby frozen all day
Weekends? Holidays?
If standby fails
How many days to fix failures?
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 46
If not on disk when needed for standby Recover from backup Dataguard may not see these redo logs
Register redo logs
Logical standby
Also generates its own archived redo logs Needed to recover standby db
Unique standby db objects?
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 47
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 48
SQL apply process applying redo log 2049 Doesnt move on within a few minutes Current time is Tue Oct 16 08:09:55 2007 Shows start time for this redo log
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 49
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 50
TYPE -----------------------------STATUS -------------------------------------------------------------------------------------------------------------HIGH_SCN --------------------COORDINATOR ORA-16116: no work available 8257767540953 READER ORA-16127: stalled waiting for additional transactions to be applied 8257767541085 BUILDER ORA-16127: stalled waiting for additional transactions to be applied 8257767540965
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 51
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 52
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 53
Compute Estimate
Tue Oct 16 08:09:55 -- Tue Oct 16 11:17:39 APPLIER has moved from 39247 to 39991 3 hours --> roughly 750 SCNs, 250 per hour it still needs to go from 539991 to 754044 over 200,000 SCNs -- at 250 per hour, this would take 800 hours --> 33 days
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 54
39991 to 40857 in the last 4 hours, 866 SCNs, roughly in line with 250/hr we computed earlier
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 55
Assumes all SCNs take same amount of time Compute estimate Confirm that it will take a long time Compare with business requirements for standby
Must be in synch once per day
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 56
SQL Apply Process restarts with redo 2048 Standby catches up quickly
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 57
Primary/Standby Interactions
Logical standby backup starts
Tablespaces put into backup mode Contain transactions for primary backup Tries to put tablespaces into backup mode Wait for standby backup to finish Restart apply process Apply process runs longer than normal
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 58
Oscillating updates
Oracle docs explain this (I cant)
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 59
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 60
No Data Found
What does it mean?
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 61
No Data Found
Wed Sep 19 12:09:23 2007 LOGSTDBY stmt: update "PO"."PO_LINE_LOCATIONS_ALL" ... ...SQL, values ... LOGSTDBY status: ORA-01403: no data found LOGSTDBY id: XID 0x0008.01e.0000c437, hSCN 0x0789.eacde6c1, lSCN 0x0789.eacde6c1 LOGSTDBY Apply process P007 pid=29 OS id=3447 stopped Wed Sep 19 12:09:23 2007 Errors in file /shared/orahome01/admin/BRHPRSB/bdump/brhprsb_lsp0_12386.trc: ORA-12801: error signaled in parallel query server P004 ORA-01403: no data found LOGSTDBY Analyzer process P003 pid=24 OS id=3439 stopped LOGSTDBY Apply process P006 pid=27 OS id=3445 stopped LOGSTDBY Apply process P005 pid=26 OS id=3443 stopped
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 62
No Data Found
What happened?
For some reason Table data not the same primary vs standby
Logical standby is read-write
SYS can change anything at any time
How to fix?
www.brianhitchcock.net
Brian Hitchcock October 23, 2007
No Data Found
Logical Standby
No way to find out what happened No utility to verify primary, standby in synch Differences can exist for a long time
Wont cause error until table updated on primary
Can you depend on this for your reports? How do you know what is in the standby? What has been skipped?
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 64
Standby db
Transactions came through to standby Standby doesnt have java class files Apply process fails Identify and skip transaction(s)
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 65
ORA-07445 Errors
SR opened
Results
Known bug fixed in 11g Apply patch on standby
Impact
None, no affect on standby
Apply patch?
No refresh would wipe out patch Dont want to patch primary db - Primary doesnt have this error
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 66
ORA-07445 Errors
SQL Apply Process stops
Tue Oct 16 21:27:50 2007 Errors in file /orahome01/admin/BRHBETA/bdump/brhbeta_p004_6577.trc: ORA-07445: exception encountered: core dump [krvsmso()+1212] [SIGSEGV] [Address not mapped to object] Tue Oct 16 21:29:06 2007 Errors in file /orahome01/admin/BRHBETA/bdump/brhbeta_lsp0_5041.trc: ORA-12805: parallel query server died unexpectedly Tue Oct 16 21:29:06 2007 Logical Standby is not for the faint of heart! TLCR process death detected. Shutting down TLCR logminer process death detected, exiting logical standby LOGSTDBY Analyzer process P003 pid=13 OS id=10051 stopped LOGSTDBY Apply process P005 pid=15 OS id=10060 stopped LOGSTDBY Apply process P006 pid=42 OS id=10062 stopped LOGSTDBY Apply process P007 pid=17 OS id=10064 stopped Tue Oct 16 21:29:06 2007 LOGSTDBY status: ORA-16222: automatic Logical Standby retry of last action LOGSTDBY status: ORA-16111: log mining and apply setting up
SQL Apply Process automatically restarts
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 67
ORA-07445 Errors
Tue Oct 16 21:29:07 2007 LOGMINER: Parameters summary for session# = 1 LOGMINER: Number of processes = 3, Transaction Chunk Size = 201 LOGMINER: Memory Size = 30M, Checkpoint interval = 150M LOGMINER: session# = 1, builder process P001 started with pid=7 OS id=10018 LOGMINER: session# = 1, reader process P000 started with pid=34 OS id=10014 LOGMINER: session# = 1, preparer process P002 started with pid=36 OS id=10020 Tue Oct 16 21:29:10 2007 LOGMINER: Begin mining logfile: /oraarch01/BRHBETA/LOG_2147_1_629245032.arc Tue Oct 16 21:29:10 2007 SQL Apply Process continues processing LOGMINER: Turning ON Log Auto Delete LOGSTDBY Analyzer process P003 started with pid=13 OS id=10051 LOGSTDBY Apply process P006 started with pid=42 OS id=10062 LOGSTDBY Apply process P004 started with pid=30 OS id=10219 LOGSTDBY Apply process P005 started with pid=15 OS id=10060 LOGSTDBY Apply process P007 started with pid=17 OS id=10064
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 68
Refresh Process
Export unique standby db objects
Scripts to recreate
Create standby control file Use standby control file
Backup primary db
Create physical standby Convert to logical standby Import unique standby db objects
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 69
Unsupported Record
ORA-16211
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 70
Unsupported Record
Thu Oct 11 10:11:58 2007 LOGMINER: Log Auto Delete - deleting: /oraarch01/BRHBETA/LOG_2005_1_629245032.arc Deleted file /oraarch01/BRHBETA/LOG_2005_1_629245032.arc Thu Oct 11 10:15:55 2007 ** LOGMINER WARNING - Invalidated 4 LCRs ** Thu Oct 11 10:20:29 2007 LOGSTDBY stmt: "BRH"."XXSUN_INV_ITEMS_INT": unsupported LOGSTDBY status: ORA-16211: unsupported record found in the archived redo log ORA-06512: at "SYS.DBMS_INTERNAL_LOGSTDBY", line 4717 ORA-06512: at line 1 LOGSTDBY id: XID 0x0009.02e.0001127d, hSCN 0x0782.a9016545, lSCN 0x0782.a9016545, Thread 1 LOGSTDBY Apply process P007 pid=23 OS id=16578 stopped Thu Oct 11 10:20:29 2007 Errors in file /orahome01/admin/BRHBETA/bdump/brhbeta_lsp0_13625.trc: ORA-12801: error signaled in parallel query server P007 ORA-16211: unsupported record found in the archived redo log LOGSTDBY Analyzer process P003 pid=19 OS id=16570 stopped LOGSTDBY Apply process P005 pid=21 OS id=16574 stopped LOGSTDBY Apply process P006 pid=36 OS id=16576 stopped LOGSTDBY Apply process P004 pid=34 OS id=16572 stopped
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 71
Unsupported Record
What causes this?
Is this a standby?
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 72
Execute utlrp.sql 2 hours go by Not much changed Alter session disable guard alter database guard standby; Recompile runs in 2 minutes Alter session enable guard Perhaps guard enabled is the problem Guard level is the problem (all vs standby)
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 73
Alter session disable guard; Unique db objects exported before refresh Must be imported after refresh Alter database guard standby;
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 74
Conclusion
Logical standby
Lots of errors
Many require refreshing standby Lots of DBA support needed
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 75
Conclusion
Physical standby
Is solid, dependable No issues Is it really a standby? Is it ready for failover? Is it providing complete data for reports? Lots of issues Is it worth the effort/risk?
Logical standby
www.brianhitchcock.net
Brian Hitchcock October 23, 2007 Page 76