Anda di halaman 1dari 86

http://dba-expert.blogspot.in/search?

updated-min=2009-0101T00:00:00%2B04:00&updated-max=2010-01-01T00:00:00%2B04:00&maxresults=15

ASM Automatic Storage Management


Files Systems:

1.
2.
3.
4.
5.
6.

Disadvantages of Raw Devices:


It supports storage of only 1 file in only 1 raw device. Hence archive redo logs files &
flashback logs which are generated numerously are not suitable member for raw devices.
General O/S commands like cp, ls, mv, du etc will not work in Raw Devices.
Only dd (diskdump) is used for format, backup, restore the raw devices
Raw devices will not support collection of I/O statistics
They cannot be resize online
In Linux environment, out of 15 partitions we can use only 14 for creation of raw
devices, In Solaris we can use only 6 out of 7 partitions per a disk

To overcome all the disadvantages we use LVM (Logical volume manager) `

1.
2.
3.
4.
5.

It is a logical storage area which is created by collection of multiple disk partitions onto
which we can create any type of file system.
It supports storage of multiple file in a single volume.
Online resizing is possible
Supports collection of I/O statistics
It improves the I/O performance & availability with the help of software level RAID
techniques*.
Types of LVMs & Vendors:
LVMs
1.
VERITAS Volume Manager
2.
Tivole Volume Manager
3.
Sun
Volume
Manager
(SVM)
4.
ASM(from oracle 10g)

Vendors
Symantec
IBM
Oracle
SUN
Oracle

ASM:
It is a type of LVM supported from oracle 10g and has a special type of instance.
INSTANCE_TYPE=ASM. & has a small footprint of SGA with size 100-128 MB.
It supports for creation of logical volume known as disk groups, internally uses both
strIPing and mirroring.
Hence it does not have any control file to mount, so its least and last stage is nomount.
It has to mount the diskgroups
Diskgroup is a logical storage area which is created by collection of multiple disk
partitions.
ASM supports storage of multiple database related files like control files, redo, data,
archive logs, flashback logs, RMAN backup pieces, spfile etc. but it will not support the
storage of static files like pfile, listener.ora. tnsnames.ora sqlnet.ora etc.
From 11.2 onwards, by using ADVM (ASM dynamic volume manager) & ACFS (ASM
cluster file system) we can store static files also.

Note: Sometimes ASM instance may contain large pool also.


1 ASM instance will support creation of multiple disk groups and will provide services to
multiple clients.
ASM Clients: These are general DB instances which are dependent on ASM instance in
order to access the diskgroups.

ASM Instance Background processes:


RBAL Rebalance Master: It is responsible for managing and coordinating the disk group
activities and also responsible for generating the plans for even distribution of ASM instance
(extends) for better load balancing whenever a new disk is added and removed.
ARBn ASM Rebalancer: It is a slave process of RBAL background process and it is
responsible for actual load balancing of ASM Disks.

ASMB ASM Background: It is responsible for successful establishment of communication


channel between ASM instance & ASM clients.
GMON Global Monitor: It is responsible for coordinating the disk group activities
whenever a disk group becomes offline or drop.
KATE Konductor for ASM Temporary Errand: It is responsible for making online for
disk groups.
ASM client Background Processes:
RBAL Rebalance Master: It is responsible for successful opening and closing the
diskgroups whenever a read or write operations occur.
PZ9X: It is responsible for gathering the dynamic views information globally across all the
instances of database.

1.
2.
3.
4.
5.

ASM related dynamic views:


In RAC environment, all the dynamic views start with gv$ , in non-RAC g$
gv$ asm_disk
gv$ asm_diskgroups
gv$ asm_io_stat
gv$ asm_clients
gv$ asm_template (total 19 views)

ASM in RAC Environment:

Working on Datapump Export


Create a directory at O/S Level

$ mkdir dump_dir (for dump directory)


$ mkdir log_dir (for log directory)
$ chmod 775 dump_dir log_dir
Connect sqlplus and execute:
Sql> Create directory datapump_dir as /u01/dump_dir;
Sql> Create directory datapump_log as /u01/log_dir;
Sql> grant read,write on directory datapump_dir to public; # to take expdp for any schema
Sql> grant read,write on directory datapump_log to public;
$ more expdp.sh
#!/bin/ksh
export ORACLE_HOME="/u01/app/oracle/product/10.2.0"
export ORACLE_SID="abc"

echo export started at `date` >> /u05/abc/export/dailyexpdp_abc.log


$ORACLE_HOME/bin/expdp system/password dumpfile=datapump_dir:abc-expdp-`date '+
%Y%m%d'`.dmp logfile=datapump_log:abc-expdp-`date '+%Y%m%d'`.log schemas=aet3
echo export stopped at `date` >> /u05/abc/export/dailyexpdp_abc.log
echo tape archiving started at `date` >> /u05/abc/export/dailyexpdp_abc.log
tar Ecvf /dev/rmt/0 /u05/abc/export
echo tape archiving stopped at `date` >> /u05/abc/export/dailyexpdp_abc.log

$ crontab -l
50 23 * * 0,1,2,3,4,5,6 /u06/abc/scripts/expdp.sh
It will generate dumpfile datewise.
Enable Crontab
Giving Crontab permission to new os user
Login as root
Goto /etc/cron.d directory
Add the user entry (ex. ota ) in cron.allow file
Note: if cron.allow files doesn't exists then create it
Login as o/s user
Set the editor in .profile or .bash_profile
Example: EDITOR=vi;export EDITOR
Now you can schedule cron jobs.
To setup cronjob
crontab -l( list current jobs in cron)
crontab -e ( edit current jobs in cron)
_1_ _2_ _3_ _4_ _5_ executable_or_job
Where
1 Minutes (0-59)
2 Hours ( 0-24)
3 day of month ( 1- 31 )
4 Month ( 1-12)
5 A day of week ( 0- 6 ) 0 -> sunday 1-> monday
e.g. 0 3 * * 6 Means run job at 3AM every saturday
This is useful for scheduling tablespace threshold, ftp, rman backup or removed old log files,
or other scripts regularly.
Sample Scheduled backup:
$ crontab l

OTA Database:
50
23
50
23
15
14

*
*
*

*
*
*

0,2,3,6
1,4
0,1,2,3,4,6

/u01/ota/dailyexp_ota.sh
/u01/ota/offbkup_ota.sh
/u01/ota/morning_arch.sh

Upgrade Oracle from 10.2.0.1 To 10.2.0.4 on Linux x86 AS4


Upgrade Oracle from 10.2.0.1 To 10.2.0.4 on Linux x86 AS4
Screen shots below attached for production upgrade on solaris 64-BIT
Download 6810189 [p6810189_10204_Linux-x86]
$ unzip p6810189_10204_Linux-x86.zip
Shut down all the Databases / listener / services / enterprise manger
Backup your database
Start Patching:
cd patchset_directory/Disk1
./runInstaller
OUI starts and patch gets installed; When prompted, run the $ORACLE_HOME/root.sh script
as the root user.
Upgrading a Release 10.2 Database using Oracle Database Upgrade Assistant
After you install the patch set, you must perform the following steps on every database:

If you do not run the Oracle Database Upgrade Assistant then the
following errors are displayed:
ORA-01092: ORACLE instance terminated.
ORA-39700: database must be opened with UPGRADE option.

Start the listener as follows:


$ lsnrctl start
Run Oracle Database Upgrade Assistant

$ dbua
Complete the following steps displayed in the Oracle DBUA
On the Welcome screen, click Next.
On the Databases screen, select the name of the Oracle Database that you want to
update, then click Next.
On the Recompile Invalid Objects screen, select the Recompile the invalid objects at
the end of upgradeoption, then click next.
If you have not taken the back up of the database earlier, on the Backup screen, select
the I would like to take this tool to backup the database option
On the Summary screen, check the summary, and then click Finish.

Upgrade the Database Manually


After you install the patch set, you must perform the following steps on every database
associated with the upgraded Oracle home:
Start Listener
Connect as sys user
Sql> Startup Upgrade
Sql> Spool Patch.Log
Sql> @ORACLE_BASE\ORACLE_HOME\rdbms\admin\catupgrd.sql
Sql> Spool Off
Review the patch.log file for errors and inspect the list of components that is displayed at
the end of catupgrd.sql script.
This list provides the version and status of each SERVER component in the database.
If necessary, rerun the catupgrd.sql script after correcting any problems.
4. Restart the database:
Sql> Shutdown
Sql> Startup
5. Compile Invalid Objects
Run the utlrp.sql script to recompile all invalid PL/SQL packages now instead of when the
packages are accessed for the first time. This step is optional but recommended.

SQL> @ORACLE_BASE\ORACLE_HOME\rdbms\admin\utlrp.sql
SQL> select * from v$version;
BANNER
---------------------------------------------------------------Oracle Database 10g Release 10.2.0.4.0 - Production
PL/SQL Release 10.2.0.4.0 - Production
CORE 10.2.0.4.0 Production
TNS for 32-bit Windows: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production

----Screenshots----

LOG APPLY SERVICES (LAS)

LOG APPLY SERVICE:


i. Applying redo immediately
ii. Time delay for redo apply
Applying redo data to Physical Standby Database
1.

start redo apply

2.

stop redo apply

3.

monitor redo apply

Applying redo data to Logical Standby Database


iii. start sql apply
iv. stop sql apply
v. monitor sql apply
LOG APPLY SERVICES (LAS): Process is Automatic
1. Redo Apply (Physical Standby Database only)
Uses Media Recovery to keep Primary Database & Standby Database synchronized.
Kept in mounted state & can be open for reporting.
2. SQL Apply (Logical Standby Database only)
Reconstructs the SQL statements form redo data received from Primary Database &
applies it to Logical Standby Database.
Can be opened in R/W mode.
Redo Transport Service Process on the Standby Database receives the redo data and applies
it to standby redolog files or archived redolog files.
RFS - Redo file server process
MRP- Managed recovery process (performs recovery i.e... starts apply redo data)
LSP - Logical Standby Process.

FIG 6-1: Oracle Dataguard (B14239-04)


1. Applying Redo Data Immediately: (Real-Time Apply)
In this process the redo data is applied immediately as it is received without waiting for the
current standby redolog file to be archived
Enabling real-Time Apply for Physical Standby Database
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE
Enabling real-Time Apply for Logical Standby Database
ALTER DATABASE START LOGICAL STANDBY APPLY IMMEDIATE
2. Specifying a time delay for applying Redologs:
Paramter used: log_archive_dest_n
Attribute: delay
System: Delay in minutes
Default Value: 30 mins
Delay is used to protect the corrupted data getting applied to Standby Database.
Delay time starts after redo is received and completely archived

If real time is applied & delay is specified then delay is ignored


Cancel delay using nodelay
Ex:
Physical Standby Database
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE NODELAY;
Logical Standby Database
SQL> ALTER DATABASE START LOGICAL STANDBY APPLY NODELAY;
Alternate option for delaying:
Using flash back on standby database.
Applying Redo data to Physical Standby Database:
By default redo is always applied from archived redo logs of standby db.
In case of real-time apply; redo applied directly from standby redo log files before they are
archived.
Redo data cannot be applied if Physical Standby Database is open in read only mode.
There fore start the Physical Standby Database and keep it in mounted state to apply
the redo.
Apply redo is foreground process ( control is not returned)
Sql> Alter database recover managed standby database;
Applying redo as background process(control is returned)
Sql> Alter database recover managed standby database disconnect;
Using Real-time Apply:
Sql> Alter database recover managed standby database using current logfile;
Cancel Real-time Apply:
Sql> Alter database recover managed standby database cancel;
Monitoring:
Use OEM for monitoring log apply services.

Applying redo data to Logical Standby Database:


SQL Apply converts the data from archived redo log files or standby redolog ifles on Logical
Standby Database into sql statements and then these sql statement are applied to Logical
Standby Database.
Logical Standby Database always remain open as sql statements has to be executed.
Used for reporting, summations and quering purpose.
Starting sql apply
SQL> ALTER DATABASE START LOGICAL STANDBY APPLY;
Realtime:
SQL> ALTER DATABASE START LOGICAL STANDBY APPLY IMMEDIATE;
Stopping:
SQL> ALTER DATABASE STOP LOGICAL STANDBY APPLY;
Note: This command is delayed as sql apply will wait to apply all the commited transactions.
For stopping immediately used
SQL> ALTER DATABASE ABORT LOGICAL STANDBY APPLY;
Monitoring:
Use OEM for monitoring log apply servies.
REDO TRANSPORT SERVICES (RTS)
Redo Transport Service:
Automates the transfer of redo to one or more destinations
Resolve gaps in Redo Transport in case of network failures

FIG 5-1: Oracle Dataguard (B14239-04)


Destinations types for Redo Transport Service:
Oracle Data Guard Standby Database
Archived redo log repository
Oracle Streams real-time downstream capture database
Oracle Change Data Capture staging database
LOG_ARCHIVE_DEST_n parameter:
Number of destinations maximum (10)
Number of Standby Database configured max (10)
Attributes:
Location= specifies the local destinations
Service= specifies remote destinations
LOG_ARCHIVE_DEST_n is used along with LOG_ARCHIVE_DEST_STATE_n parameter.
Attributes: of LOG_ARCHIVE_DEST_STATE_n parameter
ENABLE:

Redo transport services can transmit redo data to this destination.


This is the default.

DEFER:

This is a valid but unused destination (Redo transport services will


not transmit redo data to this destination.)

ALTERNATE:

This destination is not enabled, but it will become enabled if


communication to its associated destination fails.

RESET:

Functions the same as DEFER

Example 51 Specifying a Local Archiving Destination


LOG_ARCHIVE_DEST_1=LOCATION=/arch1/chicago/
LOG_ARCHIVE_DEST_STATE_1=ENABLE
Example 52 Specifying a Remote Archiving Destination
LOG_ARCHIVE_DEST_1=LOCATION=/arch1/chicago/
LOG_ARCHIVE_DEST_STATE_1=ENABLE
LOG_ARCHIVE_DEST_2=SERVICE=boston
LOG_ARCHIVE_DEST_STATE_2=ENABLE
We can change the destination attributes
SQL> ALTER SYSTEM SET LOG_ARCHIVE_DEST_2=SERVICE=boston
VALID_FOR= (ONLINE_LOGFILES,PRIMARY_ROLE);
SQL> ALTER SYSTEM SET LOG_ARCHIVE_DEST_STATE_2=DEFER;
(This commands defers the Redo Transport Service)
The modifications take effect after the next log switch on the primary database.
Parameter for configuring Flash Recovery Area is DB_RECOVERY_FILE_DEST = /
If no destinations for local archiving are specified then LOG_ARCHIVE_DEST_10 is
implicitly mapped to DB_RECOVERY_FILE_DEST location by Oracle Data Guard.
A Primary Database cannot write the redo data to the Flash Recovery Area of Logical
Standby Database
Note: Flash Recovery Area is the directory to stores the files related to recovery.
To configure Flash Recovery Area to any other destinations other then
LOG_ARCHIVE_DEST_10 use
LOG_ARCHIVE_DEST_9=LOATION=USE_ DB_RECOVERY_FILE_DEST ARCH MANDATORY
REOPEN 5
Specifying Flash Recovery Area as Physical Standby Database
STANDBY_ARCHIVE_DEST = LOCATION=USE_ DB_RECOVERY_FILE_DEST
Sharing a Flash Recovery Area between Physical Standby Database and Primary Database
DB_UNIQUE_NAME should be specified to each database and it should have a unique name.

Example 53 Primary Database Initialization Parameters for a Shared Recovery Area


DB_NAME=PAYROLL
LOG_ARCHIVE_DEST_1=LOCATION=USE_DB_RECOVERY_FILE_DEST
DB_RECOVERY_FILE_DEST=/arch/oradata
DB_RECOVERY_FILE_DEST_SIZE=20G
Example 54 Standby Database Initialization Parameters for a Shared Recovery Area
DB_NAME=PAYROLL
DB_UNIQUE_NAME=boston
LOG_ARCHIVE_DEST_1=LOCATION=USE_DB_RECOVERY_FILE_DEST
STANDBY_ARCHIVE_DEST=LOCATION=USE_DB_RECOVERY_FILE_DEST
DB_RECOVERY_FILE_DEST=/arch/oradata
DB_RECOVERY_FILE_DEST_SIZE=5G
Sending Redo:
Redo can be transmitted by archiver process (Arcn) and log writer process (lgwr). But both
cannot be used for the same destinations i.e... arcn can send redo to one destinations and
lgwr to other.
Using ARCn to send redo
Default method & 4 processes are used by default
Supports only Maximum Performance level of data protection
Specify LOCATION attribute for local archiving and SERVICE attribute for remote
archiving.
EX:
LOG_ARCHIVE_DEST_1=LOCATION=/arch1/chicago/
LOG_ARCHIVE_DEST_2=SERVICE=boston
Another parameter
LOG_ARCHIVE_MAX_PROCESSES (Dynamic parameter; Maximum is 30 process)
Archival Processing:

FIG 5-3: Oracle Dataguard (B14239-04)


Note: use v$archive_log to verify the redo data is received on Standby Database
Minimum 2 Arch Process are required default is 4 & maximum is 30
RFS: On the remote destination, the remote file server process (RFS) will, in turn, write the
redo data to an archived redo log file from a standby redo log file. Log apply services use
Redo Apply (MRP process1) or SQL Apply (LSP process2) to apply the redo to the standby
database.
MRP: The managed recovery process applies archived redo log files to the physical standby
database, and automatically determines the optimal number of parallel recovery processes
at the time it starts. The number of parallel recovery slaves spawned is based on the
number of CPUs available on the standby server.
LSP: The logical standby process uses parallel execution (Pnnn) processes to apply archived
redo log files to the logical standby database, using SQL interfaces.
Using LGRW to Send Redo
- LGWR SYNC
- LGWR ASYNC
LGWR SYNC archival processing:
Parameter: LOG_ARCHIVE_DEST_n
Attributes: LGWR, SYNC,SERVICE

Example 55 Initialization Parameters for LGWR Synchronous Archival


LOG_ARCHIVE_DEST_1='LOCATION=/arch1/chicago'
LOG_ARCHIVE_DEST_2='SERVICE=boston LGWR SYNC NET_TIMEOUT=30'
LOG_ARCHIVE_DEST_STATE_1=ENABLE
LOG_ARCHIVE_DEST_STATE_2=ENABLE
SYNC: Network I/O is synchronous (default)
Waits until each write operation is completed
Note: if LGWR process does not work for some reason then redo transport will automatically
shift to ARCn process.
NET_TIMEOUT: waits for specified seconds over the network & give error if write operation
does not complete
LGWR ASYNC archival processing:
Ex: Same as above without SYNC & NET_TIMEOUT attribute
Use ASYNC instead of SYNC
NET_TIMEOUT is not necessary in ora10.2
Diagram showing SYNC & ASYNC LGWR archival process:

FIG 5-4: Oracle Dataguard (B14239-04)

FIG 5-5: Oracle Dataguard (B14239-04)


Note: LOG_ARCHIVE_DEST & LOG_ARCHIVE_DUPLEX_DEST should not be used for
configuring Flash Recovery Area.
Providing security while transmitting redo:
Sql> Orapwd file=orapw password=xyz entries=10
Note: Make sys user passwore identical for all dbs in Oracle Data Guard. Also set
remote_login_password_file=exclusive/shared.
VALID_FOR attribute of LOG_ARCHIVE_DEST_n parameter
VALID_FOR=(redo_log_type,database_role)
redo_log_type: ONLINE_LOGFILE, STANDBY_LOGFILE, or ALL_LOGFILES
database_role: PRIMARY_ROLE, STANDBY_ROLE, or ALL_ROLES.
VALID_FOR attribute is required for role transtition
- configures destination attrivutes for both Primary Database and Standby Database in one
SPFILE
- If VaLID_FOR is not used then we need to user two spfiles each time we do the role
transitions
- This attribute makes the switch-over and Fail-over easy.
Ex

LOG_ARCHIVE_DEST_1='LOCATION=/ARCH1/CHICAGO/
VALID_FOR=(ALL_LOGFILES,ALL_ROLES)'
DB_UNIQUE_NAME: Specified unique database name in Oracle Data Guard conf.
Used along with LOG_ARCHIVE_CONFIG.
Ex:
DB_NAME=chicago
DB_UNIQUE_NAME=chicago
LOG_ARCHIVE_CONFIG='DG_CONFIG= (chicago, boston)'
LOG_ARCHIVE_DEST_1='LOCATION=/arch1/chicago/ VALID_FOR= (ALL_LOGFILES,
ALL_ROLES)
LOG_ARCHIVE_DEST_2= 'SERVICE=boston LGWR ASYNCVALID_FOR= (ONLINE_LOGFILES,
PRIMARY_ROLE)
DB_UNIQUE_NAME=boston'
The LOG_ARCHIVE_CONFIG parameter also has SEND, NOSEND, RECEIVE, and
NORECEIVE attributes:
- SEND enables a database to send redo data to remote destinations
- RECEIVE enables the standby database to receive redo from another database
To disable these settings, use the NOSEND and NORECEIVE keywords
Ex: LOG_ARCHIVE_CONFIG='NORECEIVE, DG_CONFIG= (chicago,boston)'
Use of these parameters can effect the role transition. Therefore trys to remove them before
doing any role transitions
Handling Errors while transmitting redo:
Options when archiving fails
Retry the archival operations (control the number of retry operations)
Use an Alternate destinations
Ex: LOG_ARCHIVE_DEST_1='LOCATION=/arc_dest REOPEN=60 MAX_FAILURE=3's
Other parameters used:
REOPEN: default value is 300 seconds, 0 turns off this option
MAXIMUM _FAILURE: Maximum number of failures

ALTERNATE: Alternate Destinations


Note: Alternate take precedence over mandatory attribute; i.e.. even if the archiving
destinations is mandatory and if it fails; the archiving automatically moves to alternate
destinations.
DATA PROTECTION MODES:
MAXIMUX PROTECTION

MAXIMUM AVAILIBLITY

MAXIMUX PERFOMANCE
(Default)

- No Data loss if Primary


Database fails

-Provides highest level of data


protection without
compromising availability of
Primary Database

-Provides highest level of data


protection

-Redo data needed for recovery


has to be written both to an
online redo log files and
standby redo log files before
commit
- Atleast one Standby Database
should be available
- If any fault happens Primary
Database will shutdown
- Configure LGWR, SYNC &
AFFIRM attributes of
LOG_ARCHIVE_DEST_n
parameter on Standby
Database

-does not effect the


performance

-Primary Database doesnt


shutdown and continues to
work in maximum performance
mode until the fault is
corrected

- As soon as the redo data is


writted to the online redo log
file the transacation is
committed.

-all gaps in redolog files are


resolved and then it goes back
to maximum availability mode.

-redo is also written to alteast


one Standby Database
asynchronously

-Alteast on Standby Database


should be available

- Use network links with


sufficient bandwith to get
maximumavailablitiy with
minimal input on performance
on pdb

- Configure LGWR, SYNC &


AFFIRM attributes of
LOG_ARCHIVE_DEST_n
parameter on Standby
Database

- Set LGWR, SYNC & AFFIRM


attributes of
LOG_ARCHIVE_DEST_n
parameter on alteast Standby
Database

Setting the Data Protection Mode of a Data Guard Configuration


Atleast one db should meet the following minimum requirements:
MAXIMUM
PROTECTION

MAXIMUM
AVAILIBLITY

MAXIMUM
PERFOMANCE

Redo Archival Process

LGWR

LGWR

LGWR OR ARCH

Network transmission
mode

SYNC

SYNC

SYNC or ASYNC when


using

LGWR process. SYNC if


using
ARCH process
Disk write option

AFFIRM

AFFIRM

AFFIRM OR NO AFFIRM

Standby redo log


required?

YES

YES

NO BUT
RECOMMENDED

Note: oracle recommends that on Oracle Data Guard configurations that is running on
maximum protection mode contains atleast two Standby Database meeting the above
requirements so that the Primary Database continue processing without shutting down if
one of the Standby Database cannot receive redo data from Primary Database.
Managing log files:
1. Specify alternate directory for archived redologs.
- Redo received from Primary Database is identified by location attribute of the
parameterLOG_ARCHIVE_DEST_n.
- Alternate directory can be specified by using parameter STANDBY_ARCHIVE_DEST.
- If both parameters are specified then STANDBY_ARCHIVE_DEST overrides
LOG_ARCHIVE_DEST_n parameter.
- query v$arvhive_dest to check the value of STANDBY_ARCHIVE_DEST parameter
SQL> SELECT DEST_NAME, DESTINATION FROM V$ARCHIVE_DEST
WHEREDEST_NAME='STANDBY_ARCHIVE_DEST';
- Filesnames are generated in the format specified LOG_ARCHIVE_FORMAT=log%t_%s_
%r.arc
Note: Redo Transport Service stores the fully qualified domain name of these files in
Standby Database control file and redo apply uses this information to perform recovery.
- view v$archived_log
- checking archived redos log files that are on the standby system
SQL> SELECT NAME FROM V$ARCHIVED_LOG;
2. Reusing Online Redo Log Files
1. Specify alternate directory for archived redolog files
- redo received from Primary Database is identified by location attribute of the parameter
LOG_ARCHIVE_DEST_n
- Alternate directory can be specified by using parameter STANDBY_ARCHIVE_DEST

- If both parameters are specified than STANDBY_ARCHIVE_DEST overrides


LOG_ARCHIVE_DEST_n parameter
- Query v$arhive_dest to check the value of STANDBY_ARCHIVE_DEST parameter
SQL> SELECT DEST_NAME, DESTINATION FROM V$ARCHIVE_DEST WHERE DEST_NAME=
'STANDBY_ARCHIVE_DEST';
- Files name are generated in the format specified by LOG_ARCHIVE_FORMAT=log%t_%s_
%r.arc
Note: Redo Transport Service stores fully qualified domain name of these files in Standby
Database control file & redo apply uses this information to perform recovery.
2. Reusing Online Redolog file:
For reusing the online redolog files we have to set optional or mandatory option with
LOG_ARCHIVE_DEST_n parameter
Ex: LOG_ARCHIVE_DEST_3=LOCATION=/arch_dest MANDATORY
Note: By Default remote destinations are set to optional.
By Default one local destination is mandatory.
If mandatory is specified the archive log files are not overwritten until the failed archive
log is applied
If optional is specified; even if the redo is not applied the files are over written.
3. Managing Standy Redo log files:
Check the RFS process trace file or database alert log file to determine we have adequate
standby redo log files or not.
i.e... If these files indicate RFS process has to wait frequently for a group as archiving is not
getting completed than add more log file groups to standby redo log.
Note: when ever an ORL file group is added to Primary Database than we must add the
corresponding standby redo log file group to the Standby Database.
If the no. of standby redo log file groups are inadequate then Primary Database will
shutdown if it is in maximum protection mode or switch to maximum performance mode if it
is in maximum availability mode
Ex: Adding a member to the standby redo log group
Sql> Alter database add standby logfile member /disk1/oracle/dbs/log2b.rdo to group 2;
4. Planning for growth & reuse of control files:
The maximum control file size is 20,000 database blocks

If 8k is the block size (8124) then the maximum control file size will be 156 MB.
As long as the archived redo logs are generated or RMAN backups are taken records are
added to the control file. If control file reaches its maximum size then these records are
reused.
Parameter used to specify the time for keeping control file records is
control_file_record_keep_time value ranges from 0-365 days (default is 7 days)
Note: Keep the control_file_record_keep_time value atleast upto last 2 full backups
period.
In case if redo is planned to apply with delay then set this value to more no. of days.
5. Sharing a log file destinations among multiple Standby Databases:
Ex:
LOG_ARCHIVE_DEST_1=LOCATION=disk1 MANDATORY
LOG_ARCHIVE_DEST_2 =SERCIVE=standby1 optional
LOG_ARCHIVE_DEST_3 =SERCIVE=standby2 optional dependency LOG_ARCHIVE_DEST_2
In this case DEPENDENCY attribute is set to second standby db which takes the redo data
from LOG_ARCHIVE_DEST_2.
This kind of setup can be used if
Primary Database & Standby Database resides on the same system.
Physical Standby Database & Logical Standby Database resides on the same system.
When clustered file system is used.
When network file system is used
MANAGING ARCHIVE GAPS:
Oracle Data Guard resolves the gaps automatically
Gaps can happen due to network failure or archiving problem on Primary Database.
Primary Database polls Standby Database every minute to detect the gaps [polling
mechanism]
In case Primary Database is not available then we have to resolve the gaps manually by
applying redo from one of the Standby Database.
No extra configurations are required to resolve the gaps automatically.
1. Using FAL [fetch archive logs mechanism] to resolve gaps.

Set the parameters FAL_SERVER = net_service_name


FAL_SERVER=standby2_db, standby3_db
FAL_CLIENT=stadnby1_db
2. Manually resolving archive gaps
We have to resolve the gaps manually if Primary Database is not available and if we are
using Logical Standby Database (case 1); Also application for some other cases.
Resolving gaps on a Physical Standby Database:
1

Query the gap on Physical Standby Database:

SQL> SELECT * FROM V$ARCHIVE_GAP;


THREAD# LOW_SEQUENCE# HIGH_SEQUENCE#
----------- ------------------------ -------------------------1 7 10
1

Find the missing logs on Primary Database and copy them to the Physical Standby
Database.

SQL> SELECT NAME FROM V$ARCHIVED_LOG WHERE THREAD#=1 AND DEST_ID=1 AND
SEQUENCE# BETWEEN 7 AND 10;
NAME
------------------------------------------------------------------------/primary/thread1_dest/arcr_1_7.arc
/primary/thread1_dest/arcr_1_8.arc
/primary/thread1_dest/arcr_1_9.arc
1

Once these log files are copied on Physical Standby Database then we have to
register them with Physical Standby Database

SQL> ALTER DATABASE REGISTER LOGFILE


'/physical_standby1/thread1_dest/arcr_1_7.arc';
SQL> ALTER DATABASE REGISTER LOGFILE
'/physical_standby1/thread1_dest/arcr_1_8.arc';
1

Restart Redo Apply.

Resolving gaps on a Logical Standby Database:

Same procedure as Physical Standby Database but the view used is dba_logstdby_log
instead of v$archive_gap
Steps:
a.
SQL> COLUMN FILE_NAME FORMAT a55
SQL> SELECT THREAD#, SEQUENCE#, FILE_NAME FROM DBA_LOGSTDBY_LOG L
WHERE NEXT_CHANGE# NOT IN
(SELECT FIRST_CHANGE# FROM DBA_LOGSTDBY_LOG WHERE L.THREAD# = THREAD#)
ORDER BY THREAD#,SEQUENCE#;
THREAD# SEQUENCE# FILE_NAME
---------- ---------- ----------------------------------------------1 6 /disk1/oracle/dbs/log-1292880008_6.arc
1 10 /disk1/oracle/dbs/log-1292880008_10.arc
Note: If there is a gap then only one file is hsown for each thread. Otherwise it shows two
files for each thread
In the above examples missing files are 7,8,9.
b. copy these file on Logical Standby Database location.
c. register these files with Logical Standby Database
SQL> ALTER DATABASE REGISTER LOGICAL LOGFILE 'file_name';
d. Restart SQL Apply
Verification:
1. Check the status of online redofile on Primary Database
SQL> SELECT THREAD#, SEQUENCE#, ARCHIVED, STATUS FROM V$LOG;
2. determine is the most recent archive file on Primary Database
SQL> SELECT MAX(SEQUENCE#), THREAD# FROM V$ARCHIVED_LOG GROUP BY
THREAD#;
3. Use the following query on Primary Database to check which is the most recently
transmitted archive log file to each destination
SQL> SELECT DESTINATION, STATUS, ARCHIVED_THREAD#, ARCHIVED_SEQ#

FROM V$ARCHIVE_DEST_STATUS
WHERE STATUS <> 'DEFERRED' AND STATUS <> 'INACTIVE';
4. Use the following query on Primary Database to find out the archive redolog files not
received at each destination
SQL> SELECT LOCAL.THREAD#, LOCAL.SEQUENCE# FROM
(SELECT THREAD#, SEQUENCE# FROM V$ARCHIVED_LOG WHERE DEST_ID=1)
LOCAL WHERE LOCAL.SEQUENCE# NOT IN
(SELECT SEQUENCE# FROM V$ARCHIVED_LOG WHERE DEST_ID=2 AND
THREAD# = LOCAL.THREAD#);
5. Set log_archive_trace parameter in Primary Database & Standby Database to see the
transmission of redo data
Monitoring the Performance of Redo Transport Services
View: v$system_event
Parameter: log_archive_dest_n
Attributes: ARCH
LGWR ( SYNC/ASYNC)
Waits Events to monitor:
1

ARCn Wait Events

LGWR SYNC wait events

LGWR ASYNC Wait Events

Note: Use OEM to Monitor in GUI for Oracle Data Guard

CONFIGURING DATA GUARD


Configuring oracle10g Dataguard on Linux AS4

Creating a physical standby database:

A. Preparing the primary database for standby database creation.

Enable forced logging

create a password file

configure a standby redo log

Identical log files sizes on PD and SD

Determine appropriate no of log file groups

Verify parameters related to log files

Create standby redo logs

Verify standby redo log file group creations

Set Initialization Parameters for the primary database

10

Enable Archiving

B. Create of physical standby database


1

create a backup copy of Primary Database

create a control file for Standby Database

prepare initialization parameter file for Standby Database

copy files from Primary Database to Standby Database system

setup the environment to support Standby Database

start physical standby db

very physical Standby Database

C. Post Creation Steps.


1

upgrade the data protection mode

enable flash back database

A. Preparing Primary Database for PHYSICAL STANDBY DATABASE creation:


1. Place the Primary Database in force logging mode
SQL> Alter database force logging;
2. Create a password file for Sys User.
Every Database in Oracle Data Guard configuration must have password file for sys user.
$ orapwd file=orapwprod password=oracle entries=100

3. Configure standby redo logs for maximum availability & data protection.
LGWR ASYNC transport mode is preferred.
If possible multiples the standby redo logs files
Use identical sizes for PDB and SDB redo log files determine appropriate no of redo log
files.
Formula: ( max no. of log files per group +1 ) x no. of log groups ex : 2 groups with 2
members each; thenthe no. of standby redo logs should be ( 2+1) x 2 = 3 x 2 = 6 standby
redo log groups
Check for maxlogfiles & maxlogmembers clauses. If there is a limit then you have to
recreate the DB or the controlfile.
Adding a standby redolog file group to a specific thread
SQL> alter database add standby logfile thread 5 size 20m
Adding a standby redolog file group to a specific group
SQL> alter database add standby logfile group 10 size 20m
Note: If we skip group no. using 10, 20 & so on we will end up using additional space in
Standby Database controlfile.
Here we have configured standby redolog files as primary db in order to make switch over
easy if required. i.e... Primary Database =Standby Database
Verify standby redolog file group creation.
SQL> select group#, thread#, sequence#, archived, status from v$standby_logs
D. Setting Primary Database Initialization parameters
prod.__db_cache_size=184549376
prod.__java_pool_size=4194304
prod.__large_pool_size=4194304
prod.__shared_pool_size=88080384
prod.__streams_pool_size=0
*.audit_file_dest=/oracle10g/product/10.2.0/db_1/admin/prod/adump
*.background_dump_dest=/oracle10g/product/10.2.0/db_1/admin/prod/bdump
*.compatible=10.2.0.1.0

*.control_files=/oracle10g/product/10.2.0/oradata/prod/control01.ctl
*.core_dump_dest=/oracle10g/product/10.2.0/db_1/admin/prod/cdump
*.db_block_size=8192
*.db_domain=
*.db_file_multiblock_read_count=16
*.db_flashback_retention_target=3600
*.db_name=prod
*.db_recovery_file_dest=/oracle10g/arch
*.db_recovery_file_dest_size=21474836480
*.db_unique_name=standby
*.dispatchers=(PROTOCOL=TCP) (SERVICE=prodXDB)
*.fal_client=STANDBY
*.fal_server=PROD
*.fast_start_mttr_target=17
*.job_queue_processes=10
*.log_archive_config=DG_CONFIG=(prod,standby)
*.log_archive_dest_1=LOCATION=/oracle10g/arch1_prod
VALID_FOR=(ALL_LOGFILES,ALL
_ROLES) DB_UNIQUE_NAME=standby
*.log_archive_dest_2=SERVICE=prod
VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_U
NIQUE_NAME=prod
*.log_archive_dest_state_1=ENABLE
*.log_archive_dest_state_2=ENABLE
*.log_archive_format=prod_%s_%t_%r.arc
*.log_archive_max_processes=4
*.log_buffer=262144

*.open_cursors=300
*.pga_aggregate_target=94371840
*.processes=150
*.remote_login_passwordfile=EXCLUSIVE
*.sga_target=283115520
*.standby_archive_dest=/oracle10g/arch1_prod/
*.standby_fIle_management=AUTO
*.undo_management=AUTO
*.undo_retention=3600
*.undo_tablespace=UNDOTBS
*.user_dump_dest=/oracle10g/product/10.2.0/db_1/admin/prod/udump
Creation of physical standby database:
1

Make a backup copy of the Primary Database data files

create control file for standby db

Prepare pfile for Standby Database

copy file from primary system to secondary system

Setup the environment of standby system to support Standby Database.

Start Physical Standby Database

Verify Physical Standby Database

1. Making a backup copy of Primary Database data files: (on Primary Database
system)
Any backup copy can be used
Use RMAN utility
2. Creating controlfle for Standby Database (on Primary Database system)
SQL> shutdown immediate (then take offline backup of datafiles cold backup)
SQL> startup mount;
SQL> alter database create standby controlfile as /u01/oracle/standby.ctl;

SQL> alter database open;


3. Prepare pfile for Standby Database: (on Primary Database system)
Copy the file for PDB to SDB & make changes to few parameters as follows:
Note:
Compatible parameters should be 9.2.0.1.0 minimum
Take care, advantages of oracle 10g by setting it as 10.2.0.0 or higher
Should be same on both Primary Database & Standby Database
If diff value then redo will not be transmitted
Check for bdump, cdump, udump destination to have the same location on PDB and SDB.
Initstandby.ora
prod.__db_cache_size=436207616
standby.__db_cache_size=427819008
prod.__java_pool_size=4194304
standby.__java_pool_size=4194304
prod.__large_pool_size=4194304
standby.__large_pool_size=4194304
prod.__shared_pool_size=146800640
standby.__shared_pool_size=155189248
prod.__streams_pool_size=0
standby.__streams_pool_size=0
*.audit_file_dest='/u00/app/oracle/admin/standby/adump'
*.background_dump_dest='/u00/app/oracle/admin/standby/bdump'
*.compatible='10.2.0.1.0'
*.control_files= '/u01/oracle/control01.ctl',
'/u01/oracle/control02.ctl',
'/u01/oracle/control03.ctl'
*.core_dump_dest='/u00/app/oracle/admin/standby/cdump'

*.db_block_size=8192
*.db_domain='global.com'
*.db_file_multiblock_read_count=16
*.db_flashback_retention_target=3600
*.db_name='prod'
*.db_recovery_file_dest_size=21474836480
*.db_recovery_file_dest='/u01/oracle/flash'
*.db_unique_name='standby'
*.dispatchers='(PROTOCOL=TCP) (SERVICE=prodXDB)'
*.FAL_CLIENT='standby'
*.FAL_SERVER='prod'
*.global_names=TRUE
*.job_queue_processes=10
*.log_archive_config='DG_CONFIG=(prod,standby)'
*.log_archive_dest_1='LOCATION=/u01/oracle/arch
VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=standby'
*.log_archive_dest_2='SERVICE=prod
VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=prod'
*.log_archive_dest_state_1='ENABLE'
*.log_archive_dest_state_2='ENABLE'
*.log_archive_format='standby_%s_%t_%r.arc'
*.log_archive_max_processes=4
*.open_cursors=300
*.pga_aggregate_target=197132288
*.processes=150
*.remote_login_passwordfile='EXCLUSIVE'
*.sga_target=592445440

*.standby_archive_dest='/u01/oracle/arch'
*.standby_file_management='AUTO'
*.undo_management='AUTO'
*.undo_tablespace='UNDOTBS1'
*.user_dump_dest='/u00/app/oracle/admin/standby/udump'
4. Copy files for primary db:
Copy data files/standby control file and pfile to the standby system using O/S commands
(ftp)
5. Setting up the environment to support Standby Database (on Standby
Database)
Create the service standby and password file using oradim utility on windows.
If UNIX; just create password file
$ orapwd file=orapwprod password=oracle entries=100
Note: the password file for sys user should have the same password as both PDB and SDB.
Configure listener.ora
Configure net service name on both system for both databases.
Note: connect descriptor should specify the usage of dedicated server
Listener.ora on primary db:
# listener.ora Network Configuration File:
/u00/app/oracle/product/10.2.0/db_1/network/admin/listener.ora
# Generated by Oracle configuration tools.
SID_LIST_LISTENER =
(SID_LIST =
(SID_DESC =
(SID_NAME = PLSExtProc)
(ORACLE_HOME = /u00/app/oracle/product/10.2.0/db_1)
(PROGRAM = extproc)
)

(SID_DESC =
(GLOBAL_DBNAME = prod.global.com)
(ORACLE_HOME = /u00/app/oracle/product/10.2.0/db_1)
(SID_NAME = prod)
)
)
LISTENER =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = ans.global.com)(PORT = 1521))
)
(DESCRIPTION =
(ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC0))
Listener.ora on standby db:
stener.ora Network Configuration File:
/u00/app/oracle/product/10.2.0/db_1/network/admin/listener.ora
# Generated by Oracle configuration tools.
SID_LIST_LISTENER =
(SID_LIST =
(SID_DESC =
(SID_NAME = PLSExtProc)
(ORACLE_HOME = /u00/app/oracle/product/10.2.0/db_1)
(PROGRAM = extproc)
)
(SID_DESC =
(GLOBAL_DBNAME = standby.global.com)
(ORACLE_HOME = /u00/app/oracle/product/10.2.0/db_1)

(SID_NAME = standby)
)
)
LISTENER =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = 100.100.112.72)(PORT = 1521))
)
(DESCRIPTION =
(ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC0))
)
)
tnsnames.ora on primary db:
# tnsnames.ora Network Configuration File:
/u00/app/oracle/product/10.2.0/db_1/network/admin/tnsnames.ora
# Generated by Oracle configuration tools.
PROD =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = ans.global.com)(PORT = 1521))
)
(CONNECT_DATA =
(SERVICE_NAME = prod.global.com)
(INSTANCE_NAME = prod)
)
(HS = OK)
)

STANDBY =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = 100.100.112.72)(PORT = 1521))
)
(CONNECT_DATA =
(SERVICE_NAME = standby.global.com)
(INSTANCE_NAME = standby)
)
(HS = OK)
)
EXTPROC_CONNECTION_DATA =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC0))
)
(CONNECT_DATA =
(SID = PLSExtProc)
(PRESENTATION = RO)
)
)
tnsnames.ora on standby db:
names.ora Network Configuration File:
/u00/app/oracle/product/10.2.0/db_1/network/admin/tnsnames.ora
# Generated by Oracle configuration tools.
PROD =
(DESCRIPTION =

(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = 100.100.101.44)(PORT = 1521))
)
(CONNECT_DATA =
(SERVICE_NAME = prod.global.com)
)
)
STANDBY =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = 100.100.112.72)(PORT = 1521))
)
(CONNECT_DATA =
(SERVICE_NAME = standby.global.com)
)
)
EXTPROC_CONNECTION_DATA =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC0))
)
(CONNECT_DATA =
(SID = PLSExtProc)
(PRESENTATION = RO)
)
)
6. Start Physical Standby Database

Create spfile from pfile on standby db from the text parameter file that was edited for
Standby Database
SQL> create spfile from pfile=initstandby.ora;
Start Standby Database
SQL> startup mount;
Start applying redo
SQL> alter database recover managed standby database disconnect from session;
Test archival operation to the Standby Database
SQL> alter system switch logfile; (Primary Database)
7. Verify Physical Standby Database
SQL> select sequence#,first_time,next_time from v$archived_log order by sequence#; ( on
Primary Database and Standby Database )
Verify the logs applied by the following command (On Standby Database)
SQL> select sequence#, applied from v$archived_log order by sequence#;
RMAN Script: POINT IN TIME RECOVERY

[point_intime_recovery.scp]
# This scenario assumes that all initializaiton filesa nd the current controlfile arein place and
you want to recover to a point in time '2001-04-09"14:30:00'.

# Ensure you set your NLS_LANG enviroment variable

STARTUP MOUNT FORCE;


RUN
{
SET UNTIL TIME "TO_DATE('2001-04-09:14:30:00','yyyy-dd-mm:hh24:mi:ss')";
RESTORE DATABASE;
RECOVER DATABASE;

ALTER DATABASE OPEN RESETLOGS;


}

# You must take a new whole database backup after resetlogs,since backups of previous
incarnation are not easily usable

RMAN SCRIPT : DISASTER RECOVERY


[ disaster_recovery.scp]
# The commands below assume that all initialization parameters files are in place and the complete
directory structure for the datafiles is recreated
# Ensure uou set your NLS_LANG environment variable
# e.g in unix (csh);
# >setenv NLS_LANG amarican_america.we8dec
# Start RMAN without the target option, and use the following commands to restor
e and recover the database
# SET DBID; use database if from RMAN output
# not required if using recovery catalog
connect target sys/password@omr
startup nomount;
run
{
# you need to allocate channels if not using recovery catalog.
allocate channel c1 type disk;
# optionally you can set newname and switch commands to restore datafiles to a new location
restore controlfile from autobackup;
alter database mount;
restore database;
reocver database;
alter database open resetlogs;
# you must take a new whole database backup after resetlogs, since backups of previous incarnatin are
not easily usable.

RMAN SCRIPT : CONTROLFILE RECOVERY


[controlfile_recovery.scp]
# Oracle strongly recommends that you specify multiple controlfiles, on separate physical disks and
controllers, in the CONTROL_FILE initialization parameter.
# - If one copy is lost due to media failure, copy one of the others over the lost controlfile and restart the
instance.
# - If you lose all copies of the controlfile, you must re-create it using the create controlfile sql command

# You should use RMAN to recover a backup controlfile only if you have lost all copies of the current
controlfile, because after restoring a backup controlfile, you will have to open RESETLOGS and take a
new whole database backup.
# This section assumes that all copies of the current controlfile have been lost, and than all initialization
parameter files, datafiles and online logs are intact.
# Ensure you set your NLS_LANG environment variable e.g. in unix (csh):
# >setenv NLS_LANG american_america.we8dec
# Start RMAN without the TARGET option, and use the following commands to restore and recover the
database;
# SET DBID ; use database id from RMAN output; not required if using recovery catalog
connect target sys/password@omr
startup nomount;
run
{
# you need to allocate channels if not using recovery catalog.
# allocate channel foo type sbt parms'';
allocate channel c1 type disk;
restore controlfile from autobackup # or ;
alter database mount;
recover database;
alter database open resetlogs;
}
# you must take a new whole database backup after reerlogs, since backups of pre
vious incarnation are not easily usable
$

RMAN Script: DATAFILE RECOVERY


[datafile_recovery.scp]
# This section assumes that datafile 5 has been damaged and needs to be restored
and recovered, and that the current controlfile and all other datafiles are int
act. the database is mounted during the restore and recovery.
# the steps are:
# - offlie the datafile that needs recovery
# - restore the datafile from backups
# - apply incrementals and archivelogs as necessary to recover.
# - make online recovered datafile
run
{
sql 'alter database datafile 5 offline';
#if you want to restore to a different location,uncomment the following command
# Set newname for datafile 5 to '/newdirectory/new_filename.f';

restore datafile 5;
# if you restored to a different locatin, uncomment the command below to
# switch the controlfile to point to the file in the new location
# SWITCH DATAFILE ALL;
recover datafile 5;
sql 'alter database datafile 5 online';
}

RMAN Script: OMR Database Full Backup (Database Mounted)


[ dbbkup_full.scp]
run {
allocate channel c1 type disk;
backup
tag weekly_omr_full
format '/u07/omr/backup/full_%d_%s_%p_%t'
(database);
release channel c1;
configure controlfile autobackup format for device type disk to '/u07/omr/backup
/auto_cntrl_%F';
configure controlfile autobackup on;
allocate channel c2 type disk;
backup
format '/u07/omr/backup/archive_%d_%s_%p_%t'
(archivelog all);
release channel c2;
}
startup;

RMAN Script: Cumulative level 2 backup


[ call_dbbkup_cm2.scp]
#!/bin/ksh
export ORACLE_HOME="/u01/app/oracle/product/10.2.0"
export ORACLE_SID="omr"
PATH=$PATH:$ORACLE_HOME/bin
echo rman backup cm level1 for CATDB started `date` >> /u07/catdb/rmanbkup.log
rman target sys/password@omr catalog rman/rman@catdb cmdfile='/u04/catdb/scripts/dbbkup_cm1.scp'
echo rman backup cm level0 for CATDB ended `date` >> /u07/catdb/rmanbkup.log
exit
[dbbkup_cm2.scp]

run
{
allocate channel c1 type disk;
backup incremental level 2 cumulative
tag omr_cm2
format '/u10/catdb/backup/cm2_%d_%s_%p_%t'
(database);
release channel c1;
#backing up controlfile to the specified destination keeping autobakup copy
configure controlfile autobackup format for device type disk to '/u10/catdb/back
up/auto_cntrl_%F';
configure controlfile autobackup on;
#backup up archivelog files
allocate channel c2 type disk;
backup
format '/u10/catdb/backup/cm2_%d_%s_%p_%t'
(archivelog all);
release channel c2;
}

RMAN Script: Cumulative level 1 backup


[ call_dbbkup_cm1.scp]
#!/bin/ksh
export ORACLE_HOME="/u01/app/oracle/product/10.2.0"
export ORACLE_SID="omr"
PATH=$PATH:$ORACLE_HOME/bin
echo rman backup cm level1 for CATDB started `date` >> /u07/catdb/rmanbkup.log
rman target sys/password@omr catalog rman/rman@catdb cmdfile='/u04/catdb/scripts/dbbkup_cm1.scp'
echo rman backup cm level0 for CATDB ended `date` >> /u07/catdb/rmanbkup.log
exit
[dbbkup_cm1.scp]
run
{
allocate channel c1 type disk;
backup incremental level 1 cumulative
tag omr_cm1
format '/u10/catdb/backup/cm1_%d_%s_%p_%t'
(database);
release channel c1;
#backing up controlfile to the specified destination keeping autobakup copy
configure controlfile autobackup format for device type disk to '/u10/catdb/back
up/auto_cntrl_%F';

configure controlfile autobackup on;


#backup up archivelog files
allocate channel c2 type disk;
backup
format '/u10/catdb/backup/cm1_%d_%s_%p_%t'
(archivelog all);
release channel c2;
}

RMAN Script: Cumulative level 0 backup


[ call_dbbkup_cm0.scp]
#!/bin/ksh
export ORACLE_HOME="/u01/app/oracle/product/10.2.0"
export ORACLE_SID="omr"
PATH=$PATH:$ORACLE_HOME/bin
echo rman backup cm level0 for CATDB started `date` >> /u07/catdb/rmanbkup.log
rman target sys/password@omr catalog rman/rman@catdb cmdfile='/u04/catdb/scripts/dbbkup_cm0.scp'
echo rman backup cm level0 for CATDB ended `date` >> /u07/catdb/rmanbkup.log
exit
[dbbkup_cm0.scp]
run
{
allocate channel c1 type disk;
backup incremental level 0 cumulative
tag omr_cm0
format '/u10/catdb/backup/cm0_%d_%s_%p_%t'
(database);
release channel c1;
#backing up controlfile to the specified destination keeping autobakup copy
configure controlfile autobackup format for device type disk to '/u10/catdb/back
up/auto_cntrl_%F';
configure controlfile autobackup on;
#backup up archivelog files
allocate channel c2 type disk;
backup
format '/u10/catdb/backup/cm0_%d_%s_%p_%t'
(archivelog all);
release channel c2;
}

RMAN Script: Deleting archivelog when catalog exists.


[call_omr_archflush.scp]

#!/bin/ksh
export ORACLE_HOME="/u01/app/oracle/product/10.2.0"
export ORACLE_SID="omr"
PATH=$PATH:$ORACLE_HOME/bin
rman target sys/password@omr catalog rman/rman1956@rman
cmdfile='/u04/rman/rman/scripts/omr_archflush.scp'
exit
[omr_archflush.scp]
# RMAN SCRIPT: DELETING ARCHIVE LOGS
run
{
allocate channel c1 type disk;
delete archivelog until time 'SYSDATE-8';
# OR delete archivelog until sequence=;
#or RMAN> BACKUP ARCHIVELOG ALL DELETE INPUT;
release channel c1;

RMAN Script: Deleting the old archives when no catalog exists


[ call_catdb_archflush.scp]
#!/bin/ksh
export ORACLE_HOME="/u01/app/oracle/product/10.2.0"
export ORACLE_SID="catdb"
PATH=$PATH:$ORACLE_HOME/bin
rman target sys/password@catdb cmdfile='/u04/catdb/scripts/catdb_archflush.scp'
exit
[ catdb_archflush.scp]
run
{
allocate channel c1 type disk;
delete archivelog until time 'SYSDATE-8';
# OR delete archivelog until sequence=;
release channel c1;

RMAN Script: Backing up all the archivelog files


[call_arch_bkup.scp] (this scripts is calling the script arch_bkup.scp)
#!/bin/ksh
export ORACLE_HOME="/u01/app/oracle/product/10.2.0"
export ORACLE_SID="omr"
PATH=$PATH:$ORACLE_HOME/bin
echo rman ARCHIVE backup for CATDB started `date` >> /u07/catdb/rmanbkup.log
rman target sys/password@omr catalog rman/rman@catdb cmdfile='/u04/catdb/scripts/arch_bkup.scp'
echo rman ARCHIVE backup for CATDB ended `date` >> /u07/catdb/rmanbkup.log
exit
[arch_bkup.scp]
run
{

allocate channel c1 type disk;


backup
format '/u10/catdb/backup/arch_%d_%s_%p_%t'
(archivelog all);
release channel c1;
# deleting archive logs older than 8 days
allocate channel c2 type disk;
delete archivelog until time 'SYSDATE-5';
release channel c2;
}

Switching Undo Tablespace


CREATE UNDO TABLESPACE "UNDOTBS"
DATAFILE '/u05/omr/oradata/undotbs.dbf' SIZE 4000M;
SQL> select name from v$tablespace where name like 'UNDO%';
NAME
-------UNDOTBS1
UNDOTBS
SQL> ALTER SYSTEM SET undo_tablespace='UNDOTBS' SCOPE=BOTH;
SQL> show parameter undo
NAME TYPE VALUE
undo_management string AUTO
undo_retention integer 21600
undo_tablespace string UNDOTBS
SQL>select status from V$ROLLSTAT
STATUS
ONLINE
ONLINE
PENDING OFFLINE
PENDING OFFLINE
PENDING OFFLINE
PENDING OFFLINE
ONLINE
ONLINE
ONLINE
ONLINE
ONLINE
ONLINE
If the status is pending offline; u cannot drop undo tablespace UNDOTBS1
SQL>drop tablespace undotbs1 including contents and datafiles
If u drop u will get ORA-30013: undo tablespace 'UNDOTBS1' is currently in use
Note: You can find the following messages in alert.log after issuing alter system set command
Sat Jul 18 15:38:38 2009
Successfully onlined Undo Tablespace 7.
Undo Tablespace 1 moved to Pending Switch-Out state.

*** active transactions found in undo tablespace 1 during switch-out.


Sat Jul 18 15:46:28 2009
Undo Tablespace 1 successfully switched out.

Multiplexing Control File


Multiplexing the Control File When Using SPFILE
Sql> shutdown immediate
$ cp /u00/app/oracle/oradata/ota/control01.ctl /u04/ota/control04.ctl
Sql> Startup nomount
Sql> ALTER SYSTEM SET control_files =
/u00/app/oracle/oradata/ota/control01.ctl,
/u00/app/oracle/oradata/ota/control02.ctl,
/u00/app/oracle/oradata/ota/control03.ctl,
/u04/ota/control04.ctl
SCOPE=SPFILE;
Sql> shutdown immediate
Sql> startup
Sql> select name from v$controlfile;
Sql> create PFILE from SPFILE;

Multiplexing the Control File When Using PFILE


Sql> sqlplus /nolog
Sql> Connect /as sysdba
Sql> shutdown immediate
$ cp /u00/app/oracle/oradata/ota/control01.ctl /u0X/ota/control04.ctl
Add this entry in PFILE
control_files = /u00/app/oracle/oradata/ota/control01.ctl,
/u00/app/oracle/oradata/ota/control02.ctl,
/u00/app/oracle/oradata/ota/control03.ctl,
/u0X/ota/control04.ctl
SQL> startup
Frequently used OS commands for DBA

As DBA you need to use frequent OS command, here I would like to share you
some of the important day to day commands:

1. To delete files older than N number of days ? (Useful in delete log, trace, tmp file )
find . -name *.* -mtime +[N in days] -exec rm {} \;
Example : find . -mtime +5 -exec rm {} \;
(This command will delete files older then N days in that directory

2. To list files modified in last N days


find . -mtime - -exec ls -lt {} \;
Example: find . -mtime +3 -exec ls -lt {} \;1
So to find files modified in last 3 days
3. To sort files based on Size of file ?
ls -l | sort -nk 5 | more
useful to find large files in log directory to delete in case disk is full
4. To find files changed in last N days
find -mtime -N print
Example: find -mtime -2 -print

5. To extract cpio file


cpio -idmv < file_name (Dont forget to use sign < before file name)

6. To find CPU & Memory detail of linux


cat /proc/cpuinfo (CPU)
cat /proc/meminfo (Memory)

7. To find if Operating system in 32 bit or 64 bit ?


For solaris use command
isainfo -v
If you see out put like
32-bit sparc applications
That means your O.S. is only 32 bit
but if you see output like

64-bit sparcv9 applications


32-bit sparc applicationsabove means your o.s. is 64 bit & can support both 32 & 64 bit
applications

8. To find if any service is listening on particular port or not ?


netstat -an | grep {port no}
Example: netstat -an | grep 1523

9. To find Process ID (PID) associated with any port ?


This command is useful if any service is running on a particular port (389, 1521..) and that
is run away process which you wish to terminate using kill command
lsof | grep {port no.} (lsof should be installed and in path)

9. To change a particular pattern in a file ?


Open file using vi or any other editor, go in escape mode (by pressing escape) and use
:1,$s/old_pattern/new_parameter/gc ( g will change globally, c will ask for
confirmation before changing )

10. To find a pattern in some file in a directory ?


grep pattern file_name ( to find pattern in particular file )
grep pattern * ( in all files in that directory )
If you know how to find a pattern in files in that directory recursively please answer that as
comment

11. To create symbolic link to a file ?


ln -s pointing_to symbolic_name
Example : ln -s b a
If you want to create symbolic link from a -> b
(Condition:you should have file b in that directory & there should not be any file with name
a)

12. To setup cronjob (cronjob is used to schedule job in Unix at O.s. Level )
crontab -l( list current jobs in cron)
crontab -e ( edit current jobs in cron)

_1_ _2_ _3_ _4_ _5_ executable_or_job


Where
1 Minutes (0-59)
2 Hours ( 0-24)
3 day of month ( 1- 31 )
4 Month ( 1-12)
5 A day of week ( 0- 6 ) 0 -> sunday 1-> monday
e.g. 0 3 * * 6 Means run job at 3AM every saturday
This is useful for scheduling tablespace threshold, ftp, rman backup or removed old log files,
or other scripts regularly.
Sample Scheduled backup:
$ crontab l
Rman Database:
00

20

1,4

/u07/rman/scripts/call_dbbkup_cm0.scp

00

15

/u07/rman/scripts/offbkup_rman.sh

00

20

0,2,3,6

/u07/rman/scripts/call_arch_bkup.scp

OTA Database:
50

23

0,2,3,6

/u01/ota/dailyexp_ota.sh

50

23

1,4

/u01/ota/offbkup_ota.sh

15

14

0,1,2,3,4,6

/u01/ota/morning_arch.sh

How to kill all similar processes with single command (in this case opmn)
ps -ef | grep opmn |grep -v grep | awk {print $2} |xargs -i kill -9 {}

Locating Files under a particular directory


find . -print |grep -i test.sql

Using AWK in UNIX

To remove a specific column of output from a UNIX command for example to determine
the UNIX process Ids for all Oracle processes on server (second column)
ps -ef |grep -i oracle |awk '{ print $2 }'

Changing the standard prompt for Oracle Users


Edit the .profile for the oracle user
PS1="`hostname`*$ORACLE_SID:$PWD>"

Display top 10 CPU consumers using the ps command


/usr/ucb/ps auxgw | head -11

Show number of active Oracle dedicated connection users for a particular


ORACLE_SID
ps -ef | grep $ORACLE_SID|grep -v grep|grep -v ora_|wc -l

Display the number of CPUs in Solaris


psrinfo -v | grep "Status of processor"|wc -l

Display the number of CPUs in AIX


lsdev -C | grep Process|wc -l

Display RAM Memory size on Solaris


prtconf |grep -i mem

Display RAM memory size on AIX


First determine name of memory device
lsdev -C |grep mem
then assuming the name of the memory device is mem0

lsattr -El mem0

Swap space allocation and usage


Solaris : swap -s or swap -l
Aix : lsps -a

Total number of semaphores held by all instances on server


ipcs -as | awk '{sum += $9} END {print sum}'

View allocated RAM memory segments


ipcs -pmb

Manually deallocate shared memeory segments


ipcrm -m ''

Show mount points for a disk in AIX


lspv -l hdisk13

Display amount of occupied space (in KB) for a file or collection of files in a
directory or sub-directory
du -ks * | sort -n| tail

Display total file space in a directory


du -ks .

Cleanup any unwanted trace files more than seven days old
find . *.trc -mtime +7 -exec rm {} \;

Locate Oracle files that contain certain strings


find . -print | xargs grep rollback

Locate recently created UNIX files (in the past one day)
find . -mtime -1 -print

Finding large files on the server (more than 100MB in size)


find . -size +102400 -print

Crontab :
To submit a task every Tuesday (day 2) at 2:45PM
45 14 2 * * /opt/oracle/scripts/tr_listener.sh > /dev/null 2>&1

To submit a task to run every 15 minutes on weekdays (days 1-5)


15,30,45 * 1-5 * * /opt/oracle/scripts/tr_listener.sh > /dev/null 2>&1

To submit a task to run every hour at 15 minutes past the hour on weekends (days 6 and 0)
15 * 0,6 * * opt/oracle/scripts/tr_listener.sh > /dev/null 2>&1

CLUSTER ADMINISTRATION

OCR Updation: Three utilities to perform OCR updates


1.

SRVCTL (recommended) remote administration utility

2.

DBCA (till 10.2)

3.

OEM

4.
SRVCTL Service Control:
-

It is the most widely used utility in RAC environment

It is used to perform administration & control of OCR file

Registry sequence of services into OCR:


1.

Node applications

2.

ASM instances

3.

Databases

4.

Database instances

5.

Database services.

(automatically done in 11.2)


(automatically done in 11.2)

Note: To unregister you have to follow in reverse order.

OLR Oracle Local Registry

Both OLR & GPNP profile needed by lower/HAS stack & OCR, VD is needed by
upper / CSR Stack.
If OLR or GPNP got corrupted, the corresponding node will go down where as if OCR,
VD gets corrupted the complete Cluster will go down.
-

Every daemon of the node will communicate with the peer (same) daemon nodes.

Oracle availability perform OLR backup at the time of execution of root.sh scrIPt of
grid infrastructure installation & stores in the location $
GRID_HOME/cdata//backup_date_time.olr.
-

The default location of OLR file is $ GRID_HOME/cdata/.olr.

OLR Backup: Using root user

$ G_H# ./ocrconfig local manual backup


$ G_H# ./ocrconfig local backuploc
$ G_H# ./ocrcheck - local

Restoring OLR:
-

Bring the init level into either init1 or init2

Stop the cluster in the specific node

Restore the OLR from the backup location # ./ocrconfig local restore

Start cluster

Change the init level to either 3 or 5 (init 3 for CLI and init 5 for GUI mode)

OCR Oracle cluster registry or repository

It is a critical & shared Clusterware file and contains the complete cluster information like
cluster node name, their corresponding IPs, CSS parameters, OCR autobackup information
& registered resources like nodeapps, ASM instances with their corresponding node names,
databases & database instances & database services

CRSD daemon is responsible for updating OCR file whenever the utilities like srvctl, dbca,
oem, netca etc.

CSSD daemon automatically brings up online all the cluster resources which got registered
in OCR file

To know the OCR location


# ./ocrcheck
# cat /etc/oracle/ocr.loc
# cat /var/opt/oracle/ocr.loc

// disk ocr location


// In linux & HP-UX
// in Solaris & IBM-AIX

OCR Backup method: 3 ways to perform backup


1.

Automatic

2.

Physical

3.

Logical

1.

Automatic:

Oracle automatically perform OCR backup for every regular interval of 4 hrs since the CRS
start time and stores in master node.

Identifying the master node:


# vi $ G_H/log//crsd/crsd.log

I AM THE NEW OCR MASTER


OR
THE NEW OCR MASTER NODE IS

Backup location:
$ G_H/cdata/
Backup00.ocr (latest)
Backup01.ocr
Backup02.ocr
Day.ocr
Week.ocr

Oracle retains the latest three 4 hours backup, similarly one latest day backup and one
latest week backup by purging all the remaining backup.

Note: It is no possible to change the automatic backup interval time

Manual Backup:

# ./ocrconfig manual backup


(it will create backup in default location $ G_H/cdata//backup_date_time.ocr)
# ./ocrconfig backuploc (recommended is shared storage)

Restoring OCR:
-

Stop the complete cluster on all the nodes # ./crsctl stop crs

Identify the latest backup (backup00.ocr)

Restore the backup # ./ocrconfig restore

Start the cluster in all the nodes

Check the integrity of the restored OCR backup # ./cluvfy comp ocr n all
verbose
2.
Physical backup: Oracle supports image or sector level backup of OCR using dd
utility(if OCR in on raw devices). & cp,(if OCR is on general file system)

# ./ cp
# dd if= of= //if: input file, of: output file.

Restoring:
# ./ cp
# dd if= of= //if: input file, of: output file.

3.

Logical backup:

# ./ocrconfig export

# ./ocrconfig import

Note: Oracle recommends taking the backup of OCR file whenever the cluster configuration
got modified (ex: adding a node/ deleting a node)

OCR Multiplexing: To avoid OCR lost &the complete cluster goes down due to the single
point of failure (SPF) of OCR, Oracle supports OCR multiplexing from 10.2 onwards in max 2
locations (1 as primary other as mirror copy) but from 11.2 onwards it is supporting max 5
locations (1 as primary and remaining as mirror copies)

Note: from 11.2 onwards, oracle support storage of OCR in ASM diskgroups so it provides
mirroring depending on the redundancy level.

GPNP Grid Plug n Play Profile:

It contains basic cluster information like location of voting disk, ASM spfile location,
all the IP addresses and their subnet masks
-

This is a node specific file

It is and xml formatted file

Backup loc: $ G_H/gpnp//profile/peer/profile.xml


Actual loc: $ G_H/gpnp/profile/peer/profile.xml

Voting Disk (VD):

It is another & shared file which contains the node membership of all the nodes
within the cluster
CSSD Daemon is responsible for sending the heartbeat messages to other nodes for
every 1 sec and write the response into VD

VD Backup:
-

Oracle supports only physical method to take the backup of VD.

From 11.2 onwards, oracle not recommend to take the backup of VD because it
automatically maintains VD backup into OCR file

Restoring VD:
1.

Stop the CRS on all the nodes

2.

Restore the VD # ./crsctl restore vdisk

3.

Start the CRS on all the nodes

4.

Check the integrity of restored VD. # ./cluvfy comp vdisk n all verbose

VD Multiplexing: To avoid VD lost and the complete cluster goes down, due to SPF of VD,
oracle supports multiplexing of VC from 10.2 onwards in max 31 locations, but from 11.2 it
is supporting in max 15 locations.

Node Eviction:

It is the process of automatically rebooting a cluster node due to private network or VC


access failure to avoid data corruption.
If node1 & node2 can communicate with each other but not with the node3 through private
network, a split syndrome can occur for the formation of 2 sub cluster and try to master a
single resource their by having data corruption. To avoid this split blind syndrome, the
master node evicts the corresponding node by the handshake node membership information
of D.

CSS Parameter:

1.
Miscount: default 30 sec: It specifies the maximum private network latency to wait
before triggering node eviction process by the master node.
2.
Disk timeout: Default is 200 sec: It specifies the VD access latency if elapsed to
have node eviction process by the master node.
3.
Reboot Time: default 3 sec: The affected node waits till the reboot time elapsed for
actual node reboot process (this is to make some 3rd party application goes down properly)

CLUSTER COMPONENTS

OHASD - Oracle High Availability Services Daemon:


It is the first and only daemon which is going to start by parent init process and in
turn it is responsible for starting some other agents & daemon by reading OLR (oracle local
registry file)
It needs access to OLR file that contains the startup sequence of other child
daemons.

CSRD Cluster Ready Services Daemon:


It is responsible for maintain the cluster configuration and HA (High availability)
operations by reading the OCR file (oracle cluster registry)

OCR file contains the complete cluster information which is required for CRSD
Daemon.

CSSD Cluster Synchronization Services Daemon:


It is responsible for updating node membershIP of all the nodes within the cluster
into VD (Voting Disk)
In Non-RAC environment, CSSD daemon is responsible for maintaining the
communication between ASM instance and ASM Clients database instances.

VD Voting Disk:
-

It consist the updated information of all the cluster nodes.

Both OCR & VD requires 280 MB of space

EVMP Event Manager Daemon:


It is responsible for publishing & subscribing the events which are generated by CRSD
Daemon to the other nodes

OCTSSD Oracle Cluster Time Synchronization Services Daemon:


-

It is responsible for maintaining the consistency in time

It has two modes

Observer : if NTP (network time protocol) is enabled

Active : if NTP is disabled

GMNPD

GSD: From 10g it is duplicated, it is responsible for performing the administrative tasks
whenever GUI application like NETCA or DBCA invoked.

ONS Oracle Notification Server: It is responsible for publishing the notification events
thru FAN (Fast Application Notifications)

VIP Virtual IP:


-

It is registered as a resource into OCR and maintained the status into OCR.

For Release 2 onwards, we require every node specific one private IP in one subnet
mask and one public IP & VIP on other subnet mask and 3 unused scan VIPs on the same
subnet mask of public.

Types of Storage:
Types of Storage:
NAS: Network Attached Storage: It supports file level I/O

SAN Storage Area Network

NAS
S
P
A
M

SAN
S
P
A
M

ISCSI Internal Small Computer System Interface

I/O performance: NAS < ISCSI < SAN


Types of Clusters

Types of Clusters:

1.

Operating system level

2.

Hardware level

3.

Network level

4.

Application level

a.

Failover cluster

b.

Parallel cluster

c.

Hybrid cluster

Failover Cluster:

Ex: VCS (VERITAS Cluster System)

Disadvantages of failover cluster:


1.
2.
3.
4.

More downtime for the users


No load balancing
Wastage of resources
Max it support 2 node cluster setup

Parallel cluster or scalable high performance cluster:

In 9i, it supports 67 nodes per cluster


In 10g, it supports 100 nodes per cluster

In 11g, it supports 100+ nodes per cluster

Advantages:
1.
2.
3.
4.
5.

Zero or negligible downtime for the users


Better load balancing
No wastage of resources
It supports multiple nodes per cluster
Best example oracle RAC

Hybrid Cluster:

It is a combination for both parallel & failover cluster


Ex: Data guard in RAC environment

Types of Cluster Software:

Types
1.

Vendor
Hp Service Guard

Kernel level

HP

2.

Sun Cluster

Oracle SUN

3.
VERITAS cluster (failover +
parallel)
4.

Oracle RAC (Truly Parallel)

User / Application
level

Symantec
Oracle

Real Application Cluster - RAC

Global cache is maintained by


1.

In 8i, OPS (Oracle parallel Server)

2.

From 9i, it is maintained by GRD (Global Resource directory)

Global Resource directory (GRD) = Global cache service (GCS) + Global Enqueue Services
(GES)
GRD = GCS + GES

RAC is not software; it is a concept in which multiple instances (each on separate node) and
can access a common (single) database

Advantages of RAC:

1.

It offers SPAM

a.

S Scalability

b.

P Performance

c.

A Availability

d.

M Manageability

2.

It Supports HA (high availability) operations to the services like VIPs, Scan IPs etc

3.

Automatic error Detection

4.

Automatic restart of failed services

5.

It supports TAF Transparent Application Failover

Cluster Component:

1.
2.
3.

Clusterware Software
Private Interconnect
Shared Storage

Clusterware software:

It coordinates and manages all the cluster nodes of a cluster by treating all the nodes as a
single large logical server

Version

Name of Cluster S/W

9i

Oracle cluster manager (Supports only

Linux/windows)
10.1

Oracle CRS (cluster ready Service)


a.
b.

10.2

Mandatory for building 10g RAC Setup

Oracle CRS is renamed to ORACLE CLUSTERWARE


a.
b.

11.2

Supports all the o/s

a.

Supports HA operations
High performance
Grid Infrastructure

b.
Combination of Clusterware binaries and ASM
binaries

Oracle Homes
10.1

ASM_HOME (ASM)
CRS_HOME (Clusterware)
ORACLE_HOME (RDBMS)

10.2

CRS_HOME
ORACLE_HOME (Clusterware + ASM)

11.2

GRID_HOME (Clusterware + ASM)


ORACLE_HOME (RDBMS)

High Availability

Availability:

1.
Low Availability:
Recovery)
2.
Medium Availability:
Recovery)
3.
High Availability:
Recovery)

Some downtime + Data loss

Some downtime + No Data loss

No Downtime + No Data loss

(Incomplete

(Complete

(Complete

Low Availability:
ICR - Loss of redo log files or current control file.

Medium Availability:
It is a complete recovery example Data guard
Max. Performance: This is the default mode
Max Availability:
Max Protection:

Guarantee data availability by compromising database availability


Guarantee database availability by compromising data availability

High Availability: Its means No downtime & no data loss


Example RAC (it gives the solution for instance crash)
Data guard: Data Availability
RAC: Instance Availability
MAA Maximum Availability Architecture = Data guard + RAC

Oracle ASM disk failure - Part 1


Introduction
Oracle Automatic Storage Management (ASM) was introduced in Oracle 10g. ASM provides
advance storage management features such as DISK I/O re-balancing, volume management
and easy database file name management. It also can provide MIRRORING of data for high
availability and redundancy in the event of a disk failure (Mirroring is optional). ASM
guarantees that data extents (table,index row data etc.) in one disk are mirrored in another
disk (normal redundancy) and in two disks (high redundancy).

A few times I have faced ASM disk failures when redundancy (mirroring) was enabled and
none of them resulted in an issue for an end user. ASM automatically detects the disk failure
and services Oracle SQL requests by retrieving information from the mirrored (other) disk.
Such a failure is handled gracefully and entirely managed by Oracle. I am very impressed by
the fault tolerance capability in ASM.
But soon the Oracle DBA must work with the system administrator to replaced the failed
disk. If the mirrored disk also fails before the replacement, then Oracle SQL by end users
will error because both the primary and mirrored disks have failed.
This post assumes that you are using ASM redundancy (Normal or High) and that you are
not using ASMLib program. The commands and syntax could be different if you are using
ASMLib.

How to identify a failed disk

An ASM disk failure as noted below is transparent to end users and one can be caught
unaware if one is not proactive in database monitoring. The DBA can write a program that
constantly checks the database alert logfile or a SQL script that checks for any read/write
errors.

If either of the below queries return rows, then it is confirmed there are one or more ASM

disks that have failed.


select path,name,mount_status,header_status
from v$asm_disk
where WRITE_ERRS > 0
select path,name,mount_status,header_status
from v$asm_disk
where READ_ERRS > 0;
But despite the read/write errors, the header_status column value may still be shown as
"MEMBER".
Drop the failed disk
1) alter diskgroup #name# drop disk #disk name#;
Caution: Do NOT physically remove the failed disk YET from the disk enclosure of the
server. The above command is executed immediately, but ASM also starts a lengthy rebalance operation. The disk should be physically removed only after the header_status for
the failed disk becomes FORMER. This status is set after the re-balance operation is
completed. One can monitor the progress of the re-balance operation by checking
v$asm_operation.
state,power,group_number,EST_MINUTES
from v$asm_operation;
After a few min/hours the above operation will get completed (no rows returned). Then
verify that theheader_status is now FORMER and then request the System Administrator to
physically remove the disk from the disk enclosure. The LED light for the failed disk should
get turned off and this indicates the physical location of the failed disk in the enclosure.

Add the replacement disk


1) Get the replacement device name, partition it and change ownership to the database
owner. For example let the disk path after partitioning be /dev/sdk1
2) select distinct header_status from v$asm_disk where name = '/dev/sdk1';(Must show as
CANDIDATE)
3) alter diskgroup #name# add disk '/dev/sdk1';
4) ASM starts the re-balancing operation due to the above disk add command. One can
monitor the progress of the re-balance operation by checking v$asm_operation.
select state,power,group_number,EST_MINUTES
from v$asm_operation;

After a few min/hours the above gets completed (no rows returned)
5) The disk add operation is now considered complete.
How to decrease the ASM re-balance operation time
While the above ASM re-balancing operation is in progress, the DBA can let it complete
quickly by changing 'ASM power' by running the below command for example.
alter diskgroup #name# rebalance power 8;
The default power is 1 (i.e ASM starts one re-balance background process to handle the rebalancing work, called ARB process). The above command dynamically starts 8 ARB
processes (ARB0 to ARB7), which can dramatically decrease the time to re-balance. The
maximum power limit in 11g R1 is 11 (upto 11 ARB processes can be started).
Conclusion
None of the above maintenance operations (disk drop, disk add) causes a downtime to the
end user and therefore can be completed during normal business hours. The re-balance
operation can cause slight degradation of performance and hence increase the power limit
to let it complete quickly.
Oracle ASM disk failure - Part 2
Introduction
In Part 1, I wrote about a scenario when ASM detects READ_ERRS/WRITE_ERRS and
updates these columns in v$asm_disk for the ASM disk. The DBA has to explicitly drop the
disk in ASM. This article is about a different scenario when ASM instance itself performs the
'drop disk' operation.

This post assumes that you are using ASM redundancy (Normal or High) and that you are
not using ASMLib program. The commands and syntax could be different if you are using
ASMLib.

Scenario

In this scenario, ASM drops the disk automatically. Furthermore, the


READ_ERRS/WRITE_ERRS in v$asm_disk could be showing a value of NULL (instead of an
actual count of READ or WRITE errors noticed).
How to identify the failed disk

Unlike scenario 1 discussed in Part 1 of the ASM series, ASM instance can initiate the 'drop
disk' by itself in some situations. Let the failed disk be '/dev/sds1'.
select path
from v$asm_disk
where read_errs is NULL;
/dev/sds1
select path
from v$asm_disk
where write_errs is NULL
/dev/sds1
Additionally, the HEADER_STATUS in v$asm_disk returns a value of UNKNOWN.
select mount_status,header_status,mode_status,state
from v$asm_disk
where path = '/dev/sds1';
CLOSED UNKNOWN ONLINE NORMAL
Compare this scenario with that of the scenario mentioned in Part 1, when the
HEADER_STATUS is still shown as MEMBER and the READ_ERRS/WRITE_ERRS has a value >
0.
The following are the errors mentioned in the +ASM alert log file when the failure was first
noticed.

WARNING: initiating offline of disk


NOTE: cache closing disk
WARNING: PST-initiated drop disk
ORA-27061: waiting for async I/Os failed
WARNING: IO Failed. subsys:System dg:0, diskname:/dev/sds1

No "drop disk" command required by DBA


The disk is already dropped by ASM instance. There is no need of an "alter diskgroup ...drop
disk" command again. Instead the DBA has to work with the system administrator and
physically locate the failed disk in the disk enclosure and remove it.Add the replacement
disk
1)Get the replacement/new device name, partition it and change ownership to the database

owner. For example let the disk path after partitioning be /dev/sdk12)select distinct
header_status from v$asm_disk where name = '/dev/sdk1'; (Must show as CANDIDATE)
3)alter diskgroup #name# add disk '/dev/sdk1';
4)ASM starts the re-balancing operation due to the above disk add command. One can
monitor the progress of the re-balance operation by checking v$asm_operation.

select state,power,group_number,EST_MINUTES
from v$asm_operation;
After a few min/hours the above gets completed (no rows returned)
5)The disk add operation is now considered complete.
How to decrease the ASM re-balance operation time
While the above ASM re-balancing operation is in progress, the DBA can let it complete
quickly by changing 'ASM power' by running the below command for example.

alter diskgroup #name# rebalance power 8;


The default power is 1 (i.e ASM starts one re-balance background process to handle the rebalancing work, called ARB process). The above command dynamically starts 8 ARB
processes (ARB0 to ARB7), which can dramatically decrease the time to re-balance. The
maximum power limit in 11g R1 is 11 (upto 11 ARB processes can be started).
Conclusion
I am not exactly sure why ASM shows the status of a failed disk in different ways, but these
are two scenarios that I aware of so far.
None of the above maintenance operations (faile disk removal from the disk enclosure, new
disk add) causes a downtime to the end user and therefore can be completed during normal
business hours. The re-balance operation can cause slight degradation of performance and
hence increase the power limit to let it complete quickly.

Oracle DBA Activities


1) What is a typical day at your job?

I start my day checking any system alerts such as database performance problems, backup
failures etc. We are using Oracle's Enterprise Manager which is a web-based software that
sends email alerts to us automatically whenever it detects a problem based on certain
criteria. I spend most of the day working on current projects such as database upgrades,

migrations, new installations etc. I also help application developers and end-users whenever
they have a database related question or problem.
2) What does it take to be a successful Oracle DBA?
Most of today's E-Business and IT applications are entirely web-based and hence the
underlying databases have to be highly available 24*7. Responsibility, proactive
attitude and emergency preparedness are some of the key characteristics that can
make a successful Oracle DBA. IT application developers and the end-user communities rely
heavily on the database administrator for their day-to-day database issues, questions and
projects. An Oracle DBA should be polite and must treat every one in the organization with
courtesy and respect.
3) Has your job description evolved over time?
Yes indeed ! The definition of an Oracle DBA has a much broader scope today. I started
with just "database work" in my first job. Today my responsibilities include Oracle systems
design and architecture, including Oracle E-Business Suite administration, Oracle
Application Server setup and administration, setting up of Continuity of Business systems
(Disaster Recovery preparedness), setup and administration of Oracle Fusion Middleware
components such as Oracle Portal Server, Identity Management etc. I am also expected to
work on hardware specifications and requirements for upgrading existing Oracle installations
or setting new ones. Whereas the traditional "Oracle DBA" designation has remained the
same, it has a much wider scope and responsibility today.
4) How do you keep up with new features and changes & advancements in database
technology?
Every major Oracle database release comes with a lot of exciting new features which can be
leveraged for simplicity, automation or better database management.
a)I am an avid reader of the bi-monthly Oracle Magazine. The subscription is free and it is
available online as well. The magazine covers the latest in Oracle, contains a lot of expert
articles with a practical outlook to tackle business problems.

b)I have also subscribed to rss feeds in http://otn.oracle.com/ so that i get updated
whenever there is a new knowledge based article. This a popular site for the Oracle
community and most of the technology articles are posted by Oracle ACEs and Oracle ACE
Directors who are proven and recognized individuals by Oracle Corporation.

c)I also recommend aspiring DBAs to register in the Official Oracle Forum , thanks to the
many experts who generously contribute to this discussion board, virtually any of your
database related questions can get answered here.
5.What is the best feature you like about oracle DB, what needs improvement compared to
other databases in the market?

My favorite Oracle database feature is Real Application Clusters (RAC). Using RAC
technology, Oracle databases can be setup for high availability and virtually unlimited
scalability. I did not get a chance to fully evaluate other databases in the market vis-a-vis
the Oracle database. Oracle is the recognized Industry leader as per various results
published by market research companies such as IDC and Gartner.
6.Has any of the following major macro trends affected you personally, whats your opinion?
a.Outsourcing & Offshoring

No. Oracle DBA is one of the few jobs that had a lesser impact by Outsourcing. A DBA is
critical to the success of an IT department requiring a lot of technical understanding,
emotional maturity, ability to handle pressure and crisis and one that comes with a lot of
responsibility. Infact, all the Dice Reports this year show Oracle database as one of the top
technology skills in the market in the USA.

b.Virtualization

Remote Service and Tele-commuting are only for low profile work such as after-hours
support etc. Most of the managers prefer Oracle DBAs to work onsite and with direct
supervision.

c.Moving from client-server to web-based

The Oracle DBA is usually less impacted by Client-server to Web-based migrations. Oracle
databases can work with both client-server systems and web-based systems.
7.Your advice to people who are evaluating Oracle DB administration as a career.

The IT industry is facing a shortage of quality Oracle DBAs. Oracle database administration
is a good career option with long-term benefits. I have been working as an Oracle database
administrator since more than 6 years and the experience is very rewarding. It has also
given me the confidence to architect and build large scale IT systems. I was able to
positively impact the experience of the end-user community and positively contribute to
various IT departments.

Anda mungkin juga menyukai