Anda di halaman 1dari 53

1 Copyright © 2011, Oracle and/or its affiliates.

All rights
reserved.
Oracle Data Guard Switchover and Failover
Internals: How Fast Can You Go?
Presenting with
Michael Smith, Oracle
2 Srinagesh Battula, Intel
Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
3 Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Latin America 2011
December 6–8, 2011

Tokyo 2012
April 4–6, 2012

4 Copyright © 2011, Oracle and/or its affiliates. All rights


reserved.
Oracle OpenWorld Bookstore

• Visit the Oracle OpenWorld Bookstore for a fabulous


selection of books on many of the conference topics
and more!
• Bookstore located at Moscone West
West, Level 2
• All Books at 20% Discount

5 Copyright © 2011, Oracle and/or its affiliates. All rights


reserved.
Oracle Products Available Online

O
Oracle
l Store
St

Buy Oracle license and support


online today at
oracle.com/store

6 Copyright © 2011, Oracle and/or its affiliates. All rights


reserved.
Program Agenda

• Overview
– What is Data Guard
– Disaster Recovery and High Availability
– Switchover / Failover, how fast?
– Alerting / Monitoring
• Intel Case Study
– HA / DR Objectives
– Overall Architecture
– Observed Opportunities / Benefits
– HA / DR Metrics
• Failover / Switchover Details
– Failover process flow
– Switchover process flow
– Demo
– Best Practices

7 Copyright © 2011, Oracle and/or its affiliates. All rights


reserved.
Oracle Data Guard – What Is It?
• Data Availability & Data Protection solution for Oracle
• Automates the creation and maintenance of one or more
synchronized copies (standby) of the production (or primary)
database
• If the p
primary
y database becomes unavailable,, a standbyy
database can easily assume the primary role
• Feature of Oracle Database Enterprise Edition (EE)
– B
Basic
i ffeature
t available
il bl att no extra
t costt
– Active Data Guard is extra license option
– Primary and standby databases need to be licensed EE

8 Copyright © 2011, Oracle and/or its affiliates. All rights


reserved.
Oracle Data Guard
Best Data Protection
Active Data Guard
Sync / Async Standby Database
R d T
Redo Transportt

Primary
D t b
Database

Data Guard Broker


• Data availability and data protection for the Oracle Database
• Up to thirty standby databases in a single configuration
• Standby
St db d database
t b used
d ffor queries,
i reports,
t ttest,
t or backups
b k

9 Copyright © 2011, Oracle and/or its affiliates. All rights


reserved.
Data Guard
Data Guard Capabilities
Essential for HA
1. Built-in Oracle integration: ensures
transactional consistency
2. Extremely high performance
3 Transparent operation,
3. operation supports all Oracle
features and data types
4. Application-integrated failover
5. Combined HA/DR solution
LAN & MAN deployments provide Local HA and DR 6 Ensures fault isolation: Protection from data
6.
corruptions
7. Ensures zero data loss
8. DR servers utilized for real-time
reports+testing while providing DR
9. Addresses both planned and unplanned
downtime
10. No vendor lock-in for storage
11 Minimal network consumption
11.
Extend to a Wide Area Network and add remote DR 12. No distance limitation

10 Copyright © 2011, Oracle and/or its affiliates. All rights


reserved.
Data Guard Switchover
How Fast Can You Go
64
• Switchover
62 – Planned role reversal, never any
data loss
60 – No database reinstantiation
required
nds

58
– Used for database upgrades,
upgrades tech
Secon

SQL*Plus
56 Broker refresh, data center moves, OS or
hardware maintenance …
54
• M
Manually
ll execute
t via
i SQL or
52 Enterprise Manager GUI, or
50
Broker CLI
Database Application

11 Copyright © 2011, Oracle and/or its affiliates. All rights


reserved.
Data Guard Failover
How Fast Can You Go
25
• Failover
– Unplanned failure of primary
20
– Use Flashback Database to
reinstate original primary
nds

15
Man al - SQL*Plus
Manual SQL*Pl s
Secon

Manual - Broker
• Manually execute via SQL or
10 Automatic - FSFO Enterprise Manager GUI, or
Broker CLI
5
• Automate failover using Data
0 Guard (Fast-Start Failover)
Database Application

12 Copyright © 2011, Oracle and/or its affiliates. All rights


reserved.
Reinstate of Failed Primary
How Fast Can You Go
185
• Observer automatically
180

175
reinstates the failed primary
170
as a standby database
• Built-in controls p
prevent anyy
nds

165
Secon

160 Manual - SQL*Plus


Manual - Broker
possibility of a split-brain
155
condition
150

145 • Manually execute via SQL or


140
Enterprise Manager GUI, or
135
Database Broker CLI
13 Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Oracle Enterprise Manager
Data Guard Management Page

14 Copyright © 2011, Oracle and/or its affiliates. All rights


reserved.
Program Agenda

• Overview
– What is Data Guard
– Disaster Recoveryy and Highg Availability
y
– Switchover / Failover, how fast?
– Alerting / Monitoring
• Intel Case Study
– HA / DR Objectives
– Overall Architecture
– Observed Opportunities / Benefits
– HA / DR Metrics
• Failover / Switchover Details
– Failover process flow
– Switchover process flow
– Demo
– Best Practices

15 Copyright © 2011, Oracle and/or its affiliates. All rights


reserved.
High Availability & Disaster Srinagesh Battula
R
Recovery with
ith Oracle
O l Data
D t Guard
G d S Database
Sr. D t b Architect
A hit t
Technology Manufacturing Group
Fast-Start Failover
Oct, 6 2011
An Implementation at Intel

16
Agenda
Profile of Intel’s DB Ecosystem
y & HA/DR
/ Challenges
g

HA/DR Objectives

HA 1.0 vs HA 2.0

HA/DR Metrics with HA 2.0

pp
Opportunities with HA 2.0

HA 2.0 Blue Print & Stack Composition

HA 2.0 Platform Reference Architecture

Lessons Learned & Next Steps

Conclusion

17
Intel Corporation
The World’s Largest Semiconductor Manufacturer

• Leading Manufacturer of Computer, Networking & Communications Products


• Over $43B in Annual Revenues World-Wide
• One of the Top Ten Most Valuable Brands in the World for 10 Consecutive Years
• The Single-Largest Voluntary Purchaser of Green Power in the United States

18
Intel’s DB Ecosystem Profile & HA/DR Challenges
Intel’s Factory Automation DBs are used for critical Manufacturing decisions.
(Operational and Planning,
Planning Engineering Analysis,
Analysis Process control)

Include both Mission Critical OLTP and Mission Important DSS type systems
(Ranging from few hundred Gigs up to multi TB)

Applications clients include both vendor & homegrown apps (ODP .NET based)
High Availability of data pertinent to advanced manufacturing processes of ever
p g Intel Product p
expanding pipeline
p is vital.
• Current blue print of DB HA/DR for Mission Critical Databases
• Oracle Fail Safe /Storage based replication and “Vanilla” Data Guard – (codename: HA 1.0)
• HA/DR
/ Challenges
g
• Current HA implementations are too complex to leverage for operational needs such as MTTR
and Maintenance Downtime reduction.
• Activation of redundant env is time consuming and requires decisions to be made during
failure incidents.

19
HA/DR Objectives
• Comprehensive High Availability & Disaster Recovery
– Holistic end to end HA/DR across all tiers of the stack
stack. i.e.,
i e Storage,
Storage Network,
Network
Server, Database and Application
• Simplified Manageability of the Stack
• Leverage HA/DR investment for Managed DT Reduction during
OS/Server/SAN and DB patches/migrations
• Cost Effective
• HA/DR Target
g Metrics
Recovery Point Objective (RPO) Zero Data Loss
Recovery Time Objective (RTO) for Failovers < 60 secs

Recovery Time Objective (RTO) for Switchovers < 180 secs

Mean Time Between Failures (MTBF) High


(i.e., Reliable and Fault Tolerant Stack)

20
HA 1.0 DB Architecture HA 2.0 DB Architecture
Standby – Data Center B Standby – Data Center B
Primary – Data Center A Primary – Data Center A
OBSERVER OBSERVER
Public network Public network

DB DB
DB
DB Instance Instance
Instance
Instance
Oracle Broker Enabled Grid
Data Oracle Grid Infra
Oracle Oracle Failsafe Failsafe Data Guard/SYNC
Guard/ Infra
Failsafe Failsafe
ASYNC Standby Primary
Primary Standby Primary Standby
Active Passive
Active Passive

Private network
Private network
Storage
Storage Storage network
Network Storage network
Network

Storage Mirrored LUN of


REDO/Control/Arch logs ASM
Database copies ASM
Database copies DATA &
via DataGuard DATA & for Double Failure Coverage
via DataGuard FRA
and SAN FRA
Storage Replication and SAN
Replication on
Replication on
NTFS
NTFS
SAN1
SAN2

Comparison of Current HA 1.0 vs HA 2.0 DB HA/DR architecture illustrates simplicity of


21
HA 2.0 stack and a thin database infrastructure footprint.
HA/DR Metrics with HA 2.0
SAN Replication OBSERVED
/MSCS/ OFS/DG DataGuard/Broker
Outages
HA 1.0 -FSFO
HA 2.0
U l
Unplanned
d MTTR on St
Storage F
Failure
il >1
1 hour
h 74 seconds
d

MTTR on Site Failure >1 hour 39 seconds

MTTR on Node Failure 70 – 90 seconds 59 seconds

MTTR on Instance Failure


il 70
0 – 90 seconds
d 50
0 seconds
d

Planned DT needed for applying Oracle 2+ hours 120 seconds


Patches and Infra Maintenance
(SAN/OS/Server etc.,) activities

HA 2.0 allows lights-out role transition for DB/ASM/NODE/Storage/Network/Data Center failovers.


Typical Failover time is around 60 seconds (includes 30 secs FSFO threshold) for unplanned
outages.
Observed failover time as low as 17 seconds (that includes 15 second FSFO threshold for that
FMEA test)
Graceful switchovers (for planned outage) take 120 seconds .
22
Opportunities with HA 2.0
Opportunity Characteristics Notes
Server Cost 50% DB Server cost reduction Number of Servers for a given db application reduced from 4
Reduction to achieve desired DB HA/DR (for Mission Critical DBs) to 2.

Storage Cost 33% DB Storage cost reduction compared to HA 1.0 •Number of copies of db for a given application reduced from 3
Reduction to 2. (Eliminated storage mirroring for the db)
•Leveraged ASM for Dynamic Capacity Rightsizing
•Lackk off Clear
Cl visibility
bl into the
h ddata growth
h fforces the
h
databases NTFS LUNs to be pre allocated and oversized.
•Eliminates the issue of pockets of unused free space on
various NTFS LUNS.

Leverage DR •Reduction of Maintenance DT for SAN and DB Patching SAN and DB patching take 2+ hours in HA 1.0 compared to
investment for MDT 120 seconds in HA 2.0
20
Reduction
Rolling Upgrade for OS/Patching/Server Maintenance takes
120 seconds in HA 2.0

Enhanced Data Superior Data protection and Corruption prevention •Automatic validation of redo blocks before they are applied.
Protection capability p
•Fast failover to an uncorrupted standby
y database upon
p prod
p
db corruption

Simplified Operational Efficiency via reduced manageability •Switchovers and failovers performed via single broker CLI
Manageability overhead command compared to numerous steps that needs to be
executed otherwise.
•Automatic reinstatement of Standby (via Flashback
technology)
h l ) upon ffailover
il compared
d to 1 d
day off time
i to rebuild
b ild
standby

Key Message: Architecture is a demonstrated solution for cost-effective High


23
Availability solution
HA 2.0 Blue Print
Data Center B
Data Center A OBSERVER/OB-MAN OBSERVER/OB-MAN
on OEM GC Node on OEM GC Node

P bli network
Public t k

DB DB
Instance Instance

Grid Grid
Infra Infra
Broker Enabled Data Guard/SYNC
Primary Standby

Private network
Storage
Network Storage
Network

Storage Mirrored LUN of REDO/Control/Arch logs


for Double Failure Coverage ASM
ASM DATA &
DATA FRA
& FRA

Data Guard (Broker Enabled) Fast -Start Failover Zero Data Loss Configuration (SYNC/Max Availability Mode/Real
Time Apply).
Cost effective HA/DR architecture with a single standby database running on Intel XEON™ servers.
24 High Availability of Observers driven by Enterprise Manager Grid Control.
HA 2.0 Stack Composition
Public Network
Service Service

Listeners Listeners
Database Instance A Database Instance B
with DataGuard Broker with DG Broker
ASM Instance
I t A ASM Instance B
Oracle Restart A Oracle Restart B
Operating System
Operating System

Node A N d B
Node
Private Network

DATA Disks
Redo LUN
i k
DATA Disks

Redo LUN FRA Disks


FRA Disks

HA 2.0
2 0 Stack Composition: Oracle 11gR211gR2, Windows 2008 R2R2, Oracle ASM,
ASM Oracle Restart,
Restart Oracle Data
Guard Broker – Fast Start Failover, Enterprise Manager Grid Control for monitoring and FSFO Observer High
Availability.
25
HA 2.0 Platform Reference Architecture
Stack Component Technology Description
Database Oracle 11g R2 64 bit (11.2.0.2 )
HA/DR Broker Managed Oracle Data Guard (Max Availability Mode) in a Fast Start Failover configuration
Oracle Restart (Part of Grid Infra Install)
Observer HA Via OEM GC; Custom script (OBMAN) to automatically move the Observer away from the PRY data center
upon role transition
Monito ing
Monitoring Oracle
O acle Enterprise
Ente p ise Manager
Manage - Grid
G id Control
Cont ol (10
(10.2.0.5
2 0 5 OMS;
OMS 10.2.0.4
10 2 0 4 OMR) and 11gR1 OMA or
o later
late
Separate Observer Home on GC Nodes – 11gR2 32bit Client
Backups RMAN integration with GC ; Centralized RCAT for all OLTP Apps running on GC Repository DB as a
separate schema; Basic Compressed Backup sets Wkly Full and Daily Incrementals; Backups to run on
PRY (with Block Change Tracking)
OS 64 bit Windows 2008 R2 Server
Storage/File System •EVA8400; App specific Data/Index, Backup, Redo, Archive, Control Æ ASM
•Two ASM disk groups (+DATA, +FRA) and a REDO LUN (RAID 1)
•2nd member of Redo/Arch/control file on a dedicated SAN Mirrored LUN for double failure coverage.
•Oracle and Grid Infrastructure (ASM) Binaries Æ NTFS

Network Teamed Private Network 1 GigE amongst the HA 2.0 Nodes ;


Distance between Data centers typically < 1 Km; Network Latency 2ms.
Public Network 1GigE between the HA 2.0 Nodes and the clients
Server Config Windows 2008 R2 ; DL360 G7 w/ 2 - 6 core Westmere processors ; 36G RAM,
yp Threading
Hyper g Enabled

26
Lessons Learned
Intel created Homegrown custom scripts/Utilities to:
– Auto Relocate the FSFO Observer (where the current PSB’s Data Center is) upon role
transition

– Upload OMA dynamic properties so that Grid Control reflects real state of PRY/PSB

Oracle Development Team helped in providing fix/patch to:


– Automatically transfer GC RMAN Jobs to the new primary upon Data Guard failover
– Auto Reinstatement (of Old Pry as a new Standby) bug

Next Steps:
• Automatic failover for ODP.NET
ODP NET client apps via FAN/FCF
• Optimized Connect Time failover

27
Conclusion
Auto-Failovers via “Data Guard Broker enabled Fast-Start Failover”
allowed
ll d
• Elimination of human element in role transition decisions during DB infra
failures
• Simplified Manageability to perform planned role transitions
• Reduction in number of multi-vendor integration touch points in the DB
/
HA/DR Stack
– Very thin database infrastructure footprint
• Zero-Data loss and Near-Zero (few seconds) downtime for Auto-Failovers
• A single cost-effective
cost effective vehicle for Database HA,
HA DR,
DR Data protection,
protection MDT
reduction with superior operational efficiency and provides low MTTR

28
Thanks to TMG Engg Team for the contributions.

Thank
h k You for
f attending
d the
h session.

Contact Info:
srinagesh.battula@intel.com
Program Agenda

• Overview
– What is Data Guard
– Disaster Recoveryy and Highg Availability
y
– Switchover / Failover, how fast?
– Alerting / Monitoring
• Intel Case Study
– HA / DR Objectives
– Overall Architecture
– Observed Opportunities / Benefits
– HA / DR Metrics
• Failover / Switchover Details
– Failover process flow
– Switchover process flow
– Demo
– Best Practices

31 Copyright © 2011, Oracle and/or its affiliates. All rights


reserved.
Failover – Steady State
App Server
Farm Standby Connection
• JDBC: Subscribe to ONS

Primary Connection
• JDBC: Subscribe to ONS
• Connected to OrderEntry Service

JDBC: ONS Daemon JDBC: ONS Daemon JDBC: ONS Daemon JDBC: ONS Daemon

Service: OrderEntry Service: OrderEntry

2 node RAC
2-node Data Guard Redo Transport 2 node RAC
2-node
Primary Standby

Data Guard Broker


AUSTIN Data Center HOUSTON Data Center
32 Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Partial Site Failover – Database Failover
App Server
Farm Standby Connection
• JDBC: Subscribe to ONS

Primary Connection 2
• JDBC: Timeout ensues
• No longer connected to OrderEntry Service

JDBC: ONS Daemon JDBC: ONS Daemon JDBC: ONS Daemon JDBC: ONS Daemon

Service: OrderEntry Service: OrderEntry

2 node RAC
2-node 2 node RAC
2-node
Primary Standby
1
3
Failover Started by Data Guard Broker
AUSTIN Data Center HOUSTON Data Center
33 Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Failover – New Primary
6
App Server
Farm Standby
New Connection
Connections Directed to New Primary
• • JDBC:
JDBC:Subscribe
SubscribetotoONS
ONS
• Connected to OrderEntry Service
• Old Primary Existing Connections
• JDBC: Timeout continues
• No longer connected to OrderEntry Service

JDBC: ONS Daemon JDBC: ONS Daemon

5 Service: OrderEntry auto-started


on new primary
Service: OrderEntry auto-started
on new primary

2 node RAC
2-node 2 node RAC
2-node
Primary Primary

4 4

Failover Completed by Data Guard Broker


AUSTIN Data Center HOUSTON Data Center
34 Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Failover – Notification
New Connections to New Primary

App Server
Farm

Old Primary Existing Connections


• JDBC: Timeout continues Broker Notification to Terminate Existing Connections 7
• JDBC: FAN event sent to ONS subscribers
• No longer connected to OrderEntry Service

JDBC: ONS Daemon JDBC: ONS Daemon

Service: OrderEntry Service: OrderEntry

2 node RAC
2-node 2 node RAC
2-node
Primary Primary

Data Guard Broker


AUSTIN Data Center HOUSTON Data Center
35 Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Failover – Application Redirect
8
App Server
Farm All Connections Directed to New Primary
• JDBC: Subscribe to ONS
• Connected to OrderEntry Service

Apps no longer connected to old primary

JDBC: ONS Daemon JDBC: ONS Daemon

Service: OrderEntry Service: OrderEntry

2 node RAC
2-node 2 node RAC
2-node
Primary Primary

Data Guard Broker


AUSTIN Data Center HOUSTON Data Center
36 Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Integrated, Automatic Client Failover
• Use SRVCTL to configure Clusterware managed services
– srvctl add service -d <db_unique_name> -s <service_name>
[-l [PRIMARY][,PHYSICAL_STANDBY][,LOGICAL_STANDBY]
[,SNAPSHOT_STANDBY]]
[-y {AUTOMATIC | MANUAL}][-r <instance1,instance2…>]
• Data Guard Broker-managed failovers:
– CRS starts/stops services appropriate for database role
– All FAN compliant clients are automatically notified
– Eliminates need for custom client notifications and database triggers
• Data Guard Broker is required for complete automation
• Oracle Restart (11.2.0.1 or later) is required for non-RAC configurations

37 Copyright © 2011, Oracle and/or its affiliates. All rights


reserved.
Considerations for OCI
Oracle Net Alias
• Oracle Net alias should specify both the primary and
standby SCAN hostnames
SALES=
(DESCRIPTION_LIST=
(LOAD_BALANCE=off)(FAILOVER=on)
(
(DESCRIPTION=
(LOAD_BALANCE=on)(CONNECT_TIMEOUT=10)(RETRY_COUNT=3)
(ADDRESS_LIST=
(ADDRESS=(PROTOCOL=TCP)(HOST=Austin-scan)(PORT=1521)))
(CONNECT_DATA=(SERVICE_NAME=OrderEntry)))
(DESCRIPTION
(DESCRIPTION=
(LOAD_BALANCE=on)(CONNECT_TIMEOUT=10)(RETRY_COUNT=3)
(ADDRESS_LIST=
(ADDRESS=(PROTOCOL=TCP)(HOST=Houston-scan)(PORT=1521)))
(CONNECT_DATA=(SERVICE_NAME=OrderEntry))))

38 Copyright © 2011, Oracle and/or its affiliates. All rights


reserved.
Considerations for OCI
New Oracle Net Parameters
• Three new parameters in Oracle Database 11g Release 2
used in previous example
– CONNECT_TIMEOUT controls the overall time to connect to the
service
– TRANSPORT_CONNECT_TIMEOUT
TRANSPORT CONNECT TIMEOUT is the amount of time for the
TCP connection to complete
• CONNECT_TIMEOUT set to a value slightly greater than
TRANSPORT CONNECT TIMEOUT
TRANSPORT_CONNECT_TIMEOUT
– RETRY_COUNT parameter specifies the number of times an
address list is traversed before the connection attempt is
terminated

39 Copyright © 2011, Oracle and/or its affiliates. All rights


reserved.
Considerations for OCI
New Oracle Net Parameters – Example
• Be careful with optimal values, e.g. if Austin server/clusterware is down:
SALES=
(DESCRIPTION_LIST=
(
(LOAD_BALANCE=off)(FAILOVER=on)
(DESCRIPTION=
(LOAD_BALANCE=on)(CONNECT_TIMEOUT=5)(RETRY_COUNT=2)
(ADDRESS_LIST=
(ADDRESS=(PROTOCOL=TCP)(HOST=Austin-scan)(PORT=1521)))
(CONNECT DATA=(SERVICE
(CONNECT_DATA (SERVICE_NAME
NAME=OrderEntry)))
OrderEntry)))
(DESCRIPTION=
(LOAD_BALANCE=on)(CONNECT_TIMEOUT=5)(RETRY_COUNT=2)
(ADDRESS_LIST=
(ADDRESS=(PROTOCOL=TCP)(HOST=Houston-scan)(PORT=1521)))
(CONNECT_DATA=(SERVICE_NAME=OrderEntry))))

• A new connection spends 5 x 3 = 15 seconds to iterate through 3 Austin-SCAN VIPs


• This is retried 2 times: additional 2 x 15 = 30 seconds
• So connection fails over to Houston after 15 + 30 = 45 seconds

40 Copyright © 2011, Oracle and/or its affiliates. All rights


reserved.
Considerations for Non FAN Clients
• Follow all client failover best practices
– Role based services
– Outbound Connect Timeout (pre 11.2 version of
CONNECT_TIMEOUT)
– Description list of primary and standby SCAN names
• Configure client operating system for efficient TCP
timeouts
• Configure application code to automatically retry

41 Copyright © 2011, Oracle and/or its affiliates. All rights


reserved.
Demo
Integrated Failover of PeopleSoft Application with Data Guard Fast-Start Failover

42 Copyright © 2011, Oracle and/or its affiliates. All rights


reserved.
Zero Data Loss with ASYNC Redo Transport
Flush Redo
• ASYNC allows redo writes to commit locally without
waiting for remote writes
– Little to no performance impact on the primary
– Allows for remote standby over WAN
• Failover with ASYNC configurations results in data loss
• New in 11.2 – Use ALTER SYSTEM FLUSH REDO to
flush any unset redo to the standby
– Provides for zero data loss in ASYNC configurations
– Must be able to mount primary

43 Copyright © 2011, Oracle and/or its affiliates. All rights


reserved.
Switchover for Planned Maintenance
• Switchover can be used to reduce downtime for many
planned maintenance scenarios
– Hardware and Operating Systems upgrades
– RAC and Clusterware upgrades
– Platform migrations
– Standby First Patch Apply for Grid Infrastructure and Database
patching (MOS Note 1265700.1)
• Integral part of database rolling upgrades

44 Copyright © 2011, Oracle and/or its affiliates. All rights


reserved.
Client Failover for Data Guard Switchover

• You get client failover for switchover free if you have followed
the steps (well….sorta)
• Physical standby
– Clients disconnected as primary is converted to a standby
– Clients go through TAF retry logic (OCI) or application retry logic (JDBC)
– Clients connected to the standby disconnected as it is converted to primary
– Once both databases come up in new roles, services start and clients reconnect

• Logical
g standby
y
– Services are stopped automatically if Data Guard Broker switchover
– Manually disconnect connections to both primary and standby
– Perform switchover
– Once both databases come up in new roles,
roles services start and clients reconnect

45 Copyright © 2011, Oracle and/or its affiliates. All rights


reserved.
Role Transition Best Practices

• Enable Flashback Database to reinstate failed primary


• Use
U real-time
l ti apply
l
– Allows for very aggressive Recovery Time Objective
– Use flashback database to protect against logical corruptions
• Set LOG_FILE_NAME_CONVERT parameter
– Pre-clears
Pre clears online redo logs when media recovery first started
– Eliminates additional I/O at time of failover
– Not necessary if using Oracle Managed Files

46 Copyright © 2011, Oracle and/or its affiliates. All rights


reserved.
Role Transition Best Practices

• Consider using multiple standby databases


– Provides for continued data protection post failover
– Allows more time to diagnose primary issues prior to reinstate
• Use the Data Guard Broker
– Minimizes exposure to operator errors
– Best practices built in

47 Copyright © 2011, Oracle and/or its affiliates. All rights


reserved.
Summary

• Data Guard Failover


– Fast and reliable
– Best option to quickly resolve outages
– Integrated
g and automated application
pp redirection
• Data Guard Switchover
– Reduce time for planned maintenance
– Application failover built-in

48 Copyright © 2011, Oracle and/or its affiliates. All rights


reserved.
Resources
• OTN HA Portal:
http://www.oracle.com/goto/availability
• Maximum Availability Architecture (MAA):
http://www.oracle.com/goto/maa
• MAA Blogs:
http://blogs oracle com/maa
http://blogs.oracle.com/maa
• Exadata on OTN:
http://www.oracle.com/technetwork/database/exadata/index.html
• Oracle HA Customer Success Stories on OTN:
http://www.oracle.com/technetwork/database/features/ha-casestudies-
098033.html

49 Copyright © 2011, Oracle and/or its affiliates. All rights


reserved.
Key HA Sessions, Demos, Labs by Oracle Development
Monday, 3 Oct – Moscone South * Wednesday, 5 Oct – Moscone South *
11:00a Auto Detect, Prevent and Repair Data Corruptions, Rm 102 10:15a Oracle Active Data Guard - Lessons Learned, Rm 102
12:30p Future of Oracle Exadata, Rm 104 1:15p Data Guard for Planned Maintenance, Rm 102
12:30p RMAN: Not Just for Backups Anymore, Rm 304 1:15p Understanding Oracle RAC Internals, Rm 103
2:00p Extreme Data Management, Moscone North Hall D
1:15p Clone Oracle with CloneDB and Direct NFS, Rm 270
5 00 Oracle
5:00p O l Hi High-Availability
h A il bilit SSystem
t O
Overview,
i R 104
Rm
5:00p GoldenGate Product Update and Strategy, Intercontinental-Sutter Thursday, 6 Oct – Moscone South *
Tuesday, 4 Oct – Moscone South * 9:00a Exadata Backup and Recovery, Rm 304
10:15a Oracle Secure Backup - Best practices, Rm 304 10:30a Deduplication and Compression for Backups, Rm 304
11:45a Oracle Exadata Technical Deep Dive, Rm 104 12 00 Data
12:00p D G
Guard
dSSwitchover
i h /F
Failover,
il R 103
Rm
3:30p RMAN & Data Guard: Seven Cool Tips from Oracle, Rm 304 3:00p Configure, Size, Monitor Fast Recovery Area, Rm 304
3:30p Consolidation on Oracle Exadata, Rm 103 3:00p PeopleSoft with Active Data Guard, Moscone West 2022

Demos Moscone South DEMOGrounds Hands on Labs Marriott Marquis


Hands-on Marquis, Salon 14 / 15
Mon 10:00a - 5:30p Tue 9:45a - 6:00p Wed 9:45a - 4:00p Monday, Oct 3, 5:00 pm - 6:00 pm Oracle Active Data Guard
Maximum Availability Architecture (MAA) Exadata Tuesday, Oct 4, 10:15 am - 11:15 am Oracle Active Data Guard
Active Data Guard Oracle Secure Backup
Recovery Manager & Flashback GoldenGate *All
All session rooms at Moscone South unless otherwise noted
Real Application Clusters ASM *After Oracle OpenWorld, ref. http://www.oracle.com/goto/availability

50 Copyright © 2011, Oracle and/or its affiliates. All rights


reserved.
Q&
Q&A

51 Copyright © 2011, Oracle and/or its affiliates. All rights


reserved.
52 Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
53 Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.

Anda mungkin juga menyukai