Corporate Headquarters Redback Networks Inc. 100 Headquarters Drive San Jose, CA 95134-1362 USA http://www.redback.com Tel: +1 408 750 5000
1996 to 2008, Ericsson AB. All rights reserved. Redback and SmartEdge are trademarks registered at the U.S. Patent & Trademark Office and in other countries. AOS, NetOp, SMS, and User Intelligent Networks are trademarks or service marks of Redback Networks Inc. All other products or services mentioned are the trademarks, service marks, registered trademarks or registered service marks of their respective owners. All rights in copyright are reserved to the copyright owner. Company and product names are trademarks or registered trademarks of their respective owners. Neither the name of any third party software developer nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission of such third party.
Disclaimer
No part of this document may be reproduced in any form without the written permission of the copyright owner. The contents of this document are subject to revision without notice due to continued progress in methodology, design and manufacturing. Redback or Ericsson shall have no liability for any error or damage of any kind resulting from the use of this document.
Contents
Chapter 1: NetOp Database High Availability Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1 Manual Failover Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2 Fast-Start Failover Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2 Data Guard Broker Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2 Circumstances That Cause Fast-Start Failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-3 NetOp Database Directory Path Substitutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-3 Chapter 2: NetOp Database Failover Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1 Verify Network Communication Between Database Hosts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1 Configure Hostname Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2 Prepare a Host for the Standby NetOp Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2 Create and Configure the Standby NetOp Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3 Set Up the Primary NetOp Database for Failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-4 Configure the Frequency of Data Transfer from the Primary to Standby Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6 Set Up the Data Guard Observer Host . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-7 Configure and Start the Fast-Start Failover Observer Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-8 Chapter 3: NetOp Database Failover Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1 Manage Manual Failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1 Recover from a Failed Primary NetOp Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1 Recover from a Failed Standby NetOp Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2 Recover from a Simultaneous Failure of Both Primary and Standby NetOp Databases . . . . . . . . . . . . . . . . . . . . . . . . 3-2 Switch Databases During Normal Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4 Switch the Primary NetOp Database to the Standby Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-5 Switch the Standby Database to the Primary Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-5 Manage Fast-Start Failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-6 Detect a Database or Network Communication Failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-7 Recover from a Database or Network Communication Failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-8 Force a Failover in Fast-Start Failover Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-9 Manually Disable a Fast-Start Failover Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-10 Switch Databases During Normal Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-11 Prepare a Failed NetOp Database Host for a Return to Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-12 Re-create Manual Failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-13 re-create Fast-Start Failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-14
Contents
iii
iv
Chapter 1
This chapter describes the two database high-availability features for the NetOp software: manual failover (known in previous releases of the NetOp product as the warm standby feature) and fast-start failover. Both configurations use Oracle Data Guard to allow you to replace a failed primary with a standby NetOp database. Both high-availability features use primary and standby databases, each residing on a separate Oracle database host. The standby NetOp database host must conform to the specifications of the Oracle database host used for the primary, or production, NetOp database. For information on the hardware requirements for the Oracle database hosts, see the section on server host requirements in the Prepare to Install the NetOp PM System chapter in the NetOp Policy Manager Installation Guide. In both the fast-start failover and manual failover configurations, if a network outage occurs or the primary database fails, the NetOp components automatically switch to the active database host. Fast-start failover requires a third host called the Data Guard Observer host. The host runs the Oracle Data Guard Broker observer process. The observer process communicates with both database servers and automates the failover process when it detects that a failover is needed. For information on the hardware requirements for the Data Guard Observer host, see the Set Up the Data Guard Observer Host section on page 2-7.
Table 1-1
Term Original database Primary database
Standby database Role transition Switchover Failover Fast-start failover Manual failover
1-1
Table 1-1
Term Warm standby
In either fast-start or manual failover, you can schedule a switchover between database servers if you need to maintain one of the database hosts. Before you create and configure the standby database to use for the high-availability option you choose to implement, the original NetOp database (which will become the primary database) must already be deployed in a stand-alone configuration. For information on how to create the original NetOp database, see the Install the NetOp Database chapter in the NetOp Policy Manager Installation Guide. Configuring a database high-availability feature disrupts service; you are required to stop and restart the NetOp server after you have reconfigured the original stand-alone database as the primary database.
1-2
The Data Guard broker command-line interface (CLI) that provides commands for Data Guard operations (for example, to switch over or fail over to the standby database manually). It also provides reporting commands that display the status of all Data Guard components in real time. The observer process that runs on the Data Guard Observer host. The Data Guard Observer host is separate from, but connected to, both the primary and standby databases. This process communicates with both database servers and automates the failover process when it detects that a failover is needed. If the process detects and confirms a failure of the primary database or network links between the two databases and the observer process that may cause synchronization problems, it initiates a fast-start failover.
Fast-start failover does not occur when: There is a loss of communication between the primary and standby database hosts. Instead, the primary database, in coordination with the observer process, resets its state to indicate that it must synchronize, and the observer process sets the standby database state to unsynchronized. If communication between the two database hosts is restored, the databases synchronize. If there is also a loss of communication between the observer process and primary database before the primary-to-standby database link is restored, failover is not initiated. There is a loss of communication between the primary database and the observer process. However, if there is a subsequent loss of communication between the primary and standby database, the primary database stalls because it cannot coordinate with the observer process to reset its state to indicate that it must synchronize. If the standby database detects that its link to primary database has failed, it informs the observer process, and a fast-start failover occurs. There is a simultaneous loss of communication between the primary and standby database and the primary database and the observer process. Instead, the primary database stalls after a short time period. If the standby database cannot communicate with the observer process, failover does not occur, and administrator intervention is required to make one of the databases active. In order to reduce the risk of simultaneous failure, ensure that there is no single point of failure in your system architecture. To ensure the reliability of fast-start failover components and communication between them make use of standard fast-start failover practices. Verify that network paths run through different equipment, and fast-start failover components are placed in separate locations.
1-3
Table 1-2
1-4
Chapter 2
To configure database failover, perform the tasks described in the following sections in the order shown: 1. Verify Network Communication Between Database Hosts 2. Configure Hostname Resolution 3. Prepare a Host for the Standby NetOp Database 4. Create and Configure the Standby NetOp Database 5. Set Up the Primary NetOp Database for Failover 6. Configure the Frequency of Data Transfer from the Primary to Standby Database At this point, configuration for manual failover is complete. If you are setting up fast-start failover, complete the following additional tasks: 1. Set Up the Data Guard Observer Host 2. Configure and Start the Fast-Start Failover Observer Process If you experience problems with the observer process, see Chapter 3, NetOp Database Failover Configuration Issues in the NetOp Policy Manager Database Troubleshooting Guide.
2-1
The observer host must resolve the IP address of the primary and standby database host names.
2. Ensure that no recent network changes have occurred for a standby database that was previously working.
2. Confirm that the original NetOp database is operational: a. Log on to a NetOp database as root.
2-2
b. Enter the following command: ps -ef | grep smon 3. Create a backup of the original NetOp database by running the backup_npm_db.sh script on the original NetOp database host; see Chapter 2, Manual Database Maintenance Tasks in the NetOp Policy Manager Database Administration Guide. Use the latest backup set of the original NetOp database; that is, one created after a significant database event, such as a failover, because such an event invalidates all older backup sets. 4. Generate a .tar file of the backup set, specifying the full path to the directory, and compress the file. 5. Copy the backup set from the backup directory on the primary NetOp database to the standby NetOp database host or a Network File System (NFS) directory shared by the primary and standby NetOp database hosts. 6. Extract the backup set on the standby NetOp database host.
2-3
3. Run the setup_standby_db.sh script, indicating the location of the backup set: setup_standby_db.sh [-archive_lag_time lag_time_in_seconds] -backup_dir primary_backup_directory [-db database_name] [-h] -primary_db_ip primary_ip_address
Table 2-1
Syntax -archive_lag_time lag_time_in_seconds
-backup_dir primary_backup_directory
Directory in which the backup set from the primary NetOp database is stored; can be a local directory on the database host, or on an NFS directory shared by the primary and standby database hosts.1 Optional. Name of the NetOp database. Optional. Prints usage information and exits. IP address of the primary NetOp database host.
1. When you run the setup_standby_db.sh and setup_primary_db.sh scripts, use the same backup set on both the primary and standby database hosts.
For example, to create and configure the standby NetOp database for a NetOp PM deployment using default values:
./setup_standby_db.sh -archive_lag_time 1800 -backup_dir /export/home/dbback/npm/2007_01_31:12:30 -primary_db_ip 10.192.32.210
Note
For information on how to reset the value for the -archive_lag_time keyword after this script has been run, see the Configure the Frequency of Data Transfer from the Primary to Standby Database section on page 2-6.
2-4
To set up the original NetOp database as the primary database: 1. Log on to the original NetOp database host as root and open a terminal window. 2. Navigate to the Administration directory. 3. Run the setup_primary_db.sh script: setup_primary_db.sh[-archive_lag_time lag_time_in_seconds] -backup_dir primary_backup_directory [-db database_name] [-enable_fast_start_failover] [-h] [-standby_db_acct standby_database_sysdba_account_name] -standby_db_ip standby_ip_address -standby_db_passwd standby_database_sysdba_account_password
Table 2-2
Syntax -archive_lag_time lag_time_in_seconds
-backup_dir primary_backup_directory
Directory in which the backup set from the primary NetOp database is stored; can be a local directory on the database host or on an NFS directory shared by the primary and standby database hosts.1 Optional. Name of the NetOp database. Optional. Enables the fast-start failover high availability feature. Note: Use this option only to enable fast-start failover after setting up a standby database.
Optional. Prints usage information and exits. Account name for the standby database sysdba; the default value is sys. IP address of the standby NetOp database host. Password for the standby database sysdba account; the default password is redback.
1. When you run the setup_standby_db.sh and setup_primary_db.sh scripts, use the same backup set on both the primary and standby database hosts.
Note
The value for the following keywords must be the same for both the primary and standby databases: -archive_lag_time -backup_dir
If they are different, the setup_primary_db.sh script fails. For example, to set up a fast-start failover configuration for a NetOp PM deployment using default values:
./setup_primary_db.sh -archive_lag_time 1800 -backup_dir /export/home/dbback/npm/2007_01_31:12:30 -standby_db_ip 10.192.33.178 -standby_db_passwd redback -enable_fast_start_failover
2-5
Configure the Frequency of Data Transfer from the Primary to Standby Database
Note
After you set up the fast-start failover configuration, do not use the scripts provided for switching databases in a manual failover configuration. You must use the DataGuard broker CLI instead.
4. Configure the NetOp PM system to recognize both the primary and standby NetOp PM database hosts: a. Run the config_npm.sh script with the -db_host database_host, [database_host] construct on each NetOp PM application host. b. Restart the NetOp PM components.
Configure the Frequency of Data Transfer from the Primary to Standby Database
By default, archive logs are transferred to the standby NetOp database host when 100 MB of changed information has accumulated on the primary NetOp database host or at the data transfer interval set in the Oracle server parameter file (SPFILE), which by default is every 1,800 seconds (once every half hour). Depending on the database workload and other factors, this interval may need to be smaller. Ideally, you should monitor the archive generation frequency, and if it is not frequent enough, reset the interval in the SPFILE to generate archive logs more frequently. As this interval decreases, the potential data loss is minimized. However, the database performance degrades as this value decreases, because the database performs increasingly more work on generating the archive logs and transferring them from the primary to the standby database. You can customize the data transfer interval, to generate archive logs more or less frequently, in two ways: By setting the archive_lag_target option in the SPFILE, as documented in this procedure. By using the -archive_lag_time keyword in the setup_primary_db.sh or setup_standby_db.sh script run when you configure the manual failover feature.
To reset the database to transfer data from the primary to the standby database to a desired frequency, by modifying the SPFILE on both the primary host and standby host: 1. Log on to the primary NetOp database host as root. 2. Log on to the NetOp database as the Oracle system administrator (for example, sys) using the following command: sqlplus sys/password@database-name as sysdba where the value of the password argument is the password of the system database account (the default value is redback), and the value of the database-name argument is the database name. 3. In the SQL*Plus session, enter the following command: alter system set archive_lag_target=number_of_seconds scope=both; where number_of_seconds is the data transfer interval in seconds. 4. Exit the SQL*Plus session.
2-6
5. Log on to the standby NetOp database host as oracle. 6. Log on to the NetOp database as the Oracle system administrator (for example, sys) using the following command: sqlplus sys/password@database-name as sysdba where the value of the password argument is the password of the sys database account (the default value is redback), and the value of the database-name argument is the database name. 7. In the SQL*Plus session, enter the following command: alter system set archive_lag_target=number_of_seconds scope=both; where number_of_seconds is the data transfer interval in seconds. 8. Exit the SQL*Plus session. After you complete this procedure, manual failover configuration is complete. Confirm that the primary and standby databases are running; for more information, see Chapter 2, NetOp Database Issues in the NetOp Policy Manager Database Troubleshooting Guide.
Minimum Partition Sizes for the Oracle Data Guard Observer Host
Partition swap / Purpose swap (at least the same size as the amount of installed RAM) root (Solaris root, Oracle binaries) GB 2 10
To set up the Data Guard Observer process on a Solaris host: 1. Install the Solaris 10 OS and partition the disk to meet the Data Guard Observer host requirements. For more information, see the Install Solaris OS chapter in the NetOp Policy Manager Installation Guide. 2. After the Solaris host has rebooted, copy the netop_install.sh script to the root (/) directory from the delivery medium and install the additional Solaris software; for more information, see the Install Solaris OS chapter in the NetOp Policy Manager Installation Guide. 3. Install the Oracle DBMS; for more information, see the Install the Oracle DBMS section in the Install the NetOp PM Database chapter in the NetOp Policy Manager Installation Guide. 4. Run the config_db.sh script with the same parameters you used when configuring the primary database host. 5. Verify that the Observer process host is operational. See Chapter 3, NetOp Database Failover Configuration Issues in the NetOp Policy Manager Database Troubleshooting Guide.
2-7
Note
Do not configure and create a NetOp database on the Data Guard Observer host.
4. Configure the NetOp PM system to recognize both the primary and standby NetOp PM database hosts: a. Run the config_npm.sh script with the -db_host database_host, [database_host] construct on each NetOp PM application host. b. Restart the NetOp PM components. After you complete this procedure, fast-start failover configuration is completed and you can proceed to the Verify That the Primary and Standby Databases Are Running section on page 2-1 of the NetOp Policy Manager Database Troubleshooting Guide.
2-8
Chapter 3
This chapter describes how to manage the two optional database high-availability features for the NetOp software: manual failover (known in previous releases of the NetOp product as the warm standby feature) and fast-start failover. Both configurations use Oracle Data Guard to allow you to replace a failed primary with a standby NetOp database. Regular preventative maintenance activities, such as monitoring database size and cleaning up the database, help prevent database outages. If a database host does fail, the steps you must take depend on which database host has failed and which database high availability feature is implemented. This chapter consists of the following topics: Manage Manual Failover Manage Fast-Start Failover Prepare a Failed NetOp Database Host for a Return to Service Re-create Manual Failover re-create Fast-Start Failover
3-1
Note
Using the -failover keyword breaks the link between the primary and standby databases, and because this breakage is irreversible, you must set up a new standby database.
To switch the role of the original standby database to be the primary database, perform the following tasks: 1. Log on to the standby database host as oracle and open a terminal window. 2. Navigate to the Administration directory. 3. Run the standby_to_primary.sh script according to the following syntax: standby_to_primary.sh [-db database_name] [{-primary_db_passwd primary_db_sysdba_account_password | -failover}] [-h] For example:
./standby_to_primary.sh -failover
Table 3-3 on page 3-6 describes the syntax and usage guidelines for this script. 4. Restart the NetOp components. Note If the NetOp components are configured to recognize both database hosts, then the NetOp components automatically switch to the active database host.
You can continue to use the stand-alone database configuration until the failed database host is returned to service. For information on how to prepare the failed database before restoring it to service, see the Prepare a Failed NetOp Database Host for a Return to Service section on page 3-12. When the second database host is ready, follow the procedure in the Re-create Manual Failover section on page 3-13.
Recover from a Simultaneous Failure of Both Primary and Standby NetOp Databases
In a manual failover configuration, you rely on a database switchover to recover from a database failure on the primary database host. However, in the rare case that both the primary and standby databases fail, you must use the remove_npm_db.sh script to remove the primary NetOp database and then use the remove_npm_db.sh script to recover the primary NetOp database using the most recent backup set from the primary database before the failure.
3-2
Note
We recommend that you attempt the recover procedure on a test server before a real failure occurs, to verify that the set of backup files is complete and accurate.
Caution Risk of data loss. Do not use the optional -remove_backups keyword if you do not have any other backup sets available or if these are the backup sets that you want to recover from.
Optionally, you can run the remove_npm_db.sh script with the -remove_backups keyword to remove backups and archive files in addition to the database files. Before you recover the primary NetOp database from backup files, decide whether the archive log directory, /ora_archive/npm/arc01, should be emptied (or renamed): Empty the archive log directory to restore the database to the state it was in when the backup was made. You may want to do this if the database failure is the result of invalid or misconceived database transactions. Do not empty the archive log directory if you want to apply transactions that were committed after the backup was created. You may want to do this if the database failure results from a media failure, such as a hard drive failure or database block corruption.
To use a set of backup files to recover from a NetOp database failure on both the primary and standby database hosts, perform the following steps: 1. Log on to the primary NetOp database host as root. 2. Use the remove_npm_db.sh script to remove the NetOp database; for more information, see the Remove a NetOp Database section on page 2-10 of the NetOp Policy Manager Database Administration Guide. 3. Retrieve the most recent set of backup files from the external storage device and place it on the primary database host. The set of backup files must be placed in the directory in which the original backup files were created. The default backup directory is /export/home/dbback/database_name. 4. Open a terminal window and navigate to the Maintenance directory. 5. Log on to the primary NetOp database host as root.
3-3
6. Run the recover_npm_db.sh script according to the following syntax: recover_npm_db.sh -backup_dir backup_directory [-db database_name] [-h]
Table 3-1
Syntax -backup_dir backup_directory -db database_name -h -standby
For example to recover the primary database for a NetOp PM deployment using default values:
./recover_npm_db.sh -backup_dir /export/home/dbback/npm/2007_01_31:12:30
7. Verify that the NetOp database works with the NetOp client. 8. Remove the restored backup files retrieved in step 4. You can continue to use the stand-alone database configuration until the second database host is ready to be returned to service. For information on how to prepare the failed database before restoring it to service, see the Prepare a Failed NetOp Database Host for a Return to Service section on page 3-12.
To switch the primary and standby NetOp databases, perform the tasks described in this section in the following order: 1. Switch the Primary NetOp Database to the Standby Database 2. Switch the Standby Database to the Primary Database
3-4
For example:
./primary_to_standby.sh
3-5
3. Run the standby_to_primary.sh script according to the following syntax: standby_to_primary.sh [-db database_name] [-primary_db_passwd primary_db_sysdba_account_password | -failover] [-h]
Table 3-3
Syntax -db database_name -failover -primary_db_passwd primary_db_sysdba_account_password -h
For example:
./standby_to_primary.sh -primary_db_passwd redback
Note
Before you complete the remaining steps, wait five minutes to allow the NetOp database switchover to complete.
4. Configure the frequency of data transfer on the primary database host; see the Configure the Frequency of Data Transfer from the Primary to Standby Database section on page 2-6. 5. Verify that the primary and standby NetOp databases are operating properly by running the verify_npm_standby.sh script; see the Verify That the Primary and Standby Databases Are Running section on page 2-1 of the NetOp Policy Manager Database Troubleshooting Guide. Note If the NetOp components are configured to recognize both database hosts, then the NetOp components automatically switch to the active database host.
3-6
A failover is unlike a switchover (a planned role transition between the primary and standby databases), in which the observer process coordinates all the role changes and ensures that database synchronization is properly maintained; for details, see the Switch Databases During Normal Operations section on page 3-4. Caution Risk of database failure. Using the scripts provided for switching databases in a manual failover configuration in a fast-start failover configuration can cause database corruption. To reduce this risk, do not use the scripts provided for switching databases in a manual failover configuration. You must use the DataGuard broker CLI instead. This section describes how to recover after each of the possible types of failures in a fast-start failover configuration (specifically a database failure or a network outage), to force a failover, disable fast-start failover, and switch databases during normal operations.
3-7
To ensure the reliability of fast-start failover components and communication between them, make use of standard fast-start failover practices. Verify that network paths run through different equipment, and fast-start failover components are placed in separate locations. For more information, see the remaining topics in this section. Caution Risk that database maintenance scripts fail to run. After a failover, the database on the failed database host is not operational. If you run any of the database maintenance scripts from the failed primary database host while it is not operational, the scripts fail. To avoid this risk, fix the problem that caused the failover and re-create a new standby database on the former primary database host before you try to run a database maintenance script on this host.
d. Exit the Data Guard broker CLI. 3. Stop the stand-alone NetOp database host; for details on how to stop a NetOp database host, see the Start and Stop the NetOp Database section on page 2-3 of the NetOp Policy Manager Database Administration Guide.
3-8
4. Start the stand-alone NetOp database host; for details on how to start a NetOp database host, see the Start and Stop the NetOp Database section on page 2-3 of the NetOp Policy Manager Database Administration Guide. 5. If the IP address or the name of the failed database host has changed and the NetOp PM components: Have been configured with both database hosts, then the NetOp PM components automatically switch to the active database host. Have not been configured with both database hosts, run the config_npm.sh script with the -db_host database_host, [database_host] construct on each NetOp PM application host and restart the NetOp PM components. For a description on the config_npm.sh script and its syntax, see the Change NetOp Database Account Passwords section on page 2-1 of the NetOp Policy Manager Database Administration Guide. For information on starting and stopping NetOp PM components, see the NetOp PM Components chapter in the NetOp Policy Manager Configuration Guide
You can continue to use the stand-alone database configuration until the failed database host is ready to be returned to service. For information on how to prepare the failed database before restoring it to service, see the Prepare a Failed NetOp Database Host for a Return to Service section on page 3-12.
After you force a failover, the former standby database is now functioning as a stand-alone NetOp database and must be reconfigured. You can use the stand-alone database until a new fast-start failover configuration is set up. To force a failover in a fast-start failover configuration, on the standby NetOp database host: 1. Log on as oracle. 2. Start a DataGuard Broker CLI session by entering the following command at the UNIX command line: dgmgrl The Data Guard broker CLI appears. 3. Enter the following command to connect to the failed database: connect sys/password@database-name; where the value of the password argument is the password of the sys Oracle database account (the default value is redback), and the value of the database-name argument is the database name. 4. Verify which database is primary and which is standby: show configuration; A message similar to the following appears:
Configuration Name: Enabled: npmcold_DG YES
3-9
Protection Mode: MaxAvailability Fast-Start Failover: ENABLED Databases: npm - Primary database npm_clone - Physical standby database - Fast-Start Failover target Current status for "npmcold_DG": SUCCESS
5. Force a failover from the primary database to the standby database by entering the following commands: disable fast_start Failover force; failover tocurrent_standby_database immediate; 6. Enter the following command to exit the Data Guard broker CLI and return to the UNIX command line: exit; Proceed to the Recover from a Database or Network Communication Failure section on page 3-8. Note If the NetOp components are configured to recognize both database hosts, then the NetOp components automatically switch to the active database host.
3-10
4. Enter the following commands at the DGMGRL prompts to disable fast-start failover: disable fast_start failover force; edit configuration set protection mode as maxperformance; remove configuration exit 5. Log on to the NetOp database as the Oracle system administrator (for example, sys) using the following command: sqlplus sys/password@database-name as sysdba where the value of the password argument is the password of the sys database account (the default value is redback), and the value of the database-name argument is the database name. 6. Enter the following commands at the SQL prompts: alter system set dg_broker_start=false scope=both; alter system set log_archive_config=nodg_config scope=both;
Use the Data Guard broker CLI to switch the standby database to the new primary database and conversely. During a switchover, the observer process coordinates all of the role changes and ensures that database synchronization is properly maintained.
3-11
To switch the primary and standby databases, perform the following steps: 1. If you are not already using the DataGuard broker CLI on the primary database host, perform these steps: a. Log on to the primary NetOp database host as oracle and open a terminal window. b. Start a DataGuard Broker CLI session by entering the following command at the UNIX command line: dgmgrl The Data Guard broker CLI appears. c. Enter the following command to connect to the failed database: connect sys/password@database-name; where the value of the password argument is the password of the sys Oracle database account (the default value is redback), and the value of the database-name argument is the database name. 2. Force a failover from the primary database to the standby database by entering the following command: switchover to current_standby_database 3. Enter the following command to exit the Data Guard broker CLI and return to the UNIX command line: exit; Note If the NetOp components are configured to recognize both database hosts, then the NetOp components automatically switch to the active database host.
3-12
We recommend that after a failed NetOp database is prepared for a return to service, you restore the database as the standby database, regardless of its previous role. After it is prepared for a return to service, you can re-create the appropriate failover configuration: To re-create a manual failover configuration, see the Re-create Manual Failover section on page 3-13. To re-create a fast-start failover configuration, see the re-create Fast-Start Failover section on page 3-14.
When you have fixed or replaced the failed NetOp database host, restore it as a standby database: 1. On the current database host, create a backup of the database by running the backup_npm_db.sh script using the -standby_failed keyword; see the Back Up the NetOp Database section on page 2-5 of the NetOp Policy Manager Database Administration Guide. 2. Generate a .tar file of the backup set, specifying the full path to the directory, and compress the file. 3. Copy and extract the backup set from the backup directory on the current NetOp database host to the NetOp database host you are preparing to become the standby host or a NFS directory shared by both NetOp database hosts. 4. Complete the configuration of the manual failover option on both database hosts. See the Create and Configure the Standby NetOp Database section on page 2-3 and the Set Up the Primary NetOp Database for Failover section on page 2-4. 5. If the IP address or the name of the failed database host has changed and the NetOp PM components: Have been configured with both database hosts, then the NetOp PM components automatically switch to the active database host. Have not been configured with both database hosts, run the config_npm.sh script with the -db_host database_host, [database_host] construct on each NetOp PM application host and restart the NetOp PM components. For a description on the config_npm.sh script and its syntax, see the
3-13
Change NetOp Database Account Passwords section on page 2-1 of the NetOp Policy Manager Database Administration Guide. For information on starting and stopping NetOp PM components, see the NetOp PM Components chapter in the NetOp Policy Manager Configuration Guide 6. On the standby database host, verify that both the primary and standby NetOp databases are operational; for more information, see the Verify That the Primary and Standby Databases Are Running section on page 2-1 of the NetOp Policy Manager Database Troubleshooting Guide.
3-14
8. Generate a.tar file of the backup set, specifying the full path to the directory, and compress the file. 9. Copy and extract the backup set from the backup directory on the current NetOp database host to the NetOp database host that you are preparing to become the standby host or a NFS directory shared by both NetOp database hosts. 10. If you are creating the standby database on the same server where the initial primary database failed, run the remove_db.sh script, then clean up the database installation by manually removing the netop and oracle directories from the /u01/app directory. 11. Log on to the NetOp database host as root and open a command shell. Install the new standby database from the NetOp PM Platform Software media or ISO images using the netop_install.sh script with the database keyword. For information on running this script see the Install the Oracle DBMS section of the NetOp Policy Manager Installation Guide. 12. Configure the NetOp database by running the config_npm_db.sh script. 13. On the standby database server host, run the setup_standby_db.sh script referencing the directory location of the backup files created when you backed up the primary database: ./setup_standby_db.sh -backup_dir backup_directory -primary_db_ip primary_ip_address 14. On the primary server host, run the setup_primary_db.sh script identifying the location of the backup files created when you backed up the primary database: ./setup_primary_db.sh -backup_dir backup_directory -standby_db_ip primary_ip_address -enable_fast_start_failover -standby_db_passwd sysdba_account_password 15. Set up and start the observer process by performing the following tasks: a. On the data guard observer host, log in as root. b. Navigate to the Administration directory. c. Run the setup_observer.sh script to configure the fast-start failover observer process. For more information, see the Configure and Start the Fast-Start Failover Observer Process section on page 2-8. 16. Verify the configuration by repeating steps 1-3 of this procedure to connect to the database as the oracle user, then typing show configuration at the DGMGRL prompt. If the configuration is correct, a success message is returned. 17. If the IP address or the name of the failed database host has changed and the NetOp PM components: Have been configured with both database hosts, then the NetOp PM components automatically switch to the active database host. Have not been configured with both database hosts, run the config_npm.sh script with the -db_host database_host, [database_host] construct on each NetOp PM application host and restart the NetOp PM components. For a description on the config_npm.sh script and its syntax, see the Change NetOp Database Account Passwords section on page 2-1 of the NetOp Policy Manager Database Administration Guide. For information on starting and stopping NetOp PM components, see the NetOp PM Components chapter in the NetOp Policy Manager Configuration Guide.
3-15
3-16