Anda di halaman 1dari 53

Cisco CallManager Database Replication

Vajrender (Sunny) Akkera

Presentation_ID

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

Agenda
CallManager Database Architecture DB Replication Flow Diagram

What could possibly break DB replication


How to verify if DB Replication is broken Troubleshooting Database Replication issues Replication Logs Closing

Presentation_ID

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

DB Architecture : Install/Ugrade
In 5.0 and 5.1 The publisher upgrade migrates data prior to reboot to the new version. The subscriber starts replication setup after it is upgraded and rebooted.

Replication setup pushes data from the publisher to the subscriber. The subscribers local database is ready for failover only after replication is complete.

Presentation_ID

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

DB Architecture : Install/Ugrade
In 6.X + The publisher upgrade migrates data and performs an ontape (Informix utility) backup prior to reboot to the new version.

The subscriber upgrade gets the publisher ontape backup via SFTP, and restores that data to the subscriber. (This gets the data close in content which is imperative for services reading data local.) The subscriber starts replication setup after the upgrade and reboot.
Replication setup audits the data and pushes differences between the publisher and subscriber to the subscriber. Change notification is sent to the local services for each change. The local database is ready before replication is complete. The replication setup timeout is set-able via CLI utils dbreplication setrepltimeout 900 (15 minutes) User Facing Features (listed on a later slide) are backed up locally on all servers prior to upgrade and reboot and restored after reboot so that any changes made by users during the upgrade are not lost.

Presentation_ID

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

DB Architecture CallManager 5.X

Presentation_ID

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

DB Architecture CallManager 6.X

Presentation_ID

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

User Facing Features (UFF)


This Data can be written into the local DB Call Forward All (CFA) Message Waiting Indication (MWI)

Privacy Enable/Disable
Do Not Disturb Enable/Disable (DND) Extension Mobility Login (EM) Monitor (for future use, currently no updates at the user level) Hunt Group Logout Device Mobility CTI CAPF status for end users and application users

Credential hacking and authentication

Presentation_ID

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

DB Architecture: Replication 6.X


Replication is now fully meshed. A change on any server gets propagated to every other server. Only UFF data is writeable on a subscriber, so that is the only data that will replicate from a subscriber. Logically, most data is still hub-and-spoke from a replication perspective, since most data is still only updateable on the publisher. Replication queues on the subscriber are now used. Perfmon counters for replication are now used on subscribers. Replication now impacts data availability and change notification.

Presentation_ID

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

DB Architecture: Replication 5.X

Presentation_ID

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

DB Architecture: Replication 6.X

Presentation_ID

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

10

DB Replication Flow Diagram

Presentation_ID

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

11

DB Replication Flow Diagram

Presentation_ID

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

12

Steps to DB Replication
These steps are done automatically by the replication scripts when the system is installed. When we do a utils dbreplication reset all, these steps get done again. 1. Tears down the replication . CDR DELETE Server this can cause corruption of syscdr database . 2. Define publisher - This will help to set it up to start replicating 3. Define template on publisher and realize it - This tells publisher what tables to replicate.

4. Define each subscriber


5. Realize template on subscriber - This will tell subscribers what tables they will get/send data for. 6. Synchronize the data. When we look at the log files, we see output from steps 3, 4,and 5. Each subscriber will define by itself, but the realize and sync step shows up in the dbl_repl_output_Broadcast_.logfile. There may be one subscriber, or many in the "batch".
Presentation_ID 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential

13

What could possibly break Replication


Connectivity issues between nodes Host Files Mis-match

Communication on UDP port 8500, not in phase 2


DNS not configured properly (forward/reverse lookup) NTP not reachable A Cisco DB and A Cisco DB Replicator not running/working Cisco Database Layer Monitor (Dbmon) hung/stopped

Presentation_ID

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

14

DB Replication Troubleshooting
How do we verify if replication is broken Commands to diagnose and fix replication If you cannot fix it, what trace files do we collect

If customer needs an RCA, we would have to run the special ercollect script on the server.

Presentation_ID

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

15

How to verify if Replication is broken?


Replication failure alert Replication status counter not being in good state (can be watched proactively)

CLI for replication status shows tables suspect or missing servers.


CM Database Status Report under Unified Reporting Verify the output for utils dbreplication runtimestate on the publisher. (if the command is available)

Presentation_ID

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

16

How to verify if Replication is broken?

What the replication state counter means: 0 = Initialization 1 = Number of replicates is not correct (old sys) 2 = Replication is good 3 = Replication is bad 4 = Replication setup did not succeed (this meaning is for 5.1.3 and all 6.X versions) .

Presentation_ID

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

17

How to verify if Replication is broken?

Presentation_ID

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

18

How to verify if Replication is broken?


show perf query class "Number of Replicates Created and State of Replication admin:show perf query class "Number of Replicates Created and State of Replication" ==>query class : - Perf class (Number of Replicates Created and State of Replication) has instances and values: ReplicateCount -> Number of Replicates Created = 348

ReplicateCount -> Replicate_State

=2

Presentation_ID

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

19

How to verify if Replication is broken?

Presentation_ID

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

20

How to verify if Replication is broken?


admin:utils dbreplication runtimestate DB and Replication Services: ALL RUNNING Cluster Replication State: Replication status command started at: 2010-05-13-15-53 Replication status command COMPLETED 427 tables checked out of 427

No Errors or Mismatches found.

DB Version: ccm7_1_3_10000_11 Number of replicated tables: 427

Cluster Detailed View from PUB (2 Servers):

PING SERVER-NAME & details ----------Publisher subscriber


Presentation_ID

REPLICATION

REPL. DBver& REPL. REPLICATION SETUP STATUS QUEUE TABLES LOOP? (RTMT)

IP ADDRESS ------ ----

(msec) RPC? -----------

------------

----- ------- ----- ----------------Connected Connected 0 0 match N/A match N/A (2) PUB Setup Completed (2) Setup Completed
21

14.128.62.72 14.128.62.73

0.063 Yes 0.384 Yes

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

Troubleshooting Steps
Verify Connectivity Verify Host Files are in sync.

Connectivity on UDP port 8500


Verify NTP reachability and Network Validation Is the publisher failing to define the template or realize the template DB Replication Commands

Presentation_ID

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

22

Troubleshooting : Verify Connectivity


Utils network connectivity This command can take up to 3 minutes to complete. Continue (y/n)?y

Running test, please wait ...


. Network connectivity test with the publisher completed successfully. Note : Command can be run only on the Subscribers Utils network host <hostname/ipaddress> Verifies DNS resolution

Utils network ping <hostname/ipaddress> Helps verify connectivity between nodes.

Presentation_ID

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

23

Troubleshooting : Verify Host Files


/etc/hosts /etc/services

/home/informix/.rhosts
/usr/local/cm/db/informix/etc/sqlhosts

Presentation_ID

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

24

Troubleshooting : Verify Host Files


admin:show tech network hosts -------------------- show platform network -------------------/etc/hosts File:

#This file was generated by the /etc/hosts cluster manager.


#It is automatically updated as nodes are added, changed, removed from the cluster.

127.0.0.1 localhost
14.128.62.3 CM613 14.128.62.6 CM613SUB

Presentation_ID

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

25

Troubleshooting : Verify Host Files


admin:show tech dbstateinfo Database State Info Output is in cm/trace/dbl/showtechdbstateinfo20593.out admin:file view activelog cm/trace/dbl/showtechdbstateinfo20593.out (Hit e to go to the end of the file)
#SQL Hosts: g_hdr group i=1 group i=2

g_cm613_ccm6_1_3_1000_16

cm613_ccm6_1_3_1000_16 onsoctcp CM613 cm613_ccm6_1_3_1000_16 g=g_cm613_ccm6_1_3_1000_16 b=32767 g_cm613sub_ccm6_1_3_1000_16 group i=3 cm613sub_ccm6_1_3_1000_16

cm613sub_ccm6_1_3_1000_16 onsoctcp CM613SUB g=g_cm613sub_ccm6_1_3_1000_16 b=32767 # .rhosts: localhost CM613 CM613SUB


Presentation_ID 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential

26

Troubleshooting : Verify Host Files

Presentation_ID

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

27

Troubleshooting : Data Access Failure


admin:utils firewall list ACCEPT ACCEPT tcp -- CM613SUB udp -- CM613SUB anywhere anywhere tcp dpt:cm613_ccm6_1_3_1000_16 udp dpt:1500

ACCEPT
ACCEPT

tcp -- CM613SUB
udp -- CM613SUB

anywhere
anywhere

tcp dpt:1501
udp dpt:1501

This example above is from a pub (CM613) where CM613SUB is the sub. Sub should have similar entries for pub. If they do not, it is probably a network issue. TCP port 1501 is used by callmanager database at the time of migration (upgrade). Ensure all servers in cluster have good status (TCP and ACCEPT on port 1500 and is named by server). Else Verify the Cluster Manager Logs. - File list activelog platform/log/clustermgr* - File view activelog platform/log/clustermgr00000002.log Example : 06/14/2010 23:22:03.009 clm|HMAC_SHA1 match failed IP(14.128.62.6)| (Failed) 03/25/2010 06:52:39.864 clm|hostname: CM613SUB state POLICY_INJECTED| (Success)
28

Presentation_ID

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

Troubleshooting : Data Access Failure


// Cluster Manager Log (file list activelog platform/log/clustermgr*) 03/25/2010 06:52:24.547 clm|exec'ing: /root/.security/drf/setdrfdetails.sh 03/25/2010 06:52:24.636 clm|Binding to /usr/local/platform/conf/clm/unix_socket 03/25/2010 06:52:24.636 clm|creating 2 state machines 03/25/2010 06:52:24.637 clm|succeeded to create sm for: CM613SUB 03/25/2010 06:52:24.637 clm|exec'ing: sudo /root/.security/ipsec/disable_ipsec.sh -desthostName=CM613SUB --op=delete 03/25/2010 06:52:26.215 clm|hostname: CM613SUB state INITIATOR| 03/25/2010 06:52:26.356 clm|exec'ing: /etc/init.d/iptables start 03/25/2010 06:52:27.340 clm|ignoring initiation from other side peer hostname(CM613SUB)

03/25/2010 06:52:33.804 clm|exec'ing: /etc/init.d/iptables start


03/25/2010 06:52:35.750 clm|for initator(CM613SUB): entering the policy injected state 03/25/2010 06:52:39.864 clm|hostname: CM613SUB state POLICY_INJECTED
Presentation_ID 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential

29

Troubleshooting : Data Access Failure


admin:utils network capture port 8500 Executing command with options: size=128 count=1000 interface=eth0

src=
ip=

dest=

port=8500

22:09:10.479943 CM613.8500 > CM613SUB.8500: isakmp: phase 2/others ? #71[C] (DF) 22:09:10.481232 CM613SUB.8500 > CM613.8500: isakmp: phase 2/others ? #71[C] (DF)

22:09:15.474954 CM613SUB.8500 > CM613.8500: isakmp: phase 2/others ? #71[C] (DF)


22:09:15.475677 CM613.8500 > CM613SUB.8500: isakmp: phase 2/others ? #71[C] (DF) Verify the communication is in phase 2 in both directions (pub->sub, sub->pub). If you have multiple nodes in the cluster, all the nodes must be in phase 2 with every other node in the cluster. You could verify the CM system logs to verify if the server is in policy injected state with other nodes.
Presentation_ID 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential

30

Troubleshooting : Verify NTP reachability and Network Validity


admin:utils diagnose test

Log file: /var/log/active/platform/log/diag4.log

Starting diagnostic test(s) =========================== test - disk_space skip - disk_files : Passed (available: 849 MB, used: 4998 MB) : This module must be run directly and off hours : Passed

test - service_manager test - tomcat

: Passed : Passed

test - validate_network test - system_info

: Passed (Collected system information in diagnostic log) : Passed

test - ntp_reachability

test - ntp_clock_drift
test - ntp_stratum

: Passed
: Passed

Diagnostics Completed
Presentation_ID 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential

31

Troubleshooting : Is the publisher failing to define the template or realize the template
Verify the logs to see at what point is the replication failing.
admin:file list activelog /cm/trace/dbl date det 15 Jun,2010 10:45:17 <dir> dblj

15 Jun,2010 10:45:17
15 Jun,2010 10:45:17 19 Nov,2009 18:53:44

<dir>
<dir> 1,847

ncsj
sdi

2010_09_15_11_14_58_ne042_ccm_164_ccm8_6_0_96000_16_dbl_repl_cdr_define.log

19 Nov,2009 18:59:57

299,786

2010_09_15_13_10_20_dbl_repl_cdr_Broadcast.log 19 Nov,2009 18:59:57 1,261

2010_09_15_13_10_20_dbl_repl_output_Broadcast.log

Will explain more on this from slides 48-52

Presentation_ID

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

32

DB Replication Commands

Presentation_ID

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

33

DB Replication Commands
Utils dbreplication status This command displays the status of database replication by comparing the database content of subscribers to the Publisher. It will indicate if the servers in the cluster are connected, and if the data is in sync.

Utils dbreplication stop This command stops automatic replication setup on the local server waits the replication timeout and stops the automatic replication setup again.

You would want to wait it out to run the following (reset) commands.
This command is typically run prior to running reset Stop the replication on the subs first and then the pub. After we stop on the pub, it waits the repl timeout to start replication. We would have to reset to initiate replication as all the automatic setup processes are stopped.

Presentation_ID

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

34

DB Replication Commands
Utils dbreplication repair This command repairs data if they are out of sync. This command is run when utils dbreplication status shows connected and few tables are out of sync. Syntax: utils dbreplication repair {all | hostname}

Utils dbreplication reset

It can be used to tear down and rebuild replication when the system has not set up properly.
Syntax: utils dbreplication reset {all | hostname}

Presentation_ID

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

35

DB Replication Commands
Utils dbreplication setrepltimeout Syntax : utils dbreplication setrepltimeout timeout
Timeout - The new database replication timeout, in seconds. Value Range is between 300 and 7200.

The default database replication timeout equals 5 minutes (value of 300). This timer comes into effect for both the replication stop and reset replication commands. For reset, it waits for the timer after defining the servers and then realizes the template. When the first subscriber requests replication with the pub, this timer will be set. When the timer expires, the first sub plus other subs that requested replication within that time period begin data replication with the pub in a "batch". For large clusters, you can use the command to increase the default timeout value, so more subs will be included in the batch.

This timer should be set on the publisher after publisher has been upgraded and booted up on the upgraded partition, but before first sub has been switched over to new release. Then, when the first sub requests replication, the pub will set the timer based on this new value.
Note: It is recommended you restore this value back to the default of 300 (5 minutes) once the cluster isInc. upgraded successfully Presentation_ID entire 2006 Cisco Systems, All rights reserved. Cisco Confidential and subs have successfully set up replication.
36

DB Replication Commands
admin:show tech repltimeout

-------------------show tech repltimeout -------------------

The Replication timeout is set to 300 seconds This command helps you determine the repltimeout set on the cluster

Presentation_ID

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

37

DB Replication Commands
Utils dbreplication runtimestate
This command helps to make sure the Publisher is able to communicate with all the subscribers DBLRPC service aka Database Replicator. Verify the RPC column. Typically run before running the reset command.

admin:utils dbreplication runtimestate DB and Replication Services: ALL RUNNING Cluster Replication State: Replication status command started at: 2010-05-13-15-53 Replication status command COMPLETED 427 tables checked out of 427 No Errors or Mismatches found.

DB Version: ccm7_1_3_10000_11 Number of replicated tables: 427

Cluster Detailed View from PUB (2 Servers):

PING SERVER-NAME ----------Publisher subscriber


Presentation_ID

REPLICATION (msec) RPC? -----------

REPL. DBver& REPL. REPLICATION SETUP STATUS QUEUE TABLES LOOP? (RTMT) & details

IP ADDRESS ------ ----

------------

----- ------- ----- ----------------Connected Connected 0 0 match N/A match N/A (2) PUB Setup Completed (2) Setup Completed
38

14.128.62.72 14.128.62.73

0.063 Yes 0.384 Yes

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

DB Replication Commands Last Resort


Utils dbreplication clusterreset This command can be used to debug database replication, but should only be used if "utils dbreplication reset all" has previously been tried and has failed to restart replication on the cluster.

This command will tear down and rebuild replication for the entire cluster.
After using this command, each sub needs to be rebooted. Also, once the subs have been rebooted, you must go to the pub and issue the CLI command "utils dbreplication reset all". RCA cannot be determined once you run this command. Syntax : utils dbreplication clusterreset

Utils dbreplication dropadmindb This command drops the Informix syscdr database on any server in the cluster.

You should run this command only if database replication reset or clusterreset fails to define a particular node in the replication process.
RCA cannot be determined. Syntax : utils dbreplication dropadmindb
Presentation_ID 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential

39

DB Replication Command : Example


Utils dbreplication status

- Good Status

1. Check the output to be sure each server is connected, and no tables are suspect
2. The status should list all the subscribers as being connected at the top of the file, and no tables are suspect SERVER ID STATE STATUS QUEUE CONNECTION CHANGED ----------------------------------------------------------------------g_bldr_ccm4_ccm 2 Active Local 0 g_bldr_ccm5_ccm 3 Active Connected 0 Sep 6 16:27:15

Presentation_ID

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

40

DB Replication Command : Example


-Bad Status Servers out of Sync 1. If RTMT counter value for replication state is 2 or 3 for all nodes of the cluster, then replication is set up.

2. Replication state 3 states, there are a few tables that are out of sync.
3. You would run a dbreplication repair to clear this issue. (Slide 31)

SERVER ID STATE STATUS QUEUE CONNECTION CHANGED ----------------------------------------------------------------------g_bldr_ccm4_ccm 2 Active Local 0 g_bldr_ccm5_ccm 3 Active Connected 0 Sep 6 16:27:15 ---------- Suspect Replication Summary ---------For table: ccmdbtemplate_bldr_ccm4_ccm_1_27_processnode replication is suspect for node(s): g_bldr_ccm5_ccm For table: ccmdbtemplate_bldr_ccm4_ccm_1_34_replicationdynamic replication is suspect for node(s): g_bldr_ccm5_ccm ------------------------------------------------Presentation_ID 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential

41

DB Replication Command : Example


- Bad Status Replication not setup properly 1. One or more nodes or some servers shows "Quiescent" or "Dropped" Status. This status is not necessarily bad as the server could have been shut-down or in the middle of replication.

2. If the servers replication status is Failed , it is in a bad state.


3. This would typically show a replicate state of 0 or 4. 4. You would run a dbreplication reset to clear this issue.

SERVER ID STATE STATUS QUEUE CONNECTION CHANGED ----------------------------------------------------------------------g_bldr_ccm4_ccm 2 Active Local 0 g_bldr_ccm5_ccm 3 Active Dropped 636 Sep 10 14:01:20
Possible causes :

1. A communications problem/ network error(publisher and subscriber cannot talk.


2. One or more ports that is required by the database is not opened on the firewall. 3. Host files not setup properly.
Presentation_ID 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential

42

Commands introduced in CM 7.X


Utils dbreplication forcedatasyncsub This command forces a subscriber server to have its data restored from data on the publisher server. Use this command only after you have run the utils dbreplication repair command several times, but the utils dbreplication status command still shows nondynamic tables that are not in sync . This command can take a significant amount of time to execute and can affect the system-wide IOWAIT. This command takes a database backup of the publisher server and restores that data into the database on the subscriber server. This command erases all existing data on the subscriber server and replaces it with the database from the publisher server, which makes it impossible to determine the original root cause for the subscriber server tables going out of sync. After you run this command, you must restart the restored subscriber servers. This command causes an outage on the server it is run. This command is used as a last resort and once used, RCA cannot be done. Syntax : utils dbreplication forcedatasyncsub {all|hostname}
Presentation_ID 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential

43

Commands introduced in CM 7.X


Utils dbreplication quickaudit This command runs a quick database check on selected content on dynamic tables. This option will only check selected content of dynamic tables:

- Number of configured devices, directory numbers, and users


- Number of mobility devices changing device pool - Number of extension mobility users logged in - Number of active extensions with DND set

- Number of active extensions with MWI set


- Number of active extensions with CFA set Syntax : utils dbreplication quickaudit nodename | all

Utils dbreplication dropadmindbforce Drops the Informix syscdr database on the server which it is run Syntax : utils dbreplication dropadmindbforce
Presentation_ID 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential

44

Commands introduced in CM 7.X


Utils dbreplication repairreplicate This command repairs mismatched data between cluster nodes and changes the node data to match the publisher data. It does not repair replication setup. Syntax : utils dbreplication repairreplicate replicatename [nodename]|all

Utils dbreplication repairtable This command repairs mismatched data between cluster nodes and changes the node to match the publisher data. It does not repair replication setup. Syntax : utils dbreplication repairtable tablename [nodename]|all

Presentation_ID

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

45

Replication Logs
From the Publisher 1. File get activelog cm/log/informix/*dbl_repl*.log 2. File get activelog cm/trace/dbl/*dbl_repl*.log

3. File get activelog cm/log/informix/ccm.log*


4. File get activelog cm/log/informix/ats/* 5. File get activelog cm/log/informix/ris/* 6. File get activelog cm/ltraces/dbl/sdi/dbmon*.txt 7. File get activelog cm/log/info 8. Run utils dbreplication status and cm/trace/dbl/sdi/ReplicationStatus* file get activelog

9. File get activelog cm/trace/dbl/sdi/ReplicationRepair* 10. File get activelog cm/trace/dbl/sdi/replication_scripts_output.log 11. utils diagnose test o/p (file get activelog /platform/log/diag2.log)
Presentation_ID 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential

46

Replication Logs
From the Subscribers 1. File get activelog cm/log/informix/ccm.log*

2. File get activelog cm/trace/dbl/sdi/dbmon*.txt


3. File get activelog cm/log/informix/ats/* 4. File get activelog cm/log/informix/ris/*

Download the following unified reports 1. Database Status 2. Cluster Overview 3. Replication Debug

Presentation_ID

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

47

Replication Logs
admin:file list activelog /cm/trace/dbl date det 15 Jun,2010 10:45:17 <dir> dblj

15 Jun,2010 10:45:17
15 Jun,2010 10:45:17

<dir>
<dir>

ncsj
sdi

19 Nov,2009 18:53:44 1,847 dbl_repl_cdr_define_subscriber_ccm7_1_3_10000_112009_11_19_18_53_21.log 19 Nov,2009 18:59:57 19 Nov,2009 18:59:57 299,786 dbl_repl_cdr_Broadcast_2009_11_19_18_58_44.log 1,261 dbl_repl_output_Broadcast_2009_11_19_18_58_44.log

Presentation_ID

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

48

Replication Logs : Sample Define


[# cat 2010_09_15_11_14_58_ne042_ccm_164_ccm8_6_0_96000_16_dbl_repl_cdr_define.log passed dbname [ccm6_1_0_9901_391] dbname passed[ccm6_1_0_9901_391] local_dbname [ccm6_1_0_9901_391] -------Inside deleteQuiescent------subscriber name: g_nw104a_196_ccm sucmd to execute [su -c 'cdr list serv > /tmp/cdr_list_serv_local_quiescent' informix] -------Exiting deleteQuiescent------sucmd_err [su -c 'ulimit -c 0;cdr err --zap' - informix ] Executing [su -c 'ulimit -c 0;cdr define server --connect=nw104a_196_ccm --idle=0 -init --sync=g_nw104a_212_ccm g_nw104a_196_ccm -ats=/var/log/active/cm/log/informix/ats --ris=/var/log/active/cm/log/informix/ris;' informix] After Executing [su -c 'ulimit -c 0;cdr define server --connect=nw104a_196_ccm -idle=0 --init --sync=g_nw104a_212_ccm g_nw104a_196_ccm -ats=/var/log/active/cm/log/informix/ats --ris=/var/log/active/cm/log/informix/ris;' informix] ---------------START-------------------Inside getServCountonpublisher------sucmd to execute [su -c 'cdr list serv > /tmp/cdr_list_serv_local' - informix] ---Inside-------locateFailure servcount_on_publisher is [1] sleeptime is[10] SERVER ID STATE STATUS QUEUE CONNECTION CHANGED ----------------------------------------------------------------------g_nw104a_196_ccm 17 Active Local 0 g_nw104a_212_ccm 2 Active Connected 0 Sep 24 16:43:20 Count on node [g_nw104a_196_ccm] is [1] count_on_publisher [1] -------LocateFailure-------Returning-------------servcount_on_publisher is [1] --------------END------------sucmd [su -c 'ulimit -c 0;cdr err -a' - informix >> /usr/local/cm/db/cdr_err_define.out 2>&1] size of cdr_err.out is [64]
Presentation_ID 2006 Cisco Systems, Inc. All rights reserved. Cisco Confidential

49

Replication Logs : Sample Define


In the above 2010_09_15_11_14_58_ne042_ccm_164_ccm8_6_0_96000_16_ dbl_repl_cdr_define.log output, Servers show Local or Connected which is good. Shows size of cdr_err.out is [64] which is good

Presentation_ID

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

50

Replication : Sample dbl_repl_output_broadcast


[root@nw104a-212 dbl]# cat 2010_09_15_13_10_20_dbl_repl_cdr_Broadcast.log sucmd [su -c 'ulimit -c 0;cdr err --zap' - informix >> /var/log/active/cm/trace/dbl/dbl_repl_cdr_Broadcast_2007_09_24_16_59_57.log 2>&1] Starting Broadcast RT...(g_nw104a_196_ccm g_nw104a_198_ccm g_nw104a_199_ccm g_nw104a_201_ccm g_nw104a_202_ccm g_nw104a_200_ccm g_nw104a_203_ccm g_nw104a_205_ccm g_nw104a_206_ccm g_nw104a_194_ccm g_nw104a_208_ccm g_nw104a_209_ccm ) sucmd [su -c 'ulimit -c 0;cdr realize template ccmdbtemplate g_nw104a_196_ccm g_nw104a_198_ccm g_nw104a_199_ccm g_nw104a_201_ccm g_nw104a_202_ccm g_nw104a_200_ccm g_nw104a_203_ccm g_nw104a_205_ccm g_nw104a_206_ccm g_nw104a_194_ccm g_nw104a_208_ccm g_nw104a_209_ccm ' - informix >> /var/log/active/cm/trace/dbl/dbl_repl_cdr_Broadcast_2007_09_24_16_59_57.log 2>&1] realizeclockstart [1190671197.81] Time taken to do realize template [116.477200985] cmd[rm -f /usr/local/cm/db/cdr_err_realize.out] sucmd [su -c 'ulimit -c 0;cdr err -a' - informix >> /usr/local/cm/db/cdr_err_realize.out 2>&1]

size of cdr_err.out is [64]


Before cdr check sucmd [su -c 'ulimit -c 0;cdr err --zap' - informix >> /var/log/active/cm/trace/dbl/dbl_repl_cdr_Broadcast_2007_09_24_16_59_57.log 2>&1] sucmd [su -c 'ulimit -c 0; cdr check replicateset -m g_nw104a_212_ccm -s ccmdbtemplate -e delete -R g_nw104a_196_ccm g_nw104a_198_ccm g_nw104a_199_ccm g_nw104a_201_ccm g_nw104a_202_ccm g_nw104a_200_ccm g_nw104a_203_ccm g_nw104a_205_ccm g_nw104a_206_ccm g_nw104a_194_ccm g_nw104a_208_ccm g_nw104a_209_ccm --firetrigger=follow' - informix >> /var/log/active/cm/trace/dbl/dbl_repl_cdr_Broadcast_2007_09_24_16_59_57.log 2>&1] Time taken to do cdr check[2038.29240179] cmd[rm -f /usr/local/cm/db/cdr_check.out] sucmd [su -c 'ulimit -c 0;cdr err -a' - informix >> /usr/local/cm/db/cdr_check.out 2>&1] size of cdr_check.out is [64]
Presentation_ID

Realize Template Successful.

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

51

Replication : Sample dbl_repl_output_broadcast


In the above output file, you need to look for: A successful realize A successful sync or check size of cdr_check.out is [64] which is good Realize Template Successful.

Presentation_ID

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

52

Presentation_ID

2006 Cisco Systems, Inc. All rights reserved.

Cisco Confidential

53

Anda mungkin juga menyukai