1-1
1-1
1-1
1-2
1-3
2-1
2-1
2-1
2-4
2-5
2-5
2-6
2-8
2-9
3-1
3-1
3-1
3-2
3-3
3-5
3-5
3-6
3-6
3-6
3-8
3-12
A-1
A-1
A-3
B-1
B-1
B-1
C-1
Index .................................................................................................................
HUAWEI
Emergency Maintenance
Manual Version
T2-010348-20050530-C-2.00
Product Version
V200R001
BOM
31033848
Huawei Technologies Co., Ltd. provides customers with comprehensive technical support
and service. Please feel free to contact our local office or company headquarters.
Trademarks
Notice
The information in this manual is subject to change without notice. Every effort has
been made in the preparation of this manual to ensure accuracy of the contents,
but all statements, information, and recommendations in this manual do not
constitute the warranty of any kind, express or implied.
Organization
The manual describes the maintenance principles, emergency maintenance tasks
and operations of SG7000 V200R001.
There are three chapters and three appendixes in the manual.
z
Intended Audience
The manual is intended for the following readers:
z
Engineering personnel
Maintenance personnel
Conventions
The manual uses the following conventions:
I. General conventions
Convention
Description
Arial
Boldface
Courier New
Description
<>
[]
Window names, menu items, data table and field names are
inside square brackets. For example, pop up the [New User]
window.
Description
Press the left button or right button quickly (left button by
default).
IV. Symbols
Eye-catching symbols are also used in the manual to highlight the points worthy of
special attention during the operation. They are defined as follows:
description.
Table of Contents
Table of Contents
Chapter 1 Classification of Emergent Faults.............................................................................. 1-1
1.1 Definitions of Emergent Faults........................................................................................... 1-1
1.1.1 Equipment Fault ...................................................................................................... 1-1
1.1.2 Extremely Heavy Traffic Fault ................................................................................. 1-2
1.2 Basic Principles of Emergency Maintenance .................................................................... 1-3
Chapter 2 Emergency Maintenance Flow ................................................................................... 2-1
2.1 Overview of Emergency Maintenance Flow ...................................................................... 2-1
2.1.1 Overall Flow ............................................................................................................ 2-1
2.1.2 Collecting Fault Information .................................................................................... 2-4
2.2 Handling Equipment Fault ................................................................................................. 2-5
2.2.1 Handling Breakdown of Cabinet or Frame.............................................................. 2-5
2.2.2 Handling BAM Breakdown ...................................................................................... 2-7
2.2.3 Handling Restart of Equipment ............................................................................... 2-9
2.3 Handling Extremely Heavy Traffic Fault .......................................................................... 2-10
Chapter 3 Related Operations of Emergency Maintenance ...................................................... 3-1
3.1 Power-on Operations......................................................................................................... 3-1
3.1.1 Powering On and Restarting Cabinet ..................................................................... 3-1
3.1.2 Powering On and Restarting Frames...................................................................... 3-2
3.2 Pulling Out and Inserting Boards ....................................................................................... 3-3
3.3 Resetting Operations ......................................................................................................... 3-5
3.3.1 Resetting Frames .................................................................................................... 3-5
3.3.2 Resetting Boards..................................................................................................... 3-6
3.4 Backing Up and Recovering Database.............................................................................. 3-6
3.4.1 Automatic Backup of Database............................................................................... 3-6
3.4.2 Manual Backup of Database ................................................................................... 3-8
3.4.3 Safe Data Recovery .............................................................................................. 3-12
Appendix A Record Tables of Emergency Maintenance...........................................................A-1
A.1 Emergency Maintenance Note ..........................................................................................A-1
A.2 Troubleshooting Record Table ..........................................................................................A-3
Appendix B Power Supply System of Cabinet ...........................................................................B-1
B.1 Power Supply Loop of Power Distribution Frame .............................................................B-1
B.2 Power Supply Loop Inside Cabinet ...................................................................................B-1
Appendix C Acronyms and Abbreviations .................................................................................C-1
Index ................................................................................................................................................ i-1
The power-off of the basic cabinet (cabinet 0) or master service frame (frame 0 of
the basic cabinet) leads to the breakdown of the whole system.
The simultaneous breakdown of the active and standby HSYS boards of the
master service frame leads to the breakdown of the whole system.
The simultaneous breakdown of the active and standby HSYS boards of the
slave service frame leads to the breakdown of the service frame.
Note:
In the default delivery configuration, the basic cabinet refers to cabinet 0, the master
service frame refers to frame 0 of the basic cabinet, and the slave service frame refers
to all the service frames excluding the master service frame.
The CPU utilization of the BAM keeps near 100% for a long time.
The Windows operating system of the BAM breaks down during the operation or
it is unable to boot when the BAM starts.
Huawei Technologies Proprietary
1-1
Abnormal running of the BAM Manager". For example, unable to start the BAM
Manager, unable to start all the service processes (their statuses are "Stopped"),
or all the service processes enter the Exception state after being started
repeatedly.
You need to power on the equipment again after powering it off for other
reasons.
You find that the data in the host and the BAM are not consistent, and cannot
recover the data through the commands.
You must attend the emergency maintenance training which is mandatory for
maintenance personnel. You must learn the basic methods of judging emergent
faults and learn how to handle emergent faults.
When there is an emergent fault, check whether the equipment and the bearer
network are working normally. Then you need to judge whether the emergent
fault is caused by the equipment. If yes, you can handle the fault according to the
pre-prepared schemes or the procedures in this manual.
When emergent faults happen to the BAM, you are forbidden to reinstall the
system or format the hard disk on the BAM before consulting Huawei for
technical support to avoid the loss of some important data.
In order to get quick technical support when handling an emergent fault, you
need to contact the customer service center (see page 2 of the cover of this
manual) or the regional office of Huawei.
After handling an emergent fault, you need to collect the alarm information
related to this fault and send the fault handling report, equipment alarm files and
log files to Huawei for analysis. By doing this, you can help Huawei improve the
after-sales services.
Collecting fault
information
No
Hardware Ok?
Yes
Bearer
network Ok?
No
Yes
4
Handling
equipment fault
Handlng bearer
network fault
Handling extremely
heavy traffic fault
Fault solved?
No
Collecting fault
information
Yes
8
Collecting fault
information
9
Reporting for
analysis
Reporting for
Help
Check if the power supply to the equipment is normal. That is, check if the power
supply to the cabinet, components within the cabinet, and frames is normal.
Check if the components, including the LAN Switch and BAM, in the basic
cabinet run normally.
Check if the frames run normally. In the system navigation window of the client,
select System Settings -> Board Position Management. Observe the running
statuses of the boards (front boards, back boards, and subboards), power frame,
and fans.
Check if there is any transmission fault alarm for the alarm management system.
Use the ping command to check if the connection between related devices is
normal.
If necessary, use the tracert command to locate the IP address of the faulty
router in the bearer network.
transmission delay, the bit error rate, the packet loss ratio and the jitter. Check
whether network congestion, network storm or virus attacking exists in the
bearer network.
Caution:
In this version, if the server is in single-system mode, the default storage path for the
system debugging information is E:\MSSQL\SGData. For actual path, refer to the
actual installation directory of the system.
In this version, if the server is in dual-system mode, the default storage path for the
system debugging information is F:\MSSQL\DATALOG.
These debug log files are crucial to the location of the fault. When the size of these
files reaches a certain threshold, the system will automatically remove the information
saved originally. Therefore, after the emergent fault occurs, you need to copy and
save these files to another place in time.
Select Alarm Query -> Query on the menu bar of the alarm management
system, and set the type and time segment of the alarm to be queried on the
pop-up window. Click OK.
2)
On the output window of the query result popped up, right click and select Save
as on the shortcut menu popped up. Then you can export the alarm log
information and save it as a text file.
If the green RUN indicator on the front panel of the power distribution frame is
on and blinks once every two seconds, it indicates that the power supply to the
cabinet is normal. That is, the power supply to the power distribution frame and
the circuit in it is normal.
If the green RUN indicator on the front panel of the power distribution frame is
off, it indicates that the power supply to the cabinet is abnormal. That is, neither
of the two channels of the power supply to the power distribution frame has any
voltage.
II. Powering On and Restarting Cabinet After Repairing Power Supply System
Abnormal power supply to the cabinet is generally caused by faults of the power
supply system of the equipment room. In this case, proceed as follows to restore the
power supply:
1)
To prevent accidents, turn off all the power switches (SW1 to SW6) on the front
panel of the power distribution frame before the power supply system of the
equipment room returns to normal.
2)
3)
After the power supply system of the equipment room returns to normal, power
on and restart the cabinet according to the methods described in section 3.1
Power-On Operations.
HSYS and SBPI are configured in the master service frame, a power failure to the
master service frame directly leads to the breakdown of the equipment. Proceed as
follows to check if the power supply to the master service frame is normal:
z
If the indicators on the front panels of all the boards in the master service frame
are on, it indicates that the power supply to the master service frame is normal.
If no indicator on the front panels of the boards in the master service frame is on,
it indicates that the power supply to the master service frame is abnormal.
If the power supply to the master service frame is abnormal, do not perform any
operations before locating the fault. Contact Huawei immediately for technical
support.
IV. Checking If HSYSs in Master Service Frame and Slave Service Frame
Have Broken Down
The HSYS module supports the normal operation of service frames (OSTA frames). If
all the HSYSs break down, no boards in the frames can work normally.
If the following situations occur, it indicates that the HSYSs have broken down:
z
The RUN indicators on the front panels of the active and standby HSYSs are off
or constantly on.
The FAIL indicator on the front panels of the active and standby HSYSs lights
up.
The high-speed bus status indicators DOMA and DOMB on the front panels of
the active and standby HSYSs are all off.
The CPU utilizations of the active and standby HSYSs are close to 50% for a
long time.
If the power indicator of the BAM is green, it indicates that the power supply to
the BAM is normal and the BAM is on.
If the power indicator of the BAM is yellow, it indicates that the power supply to
the BAM is normal and the BAM is in the standby state.
If the power indicator of the BAM is off, it indicates that the power supply to the
BAM is abnormal.
If the power supply to the BAM is abnormal, do not perform any operations before
locating the fault cause. Contact Huawei immediately for technical support.
If the BAM is in single-system mode and the system is not equipped with a hard
disk array cabinet, you can skip this step.
If the BAM is in dual-system mode and the system is equipped with a hard disk
array cabinet, check the power indicator on the front panel of the hard disk array
cabinet. When the power supply is normal, the indicator is constantly green.
If the BAM is in the standby state, press the power switch of the BAM to restart
the BAM.
2)
If the Windows operating system of the BAM is still running, use the restart
function of the operating system to restart the BAM.
3)
If the Windows operating system of the BAM breaks down, press the Reset
button of the BAM to restart the BAM.
No
Restart
mandatorily?
Yes
Querying software parameter switches of
all boards
No
No
Yes
Setting software parameter
switches
No
Yes
Powering off frames one by one starting from frame 0 and then
powering them on one by one
No
Boards are
normally loaded?
Yes
No
Data is
consistent?
Yes
End
1)
2)
Use the DSP SFTSWT command to query the software parameter switches of all
the boards.
3)
Confirm whether loading programs and data from the background is required. If it
is required, check if the FTP program in the BAM is running. If not, start the FTP
program manually.
4)
Use the SET SFTSWT command to set the software parameter switches. After
completing the setting, use the DSP SFTSWT command to query if the setting is
successful. If it is not, reset the software parameter switches.
5)
After powering off the service frames one by one starting with master service
frame 0, power on them one by one starting with master service frame 0.
6)
Check if the boards can be normally loaded and normally run, and if the links and
connections are normal.
7)
8)
If insolvable problems occur during the restart, contact Huawei for technical
support.
Caution:
Extremely heavy traffic may cause no response for commands delivered. In the
process of emergency maintenance, if the system does not respond after a command
has been delivered many times, contact Huawei immediately for technical support.
Start
Extremely heavy
traffic fault?
No
Yes
Clearing activation
statistics, canceling all
message tracing
Other processing
flow
Link C has a
large number
of messages
No
Yes
Deactivating
link C
Opposite plane
discards a large
number of messages
sent from end
office?
Yes
Modifying data of end
office, directing route to
the local SG
No
Foreground/
background
communication
times out?
Yes
Checking
correpsonding board
End
Caution:
If there is no response to the command, re-send the command several times. If there
is still no response, contact Huawei for technical support.
faulty. In this case, modify the data of the end office, and direct the corresponding
translation point to the local equipment.
Major faults of the power supply system cause power failures of the equipment.
When the power supply system becomes normal, it is necessary to power on the
equipment.
Now the KVM/PWR indicator on the panel of the KVM should be on (green),
indicating that the power supply to the KVM/LCD converter is normal.
If the power indicator of the server is orange, it means that the server is in the
standby state. You can press the power switch of the server. Now the indicator of
the switch will turn to green, indicating a normal power supply of the server.
In the dual-system mode, the system is equipped with a hard disk array cabinet.
Turn on the two power switches on the back of the hard disk array cabinet. If the
power supply is normal, the power indicator on the front panel of the hard disk
array cabinet will be constantly green.
After confirming the hard disk array cabinet is normally started, start the server.
Turn on the power switch of the active server. After the active server starts its
operating system, start the standby server.
Note:
Check the label on the server. By default, the server named ServerA is the active
server, and the one named ServerB is the standby server.
The active and standby HSYSs in the frame are faulty or down.
Major faults of the power supply system cause power failures of the frame. After
repairing the power supply system of the frame, power on the frame.
Caution:
If the lower ejector lever of a board is pressed down or the hot-swap indicator (blue) is
on, it indicates that the board is powered off. At this time, if you just press the lower
ejector lever, you cannot power on the board. Instead, you must pull out the board and
then insert it into the backplane again. After that, press the lower ejector lever to
power the board on.
Put on an ESD-preventive wrist strap. Insert its grounding terminal into the ESD
jack of the rack.
2)
Use a cross screwdriver to loosen the fixing screws in the upper and lower
ejector levers of the board, as shown by step (a) in Figure 3-1.
3)
Grasp the upper and lower ejector levers with both hands. Press down the red
lockers on the ejector levers with thumbs to release the ejector levers. After that,
you can pull out the board from the frame.
4)
Press outward the ejector levers hard with both hands. When the two ejector
levers form an angle of 45, the board connectors break away from the
backplane, as shown by step (b) in Figure 3-1.
Grasp the ejector levers with both hands, and pull out the board for one to two
centimeters smoothly along the slide rails of the frame until the board breaks
away from the backplane.
6)
After confirming that the board has broken away from the backplane, insert it into
the backplane again. Stop pushing when the positioning pin on the front panel of
the board touches the pin positioning hole on the frame.
7)
Turn inward the ejector levers of the board hard with both hands. When the
ejector levers are vertical to the front panel, the locking keys lock the ejector
levers. This indicates that the board has been inserted into the frame.
If the RUN indicator on the front panel of the board is on and blinks regularly, it
indicates that the board is running normally.
Open the system navigator pane on the left of the maintenance console window,
and select System Setting -> Board Position Management. If the board
indicator is green or light blue (when in the standby state), it indicates that the
board is running normally. If the indicator is red, it indicates that the board is
running abnormally.
If the fault indicator ALM (yellow) on the front panel of the board is on for one
second and then off for one second, it indicates that the board is running
abnormally.
If the board is running abnormally, you need to replace it. For details, refer to U-SYS
SG7000 Signaling Gateway Maintenance Manual Parts Replacement Guide.
Huawei Technologies Proprietary
3-4
Caution:
The frames are reset in the following situations:
z
I. Preparations
Before resetting a frame, use the BKP DB command to back up the BAM data.
Press the Reset button on the panel of the HSYS boards to reset the active and
standby HSYSs.
2)
Press the Reset button on the panel of the service boards (front boards) to
reset the service boards.
III. Method 2: Resetting a Frame by Pulling Out and Inserting the Active and
Standby HSYSs
During failure or down of all the boards in the frame, you can reset the frame through
pulling out and inserting the active and standby HSYSs if method 1 does not work.
Caution:
Only in the following situations can boards be reset:
z
The board is faulty or down. In this case, you can reset the board directly.
I. Preparations
If the board to be reset is an HSYS, use the BKP DB command to back up the BAM
data before resetting the board.
Caution:
z
In this version, the default storage path for the BAM database and registry is
E:\MSSQL\SGDATA. For the specific path, refer to the actual installation
directory.
1)
The backup of the BAM database is performed in a cyclic manner. At most eleven
database files can be backed up. The backup file of the BAM database includes the
following contents.
z
Saving the database backup of the past consecutive seven days from the current
day (except Sunday), BamYYYYMMDD.dat is the file name of these six files.
Saving the database backup of the four Sundays of the past 28 days,
BamYYYYMMDD.dat is the file name of these four files.
On the first day of each month, a BAM database backup has been stored in a file
BamMonthBak.dat.
2)
The system automatically stores the BAM registration information in the Windows NT
system registry every day. Only the latest backup is retained, with the file name of
BamReg.bak.
Note:
In the BAM.ini configuration file, you can set the start time of the automatic BAM
database backup and registry backup for the system by modifying the value of
BkpDbStartHour (start hour of the database backup) and BkpDbStartMin (start minute
of the database backup) in DataMan. Restart the DataMan process for the new
setting to take effect after the modification of the content in the sub-item.
In the Windows NT interface of BAM, click Start -> Programs -> Microsoft SQL
Server 7.0 -> Enterprise Manager to log in to the database server. The
database server window is as shown in Figure 3-2.
Suppose the BAM name is 2203 (Windows NT). Expand the navigation tree to
open the node Console Root -> Microsoft SQL Servers -> 2203 (Windows
NT) -> Databases. Right-click the node to display a floating menu as shown in
Figure 3-3.
Click All Tasks -> Backup Database to pop up a window as shown in Figure 3-4.
Select Bam from the Database field to select the database to be backed up. In
the Backup field, determine the method of data backup by selecting backup of
all the data or of data that has been modified. Select the default data backup file
name of the system in the Destination field and click Remove to remove the
value. Then click the Add to determine the file name and the path where the
backup file is located, as shown in Figure 3-5.
As shown in Figure 3-6, select Overwrite existing media in the Overwrite area,
which indicates to overwrite the previously backed up data. If you do not want to
overwrite the data, select Append to media and click OK.
The system will back up the data after confirmation. Moreover, an interface
displaying the backup progress will pop up as shown in Figure 3-7.
When the system has successfully backed up the data, it gives a success
prompt.
Note:
z
When the BKP DB command is executed for the BAM database backup, the
system does not back up the operation log. When the Enterprise Manager Tool
menu of the SQL Server is used for the data backup, the system will back up the
operation log.
The file generated from manual backup is named based on the date. If several
times of backup are necessary within one day, the file name of the last backup has
to be modified manually.
In the terminal system of BAM server, click Start -> Programs -> Microsoft
SQL Server 7.0 -> Service Manager. The system will display a dialog box as
shown in
Figure 3-8.
2)
Click
to stop the SQL Server program. Then the system will pop up a
dialogue box. Click Yes to interrupt the communication between the foreground
and the background.
3)
Click
In the Windows NT interface, click Start -> Programs -> Microsoft SQL Server
-> Enterprise Manager to log on to the database server.
2)
Enter Console Root -> Microsoft SQL Servers -> SQLserver Group ->
L17918B (Windows NT) -> Databases. Right-click and select All Tasks ->
Restore Database.
3)
In the window as shown in Figure 3-9, select Bam in the Restore as database
field and From equipment in the Restore field. Recover the data setting
according to actual situations (complete database recovery, recovery of different
Huawei Technologies Proprietary
3-13
If the equipment file displayed in Figure 3-9 is not the equipment data file to be
restored, proceed as follows:
5)
Select the equipment file in the list. Click Remove All and Add. The system
displays a window as shown in Figure 3-11.
8)
Select the data backup file to be recovered and click OK. The system will display
a window showing the progress of the data recovery.
9)
After the data recovery is finished, set the BAM Service to the automatic startup
mode and restart the BAM.
Caution:
After using the SQL Service Manager tool to recover BAM data, you must carry out
the FMT command, which converts the format of all data.
Equipment type
Capacity
Complainer
Contact
telephone
Version
Responding
date and time
required
Whether
it
has passed
the warranty
period
Yes No
Auditor:
Stamp (your department):
The following contents are to be filled by Huawei
by telephone
on-site support
Handling method
by remote maintenance
Operator:
Date:
Unsolved problems:
by
IP address:
Time of solution:
Person on duty:
Operator:
Fault type:
Hardware fault
Power fault
Relaylink fault
Clock fault
Other faults
Fault source:
User complaint
Alarm system
Other sources
Description of Fault:
In single-system mode, the relationships between the PDF and the components
inside the cabinet are as shown in Table B-1.
Table B-1 Relationship between cabinet components and controlling switches (1)
Cabinet type
Basic cabinet
Extension cabinet
Component
Controlling switch
BAM
SW5, SW6
LAN Switch 0
SW3
LAN Switch 1
SW2
KVM/LCD
SW1
SW1, SW4
SW2, SW5
SW1, SW4
SW2, SW5
In dual-system mode, the relationships between the PDF and the components
inside the cabinet are as shown in Table B-2.
Table B-2 Relationship between cabinet components and controlling switches (2)
Cabinet type
Component
BAM0
SW5, SW6
BAM1
SW5, SW6
Extension cabinet
Controlling switch
SW3, SW4
LAN Switch 0
SW3
LAN Switch 1
SW2
KVM/LCD
SW1
SW1, SW4
SW2, SW5
SW1, SW4
SW2, SW5
Full spelling
AS
Application Server
ALUI
BAM
DPC
LAN
LCD
KVM
MML
Man-Machine Language
MTP
NGN
SBPI
SBPU
SCCP
SCTP
SG
Signaling Gateway
SLPU
SN
Sequence Number
SQL
STP
TFA
TransFer-Allowed signal
TFP
TransFer-Forbidden signal
WS
Work Station
Index
Index
A
board
inserting, 3-3
powering on
pulling, 3-3
sever, 3-2
collecting
alarm log, 2-4
restarting
BAM, 2-7
equipment, 2-9
database
automatic backup, 3-6
T
the power supply system checking, 2-5
i.