Anda di halaman 1dari 53

Security Level:

Service Drop
Optimization Guide

www.huawei.com

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential


Change History

Date Version Description Reviewer Author


2012-01-10 1.0 Completed the
draft.

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 2


R&D Personnel
Name Employee ID Contact Information
He Yongzhen 00107656 See Huawei telephone directory.

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 3


Abstract
This slide describes formulas of key performance indicator
(KPI) counters, mechanism of traffic measurement counters,
service drop rate and factors affecting KPIs, common fault
location methods, and deliverables to be submitted for
further fault location if a fault causing the service drop
cannot be located based on routine troubleshooting
operations.

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 4


Contents
Formulas of Service-Drop-Related Counters

Common Symptoms of Service Drops

Causes of Service Drops and Data Handling

Checklist and Deliverables for Service Drops

Service Drop Cases

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 5


Whole Call flow
If the eNB start the release procedureS1 have 3 release message

UE_Context_Rel_Req UE_Context_Rel_Cmd UE_Context_Rel_Cmp

If started by MMeE

UE_Context_Rel_Cmd UE_Context_Rel_Cmp

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 6


Mechanism of Call Drop
Definition from protocol
When eRAB or UE context release cause is Normal Release User
inactivity or successful mobility it will be defined as a normal release.

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 7


Formulas of Service-Drop-Related
Counters on the UE Side (1/2)
On the UE side
Call Drop Rate = eRAB AbnormRel/ eRAB Setup Success *100%
eRAB AbnormRel: indicates the number of abnormal E-RAB releases.

eRAB Setup Success: indicates the number of successful E-RAB setups.

Definition Stated in Huawei Genex PA


1. The UE receives the RRC Connection Reconfiguration message in a scenario where no
Non-Access Stratum (NAS) message "DEACTIVATE EPS BEARER CONTEXT REQUEST"
is received, no NAS message "DETACH REQUEST" is received from the MME, and no
NAS message "DETACH REQUEST" is sent to the network side. The RRC Connection
Reconfiguration message carries a "drb-ToReleaseList" information element (IE) and the
ERABAbnormalRel counter is incremented by 1. The number of eps-BearerIdentitys under
the Releaselist is recorded. ERAB num indicates the number of released E-RABs. The E-
RAB num is subtracted by 1 for each abnormal release. If the E-RAB number becomes 0,
the UE state becomes RRC_IDLE; otherwise, the UE state does not change.

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 8


Formulas of Service-Drop-Related
Counters on the UE Side (2/2)
2. The UE receives the RRC connection release message in a scenario where no NAS message
"DEACTIVATE EPS BEARER CONTEXT REQUEST" is received, no NAS message "DETACH
REQUEST" is received from the MME, and no NAS message "DETACH REQUEST" is sent to the
network side. In this case, an abnormal release is counted into the ERABAbnormalRel counter if RLC
transmission exists in 4s before receiving the RRC connection release message (both uplink and
downlink transmission must be considered; the condition is met as long as data transmission is
performed in either direction). Then, the UE state becomes RRC_IDLE.
3. An abnormal release is counted into the ERABAbnormalRel counter if the UE is in the RRC_IDLE
state before receiving the RRC connection release message. The ERABAbnormalRel counter is
incremented by 1 and the E-RAB num is incremented based on the number of releases.
4. An abnormal release is counted into the ERABAbnormalRel counter if the UE sends an RRC
connection request message in a scenario where no RRC Connection Reconfiguration, DEACTIVATE
EPS BEARER CONTEXT REQUEST, DETACH REQUEST, RRC State, and RRC Connection
release message is received.
5. An Abnormal E-RAB release event is simultaneously recorded along with an RRC connection
reestablishment failure event.
Note that some sites may have the UE-initiated reestablishments counted into service drops because
different acceptance conditions are used in various sites.

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 9


Formulas of Service-Drop-Related
Counters on the Network Side
On the network side
Call Drop Rate = L.E-RAB.AbnormRel/(L.E-RAB.NormRel + L.E-
RAB.AbnormRel)*100%
L.E-RAB.AbnormRel: indicates the total number of abnormal E-RAB releases.

L.E-RAB.NormRel: indicates the total number of normal E-RAB releases.

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 10


Abnormal Release Counter on the
Network Side
As shown by point A in figure 1, when the eNodeB sends an E-RAB Release Indication and the cause
value is not Normal Release, User Inactivity, cs-fallback-triggered, and inter-RAT redirection, the L.E-
RAB.AbnormRel counter is incremented by 1. If the E-RAB Release Indication requires the release of
multiple E-RABs, related counters are incremented based on the number of releases.

As shown by point A in figure 2, after the eNodeB sends a UE Context Release Request to the MME, all
E-RABs of the UE are released. If the release cause value is not Normal Release, User Inactivity, cs
fallback triggered, and Inter-RAT redirection, related counters are incremented.

Note: In the E-RAB release procedure, one or multiple E-RABs are released. At least one default bearer
remains after the E-RAB release procedure is complete.
In the UE Context Release procedure, all E-RABs of the UE are released. No bearer, even no default
bearer, remains after the UE Context Release procedure is complete.

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 11


Counters Indicating Causes of Abnormal
Releases on the Network Side (1/2)
By abnormal-release cause, the counters can be classified into five
types:
L.E-RAB.AbnormRel.Radio: number of abnormal E-RAB releases caused by radio-
layer problems
L.E-RAB.AbnormRel.TNL: number of abnormal E-RAB releases caused by transport-
layer problems
L.E-RAB.AbnormRel.Cong: number of abnormal E-RAB releases caused by network
congestion
L.E-RAB.AbnormRel.HOFailure: number of abnormal E-RAB releases caused by
handover failures
L.E-RAB.AbnormRel.MME: number of abnormal E-RAB releases caused by EPC
problems
Abnormal E-RAB releases caused by EPC problems
As shown by points A in figures 1 and 2 on the right, the MME initiates an E-RAB or
UE context release procedure. If the cause value of the E-RAB Release Command or
the UE Context Release Command message received by the eNodeB from the MME
is not Normal Release, Detach, User Inactivity, cs fallback triggered, or inter-RAT
redirection, the cause is counted into the L.E-RAB.AbnormRel.MME counter.
Note: The L.E-RAB.AbnormRel.MME counter is not included in the L.E-RAB.AbnormRel
counter, that is, abnormal E-RAB releases caused by EPC problems are not recorded as
service drops from eRAN2.1 V100R003C00SPC400.

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 12


Counters Indicating Causes of Abnormal
Releases on the Network Side (2/2)
Abnormal E-RAB releases caused by non-EPC problems
As shown by point A in figure 3, when the eNodeB sends an E-RAB Release
Indication to the MME, carrying a cause value being radio error, the L.E-
RAB.AbnormRel.Radio counter is incremented; if the cause value indicates a
transport-layer problem, the L.E-RAB.AbnormRel.TNL counter is incremented; if the
cause value indicates congestion, the L.E-RAB.AbnormRel.Cong counter is
incremented. If the E-RAB Release Indication requires the release of multiple E-RABs,
related counters are incremented based on the number of releases of corresponding
causes.
As shown by point A in figure 4, after the eNodeB sends a UE Context Release
Request to the MME, all E-RABs of the UE are released. If the cause value indicates a
radio error, the L.E-RAB.AbnormRel.Radio counter is incremented; if the cause value
indicates a transport-layer problem, the L.E-RAB.AbnormRel.TNL counter is
incremented; if the cause value indicates congestion, the L.E-RAB.AbnormRel.Cong
counter is incremented and records abnormal releases caused by preemption and
resource congestion; If the cause value indicates a handover failure, the L.E-
RAB.AbnormRel.HOFailure counter is incremented. Related counters are incremented
based on the number of releases of corresponding causes. Releases are not counted
again when the MME responds with a UE Context Release Command message.

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 13


Contents
Definition of Service-Drop-Related Counters

Common Symptoms of Service Drops

Causes of Service Drops and Data Handling

Checklist and Deliverables for Service Drops

Service Drop Cases

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 14


Symptoms of Service Drops Observed in
Drive Tests
In a drive test, use the Probe, Huawei test UEs or Huawei data card (if a commercial
UE is used, install the corresponding UE signaling tracing software), to observe the
following information.
Event List have ERABAbnormalRel event
KPI Statistic
Event Statistic

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 15


Symptoms of Service Drops Observed From
trace
From eNB S1 trace ,if the messages (S1AP_UE_CONTEXT_REL_REQ
/ERAB_Release_Indication) cause is not : Normal ReleaseDetachUser Inactivitycs
fallback triggeredInter-RAT redirection, its a call drop

Normal
Release

Abnormal
release

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 16


Symptoms of Service Drops Observed From
trace

After eNodeB
send
RRC_CONN_
REESTAB
wait
RRC_CONN_
REESTAB_CM
P timeout

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 17


Symptoms of Service Drops Observed in the
Traffic Measurement Data
Service drops are monitored by means of traffic measurement on commercial networks. The service
drop rate and number of service drops are observed for determining a fault. The traffic
measurement result exported from the M2000 displays the following information.
Entire-network service drop rate, number of service drops, number of successful connection
establishments
Service drop rate, number of service drops, and service drop time of top cells

Top cells
contribute a
lot to
service
drops.

Service drop
occurrence period
The entire-network of top cells
service drop rate is
high.

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 18


Contents
Definition of Service-Drop-Related Counters

Common Symptoms of Service Drops

Causes of Service Drops and Data Handling

Checklist and Deliverables for Service Drops

Service Drop Cases

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 19


Procedure of Analyzing Service Drops
Step 1: Identify the range of service drops. Analyze the traffic measurement data or CHR data
to confirm the range where service drops occur, that is, to check whether it is a top-cell or
top-eNodeB problem, entire-network problem, a comprehensive problem, or a top-UE-type or
top-UE problem.
Note 1: The method of analyzing service drops varies between different scenarios.
If the service drop rate deteriorates after the upgrade, compare the difference of the service drop rate before
and after the upgrade and analyze the overall range where the deterioration occurs.
In an existing site to be optimized (counters related to the service drop rate do not meet requirements or
need to be improved), only analyze the range with a high service drop rate, not requiring comparison of the
difference of the service drop rate before and after the upgrade
Step 2: Break down causes of service drops. Use various data sources to identify major
causes of service drops.
Step 3: Perform routine troubleshooting operations for service drops. Follow the routine
troubleshooting operation checklist to locate root causes and determine rectification
measures to solve this problem.
Note that the routine troubleshooting operations for service drops are described in details in the next section.
Step 4: Perform rectification measures. Perform rectification measures to solve the problem
and evaluate the effect. If the rectification target is not met, repeat the preceding steps for
further analysis.

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 20


Determining the Range of Service Drops:
Top Cell Selection Principle
Top cells are selected according to different principles in different scenarios.

Scenario 1: The service drop rate deteriorates. The service drop rate deteriorates in scenarios,
for example, after the upgrade or where the rate suddenly deteriorates due to unknown
reason.

TOP cell selection principle: Calculate the service drop rate and difference in the number of
abnormal E-RAB releases before and after the specified time (by subtracting the value before
deterioration from that after deterioration). Sort deviation values of the service drop rate and number
of abnormal E-RAB releases in a descending order to determine top cells with service drop rate
deterioration and top cells with abnormal E-RAB releases.

Scenario 2: Existing sites are to be optimized. Counters related to the service drop rate do not
meet requirements or need to be improved to reach target values.

TOP cell selection principle: Sort the service drop rate and number of abnormal E-RAB releases
in a descending order to determine top cells with service drop rate deterioration and top cells
with abnormal E-RAB releases.

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 21


Determining the Range of Service Drops:
Criteria
Top-cell problem: If 10% of top cells with service drop rate deterioration and 10% of top cells with
abnormal E-RAB releases are subtracted and the entire-network service-drop-rate counters are
significantly improved to reach original values or target values, service drops are caused by top-cell
problems.
Entire-network problem: If 10% of top cells with service drop rate deterioration and 10% of top
cells with abnormal E-RAB releases are subtracted and the entire-network service-drop-rate
counters are not improved, service drops are caused by entire-network problems.
Comprehensive problem: If 10% of top cells with service drop rate deterioration and 10% of top
cells with abnormal E-RAB releases are subtracted and the entire-network service-drop-rate
counters are improved to a certain extent but are not as good as original values (still cannot meet
target values), service drops are caused by comprehensive (top-cell + entire-network) problems.
Top-UE problem: If 10% of top cells with abnormal E-RAB releases are subtracted and the entire-
network service-drop-rate counters are significantly improved to reach original values or target
values, service drops are caused by top-UE problems.
Note:
Currently, the UE type cannot be obtained from the CHR. Query complaints to check whether this type of problem occurs and then analyze
symptoms to check whether known problems occur on related terminals.
The eNodeB cannot obtain international mobile subscriber identifiers (IMSIs) of top UEs due to security restrictions and needs to use temporary
mobile subscriber identifiers (TMSIs) to determine top UEs.

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 22


Determining the Range of Service Drops
FMABase on M2000 project , find call drop ratio, and TOP 10 cell.

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 23


Classification of Service Drop Causes:
Tracing Tool Interface

Signaling tracing
interface on the
M2000

Probe interface

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 24


Classification of Service Drop Causes:
Analysis Tool Interface

Probe used to trace and


analyze the data of
Huawei UEs

TrafficReview used to analyze the


eNodeB tracing data.

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 25


Classification of Service Drop Causes:
Identifying Reconfiguration Messages
RRC RECONFIGURATION
Use the message query software to display the
details.
If the cqi-ReportConfig IE exists,
that is a Channel Quality Indicator
(CQI) reconfiguration message.

If the measConfig IE
exists, that is a
measurement
configuration message.

If the targetPhysCellId IE
exists, the
RRCConnectionReconfigu
ration message is a
handover command.

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 26


Classifying Service Drop Causes
Based on Traffic Measurement Data
Trend Analysis
Obtain the entire-network service drop rate of at least one to
two weeks. If an upgrade is performed, collect and analyze
the service drop rate of two weeks before the upgrade and
that of one week after the upgrade, as shown in the figure on
the right.

Cause Analysis
Analyze traffic measurement counters to check whether the
E-RAB release is caused by a radio fault or a cell resource
problem, as shown in the figure on the bottom left.

Top cell analysis


Analyze traffic measurement data to determine top cells and
top periods of RRC connection or E-RAB establishment
failures, as shown in the figure on the bottom right.

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 27


Analyzing Service Drop Causes by
Using Signaling Tracing
Signaling tracing can be used to locate in which procedure a service drop occurs
and is specially effective in location of drive test problems and repeatable
problems. However, signaling tracing can only be performed before a problem
occurs and requires manual analysis. Therefore, signaling tracing cannot apply to
unrepeatable problems or small-probability problems.
Standard interface tracing (major): After top cells and top periods are determined by
using traffic measurement, perform standard interface tracing for the corresponding cells
and periods to check which step triggers the service drop.

Single-UE entire-network tracing (minor): Obtain the IMSI of a top UE from the EPC
based on the known TMSI, and then perform entire-networking tracing on the UE. This
method is specially effective for subsequent VIP maintenance. For details about the
operation method, see chapter 6 in LTE OM Tracing and Data Collection Guide.doc.

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 28


Analyzing Service Drop Causes by
Using Drive Test Data
Compared with the eNodeB signaling tracing, the advantage of the drive test is to
obtain not only signaling messages but also the uplink signal strength, uplink
transmit power, bit error rate, and scheduling information (the information
depends on the drive test software and UE); the disadvantage of the drive test is
that, only Uu tracing (RRC and NAS message) results are available and need to
be analyzed along with the eNodeB signaling tracing results.
Differentiating an uplink problem from a downlink problem
The drive test software can be used to determine whether the UE does not receive a
message from the eNodeB or the eNodeB does not receive the response from the UE.
the downlink RSRP and SINR can be observed to check the quality of the downlink
channel. The uplink transmit power can be observed to check whether signal
demodulation on the uplink is restricted.

Isolating UE faults from non-UE faults


Logs are analyzed to determine whether received signaling messages are properly
processed or the UE encounters faults such as suddenly stopped data transmissions.

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 29


Contents
Definition of Service-Drop-Related Counters

Common Symptoms of Service Drops

Causes of Service Drops and Data Handling

Checklist and Deliverables for Service Drops

Service Drop Cases

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 30


Entire-Network Service Drop: Routine Operation
Checklist
Routine Operation Analysis Operation deliverables Solution Operation
Preliminary analysis 1. Quickly analyze the traffic measurement 1. Distribution of service drop causes and top 1. Perform corresponding optimization operations based
on traffic data and export the range and causes of causes; on top service drop causes.
measurement data service drops. 2. Operations performed at the turning point 2. Provide operations performed at the turning point of
related to service 2. Analyze the service drop rate trend to of the service drop rate the service drop rate and evaluate the impact of each
drops identify the turning point. operation on the service drop rate.
Version check 1. Check whether the eNodeB is upgraded Version No. before and after the upgrade Provides modifications before and after the upgrade
or has patches installed patches. possibly affecting the service drop rate by referring to the
2. Check whether the EPC is upgraded or release notes.
has patches installed patches.
Equipment and Check alarms on the entire network. List critical and major alarms. Analyze the impact of alarms on the service drop rate
transport alarms and check whether the service drop rate is recovered
after alarms are cleared.
Data configuration 1. Check parameter settings on the entire 1. Parameter differences before and after 1. Check whether parameter modification affects the
check network. the upgrade. service drop rate.
2. Check modified parameters on the EPC. 2. Parameter differences in comparison 2. Revert parameters and check whether the service
with the baseline parameters of the new drop rate is recovered.
version.
3. Objective and impact of parameter
modification on the EPC.
Operation record Check whether a great amount of operation Records of operations on the entire network Analyze the impact of operations on the service drop
check records exist on the entire network and rate and check whether the operations can be reverted.
whether neighboring cells and PCIs are
replanned.
Neighboring Check whether neighboring cells are Information of missing neighboring cells Add missing neighboring cells and check whether the
relationship check missing. Deployment of a great number of service drop rate is recovered.
eNodeBs between existing eNodeBs in a
scattered manner may make the
neighboring relationships of many adjacent
sites become improper.
Major events check Check whether large-scale telephone 1. Verify the UE type involved in the Confirm the relationship between the important event
number release is implemented or other telephone number release, number release and the deterioration of service drop rate.
important activities such as ceremonies, amount, and subscription policies.
holidays, and sport events are held. 2. Confirm the range and period of time of
important activities.
Note: For details about routine troubleshooting operations for a comprehensive (entire-network + top-cell) problem, see the checklists of the top-
cell problem and the entire-network problem.

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 31


Top-Cell Service Drop: Routine Troubleshooting
Operation Checklist
Routine Operation Analysis Operation deliverables Solution Operation
Preliminary analysis on 1. Quickly analyze the traffic measurement data 1. Distribution of service drop causes and 1. Perform corresponding optimization operations
the traffic and export the range and causes of service top causes; based on top service drop causes.
measurement data drops. 2. Operations performed at the turning 2. Provide operations performed at the turning point
related to top-eNodeB 2. Analyze the service drop rate trend to identify point of the service drop rate of the service drop rate and evaluate the impact of
service drops the turning point. each operation on the service drop rate.
Top-eNodeB version Check whether the eNodeB is upgraded or has Version No. before and after the upgrade Provides modifications before and after the upgrade
check patches installed patches. possibly affecting the service drop rate by referring to
the release notes.
Equipment and 1. Check alarms of top eNodeBs. List critical and major alarms. Analyze the impact of alarms on the service drop
transport alarms of top rate and check whether the service drop rate is
eNodeBs recovered after alarms are cleared.
Top-eNodeB Check parameter settings of top eNodeBs. 1. Parameter differences before and after 1. Check whether parameter modification affects the
parameter settings the upgrade; service drop rate.
check 2. Parameter differences in comparison 2. Revert parameters and check whether the service
with the baseline parameters of the new drop rate is recovered.
version.
Top-eNodeB operation Check whether a great amount of operation Records of operations on the entire Analyze the impact of operations on the service drop
record check records exist on the entire network and whether network rate and check whether the operations can be
neighboring cells and PCIs are replanned. reverted.
Top-eNodeB Check whether neighboring cells are missing. Information of missing neighboring cells Add missing neighboring cells and check whether the
neighboring Deployment of a great number of eNodeBs service drop rate is recovered.
relationship check between existing eNodeBs in a scattered manner
may make the neighboring relationships of many
adjacent sites become improper.
Top-cell coverage Analyze the MCS and CQI information in the Top-cell coverage evaluation report 1. If weak coverage exists, adjust the coverage by
check traffic measurement data, CHR data, and drive means of network optimization.
test data to check whether top cells encounters
cross coverage or weak coverage.
Top-cell interference Analyze the real-time tracing data to check 1. Top-cell interference evaluation report 1. If interference exists, solve the problem by
check whether top cells encounter intermodulation referring to the interference check manual.
interference and external interference.
Major events check Check whether large-scale telephone number 1. Verify the UE type involved in the Confirm the relationship between the important event
release is implemented or other important telephone number release, number release and the deterioration of service drop rate.
activities such as ceremonies, holidays, and sport amount, and subscription policies.
events are held. 2. Confirm the range and period of time of
important activities.

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 32


Fault Location --- parameter settings check

Check the parameters:


1. Parameter differences before and after the upgrade;
2. Parameter differences in comparison with the baseline parameters of the
new version.

Some key parameters for call drop:

More parameters PLS check attach:


Parameter value MML Command
UeMaxRetxThreshold Maxretx_Threshold_t32 MOD RLCPDCPPARAGROUP
ENodeBMaxRetxThreshold Maxretx_Threshold_t32 MOD RLCPDCPPARAGROUP
S1MessageWaitingTimer 20 MOD ENODEBCONNSTATETIMER
X2MessageWaitingTimer 20 MOD ENODEBCONNSTATETIMER
UuMessageWaitingTimer 35 MOD ENODEBCONNSTATETIMER
T304ForEutran ms500 MOD RRCCONNSTATETIMER
UeInactiveTimer 20 MOD RRCCONNSTATETIMER
T310 MS1000_T310 MOD UETIMERCONST
N311 n1 MOD UETIMERCONST
N310 n10 MOD UETIMERCONST

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 33


Fault Location --- Alarm check

Check the Alarms and faults:

Performance log Alarm log

Fault log Operation log

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 34


Fault Location: Radio Problems
Symptom:
According to the definition of the traffic measurement counter on the eNodeB, if abnormal releases are counted into
the L.E-RAB.AbnormRel.Radio counter, the service drop is caused by the radio interface problem on the wireless
network side.

Possible causes
A service drop with the cause value being radio is caused by the reason that RLC retransmissions reach the
maximum timer, out-of-synchronization occurs, or signaling message exchange fails due to weak coverage, uplink
interference, or UE faults. For details about interference elimination, see LTE RF Channel Test and Check Manual.

Handling procedure
Analyze the CHR data to check whether top UEs exist.
Analyze the CHR data to verify inner causes of abnormal releases.
If a service drop is caused on a failure in exchange of non-procedure messages, view the L2 DRB scheduling data to
check whether weak coverage or interference occurs.
If a procedure message exchange fails, observe the last ten message to locate the faulty point and determine whether the
UE does not receive the message from the eNodeB or receives but not processes the message, or the eNodeB does not
receive the response from the UE.
Inner release cause values in the CHR are: UEM_UECNT_REL_UE_RLC_UNRESTORE_IND,
UEM_UECNT_REL_UE_RESYNC_TIMEROUT_REL_CAUSE,
UEM_UECNT_REL_UE_RESYNC_DATA_IND_REL_CAUSE,
UEM_UECNT_REL_UE_RLF_RECOVER_FAIL_REL_CAUSE, and UEM_UECNT_REL_RRC_REEST_SRB1_FAIL
UEM_UECNT_REL_RB_RECFG_FAIL.

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 35


Fault Location: Hanover Failures
Symptom:
According to the definition of the traffic measurement counter on the eNodeB, if abnormal
releases are counted into the L.E-RAB.AbnormRel.HOFailure counter, service drops are
caused by handover failures.

Possible causes
A service drop with the cause value being handover failure is caused by an abnormal release
due to a failure in handover out of the serving cell.

Handling procedure
Use inter-specific-cell outgoing handover counters to determine the target cell with the largest
service drop rate.
Analyze the CHRs of the serving cell and the target cell to check whether the UE fails to
receive the handover command or the UE fails to random access the target cell. The
corresponding inner release cause values in the CHR are
UEM_UECNT_REL_HO_OUT_X2_REL_BACK_FAIL and
UEM_UECNT_REL_HO_OUT_S1_REL_BACK_FAIL.
Optimize the handover relationship including handover parameters and neighboring
relationship and then check whether related counters are recovered.

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 36


Fault Location: Transport Problems
Symptom:
According to the definition of the traffic measurement counter on the eNodeB, if
abnormal releases are counted into the L.E-RAB.AbnormRel.TNL counter, service
drops are caused by transport-layer problems.

Possible causes
A service drop with the cause value being TNL is caused by a transport fault
between the eNodeB and the MME, for example, intermittently disrupted S1 link.

Handling procedure
Query alarms to check whether there are transport-related alarms, clear the
alarms if any, and then check whether related counters are recovered.
Check whether the eNodeB encounters transport-related alarms on the M2000.
Clear alarms by referring to the alarm help.
If alarms are cleared and the L.E-RAB.AbnormRel.TNL counter still has a large
value, collect and send the following information to the next fault location station.

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 37


Fault Location: Congestion Problems
Symptom
According to the definition of the traffic measurement counter on the eNodeB, if abnormal
releases are counted into the L.E-RAB.AbnormRel.Cong counter, service drops are
caused by congestion problems.
Possible Causes
A service drop with the cause value being congestion is caused by congestion of radio
resources on the eNodeB, for example, the maxim number of users reaches.
Preempt.
Handling Procedure
If a top cell encounters service drops caused by long-term congestion, enable the load
balancing or interoperation function to reduce the load of the serving cell for a short-term
solution. For a long-term solution, expand the capacity. After solving the problem, check
whether related counters are recovered.
Turn on the MLB algorithm switch and check whether the situation is improved.

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 38


Fault Location: MME Problems
Symptom
According to the definition of the traffic measurement counter on the eNodeB, if abnormal
releases are counted into the L.E-RAB.AbnormRel.MME counter, the service drop is caused by
an abnormal release initiated by the EPC. This type of abnormal releases is not counted into
the L.E-RAB.AbnormRel counter.

Possible Causes
A service drop with the cause value being MME is caused by an abnormal release initiated by the EPC.

Handling Procedure
This type of service drops is caused by non-eNodeB problems and needs to be located by using EPC-
related information.
Inner release cause values in the CHR: UEM_UECNT_REL_MME_CMD. The service drop is caused by
the release initiated by the EPC. Work with the EPC technical support personnel to solve this problem.
Obtain the S1 tracing result of top cells and analyze the distribution of various causes of abnormal
releases initiated by the EPC.
Send measurement results and related signaling procedures to the EPC technical support personnel for
further analysis.

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 39


Fault Location: Coverage Problem
The symptom is poor link quality caused by unbalanced uplink and downlink or
weak coverage.
The symptoms of poor uplink are minimum RB count, MCS 0, PHR below 0 dB, high uplink
BLER, high CRC error rate, and negative SINR as shown in the CHR.
The symptoms of poor downlink are poor CQI or the HARQ receives a lot of DTX and NACK
messages from the UE.
Use FMA to analyze the CHR for weak coverage:
In the L1SegInfo, msg3 UL RSRP<=-130 during the RRC phase.
In the L2_DRB_STRU, UL Average Rsrp <0 and MAX{UL Average Rsrp}<=-130 during ERAB
phase.
Use Probe in the DT to analyze the weak coverage: DL RSRP<-119.

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 40


Fault Location: Coverage Problem
If the CQI count distributing on CQI 0~4 are more than 10%:

MXLP40B
MXL471A good
and coverage
MXL413A
weak
coverage

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 41


Fault Location: interference problem
Check the interference :
From the Counter: during the free time if L.UL.Interference.Avg>=-115dbm , its
possible to exist the uplink interference.
From the CHR, use FMA to find the interference ,if:
In the L1SegInfo, (UL RSRP>-130 and SINR<=0) or (SINR<=5 and UL RSRP-
SINR>=-124) during the RRC phase.
In the L2_DRB_STRU, UL Average Rsrp <0 and MIN{UL Average Rsrp}>=-130 and
MAX{ UL Average SRS Sinr }<=0 during ERAB phase.

TOP cells KPI IN counter FMA interference analyse result

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 42


Deliverables for Service Drops
Check result based on the routine troubleshooting operation checklist
for service drops
For some difficult problems, collect more logs for further location.
BRD log (mandatory)
Indicates logs of the LMPT and LBBP on the eNodeB to which top cells belong.

Standard interface signaling (mandatory)


Indicates S1, X2, and Uu interface tracing results.

Network configuration (mandatory)


Includes networking information, engineering parameters, and configuration files of top eNodeBs.

TTI tracing (optional; depending on fault location requirements)


Indicates IFTS tracing results and cell tracing results. Only information of top cells in top periods needs to
be collected because there is a great amount of data.

Single-UE tracing (optional; depending on fault location requirements)


Used for in-depth top-UE location and is performed on the entire network by using the IMSI that is obtained
from the EPC based on the TMSI of the top UE.

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 43


Contents
Definition of Service-Drop-Related Counters

Common Symptoms of Service Drops

Causes of Service Drops and Data Handling

Checklist and Deliverables for Service Drops

Service Drop Cases

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 44


Case 1: Service drops are caused by the
reason that top UEs continuously fail in
reestablishment.
As shown in the figure on the upper right, most abnormal releases
on the eNodeB are caused by failures in exchanging the first three
signaling messages during the reestablishment process.
As shown in the figure on the middle right, from the perspective of
fault occurrence time, most service drops occur in a continuous
manner within a period from 11:51 to 18:49 in cell 0.
As shown in the figure on the bottom right, from the perspective of
TMSI information, service drops are caused by a certain UE (TMSI
C2 B0 B0 40) and the main cause value of reestablishment is
reconfiguration failure.
As shown in the figure on the bottom left, from the perspective of
reconfiguration message type, messages are not handover
commands or measurement configuration messages but may be
CQI, sounding, and transmission mode (TM) reconfiguration
messages. In addition, the UE does not respond to the RRC CONN
REESTAB message and therefore the eNodeB releases E-RABs 5s
later.

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 45


Case 2: Top UEs
encounters continuous
faults.
The CHR of the eNodeB shows that most abnormal releases are caused by the
reason that RLC retransmissions reach the maximum number of times, that is,
DRB retransmissions reach the maximum number of times (8 retransmissions).
From the perspective of fault occurrence time, most service drops occur in a
continuous manner within a period from 10:51 to 13:49 in cell 2.
From the perspective of TMSI information, service drops are caused by a certain
UE (TMSI C2 7F 20 56).
The last 16 64-ms messages of DRB scheduling information show the similar
problem, that is, a fault (similar to suddenly stopped data transmission) occurs
soon after access. The release occurs within tens of seconds to two minutes after
access and is not possibly caused in a test using commands. In addition, the
access type is MO-DATA. This type of releases occurs in actual service
performance process.

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 46


Case 3: The uplink link quality is poor.
The figure on the right shows that, from the last
four 512-ms messages of DRB scheduling
information to the last 16 64-ms messages of
DRB scheduling information, the uplink RSRP
and SINR are poor. The uplink RSRP reaches
135 dBm or below. The sounding SINR and
demodulation reference signal (DMRS) SINR
are 3 dB or less. The service drop is possibly
caused by uplink weak coverage.

The figure on the left shows that, from the


last four 512-ms messages to the last 16 64-
ms messages, the uplink RSRP is around
130 dBm. The sounding SINR and DMRS
SINR are 3 dB or less. The service drop is
possibly caused by small uplink interference
in a weak-coverage area.

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 47


Case 4: Reconfiguration of the target
cell fails.
Release cause (Unspecified displayed in the S1 tracing result)
TGT_ENB_RB_RECFG_FAIL indicates an abnormal release caused by an RB reconfiguration failure on the
target eNodeB during the handover process.
After the UE successfully hands over to the target cell, the target eNodeB sends a PATH SWITCH REQ ACK
message to the MME and immediately sends a UE CONTEXT REL REQ message about 100 ms later,
carrying the S1-AP cause value of unspecified. The figure on the left displays the last ten messages.

Problem analysis
During the handover process, the MME sends a PATH_SWITCH_ACK message carrying the downlink AMBR
value inconsistent with that carries in the S1 or X2 handover request. This is a defect of the RR module. The
upper-layer RR control module sends an AMBR update message to the lower-layer RB module. The RB
module determines not to send a Uu reconfiguration message to the UE and then responds with a null value
to the upper-layer RR control module. In this case, the upper-layer RR control module handles with this
response as a fault and then releases the UE. This problem is included in eRAN2.1 V100R003C00SPC430.

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 48


Case 6: A service drop is caused
by the inter-RAT redirection.
Release cause (Inter-RAT redirection
displayed in the S1 tracing result)
IRHO_REIDRECTION_TRIGER indicates a
release caused by inter-RAT redirection. Releases
caused by this reason are mistakenly counted into
service drops in eRAN2.1 V100R003C00SPC400
and eRAN2.1 V100R003C00SPC401. The
following figure shows related messages.
This problem will be solved in eRAN2.1
V100R003C00SPC420.

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 49


Case 6: Releases are counted into
the L.E-RAB.AbnormRel.TNL
counter due to transport faults.
On Dec 11th of 2011, the entire-network service drop rate of 900 MHz and 2.6 GHz
deteriorate in Tele2 and Telnor, as shown in the following figure.
The field personnel has discussed this problem with the operator. It is likely that this
problem is caused by EPC faults. However, no response is received from the operator.

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 50


Case 7: Service drops are caused
by radio problems.
Release cause
UE_RESYNC_TIMEROUT_REL_CAUSE (Radio Connection With UE Lost displayed in the S1 tracing result): indicates a L2-report
release caused by resynchronization after timeout of the resynchronization timer following the out-of-synchronization.
UE_RLC_UNRESTORE_IND (Radio resources not available displayed in the S1 tracing result): indicates the L2-reported RLC
unrestore indication that is sent after the maximum number of RLC retransmissions reaches.
UE_RESYNC_DATA_IND_REL_CAUSE (Unspecified displayed in the S1 tracing result): indicates a L2-reported release caused by
data-triggered resynchronization after the out-of-synchronization.

Cause analysis
From the last four 512-ms messages of DRB scheduling information to the last 16 64-ms messages of DRB scheduling information,
abnormal releases are caused by faults similar to suddenly stopped data transmission in most cases. Possibly, the SIM card is
removed or the UE is faulty. The following figure shows information recorded in the CHR.

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 51


Case 8: The reestablishment
procedure fails.
Release cause (Radio Connection With UE Lost displayed in the S1 tracing result)
RRC_REEST_SRB1_FAIL: indicates a release occurring at the SRB 1 restoration stage
during the RRC connection reestablishment.
The last ten messages, as shown in the following figure, after the eNodeB sends an
RRC_CONN_REESTAB message, the eNodeB does not receive the
RRC_CONN_REESTAB_CMP message from the UE before the radio interface 5s timer
expires.
For the perspective of L2 scheduling, the UE responds with an ACK message after receiving
the RRC_CONN_REESTAB message from the eNodeB.
That is possibly because some UEs do not send the RRC_CONN_REESTAB_CMP message.
For example, Samsung UEs have this problem.

HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential Page 52


Thank you
www.huawei.com

Anda mungkin juga menyukai