Anda di halaman 1dari 32

Software Test & Performance

September 24-26, 2008

608 Performance Bugs and


Investigation Strategies

Alfred Wong
2

About me…
• 5+ years of experience in performance testing /
engineering area
• EE  SDET  PTE  PTM  PTA 

• Performance Test Advisor role:


– Provide technical consultation to new projects on performance
aspects.
– Advise performance certification and testing of web layer,
middle-tier and backend systems.
– Responsible for performance test team improvements on
standards and quality.
– Drive cross team performance tuning and capacity planning
efforts, as well as overall performance improvements across the
company.

608 Performance Bugs and Investigation Strategies | © 2008 by Alfred Wong; made available under the EPL v1.0
3

About you…
• Test Engineer / QA Analyst
• Performance Analyst
• Performance Test Engineer
• Performance Test Lead / Manager
• QA or Test Lead / Manager
• QA or Test Director

• Performance team structure?


– Centralized / Service based
– Integrated

608 Performance Bugs and Investigation Strategies | © 2008 by Alfred Wong; made available under the EPL v1.0
4

Agenda
• Scope
• ‘Simple’ Performance Testing
• Bug investigation models
• ‘Big 5’ of Performance bugs
• Performance bugs
• More bugs
• More…
• Q&A + Comments

608 Performance Bugs and Investigation Strategies | © 2008 by Alfred Wong; made available under the EPL v1.0
5

Scope

• Sample app: an online bookstore


– User accounts & profile
– Search (by title, author, keyword, ISBN etc)
– Availability & Prices
– Check-out

• Performance bugs = !Functional bugs


(ambiguous)

608 Performance Bugs and Investigation Strategies | © 2008 by Alfred Wong; made available under the EPL v1.0
6

Scope
• N-tier ecommerce architecture & all modules /
components / services associated

Services E-mail
Content

App Server Database Management etc


Web

Media Payment Processing Storage


• Yet, principles are applicable to any software projects
608 Performance Bugs and Investigation Strategies | © 2008 by Alfred Wong; made available under the EPL v1.0
7

‘Simple’ Performance Testing

Input: Output:
• Log in
• Response to
• Search ‘client’
• Buy • System behavior
• Profile
• etc Services

Performance Testing:
Investigation + Validation = Risk Reduction
608 Performance Bugs and Investigation Strategies | © 2008 by Alfred Wong; made available under the EPL v1.0
8

I am assuming that you…


• understand the overall system architecture
• know ‘enough’ about the dev design
• have performance testing goals / objectives /
requirements
• are familiar with test & production environments,
as well as their differences
• have done proper performance test design
• can model user behavior / work load pattern
• are able to use test tools correctly
• are experienced in interpreting test data (test
tool & server metrics)

608 Performance Bugs and Investigation Strategies | © 2008 by Alfred Wong; made available under the EPL v1.0
9

Performance bugs investigation


• Two major models:

– Over the fence / bug in bug out

– Collaborative / Integrated

• What’s your current model?


• How does the defect / bug look different?
608 Performance Bugs and Investigation Strategies | © 2008 by Alfred Wong; made available under the EPL v1.0
10

Quick Comparison
• Over the fence • Collaborative
– Advantages – Advantages
• Simple • Relationship
• Fast • Visibility
• Transferable • Knowledge / Understanding
• Accomplishment
– Disadvantages – Disadvantages
• Developer experience • Ownership
• Performance team • Potential time sink
experience • Technical limitation

608 Performance Bugs and Investigation Strategies | © 2008 by Alfred Wong; made available under the EPL v1.0
11

‘Big 5’ of Performance Bugs

• What are the symptom(s)?


• Why do we care? Do we?
• Why does it happen?
– symptoms & hypothesis
• What can we do to investigate / fix?
• What can be done to mitigate? trade offs?

608 Performance Bugs and Investigation Strategies | © 2008 by Alfred Wong; made available under the EPL v1.0
12

Memory Leaks
• What are the symptom(s)?
Memory Consumption (MB)
0

0
:0

:0

:0

:0

:0

:0

:0

:0

:0

:0

:0

:0
00

01

02

03

04

05

06

07

08

09

10

11
Time

• Anything else?
608 Performance Bugs and Investigation Strategies | © 2008 by Alfred Wong; made available under the EPL v1.0
13

Memory Leaks

• Why do we care? Do we?


– What’s the rate of the leak?
– Does it stabilize after ‘some’ time?
– How quickly does the app run out of memory?

• Why does it happen?


– allocation without de-allocation

608 Performance Bugs and Investigation Strategies | © 2008 by Alfred Wong; made available under the EPL v1.0
14

Memory Leaks
• What can we do to investigate / fix?
– Memory profiling tool? Debugger?
– Do we have previous test results to eliminate ‘safe
areas’?
– Areas / modules added or changed by project?
– Is the leak linear?
– Is rate of the leak proportional to ‘load’?
– Does the leak occur with single virtual user?
– Is the leak occurring only for certain types of requests?
– Is the rate dependent on the response size?

608 Performance Bugs and Investigation Strategies | © 2008 by Alfred Wong; made available under the EPL v1.0
15

Memory Leaks

• What can be done to mitigate? trade offs?


– Restart service / application periodically?
– Add more memory?
– What’s the rate of the leak?
– Any other side-effects?
• Performance degradation
• Operations overhead
• Virtual address space limit
• Other processes / apps

608 Performance Bugs and Investigation Strategies | © 2008 by Alfred Wong; made available under the EPL v1.0
16

More Memory leaks?


Memory Consumption (MB)
0

0
:0

:0

:0

:0

:0

:0

:0

:0

:0

:0

:0

:0
00

01

02

03

04

05

06

07

08

09

10

11
Time

Is this a memory leak?


What are the symptoms?

608 Performance Bugs and Investigation Strategies | © 2008 by Alfred Wong; made available under the EPL v1.0
17

Do we care?
Memory Consumption (MB)
0

0
:0

:0

:0

:0

:0

:0

:0

:0

:0

:0

:0

:0
00

01

02

03

04

05

06

07

08

09

10

11
Time

What’s the rate of the leak?


Does it stabilize after ‘some’ time?
How quickly does the app run out of memory?

608 Performance Bugs and Investigation Strategies | © 2008 by Alfred Wong; made available under the EPL v1.0
18

Why does it happen?


Memory Consumption (MB)
0

0
:0

:0

:0

:0

:0

:0

:0

:0

:0

:0

:0

:0
00

01

02

03

04

05

06

07

08

09

10

11
Time

hmm… it looks different. Does it matter?


Why does it look different?
What do we do now?

608 Performance Bugs and Investigation Strategies | © 2008 by Alfred Wong; made available under the EPL v1.0
19

Memory Leaks
• What can we do to investigate / fix?
– Memory profiling tool / Debugger
– Do we have previous test results to eliminate ‘safe
areas’?
– Areas / modules added or changed by project?
– Is the leak linear?
– Is rate of the leak proportional to ‘load’?
– Does the leak occur with single virtual user?
– Is the leak occurring only for certain types of requests?
– Is the rate dependent on the response size?
– Does the leak stop immediate after the test is stopped?
– Are the stair-steps appearing at regular intervals?
608 Performance Bugs and Investigation Strategies | © 2008 by Alfred Wong; made available under the EPL v1.0
20

What about this one?


Memory Consumption (MB)
0

0
:0

:0

:0

:0

:0

:0

:0

:0

:0

:0

:0

:0
00

01

02

03

04

05

06

07

08

09

10

11
Time

Is this a bug?
What are the symptoms?
608 Performance Bugs and Investigation Strategies | © 2008 by Alfred Wong; made available under the EPL v1.0
21

Do we care?
Memory Consumption (MB)
0

0
:0

:0

:0

:0

:0

:0

:0

:0

:0

:0

:0

:0
00

01

02

03

04

05

06

07

08

09

10

11
Time

How big are the spikes?


How frequently does it occur?
Any side effects when it spikes?
608 Performance Bugs and Investigation Strategies | © 2008 by Alfred Wong; made available under the EPL v1.0
22

Other memory performance bugs


• Memory corruption
– un-initialized memory
– un-owned memory
– thread contention / race-condition
– buffer overrun
– stack overflow
• Do we care? What happens then?
– Users get garbage
– Entire app crashes
• What can we do to investigate / fix?

608 Performance Bugs and Investigation Strategies | © 2008 by Alfred Wong; made available under the EPL v1.0
23

Other system resources


• Handle
• Thread
• Connection
• Disk
• Processor

Can these leak? Disk?!?


Any abnormal behavior / ‘weirdness’?

608 Performance Bugs and Investigation Strategies | © 2008 by Alfred Wong; made available under the EPL v1.0
24

Processor
100
90
80
CPU utilization

70
60
50
40
30
20
10
0
0

0
:0

:0

:0

:0

:0

:0

:0

:0

:0

:0

:0

:0
00

01

02

03

04

05

06

07

08

09

10

11
Time

What are the symptoms?


Do we care?
608 Performance Bugs and Investigation Strategies | © 2008 by Alfred Wong; made available under the EPL v1.0
25
100
90
80

CPU utilization
70
60
50
40
30
20
10
0
0

0
:0

:0

:0

:0

:0

:0

:0

:0

:0

:0

:0

:0
00

01

02

03

04

05

06

07

08

09

10

11
Time

Why does it happen?


– CPU is busy! What does it mean?
What can we do to investigate / fix?
– What happens when the test is stopped?
– How quickly does it repro?
– Profiler / Debugger

608 Performance Bugs and Investigation Strategies | © 2008 by Alfred Wong; made available under the EPL v1.0
26

Processor
100
90
80
CPU utilization

70
60
50
40
30
20
10
0
0

0
:0

:0

:0

:0

:0

:0

:0

:0

:0

:0

:0

:0
00

01

02

03

04

05

06

07

08

09

10

11
Time

What are the symptoms?


Do we care?
608 Performance Bugs and Investigation Strategies | © 2008 by Alfred Wong; made available under the EPL v1.0
27
100
90
80

CPU utilization
70
60
50
40
30
20
10
0
0

0
:0

:0

:0

:0

:0

:0

:0

:0

:0

:0

:0

:0
00

01

02

03

04

05

06

07

08

09

10

11
Time

Why does it happen?


– CPU is idle! What does it mean?
What can we do to investigate / fix?
– How quickly does it repro?
– Is there an simpler repro?
– Profiler / Debugger

608 Performance Bugs and Investigation Strategies | © 2008 by Alfred Wong; made available under the EPL v1.0
28

What about our users?

Q: Response time for searching a book


is 3 seconds, is this a perf bug?

• What are the symptom(s)?


• Why do we care? Do we?
• Why does it happen?
• What can we do to investigate / fix?
• What can be done to mitigate? trade offs?

608 Performance Bugs and Investigation Strategies | © 2008 by Alfred Wong; made available under the EPL v1.0
29

Other End-User Issues…


• Time-out
• Invalid response (garbage / internal server error)
• Response size

• What are the symptom(s)?


• Why do we care? Do we?
• Why does it happen?
• What can we do to investigate / fix?
• What can be done to mitigate? trade offs?

608 Performance Bugs and Investigation Strategies | © 2008 by Alfred Wong; made available under the EPL v1.0
30

Other Performance ‘Issues’…


• Hardware (model / configuration / compatibility)
• Scalability (throughput & response time)
• Capacity
• Bottleneck
• Stress recovery
• ‘Functional issues’ under load:
– Thread Pool
– Queuing & Queues
– Partial system failure / recovery
– Failover
– Load-balancing
• And others …
608 Performance Bugs and Investigation Strategies | © 2008 by Alfred Wong; made available under the EPL v1.0
31

Performance Bugs Investigation Strategy


• What are the symptom(s)?
– Consolidate all details around the ‘symptoms’
– Acquire overall view of client & system behavior
– Primary vs secondary symptoms
– Correlate symptoms as appropriate
• Why do we care? Do we?
– Do symptoms go against any project / performance goals?
– Anything bad happening as a result?
• Why does it happen?
– Be familiar with how your application works internally / Get help
– Understand how software works in general
– Symptoms & hypothesis
• What can we do to investigate / fix?
– Gather Information vs Confirm Hypothesis
– Divide and conquer
• What can be done to mitigate? trade offs?
– Think big picture
– Involve Stakeholders

608 Performance Bugs and Investigation Strategies | © 2008 by Alfred Wong; made available under the EPL v1.0
32

Q&A + Comments

as a performance engineer!
608 Performance Bugs and Investigation Strategies | © 2008 by Alfred Wong; made available under the EPL v1.0