and the
Geeks who love them
Kyle Hailey
http://perfvision.com
#.2
WaitEvents
Wait Events
In this Presentation:
Tuning Methodology
Plan of Action
Statspacks for Collection Data
Based on Waits
Word of Wisdom
Half of the game is knowing when to act and
how much effort to put in
Database is Hung!
Everybody blames the database
Poor Database
Yet 9 out of 10 dbas agree its not the
database
How do you prove it to management?
On the off chance its the database, now
we are in some serious trouble!
*$%@!!
Oracles Defense
After years of false accusations
Oracle took action and created a defense
system:
WAIT EVENTS
To the rescue
Locks
Network
IO
Waits
Instrumented code to indicate bottlenecks
Number of times waited
Amount of time waited
Examples
IO
Locks
SQL*Net
Statspack
Look at Top 5 Timed Events (~50 lines down)
Get to work if
CPU + WAIT >= Available CPU
Available CPU
Available CPU = # CPUs * Elapsed time
# of CPUs
SQLPLUS> show parameters cpu_count
Available CPU
# CPUs * Elapsed Time
Example:
Example
CPU + WAITS
= 250 + 32 + 15 + 8 + 5 = 310 secs
Available CPU was 120 secs
310 >> 120
Get to work tuning ! Event Time (s)
--------------------------- -----------
buffer busy waits 250
CPU time 32
free buffer waits 15
write complete waits 8
log buffer space 5
Copyright 2006 Kyle Hailey
#.19
Data Sources
Statspack
Top 5 Timed Events
10g ASH
OEM
ASH Report : ashrpt.sql
Custom
V$session,v$session_wait
v$active_session_history
Statspack
OEM 10g
9i &10g
Licensed but worth it
Does all the work
Aggregates Wait time
Sums with CPU time
Displays available CPU
Groups Waits
Filters out unusable waits
CPU + WAIT
Available CPU
Copyright 2006 Kyle Hailey
#.26
Tuning Methodology
Machine
Run queue (CPU)
reduce CPU usage or add CPUs
Paging
Reduce memory usage or add memory
Oracle
Waits + CPU > Available CPU
Tune waits
CPU 100% We are going to
Tune SQL concentrate here
on WAITS
Copyright 2006 Kyle Hailey
#.30
Waits
360 waits in 9i
36 waits represent 99% of all Bottlenecks
From Anjo Kolks site www.oraperf.com
Based on total wait times
Wait Areas
Buffer Cache
I/O
Locks
Waits Library Cache
Redo
SQL*Net
Redo
REDO
Library Cache
REDO Lib
Cache
Buffer Cache
IO
IO
Locks
Locks
IO
HW
TM
TX 4
TX 6
ST
SQ
TS
Network
Locks
Network
IO
Network
IO
CPU
Different from Waits
Still a Timed Event
High CPU & Low Waits
Tune SQL
100% CPU
tune highest SQL
Locks
Network
IO
User2
User3
Background Waits
Filter Out Background Waits
Statspack
ASH : SESSION_TYPE='FOREGROUND
V$session_wait : type='USER'
Background Waits
ASH 10g
Avoid Background waits in ASH with
Idle Waits
Filter Out
10g
where wait_class != Idle
Create a list
Select name from v$event_name where
wait_class=Idle;
9i
Create a list with
Documentation
List created from 10g
RAC Waits
You are on your own
Check documentation
If you are not using RAC then no worries
10g
10g
Intermediate
Library Cache locks & Pins
Run queries to find lockers
IO
Check average read times per file
Should be between 5-20 ms
Data in Statspack under File IO Stats
Intermediate - IO
1. db file sequential read
Tune SQL, speed up disks (5-15ms), increase buffer cache
2. db file parallel read -
Tune sql, tune io subsystem, increase buffer cache
3. db file scattered read
FTS , Tune SQL, add indexes, speed up disks (5-15ms)
4. direct path write (lob)
Improve IO, reduce lob write size
5. direct path read
sorts or PQO - tune IO, sort less
6. direct path write
direct path load or temp io, improve disk speed
IO
1. db file sequential read
Tune SQL, speed up disks (10-20ms), increase buffer cache
2. db file parallel read
Tune sql, tune io subsystem, increase buffer cache
3. db file scattered read
FTS , Tune SQL, add indexes, speed up disks (5-15ms)
4. direct path write (lob)
Improve IO, reduce lob write size
5. direct path read
sorts or PQO - tune IO, sort less
6. direct path write
direct path load or temp io, improve disk spee
IO Solutions
If
Db scattered Read
Db file sequential Read
Db file parallel Read
Then
Check average read times per file
Should be between 5-15 ms
Data in Statspack under File IO Stats
Advisory
Buffer Pool Advisory
Size for Size Buffers for Read Estimated
P Est (M) Factor Estimate Factor Physical Reads
--- -------- ------ ------------ ------ --------------
D 56 .1 6,986 2.3 58,928
D 112 .2 13,972 1.6 42,043
D 224 .4 27,944 1.0 25,772
D 336 .6 41,916 1.0 25,715
D 448 .8 55,888 1.0 25,715
D 596 1.0 74,351 1.0 25,715
D 728 1.2 90,818 1.0 25,715
D 840 1.4 104,790 1.0 25,715
D 952 1.6 118,762 1.0 25,715
D 1,064 1.8 132,734 1.0 25,715
IO Solutions
After Checking
FileIO response times
Buffer Cache Hit Ratio
Advanced
1. buffer busy waits
Buffer wait Statistics
v$waitstats
P1 file#, p2 block#, p3 class
2. row cache lock
Dictionary Cache Stats
v$rowcache
P1 rowcache #
3. latch free
Latch Sleep breakdown
V$latch
P1 latch#
4. Enqueue
Statspack doesnt help
V$lock
P1 lock type and mode
SQL Statements
Session IDS
select
s.sid, /*SESSION */
w.event , /* WAIT */
s.sql_hash_value, /* SQL */
w.p1, w.p2, w.p3 /* P1, P2 , P3 */
from
v$session s,
v$session_wait w
where
w.sid=s.sid
/
Copyright 2006 Kyle Hailey
#.75
2, parameter
3 parameter
Copyright 2006 Kyle Hailey from
#.82
Difficult Waits
Multiple causes and solutions
Latches
Locks
Buffer Busy
Row Cache Lock
Latches
Protect memory for concurrent use
protect lines of code
Light weight locks
Bitin memory
Atomic processor call
Fast and cheap
Gone if memory is lost
Exclusive Generally
Sharing reading has been introduced for some latches
Copyright 2006 Kyle Hailey
#.84
Finding Latches
latch free
Covers many latches, find the problem latch by
1. select name from v$latchname where latch# = p1;
OR
2. Find highest sleeps in Statspack latch section
In 10g, important latches have a wait event
latch: cache buffers chains
latch: shared pool
latch: library cache
Shared Pool
Too much hard parsing, too small a shared pool
select segment_name,
segment_type
from dba_extents
where file_id = P1
and P2 between
block_id and block_id + blocks 1;
CACHE# PARAMETER
---------- --------------------------------
1 dc_free_extents
4 dc_used_extents
2 dc_segments
0 dc_tablespaces
5 dc_tablespace_quotas
6 dc_files
7 dc_users
3 dc_rollback_segments
8 dc_objects
17 dc_global_oids
12 dc_constraints
Copyright 2006 Kyle Hailey
#.97
Example
row cache : sequence
sql : select seq.next_val
problem : sequence had cache of 1
solution: increase sequence cache to 20
Copyright 2006 Kyle Hailey
#.98
Statspack no help
V$session_wait needs lots of decoding
P1 tells Lock Type and Mode
P2,P3 give more data
Usually Need SQL to solve
Locks 10g
10g breaks Enqueues out
enq: HW - contention Configuration
enq: TM - contention Application
enq: TX - allocate ITL entry Configuration
enq: TX - index contention Concurrency
enq: TX - row lock contention Application
enq: UL - contention Application
Locks : TM & TX
Utllockt.sql
@?/rdbms/admin/catlock.sql
@?/rdbms/admin/utllockt.sql
10g v$active_session_history
Best source
10g only
Data exists since v7
Can simulate v$active_session_history
ASH Report
ASH report
@?/rdbms/admin/ashrpt.sql
Pick interval over last 7 days !
1) General info 9) Top SQL using literals
2) Top User Events *** 10) Top Sessions ***
3) Top Background Events 11) Top Blocking Sessions
4) Top Event P1/P2/P3 Values 12) Top Sessions running PQs
5) Top Service/Module 13) Top DB Objects
6) Top Client IDs 14) Top DB Files
7) Top SQL Command Types 15) Top Latches
8) Top SQL Statements *** 16) Activity Over Time ***
Copyright 2006 Kyle Hailey
#.109
V$session_wait
Moment in Time data
select
decode(w.wait_time, 0, w.event , 'CPU') as "TOP 5 Timed Events,
count(*)
from v$session s,
v$session_wait w
where w.sid=s.sid
and s.status='ACTIVE'
and s.type='USER'
and w.event not in ('jobq slave wait',
'rdbms ipc reply')
group by
decode(w.wait_time, 0, w.event , 'CPU')
order by count(*) desc; Copyright 2006 Kyle Hailey
#.110
V$session_wait
V$session_wait
col status for a35
select s.sid,
s.sql_hash_value,
decode(w.wait_time, 0, w.event , 'CPU') as status,
w.p1, w.p2, w.p3
from v$session s,
v$session_wait w
where w.sid=s.sid
and s.status='ACTIVE'
and s.type='USER'
and w.event not in ('jobq slave wait',
'rdbms ipc reply');
Copyright 2006 Kyle Hailey
#.112
V$session_wait
Moment in Time data
SID SQL_HASH STATUS P1 P2 P3
---------- -------------- -------------------------------- ---------- ---------- ----------
234 82347421 CPU 1431502854 39 0
235 3336613934 enq: US - contention 1431502854 44 0
236 1772152815 enq: US - contention 1431502854 42 0
238 2750335498 enq: US - contention 1431502854 44 0
240 343101472 enq: US - contention 1431502854 44 0
246 1782401401 enq: US - contention 1431502854 44 0
248 3333220954 CPU 1650815232 1 0
252 323960517 enq: US - contention 1431502854 44 0
260 1272059733 CPU 1431502854 44 0
Waits 28-36
28. file identify
Keep log files open, reduce checkpoints
29. pipe put
Speed up pipe readers
30. switch logfile command
Avoid switching log files
31. SQL*Net break/reset to dblink
Check for errors in sql statement sent
32. log file switch (archiving needed)
Archive log running out of space
33. Wait for a undo record
??
34. direct path write (lob)
Improve IO, reduce lob write size
35. undo segment extension
Use UNDO or with RBS, increase RBS size, avoid OPTIMAL
36. undo segment tx slot
Use UNDO, increae # of RBS segs
Copyright 2006 Kyle Hailey
#.117
Custom Collecting
collecting:
ash.collect(sleep,loops)
save data in "ash_data" table
sleep = wait time between loops
loops = # of loops
Debug or testing:
ash.print(sleep,loops)
prints with dbms_output
to see output, run
set serveroutput on
execute dbms_output.enable(1000000)
Select
Sampling
to_char(sysdate,'SSSSS')+
trunc(sysdate-to_date('JAN-01-1970 00:00:00','MON-DD-YYYY HH24:MI:SS'))*86400 ,
sysdate,
s.indx "SID",
decode(w.ksusstim,
0,decode(n.kslednam,
'db file sequential read', 'I/O',
'db file scattered read','I/O',
'WAITING'),
'CPU') "STATE",
s.ksuseser "SERIAL#",
s.ksuudlui "USER#",
s.ksusesql "SQL_ADDRESS",
s.ksusesqh "SQL_HASH_VALUE" ,
s.ksuudoct "COMMAND" /* aka SQL_OPCODE */,
s.ksuseflg "SESSION_TYPE" ,
w.ksussopc "EVENT# ",
w.ksussseq "SEQ#" /* xksuse.ksuseseq */,
w.ksussp1 "P1" /* xksuse.ksusep1 */,
w.ksussp2 "P2" /* xksuse.ksusep2 */,
w.ksussp3 "P3" /* xksuse.ksusep3 */,
w.ksusstim "WAIT_TIME" /* xksuse.ksusetim */,
s.ksuseobj "ROW_WAIT_OBJ#",
s.ksusefil "ROW_WAIT_FILE#",
s.ksuseblk "ROW_WAIT_BLOCK#",
s.ksusepnm "PROGRAM",
Copyright 2006 Kyle Hailey
s.ksuseaph "MODULE_HASH", /* ASH collects string */
#.121
Sampling
from
x$ksuse s ,
x$ksusecst w,
x$ksled n
where
s.indx != ( select distinct sid from v$mystat ) and
bitand(s.ksspaflg,1)!=0 and
bitand(s.ksuseflg,1)!=0 and
n.indx=w.ksussopc and
s.indx = w.indx and
( (
/* status Active - seems inactive & "on cpu"=> not on CPU */
w.ksusstim != 0 and /* on CPU */
bitand(s.ksuseidl,11)=1 /* ACTIVE */
)
or
w.ksussopc not in /* waiting and the wait event is not idle */