Anda di halaman 1dari 80

Session:

Problem Determination and Database


monitoring using db2pd
Jorge Daniel Vaquero
Jairo Balart
DAMA - UPC

"What's happening in the engine


Monitoring scenarios
Presented to better understand requirements

Tools available in v8.2 and v9.1


System monitor
db2pd
db2pd: tool to monitor and troubleshoot DB2
Standalone utility shipped with the DB2 engine starting with the DB2 v8.2
Used by customers to monitor and troubleshoot
Gives the user a closer view into the DB2 engine

Advantages of using db2pd


Tool collects information without acquiring any latches or using any engine resources
which has two major benefits
Faster retrieval
No competition for engine resources

Determine whether instance or database


is up
"db2pd -" tells whether instance is up and for how long
db2pd
Database Partition 0 -- Active -- Up 4 days 05:34:53

db2pd -db <database> - tells how long the database has been active
db2pd -db sample
Database Partition 0 -- Database SAMPLE -- Active -- Up 0 days 19:11:07
db2pd -alldbs
Database Partition 0 -- Database XMLDB -- Active -- Up 0 days 19:11:25
Database Partition 0 -- Database SAMPLE -- Active -- Up 0 days 19:11:43

Determine whether database is up (contd)

Attempting to take an offline backup of an activated database


db2 backup db sample to /home/db2inst1
SQL1035N The database is currently in use. SQLSTATE=57019
db2diag.log will contain
2006-01-04-02.04.17.162649-300 I22772A342
LEVEL: Error
PID
: 6275140
TID : 1
PROC : db2bp
INSTANCE: db2inst1
NODE : 000
FUNCTION: DB2 UDB, database utilities, sqlubConnectDatabase, probe:1259
DATA #1 : Hexdump, 4 bytes
0x0FFFFFFFFFFFF490 : FFFF FBF5

db2 list applications command will not work


it only tells you whether or not any applications are active on the database

Monitoring progress and behavior of


DB2 agents
db2pd -agents [db=<database>] [ [agent=<agentid>] | [application=<appid>] ]
Agents:
Current agents:
Idle agents:
Active agents:
Coordinator agents:

8
5
2
2

Address
AppHandl [nod-index] AgentPid
Priority
Type
State
ClientPid Userid
ClientNm Rowsread
Rowswrtn
LkTmOt
DBName
0x0780000001A35540 0
[000-00000] 2809968
0
Idle
n/a
n/a
n/a
0
0
NotSet
n/a
0x0780000001A16A00 1406
[000-01406] 1134686
0
Coord
Inst-Active 598024
dabrashk db2bp
2
0
NotSet
SAMPLE
0x0780000001A17460 581
[000-00581] 884778
0
Coord
Inst-Active 2969776
dabrashk db2bp
14
0
NotSet
SAMPLE
0x0780000001A34AE0 0
[000-00000] 2076776
0
Coord
Pooled
n/a
n/a
n/a
0
0
NotSet
SAMPLE

SQL1226N due to too many db2agent


processes
If many db2agent processes remain attached to an instance
They may use up all of the agents (MAXAGENTS)
New connections will trigger SQL1226N

The maximum number of client connections are already started


ADM7009E error message will be logged in a notify log

Use db2pd agents to find ApplHandl, ClientPid, UserId and ClientNm


db2pd -agents | awk '/Address|^0x/ { print $2, $8, $9, $10}'

AppHandl ClientPid Userid ClientNm


643 3612794 dabrashk db2bp
642 3227856 dabrashk db2bp
Force the identified AppHandl off with command
db2 "force application (642)"

Repeat Option: db2pd rep [num sec] [count]


db2pd -age -rep 10 3
Database Partition 0 -- Active -- Up 0 days 00:10:42
AppHandl AgentPid
120
1605826
13
1085632

ClientPid
1130588
1675376

Userid
ClientNm Rowsread
jmcmahon db2bp
18943
jmcmahon db2bp
3489

Rowswrtn
23998
9898

DBName
SAMPLE
SAMPLE

AppHandl AgentPid
120
1605826
13
1085632

ClientPid
1130588
1675376

Userid
ClientNm Rowsread
jmcmahon db2bp
98941
jmcmahon db2bp
4100

Rowswrtn
100091
9898

DBName
SAMPLE
SAMPLE

AppHandl AgentPid
120
1605826
13
1085632

ClientPid
1130588
1675376

Userid
ClientNm Rowsread
jmcmahon db2bp
100999
jmcmahon db2bp
6777

Rowswrtn
230448
9898

DBName
SAMPLE
SAMPLE

Repeat option is handy for watching activities. Example above watches agents
reads and writes
Combine -repeat option with the file redirection
db2pd -age file=agents.out -rep 10 3

Combine multiple options and use mixed scope options


db2pd db sample loc tra age fil lock.txt
Use file for multiple options

Monitoring progress and behavior of


applications
db2pd applications db sample
Use to map application to a coordinator agent
Use to determine a status of application
Use to map application to the dynamic SQL statement
db2pd -activestatements
Any statement that is part of the active statement list is reported
Gives the user the ability to identify all active dynamic statements for all
applications

Use to map application ID to IP address and port


Simplified in v9.1: no need to convert

Monitoring currently executing dynamic SQL


statements
db2pd -db sample -activestatements
Database Partition 0 -- Database SAMPLE -- Active -- Up 0 days 00:43:12
Active Statement List:
Address
AppHandl [nod-index] UOW-ID
StmtID
AnchID StmtUID
EffLockTOut EffDegree
StartTime
LastRefTime
0x0780000020A405A0 51
[000-00051] 7
1
73
1
-2
0
Sat Jan 7 00:17:43 2006
Sat Jan 7 00:17:43 2006
0x07800000209FB8A0 44
[000-00044] 1
2
44
1
-2
0
Sat Jan 7 00:10:12 2006
Sat Jan 7 00:10:12 2006

EffISO
1
1

db2pd -db sample -dyn


Dynamic SQL Statements:
Address
AnchID StmtUID
0x0780000020A08660 43
1
table staff3 like staff
0x0780000020A076C0 44
1
name from staff

NumEnv
2

NumVar
2

NumRef
4

NumExe
3

Text
create

select

Monitoring SQL statements executed on the instance

sqltext script
Uses insert time as reported by db2pd dynamic SQL Variations section.

Monitoring transactions: db2pd -transactions


Transactions:
Address
AppHandl [nod-index] TranHdl
Locks
State
Tflag
Tflag2
Firstlsn
Lastlsn
LogSpace
SpaceReserved
TID
AxRegCnt
GXID
0x078000002024DA80 1000
[000-01000] 2
1020
READ
0x00000000 0x00000000 0x000177000C00 0x0000017FF000 1230
10000
0x000000006614 1
n/a
0x078000002024E780 1004
[000-01004] 3
35
WRITE
0x00000000 0x00000000 0x0001801CF000 0x000001999600 115
10000
0x000000006627 1
n/a

Useful for determining the amount of resources a transaction is using


db2pd -transactions provides
number of locks
first lsn, last lsn
Log Sequence Number represents relative byte address, within the database log, for the first
byte of the log record

logspace used (in pages)


space reserved (in pages)

Monitor the progress and behavior of any transaction

10

Monitoring application progress


Identifying slow or hanging aplications
Monitor rows read and written for agents
db2pd -agents | awk '/Address|^0x/ { print $2, $11, $12, $14;}
AppHandl Rowsread Rowswrtn DBName
51 109 58 SAMPLE
44 46 0 SAMPLE
9 0 0 n/a

8 126 107 SAMPLE


0 0 0 SAMPLE

Use AppHandl to determine the application to take action on


Use db2pd dynamic to find SQL statement

Use db2pd static to find package


Use AppId to find out applications IP address and port number (if applicable)

11

Monitoring application progress (contd)


Monitor start time of statements in db2pd activestatements
db2pd -activestat -db sample | awk '/Address/ { print $2,$6,$7,$11,$12 } /^0x/ {print
$2,$6,$7,substr($0,115);}'
AppHandl AnchID StmtUID StartTime LastRefTime
76 44 1
Sat Jan 7 17:57:35 2006
Sat Jan 7 17:57:35 2006

Use AnchID and StmtUID to identify SQL statement


Use AppHandl to identify application

Find non-committing transactions


Use the first and last LSN (log sequence number) of the transaction
db2flsn executable can be used to identify the log file for a specific lsn.
db2pd -trans -db sample | awk '/Address|^0x/ { print $2,$9,$10;}'
AppHandl Firstlsn Lastlsn
76 0x0001801CF000 0x000202999600
44 0x000177000C00 0x000177000C00

Find biggest lockers


Use AppHandl and Locks fields of db2pd -transactions
db2pd -trans -db sample | awk '/^0x/ { print $5, $2}' | sort -rn
7 76
4 74
2 44

12

Monitoring OS: db2pd -osinfo


Operating
OSName:
NodeName:
Version:
Release:
Machine:

System Information:
AIX
mymachine
5
2
000ABCD123

CPU Information:
TotalCPU
OnlineCPU
4
4

ConfigCPU
4

Speed(MHz)
1453

Physical Memory and Swap (Megabytes):


TotalMem
FreeMem
AvailMem
TotalSwap
16384
10201
n/a
16384
Virtual Memory (Megabytes):
Total
Reserved
Available
40960
n/a
n/a
Shared Memory Information:
ShmMax
ShmMin
68719476736
1

HMTDegree
1

FreeSwap
16366

Free
38907

ShmIds
131072

ShmSeg
0

13

Using db2pd to diagnose hangs

14

Useful DB2 tools for hangs


The db2pd tool
Purpose: To gather information quickly and non-intrusively
from the DB2 engine

The db2cos tool


Purpose: To be called inline from DB2 code to collect
information about problems

Latch tracking
Purpose: To track latch ownership

Snapshot monitoring
Purpose: To understand point in time status of queries or
entities in the DB2 engine

DB2 trace
To investigate possible movement in DB2 agents

15

Troubleshooting deadlocks and lock wait


timeouts
For transient/infrequent deadlocks or lock wait timeouts
These can last a short period of time and need to be caught quickly
db2pdcfg -catch uses inline code to trigger an action immediately
Default (primary) action is to run the DB2 call out script (db2cos)

db2pdcfg -catch locktimeout count=1


Error Catch #1
Sqlcode:
ReasonCode:
ZRC:
ECF:
Component ID:
LockName:
LockType:
Current Count:
Max Count:
Bitmap:
Action:
Action:
Action:

0
0
-2146435004
0
0
Not Set
Not Set
0
1
0x4A1
Error code catch flag enabled
Execute sqllib/db2cos callout script
Produce stack trace in db2diag.log

16

Collecting call stacks in v9.1

Used to determine what DB2 agent is doing


Call stack collection has been sped up significantly in v9.1
Stack traces (trap files) collection is dissociated from binary dump files

Produce stack trace for all PIDs or chosen PID


db2pd -stack [all|<pid>]
Produces trap file(s) in the DIAGPATH directory

Produce dump file and stack trace for all PIDs or chosen PID
db2pd -dump [all|<pid>]

17

db2pd -latches option (v9.1)


Latch tracking is always-on in v9.1
db2pd -latches
Latches:
Address
Holder
Waiter
Filename
0x07800000203C40C8 1695804
0
sqldpool.C
SQLO_LT_SQLB_CLNR_PAUSE_CB__preventSuspendLatch
0x07800000203C4190 1695804
0
sqlbpool.C
SQLO_LT_SQLB_PTBL__pool_table_latch
0x07800000203C5678 1695804
0
sqlbistorage.h
SQLO_LT_SQLB_PTBL__ptfLatches

LOC
529

LatchType

2254
5169

Holder is agent process ID

LatchType is a latch identifier

To group latches by holders and waiters


db2pd -latches group
Latch Holders:
Address
Holder
0x07800000204A8A48 1695804

Filename
sqlbilatch.C

LOC
1150

LatchType
SQLO_LT_SQLB_POOL_CB__writeLatch

Latch Waiters:
Address

Filename

LOC

LatchType

Waiter

18

Detecting hangs for specific applications


Take a few application snapshots a minute apart to determine

what the status of the application is (db2pd app: Status) and


whether any work is being done (db2pd agent: RowsRead/Wrtn)
It is useful to have turned on all of the monitor switches prior to a re-creatable or recurring
problem scenario

Use db2pd -app to determine status of an application

db2pd -app -db sample | awk '/Address|^0x/ { print $6 }'

If status is UOW Waiting, the hang is not occurring at the DB2 server

The client application should be investigated to find out what it is waiting for.
In DPF environment, this may indicate a problem with another partition

If status is Executing and counters like rows-read/written are increasing, it is likely a


performance issue
If status is Lock-wait than it is a locking/concurrency issue

Exception is the case when the application being waited on is in UOW Executing and making
no progress

If status is Executing yet no counters are increasing, then the agent or agents
servicing the application may be in an abnormal state

More diagnostics is needed

19

Monitoring locks: lock contention

Monitor for slowdowns

Important to figure out who is waiting for whom

Identify the lock owner to consider action of releasing the lock

db2pd -db sample -loc

Database Partition 0 -- Database SAMPLE -- Active -- Up 0 days 00:00:26


Locks:
... TranHdl
... 2
... 3

Mode Sts Owner ...


..X G
2
..X W
2

Look for the W status for waiters

Lockname
Type
00020003000000040000000052 Row
00020003000000040000000052 Row

Transaction 2 is holding the lock that transaction 3 is waiting on

db2pd -db sample -locks wait

Show locks with a wait status and their waiter (v8.2 FP9)

Database Partition 0 -- Database SAMPLE -- Active -- Up 0 days 01:17:17


Locks:
... TranHdl
... 2
... 4
... 2

Lockname
000200040000000D0000000052
00020003000000270000000052
00020003000000270000000052

Type
Row
Row
Row

Mode
.NS
.NS
..X

Sts
W
W
G

Owner
3
0
2

Dur
1
1
1

HldCnt
0
0
0

Att
0
0
8

Rlse
0x0
0x0
0x40

Look for the G status for holders

20

Monitoring locks (contd)


Lockname <==> hex representation of the physical object that is being waited on
db2pd -db sample -loc showlocks
Database Partition 0 -- Database SAMPLE -- Active -- Up 0 days 00:10:42
Locks:
... Lockname
... 000200030000001A0000000052
... 000200030000001F0000000052
... 53514C4332453036C8324ABC41
... 53514C4332453036C8324ABC41
... 0002000D000000050000000052
... 00020003000000000000000054
... 0002000D000000000000000054

Type
Row
TbspaceID 2 TableID 3 RecordID 0x1A
Row
TbspaceID 2 TableID 3 RecordID 0x1F
Internal P Pkg UniqueID 53514c43 32453036 Name c8324abc
Internal P Pkg UniqueID 53514c43 32453036 Name c8324abc
Row
TbspaceID 2 TableID 13 RecordID 0x5
Table
TbspaceID 2 TableID 3
Table
TbspaceID 2 TableID 13

showlocks suboption will expand the lockname into meaningful explanations


To determine who is holding a lock in your database
db2pd database sample locks transactions agents file lock.txt
-agents will contain UserID for transaction handle that is holding a lock (status is G (granted))

Map lock info to a table name


Use TableID from Lockname or showlocks output
db2pd -tcbstats -db sample | awk '/Address|^0x/ { print $2,$3,$4}'
TbspaceID TableID TableName
0 18 SYSROUTINES
0 81 SYSROUTINEPROPERTI
2 4 DEPARTMENT

21

Monitoring buffer pool


Determine whether we are spending time flushing buffers
due to space constraint or poor allocation of pools
Its needed to identify areas for tuning

db2pd -buffer -db sample


Database Partition 0 -- Database SAMPLE -- Active -- Up 0 days
00:00:41
Bufferpools:
First Active Pool ID
1h
Max Bufferpool ID
1
Max Bufferpool ID on Disk 1
Num Bufferpools
5
Address
Id Name
PageSz PA-NumPgs BA-NumPgs
BlkSize
NumTbsp
PgsToRemov CurrentSz PostAlter SuspndTSCt
0x078000002034A8C0 1 IBMDEFAULTBP 4096
1000
0
0
3
0
1000
1000
0

22

Buffer pool statistics (v9.1)


db2pd -db sample -bufferpools
DatLRds (DatPRds) number of logical (physical) data page reads for this
bufferpool
Hit ratio for data pages given the above logical and physical reads
Same for index pages: IdxLRds, IdxPRds, HitRatio

Bufferpool Statistics for all bufferpools (when BUFFERPOOL monitor switch is ON):
BPID DatLRds
DatPRds
TmpIdxPRds HitRatio
1
78
22
00.00%

HitRatio TmpDatLRds TmpDatPRds HitRatio IdxLRds

IdxPRds

HitRatio TmpIdxLRds

71.79%

58

42.00%

BPID DataWrts
1
0

IdxWrts
0

DirRds
42

BPID AsDatRds
1
0

AsDatRdReq AsIdxRds
0
0

AsIdxRdReq AsRdTime
0
0

BPID TotRdTime
UnRdPFetch
1
104

TotWrtTime VectIORds

VectIOReq

BlockIORds BlockIOReq PhyPgMaps

FilesClose NoVictAvl

0
DirRdReqs
5

00.00%
DirRdTime
0

100

DirWrts
0

DirWrtReqs DirWrtTime
0
0

AsDatWrts
0

AsIdxWrts
0

AsWrtTime
0

23

Buffer pool statistics (v9.1) (cont)

Filtering bufferpools output by bufferpool ID


-bufferpools <bpID>

db2pd -db sample -bufferpools 4099


Address
Id
BlkSize
NumTbsp
0x07800000203C99A0 4099
0
0

Name
PageSz
PA-NumPgs BA-NumPgs
PgsToRemov CurrentSz PostAlter SuspndTSCt
IBMSYSTEMBP32K
32768
16
0
16
16
0

Bufferpool Statistics for bufferpool 4099 (when BUFFERPOOL monitor switch is


ON):

24

db2pd -pages (v9.1)


db2pd -db sample -pages
Pages for all bufferpools

db2pd -db sample -pages [<bpID>]


Monitor bufferpool behavior
Tells which pages are in the bufferpool
Use to determine what is in the bufferpool that is the cause for the hit ratio to be lower than
you expect

Allows user to check how many pages each object (table, index, etc) has
within any particular bufferpool
Similar to IDS's onstat -b option

might help to detect a problem, such as an insufficient number of buffers in


the buffer pool or high read aheads

25

db2pd -pages: example (v9.1)


db2pd -db sample -page
Database Partition 0 -- Database SAMPLE -- Active -- Up 0 days 00:01:01
Bufferpool Pages:
First Active Pool ID
1
Max Bufferpool ID
1
Max Bufferpool ID on Disk 1
Num Bufferpools
5
Pages for all bufferpools:
Address
BPID TbspaceID
Prefetched
0x07800000204DD040 1
0
0x07800000204DD0F0 1
0
0x07800000204DD1A0 1
0
0x07800000204DD250 1
0
0x07800000204DD300 1
0
0x07800000204DD3B0 1
0
0x07800000204DD460 1
0
0x07800000204DD510 1
0
0x07800000204DDB40 1
0
0x07800000204DDBF0 1
0
0x07800000204DDCA0 1
0
0x07800000204DDD50 1
0
0x07800000204DDE00 1
0
<snip>
Total number of pages: 80

TbspacePgNum ObjID ObjPgNum

ObjClass ObjType

Dirty

6
0
0
1
2
0
4
0
1
7
8
9
10

Perm
Perm
Perm
Perm
Perm
Perm
Perm
Perm
Perm
Perm
Perm
Perm
Perm

N
N
N
N
N
N
N
N
N
N
N
N
N

19
19
19
19
19
1
19
19
87
19
19
19
19

6
0
0
1
2
0
4
0
1
7
8
9
10

Index
LOBA
Index
Index
Index
Data
Index
Data
Index
Index
Index
Index
Index

N
N
N
N
N
N
N
N
N
N
N
N
N

Summary info for all bufferpools:

26

Monitoring for SQL errors: db2pdcfg -catch


Functionality moved from db2pd to db2pdcfg in v9.1
Purpose
allow the user to catch any sqlcode (and reason code), zrc or ecf codes (internal
error codes)
capture the information needed to solve the error code

Primary action: execute the db2cos (callout script)


template db2cos file is located in sqllib/bin (v9.1)
db2cos may be altered to run any command (db2pd,OS or other) needed to solve
the problem: default is db2pd -db $database in "SQLCODE section)

Defaults
Up to 10 catch points simultaneously
Error catch array is full. Use 'clear' suboption to clear an element"
Up to 255 invocations
Max Count:

255

27

Index Statistics: db2pd -tcbstats all


Useful for performance tuning
Numerous statistics are reported to provide characteristics about each
indexs use
Only database activation/deactivation will reset these statistics
db2pd tcbstats all or db2pd tcbstats index
TCB Index
TbspaceID
0
0

Stats:
TableID TableName SchemaNm ID RootSplits Scans KeyUpdates Merg
2
SYSTABLES SYSIBM
6 0
0
0
0
2
SYSTABLES SYSIBM
5 0
0
0
0

Scans
The number of scans against the index

28

Detect full-table scan vs index scans


Detect full-table scans for every table
Use db2pd -tcb -db <dbname>
db2 select * from employee
db2pd -tcb -db sample | awk '/TCB Table Stats/ { found =1} found==1 { print}' | grep -i
employee | awk '{print "Scans: ", $3}'

Detect number of index scans


Use 'TCB Index Stats' portion of the 'db2pd -tcb index -db <dbname>' output
db2 "CREATE INDEX LNAME ON EMPLOYEE (LASTNAME ASC)
db2 select * from employee
db2pd -tcb index -db sample | awk '/TCB Index Stats/ { found =1} found==1 { print}' |
grep -i employee | awk '$8 > 0 {print "Index Scans: ", $8}'

29

DB2 table access ratio


db2pd can help determine the access frequency of each table
Operations such as select, update, insert and delete

Reported by db2pd -db <dbname> -tcbstats

Identify tables with most inserts done to them


db2pd -db sample -tcbstats | awk '/TCB Table Stats/ { found =1} found==1 { print}' |
awk '/^0x/ { print $9, $2}' | sort -rn | head -5
36 STAFF3
32 EMPLOYEE
20 PROJECT
7 SYSCOLUMNS

1 SYSUSERAUTH

30

Monitoring progress of transaction logging


By watching the Pages Written output, you can determine whether the log
usage is progressing
db2pd -logs -db sample
Logs:
Current Log Number
Pages Written
Address
0x000000022022FEB8
S0000000.LOG
0x000000022022FF78
S0000001.LOG
0x0000000220008E78
S0000002.LOG
0x0000000220A57F58
S0000003.LOG
0x0000000220A32598
S0000004.LOG

4
464
StartLSN
State
Size
0x000000FA0000 0x00000000 1000

Pages
597

0x000001388000 0x00000000 1000

0x000001770000 0x00000000 1000

0x000001B58000 0x00000000 1000

1000

0x000001F40000 0x00000000 1000

1000

Filename

Monitor amount of log space consumed over the course of 10 minutes


db2pd -db SAMPLE -logs -repeat 60 10

32

Monitoring log usage (FP9 enhancements)


db2pd -logs has some new information since v8.2.2:
Logs:
Current Log Number
Pages Written
Method 1 Archive Status
Method 1 Next Log to Archive
Method 1 First Failure
Method 2 Archive Status
Method 2 Next Log to Archive
Method 2 First Failure
Address
0x000000023001BF58
0x000000023001BE98
0x0000000230008F58

5
846
Success
5
n/a
Success
5
n/a

StartLSN
0x000001B58000
0x000001F40000
0x000002328000

State
0x00000000
0x00000000
0x00000000

Size
1000
1000
1000

Pages
1000
1000
1000

Filename
S0000002.LOG
S0000003.LOG
S0000004.LOG

Two problems can be identified with this output


Problem with archiving
if Archive Status is set to Failure, the most recent log archive failed
If First Failure is set, ongoing archive failure is preventing logs from archiving
Log archiving is proceeding very slowly
Next Log to Archive will be behind Current Log Number (this can cause the log path to fill
up completely)
Monitor Next Log to Archive compared to Current Log Number
If next log is 3 and current is 5, then logs 3 and 4 havent been logged yet
Log 5 is the current log being written into

33

Monitoring Tablespaces and Containers


Single tablespace can be monitored
db2pd -tab[lespaces] <tablespaceID> -rep[eat] <numSecs>
db2 "create tablespace dms1 managed by database using (file 'tbspace1' 1M)
db2pd -db sample -tab | grep DMS1 | awk '{print $2, $15}
6 DMS1
tablespace ID is 6
db2pd -db sample -tab 6 | perl -ane 'if (/TotalPgs * UsablePgs/ .. /^$/) { print "$F[2]
$F[3]\n" } '

UsablePgs UsedPgs
224 96
db2 "insert into staff3 select * from staff
repeat
db2pd -db sample -tab 6 | perl -ane 'if (/TotalPgs * UsablePgs/ .. /^$/) { print "$F[2]
$F[3]\n" } '

UsablePgs UsedPgs
224 160

34

Verifying isolation level


Current isolation level of dynamic SQL statement
how can I tell what isolation level is being used ?
db2pd -db sample -dynamic

ISO column in Dynamic SQL Environments section


Dynamic SQL Environments:
Address

AnchID StmtUID

EnvID

Iso QOpt Blk

0x0780000020CEBC40 41

CS

0x0780000020BB5D80 42

CS

0x0780000020CE2FC0 235

RR

db2pd -activestatements
EffISO column
0=RR,1=CS,2=UR and 3=RS
Will setting DB2_EVALUNCOMMITTED to ON help?
DB2_EVALUNCOMMITTED enables deferred locking
available if application is using cursor stability or read stability

35

Monitoring memory usage


db2pd -memsets -mempools
reports statistics about DB2 Memory Sets and Memory Pools which helps in
understanding memory usage
Memory Sets:
Name
Address
Id
DBMS
0x0780000000000000 19398699
FMP
0x0780000010000000 72220781
Trace
0x0770000000000000 28704896
Memory Pools:
Address
MemSet
PoolName
Id
PhySz
PhyUpBnd
0x0780000000001230 DBMS
fcmrqb
79
1507328
1966080

0x07800000000003C0 DBMS
eduah
72
114688
114688
0x07800000100003C0 FMP
undefh
59
786432
245170176

Size
56639488
245284864
134906824
Overhead
PhyHWM
193888
1507328
2496
114688
48000
786432

Key
0x59FE1161
0x0
0x59FE1174
LogSz
1290312

DBP
0
0
0

Type
0
2
-1

LogUpBnd
Bnd BlkCnt CfgParm
1953864
Ovf 0
n/a

112032
Ovf 0
737400
Phy 0

112064
n/a
245163840
n/a

Ov
Y
N
N

OvSize
8241152
0
0
LogHWM
1290312

112032
737400

36

Monitoring memory usage (cont)


db2pd memblocks
Reports all memory blocks in DBMS set (list)
Followed by the sorted 'per-pool' output
Memory blocks sorted by size for ostrack pool:

PoolID

PoolName

TotalSize(Bytes)

TotalCount LOC

File

57

ostrack

5160048

3047

698130716

57

ostrack

240048

3034

698130716

57

ostrack

240

2983

698130716

57

ostrack

80

2999

698130716

57

ostrack

80

2970

698130716

57

ostrack

80

3009

698130716

Total size for ostrack pool: 5400576 bytes

Final section sorts the consumers of memory for the entire set
All memory consumers in DBMS memory set:
PoolID

PoolName

TotalSize(Bytes)

%Bytes TotalCount %Count LOC

File

57

ostrack

5160048

71.90

0.07

3047

698130716

50

sqlch

778496

10.85

0.07

202

2576467555

50

sqlch

271784

3.79

0.07

260

2576467555

57

ostrack

240048

3.34

0.07

3034

698130716

50

sqlch

144464

2.01

0.07

217

2576467555

69

krcbh

73640

1.03

0.36

547

4210081592

Report memory blocks for private memory on UNIX and Linux


db2pd -memb pid=159770

37

Monitoring utilities: db2pd -utilities


Utilities:
ID Type
DBName StartTime
NumPhase CurPhase Desc
2 BACKUP SAMPLE Wed 12:35:00 Apr 28 2004 1
1
offlinedb
Progress:
ID PhaseNum StartTime
CompletedWork TotalWork
2 1
Wed Apr 28 12:35:42 2004 22782661 bytes 24303325 bytes

Utility types: BACKUP, RUNSTATS, REORG, RESTORE, CRASH_RECOVERY,


ROLLFORWARD_RECOVERY, LOAD, RESTART_RECREATE_INDEX
db2 backup db sample

db2 restore db sample

Use to monitor utilities progress


Determine whether to throttle or unthrottle BACKUP or RUNSTATS utility (using SET
UTIL_IMPACT_PRIORITY)

39

Monitoring Table Reorgs: db2pd -reorgs


Table reorg output including tablespace id, table id, table name, phases,
counters, type (offline/online), start time, and end time are reported
Table Reorg Stats:
TbspaceID TableID TableName MaxPhase
2
2
PDTEST
2

Phase
CurCount MaxCount Type
Replace 0
2
Offline

Phase field (only applies to offline table reorganization)


The phase of the table reorganization: Sort, Build, Replace, InxRecreat

Status field (only applies to online table reorganization)


status of an online table reorganization: Started, Paused, Stopped, Done, Truncat
Done" status indicates that the reorg utility has been completed

Completion field
success indicator for the table reorganization. Possible values:
0. The table reorganization completed successfully
-1. The table reorganization failed

40

Recovery: db2pd -recovery


Recovery:
Recovery Status
Current Log
Current LSN
Job Type
Job ID
Job Description
Invoker Type
Total Phases
Current Phase

0x00000401
S0000000.LOG
000000BB800C
ROLLFORWARD RECOVERY
2
Database Rollforward Recovery
User
2
1

Progress:
PhaseNum Description StartTime
CompletedWork TotalWork
1
Forward
Wed May 5 10:48:09 2004 0 bytes
Unknown
2
Backward
NotStarted
0 bytes
Unknown

Monitoring recovery
db2pd recovery shows several counters to make sure recovery is progressing:
Current Log and Current LSN provide the log position

CompletedWork counts the number of bytes completed thus far

41

Figuring out which application is using up your


tablespace
Identify number of Inserts for table (here, temp table TEMP1)
db2pd -tcbstats
TCB Table Stats:
Address
TbspaceID TableID TableName
ObjClass UDI
DataSize
0x000000022094AA58 4
2
TEMP1
Temp
0
1

SchemaNm Scans

Inserts

SESSION

124

Map to tablespace 4 in db2pd -tablespaces output:


Tablespaces:
Address
Id
Prefetch
BufID
0x0000000220942F80 4
1
1

Type Content AS
BufIDDisk
DMS UsrTmp No

Containers:
Address
TspId ContainNum Type
Container
0x0000000220377CE0 4
0
File
/export/home/jmcmahon/tempspace2a

AR

PageSize

ExtentSize Auto

No

4096

32

Yes

32

TotalPages UseablePgs StripeSet


10000

9952

Notice the space filling up by watching UseablePgs vs. TotalPages

42

Figuring out which application is using up your


tablespace (2)
Identify the dynamic sql statement using a table called TEMP1

db2pd -db sample -dyn

Database Partition 0 -- Database SAMPLE -- Active -- Up 0 days 00:13:06


Dynamic Cache:
Current Memory Used
Total Heap Size
Cache Overflow Flag
Number of References
Number of Statement Inserts
Number of Statement Deletes
Number of Variation Inserts
Number of Statements

1072198
1271398
0
7540
3981
3924
2459
57

Dynamic SQL Statements:


Address
AnchID StmtUID
NumEnv
NumVar
NumRef
Text
0x0000000220A08C40 78
1
2
2
3
declare global temporary table temp1 (c1 char(6)) not logged
0x0000000220A8D960 253
1
1
1
24
insert into session.temp1 values(TEST)

NumExe
2

24

43

Figuring out which application is using up your


tablespace (3)
Map this to -app output to identify the application
db2pd -app -db sample
Applications:
Address
AppHandl [nod-index] NumAgents CoorPid Status
CAnchID C-StmtUID L-AnchID L-StmtUID Appid
0x0000000200661840 501
[000-00501] 1
11246
UOW-Waiting 0
0
253
1
*LOCAL.jmcmahon.050202160426

db2pd -agent output will show the number of rows written as


verification
Address
AppHandl [nod-index] AgentPid
Priority
Type
State
ClientPid Userid
ClientNm Rowsread
Rowswrtn
LkTmOt
DBName
0x0000000200698080 501
[000-00501] 11246
0
Coord
Inst-Active 26377
jmcmahon db2bp
100999
230448
NotSet
SAMPLE

44

Monitoring implicit temporary table space


Steps are different for the implicit temporary table
Use db2pd -tcbstats to identify tables with large numbers of inserts
TCB Table Information:
Address
TbspaceID
DataSize
...
0x0780000020CC0D30 1
2470 ...
0x0780000020CC14B0 1
2367 ...
0x0780000020CC21B0 1
1872 ...
TCB Table Stats:
Address
0x0780000020CC0D30
0x0780000020CC14B0
0x0780000020CC21B0

TableID PartID MasterTbs MasterTab TableName

SchemaNm ObjClass

n/a

TEMP (00001,00002) <30>

<JMC Temp

n/a

TEMP (00001,00003) <31>

<JMC Temp

n/a

TEMP (00001,00004) <30>

<JMC Temp

TableName
TEMP (00001,00002)
TEMP (00001,00003)
TEMP (00001,00004)

Scans
0
0
0

UDI
0
0
0

PgReorgs
0
0
0

NoChgUpdts
0
0
0

Reads
0
0
0

FscrUpdates
0
0
0

Inserts ...
43219 ...
42485 ...
0
...

Notice large number of inserts for implicit temporary tables


tables with the naming convention "TEMP (TbspaceID, TableID)
Identify the application doing the work

values in the SchemaNm column have a naming convention of


<AppHandl><SchemaNm>

45

Monitoring implicit temporary table space


(cont)
Map that info to the used space for table space 1
Use db2pd tablespaces
Notice the UsedPgs vs the UsablePgs in the table space statistics
Tablespace Configuration:
Address
Id
Type Content PageSz ExtentSz Auto Prefetch BufID BufIDDisk FSC
NumCntrs MaxStripe LastConsecPg Name
0x07800000203FB5A0 1
SMS SysTmp 4096
32
Yes 320
1
1
On
10
0
31
TEMPSPACE1
Tablespace Statistics:
Address
Id
TotalPgs
MinRecTime NQuiescers
0x07800000203FB5A0 1
6516
0
0

UsablePgs

UsedPgs

PndFreePgs FreePgs HWM

State

6516

6516

0x00000000

Tablespace Autoresize Statistics:


Address
Id
AS AR InitSize
0x07800000203FB5A0 1
No No 0

IncSize
0

IIP MaxSize LastResize


No 0
None

LRF
No

Identify the application handles 30 and 31


Seen in the -tcbstats output
db2pd -app

Map this to the Dynamic SQL using db2pd -dyn

46

db2pd fmp command


Returns information about the process in which the fenced routines are executed

db2pd fmp
FMP:
Pool Size:
Max Pool Size:
Keep FMP:
Initialized:
Trusted Path:
Fenced User:

1
250
YES
YES
/home/dabrashk/sqllib/function/unfenced
nobody

FMP Process:
Address
FmpPid
0x07800000007A5FE0 6455492

Bit
64

Flags
ActiveThrd PooledThrd Active
0x00000000 0
0
No

Active Threads:
Address
FmpPid
No active threads.

EduPid

Pooled Threads:
Address
FmpPid
No pooled threads.

ThreadId

ThreadId

47

db2pd -fmp (contd)


Useful flags (StateFlags field)

0x00000001 - JVM initialized


0x00000002 - Is threaded
0x00000004 - Used to run federated wrappers
0x00000008 - Used for Health Monitor
0x00000010 - Marked for shutdown and will not accept new tasks
0x00000020 - Marked for cleanup by db2sysc
0x00000040 - Marked for agent cleanup
0x00000100 - All ipcs for the process have been removed
0x00000200 - .NET runtime initialized
0x00000400 - JVM initialized for debugging
0x00000800 - Termination flag

ActiveTh - Number of active threads running in the fmp process.


PooledTh - Number of pooled threads held by the fmp process.
Active - Active state of the fmp process. Values are Yes or No.
48

-fcm command improvements in v9.1

Use the new hwm option to see historical information about


applications that consume large amounts of fast communication
manager (FCM) resources
Retrieve high-watermark consumptions of FCM buffers and channels by
applications since the start of the DB2 instance
The high-watermark consumption values of applications are retained even
if they have disconnected from the database already

The output will now contain FCM channel usage statistics, including
the high and low water mark values with respect to the number of
channels used.

49

DB2 Troubleshooting and


Problem Determination
Resources

DB2 Monitoring and Troubleshooting: db2pd tool


http://publib.boulder.ibm.com/infocenter/db2luw/v9/index.jsp?topic=/com.ibm.db2.udb.a
dmin.doc/doc/r0011729.htm

Problem Determination Guide


http://publib.boulder.ibm.com/infocenter/db2luw/v9/index.jsp?topic=/com.ibm.db2.udb.r
n.doc/doc/c0023244.htm

Whats new for V9.1: Troubleshooting and problem determination enhancements


http://publib.boulder.ibm.com/infocenter/db2luw/v9/index.jsp?topic=/com.ibm.db2.udb.r
n.doc/doc/c0023244.htm

DB2 Product Support site


http://www-306.ibm.com/software/data/db2/udb/support/index.html

DB2 APARs (Authorized Program Analysis Reports)


http://www-306.ibm.com/software/data/db2/udb/support/apars.html

50

Summary
Understanding requirements for PD and monitoring
Different monitoring scenarios
Tools available in v8.2 and v9.1
db2pd - Monitor and troubleshoot DB2 Universal Database Command
Options either added or extended for each release/fixpack
Proactive FFDC: catching errors

Questions

51

Appendix

52

Miscellaneous monitoring tools

Miscellaneous monitoring tools

Lock Timeout Report Tool

db2mc: The DB2 Monitoring Console - Open Source Project

Light-weight, web-based console for DB2 for Linux, UNIX and Windows

http://sourceforge.net/projects/db2mc

Help: http://sourceforge.net/docman/?group_id=211760

db2top: Single System View Monitor for DB2

Specifically designed for DPF environments

The user interface is character-based and built using the curses library

Unix only

http://dl.alphaworks.ibm.com/technologies/db2top/db2top.pdf

54

Troubleshooting lock timeouts

Lock Timeout Report Tool

Introduced in v9.5. Will be back ported to v9.1FP4 and v8.2F16


Lock timeouts and deadlocks happen very frequently in customer situations
End result is that some application (unit of work) fails and gets rolled back forcing the user to
resubmit the work.
It is very desirable to avoid lock timeouts and deadlocks in a production environment.
The Lock Timeout Reporting tool makes debugging these and making the needed application
changes to avoid them possible.

Enabled by setting registry variable DB2_CAPTURE_LOCKTIMEOUT

To enable/disable lock timeout reporting

db2set DB2_CAPTURE_LOCKTIMEOUT=ON
db2set DB2_CAPTURE_LOCKTIMEOUT=

55

Troubleshooting lock timeouts (cont)

Lock timeout report

Lock in contention

Lock Requestor
Lock owner or representative

Lock name and type


Lock specifics, including row ID, table space ID, and table ID. Use this information to query the
SYSCAT.TABLES system catalog view to identify the name of the table.

There can be more than one lock owner: the first lock owner is the representative for other lock owners

Lock timeout report files

Report file is generated by the agent receiving the lock timeout error

When the lock timeout reporting function is active and a lock timeout occurs

The report is stored in a file using the following name format:


db2locktimeout.par.AGENTID.yyyy-mm-dd-hh-mm-ss, where

par is the database partition number. In non-partitioned database environments, par is set to 0.
AGENTID is the agent ID.
yyyy-mm-dd-hh-mm-ss is the time stamp, consisting of the year, month, day, hour, minute, and second.
An example of a lock timeout report file name is
/home/juntang/sqllib/db2dump/db2locktimeout.000.4944050.2006-08-11-11-09-43.

56

Monitoring STMM

Tune a system from an out-of-the-box configuration to near-optimal memory usage in


an hour or less
Retrieve the current size of a buffer pool set to AUTOMATIC

db2pd -database MYDB1 -bufferpools

See CurrentSz column

Monitor STMM bufferpool changes

Monitor DB configuration changes

db2diag -g "message:=Altering bufferpool" db2diag.log

db2diag -node 1 -g "changeevent:=STMM CFG DB" db2diag.log

Tool to parse the STMM log files

parseStmmLogFile.pl <log file> <database name> <options>

http://www.ibm.com/developerworks/db2/library/techarticle/dm-0708naqvi/

57

Monitoring memory usage: db2mtrk

Provide complete report of memory status, for instances, databases


and agents
Outputs the following memory pool allocation information:
Current size
Maximum size (hard limit)

Largest size (high water mark)


Type (identifier indicating function for which memory will be used)
Agent who allocated pool (only if the pool is private)

The "Other Memory" reported is the memory associated with the


overhead of operating the database management system

58

db2top tool

db2top: monitoring tool

Single System View Monitor for DB2

Specifically designed for DPF environments

The user interface is character-based and built using the curses library

Unix only
Uses the DB2 snapshot monitoring APIs to retrieve data

Uses both global as well as partition-specific monitoring information to provide

aggregation
quick drill-down capabilities

db2top info

db2top h

http://dl.alphaworks.ibm.com/technologies/db2top/db2top.pdf

db2top configuration file: .db2toprc

Used to setup parameters at initialization time

Location is determined by $DB2TOPRC

Default is current directory (and home directory if not found)

Type w in db2top to generate resource configuration file

60

db2top Command Options

-n specifies the node to attach to.


-d specifies the database to monitor.
-u specifies the DB2 username used to access the database.
-p specifies the DB2 password.
-V specifies the default schema used in explains.
-i specifies the delay between screen updates.
-k specifies whether to display actual or delta values.
-R Reset snapshot at startup.
-P <number>. Snapshot issued on current or partition number.
-x specifies whether to display additionnal counters on session snd appl creens
(might run slower on session).
-b tells db2top to run in background mode.
-a specifies only active queries will be displayed.
-C Runs db2top in snapshot collector mode, raw snapshot data is saved in
<db2snap-<Machine>.bin>.
-f <file> </pattern> <+offset>. Run db2top in replay mode when snapshot data has
been previously collected in <file>. offset will jump to a certain point in time
in the file. It can be expressed in seconds (+10s), minutes (+10m) or hours
(+10h). /pattern will analyse file and display at which offset matches appear.
pattern can specified as regular expression.
-m Will limit duration of db2top in minutes for -b and -C.
-o <outfile>. Outfile for -b option.
-h short help.

61

db2top Interactive Commands

d Goto database screen


l Goto sessions screen
a Goto application details for agent
G Toggle between all partitions and
current partitions.
P Select db partition on which to issue
snapshot.
t Goto tablespaces screen.
T Goto tables screen.
b Goto bufferpools screen.
D Goto the dynamic SQL screen.
m Display memory pools.
s Goto the statements screen.
U Goto the locks screen.
p Goto the partitions screen.
H Goto the history screen
(experimental).
f Freeze screen.
W Watch mode for agent_id, os_user,
db_user, application or netname.

/ Enter expression to filter data.


<|> Move to left or right of screen.
z|Z Sort on ascending or
descending order.
c This option allows to change the
order of the columns displayed on
the screen.
S Run native DB2 snapshot.
L Allows to display the complete
query text from the SQL screen.
R Reset snapshot data.
i Toggle idle sessions on/off.
k Toggle actual vs delta values.
g Toggle graph on/off.
X Toggle extended mode on/off.
C Toggle snapshot data collector
on/off.
V Set default explains schema.
O Display session setup.
w Write session settings to
.db2toprc.
q Quit db2top.

62

db2top example

Running db2top monitoring utility in interactive mode in a DPF environment

db2top -d TEST -n mynode -u user -p passwd -V skm4 -B -i 1


Command parameters are as follows:
-d TEST --> database name
-n mynode --> node name
-u user --> user id
-p passwd --> password
-V skm4 --> Schema name
-B --> Bold enabled
-i 1 --> Screen update interval: 1 second

Bufferpool snapshot

b command

lqqqqqqqqqqqqqqwqqqqqqqqqqqqwqqqqqqqqqqqqwqqqqqqqqqqqqwqqqqqqqqqqqk
x
x
25%x
50%x
75%x
100%x
xHit Ratio%
x--------------------------------------------------x
mqqqqqqqqqqqqqqvqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqj
Bufferpool
Delta
Delta
Hit
Async
Delta
Name
l_reads/s
p_reads/s Ratio% Reads%
Writes/s
--------------- ------------ ------------ ------- ------- -----------IBMDEFAULTBP
23015
241 98.95% 100.00%
212
IBMSYSTEMBP16K
0
0
0.00%
0.00%
0
IBMSYSTEMBP32K
0
0
0.00%
0.00%
0
IBMSYSTEMBP4K
0
0
0.00%
0.00%
0
IBMSYSTEMBP8K
0
0
0.00%
0.00%
0

63

Bufferpool performance

Determine performance of buffer pools

In real time and over time interval

Use performance analysis option of db2top (-A) for bufferpools

db2top -d sample -A -b b

--- Top twenty performance report for 'Bufferpools' between 13:19:52 and 13:20:18
-- Sort criteria 'Pages_VctIOs/s'
-Rank
----1
2
3
4
5

Bufferpool_Name
Percentage fromTime toTime
sum(Pages_VctIOs/s)
------------------------------ ----------- -------- --------- -----------------------------IBMDEFAULTBP
100.0000% 13:19:52 13:20:18
18
IBMSYSTEMBP8K
0.0000% 13:19:52 13:20:18
0
IBMSYSTEMBP4K
0.0000% 13:19:52 13:20:18
0
IBMSYSTEMBP16K
0.0000% 13:19:52 13:20:18
0
IBMSYSTEMBP32K
0.0000% 13:19:52 13:20:18
0

--- Performance report, breakdown by 300 seconds


-fromTime
sum(Pages_VctIOs/s) Percentage
-------- ------------------------------ ---------18 100.0000%
--- Performance report, breakdown by 0.5 hour
-fromTime
sum(Pages_VctIOs/s) Percentage
-------- ------------------------------ ---------18 100.0000%
-

Top Five in 300 seconds interval


+----------------------------------------------+
|Rank|Percentage|Bufferpool_Name
|
|
1| 100.0000%|IBMDEFAULTBP
|
|
2|
0.0000%|IBMSYSTEMBP8K
|
|
3|
0.0000%|IBMSYSTEMBP4K
|
|
4|
0.0000%|IBMSYSTEMBP16K
|
|
5|
0.0000%|IBMSYSTEMBP32K
|
+----------------------------------------------+

Top Five in 0.5 hour interval


+----------------------------------------------+
|Rank|Percentage|Bufferpool_Name
|
|
1| 100.0000%|IBMDEFAULTBP
|
|
2|
0.0000%|IBMSYSTEMBP8K
|
|
3|
0.0000%|IBMSYSTEMBP4K
|

64

Bufferpool performance (cont)

db2top will report in real time

data hit ratio for each bufferpool

hit ratio for all tablespaces

logica/physical read/writes per second per bufferpool

efficiency of the prefetch (% of unread prefetch pages)

whether block based bufferpools are used

overall temp/data/index hit ratio

db2top collector mode

Gather a lot of data at once and replay it back later

db2top -C -d sample
Start collection
Answer N to create named pipe question
db2top -f db2snap-sample-AIX64.bin -d sample
Replay collection in another window
db2top does not need to attach to the DB2 instance in replay mode

Convenient for remote monitoring


Limit the content and size of the stream file

specify any of the suboptions available to the -C switch


db2top -C b -d sample

65

db2top: Bottleneck analysis

Type B when in interactive mode


lqqqqqqqqqqqqqqwqqqqqqqqqqqqwqqqqqqqqqqqqwqqqqqqqqqqqqwqqqqqqqqqqqk
x
x
25%x
50%x
75%x
100%x
xwait lock ms x
x
xsort ms
x
x
xbp r/w ms
x ------------------------------------------------ x
xasync r/w ms x ----------------------x
xpref wait ms x ---------x
xdir r/w ms
x
x
mqqqqqqqqqqqqqqvqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqj

-=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>
=>

Server
Top
Resource Resource
Resource
Agent
Usage
Value
------------ -------- -------- ---------------Cpu
N/A
0%
0
SessionCpu
1223
99.87%
4.844
IO r/w
1223 100.00%
4894
Memory
1223
50.00%
384.0K
Locks
N/A
0%
0
Sorts
N/A
0%
0
Sort Time
N/A
0%
0
Log Used
1223 100.00%
2.1M
Overflows
N/A
0%
0
RowsRead
1223 100.00%
8960
RowsWritten
1223 100.00%
8960
TQ r/w
N/A
0%
0
MaxQueryCost
N/A
0%
0

Application
Name
-------------------N/A
db2bp
db2bp
db2bp
N/A
N/A
N/A
db2bp
N/A
db2bp
db2bp
N/A
N/A

66

Overview of new db2pd features


in v9.1

db2pdcfg tool

db2pdcfg tool new in v9.1


Configure DB2 database for problem determination behavior command
Sets flags in the DB2 database memory sets to influence the database
system behavior for problem determination purposes
db2pdcfg help
Clear separation between db2pd and db2pdcfg tools
db2pd is the read-only tool
It will never write to DB2 shared memory or change anything
db2pdcfg is the read-write tool
Sets and displays parameters used

db2pdcfg will be substantially extended in post-v9.1 releases

69

db2cos tool in v9.1

DB2 call-out script improvements in v9.1


In v9.1, importance of db2cos for PD is significantly increased
New cases of automatic invocation
Panic, trap, segmentation violation or exception
Configurable by db2pdcfg
ON by default
Diagnostic dumps
Configurable by db2pdcfg
OFF by default

Addition of Windows OS support: db2cos.bat


Standardized place for db2cos script
No need to copy file from sqllib/cfg
$HOME/sqllib/bin on Unix
%DB2PATH%\bin on Windows

Standardized place for db2cos output files


directory specified by the DIAGPATH database manager configuration parameter

71

DB2 call-out script improvements in v9.1


Standardized header for db2cos report files
2006-09-07-01.32.32.481578
PID
: 7057616
TID : 1
PROC : db2cos
INSTANCE: dabrashk
NODE : 0
DB
: LDSORTDB
APPHDL :
APPID: *LOCAL.dabrashk.060907053224
FUNCTION: oper system services, sqloEDUCodeTrapHandler, probe:999
EVENT
: Invoking /home/dabrashk/sqllib/bin/db2cos from oper system services
sqloEDUCodeTrapHandler
Trap Caught
Instance dabrashk uses 64 bits and DB2 code release SQL09010

<detailed db2pd output follows>

Speed and scalability improvement for large customer systems


db2cos output files are named db2cosXXXYYY.ZZZ, where XXX is the process ID
(PID), YYY is the thread ID (TID) and ZZZ is the database partition number (or 000 for
single partition databases)
Necessitated by db2pdcfg catch feature (especially for catching errors)

72

Diagnosing traps

Call-out script (db2cos) trap output (v9.1)


Increased control over the set of diagnostic information produced when the database
manager encounters a panic, trap, exception, or segmentation violation

In such situations, the db2cos script is now automatically run

DB2 call-out script is executed on traps

search for "TRAP in db2cos script

To disable generation of db2cos report in trap scenarios

db2pdcfg cos off

Output goes to db2cos<pid><tid>.<node> file

db2cos48621401.0

To print the status

db2pdcfg cos status

Contains output of db2pd command (all options)

db2pd inst OR db2pd -db $database -inst

You can edit the db2cos script to collect more or less information

74

Call-out script (db2cos) trap output (v9.1)


START and STOP are logged into db2diag.log

db2diag -g "funcname=pdInvokeCalloutScript

START : Invoking /home/dabrashk/sqllib/bin/db2cos from oper system services sqloEDUCodeTrapHandler

To specify number of times to execute db2cos during a trap

db2pdcfg cos count=<count>


default is 255

Specify how long to sleep between checking the size of the output file generated by
db2cos

db2pdcfg cos sleep=<numsec>


default is 3 seconds

Specify how long of a timeout when checking if the output file generated by db2cos is
growing in size

db2pdcfg cos timeout=<numsec>


default is 30 seconds

Instruct the database manager to execute db2cos when receiving SQLO_SIG_DUMP


signal (db2pd dump; kill -36 <agentPID> on AIX)

db2pdcfg cos SQLO_SIG_DUMP

75

db2pd additions in v9.1

76

Overview of new db2pd features for DB2 v9

PD Infrastructure enhancements in v9.1


db2pdcfg tool
db2cos tool improvements

Improvements for diagnosing traps

v9.1 db2pd additions

Troubleshooting and monitoring using db2pd

77

db2pd additions in v9.1


New commands
-latches
-fmp
-pages
-memblocks
-dump

New options
-bufferpools command: Bufferpool ID that contains the page
-fcm command: high water mark option (hwm)

db2pd usability improvements

78

db2pd usability improvements


Agents

v8.2: -agents [db=<database>] [ [agent=<agentid>] | [application=<appid>] ]

v9.1: -agents [db=<database>] [[<AgentId> | [app=<AppHandl>]]

Applications

v8.2: -applications [ [application=<appid>] | [agent=<agentid>] ]

v9.1: -applications [[<AppHandl> | [agent=<AgentId>]]

Transactions

v8.2: -transactions [tran=<tranhdl>] [app=<apphdl>]

v9.1: -transactions [<TranHdl> | [app=<AppHandl>]]

Locks

v8.2: -locks [tran=<tranhdl>] [showlocks] [wait]

v9.1: -locks [<TranHdl>] [showlocks] [wait]

Table Control Block Stats

v8.2: -tcbstats [all|index] [tbspaceid=<tbspaceid> [tableid=<tableid>]]

v9.1: -tcbstats [all|index] [<TbspaceID> [<TableID>]]

Tablespaces/Containers

v8.2: -tablespaces [group] [tablespace=<tablespace id>]

v9.1: -tablespaces [<Tablespace ID>] [group]

79

Backup slides

80

db2pd -transactions states


-transactions State field
FREE - Free State
READ - Read, no log record written
WRITE - Log record written
COMMIT - In commiting state
ABORT - In aborting state
ABORTDL - In aborting state, needs to be aborted on datalink file servers
SAVEPNT - In rollback to savepoint
PREP - Prepared transaction
HCOMT - heurestically commited

HABRT - heurestically rolled back


HAING - heurestically rolling back, used by recovery if further rollback processing
required
FRG - The transaction is forgotting
REPAIR - Transaction needs repair due to I/O errors on tablespaces it uses or PIT
tablespace rollforward
REUSE - Federated transaction has reached beginning of federated two-phase-commit
processing. If no other log record follows, then this signals prepare is not fully
finished and transaction needs to be rolled back at resync time.

81

Mapping Application ID to IP address and


port
A TCP/IP-generated application ID is composed of three sections (v8.2)

IP address represented as a 32-bit number (8 hexadecimal chars)

port number (4 hexadecimal characters)

unique identifier for the instance of this application

When IP address or port number begin with 0-9, they are changed to G-P respectively.
For example, "0" is mapped to "G", "1 is mapped to "H", and so on.
The IP address, AC10150C.NA04.006D07064947 is interpreted as follows:

The IP address remains AC10150C, which translates to 172.16.21.12.

The port number is NA04. The first character is "N", which maps to "7". Therefore, port number
is 7A04, which translates to 31236 in decimal form.

The IP address and port can be used with lsof to find out which remote application is
using the application ID
IP Address

Port

Generated ID

AC10150C.NA04.006D07064947

82

Anda mungkin juga menyukai