Anda di halaman 1dari 9

Introduction

In my last article, I discussed how to install Oracle's Statspack Utility; a free tool for
monitoring your Oracle database instance. That article also discussed how to generate a
Statspack Report. This article will help you interpret a lot of the information you will find
in a statspack report.

The Statspack Header


The beginning of the statspack report shows you some basic information about your
instance including the database name, instance name, DB ID, version, host and the start
and end times of the snapshots used in your report. Here is an example:
STATSPACK report for

DB Name DB Id Instance Inst Num Release Cluster Host


------------ ----------- ------------ -------- ----------- ------- ----
--------
ORCL 2586436430 ORCL 1 9.2.0.4.0 NO
localhost

Snap Id Snap Time Sessions Curs/Sess Comment


------- ------------------ -------- --------- -------------
------
Begin Snap: 4873 13-Dec-05 05:00:05 110 37.4
End Snap: 4875 13-Dec-05 07:00:04 651 203.7
Elapsed: 119.98 (mins)

Cache Sizes
The next section, Cache Sizes, shows you some of your instance settings including:
Buffer Cache (DB_CACHE_SIZE), Standard Block Size (DB_BLOCK_SIZE), Shared
Pool Size (SHARED_POOL_SIZE), and Log Buffer (LOG_BUFFER). These are all
instance parameters which you can modify in your spfile/pfile. :
Cache Sizes (end)
~~~~~~~~~~~~~~~~~
Buffer Cache: 3,008M Std Block Size: 8K
Shared Pool Size: 1,920M Log Buffer: 10,240K

Load Profile
The "Load Profile" section shows you the load on your instance per second and per
transaction. You can compare this section between two Statspack Reports to see how the
load on your instance is increasing or decreasing over time.

• Redo Size & Block Changes Increase: If you see an increase here then more
DML statements are taking place (meaning your users are doing more INSERTs,
UPDATEs, and DELETEs than before.

Load Profile
~~~~~~~~~~~~ Per Second Per
Transaction
--------------- --------------
-
Redo size: 352,535.71
8,517.66
Logical reads: 202,403.30
4,890.29
Block changes: 2,713.47
65.56
Physical reads: 44.22
1.07
Physical writes: 27.46
0.66
User calls: 787.32
19.02
Parses: 301.40
7.28
Hard parses: 0.05
0.00
Sorts: 317.78
7.68
Logons: 0.10
0.00
Executes: 2,975.84
71.90
Transactions: 41.39

% Blocks changed per Read: 1.34 Recursive Call %: 87.43


Rollback per transaction %: 27.56 Rows per Sort: 7.22

Instance Efficiency Percentages


The "Instance Efficiency Percentages" section is very useful. It gives you an overview of
your instance health. Anytime you make instance parameter changes you should take a
look to see if this affects your instance efficiency in any way. Here is a description of
some of the fields (Note, as stated in the statspack report, your goal here is to have these
percentages be as close to 100% as possible):

• Buffer Nowait %: This is the percentage of time that the instance made a call to
get a buffer (all buffer types are included here) and that buffer was made available
immediately (meaning it didn't have to wait for the buffer...hence "Buffer
Nowait").

• Buffer Hit %: This means that when a request for a buffer took place, the buffer
was available in memory and physical disk I/O did not need to take place.

• Library Hit %: If your Library Hit percentage is low it could mean that your
shared pool size is too small or that the bind variables are not being used (or at
least being used properly).

• Execute to Parse %: This is the formula used to get this percentage:

round(100*(1-parsevalue/executevalue),2)
So, if you run some SQL and it has to be parsed every time you execute it
(because no plan exists for this statement) then your percentage would be 0%. The
more times that your SQL statement can reuse an existing plan the higher your
Execute to Parse ratio is.

One way to increase your parse ratio is to use bind variables. This allows the
same plan to be used for multiple SQL statements. The only thing that changes in
the SQL is the parameters used in your statement's WHERE clause. For
Java/JDBC Programmers that means using PreparedStatements as opposed to
regular Statements.

• Parse CPU to Parse Elapsd %: Generally, this is a measure of how available


your CPU cycles were for SQL parsing. If this is low, you may see "latch free" as
one of your top wait events.

• Redo NoWait %: You guessed it...the instance didn't have to wait to use the redo
log if this is 100%.

• In-memory Sort %: This means the instance could do its sorts in memory as
opposed to doing physical I/O...very good. You don't want to be doing your sorts
on disk...especially in an OLTP system. Try increasing your SORT_AREA_SIZE
or PGA_AGGREGATE_TARGET in your spfile/pfile to see if that helps if your
in-memory sorting is not between 95% and 100%.

• Soft Parse %: This is an important one...at least for OLTP systems. This means
that your SQL is being reused. If this is low (not between 95% and 100%) then
make sure that you're using bind variables in the application and that they're being
used properly.

• Latch Hit %: This should be pretty close to 100%; if it's not then check out what
your top wait events are to try to fix the problem (pay specific attention to 'latch
free' event).

Instance Efficiency Percentages (Target 100%)


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 100.00 Redo NoWait %: 100.00
Buffer Hit %: 99.98 In-memory Sort %: 100.00
Library Hit %: 100.04 Soft Parse %: 99.98
Execute to Parse %: 89.87 Latch Hit %: 94.99
Parse CPU to Parse Elapsd %: 75.19 % Non-Parse CPU: 99.46

Top 5 Timed Events (Called "Top 5 Wait Events" in 8i)


This section is crucial in determining what some of the performance drains in your
database are. It will actually tell you the amount of time the instance spent waiting. Here
are some common reasons for high wait events:

• DB file scattered read: This can be seen fairly often. Usually, if this number is
high, then it means there are a lot of full tablescans going on. This could be
because you need indexes or the indexes you do have are not not being used.

• DB file sequential read: This could indicate poor joining orders in your SQL or
waiting for writes to 'temp' space. It could mean that a lot of index reads/scans are
going on. Depending on the problem it may help to tune
PGA_AGGREGATE_TARGET and/or DB_CACHE_SIZE.

• CPU Time: This could be completely normal. However, if this is your largest
wait event then it could mean that you have some CPU intensive SQL going on.
You may want to examine some of the SQL further down in the Statspack report
for SQL statements that have large CPU Time.

• SQL*Net more data to client: This means the instance is sending a lot of data to
the client. You can decrease this time by having the client bring back less data.
Maybe the application doesn't need to bring back as much data as it is.

• log file sync: A Log File Sync happens each time a commit takes place. If there
are a lot of waits in this area then you may want to examine your application to
see if you are committing too frequently (or at least more than you need to).

• Logfile buffer space: This happens when the instance is writing to the log buffer
faster than the log writer process can actually write it to the redo logs. You could
try getting faster disks but you may want to first try increasing the size of your
redo logs; that could make a big difference (and doesn't cost much).

• Logfile switch: This could mean that your committed DML is waiting for a
logfile switch to occur. Make sure your filesystem where your archive logs reside
are not getting full. Also, the DBWR process may not be fast enough for your
system so you could add more DBWR processes or make your redo logs larger so
log switches are not needed as much.

Top 5 Timed Events


~~~~~~~~~~~~~~~~~~
% Total
Event Waits Time (s)
Ela Time
-------------------------------------------- ------------ ----------- -
-------
db file sequential read 187,787 906
88.60
SQL*Net more data to client 49,707 57
5.55
CPU time 54
5.33
log file parallel write 1,011 2
.22
latch free 6,226 2
.16
-------------------------------------------------------------

The SQL Sections (Buffer Gets, Disk Reads, Executions, and Parse Counts)
The following sections show you the Top SQL (or 'worst performing' SQL) grouped by
four sections: Buffer Gets, Disk Reads, Executions, and Parse Counts. You'll want to
review the top SQL statements in each of these sections to see if they can be tuned better.
These sections are a great way to how many times the SQL is being executed, how much
CPU time is being used to execute them, and the total time for the statement to execute.
SQL ordered by Parse Calls for DB: ORCL Instance: ORCL Snaps: 4873 -
4875
-> End Parse Calls Threshold: 1000

% Total
Parse Calls Executions Parses Hash Value
------------ ------------ -------- ----------
144,300 144,300 6.65 4199666855
Module: JDBC Thin Client
select parameter, value from nls_session_parameters
Note: If you take the hash value for the SQL statement, then you run the
ORACLE_HOME/rdbms/admin/sprepsql.sql script, and enter the hash value when it
prompts you it will pull up the Execution Plan for that SQL statement. Pretty Cool!

Instance Activity Stats


This section may provide some insight into some potential performance problems that
were not as easily visible from previous sections in the report. This section is also useful
when comparing statspack reports from the same timeframes on different days.

Tablespace and Data File I/O Statistics


These sections help give you some visibility into I/O rolled up to the tablespace level and
I/O stats on your data files.
Tablespace IO Stats for DB: ORCL Instance: ORCL Snaps: 4873 -4875
->ordered by IOs (Reads + Writes) desc

Tablespace
------------------------------
Av Av Av Av Buffer
Av Buf
Reads Reads/s Rd(ms) Blks/Rd Writes Writes/s Waits
Wt(ms)
-------------- ------- ------ ------- ------------ -------- ----------
------
UNDOTBS
146 0 5.8 1.0 117,119 16 50,681
1.3
APP1
19,395 3 10.5 1.0 32,613 5 1,886
2.8
INDEX1
36,919 5 0.7 6.3 977 0 526
5.0
APP2
6,969 1 11.7 1.0 13,559 2 2,513
2.5
SYSTEM
15,056 2 0.8 1.8 360 0 13
3.8

http://www.troygeek.com/articles/Statspack101-Interpreting/

Reading & Interpreting Statspack

In Oracle, Performance Tuning is based on the following formula:

Response Time = Service Time + Wait Time


Where
• Service Time is time spent on the CPU
• Wait Time is the sum of time spent on Wait Events i.e. non-idle time spent waiting
for an event to complete or for a resource to become available.

Service Time is comprised of time spent on the CPU for Parsing, Recursive CPU usage
(for PLSQL and recursive SQL) and CPU used for execution of SQL statements (CPU
Other).

Service Time = CPU Parse + CPU Recursive + CPU Other

The above components of Service Time can be found from the following statistics:
• Service Time from CPU used by this session
• CPU Parse from parse time cpu
• CPU Recursive from recursive cpu usage

From these, CPU Other can be calculated as follows:

CPU other = CPU used by this session - parse time CPU - recursive CPU usage

Many performance-tuning tools (including Statspack) produce a list of the top wait
events. For example, Statspack’s report contains the "Top 5 Wait Events" section.(Pre-
Oracle9i Release 2).
It is a common mistake to start dealing with Wait Events first and not taking in
consideration the corresponding response time. So always compare the time consumed by
the top wait events to the 'CPU used by this session' and identify the biggest consumers.

Here is an example where CPU Other was found to be a significant component of total
Response Time even though the report shows “direct path read” as top wait event:

From these figures we can obtain:


• Wait Time = 10,827 x 100% / 52,01% = 20,817 cs
• Service Time = 358,806 cs
• Response Time = 358,806 + 20,817 = 379,623 cs
• CPU Other = 358,806 - 38 - 186,636 = 172,132 cs

If we now calculate percentages for the top Response Time components:


• CPU Other = 45.34%
• CPU Recursive = 49.16%
• direct path read = 2.85%
• etc. etc.

So we can see the I/O-related Wait Events actually are not a significant component of the
overall Response Time. For us it makes sense concentrate our tuning effort on the service
time component.

CPU Other is a significant component of Response Time, so a possible next step is to


look at the CPU intensive SQL and not at direct path read wait event.
Starting with Oracle9i Release 2, Statspack presents Service Time (obtained from the
statistic CPU used by this session) together with the top Wait Events in a section called
Top 5 Timed Events, which replaces the section Top 5 Wait Events of previous
releases.
Here is an example:
These figures give us directly the percentages of the Wait Events against the total
Response Time so no further calculations are necessary to assess the impact of Wait
Events. Service Time is presented as CPU time in this section and corresponds to the
total CPU utilisation. We can drill down to the various components of Service Time as
follows:

• CPU Other = 3,211 - 59 - 232 = 2,920 cs


• CPU Other = 2,920 / 3,211 x 5.79% = 5.26%
• CPU Parse = 59 / 3,211 x 5.79% = 0.11%
• CPU Recursive = 232 / 3,211 x 5.79% = 0.42%

In this example, the main performance problem was an issue related to the Library Cache.
The second most important time consumer was waiting for physical I/O due to
multiblock reads (db file scattered read).

Identifying problematic SQL’s from Statspack

From the above calculations you will get the significant components which caused the
performance problem. Based on this components lets decide on the various Statspack
section to identify the problematic SQL’s.

• Other CPU

If this shows CPU other as being significant the next step will be to look at the SQL
performing most block accesses in the SQL by Gets section of the Statspack report. A
better execution plan for this statement resulting in fewer Gets/Exec will reduce its CPU
consumption.

• CPU Parse

If CPU Parse time is a significant component of Response Time, it can be because


cursors are repeatedly opened and closed every time they are executed instead of being
opened once, kept open for multiple executions and only closed when they are no longer
required. The SQL ordered by Parse Calls can help find such cursors.

• Disk I/O related waits.

Identifying SQL statements responsible for most physical reads from the Statspack
section SQL ordered by Reads has similar concepts as for SQL ordered by Gets.
% Total can be used to evaluate the impact of each statement. Reads per Exec together
with Executions can be used as a hint of whether the statement has a suboptimal
execution plan causing many physical reads or if it is there simply because it is executed
often. Possible reasons for high Reads per Exec are use of unselective indexes require
large numbers of blocks to be fetched where such blocks are not cached well in the buffer
cache, index fragmentation, large Clustering Factor in index etc.

• Latch related waits.

Statspack has 2 sections to help find such unsharable statements, SQL ordered by
Sharable Memory and SQL ordered by Version Count. This can help with Shared Pool
and Library Cache/Shared Pool latch tuning. Statements with many versions (multiple
child cursors with the same parent cursor i.e. identical SQL text but different properties
such as owning schema of objects, optimizer session settings, types & lengths of bind
variables etc.) are unsharable. This means they can consume excessive memory resources
in the Shared Pool and cause performance problems related to parsing e.g. Library Cache
and Shared Pool latch contention or lookup time e.g. Library Cache latch contention.

http://arulselvaraj.blogspot.com/2007/07/reading-interpreting-statspack.html

Anda mungkin juga menyukai