Make Sure
• Queries are simple
• Unused rows, columns are not fetched
• There is no unnecessary ORDER BY or GROUP BY Clause
• Minimize lock duration
• No redundant predicates
• Assuming that subquery 1 and subquery 2 are the same type of subquery
(either correlated or noncorrelated) and the subqueries are stage 2, DB2
evaluates the subquery predicates in the order they appear in the WHERE
clause. Subquery 1 rejects 10% of the total rows, and subquery 2 rejects
80% of the total rows.
• The predicate in subquery 1 (which is referred to as P1) is evaluated 1000
times, and the predicate in subquery 2 (which is referred to as P2) is
evaluated 900 times, for a total of 1900 predicate checks. However, if the
order of the subquery predicates is reversed, P2 is evaluated 1000 times,
but P1 is evaluated only 200 times, for a total of 1200 predicate checks.
• Coding P2 before P1 appears to be more efficient if P1 and P2 take an
equal amount of time to execute. However, if P1 is 100 times faster to
evaluate than P2, then coding subquery 1 first might be advisable.
• In general subquery predicates can potentially be thousands of times more
processor- and I/O-intensive than all other predicates, the order of subquery
predicates is particularly important.
• Regardless of coding order, DB2 performs noncorrelated subquery
predicates before correlated subquery predicates, unless the subquery is
transformed into a join.
If a query involves aggregate functions, make sure that they are coded as simply as
possible; this increases the chances that they will be evaluated when the data is
retrieved, rather than afterward. In general, a aggregate function performs best
when evaluated during data access and next best when evaluated during DB2 sort.
Least preferable is to have a aggregate function evaluated after the data has been
retrieved.
» No sort is needed for GROUP BY. Check this in the EXPLAIN output.
• No stage 2 (residual) predicates exist.
• No distinct set functions exist, such as COUNT(DISTINCT C1).
• If the query is a join, all set functions must be on the last table joined.
• All aggregate functions must be on single columns with no arithmetic
expressions.
• The aggregate function is not one of the following aggregate functions:
– STDDEV
– STDDEV_SAMP
– VAR
– VAR_SAMP
• Does a query have an input variable in the predicate?
• When host variables or parameter markers are used in a query, the
actual values are not known when bind the package or plan that
contains the query. DB2 therefore uses a default filter factor to
determine the best access path for an SQL statement.
Poor performance:
SELECT A.NAME, DECRYPT_CHAR(A.EMPNO) FROM EMP A, EMPPROJECT B
WHERE DECRYPT_CHAR(A.EMPNO) = DECRYPT_CHAR(B.EMPNO) AND
B.PROJECT ='UDDI Project';
Good performance
SELECT A.NAME, DECRYPT_CHAR(A.EMPNO) FROM EMP A, EMPPROJ B
WHERE A.EMPNO = B.EMPNO AND B.PROJECT ='UDDI Project';
Simple or compound
A compound predicate is the result of two predicates, whether
simple or compound, connected together by
AND or OR Boolean operators. All others are simple.
Local or join
Local predicates reference only one table. They are local to the
table and restrict the number of rows
returned for that table. Join predicates involve more than one table
or correlated reference. They determine
the way rows are joined from two or more tables.
Boolean term
Any predicate that is not contained by a compound OR predicate
structure is a Boolean term. If a Boolean
term is evaluated false for a particular row, the whole WHERE
clause is evaluated false for that row.
• Tablespace Scan
• Index scan
– Index Only Access (INDEXONLY = Y)
– Multiple index Scan (ACCESSTYPE=M,MI,MU,MX)
– Matching index scan (MATCHCOLS > 0)
– Non-Matching index scan ( MATCHCOLS = 0)
– One fetch access (ACCESSTYPE= I1)
• Chosen when
– Huge number of rows returned
– Indexes available have low clusterratio
– No index available
• Sequential prefetch is used (PREFETCH=S)
• When required data can be taken from index pages and no need to access
data page
• Much efficient
• ACCESSTYPE = I AND INDEXONLY = Y
• Retrieves rows from more than one table and combines them
• Application joins are called inner join, left outer join, right outer join and full
outer join
• DB2 internally uses three types of join method - Nested loop join, Merge
Scan Join and Hybrid Join
X B
Y A A PC
B OE
Z C
C BPR
X OE
Y PC
Z BPR
X B B OE
Z C C BPR
X OE
Y PC
Z BPR
X B B 30
Z C C 10
X OE Y 5
Y PC X 30
Z BPR
Z 10
• If DB2 does not choose prefetch at bind time, it can sometimes do that at
execution time. The method is called sequential detection.
• If a table is accessed repeatedly using the same statement (SQL in a do-
while loop), the data or index leaf pages of the table can be accessed
sequentially.
• DB2 can use this technique if it did not choose sequential prefetch at bind
time because of an inaccurate estimate of the no of pages to be accessed.
• Sort is need for subquery processing. Result of the subquery is sorted and
put into the work file for later reference by parent query.
• DB2 sorts RIDs into ascending page number order in order to perform list
prefetch. This sort is very fast and is done totally in memory
• If sort is required during CURSOR processing, it is done during OPEN
CURSOR. Once cursor is closed and opened, sort is to be performed again.
• OPTIMIZE OF n ROWS
• Reducing the number of matching columns for index scan
• Adding extra local predicates
• Changing inner join to outer join
• Updating Catalog Statistics