Online Database
Maintenance for 24x7
OLTP Systems
Vijay Sitaram
Per-Se Technologies
In today’s world DBA’s are challenged with supporting OLTP environments with very little or no
downtime. Such a task needs to be accomplished without sacrificing performance. DB2 UDB
has kept up with industry challenges and gives DBA the relief / ammunition to perform his tasks
online unobtrusively in the background.
We start with the assumption that the need for reorg has been identified with the help of:
1. Snapshot and Event monitors show page reorgs, overflows
2. Query explain show extra I/O is being performend. The access plan has changed that the query
has picked the wrong index or it’s using HASH join instead of Nested-Loop join. This is typical
when the optimizer does not have current information on the data layout through statistics.
3. Reorgchk recommends that the table needs reorg (This is accurate as long as the statistics are
current).
Therefore, having set the stage, we now dive into this presentation where we pick apart the two
command: reorg and runstats. With that, we explore the possibilities of running them most
efficiently causing the least service impact to customers.
1
About Per-Se Technologies
Your Health Is The Bottom Line
At Per-Se Technologies, our business is delivering Connective Healthcare solutions to help ensure
the financial success of physicians, pharmacies, hospitals and healthcare organizations. A long-time
leader in business services and information technology solutions for the healthcare industry, Per-Se
understands that rising costs of treatment and rampant reimbursement inefficiencies diminish the
ability of providers to stay competitive, run profitable businesses and deliver optimal patient care. Our
Connective Healthcare solutions can help.
Connective Healthcare is a comprehensive class of business solutions that help providers achieve
their income potential by streamlining and simplifying the complex administrative burden of providing
healthcare. Innovative technology, integrated networks, data mining, provider training and extensive
healthcare expertise enable us to accelerate the movement of funds to benefit our clients.
Providers who implement Connective Healthcare solutions will benefit from decreased expenses
through Per-Se’s resource management solutions and increased revenue through our revenue cycle
management solutions. Our solutions work to streamline cash flow, increase collections and
reimbursements, ensure compliance and improve customer service. The result is an approach that
empowers physicians, pharmacies, hospitals and healthcare organizations to collect the
compensation they deserve so they can focus on the important services they provide.
2
Agenda
• Utilities for online maintenance
• Design online maintenance
• What to expect and avoid!
• Monitoring
• Tweaks & considerations
• Reduce better avoid reorg!
• Asked and Answered!
Outline:
a. Reorg table & Index with examples
b. Runstats with examples
c. Nature and Benefits of Online Reorg.
d. Nature and Benefits of Online Runstats.
Design:
a. Plan before design.
b. Designing online maintenance.
c. Phases of online reorg and runstats
d. Internals
e. Locks and Logs
f. Parallel reorgs and runstats
3
Agenda
4
Utilities for performing online maintenance
REORG command:
• Table
• Reclaim Space
• Recluster data
• Index
• Reclaim Space
• Rebuild index
Examples:
db2 REORG TABLE db2inst1.employee INDEX db2inst1.emp_IX1
INPLACE NOTRUNCATE TABLE ALLOW WRITE ACCESS START
db2 REORG INDEXES ALL FOR TABLE db2inst1.emoloyee ALLOW
WRITE ACCESS CONVERT
5
The reorg command now supports new options which used in combination can address most of the scenarios / challenges
posed to DBA. The part of the command that’s highlighted here is the focus of attention.
Option INPLACE runs table REORG in online mode and one may choose to ALLOW WRITE or READ ACCESS while
running inplace reorg. This is asynchronous in nature and the command is submitted as a background task to DB2. The
command can be run on regular tables only. MDC tables allow offline reorg only. Index reorg is different than table reorg
for many reasons, one of them being it’s a synchronous operation and gives lot more flexibility. More on this to be
discussed during the course of presentation.
5
Utilities for performing Online maintenance
RUNSTATS command:
• Column level statistics
• Group of columns (CGS)
• Distribution columns statistics
• Detailed index statistics
• Table Row and page level sampling
• Index sampling
• LIKE predicates
• NUM_FREQUENCY and NUM_QUANTILES at table
and column level
• Statistics Profiling
6
The way to collect statistics have improved in the past few years, thanks to the LEO project. DB2 can collect statistics
automatically and store it for profiling and reuse at a later time. This also gives the optimizer to learn from the statistics gathered
and suggest what to collect and when to collect. The whole system runs on proactive & feedback based statistics collection.
Runstats command is now very flexible, granular, fast, easy and effective as always. The options are so many that it could be a
topic of discussion. For this presentation, we will look at:
Runstats is a throttled utility and can be done using UTIL_IMPACT_PRIORITY and UTIL_IMPACT_LIM db cfg.
Index-clause:
1. Sampled – Use sampling for collecting index key statistics. Instead of collecting based on all the keys, db2 can sample a set of keys and
derive the statistics.
2. Detailed – This is useful to gather detailed index statistics such as clusterfactor, pagefetch pairs and prefetch statistics.
3. INDEX <index_name> vs. INDEX ALL – Allows you to collect statistics on one index instead of all the indexes for the table.
Column-stats-clause:
1. ALL vs. KEY COLUMNS – Collect statistics on all table columns or just on the columns which are used by indexes (key columns).
2. Set of columns – You can also specify a set of columns (key or non-key) to collect statistics with or without LIKE STATISTICS
Distribution-clause:
1. DISRIBUTION Statistics – Collect histograms on table columns which provide more detailed information on table column values.
2. ALL vs. KEY COLUMNS – Collect distribution on all or just the key columns. For each of these columns, you may also specify
NUM_QUANTILES and NUM_FREQUENCY.
6
Runstats by example
Column level statistics:
RUNSTATS ON TABLE db2inst1.employee
ON KEY COLUMNS AND COLUMNS (empno, empname)
Index Statistics:
RUNSTATS ON TABLE db2inst1.employee
FOR INDEXES db2admin.INX1, db2admin.INX2, db2admin.INX3
Distribution statistics:
RUNSTATS ON TABLE db2inst1.employee
WITH DISTRIBUTION DEFAULT NUM_FREQVALUES 40
Using CGS:
RUNSTATS ON TABLE db2inst1.employee
ON COLUMNS ((empno, empname), mrgno, (admrdept, location))
7
Runstats by example
Statistics profile:
RUNSTATS ON TABLE db2inst1.employee AND INDEXES ALL SET PROFILE ONLY
Sampling:
Table Î Row-level:
RUNSTATS ON TABLE db2admin.department WITH DISTRIBUTION
TABLESAMPLE BERNOULLI (10) REPEATABLE (1024)
Table Î Page-level:
RUNSTATS ON TABLE db2admin.department AND INDEXES ALL
TABLESAMPLE SYSTEM (10)
Index Î key-level:
RUNSTATS ON TABLE db2admin.department WITH DISTRIBUTION
ON KEY COLUMNS AND SAMPLED DETAILED INDEXES ALL
8
Agenda
9
Planning for Online maintenance
The elements:
• Hardware
• Is the disk always running at full capacity?
• Is there cpu power left on the system?
• Is there an archive target like TSM to manage logs?
• User
• Can one identify the online workload for the day, week and month?
• DBA
• What’s the best day to run it? Should one run it all or spread it out?
• Is it scripted with all the options based on the scenario?
• Can it be monitored and email/page alerts to DBA?
• Can the process be controlled without disrupting the customer?
10
Before staring anything, it would help to determine if online maintenance is the best fit for the
application and it’s workload. The online maintenance runs in trickle mode and therefore could
stretch for hours before completion. It is during this time DBA need to insure that user
application performance is not affected and the system has enough cycles to process
maintenance request. Find a window to run maintenance on discussing with the end user. It is
not recommended to run maintenance and batch workload on the same table resulting in
invalidating the work done and/or delaying the maintenance process. Try and eliminate need for
maintenance especially reorgs on tables using MDC (Multi-Dimensional Cluster) tables. MDC
provides more benefits in addition to eliminating maintenance.
10
Online table reorg
• Nature of ONLINE table reorg:
• It’s trickle and not meant for speed
• It’s just like any other transaction acquiring locks and generating logs
• It’s a background process and knows to play well with online user workload
• Currently reorg is not one of the throttled utility
• It’s an I/O bound operation
• Multiple online reorgs can run in parallel
One of the questions everyone asks is “Why is it slow?”. So let’s set the record straight! ONLINE
reorg is NOT meant for speed. It runs in trickle mode in the background without disrupting service. It
behaves like a user transaction which would hold locks on tables and writes to transaction logs to
make itself recoverable incase of failure. Multiple reorgs can be run in parallel and it greatly depends
on the system and available resources. The online reorg can be seen as “db2reorg” in the list
applications command. The reorg process gets submitted and runs in background mode. Currently it
runs in a mode where it does not set a priority and runs like any other user connection. Also online
reorg has to determine the system usage, it’s completion time cannot be determined as they differ for
each run.
Due to the fact that data pages are moved physically inplace without requiring extra space on the
tablespace the table resides, online reorg takes longer than offline reorg to finish running, but has it’s
own advantages compared to offline reorg. The data pages are moved in batches providing
immediate clustering benefits. Online reorg no longer needs the tablespace to be twice the size of the
table, it does all the work within the space already allocated to the table (not made available to the
table) making it very space efficient. Online reorg has execution phases and at end of each phase
needs to acquires locks on the table. When we go through this presentation, we will learn more about
the locks involved and tips to avoid them. The online table reorg is like an application connected to
the database. Therefore, one may use snapshot and event monitors to gather data on status and
progress.
11
Online table reorg – Reclustering
12
Online reorg can be run in different way that benefits the application. In this slide, we will discuss
the method of reclustering in the online reorg. If the requirement is to reorg the table based on a
specific index, such that the index clusterratio/clusterfactor improves application performance,
the table reorg can be made to use a specified index if the table does not have a clustered index
built on it. Should the table have a clustered index, then any reorgs performed on the table will
be based on the clustered index by default.
The above slide is a classic and simple representation of data page movement based on an index key.
The white color stands for unused data pages and the green color stands for data pages with data
in it. The online reorg in this case is moving data pages from left to right. This is called as
“vacate and fill” phase of online table reorg. During the vacate phase, pages are moved to
adjacent extents in batches. During the fill phase, the pages are brought back in the cluster order
into the pages previously vacated in the vacate phase. This repeats until the whole table is
reorganized using the index order. If reorg truncates the table, the free space reclaimed from the
table reorged is returned to the tablespace and can be used by any other table.
12
Online table reorg – Reclaim space
Using NOTRUNCATE for table reorg does not reclaim space. Why is
not the empty fragmented space reused?
Answer: If NOTRUNCATE is used, free space within the table being
reorged is not returned to the tablespace.
13
This is a classic representation of online table reorg written to reclaim space only without using an
index to recluster/order data.
Here, the blue color represents data and the white pockets represent empty spaces within the data
pages. During reorg, the operation is run from right to left to move rows into the left most
extents/pages which has empty pockets. This is called as the vacate and fill phase and the reorg
utility repeats this cycle in batches until all empty pockets are filled with rows from the right most
extents. If you notice the vacate and fill phase is uni-directional and does the first-fit-out algorithm
for filling out empty pages moving from the right to the left most extent. Towards the end of reorg,
the empty extents towards the tail end of the table is truncated and space allocated to the table is
released back to the tablespace.
13
Online table reorg – Progress chart
14
This ties in everything that has been discussed so far about online table reorg. This slide
demonstrates how the reorg process goes through phases before and after vacate and fill phase:
There are some milestones/commit points for the online reorg process:
1. Draining existing scanners while allowing new scanners to start.
2. Move rows in batches (vacate and fill phase)
3. Truncate empty space at tail end of the table (if requested), starting with acquiring a Shared table
lock, drain existing scanners allowing new scanners to start, truncate table and finally commit.
Also note that incremental clustering is provided as reorg continues to make progress with the vacate
and fill cycles.
14
Online index reorg
• Nature of online index reorg:
• Synchronous operation
• Does not support throttling
• Allows transactions to update the index during reorg
• Only Type-2 index support
• In DMS mode, can run in most cases with online backup
• Writes to transaction logs
15
The REORG for INDEXES are synchronous in nature compared to table reorgs. The index reorgs
does not support throttling. During index reorg, transactions can update the index while the reorg is
in progress till the switch-over phase. All type-1 indexes must be converted to tupe-2 indexes using
the CONVERT option in the reorg command (if migrated from previous version) or drop and create
them as type-2 indexes in V8 to support online index reorganization. If the index resides in DMS
tablespace, the online index reorg can make the backup wait on lock because of the switch-over
phase, which is NOT recommended. Depending on the size of the indexes, the index reorg can
generate substantial amount of transaction logs for recovery purposes and this helps for HADR
solution. In SMS mode, online index reorganization and online backup are incompatible in the same
way that online index create and online backup are incompatible. In DMS mode, online index
reorganization and online backup can run concurrently in most cases. The only case where they are
not compatible is the same case that is described above for online index create and online backup. In
addition, online index reorganization quiesces the table before the switch phase and gets a Z lock,
which prevents an online backup.
15
Online index reorg – Progress chart
Online index reorg:
1. Rebuild Phase
2. Log catch-up phase
3. Switch-Over phase
16
This slide explains the phases of index reorg in detail. The index reorg consists of 3 major phases:
1. Rebuild phase
2. Log catch-up phase
3. Switch-over phase
In the first phase, the indexes keys are rebuilt. The reorg creates a copy of the index called “shadow”
and does all it’s work on this copy letting existing scanners work on the original index. The keys
are sorted and RIDs are assigned to the index during this phase. During this phase, concurrent
read and write access is available on the index object. These operations are captured into special
logs into memory buffer and are not reflected on the shadow object till the next phase.
In the second phase, while the existing read and write scanners are working on the index, all changes
to the original indexes are captured into a log in memory. This log is replayed on the shadow
index to catch-up with the original index. This insures all transactions done during rebuild phase
are applied to the shadow index.
In the third phase, db2 does a catch-up of the logs one more time. At this time, when all logs are
caught up and the shadow index is consistent with the original index, a Z lock is issued to switch
indexes. During this switch phase. Existing scanners are drained and new scanners are pointed to
the shadow index object. The old/original index object is dropped as cleanup process.
16
Online Runstats
• Nature of ONLINE RUNSTATS:
• Exploits SMP parallelism
• It can be throttled
• CPU and memory intensive
• Can be run in parallel for multiple tables
Runstats has grown over years and have many features and benefits to it. One of them being
exploiting SMP parallelism. It can be throttled using the UTIL_IMPACT_PRIORITY option in
conjunction with UTIL_IMPACT_LIM database configuration parameter. Throttling is required
when the system is resource constrained. Runstats is CPU and memory intensive given the fact that it
needs to calculate statistics that could affect application performance.
It aids the optimizer in making the correct decision producing efficient access plans. It can perform
what-if scenarios for troubleshooting query performance.
Runstats can be tuned to be CPU efficient and still provide accurate statistics by using page or row
level sampling on the table. Index statistics can also be sampled to speed up runstats.
Runstats provides accurate data to utilities like reorgchk and also populates SYSTAT and SYSCAT
views. Aids in determining multiple factors to decide if a table is fragmented and subsequently
needing reorg.
With CGS (Column Group Statistics), one can avoid creating indexes for statistical information only.
CGS helps in data correlation. CGS is the number of distinct combinations values for a set of
columns (cardinality of a group of columns). CGS does not assume independence and gives better
joint selectivity.
DB2 now provides automated statistics collection. As a part of ASC, statistics for the table can be
profiled and stored in a table. Db2 would analyze this and decide what and what statistics to gather.
More on this is available on LEO (LEarning optimizer) project.
17
Online Runstats
On What to collect?
How much to collect?
When to collect?
Ways to collect?
18
WHAT to collect: Runstats can be run on table, indexes individually or on both of them. This gives the
flexibility to collect statistics on the table only, all or on a specific index or on both table and it’s indexes.
Let’s say that we just added a new index to the table, statistics can be collect for the new index alone without
having to run on the table and all of the existing indexes. It is good to keep the table and index statistics upto
date and in sync to avoid any query performance issues. The data collected from these options are stored in
sysstat.tables and sysstat.indexes views.
HOW MUCH to collect: So start with, it’s a tough decision! To collect statistics one needs to know the data.
Once you know the data, one needs to determinate how it’s being used by queries. Statistics collected on the
table and related objects cannot be based on one query alone. Therefore, one needs to collect statistics which
can provide enough information to the optimizer. Here is where runstats stands out in it’s capabilities to
drill-down to column and key level. One may choose to collect detailed index key statistics to provide more
details other than just the highkey and lowkey values. You can also collect distribution statistics on the table
columns which provides more information on data. Now, we can also store the statistics in a profile and just
use the profile for the next time.
WHEN to collect: The DBA can decide on when to collect statistics based on the needs of the application and
the availability of resources to perform such an operation. Or the DBA may choose to leave this to DB2 by
turning on AUTOMIATIC statistics collection.
WAY to collect:
Runstats command
Load command
Reorgchk using update statistics
Create index with collect statistics (supported for Declared Global Temporary table indexes at time of
creating the index)
Automatic Maintenance using Statistics Profile warehouse.
18
Designing online runstats
Backup existing statistics
19
Before collecting fresh statistics, it is always advisable to backup existing statistics. This would really help in debugging the
problem “if” the new statistics affects query performance. This can be done using db2look utility in mimic mode.
Recommend to backup statistics on a daily basis (or) before it changes using db2look in mimic mode:
db2look -d SAMPLE -z DB2INST1 –t EMPLOYEE –m -p -xd -o employee.stats
Key column statistics does satisfy queries covered by indexes. But there are exceptions to the rule, so find to see if statistics
collected on all columns would help query performance using correlation and selectivity.
Use CGS and avoid creating multiple indexes to provide statistical information. Improves optimizer estimates providing better
access plans.
Sampling improves runstats performance consuming less cpu. At the same time, don’t sample small tables. Runstats will not
sample if table has < 5 pages. Sample related tables with similar sampling rate to achieve same level of accuracy.
Sampling statistics with REPEATABLE would provide consistent results if table changes infrequently across multiple
invocations of runstats.
UTIL_IMPACT_PRIORITY can throttle runstats. This works only if UTIL_IMPACT_LIM is set to less than 100%. For
example setting UTIL_IMPACT_LIM=90 and UTIL_IMPACT_PRIORITY=50 would increase overall system usage by
45%.
19
Agenda
20
20
What to expect and avoid!
Logging
+ Reorg writes to transaction logs
+ Use DB2_ALTERNATE_PAGECLEANER
+ Monitor log archives
+ Consider using infinite logging
- Avoid log full condition
- Reorg placed in “Paused State” when log is full
Logging:
Reorg for both table and index (as option) writes to transaction logs.
Use DB2_ALTERNATE_PAGECLEANER registry variable to automate page cleaning.
Monitor log archives and ensure the archives are keeping up with the rate at which logs are generated
by reorg.
Consider using infinite logging.
Avoid log full condition. Maintain enough space on the active log path and archive log path.
If the active log gets full during reorg, the reorg is automatically placed in “Paused” state. DBA will
have to resume the reorg manually or stop and start it if required. Use table snapshot.
21
What to expect and avoid!
Locking
+ Use NOTRUNCATE option
+ Use CLEANUP option
+ Watch for reorgs waiting on “Lock Conversion”
- Avoid reorg waiting on locks for extended duration
- Avoid running table and index reorg on same table at same time
Does the table reorg hold the log file as “active” if it is in “lock-
wait” state?
Answer: Yes, just like any other transaction in lock-wait.
22
Locking:
Use reorg with NOTRUNCATE table option. This eliminates the “S” lock required during truncation.
Take advantage of index clean up options to avoid replace phase that required “Z” lock.
Watch for reorgs waiting on “Lock conversion”. Find the application holding lock and take necessary action.
Avoid prolonged reorg lock-wait. In situations where this reorg may be stopped or paused, such an asynchronous
operation will take longer to take effect which is directly proportional to the time of lock-wait. Forcing the
application that is running the online reorg is asynchronous in nature. Therefore time taken for reorg to stop depends
on the time it has been waiting on locks. May cause lock chains if lock-waits are left unattended and is not a
desirable scenario.
Avoid running table and index reorg together on the same table.
22
What to expect and avoid!
Recovery
+ Full backup at end of reorg required
+ Index reorgs are logged
+ History file contains reorg information
Throttling
+ Not supported for Table and Index reorg
+ Support for runstats
23
Recovery:
Take a full backup of the database at end of reorg. This is to avoid using the logs generated during reorg to rollforward incase
of disaster.
Index reorgs are logged with the new db cfg “LOGINDEXREBUILD” and table reorg is logged by default. This has to do with
the design of HADR.
History file is not an accurate representation of the status of previous reorg. It does not always provide failure information,
especially for index reorgs.
Throttling:
Reorg for table and index does not support throttling.
Runstats is a throttled utility by using UTIL_IMAPCT_PRIORITY with UTIL_IMPACT_LIM
23
What to expect and avoid!
Performance
+ Reorg without specifying an index is the fastest
+ Reorg can be paused and resumed
+ SQL is sensitive to order of data
+ Run reorgs in parallel
- Long running transactions impact reorg
+ Split large tables into smaller ones
Can a user application (who does not have REORG privileges) get "Internal
Reorg Lock" on a table even while it is NOT doing a reorg? The lock
snapshot shows user application holding "Internal Reorg Lock".
Answer: An application that is executing against the same table that is being
reorg’d will show a shared Internal Reorg Lock, as observed. This does not
mean that that particular application is performing a reorg, just that it is
executing at the same time as the reorg, and on the same table. This type of
lock is used as a synchronization mechanism for inplace reorg processing.
24
Performance:
Reorg without specifying an index would pack data pages in the order presented to it. This is the fastest approach.
Reorg is easy to manage. Pause it if required and resume it soon enough that the benefits of clustering is not lost.
SQL is sensitive to order of data. Therefore, it is common to see reorgs based on an index (or) if the table has a clustered
unique index, the reorg would use that index by default.
Allow write access.
Run reorgs in parallel and measure the impact. It depends from system to system on how many parallel reorgs can be executed
without affecting the end user.
Reorg on a table run during different time period does not run at the same speed. It depends on what’s running on the system at
that time.
Long running transactions can impact progress of reorg. Pick a suitable time to do reorg. If no window is available, go with the
best available time.
Split the big tables into smaller tables and use union all views (V9 - use of partition tables) If the table is too bit to reorg that
by the time reorg is done, a portion of the table has already been changed. Divide and conquer.
Reorg is not a throttled utility. However, Runstats can be throttled using UTIL_IMPACT_PRIORITY set to a % and
UTIL_IMPACT_LIM set to less than 100%
24
What to expect and avoid!
Errors/Warnings
- Index statistics inconsistent with table statistics – SQL0437W, rc=6
- Online table reorg not supported when using APPEND ON – SQL2219N, rc=12
- Online table reorg not supported for MDC tables – SQL0270N, rc =46
- Online index reorg fails due to space allocation – SQL0289N
Space allocation
+ Extra space not required for online table reorg
+ Space equal to size of index required for online index reorg
+ Runstats with distribution needs more space under system catalog
Errors/Warnings:
Index statistics not consistent with table statistics - SQL0437W, reason code 6
INPLACE table reorg is not supported for tables in APPEND MODE ON - SQL2219N, reason code 12
INPLACE table reorg is not supported for MDC tables - SQL0270N, reason code 46
ONLINE index reorg will get SQL0289N error when run with ALLOW READ/WRITE ACCESS and there is not enough
space in the tablespace. If ALLOW NO ACCESS is used, the reorg for index would succeed with a Z lock throughout the
duration of index reorg, thus is not very desirable for OLTP applications.
Space allocation:
INPLACE table reorg does not need extra space and does all it’s work in batches within the space allocated in the tablespace
where the table resides.
Online index reorg creates a shadow index during the rebuild phase in the same tablespace as the index. It is important that
there is enough space in the index tablespace to hold a shadow copy of the index.
Runstats with distribution needs more space allocated to system catalog tablespace.
25
Agenda
26
26
Working with reorg online
• Reorg can be stopped, paused and resumed online. Reasons:
27
Working with reorg online
The DBA wishes to alter the table reorg:
For example, change the access from read only to write access and not truncate the table
to make it more concurrent with online user workload:
db2 reorg table db2inst1.employee inplace allow read access start
db2 reorg table db2inst1.employee inplace pause
db2 reorg table db2inst1.employee inplace notruncate table resume
Another example, change the table reorg to use an index with the same level of access:
db2 reorg table db2inst1.employee inplace allow write access start
db2 reorg table db2inst1.employee inplace pause
db2 reorg table db2inst1.employee index db2inst1.employee_idx1
inplace resume
28
Monitoring Reorg & Runstats
Snapshot Monitors:
Database : Watch log usage, Oldest transaction application ID
Table : Full table and reorg information available
Application : Idle time, Sorts, Rows: read, selected and written, Log used
Lock: Detailed lock information and status
Utilities:
db2 list utilities
db2pd
AIX: nmon, vmstat and iostat
If reorg transactions are written to log files such that if these logs
were used for rollforward, does the reorg log records get
replayed?
Answer: Yes - inplace reorg is like any other transaction.
29
Application snapshot: (idle time, hit ratios, sorts, rows: read, selected and written, log space used):
get snapshot for application agentid <agentid_of_db2reorg_process>,
get snapshot for locks for application agentid <agentid_of_db2reorg_process>
select * from table(sysproc.snapshot_app_info(‘sample’,-1)) as appsnap, select * from
table(sysproc.snapshot_lockinfo(‘sample’,-1)) as locksnap
Administrative notification log and FFDC log for progress and errors
29
Monitoring Table reorg Reorg attributes:
Reclustering
$ db2 reorg table db2inst1.employee1 index idx1 inplace allow write access start Reclaiming
DB20000I The REORG command completed successfully.
DB21024I This command is asynchronous and may not be effective immediately Inplace table reorg
(Output of db2 get snapshot for tables on sample given below): Allow write access
Table Schema = db2inst1 Allow read access
Table Name = EMPLOYEE1
Allow no access
Table Type = User
Data Object Pages = 6752 No table truncation
Index Object Pages = 819
Recluster via index scan
Rows Read = 0 Reorg Status:
Rows Written = 0 Reorg long field LOB data
Started/Resumed
Overflows = 0
Reorg data only w/o LOBS
Page Reorgs = 602 Paused
Table Reorg Information:
Stopped
Reorg Type =
Reclustering Completed
Inplace Table Reorg
Truncate
Allow Write Access
Reorg Index = 1
Reorg Tablespace = 2
Start Time = 12/20/2005 11:32:41.263978
Reorg Phase =
Max Phase = Reorg progress= reorg_current_counter / reorg_max_counter * 100
Phase Start Time =
Status = Completed Status:
Current Counter = 6751
Max Counter = 6751 Success = 0
Completion = 0 Failure = -1
End Time = 12/20/2005 11:33:08.59918630
30
Monitoring Index Reorg
Application snapshot elements:
- Rows read – DB2 has to read ALL rows
- Rows written – Overflow rows to systempspace (like a sort spill)
31
31
Monitoring Runstats
Application snapshot elements:
- Rows read
- Rows written
- Bufferpool data/index logical reads
- Bufferpool data/index physical reads
32
Application id = 1182
Application status = UOW Waiting
Snapshot timestamp = 03/01/2006 11:04:20.028821
Buffer pool data logical reads = 6163820
Buffer pool data physical reads = 381
Buffer pool temporary data logical reads = 0
Buffer pool temporary data physical reads = 0
Buffer pool data writes = 0
Buffer pool index logical reads = 770705
Buffer pool index physical reads = 53
Buffer pool temporary index logical reads = 0
Buffer pool temporary index physical reads = 0
Buffer pool index writes = 0
Direct reads = 348
Direct writes = 348
Direct reads elapsed time (ms) = 2
Rows deleted = 0
Rows inserted = 0
Rows updated = 0
Rows selected = 0
Rows read = 45519430
Rows written = 10808
32
Locking/Concurrency – Table reorg
Database name = SAMPLE
Database path = /db2home/db2inst1
Input database alias = SAMPLEreorg table db2inst1.employee inplace allow write access start
Locks held = 560
Applications currently connected = 1
Agents currently waiting on locks = 0
Snapshot timestamp = 01/05/2006 13:31:15.638064
TABLE NAME | TYPE | MODE | LOCK COUNT | HOLD COUNT | CURRENT | APP_WORKING | APP_BLOCKING
-----------------| ----------------- | ------ | ---------- | ---------- | --------| ------------ | --------
- | Tablespace | IX | 2 | 1 | 1 | 949 |
DB2INST1.EMPLOYEE| Row | IN | 1 | 0 | 553 | 949 |
DB2INST1.EMPLOYEE| Row | NS | 1 | 0 | 2 | 949 |
DB2INST1.EMPLOYEE| Internal Table Al | X | 1 | 1 | 1 | 949 |
DB2INST1.EMPLOYEE| Table | IS | 1 | 1 | 1 | 949 |
DB2INST1.EMPLOYEE| Inplace Reorg Loc | IN | 1 | 1 | 2 | 949 |
The lock that’s missing here is the Shared-Lock at the time of table truncation. If you take frequent
lock snapshot and if REORG is waiting on locks. The lock that it is waiting to acquire will be in
“Conversion”. The IS lock is upgraded to S lock.
33
Locking/Concurrency – Index reorg
Database name = SAMPLE
Database path = /db2home/db2inst1
Input database alias = SAMPLE
Locks held = 6
Applications currently connected = 1
Agents currently waiting on locks = 0 db2 reorg indexes all for table db2inst1.employee allow write access
Snapshot timestamp = 01/16/2006 11:43:28.665147
TABLE NAME | TYPE | MODE | LOCK COUNT | HOLD COUNT | LOCKS HELD | APP_WORKING | APP_BLK
-----------------| ----------------- | ------ | ---------- | ---------- | ------------| ------------ | -------
SYSIBM.SYSTABLES | Row | NS | 1 | 0 | 1 | 870 |
- | Tablespace | IX | 1 | 0 | 2 | 870 |
DB2INST1.EMPLOYEE| Internal Table Al | X | 1 | 0 | 1 | 870 |
SYSIBM.SYSTABLES | Table | IS | 1 | 0 | 1 | 870 |
DB2INST1.EMPLOYEE| Table | IN | 1 | 0 | 1 | 870 |
The above is a formatted output of lock snapshot. Note that there are different locks acquired during
reorg and not all locks are shown above. The lock conversion will occur before the final switch-over
phase starts. Note that the index reorg will quiesce the object and escalate to a Z lock to switch
objects.
34
Locking/Concurrency - Runstats
Database name = SAMPLE
Database path = /db2home/db2inst1
Input database alias = SAMPLE
Locks held = 3
Applications currently connected = 1
Agents currently waiting on locks = 0
Snapshot timestamp = 01/05/2006 12:55:43.977863
runstats on table db2inst1.employee with distribution and detailed indexes all allow write access
APP.NAME APP.USER HANDLE APP.ID APP.STATUS LOCKS WAIT.ms BLOCKING.ID
-------------------- -------- ------ ------------------------------ ----------------- ------- ------- -----------
db2bp db2inst1 62 *LOCAL.db2inst1.060104183406 UOW Executing 3 0
TABLE NAME | TYPE | MODE| LOCK COUNT | HOLD COUNT | LOCKS HELD | APP_WORKING | APP_BLOCKING
------------------| ----------------- | ----| ---------- | ---------- | ------------| ------------ | ------------
DB2INST1.EMPLOYEE | Table | IN | 2 | 0 | 1 | 62 |
SYSIBM.SYSTABLES | Row | U | 1 | 0 | 1 | 62 |
SYSIBM.SYSTABLES | Table | IX | 1 | 0 | 1 | 62 |
35
Throttling Runstats
db2 runstats on table db2inst1.employee with distribution and detailed indexes all allow write access
36
Agenda
• Utilities for performing online maintenance
• Designing online maintenance
• What to expect and avoid!
• Monitoring
• Tweaks & considerations
37
37
Tweaks & Considerations
• General:
• Stagger big tables and run in parallel
• Not benefit tables with high transaction volume
• Performance is not affected by fragmentation
• DMS highwater mark does not reset after reorg or runstats
• Use SYSPROC for monitoring and storing snapshot data for analysis
• Do NOT compete with batch jobs, shorter the transactions faster the reorg
• Table reorg:
• Use NOTRUNCATE TABLE
• Without index is the fastest
• Consider using clustered indexes
• Index reorg:
• CLEANUP after the mass delete
• CLEANUP ONLY ALL to avoid Z locks
• Runstats:
• Sample statistics
• Column-group statistics (CGS) for group cardinalities
38
Consider running multiple table reorgs in parallel. This can be scripted to start “n” table reorgs in
parallel initially and based on system activity add to it.
For running multiple table reorgs in parallel, try not to run big tables in parallel. Stagger them to
sustain optimal system & database performance.
If the table is constantly changing, then online reorg may not be able to keep up with the changes to
the table.
Prefer to run table reorg without index for better performance. Index based table reorg is slower than
a table reorg without specifying indexes. Note that if the table has clustered unique indexes, then
online reorg will by default pick the clustered unique index for ordering data pages.
Consider using clustered indexes but comes with insert penalty, reducing need for reorg, but if the
table had to be reorg’d would be slower as it needs to go through the index to order data.
38
Tweaks & Considerations
• Reorg and Backup:
• Internal B Locks can cause reorg’s to run slower during online backup
• Online table reorg clean up phase cannot start during online backup
• Online index reorg and online backup have the following compatibility
issues:
• In SMS mode, they are not compatible
• In DMS mode, they are compatible with recovery issues related to lifeLSN. In
addition, online backup will wait because of Z lock during switch-over phase
39
Consider index reorg using CLEANUP ONLY ALL option to avoid Z locks.
Consider using runstats with sampled statistics.
Consider collecting column-group statistics (CGS) which is better than adding indexes to satisfy
queries and helps in cardinality calculations by providing stronger correlation.
Performance is not affected by fragmentation but however is affected by order of data. Therefore,
with the helps of “last known good statistics”, you could get away with table reorg if space is not an
issue. Remember, db2 will try and fit the new row in an existing page with information from
SMP/EMP extents and that’s limited by DB2FSCRSEARCH registry variable on how many EMP
extents should be checked for extents with empty pages before creating a new extent.
CLOBS have to be reorganized for non-MDC and MDC tables. The CLOB space usage (the actual
usage) can be obtained online using INSPECT command which would give an idea of how many
pages would be freed following clob reorg.
Some configuration parameters that could help speed up reorg are:
DB2_USE_ALTERNATE_PAGE_CLEANING to avoid ADM1822W errors during online
table reorg.
Larger UTILHEAPSZ to avoid ADM9500W while running online index reorg.
Bigger bufferpool for System temporary tablespace.
Enough I/O Cleaners and Servers to handle the reorgs.
Note: There are other cfg parameters that can speed up reorg, but don’t suit OLTP
environments like sortheap, intra_parallel, max_querydegree, etc.
39
Large objects (LOBs) Considerations
Reorg table with LONGLOBDATA option to reclaim wasted lob space
Deleting data from LOBS does not decrease SMS tablespace size
40
DB2 allocates 256 k blocks instead of the tablespace page size for the lob objects. These chunks of
spaces are called “buddy spaces”. The Lob manger keeps track of “buddy spaces” using a special
page called “allocation page”. Lobs will have to be reorged using LONGLOBDATA option in
offline mode.
List tablespace show detail does not show “real” usage
SQL function length(lob_column) cannot be used to calculate lob space “real” usage.
After deleting data from the table and running a REORG, no space is reclaimed; the size of the table
space did not decrease. When an SMS tablespace is used a REORG is not going to reclaim space. An
SMS table space is defined as a directory. The directory uses the space available on the underlying
file system. When a row is inserted and subsequently deleted in an SMS table space, the space is
reused. The data for an SMS table space is always contiguous and there are never any empty pages in
the table space. The files in the SMS directory will not change in size. The only way to make the
LOB file decrease in size in an SMS table space is to do an EXPORT/LOAD of the data in the table.
When you LOAD data into an empty table the data is placed in a contiguous fashion and there should
be no free space. Here are your steps for doing the EXPORT/LOAD:
1. EXPORT the data
2. drop the table
3. re-create the DDL for the table
4. LOAD data
40
Agenda
• Utilities for online maintenance
• Design online maintenance
• What to expect and avoid!
• Monitoring
• Tweaks & considerations
• Reduce better avoid reorg!
41
41
Reduce/Avoid REORG
• MDC – Multi Dimensional Clustering
• MDC tables behave just like normal tables except for its enhanced
clustering and processing aspects
42
“Multidimensional clustering is primarily intended for data warehousing and large database environments, and it can also
be used in online transaction processing (OLTP) environments.” – DB2 Information Center
Source: http://www-306.ibm.com/cgi-
bin/db2www/data/db2/udb/winos2unix/support/v8infocenter.d2w/report?target=mainFrame&fn=c0007201.htm
“Multidimensional clustering (MDC) . MDC is new clustering technology that provides a method for automatic,
continuous clustering of data along multiple dimensions. And MDC tables don't require database maintenance operations,
such as reorganization. MDC primarily benefits data warehousing type queries and large database environments;
however, it can also be useful for transaction processing. “
Source:
http://www.db2mag.com/shared/printArticle.jhtml?article=/db_area/archives/2002/q4/gunning.shtml&pub=db2mag
“MDC scheme lends itself to relatively efficient maintenance. Most utilities such as Reorg, Backup, Restore, and Runstats
have been modified appropriately for use with MDC tables.” Source: SIGMOD2003.
“If the dimensions and blocksizes are chosen appropriately, then clustering benefits will translate into significant
performance and maintenance advantages.” Source: SIGMOD2003
“MDC tables behave just like normal tables except for its enhanced clustering and processing aspects.”
Source: DBLP Project.
“Type-2 Indexes. Type-2 indexes improve performance by eliminating most next-key-share locks, as entries are marked
deleted instead of physically deleted from the page. Type-2 indexes are required for online load, online reorganization,
and MDC”
Source:
http://www.db2mag.com/shared/printArticle.jhtml?article=/db_area/archives/2002/q4/gunning.shtml&pub=db2mag
Note: Any changes to the data layout should be balanced. Table and Index structure can be designed to reduce
maintenance free but also provide optimal insert/update/delete/select performance for online workloads.
42
Agenda
• Utilities for online maintenance
• Design online maintenance
• What to expect and avoid!
• Monitoring
• Tweaks & considerations
• Reduce better avoid running maintenance
• Asked and Answered!
43
43
Asked and Answered!
1. Is Index reorg any different than Table reorg with respect
to concurrency?
44
44
Asked and Answered!
4. If we DON'T use NOTRUNCATE option for inplace table
reorg, then we reclaim space but at the cost of
concurrency. Why is this behavior?
45
45
Summary
• Table reorgs run in trickle mode! Not meant for speed!
• The goal is to run reorgs and runstats without impacting customer
• Understand the application before designing reorg
• Know your peak and off-peak hours
• Stagger reorgs and runstats
• Reorg without index and without table truncation provides high
concurrency
• Table reorg can be paused and restarted
• Index reorg rebuild requires Z lock following catch-up phase
• Know the SQL and collect required statistics
• Consider using statistical profiling
• Consider collecting Column Group Statistics (CGS)
• Monitor utilities and adjust accordingly
46
46
Acknowledgements
This presentation would not be complete without valuable inputs from the following
people and their work in the area of database maintenance and performance:
• Bishwaranjan Bhattacharjee
– MDC Team, IBM TJ Watson Research Center
• Danny Arnold
– Senior Product Manager, IBM Competitive Database Technologies
• Matt Huras
- Distinguished Engineer, Lead Kernel Architect, DB2 Development – IBM
Toronto Lab
• Paul Turpin
– Thread Chair IDUG 2006 NA, DBA - Wachovia
47
47
Session E02
Online database maintenance for 24x7 OLTP Systems
Vijay Sitaram
Vijay.Sitaram@gmail.com
48
48
Session E02
Online database maintenance for 24x7 OLTP Systems
DB2 performs!
Thank you
49
49
Additional Material
50
50
Appendix A – Answers to Questions
• Q1 Answer: Inplace table reorg was designed to have minimal impact on
concurrent apps. Inplace table reorg holds S row locks for very short periods of
time while it moves rows around in the table. Online index reorg is not inplace, it
uses a shadow object which is invisible to the users until we switch between the
original and the shadow object. This means that for an index reorg that specifies
ALLOW WRITE ACCESS, minimal locking will be done during the index build
phase but we will need to quiesce all activity on the indexes in order to switch the
original and rebuilt (shadow) objects. Inplace table reorg has no comparable need
for a quiesce, and therefore you'd see less of a concurrency issue with that
operation.
51
51
Appendix A – Answers to Questions
• Q3 Answer: Once a page is allocated to the index object, it's
never returned back to the tablespace until the index object is
reorged or the last index is dropped. If you do lots of deletion, you
could run reorg with cleanup option more to clean up the pseudo
empty pages, so that the empty pages will be reused sooner.
52
52
Appendix A – Answers to Questions
• Q5 Answer: To avoid this, one should NOT have long
running and uncommitted transactions (writers and
readers) that hold up the reorg index lock request. The
two lock requests are only made after reorg indexes
has done most of the work to rebuild indexes.
53
53
Appendix B – Monitoring reorg using sysproc
db2 " \
SELECT \
SUBSTR(LTRIM(RTRIM(TABLE_SCHEMA))||'.'||LTRIM(RTRIM(TABLE_NAME)),1,20) as TABLE_NAME, \
PAGE_REORGS, \
REORG_CURRENT_COUNTER, \
REORG_MAX_COUNTER, \
CASE REORG_TYPE \
WHEN 268500992 THEN 'Start inplace with write access' \
WHEN 268959744 THEN 'Resume inplace with write access' \
WHEN 536936448 THEN 'Start inplace with read access' \
ELSE 'Inplace Table reorg' \
END AS TYPE , \
CASE REORG_STATUS \
WHEN 1 THEN 'Started' \
WHEN 2 THEN 'Paused' \
WHEN 3 THEN 'Stopped' \
WHEN 4 THEN 'Completed' \
WHEN 5 THEN 'Truncate' \
ELSE NULL \
END AS STATUS, \
CASE REORG_COMPLETION \
WHEN 0 THEN 'Success' \
ELSE 'Failure' \
END AS COMPLETION, \
REORG_START, \
REORG_END, \
REORG_INDEX_ID, \
REORG_TBSPC_ID, \
PARTITION_NUMBER \
FROM \
TABLE ( SYSPROC.SNAPSHOT_TBREORG('SAMPLE',-1)) as REORG”
54
The table snapshot can be obtained through SYSPROC.SNAPSHOT_TBREORG table function. The sql for the table function
is given below which provides all relevant details to table reorg:
1. Reorg progress
2. Type of reorg
3. Status of reorg
4. Completion of reorg
5. Start and End timestamp for reorg
6. Index used for reorg
7. Tablespace used for reorg
8. Partition currently running the reorg.
54
Appendix – C: Locking/Concurrency – Table reorg
Database name = SAMPLE
Database path = /db2home/db2inst1
Input database alias = SAMPLEreorg table db2inst1.employee inplace allow write access start
Locks held = 1280
Applications currently connected = 2 select count(*) from db2inst1.employee
Agents currently waiting on locks = 1
Snapshot timestamp = 01/05/2006 14:46:59.329012
TABLE NAME | TYPE | MODE| LOCK COUNT |HOLD COUNT | LOCKS HELD | APP_WORKING | APP_BLOCKING
-----------------| ----------------- | --- | ---------- | --------- | ----------- | ------------ | -----------
DB2INST1.EMPLOYEE| Row | IN | 1 | 0 | 1269 | 438 | 930
DB2INST1.EMPLOYEE| Inplace Reorg Loc | IN | 1 | 1 | 2 | 438 | 930
DB2INST1.EMPLOYEE| Inplace Reorg Loc | IX | 1 | 1 | 1 | 438 | 930
DB2INST1.EMPLOYEE| Internal Table Al | X | 1 | 1 | 1 | 438 | 930
- | Tablespace | IX | 2 | 1 | 1 | 438 | 930
DB2INST1.EMPLOYEE| Table | IS | 1 | 1 | 1 | 438 | 930
DB2INST1.EMPLOYEE| Row | NS | 0 | 0 | 1 | 930 |
DB2INST1.EMPLOYEE| Table | IS | 2 | 0 | 1 | 930 |
- | Internal Variatio | S | 1 | 0 | 1 | 930 |
DB2INST1.EMPLOYEE| Inplace Reorg Loc | S | 1 | 1 | 1 | 930 |
DB2INST1.EMPLOYEE| Row | NS | 1 | 0 | 1 | 930 |
- | Internal Plan Loc | S | 1 | 0 | 1 | 930 |
55
The above is a formatted output of lock snapshot. Note that there are different locks acquired during
reorg and not all locks are shown above. Some of the interesting locks:
Inplace Reorg Lock – This ensures that no two applications can run reorg on same object. This is
meant for synchronization.
Internal Table Alter Lock – This ensures that the table DDL statements cannot be run during reorg.
55
Appendix D – Technotes from IBM
• Internal B Locks
• http://www-
1.ibm.com/support/docview.wss?rs=71&context=SSEPGG&q1=reorg&uid=swg21153425&loc=en_US&cs=utf-
8&lang=en
• Utilities compatible with ONLINE backup
• http://www-
1.ibm.com/support/docview.wss?rs=71&context=SSEPGG&q1=reorg&uid=swg21214717&loc=en_US&cs=utf-
8&lang=en
• What can run during online backup?
• http://www-
1.ibm.com/support/docview.wss?rs=71&context=SSEPGG&q1=inspect&uid=swg21214717&loc=en_US&cs=ut
f-8&lang=en
• Estimating & viewing space usage for large objects (LOBs)
• http://www-
1.ibm.com/support/docview.wss?rs=71&context=SSEPGG&q1=inspect&uid=swg21218554&loc=en_US&cs=ut
f-8&lang=en
• Tablespace size does not change after deleting data, then running reorg
• http://www-
1.ibm.com/support/docview.wss?rs=71&context=SSEPGG&q1=reorg&uid=swg21204819&loc=en_US&cs=utf-
8&lang=en
• Highwater mark is the list tablespace show detail not RESET after REORG or RUNSTATS
• http://www-
1.ibm.com/support/docview.wss?rs=71&context=SSEPGG&q1=reorg&uid=swg21007334&loc=en_US&cs=utf-
8&lang=en
• Introduction to SQL administrative routines SYSPROC
• http://www-128.ibm.com/developerworks/db2/library/techarticle/dm-0505melnyk/
• Automatic statistics profiling and updates to the warehouse feedback tables
• http://www-
1.ibm.com/support/docview.wss?rs=71&context=SSEPGG&q1=reorg&uid=swg21216981&loc=en_US&cs=utf-
8&lang=en
56
56
Appendix E – Reduce/Avoid reorg
57
57
Appendix E: Reduce/Avoid reorg
Consider the following for MDC:
58