Anda di halaman 1dari 58

Session: E02

Online Database
Maintenance for 24x7
OLTP Systems

Vijay Sitaram
Per-Se Technologies

Monday, May 8, 2006 • 01:00 p.m. – 02:10 p.m.

Platform: DB2 for Linux, Unix, Windows

In today’s world DBA’s are challenged with supporting OLTP environments with very little or no
downtime. Such a task needs to be accomplished without sacrificing performance. DB2 UDB
has kept up with industry challenges and gives DBA the relief / ammunition to perform his tasks
online unobtrusively in the background.

We start with the assumption that the need for reorg has been identified with the help of:
1. Snapshot and Event monitors show page reorgs, overflows
2. Query explain show extra I/O is being performend. The access plan has changed that the query
has picked the wrong index or it’s using HASH join instead of Nested-Loop join. This is typical
when the optimizer does not have current information on the data layout through statistics.
3. Reorgchk recommends that the table needs reorg (This is accurate as long as the statistics are
current).

Therefore, having set the stage, we now dive into this presentation where we pick apart the two
command: reorg and runstats. With that, we explore the possibilities of running them most
efficiently causing the least service impact to customers.

1
About Per-Se Technologies
Your Health Is The Bottom Line
At Per-Se Technologies, our business is delivering Connective Healthcare solutions to help ensure
the financial success of physicians, pharmacies, hospitals and healthcare organizations. A long-time
leader in business services and information technology solutions for the healthcare industry, Per-Se
understands that rising costs of treatment and rampant reimbursement inefficiencies diminish the
ability of providers to stay competitive, run profitable businesses and deliver optimal patient care. Our
Connective Healthcare solutions can help.
Connective Healthcare is a comprehensive class of business solutions that help providers achieve
their income potential by streamlining and simplifying the complex administrative burden of providing
healthcare. Innovative technology, integrated networks, data mining, provider training and extensive
healthcare expertise enable us to accelerate the movement of funds to benefit our clients.

Providers who implement Connective Healthcare solutions will benefit from decreased expenses
through Per-Se’s resource management solutions and increased revenue through our revenue cycle
management solutions. Our solutions work to streamline cash flow, increase collections and
reimbursements, ensure compliance and improve customer service. The result is an approach that
empowers physicians, pharmacies, hospitals and healthcare organizations to collect the
compensation they deserve so they can focus on the important services they provide.

Our vision is:


To enable every provider to achieve their income potential.
Visit us at: www.per-se.com
2

2
Agenda
• Utilities for online maintenance
• Design online maintenance
• What to expect and avoid!
• Monitoring
• Tweaks & considerations
• Reduce better avoid reorg!
• Asked and Answered!

Questions at the end of presentation. Thank you!


3

Outline:
a. Reorg table & Index with examples
b. Runstats with examples
c. Nature and Benefits of Online Reorg.
d. Nature and Benefits of Online Runstats.

Design:
a. Plan before design.
b. Designing online maintenance.
c. Phases of online reorg and runstats
d. Internals
e. Locks and Logs
f. Parallel reorgs and runstats

What to expect and avoid:


a. Logging
b. Recovery
c. Locking
d. Performance
e. Errors and warnings

Monitoring REORG and RUNSTATS:


a. Monitor using snapshot command
b. Monitor using db2 command line utilities to record progress
c. Monitor using AIX utilities on system performance.

Tweaks and Considerations:


a. Options for better reorg and runstats performance

Reduce and better avoid table and index maintenance:


a. Using MDC to avoid reorgs
b. Using parameters to reduce frequency of reorg.

Asked and Answered:


This section has some interesting questions with answers.

3
Agenda

• Utilities for online maintenance

4
Utilities for performing online maintenance
REORG command:
• Table
• Reclaim Space
• Recluster data
• Index
• Reclaim Space
• Rebuild index

REORG {TABLE table-


table-name Table-
Table-Clause | INDEXES ALL FOR TABLE table-
table-
name Index-
Index-Clause}
Table-Clause:
[INDEX index-
index-name] [INPLACE [ [ALLOW {WRITE | READ} ACCESS]
[NOTRUNCATE TABLE] [START | RESUME] | {STOP | PAUSE} ]]
Index-Clause:
[ALLOW {READ | NO | WRITE} ACCESS] [{CLEANUP ONLY [ALL | PAGES] |
CONVERT}]

Examples:
db2 REORG TABLE db2inst1.employee INDEX db2inst1.emp_IX1
INPLACE NOTRUNCATE TABLE ALLOW WRITE ACCESS START
db2 REORG INDEXES ALL FOR TABLE db2inst1.emoloyee ALLOW
WRITE ACCESS CONVERT
5

The reorg command now supports new options which used in combination can address most of the scenarios / challenges
posed to DBA. The part of the command that’s highlighted here is the focus of attention.
Option INPLACE runs table REORG in online mode and one may choose to ALLOW WRITE or READ ACCESS while
running inplace reorg. This is asynchronous in nature and the command is submitted as a background task to DB2. The
command can be run on regular tables only. MDC tables allow offline reorg only. Index reorg is different than table reorg
for many reasons, one of them being it’s a synchronous operation and gives lot more flexibility. More on this to be
discussed during the course of presentation.

Table-clause: This is meant for reorganizing tables.


1. INDEX index-name – An index can be specified to order data based on the index. This index can be clustered or a non-
clustered index. However, if an clustered index existed on the table, that index will be chosen to order data. If an index
other than the clustered index was used for ordering data (when the table has a clustered index), the reorg command
would fail.
2. NOTRUNCATE TABLE – This option comes handy when the goal is to ONLY recluster data. The DBA can skip
reclaiming space during the reorg by NOT truncating the empty extents at the tail end of the table. This option provides
more concurrency and will be discussed later in the presentation.
3. START | RESUME || STOP | PAUSE – These options allows DBA to control the reorg process based on system activity.
It is very handy especially when the system has high workloads to pause the reorg or for some reason stop the reorg due
to batch processing. More discussions on this specific option in the presentation.

Index-clause: This is meant for reorganizing indexes only.


1. CLEANUP options – This option can be used just to clean up the index and ignore reclaiming empty extents allocated to
the index.
2. CONVERT – This helps in converting type-1 indexes to type-2 indexes if the database has V7 style indexes. Type-2
indexes reduces NKS locking and provides better concurrency. Use db2dart (offline) or inspect (online) to determine the
type of index. Online index reorg can be run only on type-2 indexes.

5
Utilities for performing Online maintenance
RUNSTATS command:
• Column level statistics
• Group of columns (CGS)
• Distribution columns statistics
• Detailed index statistics
• Table Row and page level sampling
• Index sampling
• LIKE predicates
• NUM_FREQUENCY and NUM_QUANTILES at table
and column level
• Statistics Profiling
6

The way to collect statistics have improved in the past few years, thanks to the LEO project. DB2 can collect statistics
automatically and store it for profiling and reuse at a later time. This also gives the optimizer to learn from the statistics gathered
and suggest what to collect and when to collect. The whole system runs on proactive & feedback based statistics collection.
Runstats command is now very flexible, granular, fast, easy and effective as always. The options are so many that it could be a
topic of discussion. For this presentation, we will look at:
Runstats is a throttled utility and can be done using UTIL_IMPACT_PRIORITY and UTIL_IMPACT_LIM db cfg.

Index-clause:
1. Sampled – Use sampling for collecting index key statistics. Instead of collecting based on all the keys, db2 can sample a set of keys and
derive the statistics.
2. Detailed – This is useful to gather detailed index statistics such as clusterfactor, pagefetch pairs and prefetch statistics.
3. INDEX <index_name> vs. INDEX ALL – Allows you to collect statistics on one index instead of all the indexes for the table.
Column-stats-clause:
1. ALL vs. KEY COLUMNS – Collect statistics on all table columns or just on the columns which are used by indexes (key columns).
2. Set of columns – You can also specify a set of columns (key or non-key) to collect statistics with or without LIKE STATISTICS
Distribution-clause:
1. DISRIBUTION Statistics – Collect histograms on table columns which provide more detailed information on table column values.
2. ALL vs. KEY COLUMNS – Collect distribution on all or just the key columns. For each of these columns, you may also specify
NUM_QUANTILES and NUM_FREQUENCY.

Table-sampling-options: Row level sampling & Page level sampling

Profile-options: Create new profile & Update existing profile

6
Runstats by example
Column level statistics:
RUNSTATS ON TABLE db2inst1.employee
ON KEY COLUMNS AND COLUMNS (empno, empname)

Index Statistics:
RUNSTATS ON TABLE db2inst1.employee
FOR INDEXES db2admin.INX1, db2admin.INX2, db2admin.INX3

Distribution statistics:
RUNSTATS ON TABLE db2inst1.employee
WITH DISTRIBUTION DEFAULT NUM_FREQVALUES 40

Using CGS:
RUNSTATS ON TABLE db2inst1.employee
ON COLUMNS ((empno, empname), mrgno, (admrdept, location))

Using LIKE statistics:


RUNSTATS ON TABLE db2inst1.employee
ON ALL COLUMNS and COLUMNS (empname LIKE STATISTICS)
7

7
Runstats by example
Statistics profile:
RUNSTATS ON TABLE db2inst1.employee AND INDEXES ALL SET PROFILE ONLY

RUNSTATS ON TABLE db2inst1.employee USE PROFILE

Sampling:
Table Î Row-level:
RUNSTATS ON TABLE db2admin.department WITH DISTRIBUTION
TABLESAMPLE BERNOULLI (10) REPEATABLE (1024)

Table Î Page-level:
RUNSTATS ON TABLE db2admin.department AND INDEXES ALL
TABLESAMPLE SYSTEM (10)

Index Î key-level:
RUNSTATS ON TABLE db2admin.department WITH DISTRIBUTION
ON KEY COLUMNS AND SAMPLED DETAILED INDEXES ALL

8
Agenda

• Utilities for Online maintenance


• Design online maintenance

9
Planning for Online maintenance
The elements:
• Hardware
• Is the disk always running at full capacity?
• Is there cpu power left on the system?
• Is there an archive target like TSM to manage logs?
• User
• Can one identify the online workload for the day, week and month?
• DBA
• What’s the best day to run it? Should one run it all or spread it out?
• Is it scripted with all the options based on the scenario?
• Can it be monitored and email/page alerts to DBA?
• Can the process be controlled without disrupting the customer?

• Before designing online reorg, consider the following:


• CUSTOMER FIRST – Is it the right fit?
• Set Goals and Expectations
• Find a window
• Never compete with batch jobs. Shorter the transaction, faster the reorg!
• Eliminate reorg using MDC (Multi-Dimensional Cluster) tables

10

Before designing maintenance, requirements to consider:


1. Hardware requirements: Study must be made to determine if the current hardware is capable of
running online maintenance workload. Some of them would be disk subsystem, cpu power,
offline storage for logs.
2. User requirements: With the help of end user and application group, DBA can understand the
application and pattern of usage. This would really help in deciding the peak and off-peak hours
for the application. The DBA would then be able to run online maintenance during hours of low
to average user load on the system.
3. DBA requirements: The DBA can perform a study of the application and the usage. Based on
which, one can come up with maintenance strategy and have scripts to run it. Also, the DBA
need to have monitoring and control mechanisms in place while running online maintenance.

Before staring anything, it would help to determine if online maintenance is the best fit for the
application and it’s workload. The online maintenance runs in trickle mode and therefore could
stretch for hours before completion. It is during this time DBA need to insure that user
application performance is not affected and the system has enough cycles to process
maintenance request. Find a window to run maintenance on discussing with the end user. It is
not recommended to run maintenance and batch workload on the same table resulting in
invalidating the work done and/or delaying the maintenance process. Try and eliminate need for
maintenance especially reorgs on tables using MDC (Multi-Dimensional Cluster) tables. MDC
provides more benefits in addition to eliminating maintenance.

10
Online table reorg
• Nature of ONLINE table reorg:
• It’s trickle and not meant for speed
• It’s just like any other transaction acquiring locks and generating logs
• It’s a background process and knows to play well with online user workload
• Currently reorg is not one of the throttled utility
• It’s an I/O bound operation
• Multiple online reorgs can run in parallel

• Benefits of ONLINE table reorg:


• Immediate and incremental clustering
• Minimal extra storage is required
• The process can be written to avoid strong locks
• Eliminate overflow records
• Minimal impact on system and online workload performance
• Can be monitored and controlled online
Does INPLACE reorg cause extra logging?
Answer: Yes. Inplace reorg causes a significant amount of logging
to be performed. The logging can be especially high if there are
many indexes on the table.
11

One of the questions everyone asks is “Why is it slow?”. So let’s set the record straight! ONLINE
reorg is NOT meant for speed. It runs in trickle mode in the background without disrupting service. It
behaves like a user transaction which would hold locks on tables and writes to transaction logs to
make itself recoverable incase of failure. Multiple reorgs can be run in parallel and it greatly depends
on the system and available resources. The online reorg can be seen as “db2reorg” in the list
applications command. The reorg process gets submitted and runs in background mode. Currently it
runs in a mode where it does not set a priority and runs like any other user connection. Also online
reorg has to determine the system usage, it’s completion time cannot be determined as they differ for
each run.

Due to the fact that data pages are moved physically inplace without requiring extra space on the
tablespace the table resides, online reorg takes longer than offline reorg to finish running, but has it’s
own advantages compared to offline reorg. The data pages are moved in batches providing
immediate clustering benefits. Online reorg no longer needs the tablespace to be twice the size of the
table, it does all the work within the space already allocated to the table (not made available to the
table) making it very space efficient. Online reorg has execution phases and at end of each phase
needs to acquires locks on the table. When we go through this presentation, we will learn more about
the locks involved and tips to avoid them. The online table reorg is like an application connected to
the database. Therefore, one may use snapshot and event monitors to gather data on status and
progress.

11
Online table reorg – Reclustering

12

Online reorg can be run in different way that benefits the application. In this slide, we will discuss
the method of reclustering in the online reorg. If the requirement is to reorg the table based on a
specific index, such that the index clusterratio/clusterfactor improves application performance,
the table reorg can be made to use a specified index if the table does not have a clustered index
built on it. Should the table have a clustered index, then any reorgs performed on the table will
be based on the clustered index by default.

The above slide is a classic and simple representation of data page movement based on an index key.
The white color stands for unused data pages and the green color stands for data pages with data
in it. The online reorg in this case is moving data pages from left to right. This is called as
“vacate and fill” phase of online table reorg. During the vacate phase, pages are moved to
adjacent extents in batches. During the fill phase, the pages are brought back in the cluster order
into the pages previously vacated in the vacate phase. This repeats until the whole table is
reorganized using the index order. If reorg truncates the table, the free space reclaimed from the
table reorged is returned to the tablespace and can be used by any other table.

The reclustering occurs as follows in a bi-directional mode:


1. Move data from left to right (vacate phase)
2. Move data from right to left (fill phase)
The above is repeated until there are no more pages to vacate as they are all packed based on index
key order.

12
Online table reorg – Reclaim space

Using NOTRUNCATE for table reorg does not reclaim space. Why is
not the empty fragmented space reused?
Answer: If NOTRUNCATE is used, free space within the table being
reorged is not returned to the tablespace.
13

This is a classic representation of online table reorg written to reclaim space only without using an
index to recluster/order data.

Here, the blue color represents data and the white pockets represent empty spaces within the data
pages. During reorg, the operation is run from right to left to move rows into the left most
extents/pages which has empty pockets. This is called as the vacate and fill phase and the reorg
utility repeats this cycle in batches until all empty pockets are filled with rows from the right most
extents. If you notice the vacate and fill phase is uni-directional and does the first-fit-out algorithm
for filling out empty pages moving from the right to the left most extent. Towards the end of reorg,
the empty extents towards the tail end of the table is truncated and space allocated to the table is
released back to the tablespace.

13
Online table reorg – Progress chart

14

This ties in everything that has been discussed so far about online table reorg. This slide
demonstrates how the reorg process goes through phases before and after vacate and fill phase:
There are some milestones/commit points for the online reorg process:
1. Draining existing scanners while allowing new scanners to start.
2. Move rows in batches (vacate and fill phase)
3. Truncate empty space at tail end of the table (if requested), starting with acquiring a Shared table
lock, drain existing scanners allowing new scanners to start, truncate table and finally commit.

Also note that incremental clustering is provided as reorg continues to make progress with the vacate
and fill cycles.

14
Online index reorg
• Nature of online index reorg:
• Synchronous operation
• Does not support throttling
• Allows transactions to update the index during reorg
• Only Type-2 index support
• In DMS mode, can run in most cases with online backup
• Writes to transaction logs

• Benefits of online index reorg:


• Online index cleanup
• More concurrency compared to type-1 indexes
• Can run with write access providing maximum flexibility

15

The REORG for INDEXES are synchronous in nature compared to table reorgs. The index reorgs
does not support throttling. During index reorg, transactions can update the index while the reorg is
in progress till the switch-over phase. All type-1 indexes must be converted to tupe-2 indexes using
the CONVERT option in the reorg command (if migrated from previous version) or drop and create
them as type-2 indexes in V8 to support online index reorganization. If the index resides in DMS
tablespace, the online index reorg can make the backup wait on lock because of the switch-over
phase, which is NOT recommended. Depending on the size of the indexes, the index reorg can
generate substantial amount of transaction logs for recovery purposes and this helps for HADR
solution. In SMS mode, online index reorganization and online backup are incompatible in the same
way that online index create and online backup are incompatible. In DMS mode, online index
reorganization and online backup can run concurrently in most cases. The only case where they are
not compatible is the same case that is described above for online index create and online backup. In
addition, online index reorganization quiesces the table before the switch phase and gets a Z lock,
which prevents an online backup.

15
Online index reorg – Progress chart
Online index reorg:
1. Rebuild Phase
2. Log catch-up phase
3. Switch-Over phase

16

This slide explains the phases of index reorg in detail. The index reorg consists of 3 major phases:
1. Rebuild phase
2. Log catch-up phase
3. Switch-over phase

In the first phase, the indexes keys are rebuilt. The reorg creates a copy of the index called “shadow”
and does all it’s work on this copy letting existing scanners work on the original index. The keys
are sorted and RIDs are assigned to the index during this phase. During this phase, concurrent
read and write access is available on the index object. These operations are captured into special
logs into memory buffer and are not reflected on the shadow object till the next phase.

In the second phase, while the existing read and write scanners are working on the index, all changes
to the original indexes are captured into a log in memory. This log is replayed on the shadow
index to catch-up with the original index. This insures all transactions done during rebuild phase
are applied to the shadow index.

In the third phase, db2 does a catch-up of the logs one more time. At this time, when all logs are
caught up and the shadow index is consistent with the original index, a Z lock is issued to switch
indexes. During this switch phase. Existing scanners are drained and new scanners are pointed to
the shadow index object. The old/original index object is dropped as cleanup process.

16
Online Runstats
• Nature of ONLINE RUNSTATS:
• Exploits SMP parallelism
• It can be throttled
• CPU and memory intensive
• Can be run in parallel for multiple tables

• Benefits of ONLINE RUNSTATS:


• Provides optimizer with the most current statistics
• First step towards troubleshooting
• Perform what-if scenarios
• Sampling for tables and indexes
• Provides accurate data to utilities
• CGS - Eliminate creating indexes for correlation
17

Runstats has grown over years and have many features and benefits to it. One of them being
exploiting SMP parallelism. It can be throttled using the UTIL_IMPACT_PRIORITY option in
conjunction with UTIL_IMPACT_LIM database configuration parameter. Throttling is required
when the system is resource constrained. Runstats is CPU and memory intensive given the fact that it
needs to calculate statistics that could affect application performance.

It aids the optimizer in making the correct decision producing efficient access plans. It can perform
what-if scenarios for troubleshooting query performance.

Runstats can be tuned to be CPU efficient and still provide accurate statistics by using page or row
level sampling on the table. Index statistics can also be sampled to speed up runstats.

Runstats provides accurate data to utilities like reorgchk and also populates SYSTAT and SYSCAT
views. Aids in determining multiple factors to decide if a table is fragmented and subsequently
needing reorg.

With CGS (Column Group Statistics), one can avoid creating indexes for statistical information only.
CGS helps in data correlation. CGS is the number of distinct combinations values for a set of
columns (cardinality of a group of columns). CGS does not assume independence and gives better
joint selectivity.

DB2 now provides automated statistics collection. As a part of ASC, statistics for the table can be
profiled and stored in a table. Db2 would analyze this and decide what and what statistics to gather.
More on this is available on LEO (LEarning optimizer) project.

17
Online Runstats
On What to collect?
How much to collect?
When to collect?
Ways to collect?

Cost depends on Cardinality


Cardinality can be inaccurate
Bad plan due to poor cardinality estimate
Cardinality assumptions:
- Currency, Uniformity, Independence

18

WHAT to collect: Runstats can be run on table, indexes individually or on both of them. This gives the
flexibility to collect statistics on the table only, all or on a specific index or on both table and it’s indexes.
Let’s say that we just added a new index to the table, statistics can be collect for the new index alone without
having to run on the table and all of the existing indexes. It is good to keep the table and index statistics upto
date and in sync to avoid any query performance issues. The data collected from these options are stored in
sysstat.tables and sysstat.indexes views.

HOW MUCH to collect: So start with, it’s a tough decision! To collect statistics one needs to know the data.
Once you know the data, one needs to determinate how it’s being used by queries. Statistics collected on the
table and related objects cannot be based on one query alone. Therefore, one needs to collect statistics which
can provide enough information to the optimizer. Here is where runstats stands out in it’s capabilities to
drill-down to column and key level. One may choose to collect detailed index key statistics to provide more
details other than just the highkey and lowkey values. You can also collect distribution statistics on the table
columns which provides more information on data. Now, we can also store the statistics in a profile and just
use the profile for the next time.

WHEN to collect: The DBA can decide on when to collect statistics based on the needs of the application and
the availability of resources to perform such an operation. Or the DBA may choose to leave this to DB2 by
turning on AUTOMIATIC statistics collection.

WAY to collect:
Runstats command
Load command
Reorgchk using update statistics
Create index with collect statistics (supported for Declared Global Temporary table indexes at time of
creating the index)
Automatic Maintenance using Statistics Profile warehouse.

18
Designing online runstats
Backup existing statistics

KEY COLUMN statistics satisfy most of queries


Use CGS (Column Group Statistics)
Use throttling on resource constrained systems
Gather statistics while creating indexes
Consider statistical profiling
Consider using Automatic Runstats

Row level sampling vs. Page level sampling


REPEATABLE clause can provide consistent information
Don’t sample small tables

19

Before collecting fresh statistics, it is always advisable to backup existing statistics. This would really help in debugging the
problem “if” the new statistics affects query performance. This can be done using db2look utility in mimic mode.
Recommend to backup statistics on a daily basis (or) before it changes using db2look in mimic mode:
db2look -d SAMPLE -z DB2INST1 –t EMPLOYEE –m -p -xd -o employee.stats

Key column statistics does satisfy queries covered by indexes. But there are exceptions to the rule, so find to see if statistics
collected on all columns would help query performance using correlation and selectivity.

Use CGS and avoid creating multiple indexes to provide statistical information. Improves optimizer estimates providing better
access plans.

Sampling improves runstats performance consuming less cpu. At the same time, don’t sample small tables. Runstats will not
sample if table has < 5 pages. Sample related tables with similar sampling rate to achieve same level of accuracy.
Sampling statistics with REPEATABLE would provide consistent results if table changes infrequently across multiple
invocations of runstats.

Row level sampling is the fastest, while I/O is not reduced:


- Every data page is picked for full table scan. This method is best suited for clustered tables and results provided
are most accurate.
Page level sampling is also fast and saves on I/O:
- Prefetching only sampled pages. Does not provide accurate results if data is clustered even if runstats detects
and corrects for clustering! But, if the data in the table are distributed randomly, then the p percent of pages
picked for sampling could provide accurate results

UTIL_IMPACT_PRIORITY can throttle runstats. This works only if UTIL_IMPACT_LIM is set to less than 100%. For
example setting UTIL_IMPACT_LIM=90 and UTIL_IMPACT_PRIORITY=50 would increase overall system usage by
45%.

19
Agenda

• Utilities for performing online maintenance.


• Designing online maintenance.
• What to expect and avoid!

20

20
What to expect and avoid!
Logging
+ Reorg writes to transaction logs
+ Use DB2_ALTERNATE_PAGECLEANER
+ Monitor log archives
+ Consider using infinite logging
- Avoid log full condition
- Reorg placed in “Paused State” when log is full

Can a online table reorg under prolonged “lock-wait" time cause


LOG FULL CONDITION?
Answer: Yes. Reorg is similar to any long running transaction that
is in Lock-wait.
21

Logging:
Reorg for both table and index (as option) writes to transaction logs.
Use DB2_ALTERNATE_PAGECLEANER registry variable to automate page cleaning.
Monitor log archives and ensure the archives are keeping up with the rate at which logs are generated
by reorg.
Consider using infinite logging.
Avoid log full condition. Maintain enough space on the active log path and archive log path.
If the active log gets full during reorg, the reorg is automatically placed in “Paused” state. DBA will
have to resume the reorg manually or stop and start it if required. Use table snapshot.

21
What to expect and avoid!
Locking
+ Use NOTRUNCATE option
+ Use CLEANUP option
+ Watch for reorgs waiting on “Lock Conversion”
- Avoid reorg waiting on locks for extended duration
- Avoid running table and index reorg on same table at same time

Does the table reorg hold the log file as “active” if it is in “lock-
wait” state?
Answer: Yes, just like any other transaction in lock-wait.
22

Locking:
Use reorg with NOTRUNCATE table option. This eliminates the “S” lock required during truncation.
Take advantage of index clean up options to avoid replace phase that required “Z” lock.
Watch for reorgs waiting on “Lock conversion”. Find the application holding lock and take necessary action.
Avoid prolonged reorg lock-wait. In situations where this reorg may be stopped or paused, such an asynchronous
operation will take longer to take effect which is directly proportional to the time of lock-wait. Forcing the
application that is running the online reorg is asynchronous in nature. Therefore time taken for reorg to stop depends
on the time it has been waiting on locks. May cause lock chains if lock-waits are left unattended and is not a
desirable scenario.
Avoid running table and index reorg together on the same table.

22
What to expect and avoid!

Recovery
+ Full backup at end of reorg required
+ Index reorgs are logged
+ History file contains reorg information

Throttling
+ Not supported for Table and Index reorg
+ Support for runstats

Is table reorg a low priority thread?


Answer: No. There is no priority assigned to an agent doing an
inplace reorg, but the nature of the design is that in majority of
cases, the inplace reorg will let the scanners proceed.

23

Recovery:
Take a full backup of the database at end of reorg. This is to avoid using the logs generated during reorg to rollforward incase
of disaster.
Index reorgs are logged with the new db cfg “LOGINDEXREBUILD” and table reorg is logged by default. This has to do with
the design of HADR.
History file is not an accurate representation of the status of previous reorg. It does not always provide failure information,
especially for index reorgs.

Throttling:
Reorg for table and index does not support throttling.
Runstats is a throttled utility by using UTIL_IMAPCT_PRIORITY with UTIL_IMPACT_LIM

23
What to expect and avoid!
Performance
+ Reorg without specifying an index is the fastest
+ Reorg can be paused and resumed
+ SQL is sensitive to order of data
+ Run reorgs in parallel
- Long running transactions impact reorg
+ Split large tables into smaller ones

Can a user application (who does not have REORG privileges) get "Internal
Reorg Lock" on a table even while it is NOT doing a reorg? The lock
snapshot shows user application holding "Internal Reorg Lock".
Answer: An application that is executing against the same table that is being
reorg’d will show a shared Internal Reorg Lock, as observed. This does not
mean that that particular application is performing a reorg, just that it is
executing at the same time as the reorg, and on the same table. This type of
lock is used as a synchronization mechanism for inplace reorg processing.

24

Performance:
Reorg without specifying an index would pack data pages in the order presented to it. This is the fastest approach.
Reorg is easy to manage. Pause it if required and resume it soon enough that the benefits of clustering is not lost.
SQL is sensitive to order of data. Therefore, it is common to see reorgs based on an index (or) if the table has a clustered
unique index, the reorg would use that index by default.
Allow write access.
Run reorgs in parallel and measure the impact. It depends from system to system on how many parallel reorgs can be executed
without affecting the end user.
Reorg on a table run during different time period does not run at the same speed. It depends on what’s running on the system at
that time.
Long running transactions can impact progress of reorg. Pick a suitable time to do reorg. If no window is available, go with the
best available time.
Split the big tables into smaller tables and use union all views (V9 - use of partition tables) If the table is too bit to reorg that
by the time reorg is done, a portion of the table has already been changed. Divide and conquer.
Reorg is not a throttled utility. However, Runstats can be throttled using UTIL_IMPACT_PRIORITY set to a % and
UTIL_IMPACT_LIM set to less than 100%

24
What to expect and avoid!
Errors/Warnings
- Index statistics inconsistent with table statistics – SQL0437W, rc=6
- Online table reorg not supported when using APPEND ON – SQL2219N, rc=12
- Online table reorg not supported for MDC tables – SQL0270N, rc =46
- Online index reorg fails due to space allocation – SQL0289N

Space allocation
+ Extra space not required for online table reorg
+ Space equal to size of index required for online index reorg
+ Runstats with distribution needs more space under system catalog

Can table reorg lock timeout an end-user application?


Answer: Yes, just like any other transaction.
25

Errors/Warnings:
Index statistics not consistent with table statistics - SQL0437W, reason code 6
INPLACE table reorg is not supported for tables in APPEND MODE ON - SQL2219N, reason code 12
INPLACE table reorg is not supported for MDC tables - SQL0270N, reason code 46
ONLINE index reorg will get SQL0289N error when run with ALLOW READ/WRITE ACCESS and there is not enough
space in the tablespace. If ALLOW NO ACCESS is used, the reorg for index would succeed with a Z lock throughout the
duration of index reorg, thus is not very desirable for OLTP applications.

Space allocation:
INPLACE table reorg does not need extra space and does all it’s work in batches within the space allocated in the tablespace
where the table resides.
Online index reorg creates a shadow index during the rebuild phase in the same tablespace as the index. It is important that
there is enough space in the index tablespace to hold a shadow copy of the index.
Runstats with distribution needs more space allocated to system catalog tablespace.

25
Agenda

• Utilities for performing online maintenance


• Designing online maintenance
• What to expect and avoid!
• Monitoring

26

26
Working with reorg online
• Reorg can be stopped, paused and resumed online. Reasons:

• Database running low on log space

• Application on “lock-wait” due to table reorg process

• Table reorg on “lock-wait” due to long running user process

• Top disk busy at 95% and is affecting application performance

• Table used by heavy batch jobs

Does ONLINE Table reorg issue commit? If it does issue commit,


what happens when it does not issue a commit due to "lock-wait"
status?
Answer: Yes, the inplace reorg issues commits after each phase of
processing. If the inplace reorg is in lock-wait, it will behave like
any other transaction that is in lock-wait that hasn't issued a
commit.
27

27
Working with reorg online
The DBA wishes to alter the table reorg:
For example, change the access from read only to write access and not truncate the table
to make it more concurrent with online user workload:
db2 reorg table db2inst1.employee inplace allow read access start
db2 reorg table db2inst1.employee inplace pause
db2 reorg table db2inst1.employee inplace notruncate table resume
Another example, change the table reorg to use an index with the same level of access:
db2 reorg table db2inst1.employee inplace allow write access start
db2 reorg table db2inst1.employee inplace pause
db2 reorg table db2inst1.employee index db2inst1.employee_idx1
inplace resume

Is there a relationship between table reorg in “lock-wait” status and


reorg holding active log files?
Answer: Yes. This is the same as any other application in Lock-wait
that is holding active files.
28

28
Monitoring Reorg & Runstats
Snapshot Monitors:
Database : Watch log usage, Oldest transaction application ID
Table : Full table and reorg information available
Application : Idle time, Sorts, Rows: read, selected and written, Log used
Lock: Detailed lock information and status

Utilities:
db2 list utilities
db2pd
AIX: nmon, vmstat and iostat

DB2 recovery history file


Administrative notification log and FFDC log for progress and errors

If reorg transactions are written to log files such that if these logs
were used for rollforward, does the reorg log records get
replayed?
Answer: Yes - inplace reorg is like any other transaction.
29

Database snapshot: (watch log usage):


get snapshot for db on sample,
select * from table(sysproc.snapshot_db(‘sample’,-1)) as dbsnap

Table snapshot: (Full table and reorg information available)


get snapshot for tables on sample,
select * from table(sysproc.snapshot_tbreorg(‘sample’,-1)) as tbsnap

Application snapshot: (idle time, hit ratios, sorts, rows: read, selected and written, log space used):
get snapshot for application agentid <agentid_of_db2reorg_process>,
get snapshot for locks for application agentid <agentid_of_db2reorg_process>
select * from table(sysproc.snapshot_app_info(‘sample’,-1)) as appsnap, select * from
table(sysproc.snapshot_lockinfo(‘sample’,-1)) as locksnap

Lock snapshot for detailed lock information and status.

db2 list utilities show detail

Db2pd: db2pd –utilities

DB2 recovery history file:


db2 list history reorg all for db sample (or)
db2 list history reorg containing db2inst1.employee for db sample

Administrative notification log and FFDC log for progress and errors

AIX: nmon, vmstat and iostat

29
Monitoring Table reorg Reorg attributes:
Reclustering
$ db2 reorg table db2inst1.employee1 index idx1 inplace allow write access start Reclaiming
DB20000I The REORG command completed successfully.
DB21024I This command is asynchronous and may not be effective immediately Inplace table reorg
(Output of db2 get snapshot for tables on sample given below): Allow write access
Table Schema = db2inst1 Allow read access
Table Name = EMPLOYEE1
Allow no access
Table Type = User
Data Object Pages = 6752 No table truncation
Index Object Pages = 819
Recluster via index scan
Rows Read = 0 Reorg Status:
Rows Written = 0 Reorg long field LOB data
Started/Resumed
Overflows = 0
Reorg data only w/o LOBS
Page Reorgs = 602 Paused
Table Reorg Information:
Stopped
Reorg Type =
Reclustering Completed
Inplace Table Reorg
Truncate
Allow Write Access
Reorg Index = 1
Reorg Tablespace = 2
Start Time = 12/20/2005 11:32:41.263978
Reorg Phase =
Max Phase = Reorg progress= reorg_current_counter / reorg_max_counter * 100
Phase Start Time =
Status = Completed Status:
Current Counter = 6751
Max Counter = 6751 Success = 0
Completion = 0 Failure = -1
End Time = 12/20/2005 11:33:08.59918630

30
Monitoring Index Reorg
Application snapshot elements:
- Rows read – DB2 has to read ALL rows
- Rows written – Overflow rows to systempspace (like a sort spill)

At end of index reorg: (if it overflowed to systempspace)


- Rows read = # of rows in table * # of indexes * 2
- Rows written = # of rows in table * # of indexes

At end of index reorg: (if it did NOT overflowed to systempspace)


- Rows read = # of rows in table * # of indexes
- Rows written = 0

Therefore, approximate % of completion for index reorg Î


Without sort overflow:
= (app_snapshot_rows_read)*100/((total_rows_in_table)*(no_of_indexes_on_table))
With sort overflow:
= (app_snapshot_rows_read)*100/((total_rows_in_table)*(no_of_indexes_on_table)*2)

Note: Use SYSPROC.SNAPSHOT_APPL to query the progress of index reorg

31

SAMPLE APPLICATION SNAPSHOT at end of index reorg


Application id = 870
Application status = UOW Waiting
Snapshot timestamp = 03/01/2006
12:56:44.788357
Buffer pool data logical reads = 38716332
Buffer pool data physical reads = 23223
Buffer pool temporary data logical reads = 2009915
Buffer pool temporary data physical reads = 811
Buffer pool data writes = 24
Buffer pool index logical reads = 2781134
Buffer pool index physical reads = 51
Buffer pool temporary index logical reads = 0
Buffer pool temporary index physical reads = 0
Buffer pool index writes = 905
Direct reads = 348
Direct writes = 16
Direct reads elapsed time (ms) = 17
Rows deleted = 0
Rows inserted = 0
Rows updated = 0
Rows selected = 0
Rows read = 364111940
Rows written = 182055968

31
Monitoring Runstats
Application snapshot elements:
- Rows read
- Rows written
- Bufferpool data/index logical reads
- Bufferpool data/index physical reads

RUNSTATS ON TABLE DB2.EMPLOYEE


WITH DISTRIBUTION AND DETAILED INDEXES ALL
- Full tablescan to get all data rows
- Read the index keys

Therefore, approximate % of completion Æ


= ((# of data pages read so far + # of index pages read so far)
*100)/ ((total data pages to read + total index pages to read)*1.5)

Note: Use SYSPROC.SNAPSHOT_APPL to query the progress of runstats

32

A SAMPLE runstats with distribution and detailed indexes all Application


SNAPSHOT output.

Application id = 1182
Application status = UOW Waiting
Snapshot timestamp = 03/01/2006 11:04:20.028821
Buffer pool data logical reads = 6163820
Buffer pool data physical reads = 381
Buffer pool temporary data logical reads = 0
Buffer pool temporary data physical reads = 0
Buffer pool data writes = 0
Buffer pool index logical reads = 770705
Buffer pool index physical reads = 53
Buffer pool temporary index logical reads = 0
Buffer pool temporary index physical reads = 0
Buffer pool index writes = 0
Direct reads = 348
Direct writes = 348
Direct reads elapsed time (ms) = 2
Rows deleted = 0
Rows inserted = 0
Rows updated = 0
Rows selected = 0
Rows read = 45519430
Rows written = 10808

32
Locking/Concurrency – Table reorg
Database name = SAMPLE
Database path = /db2home/db2inst1
Input database alias = SAMPLEreorg table db2inst1.employee inplace allow write access start
Locks held = 560
Applications currently connected = 1
Agents currently waiting on locks = 0
Snapshot timestamp = 01/05/2006 13:31:15.638064

APP.NAME APP.USER HANDLE APP.ID APP.STATUS LOCKS WAIT.ms BLK_ID


-------------------- -------- ------ ------------------------------ ----------------- ------- ------- ---------
db2reorg db2inst1 949 *LOCAL.db2inst1.060105183107 Connect Completed 560 0

TABLE NAME | TYPE | MODE | LOCK COUNT | HOLD COUNT | CURRENT | APP_WORKING | APP_BLOCKING
-----------------| ----------------- | ------ | ---------- | ---------- | --------| ------------ | --------
- | Tablespace | IX | 2 | 1 | 1 | 949 |
DB2INST1.EMPLOYEE| Row | IN | 1 | 0 | 553 | 949 |
DB2INST1.EMPLOYEE| Row | NS | 1 | 0 | 2 | 949 |
DB2INST1.EMPLOYEE| Internal Table Al | X | 1 | 1 | 1 | 949 |
DB2INST1.EMPLOYEE| Table | IS | 1 | 1 | 1 | 949 |
DB2INST1.EMPLOYEE| Inplace Reorg Loc | IN | 1 | 1 | 2 | 949 |

Why is reorg a LOGGED Transaction?


Answer: Inplace table reorg is recoverable. As such, it needs to be
logged.
33

The lock that’s missing here is the Shared-Lock at the time of table truncation. If you take frequent
lock snapshot and if REORG is waiting on locks. The lock that it is waiting to acquire will be in
“Conversion”. The IS lock is upgraded to S lock.

33
Locking/Concurrency – Index reorg
Database name = SAMPLE
Database path = /db2home/db2inst1
Input database alias = SAMPLE
Locks held = 6
Applications currently connected = 1
Agents currently waiting on locks = 0 db2 reorg indexes all for table db2inst1.employee allow write access
Snapshot timestamp = 01/16/2006 11:43:28.665147

APP.NAME APP.USER HANDLE APP.ID APP.STATUS LOCKS WAIT.ms BLK_ID


---------------- -------- ------ ------------------------------ ----------------- ------- ------- ------
db2bp DBIERIK 870 *LOCAL.dbierik.060116164307 UOW Executing 6 0

TABLE NAME | TYPE | MODE | LOCK COUNT | HOLD COUNT | LOCKS HELD | APP_WORKING | APP_BLK
-----------------| ----------------- | ------ | ---------- | ---------- | ------------| ------------ | -------
SYSIBM.SYSTABLES | Row | NS | 1 | 0 | 1 | 870 |
- | Tablespace | IX | 1 | 0 | 2 | 870 |
DB2INST1.EMPLOYEE| Internal Table Al | X | 1 | 0 | 1 | 870 |
SYSIBM.SYSTABLES | Table | IS | 1 | 0 | 1 | 870 |
DB2INST1.EMPLOYEE| Table | IN | 1 | 0 | 1 | 870 |

Is it normal to see table grow during INPLACE table reorg?


Answer: Yes. Online reorg moves rows around in the table. It will
search for a page to move rows. 10% of FSCR or 2x number of
FSCR (which ever is greater) would be searched by normal insert
operation free space search algorithm. If no space is found row is
appended.
34

The above is a formatted output of lock snapshot. Note that there are different locks acquired during
reorg and not all locks are shown above. The lock conversion will occur before the final switch-over
phase starts. Note that the index reorg will quiesce the object and escalate to a Z lock to switch
objects.

34
Locking/Concurrency - Runstats
Database name = SAMPLE
Database path = /db2home/db2inst1
Input database alias = SAMPLE
Locks held = 3
Applications currently connected = 1
Agents currently waiting on locks = 0
Snapshot timestamp = 01/05/2006 12:55:43.977863
runstats on table db2inst1.employee with distribution and detailed indexes all allow write access
APP.NAME APP.USER HANDLE APP.ID APP.STATUS LOCKS WAIT.ms BLOCKING.ID
-------------------- -------- ------ ------------------------------ ----------------- ------- ------- -----------
db2bp db2inst1 62 *LOCAL.db2inst1.060104183406 UOW Executing 3 0

TABLE NAME | TYPE | MODE| LOCK COUNT | HOLD COUNT | LOCKS HELD | APP_WORKING | APP_BLOCKING
------------------| ----------------- | ----| ---------- | ---------- | ------------| ------------ | ------------
DB2INST1.EMPLOYEE | Table | IN | 2 | 0 | 1 | 62 |
SYSIBM.SYSTABLES | Row | U | 1 | 0 | 1 | 62 |
SYSIBM.SYSTABLES | Table | IX | 1 | 0 | 1 | 62 |

How can one get the structure of SYSPROC table function?


Answer:
db2 “describe select * from TABLE
(SYSPROC.SNAPSHOT_TBREORG(‘sample’,-1)) as tbreorg”
35

35
Throttling Runstats
db2 runstats on table db2inst1.employee with distribution and detailed indexes all allow write access

db2 list utilities show detail


ID = 33
Type = RUNSTATS
Database Name = SAMPLE
Partition Number = 0
Description = DB2INST1.EMPLOYEE
Start Time = 01/09/2006 10:48:31.929343
Throttling:
Priority = Unthrottled
UNTHROTTLED
db2pd -utilities
Database Partition 0 -- Active -- Up 0 days 18:33:08
Utilities:
Address ID Type Priority DBName StartTime NumPhases CurPhase Description
0x07800000003B3D00 33 RUNSTATS 0 SAMPLE Mon Jan 9 10:48:31 2006 0 0
Progress:
Address ID PhaseNum Description StartTime CompletedWork TotalWork
db2 attach to db2inst1
db2 connect to sample
db2 update dbm cfg using UTIL_IMPACT_LIM 70 IMMEDIATE
db2 runstats on table db2inst1.employee with distribution and detailed indexes all allow write access UTIL_IMPACT_PRIORITY 50

db2 list utilities show detail


ID = 34
Type = RUNSTATS
Database Name = SAMPLE
Partition Number = 0
Description = DB2INST1.EMPLOYEE
Start Time = 01/09/2006 14:45:20.725701
Throttling:
Priority = 50 THROTTLED
db2pd -utilities
Database Partition 0 -- Active -- Up 0 days 19:01:43
Utilities:
Address ID Type Priority DBName StartTime NumPhases CurPhase Description
0x07800000003B3260 34 RUNSTATS 50 SAMPLE Mon Jan 9 14:45:20 2006 0 0
Progress:
Address ID PhaseNum Description StartTime CompletedWork TotalWork
36

36
Agenda
• Utilities for performing online maintenance
• Designing online maintenance
• What to expect and avoid!
• Monitoring
• Tweaks & considerations

37

37
Tweaks & Considerations
• General:
• Stagger big tables and run in parallel
• Not benefit tables with high transaction volume
• Performance is not affected by fragmentation
• DMS highwater mark does not reset after reorg or runstats
• Use SYSPROC for monitoring and storing snapshot data for analysis
• Do NOT compete with batch jobs, shorter the transactions faster the reorg

• Table reorg:
• Use NOTRUNCATE TABLE
• Without index is the fastest
• Consider using clustered indexes
• Index reorg:
• CLEANUP after the mass delete
• CLEANUP ONLY ALL to avoid Z locks
• Runstats:
• Sample statistics
• Column-group statistics (CGS) for group cardinalities
38

Consider running multiple table reorgs in parallel. This can be scripted to start “n” table reorgs in
parallel initially and based on system activity add to it.

For running multiple table reorgs in parallel, try not to run big tables in parallel. Stagger them to
sustain optimal system & database performance.

If the table is constantly changing, then online reorg may not be able to keep up with the changes to
the table.

Consider table reorg using NOTRUNCATE TABLE option to avoid S locks.

Prefer to run table reorg without index for better performance. Index based table reorg is slower than
a table reorg without specifying indexes. Note that if the table has clustered unique indexes, then
online reorg will by default pick the clustered unique index for ordering data pages.

Consider using clustered indexes but comes with insert penalty, reducing need for reorg, but if the
table had to be reorg’d would be slower as it needs to go through the index to order data.

38
Tweaks & Considerations
• Reorg and Backup:
• Internal B Locks can cause reorg’s to run slower during online backup
• Online table reorg clean up phase cannot start during online backup
• Online index reorg and online backup have the following compatibility
issues:
• In SMS mode, they are not compatible
• In DMS mode, they are compatible with recovery issues related to lifeLSN. In
addition, online backup will wait because of Z lock during switch-over phase

• Some configuration parameters to tweak reorg/runstats:


• DB2_USE_ALTERNATE_PAGE_CLEANING
• UTILHEAPSZ
• SORTHEAP & SHEAPTHRES
• Bigger bufferpool for System temporary tablespace.
• I/O Cleaners & Prefetchers
• STAT_HEAP_SZ (additional 2 meg for sampling)

39

Consider index reorg using CLEANUP ONLY ALL option to avoid Z locks.
Consider using runstats with sampled statistics.
Consider collecting column-group statistics (CGS) which is better than adding indexes to satisfy
queries and helps in cardinality calculations by providing stronger correlation.
Performance is not affected by fragmentation but however is affected by order of data. Therefore,
with the helps of “last known good statistics”, you could get away with table reorg if space is not an
issue. Remember, db2 will try and fit the new row in an existing page with information from
SMP/EMP extents and that’s limited by DB2FSCRSEARCH registry variable on how many EMP
extents should be checked for extents with empty pages before creating a new extent.
CLOBS have to be reorganized for non-MDC and MDC tables. The CLOB space usage (the actual
usage) can be obtained online using INSPECT command which would give an idea of how many
pages would be freed following clob reorg.
Some configuration parameters that could help speed up reorg are:
DB2_USE_ALTERNATE_PAGE_CLEANING to avoid ADM1822W errors during online
table reorg.
Larger UTILHEAPSZ to avoid ADM9500W while running online index reorg.
Bigger bufferpool for System temporary tablespace.
Enough I/O Cleaners and Servers to handle the reorgs.

Note: There are other cfg parameters that can speed up reorg, but don’t suit OLTP
environments like sortheap, intra_parallel, max_querydegree, etc.

39
Large objects (LOBs) Considerations
Reorg table with LONGLOBDATA option to reclaim wasted lob space

Space organized into 256k blocks called “buddy spaces”

LOB Manger works on “buddy space”

Special page “allocation page” to track buddy pages

Deleting data from LOBS does not decrease SMS tablespace size

Only Inspect command can provide exact lob space usage

40

DB2 allocates 256 k blocks instead of the tablespace page size for the lob objects. These chunks of
spaces are called “buddy spaces”. The Lob manger keeps track of “buddy spaces” using a special
page called “allocation page”. Lobs will have to be reorged using LONGLOBDATA option in
offline mode.
List tablespace show detail does not show “real” usage
SQL function length(lob_column) cannot be used to calculate lob space “real” usage.
After deleting data from the table and running a REORG, no space is reclaimed; the size of the table
space did not decrease. When an SMS tablespace is used a REORG is not going to reclaim space. An
SMS table space is defined as a directory. The directory uses the space available on the underlying
file system. When a row is inserted and subsequently deleted in an SMS table space, the space is
reused. The data for an SMS table space is always contiguous and there are never any empty pages in
the table space. The files in the SMS directory will not change in size. The only way to make the
LOB file decrease in size in an SMS table space is to do an EXPORT/LOAD of the data in the table.
When you LOAD data into an empty table the data is placed in a contiguous fashion and there should
be no free space. Here are your steps for doing the EXPORT/LOAD:
1. EXPORT the data
2. drop the table
3. re-create the DDL for the table
4. LOAD data

40
Agenda
• Utilities for online maintenance
• Design online maintenance
• What to expect and avoid!
• Monitoring
• Tweaks & considerations
• Reduce better avoid reorg!

41

41
Reduce/Avoid REORG
• MDC – Multi Dimensional Clustering
• MDC tables behave just like normal tables except for its enhanced
clustering and processing aspects

• MDC scheme lends itself to relatively efficient maintenance

• MDC clustering can be used in OLTP/DSS/ODS environments

• MDC is new clustering technology that provides a method for automatic,


continuous clustering of data along multiple dimensions

• For regular tables/indexes:


• Reduce maintenance using MAXFSCRSEARCH, PCTFREE and
MINPCTUSED

• Consider using CLUSTERED INDEXES for table to


keep up the clusterratio/clusterfactor

42

“Multidimensional clustering is primarily intended for data warehousing and large database environments, and it can also
be used in online transaction processing (OLTP) environments.” – DB2 Information Center
Source: http://www-306.ibm.com/cgi-
bin/db2www/data/db2/udb/winos2unix/support/v8infocenter.d2w/report?target=mainFrame&fn=c0007201.htm
“Multidimensional clustering (MDC) . MDC is new clustering technology that provides a method for automatic,
continuous clustering of data along multiple dimensions. And MDC tables don't require database maintenance operations,
such as reorganization. MDC primarily benefits data warehousing type queries and large database environments;
however, it can also be useful for transaction processing. “
Source:
http://www.db2mag.com/shared/printArticle.jhtml?article=/db_area/archives/2002/q4/gunning.shtml&pub=db2mag
“MDC scheme lends itself to relatively efficient maintenance. Most utilities such as Reorg, Backup, Restore, and Runstats
have been modified appropriately for use with MDC tables.” Source: SIGMOD2003.
“If the dimensions and blocksizes are chosen appropriately, then clustering benefits will translate into significant
performance and maintenance advantages.” Source: SIGMOD2003
“MDC tables behave just like normal tables except for its enhanced clustering and processing aspects.”
Source: DBLP Project.
“Type-2 Indexes. Type-2 indexes improve performance by eliminating most next-key-share locks, as entries are marked
deleted instead of physically deleted from the page. Type-2 indexes are required for online load, online reorganization,
and MDC”
Source:
http://www.db2mag.com/shared/printArticle.jhtml?article=/db_area/archives/2002/q4/gunning.shtml&pub=db2mag

Note: Any changes to the data layout should be balanced. Table and Index structure can be designed to reduce
maintenance free but also provide optimal insert/update/delete/select performance for online workloads.

42
Agenda
• Utilities for online maintenance
• Design online maintenance
• What to expect and avoid!
• Monitoring
• Tweaks & considerations
• Reduce better avoid running maintenance
• Asked and Answered!

43

43
Asked and Answered!
1. Is Index reorg any different than Table reorg with respect
to concurrency?

2. If the online reorg is just like any other transaction, does it


timeout based on the LOCKTIMEOUT parameter?

3. For index reorg, using CLEANUP ONLY PAGES option


does not reclaim space and at the same time does NOT
rebuild indexes causing tablespace to reach 100%
capacity. Why is the space not reused?

44

44
Asked and Answered!
4. If we DON'T use NOTRUNCATE option for inplace table
reorg, then we reclaim space but at the cost of
concurrency. Why is this behavior?

5. For index reorg, if CLEANUP ONLY PAGES is NOT used,


indexes are rebuilt and reclaim space. The tablespace
used pages drops due to index rebuild. But this puts user
applications on lock-waits status causing concurrency
issues. How can we avoid this?

6. Can online backup run during online table reorg?

45

45
Summary
• Table reorgs run in trickle mode! Not meant for speed!
• The goal is to run reorgs and runstats without impacting customer
• Understand the application before designing reorg
• Know your peak and off-peak hours
• Stagger reorgs and runstats
• Reorg without index and without table truncation provides high
concurrency
• Table reorg can be paused and restarted
• Index reorg rebuild requires Z lock following catch-up phase
• Know the SQL and collect required statistics
• Consider using statistical profiling
• Consider collecting Column Group Statistics (CGS)
• Monitor utilities and adjust accordingly

46

46
Acknowledgements
This presentation would not be complete without valuable inputs from the following
people and their work in the area of database maintenance and performance:

• Bishwaranjan Bhattacharjee
– MDC Team, IBM TJ Watson Research Center

• Danny Arnold
– Senior Product Manager, IBM Competitive Database Technologies

• Matt Huras
- Distinguished Engineer, Lead Kernel Architect, DB2 Development – IBM
Toronto Lab

• Paul Turpin
– Thread Chair IDUG 2006 NA, DBA - Wachovia

47

47
Session E02
Online database maintenance for 24x7 OLTP Systems

Vijay Sitaram

Vijay.Sitaram@gmail.com

48

48
Session E02
Online database maintenance for 24x7 OLTP Systems

DB2 performs!

Thank you

49

49
Additional Material

50

50
Appendix A – Answers to Questions
• Q1 Answer: Inplace table reorg was designed to have minimal impact on
concurrent apps. Inplace table reorg holds S row locks for very short periods of
time while it moves rows around in the table. Online index reorg is not inplace, it
uses a shadow object which is invisible to the users until we switch between the
original and the shadow object. This means that for an index reorg that specifies
ALLOW WRITE ACCESS, minimal locking will be done during the index build
phase but we will need to quiesce all activity on the indexes in order to switch the
original and rebuilt (shadow) objects. Inplace table reorg has no comparable need
for a quiesce, and therefore you'd see less of a concurrency issue with that
operation.

• Q2 Answer: For online table reorg: The LOCKTIMEOUT parameter doesn't


influence online reorg, i.e. it doesn't timeout based on LOCKTIMEOUT. Since
inplace table reorg was designed to run in the background with minimal impact to
concurrent applications, it will wait on locks indefinitely. If it did not, there is the
possibility of the user having to often resume the reorg.
For online index reorg: At the end of the catch-up phase, it first quiesces the
writers by requesting a S lock, and then before switching the shadow object to the
real index object, it requests a Z lock. Both locks are not to be timed out, and will
be less likely to be chosen as deadlock victim if deadlock occurs.

51

51
Appendix A – Answers to Questions
• Q3 Answer: Once a page is allocated to the index object, it's
never returned back to the tablespace until the index object is
reorged or the last index is dropped. If you do lots of deletion, you
could run reorg with cleanup option more to clean up the pseudo
empty pages, so that the empty pages will be reused sooner.

• Q4 Answer: The only time that there is any concurrency difference


is when the table is being truncated. The table lock is upgraded
from an IS to S lock (assuming allow write access was specified).
We cannot allow writers while the table is being truncated,
otherwise, those writers could use the reclaimed free space from a
reorg. Hence we would not be able to return any space to the
tablespace since it's no longer free.

52

52
Appendix A – Answers to Questions
• Q5 Answer: To avoid this, one should NOT have long
running and uncommitted transactions (writers and
readers) that hold up the reorg index lock request. The
two lock requests are only made after reorg indexes
has done most of the work to rebuild indexes.

• Q6 Answer: The clean up phase of online table


reorganization cannot start while an online backup is
running as it required “S” lock. A customer can pause
the table reorganization, if required, to allow the online
backup to finish before resuming the online table reorg.

53

53
Appendix B – Monitoring reorg using sysproc
db2 " \
SELECT \
SUBSTR(LTRIM(RTRIM(TABLE_SCHEMA))||'.'||LTRIM(RTRIM(TABLE_NAME)),1,20) as TABLE_NAME, \
PAGE_REORGS, \
REORG_CURRENT_COUNTER, \
REORG_MAX_COUNTER, \
CASE REORG_TYPE \
WHEN 268500992 THEN 'Start inplace with write access' \
WHEN 268959744 THEN 'Resume inplace with write access' \
WHEN 536936448 THEN 'Start inplace with read access' \
ELSE 'Inplace Table reorg' \
END AS TYPE , \
CASE REORG_STATUS \
WHEN 1 THEN 'Started' \
WHEN 2 THEN 'Paused' \
WHEN 3 THEN 'Stopped' \
WHEN 4 THEN 'Completed' \
WHEN 5 THEN 'Truncate' \
ELSE NULL \
END AS STATUS, \
CASE REORG_COMPLETION \
WHEN 0 THEN 'Success' \
ELSE 'Failure' \
END AS COMPLETION, \
REORG_START, \
REORG_END, \
REORG_INDEX_ID, \
REORG_TBSPC_ID, \
PARTITION_NUMBER \
FROM \
TABLE ( SYSPROC.SNAPSHOT_TBREORG('SAMPLE',-1)) as REORG”

54

The table snapshot can be obtained through SYSPROC.SNAPSHOT_TBREORG table function. The sql for the table function
is given below which provides all relevant details to table reorg:
1. Reorg progress
2. Type of reorg
3. Status of reorg
4. Completion of reorg
5. Start and End timestamp for reorg
6. Index used for reorg
7. Tablespace used for reorg
8. Partition currently running the reorg.

54
Appendix – C: Locking/Concurrency – Table reorg
Database name = SAMPLE
Database path = /db2home/db2inst1
Input database alias = SAMPLEreorg table db2inst1.employee inplace allow write access start
Locks held = 1280
Applications currently connected = 2 select count(*) from db2inst1.employee
Agents currently waiting on locks = 1
Snapshot timestamp = 01/05/2006 14:46:59.329012

APP.NAME APP.USER HANDLE APP.ID APP.STATUS LOCKS WAIT.ms BLK_ID


-------------------- -------- ------ ------------------------------ ----------------- ------- ------- ---------
db2reorg db2inst1 438 *LOCAL.db2inst1.060105194557 Lock-wait 1274 4973 930
db2bp db2inst1 930 *LOCAL.db2inst1.060105182954 UOW Executing 6 0

TABLE NAME | TYPE | MODE| LOCK COUNT |HOLD COUNT | LOCKS HELD | APP_WORKING | APP_BLOCKING
-----------------| ----------------- | --- | ---------- | --------- | ----------- | ------------ | -----------
DB2INST1.EMPLOYEE| Row | IN | 1 | 0 | 1269 | 438 | 930
DB2INST1.EMPLOYEE| Inplace Reorg Loc | IN | 1 | 1 | 2 | 438 | 930
DB2INST1.EMPLOYEE| Inplace Reorg Loc | IX | 1 | 1 | 1 | 438 | 930
DB2INST1.EMPLOYEE| Internal Table Al | X | 1 | 1 | 1 | 438 | 930
- | Tablespace | IX | 2 | 1 | 1 | 438 | 930
DB2INST1.EMPLOYEE| Table | IS | 1 | 1 | 1 | 438 | 930
DB2INST1.EMPLOYEE| Row | NS | 0 | 0 | 1 | 930 |
DB2INST1.EMPLOYEE| Table | IS | 2 | 0 | 1 | 930 |
- | Internal Variatio | S | 1 | 0 | 1 | 930 |
DB2INST1.EMPLOYEE| Inplace Reorg Loc | S | 1 | 1 | 1 | 930 |
DB2INST1.EMPLOYEE| Row | NS | 1 | 0 | 1 | 930 |
- | Internal Plan Loc | S | 1 | 0 | 1 | 930 |

55

The above is a formatted output of lock snapshot. Note that there are different locks acquired during
reorg and not all locks are shown above. Some of the interesting locks:

Inplace Reorg Lock – This ensures that no two applications can run reorg on same object. This is
meant for synchronization.
Internal Table Alter Lock – This ensures that the table DDL statements cannot be run during reorg.

55
Appendix D – Technotes from IBM
• Internal B Locks
• http://www-
1.ibm.com/support/docview.wss?rs=71&context=SSEPGG&q1=reorg&uid=swg21153425&loc=en_US&cs=utf-
8&lang=en
• Utilities compatible with ONLINE backup
• http://www-
1.ibm.com/support/docview.wss?rs=71&context=SSEPGG&q1=reorg&uid=swg21214717&loc=en_US&cs=utf-
8&lang=en
• What can run during online backup?
• http://www-
1.ibm.com/support/docview.wss?rs=71&context=SSEPGG&q1=inspect&uid=swg21214717&loc=en_US&cs=ut
f-8&lang=en
• Estimating & viewing space usage for large objects (LOBs)
• http://www-
1.ibm.com/support/docview.wss?rs=71&context=SSEPGG&q1=inspect&uid=swg21218554&loc=en_US&cs=ut
f-8&lang=en
• Tablespace size does not change after deleting data, then running reorg
• http://www-
1.ibm.com/support/docview.wss?rs=71&context=SSEPGG&q1=reorg&uid=swg21204819&loc=en_US&cs=utf-
8&lang=en
• Highwater mark is the list tablespace show detail not RESET after REORG or RUNSTATS
• http://www-
1.ibm.com/support/docview.wss?rs=71&context=SSEPGG&q1=reorg&uid=swg21007334&loc=en_US&cs=utf-
8&lang=en
• Introduction to SQL administrative routines SYSPROC
• http://www-128.ibm.com/developerworks/db2/library/techarticle/dm-0505melnyk/
• Automatic statistics profiling and updates to the warehouse feedback tables
• http://www-
1.ibm.com/support/docview.wss?rs=71&context=SSEPGG&q1=reorg&uid=swg21216981&loc=en_US&cs=utf-
8&lang=en
56

56
Appendix E – Reduce/Avoid reorg

57

57
Appendix E: Reduce/Avoid reorg
Consider the following for MDC:

• Single Dimension MDC table:


• Block/Extent size
• Rows per Cell - RpC
• Space used per Cell - SpC
• Blocks per Cell - BpC
• Bucket ratio

• Too Many vs. Too few BpC

• Create dimension column based of a


cluster key and apply coarsification

• The dimension column can be a


generated column

• Must be monotonic expression


58

58

Anda mungkin juga menyukai