Anda di halaman 1dari 16

An Unlucky Query

Impacts of Full Table Refresh on Queries Requring FTS

Query Environment
DB: taxext (sp2-taoextdb) Table: TAO.DIM_TAO_ADVERTISER, list partitioned by column IS_CURRENT. Totally two partitions. Table stats (as of 2012/03/25, old, but very close): rows: 2,118,910, blocks: 311,297. More than 90% of data is from partition P1 (IS_CURRENT=1) Parameters: db_block_size=8192,
db_file_multiblock_count=32

The Query
select buyer_line_crt_id, to_char(surrogate_key) surrogate_key, null as dummy from TAO.DIM_TAO_ADVERTISER where is_current= 1

The Plan
----------------------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | ----------------------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | | | 58381 (100)| | | | | 1 | PARTITION LIST SINGLE| | 2062K| 45M| 58381 (1)| 00:11:41 | KEY | KEY | | 2 | TABLE ACCESS FULL | DIM_TAO_ADVERTISER | 2062K| 45M| 58381 (1)| 00:11:41 | 1 | 1 |

-----------------------------------------------------------------------------------------------------------

Query Stats From AWR


DATE 08/07 08/09 08/11 08/13 08/14 TIME (SEC) 24,276 25,225 35,056 38,365 237 DISK_READS 3,555,310 3,799,893 3,734,663 4,072,043 301,607

Questions
Why the elapse times and disk_reads are so different when the query plan stays the same, when compared to one run on 08/14? Why did the query read more than 10 times of data as the table has? Note the query has only a single table scan.

ASH Wait Events


Real time tracking, top ASH count (about 2.5 hours)
db file sequential read CPU gc cr disk read direct path read 9578 57 41 25

Why are there so many db file sequential read? Shouldnt FTS uses direct path reads or db file scattered read?

What is the Query Reading With db file sequential read?


v$session snapshot row_wait_obj#: 0 p1: 3 (file#) 1. 2. 3. There is no object with object_id as 0. Usually it means some system related data. From dba_data_files, file_id=3 is related to tablespace UNDOTBS1 ASH from AWR for the query shows db file sequential read with current_obj#=0 is the top 1 events, and far more than other events.

Why did the query need so many UNDO reads?


UNDO is used by Oracle for consistent reads: the query only reads the committed data just before the query starts. If any data block has newer SCN, Oracle will lookup UODO records to reconstruct the data back to the value just before the query starts (not past image or original block). If Oracle cannot find appropriate UNDO for this purpose, it will raise ORA-01555 error. Note for this query for almost each UNDO records applied, there is a physical block read. Here is a snapshot from v$sesstat. The physical reads direct is the actual read from disk for the table data itself.
Statistics Value Unit

data blocks consistent reads undo records applied


physical reads physical reads direct

4,643,835

Records

3,605,637 281,237

Blocks blocks

Data Block and UNDO Block


Data Block UNDO Block

Header And Summary


ITL List Space Summary

Control Section
Record Directory

Row Directory
Free Spaces

Free Spaces

Rrecord Heap

Row Heap

ITL (Interested Transaction List)


Column Itl Description The array index for the list.

Xid
Uba Flag Lck

The transaction id of a recent transaction that has modified this block. (undo segment).(undo slot).(undo seq number)
Undo record address. (Absolute block address).(block sequence number).(record within block) Transaction state. Number of rows locked by this transaction in this block.

Scn/Fsc

Committed SCN or the number of bytes of free space that would be available if this transaction committed.

How UNDO is Used With Consistent Read?


Given a DBA (data block address), Oracle reads the data from disk or buffer cache If the block has SCN newer than the query requires, Oracle will check the ITL list of the block. Oracle clones the block in memory, using the content of the first ITL, to find an UNDO block (may be in memory or disk), and apply its content (the related ITL entry will be removed). The procedure will be repeated until the SCN satisfies the query requirement. It is possible that Oracle has to read multiple UNDO blocks from disk to construct one consistent read (CR) block.

A possible Scenario For the Unlucky Query


Oracle reads a set of blocks with direct path read (for example, 32) Inside each block, approximately there is an different ITL entry for each row, referring to a different UNDO block (note the data is updated by rows, not blocks. So the rows inside a block could be updated at different time, by the same or different transactions). At the end, for each block, the number of UNDO blocks to be read by Oracle is at least close to the row count. If the same row is updated multiple times when the query reads it, Oracle then has to read a chain of UNDO blocks for the purpose to get original data for one row. The reads of UNDO block from disk are single block reads. They are inefficient because of too many trips.

Data From AWR


Disk_reads 560,142 553,892 494,847 484,243 480,869 100,370 223,669 334,939 449,861 489,225 308,843 Buffer_gets 1,026,784 933,813 661,918 661,642 653,183 244,330 301,585 453,387 594,926 639,530 392,100 Direct_writes 0 0 0 0 0 0 0 0 0 0 0 Rows 807,001 465,000 210,000 150,000 117,000 36,000 45,000 63,000 69,000 66,000 37,677 Blocks/row 0.69 1.19 2.36 3.22 4.11 2.79 4.97 5.31 6.52 7.41 8.2 Snap Start time 08/13 19:00:00 08/13 20:00:00 08/13 21:00:00 08/13 22:00:00 08/13 23:00:00 08/14 00:00:00 08/14 01:00:00 08/14 02:00:00 08/14 03:00:00 08/14 04:00:00 08/14 05:00:00

Here is the snap by snap summary from AWR dba_hist_sqlstat for one execution. Except for one snap (08/14 00:00:00), to read one row from the table, the number of UNDO blocks to read from disk grows almost linear with one additional block every hour.

The Root Cause


Table TAO.DIM_TAO_ADVERTISER is refreshed each hour. The refresh is not a simple update with a small number of records added or updated. The refresh almost updates every record for the concerned partition, using MERGE. The update order is not based on block, but on rows. So the rows of the same block will end with different UNDO blocks. The concerned query most likely started near the end of one round of table refresh. The longer the unlucky query runs, the more UNDO blocks it needs to read to recover one row, because more and more changes have applied on a single row. The exception inside the table of last page was caused by the fact that the refresh job self run more than one hour.

Work Around
Since the concerned query can complete within 4 minutes, here is the work around Aggressive one: lock the table with wait before run the query, and release it after done Conservative one: lock the table with wait, and release it and run the query immediately. The table refresh query will take a while for join operation, the first row update will be far after 4 minutes.

Anda mungkin juga menyukai