Yanamala
• Introduction to Partitioning
• Partitioning Methods
• Partitioned Indexes
• Partitioning to Improve Performance
Note:
Introduction to Partitioning
Partitioning addresses key issues in supporting very large tables and indexes by letting
you decompose them into smaller and more manageable pieces called partitions. SQL
queries and DML statements do not need to be modified in order to access partitioned
tables. However, after partitions are defined, DDL statements can access and
manipulate individuals partitions rather than entire tables or indexes. This is how
partitioning can simplify the manageability of large database objects. Also, partitioning is
entirely transparent to applications.
Each partition of a table or index must have the same logical attributes, such as column
names, datatypes, and constraints, but each partition can have separate physical
attributes such as pctfree, pctused, and tablespaces.
Note:
See Also:
Partition independence for partition maintenance operations lets you perform concurrent
maintenance operations on different partitions of the same table or index. You can also
run concurrent SELECT and DML operations against partitions that are unaffected by
maintenance operations.
Figure 11-1 offers a graphical view of how partitioned tables differ from nonpartitioned
tables.
Partition Key
Partitioned Tables
Tables can be partitioned into up to 64,000 separate partitions. Any table can be
partitioned except those tables containing columns with LONG or LONG RAW
datatypes. You can, however, use tables containing columns with CLOB or BLOB
datatypes.
You can range partition index-organized tables. This feature is very useful for providing
improved manageability, availability and performance for index-organized tables. In
addition, data cartridges that use index-organized tables can take advantage of the
ability to partition their stored data. Common examples of this are the Image and
interMedia cartridges.
Partitioning Methods
• Range Partitioning
• List Partitioning
• Hash Partitioning
• Composite Partitioning
Range Partitioning
Range partitioning maps data to partitions based on ranges of partition key values that
you establish for each partition. It is the most common type of partitioning and is often
used with dates. For example, you might want to partition sales data into monthly
partitions.
A typical example is given in the following section. The statement creates a table
(sales_range) that is range partitioned on the sales_date field.
List Partitioning
List partitioning enables you to explicitly control how rows map to partitions. You do this
by specifying a list of discrete values for the partitioning key in the description for each
partition. This is different from range partitioning, where a range of values is associated
with a partition and from hash partitioning, where a hash function controls the row-to-
partition mapping. The advantage of list partitioning is that you can group and organize
unordered and unrelated sets of data in a natural way.
The details of list partitioning can best be described with an example. In this case, let's
say you want to partition a sales table by region. That means grouping states together
according to their geographical location as in the following example.
A row is mapped to a partition by checking whether the value of the partitioning column
for a row falls within the set of values that describes the partition. For example, the rows
are inserted as follows:
Unlike range and hash partitioning, multicolumn partition keys are not supported for list
partitioning. If a table is partitioned by list, the partitioning key can only consist of a
single column of the table.
The DEFAULT partition enables you to avoid specifying all possible values for a list-
partitioned table by using a default partition, so that all rows that do not map to any
other partition do not generate an error.
Hash Partitioning
Hash partitioning enables easy partitioning of data that does not lend itself to range or
list partitioning. It does this with a simple syntax and is easy to implement. It is a better
choice than range partitioning when:
• You do not know beforehand how much data maps into a given range
• The sizes of range partitions would differ quite substantially or would be
difficult to balance manually
• Range partitioning would cause the data to be undesirably clustered
• Performance features such as parallel DML, partition pruning, and partition-
wise joins are important
The concepts of splitting, dropping or merging partitions do not apply to hash partitions.
Instead, hash partitions can be added and coalesced.
Composite Partitioning
Composite partitioning partitions data using the range method, and within each partition,
subpartitions it using the hash or list method. Composite range-hash partitioning
provides the improved manageability of range partitioning and the data placement,
striping, and parallelism advantages of hash partitioning. Composite range-list
partitioning provides the manageability of range partitioning and the explicit control of
list partitioning for the subpartitions.
Partitioned Indexes
See Also:
Local partitioned indexes are easier to manage than other types of partitioned indexes.
They also offer greater availability and are common in DSS environments. The reason
for this is equipartitioning: each partition of a local index is associated with exactly one
partition of the table. This enables Oracle to automatically keep the index partitions in
sync with the table partitions, and makes each table-index pair independent. Any
actions that make one partition's data invalid or unavailable only affect a single partition.
You cannot explicitly add a partition to a local index. Instead, new partitions are added
to local indexes only when you add a partition to the underlying table. Likewise, you
cannot explicitly drop a partition from a local index. Instead, local index partitions are
dropped only when you drop a partition from the underlying table.
A local index can be unique. However, in order for a local index to be unique, the
partitioning key of the table must be part of the index's key columns. Unique local
indexes are useful for OLTP environments.
Global partitioned indexes are flexible in that the degree of partitioning and the
partitioning key are independent from the table's partitioning method. They are
commonly used for OLTP environments and offer efficient access to any individual
record.
The highest partition of a global index must have a partition bound, all of whose values
are MAXVALUE. This ensures that all rows in the underlying table can be represented
in the index. Global prefixed indexes can be unique or nonunique.
You cannot add a partition to a global index because the highest partition always has a
partition bound of MAXVALUE. If you wish to add a new highest partition, use the
ALTER INDEX SPLIT PARTITION statement. If a global index partition is empty, you
can explicitly drop it by issuing the ALTER INDEX DROP PARTITION statement. If a
global index partition contains data, dropping the partition causes the next highest
partition to be marked unusable. You cannot drop the highest partition in a global index.
ADD (HASH)
COALESCE (HASH)
DROP
EXCHANGE
MERGE
MOVE
SPLIT
TRUNCATE
• The index remains available and online throughout the operation. Hence no
other applications are affected by this operation.
• The index doesn't have to be rebuilt after the operation.
Example:
Note:
See Also:
Global nonpartitioned indexes behave just like a nonpartitioned index. They are
commonly used in OLTP environments and offer efficient access to any individual
record.
You can create bitmap indexes on partitioned tables, with the restriction that the bitmap
indexes must be local to the partitioned table. They cannot be global indexes.
Global indexes can be unique. Local indexes can only be unique if the partitioning key is
a part of the index key.
• Global indexes and unique, local indexes provide better performance than
nonunique local indexes because they minimize the number of index partition probes.
• Local indexes offer better availability when there are partition or subpartition
maintenance operations on the table.
Here are a few guidelines for data warehousing and DSS applications:
• Local indexes are preferable because they are easier to manage during data
loads and during partition-maintenance operations.
• Local indexes can improve performance because many index partitions can
be scanned in parallel by range queries on the index key.
Partitioning can help you improve performance and manageability. Some topics to keep
in mind when using partitioning for these reasons are:
• Partition Pruning
• Partition-wise Joins
• Parallel DML
Partition Pruning
The Oracle server explicitly recognizes partitions and subpartitions. It then optimizes
SQL statements to mark the partitions or subpartitions that need to be accessed and
eliminates (prunes) unnecessary partitions or subpartitions from access by those SQL
statements. In other words, partition pruning is the skipping of unnecessary index and
data partitions or subpartitions in a query.
For each SQL statement, depending on the selection criteria specified, unneeded
partitions or subpartitions can be eliminated. For example, if a query only involves
March sales data, then there is no need to retrieve data for the remaining eleven
months. Such intelligent pruning can dramatically reduce the data volume, resulting in
substantial improvements in query performance.
If the optimizer determines that the selection criteria used for pruning are satisfied by all
the rows in the accessed partition or subpartition, it removes those criteria from the
predicate list (WHERE clause) during evaluation in order to improve performance.
However, the optimizer cannot prune partitions if the SQL statement applies a function
to the partitioning column (with the exception of the TO_DATE function). Similarly, the
optimizer cannot use an index if the SQL statement applies a function to the indexed
column, unless it is a function-based index.
Pruning can eliminate index partitions even when the underlying table's partitions
cannot be eliminated, but only when the index and table are partitioned on different
columns. You can often improve the performance of operations on large tables by
creating partitioned indexes that reduce the amount of data that your SQL statements
need to access or modify.
Equality, range, LIKE, and IN-list predicates are considered for partition pruning with
range or list partitioning, and equality and IN-list predicates are considered for partition
pruning with hash partitioning.
SELECT SUM(value)
FROM orders
WHERE order_date BETWEEN '28-MAR-98' AND '23-APR-98'
An index scan of the March and April data partition due to high index
selectivity
or
A full scan of the March and April data partition due to low index
selectivity
Partition-wise Joins
A partition-wise join is a join optimization that you can use when joining two tables that
are both partitioned along the join column(s). With partition-wise joins, the join operation
is broken into smaller joins that are performed sequentially or in parallel. Another way of
looking at partition-wise joins is that they minimize the amount of data exchanged
among parallel slaves during the execution of parallel joins by taking into account data
distribution.
See Also:
Parallel DML
Managing Indexes
This chapter discusses the management of indexes, and contains the following topics:
• About Indexes
• Guidelines for Managing Indexes
• Creating Indexes
• Altering Indexes
• Monitoring Space Use of Indexes
• Dropping Indexes
• Viewing Index Information
About Indexes
Indexes are optional structures associated with tables and clusters that allow SQL
statements to execute more quickly against a table. Just as the index in this manual
helps you locate information faster than if there were no index, an Oracle Database
index provides a faster access path to table data. You can use indexes without rewriting
any queries. Your results are the same, but you see them more quickly.
Indexes are logically and physically independent of the data in the associated table.
Being independent structures, they require storage space. You can create or drop an
index without affecting the base tables, database applications, or other indexes. The
database automatically maintains indexes when you insert, update, and delete rows of
the associated table. If you drop an index, all applications continue to work. However,
access to previously indexed data might be slower.
See Also:
This section discusses guidelines for managing indexes and contains the following
topics:
See Also:
Data is often inserted or loaded into a table using either the SQL*Loader or an import
utility. It is more efficient to create an index for a table after inserting or loading the data.
If you create one or more indexes before loading data, the database then must update
every index as each row is inserted.
Creating an index on a table that already has data requires sort space. Some sort space
comes from memory allocated for the index creator. The amount for each user is
determined by the initialization parameter SORT_AREA_SIZE. The database also
swaps sort information to and from temporary segments that are only allocated during
the index creation in the users temporary tablespace.
Under certain conditions, data can be loaded into a table with SQL*Loader direct-path
load and an index can be created as data is loaded.
See Also:
Note:
Primary and unique keys automatically have indexes, but you might
want to create an index on a foreign key.
• Small tables do not require indexes. If a query is taking too long, then the
table might have grown from small to large.
Some columns are strong candidates for indexing. Columns with one or more of the
following characteristics are candidates for indexing:
This is because the first uses an index on COL_X (assuming that COL_X is a numeric
column).
Columns with the following characteristics are less suitable for indexing:
• There are many nulls in the column and you do not search on the not null
values.
The size of a single index entry cannot exceed roughly one-half (minus some overhead)
of the available space in the data block.
If you create a single index across columns to speed up queries that access, for
example, col1, col2, and col3; then queries that access just col1, or that access just col1
and col2, are also speeded up. But a query that accessed just col2, just col3, or just
col2 and col3 does not use the index.
A table can have any number of indexes. However, the more indexes there are, the
more overhead is incurred as the table is modified. Specifically, when rows are inserted
or deleted, all indexes on the table must be updated as well. Also, when a column is
updated, all indexes that contain the column must be updated.
Thus, there is a trade-off between the speed of retrieving data from a table and the
speed of updating the table. For example, if a table is primarily read-only, having more
indexes can be useful; but if a table is heavily updated, having fewer indexes could be
preferable.
• It does not speed up queries. The table could be very small, or there could be
many rows in the table but very few index entries.
• The queries in your applications do not use the index.
• The index must be dropped before being rebuilt.
See Also:
When an index is created for a table, data blocks of the index are filled with the existing
values in the table up to PCTFREE. The space reserved by PCTFREE for an index
block is only used when a new row is inserted into the table and the corresponding
index entry must be placed in the correct index block (that is, between preceding and
following index entries).
If no more space is available in the appropriate index block, the indexed value is placed
where it belongs (based on the lexical set ordering). Therefore, if you plan on inserting
many rows into an indexed table, PCTFREE should be high to accommodate the new
index values. If the table is relatively static without many inserts, PCTFREE for an
associated index can be low so that fewer blocks are required to hold the index data.
PCTUSED cannot be specified for indexes.
See Also:
Estimating the size of an index before creating one can facilitate better disk space
planning and management. You can use the combined estimated size of indexes, along
with estimates for tables, the undo tablespace, and redo log files, to determine the
amount of disk space that is required to hold an intended database. From these
estimates, you can make correct hardware purchases and other decisions.
Use the estimated size of an individual index to better manage the disk space that the
index uses. When an index is created, you can set appropriate storage parameters and
improve I/O performance of applications that use the index. For example, assume that
you estimate the maximum size of an index before creating it. If you then set the
storage parameters when you create the index, fewer extents are allocated for the table
data segment, and all of the index data is stored in a relatively contiguous section of
disk space. This decreases the time necessary for disk I/O operations involving this
index.
The maximum size of a single index entry is approximately one-half the data block size.
See Also:
Indexes can be created in any tablespace. An index can be created in the same or
different tablespace as the table it indexes. If you use the same tablespace for a table
and its index, it can be more convenient to perform database maintenance (such as
tablespace or file backup) or to ensure application availability. All the related data is
always online together.
Using different tablespaces (on different disks) for a table and its index produces better
performance than storing the table and index in the same tablespace. Disk contention is
reduced. But, if you use different tablespaces for a table and its index and one
tablespace is offline (containing either data or index), then the statements referencing
that table are not guaranteed to work.
When creating an index in parallel, storage parameters are used separately by each
query server process. Therefore, an index created with an INITIAL value of 5M and a
parallel degree of 12 consumes at least 60M of storage during index creation.
See Also:
You can create an index and generate minimal redo log records by specifying
NOLOGGING in the CREATE INDEX statement.
Note:
In general, the relative performance improvement is greater for larger indexes created
without LOGGING than for smaller ones. Creating small indexes without LOGGING has
little effect on the time it takes to create an index. However, for larger indexes the
performance improvement can be significant, especially when you are also parallelizing
the index creation.
In situations where you have B-tree index leaf blocks that can be freed up for reuse, you
can merge those leaf blocks using the following statement:
Figure 15-1 illustrates the effect of an ALTER INDEX COALESCE on the index vmoore.
Before performing the operation, the first two leaf blocks are 50% full. This means you
have an opportunity to reduce fragmentation and completely fill the first block, while
freeing up the second. In this example, assume that PCTFREE=0.
Because unique and primary keys have associated indexes, you should factor in the
cost of dropping and creating indexes when considering whether to disable or drop a
UNIQUE or PRIMARY KEY constraint. If the associated index for a UNIQUE key or
PRIMARY KEY constraint is extremely large, you can save time by leaving the
constraint enabled rather than dropping and re-creating the large index. You also have
the option of explicitly specifying that you want to keep or drop the index when dropping
or disabling a UNIQUE or PRIMARY KEY constraint.
See Also:
Creating Indexes
This section describes how to create indexes. To create an index in your own schema,
at least one of the following conditions must be true:
To create an index in another schema, all of the following conditions must be true:
You can create indexes explicitly (outside of integrity constraints) using the SQL
statement CREATE INDEX. The following statement creates an index named
emp_ename for the ename column of the emp table:
Notice that several storage settings and a tablespace are explicitly specified for the
index. If you do not specify storage options (such as INITIAL and NEXT) for an index,
the default storage options of the default or specified tablespace are automatically used.
See Also:
Oracle Database SQL Reference for syntax and restrictions on the use
of the CREATE INDEX statement
Indexes can be unique or nonunique. Unique indexes guarantee that no two rows of a
table have duplicate values in the key column (or columns). Nonunique indexes do not
impose this restriction on the column values.
Use the CREATE UNIQUE INDEX statement to create a unique index. The following
example creates a unique index:
Alternatively, you can define UNIQUE integrity constraints on the desired columns. The
database enforces UNIQUE integrity constraints by automatically defining a unique
index on the unique key. This is discussed in the following section. However, it is
advisable that any index that exists for query performance, including unique indexes, be
created explicitly.
See Also:
Note:
An efficient procedure for enabling a constraint that can make use of
parallelism is described in"Efficient Use of Integrity Constraints: A
Procedure".
You can set the storage options for the indexes associated with UNIQUE and PRIMARY
KEY constraints using the USING INDEX clause. The following CREATE TABLE
statement enables a PRIMARY KEY constraint and specifies the storage options of the
associated index:
If you require more explicit control over the indexes associated with UNIQUE and
PRIMARY KEY constraints, the database lets you:
• Specify an existing index that the database is to use to enforce the constraint
• Specify a CREATE INDEX statement that the database is to use to create the
index and enforce the constraint
These options are specified using the USING INDEX clause. The following statements
present some examples.
Example 1:
CREATE TABLE a (
a1 INT PRIMARY KEY USING INDEX (create index ai on a (a1)));
Example 2:
CREATE TABLE b(
b1 INT,
b2 INT,
CONSTRAINT bu1 UNIQUE (b1, b2)
USING INDEX (create unique index bi on b(b1, b2)),
CONSTRAINT bu2 UNIQUE (b2, b1) USING INDEX bi);
Example 3:
If a single statement creates an index with one constraint and also uses that index for
another constraint, the system will attempt to rearrange the clauses to create the index
before reusing it.
See Also:
Oracle Database provides you with the opportunity to collect statistics at very little
resource cost during the creation or rebuilding of an index. These statistics are stored in
the data dictionary for ongoing use by the optimizer in choosing a plan for the execution
of SQL statements. The following statement computes index, table, and column
statistics while building index emp_ename on column ename of table emp:
See Also:
Using this procedure can avoid the problem of expanding your usual, and usually
shared, temporary tablespace to an unreasonably large size that might affect future
performance.
Creating an Index Online
You can create and rebuild indexes online. This enables you to update base tables at
the same time you are building or rebuilding indexes on that table. You can perform
DML operations while the index build is taking place, but DDL operations are not
allowed. Parallel execution is not supported when creating or rebuilding an index online.
Note:
While you can perform DML operations during an online index build,
Oracle recommends that you do not perform major/large DML
operations during this procedure. This is because while the DML on the
base table is taking place it holds a lock on that resource. The DDL to
build the index cannot proceed until the transaction acting on the base
table commits or rolls back, thus releasing the lock.
For example, if you want to load rows that total up to 30% of the size of
an existing table, you should perform this load before the online index
build.
See Also:
To create a function-based index, you must have the COMPATIBLE parameter set to
8.1.0.0.0 or higher. In addition to the prerequisites for creating a conventional index, if
the index is based on user-defined functions, then those functions must be marked
DETERMINISTIC. Also, you just have the EXECUTE object privilege on any user-
defined function(s) used in the function-based index if those functions are owned by
another user.
CREATE INDEX stores the timestamp of the most recent function used
in the function-based index. This timestamp is updated when the index
is validated. When performing tablespace point-in-time recovery of a
function-based index, if the timestamp on the most recent function used
in the index is newer than the timestamp stored in the index, then the
index is marked invalid. You must use the ANALYZE INDEX ...
VALIDATE STRUCTURE statement to validate this index.
In the following SQL statement, when area(geo) is referenced in the WHERE clause,
the optimizer considers using the index area_index.
Table owners should have EXECUTE privileges on the functions used in function-based
indexes.
See Also:
• You have a nonunique index where ROWID is appended to make the key
unique. If you use key compression here, the duplicate key is stored as a prefix entry on
the index block without the ROWID. The remaining rows become suffix entries
consisting of only the ROWID.
• You have a unique multicolumn index.
You enable key compression using the COMPRESS clause. The prefix length (as the
number of key columns) can also be specified to identify how the key columns are
broken into a prefix and suffix entry. For example, the following statement compresses
duplicate occurrences of a key in the index leaf block:
The COMPRESS clause can also be specified during rebuild. For example, during
rebuild you can disable compression as follows:
See Also:
Altering Indexes
To alter an index, your schema must contain the index or you must have the ALTER
ANY INDEX system privilege. Among the actions allowed by the ALTER INDEX
statement are:
Alter the storage parameters of any index, including those created by the database to
enforce primary and unique key integrity constraints, using the ALTER INDEX
statement. For example, the following statement alters the emp_ename index:
The storage parameters INITIAL and MINEXTENTS cannot be altered. All new settings
for the other storage parameters affect only extents subsequently allocated for the
index.
For indexes that implement integrity constraints, you can adjust storage parameters by
issuing an ALTER TABLE statement that includes the USING INDEX subclause of the
ENABLE clause. For example, the following statement changes the storage options of
the index created on table emp to enforce the primary key constraint:
See Also:
Oracle Database SQL Reference for syntax and restrictions on the use
of the ALTER INDEX statement
Before rebuilding an existing index, compare the costs and benefits associated with
rebuilding to those associated with coalescing indexes as described in Table 15-1.
When you rebuild an index, you use an existing index as the data source. Creating an
index in this manner enables you to change storage characteristics or move to a new
tablespace. Rebuilding an index based on an existing data source removes intra-block
fragmentation. Compared to dropping the index and using the CREATE INDEX
statement, re-creating an existing index offers better performance.
You have the option of rebuilding the index online. The following statement rebuilds the
emp_name index online:
If you do not have the space required to rebuild an index, you can choose instead to
coalesce the index. Coalescing an index is an online operation.
See Also:
The view V$OBJECT_USAGE can be queried for the index being monitored to see if
the index has been used. The view contains a USED column whose value is YES or
NO, depending upon if the index has been used within the time period being monitored.
The view also contains the start and stop times of the monitoring period, and a
MONITORING column (YES/NO) to indicate if usage monitoring is currently active.
Each time that you specify MONITORING USAGE, the V$OBJECT_USAGE view is
reset for the specified index. The previous usage information is cleared or reset, and a
new start time is recorded. When you specify NOMONITORING USAGE, no further
monitoring is performed, and the end time is recorded for the monitoring period. Until
the next ALTER INDEX ... MONITORING USAGE statement is issued, the view
information is left unchanged.
The percentage of index space usage varies according to how often index keys are
inserted, updated, or deleted. Develop a history of average efficiency of space usage for
an index by performing the following sequence of operations several times:
• Analyzing statistics
• Validating the index
• Checking PCT_USED
• Dropping and rebuilding (or coalescing) the index
When you find that index space usage drops below its average, you can condense the
index space by dropping the index and rebuilding it, or coalescing it.
See Also:
Dropping Indexes
To drop an index, the index must be contained in your schema, or you must have the
DROP ANY INDEX system privilege.
When you drop an index, all extents of the index segment are returned to the containing
tablespace and become available for other objects in the tablespace.
How you drop an index depends on whether you created the index explicitly with a
CREATE INDEX statement, or implicitly by defining a key constraint on a table. If you
created the index explicitly with the CREATE INDEX statement, then you can drop the
index with the DROP INDEX statement. The following statement drops the emp_ename
index:
DROP INDEX emp_ename;
You cannot drop only the index associated with an enabled UNIQUE key or PRIMARY
KEY constraint. To drop a constraints associated index, you must disable or drop the
constraint itself.
Note:
See Also:
View Description
DBA_INDEXES DBA view describes indexes on all tables in the database. ALL
view describes indexes on all tables accessible to the user.
ALL_INDEXES USER view is restricted to indexes owned by the user. Some
columns in these views contain statistics that are generated by
USER_INDEXES the DBMS_STATS package or ANALYZE statement.
DBA_IND_COLUMNS These views describe the columns of indexes on tables. Some
columns in these views contain statistics that are generated by
ALL_IND_COLUMNS the DBMS_STATS package or ANALYZE statement.
USER_IND_COLUMNS
DBA_IND_EXPRESSIONS
These views describe the expressions of function-based indexes
on tables.
ALL_IND_EXPRESSIONS
USER_IND_EXPRESSIONS
INDEX_STATS Stores information from the last ANALYZE INDEX ... VALIDATE
STRUCTURE statement.
INDEX_HISTOGRAM Stores information from the last ANALYZE INDEX ... VALIDATE
STRUCTURE statement.
V$OBJECT_USAGE Contains index usage information produced by the ALTER
INDEX ... MONITORING USAGE functionality.