Executive Summary...............................................................................4
Introduction...........................................................................................4
Why Parallel Execution?.......................................................................6
The ultimate goal: scalability............................................................6
Shared everything – the Oracle advantage.......................................7
Fundamental Concepts of Oracle's Parallel Execution..........................8
Processing parallel SQL statements..................................................9
Query Coordinator (QC) and parallel servers..............................10
Producer/consumer model...........................................................12
Granules.......................................................................................13
Data redistribution.......................................................................14
Enabling parallel execution in Oracle.............................................19
Controlling SQL Parallel Execution in Oracle................................21
Understand your target workload................................................22
Controlling the degree of parallelism..........................................22
Controlling the usage of parallelism............................................24
Oracle SQL Parallel Execution best practices.....................................25
Start with a balanced system...........................................................26
Calibrate your configuration........................................................26
Stripe And Mirror Everything (S.A.M.E.) – use ASM................27
Set database initialization parameters for good performance..........28
Memory allocation ......................................................................28
Controlling parallel servers.........................................................29
Enabling efficient I/O throughput ...............................................30
Use parallel execution with common sense.....................................31
Don't enable parallelism for small objects...................................31
Use parallelism to achieve your goals, not to exceed them.........31
Avoid using hints.........................................................................31
Combine parallel execution with Oracle Partitioning.....................32
Ensure statistics are good enough....................................................32
Monitor parallel execution activity..................................................32
Whether or not to use parallel execution in RAC............................33
Use Database Resource Manager....................................................33
Don't try to solve hardware deficiencies with other features...........33
Don't ignore other features..............................................................34
Monitoring SQL Parallel Execution....................................................34
EXECUTIVE SUMMARY
Parallel execution is one of the fundamental database technologies that enable
organizations to manage and access tens – if not even hundreds of terabytes of
data. Without parallelism, these large databases, commonly used for data
warehouses but increasingly found in operational systems as well, would not
exist.
Parallel execution is the ability to apply multiple CPU and I/O resources to the
execution of a single database operation. While every major database vendor
today provides parallel capabilities, there remain key differences in the
architectures provided by the various vendors.
SQL parallel execution was first introduced in Oracle more than a decade ago1
and has been enriched and improved since. This paper discusses the parallel
execution architecture of Oracle Database 11g and shows its superiority over
alternative architectures for real-world applications. This paper also touches on
how to control and monitor parallel execution; lastly, it gives an insight into on
upgrade considerations when migrating from earlier versions of Oracle.
While the focus of this paper is on Oracle Database 11g, the fundamental
concepts are also applicable to earlier versions of Oracle.
INTRODUCTION
Databases today, irrespective of whether they are data warehouses, operational
data stores, or OLTP systems, contain a wealth of information. However,
finding and presenting the right information in a timely fashion can be a
challenge because of the vast quantity of data involved.
Parallel execution is the capability that addresses this challenge. Using
parallelism, terabytes of data can be processed in minutes or even less, not hours
or days. Parallel execution uses multiple processes to accomplish a single task –
to complete a SQL statement in the case of SQL parallel execution. The more
effectively the database software can leverage all hardware resources – multiple
cores, multiple I/O channels, or even multiple nodes in a cluster - the more
efficiently queries and other database operations will be processed.
1 Parallel execution was first introduced in Oracle Version 7.3 in 1996
250
200
150
100
50
0
1x 2x 3x 4x 5x 6x 7x 8x 9x 10x
Resources in units of x
The graph does not look linear to you, right? Look again: it shows the absolute
processing time, not a relative speedup factor. For example, using 2x the
resources reduces the processing time from 360 to 180, and from 2x to 4x down
to 90, both cases of linear scalability. It's just that the absolute performance gain
In a shared nothing system CPU cores are solely responsible for individual data
sets and the only way to access a specific piece of data you have to use the CPU
core that owns this subset of data2; such systems are are also commonly known
2 Some implementations allow a static small number of cores as smallest unit; for the sake
of simplicity we will discuss them as one core, the architectural trade-offs are identical
----------------------------------------
| Id | Operation | Name |
----------------------------------------
| 0 | SELECT STATEMENT | |
|* 1 | HASH JOIN | |
| 2 | TABLE ACCESS FULL| CUSTOMERS |
| 3 | TABLE ACCESS FULL| SALES |
----------------------------------------
Figure 4: customer purchase information, serial plan
----------------------------------------------------------------------------------------
| Id | Operation | Name | TQ |IN-OUT| PQ Distrib |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | |
| 1 | PX COORDINATOR | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10001 | Q1,01 | P->S | QC (RAND) |
| 3 | HASH JOIN | | Q1,01 | PCWP | |
| 4 | PX RECEIVE | | Q1,01 | PCWP | |
| 5 | PX SEND BROADCAST | :TQ10000 | Q1,00 | P->P | BROADCAST |
| 6 | PX BLOCK ITERATOR | | Q1,00 | PCWC | |
| 7 | TABLE ACCESS FULL | CUSTOMERS | Q1,00 | PCWP | |
| 8 | PX BLOCK ITERATOR | | Q1,01 | PCWC | |
| 9 | TABLE ACCESS FULL | SALES | Q1,01 | PCWP | |
-----------------------------------------------------------------------------------------
...
Figure 7: parallel server processes seen on the OS level using 'ps -ef'
3 Parallel plans will look different in versions prior to Oracle Database 10g.
Producer/consumer model
Continuing with our car counting example, imagine the job is to count the total
number of cars per car color you see. Well, if you and your friend are going to
cover one side of the road each, each one of you potentially sees the same colors
and gets a subtotal for the colors, but not the complete result for the street. You
could go ahead, memorize all this information and tell it back to the third person
(the “person in charge”), but this poor individual then has to sum up all of the
results by himself – what if all cars in the street had a different color? The third
person would redo exactly the same work as you and your friend. To parallelize
Slave set 2 “consumes” the records Slave set 1 “produces” rows from
and joins it with table sales table customers
Operations (rowsources) that are processed by the same set of parallel servers
can be identified in an execution plan by looking in the 'TQ' column. As
shown in Figure 9, the first slave set (Q1,00) is reading table CUSTOMERS in
parallel and producing rows that are sent to slave set 2 (Q1,01) that consumes
these records and joins it with table SALES. Whenever data is distributed from
4 Note that the number of additional friends is not related to the number of distinct car
colors, but matches exactly the number of people that are counting cars. We want to use
our additional friends in the most optimal manner and - assuming that all “car scanners”
have equally distributed incremental results on a continuous base - having as many “car
color counters” keeps them continuously busy as well; using more friends to count the car
colors would leave all three of them without work for 30% of their time (on average).
Granules
A granule is the smallest unit of work when accessing data. Oracle Database
uses a shared everything architecture, which from a storage perspective means
that any CPU core in a configuration can access any piece of data; this is the
most fundamental architectural difference between Oracle and all other major
database products on the market. Unlike all other systems, Oracle can – and will
- choose this smallest unit of work solely dependent on a query's requirements.
The basic mechanism the Oracle Database uses to distribute work for parallel
execution is block ranges on disk – so-called block-based granules. This
methodology is unique to Oracle and is not dependent on whether the
underlying objects have been partitioned. Access to the underlying objects is
divided into a large number of granules which are given out to parallel servers to
work on (and when a parallel server finishes the work for one granule the next
one is given out). The number of granules is always much higher than the
Data redistribution
Parallel operations – except for the most basic ones – typically require data
redistribution. Data redistribution is required in order to perform operations such
as parallel sorts, aggregations and joins. At the block-granule level there is no
notion of knowledge about the actual data content of an individual granule. Data
has to be redistributed as soon as a subsequent operation relies on the actual
content. Remember the last car example? The car color mattered, but you don't
know – or even control – what color car is parked where on the street. You
redistributed the information about the amount of cars per color to the additional
two friends based on their color responsibility, enabling them to do the total
counting for the colors they're in charge of.
Data redistribution takes place between individual parallel servers either within
a single machine, or, in the case of parallel execution across multiple machines
in a Real Application Clusters (RAC) database, between parallel servers on
multiple machines. Of course in the latter case interconnect communication is
used for the data redistribution while shared-memory is used for the former.
Data redistribution is not unique to the Oracle Database. In fact, this is one of
the most fundamental principles of parallel processing, being used by every
product that provides parallel capabilities. The fundamental difference and
advantage of Oracle's capabilities, however, is that parallel data access
(discussed in the granules section earlier) and therefore the necessary data
Serial join
In a serial join a single session reads both tables and performs the join. In this
example we assume two large tables CUSTOMERS and SALES are involved in
the join.
The database uses full table scans to access both tables. For a serial join the
single serial session (red arrows) can perform the full join because all matching
values from the CUSTOMERS table are read by one process. Figure 11 depicts
the serial join5.
5 Please note that the figures in this section represent logical diagrams to explain data
redistribution. In an actual database environment data would typically be striped across
multiple physical disks, accessible to any parallel server. This complexity has deliberately
been left out from the images.
Processing the same simple join in parallel, a redistribution of rows will become
necessary. Parallel servers scan parts of either table based on block ranges and in
order to complete the join, rows have to be distributed between parallel servers.
Figure 12 depicts the data redistribution for a parallel join at a DOP 2,
represented by the green and red arrow respectively. Both tables are read in
parallel by both the red and green process (using block-range granules) and then
each parallel server has to redistribute its result set based on the join key to the
subsequent parallel join operator.
There are many data redistribution methods. The following 5 are the most
common ones:
– HASH: Hash redistribution is very common in parallel execution in order to
achieve an equal distribution of work for individual parallel servers based
on a hash distribution. Hash (re)distribution is the basic parallel execution
enabling mechanism for most data warehouse database system, most
notably MPP systems.
– BROADCAST: Broadcast redistribution happens when one of the two
result sets in a join operation is much smaller than the other result set.
Instead-of redistributing rows from both result sets the database sends the
smaller result set to all parallel servers in order to guarantee the individual
servers are able to complete their join operation. The small result set may be
produced in serial or in parallel.
– RANGE: Range redistribution is generally used for parallel sort operations.
Individual parallel servers work on data ranges so that the QC does not have
to do any sorting but only to present the individual parallel server results in
the correct order.
Figure 13: Data redistribution for a simple parallel join using a HASH
redistribution.
Data redistribution is shown in the SQL execution plan in the 'PQ Distrib'
column. The execution plan for the simple parallel join illustrated in Figure 13.
If at least one of the tables accessed in the join has been partitioned on the join
key the database may decide to use a partition-wise join. If both tables are equi-
partitioned on the join key the database may use a full partition-wise join.
Otherwise a partial partition-wise join may be used in which one of the tables is
dynamically partitioned in memory followed by a full partition-wise join.
A partition-wise join does not require any data redistribution because individual
parallel servers will work on the equivalent partitions of both joined tables.
As shown in Figure 14, the red parallel process reads data partition one of the
CUSTOMERS table AND data partition one of the SALES table; the equi-
partitioning of both tables on the join key guarantees that there will no matching
rows for the join outside of these two partitions. The red parallel process will
always be able to complete the full join by reading just these matching
partitions. The same is true the green parallel server process, too, and for any
pair of partitions of these two tables. Note that partition-wise joins use partition-
based granules rather than block-based granules.
The partition-wise join is the fundamental enabler for shared nothing systems.
Shared nothing systems typically scale well as long as they can take advantage
of partition-wise joins. As a result, the choice of partitioning (distribution) in a
shared nothing system is critical as well as the access path to the tables.
Operations that do not use partition-wise operations in an MPP system often do
not scale well.
The tables are initially not partitioned and there are no indexes on the tables.
You want to know the total revenue for the last two months of 2007 in the
United States, by state. The following query retrieves this result:
select c.state_province
, sum(s.amount) revenue
from customers c
, sales s
where s.customer_id = c.id
and s.purchase_date
between to_date('01-NOV-2007','DD-MON-YYYY')
and to_date('31-DEC-2007','DD-MON-YYYY')
and c.country = 'United States of America'
group by c.state_province
/
You run the query without enabling parallel execution and let's say it takes 10
minutes to execute the query.
The end user who runs the query expects a faster response time (less than 3
minutes) and one way to achieve this, assuming there are surplus resources
available, is to execute in parallel.
By default the Oracle Database is configured to support parallel execution
out-of-the-box. The most relevant database initialization parameters are:
– parallel_max_servers: the maximum number of parallel servers that
can be started by the database instance. In order to execute an operation in
parallel, parallel servers must be available (i.e. not in use by another parallel
operation). By default the value for parallel_max_servers is derived
from other database settings and will be discussed later in this paper. Going
back to the example of counting cars and using help from friends:
parallel_max_servers is the maximum number of friends that you
can call for help.
– parallel_min_servers: the minimum number of parallel servers that
are always started when the database instance is running.
parallel_min_servers enables you to avoid any delay in the
Verify that parallel execution is enabled for your database instance (connect to
the database as a DBA or SYSDBA):
SQL> show parameter parallel_max_servers
Parallel execution can enable a single operation to utilize all system resources.
While this may not be a problem in certain scenarios there are many cases in
which this would not be desirable. Consider the workload to which you want to
apply parallel execution to get optimum use of the system while satisfying your
requirements.
Single-user workload
DEFAULT parallelism
Unlike the DEFAULT parallelism, a specific DOP can be requested from the
Oracle database. For example, you can set a fixed DOP at a table or index level:
alter table customers parallel 8 ;
In this case queries accessing just the customers table use a requested DOP of 8,
and queries accessing the sales table request a DOP of 16. A query accessing
both the sales and the customers table will be processed with a DOP of 16 and
potentially allocate 32 parallel servers (producer/consumer); whenever different
DOPs are specified, Oracle is using the higher DOP7.
Adaptive parallelism
When using Oracle's adaptive parallelism capabilities, the database will use an
algorithm at SQL execution time to determine whether a parallel operation
should receive the requested DOP or be throttled down to a lower DOP.
In a system that makes aggressive use of parallel execution by using a high DOP
the adaptive algorithm will throttle down with only few operations running in
parallel. While the algorithm will still ensure optimal resource utilization, users
may experience inconsistent response times. Using solely the adaptive
6 We are oversimplifying here for the purpose of an easy explanation. The multiplication
factor of two is derived by the init.ora parameter parallel_threads_per_cpu, an OS specific
parameter that is set to two on most platforms
7 Some statements do not fall under this rule, such as a parallel CREATE TABLE AS
SELECT; a discussion of these exceptions is beyond the scope of this paper.
Once a SQL statement starts execution at a certain DOP it will not change the
DOP throughout its execution. However if you start at a low DOP – either as a
result of adaptive parallel execution or because there were simply not enough
parallel servers available - it may take a very long time to complete the
execution of the SQL statement. If the completion of a statement is time-critical
then you may want to either guarantee a minimal DOP or not execute at all (and
maybe warn the DBA or programmatically try again later when the system is
less loaded).
To guarantee a minimal DOP, use the initialization parameter
parallel_min_percent. This parameter controls the minimal percentage
of parallel server processes that must be available to start the operation; it
defaults to 0, meaning that Oracle will always execute the statement,
irrespective of the number of available parallel server processes.
For example, if you want to ensure to get at least 50% of the requested parallel
server processes for a statement:
SQL> alter session set parallel_min_percent=50 ;
If there are insufficient parallel query servers available – in this example less
than 64 parallel servers for a simple SQL statement (or less than 128 slaves for a
more complex operation, involving producers and consumers) - you will see
ORA-12827 and the statement will not execute. You can capture this error in
your code and retry later.
Depending on your expected workload pattern you might want to ensure that
Oracle's parallel execution capabilities are used most optimally for your
environment. This implies two basic tasks, (a) to control the usage of parallelism
and (b) to ensure that the system does not get overloaded while adhering to the
Oracle Database Resource Manager (DBRM) enables you to group users based
on characteristics, and restrict parallel execution for some users. DBRM is the
ultimate last instance in determining the maximum degree of parallelism, and no
user in a resource group (using a specific resource plan) will ever be able to run
with a higher DOP than the resource group's maximum. For example, if your
You should set a baseline for the performance you expect out of the Oracle
Database. The Oracle Database software will not achieve better performance
than the hardware configuration can achieve. Hence you should know what the
operating system can achieve before you introduce the Oracle software, and use
it as a baseline if later you think the performance is insufficient.
SQL parallel execution is typically very I/O intensive, so you want to measure
the maximum I/O performance you can achieve without the Oracle Database.
You can use ORION8 (ORacle I/O Number calibration tool, a free Oracle-
provided utility designed to simulate Oracle I/O workloads) or basic operation
system utilities (such as the Linux/Unix dd command) to measure the I/O
performance for your system. Make sure to calibrate the configuration in the
way Oracle will use it (how the data will be laid out across storage devices) and
use a calibration workload that resembles the type of workload the Oracle
Database will perform when running SQL statements in parallel (typically large
random I/Os).
Conservatively, any physical disk may be able to sustain 20-30 MB/s for large
random reads. Considering that you need about 200 MB/s to keep a single CPU
core busy (i.e. 8 - 10 physical disks), you should realize that you need a lot of
physical spindles to get good performance for database operations running in
parallel. Do not use a single 1 TB disk for your 800 GB database, because you
will not get good performance running operations in parallel against the
database; this might work well for your single-user home video archive, but not
for a database leveraging parallel query with multiple users.
The way to utilize multiple physical spindles with Oracle's shared everything
architecture is to stripe across multiple devices. For high availability you should
use a RAID configuration (storage-based RAID1 or RAID5 are commonly
used) to ensure you can survive the failure of a single disk. For many years
Oracle has recommended its users to use the Stripe And Mirror Everything
(S.A.M.E.) methodology using a stripe size of 1 MB. Such a configuration is
Memory allocation
Large parallel operations may use a lot of execution memory, and you should
take this into account when allocating memory to the database. You should also
bear in mind that the majority of operations that execute in parallel bypass the
buffer cache. A parallel operation will only use the buffer cache if the object has
been either explicitly created with the CACHE option or if the object size is
smaller than 2% of the buffer cache. If the object size is less than 2% of the
buffer cache then the cost of the checkpoint to start the direct read is deemed
more expensive than just reading the blocks into the cache.
shared_pool_size
Parallel servers communicate among themselves and with the Query
Coordinator by passing messages. The messages are passed via memory buffers
that are allocated from the shared pool. When a parallel server is started it will
allocate buffers in the shared pool so it can communicate, if there is not enough
free space in the shared pool to allocate the buffers the parallel server will fail to
start. In order to size your shared pool appropriate you should use the following
formulas to calculate the additional overhead parallel servers will put on the
shared pool. If you are doing inter-node parallel operations
(((2 + (cpu_count X parallel_threads_per_cpu)) X 2) X
(cpu_count X parallel_threads_per_cpu)) X
parallel_execution_message_size X # concurrent queries
pga_aggregate_target
The pga_aggregate_target parameter controls the total amount of
execution memory that can be allocated by Oracle. Oracle attempts to keep the
amount of private memory below the target you specified by adapting the size of
the work areas. When you increase the value of this parameter, you indirectly
increase the memory allotted to work areas. Consequently, more memory-
intensive operations are able to run fully in memory and less will work their way
over to disk. For environments that run a lot of parallel operations you should
set pga_aggregate_target as large as possible. A good rule of thumb is to
have a minimum of 100MB X parallel_max_servers.
parallel_execution_message_size
As mentioned above the Parallel servers communicate among themselves and
with the Query Coordinator by passing messages via memory buffers. If you
execute a lot of large operations in parallel, it’s advisable to reduce the
messaging latency by increasing the parallel_execution_message_size
(the size of the buffers). By default the message size is 2K. Ideally you should
increase it to 16k (16384). However, a larger value for
parallel_execution_message_size will increase the memory
requirement for the shared_pool so if you increase it from 2K to 16K your
parallel server memory requirement will be 8 X more.
cpu_count
CPU count is an automatically derived parameter by the Oracle system and is
used to determine the default number of parallel servers and the default degree
of parallelism for an object. Do not change the value of this parameter.
parallel_min_servers
This parameter determines the number of parallel servers that will be started
during database startup. By default the value is 0. It is recommended that you set
parallel_min_servers to “average number of concurrent queries *
maximum degree of parallelism need by a query”. This will ensure that there are
ample parallel server processes available for the majority of the queries executed
on the system and queries will not suffer any additional overhead of having to
spawn extra parallel servers. However, if extra parallel servers are required for
additional queries above you average workload they can be spawn “on the fly”
up to the value of parallel_max_servers. Bear in mind that any
additional parallel server processes that are spawned above
parallel_min_servers will be killed after they have been inactive for a
certain about of time and will have to be re-spawned if they are need again in
the future.
parallel_max_servers
This parameter determines the maximum number of parallel servers that may be
started for a database instance, should there be demand for them. The default
value on Oracle Database 10g and higher is 10 * cpu_count *
parallel_threads_per_cpu. A good rule of thumb is to ensure
parallel_max_servers is set to a number greater than the “maximum
number of concurrent queries * maximum degree of parallelism need by a
query”. By doing this you will ensure every query gets the appropriate number
of parallel servers.
parallel_adaptive_multi_user
This parameter controls whether or not Oracle automatically downgrades
parallel operations to proactively to prevent an overloading of the system.
Depending on the workload and the user expectations you should set this
parameter to true or false. Realize that if you set the parameter to true, then
parallel operations may be downgraded aggressively, which can significantly
impact the execution time. For predictable response times on a busy server it is
better to set this parameter to false.
db_file_multiblock_read_count
SQL parallel execution is generally used for queries that will access a lot of
data, for example when doing a full table scan. Since parallel execution will by-
pass the buffer cache and access data directly from disk you want each I/O to be
as efficient as possible, and using large I/Os is a way to reduce latency. Set
db_file_multiblock_read_count such that when it is multiplied by the
block size you end up with 1 MB. E.g. for 8K block size, use
db_file_multiblock_read_count=128.
disk_async_io
For optimum performance make sure you use asynchronous I/Os. This is the
default value for the majority of platforms.
In general you should avoid using hints to enable parallel execution. Hints are
hard to maintain and may not give the right behavior over time when objects and
business requirements change.
Initially tables SALES and CUSTOMERS are not partitioned. The following
shows a portion of the execution plan.
explain plan for select c.state_province
, sum(s.amount) revenue
from customers c, sales s
where s.customer_id = c.id
and s.purchase_date
between to_date('01-NOV-2007','DD-MON-YYYY')
and to_date('31-DEC-2007','DD-MON-YYYY')
and c.country = 'United States of America'
group by c.state_province;
10 Note that some columns in the execution plan have been removed to improve the
readability of this example.
Using all the information discussed in the concept section of this paper, you will
be able to identify the following parallel processing steps:
– The CUSTOMERS table is read in parallel (ID 11) and is then broadcasted
to all parallel servers (ID 9) who read the SALES table.
– After the join, the data is redistributed using a HASH redistribution (ID 5)
on the group by column.
– Hash join and hash group by take place in parallel without a need for
redistribution (ID 6 and ID 7). Every parallel server process is doing the
incremental aggregation of their disjoint data set.
– Results are returned to the query coordinator in random order (ID 2), since
no order by was specified in the SQL statement; whenever a parallel server
finishes the computation of its incremental result it is returned to the QC.
Large databases and particularly data warehouses – the types of databases that
mostly use parallel execution – should always use Oracle Partitioning.
Partitioning can provide great performance improvements because of partition
elimination (pruning) capabilities, but also because parallel execution plans can
take advantage of partitioning.
Let's recreate the tables SALES and CUSTOMERS as follows:
– HASH partitioning on the ID column for the CUSTOMERS table using 128
partitions.
– Table SALES and CUSTOMERS are now equi-partitioned on the join
column
Figure 17: customer purchase information, parallel plan , hash partitioning with partition-wise joins
Figure 17 shows the execution plan for the same query using the now hash
partitioned tables. Unlike in previous examples, you do not see the granules for
table SALES and CUSTOMERS right away in the plan. The simple reason for
this because we are now using partition-based granules, so Oracle does not
have to partition the data for parallel access at runtime; the database simply has
to iterate over existing partitions.
Furthermore, we are joining two equi-partitioned tables leveraging a partition-
wise join. The partition-based granules are not only identical for both tables, but
the iteration (processing) of granules is now a processing of pairs of partitions
that includes the join as well; one parallel server process is working on one
equivalent partition pair at a given point in time. Consequently, the partition-
based granule iterator is ABOVE the hash join operation in the execution plan.
Besides the known processing steps of parallel execution this new behavior of a
partition-wise join is seen in the execution plan in Figure 17.
– Tables SALES and CUSTOMERS are accessed in parallel, iterating over the
existing equi-partitioned hash partition-based granules (ID 7). You can read
this operation as “ loop over all hash partitions and process the operations
below”. A set of parallel servers is working on n partitions at a time (n
equals the DOP), from partition 1 to 128 (identified through columns
'Pstart' and 'Pstop')
– For each HASH partition pair, a parallel server process joins the table
CUSTOMERS and SALES.
No data redistribution is taking place to join tables SALES and CUSTOMERS.
In the case of inter-node parallel query, there would be no data transfer
necessary between the compute nodes, and the Oracle database – although built
on the shared everything paradigm - would behave like a shared nothing system
for this operation.
8 – access("S"."CUSTOMER_ID"="C"."ID")
9 - filter("C"."COUNTRY"='United States of America')
10 - filter("TIME_ID">=TO_DATE(' 2007-11-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')
AND "TIME_ID"<=TO_DATE(' 2007-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
A full partition-wise join only requires the partitioning strategy of the join
column(s) to be identical. If we change the SALES table to become a composite
RANGE–HASH partitioned table, using PURCHASE_DATE for range partitioning
(7 years worth of data, partitioned by month) and CUSTOMER_ID for hash sub-
partitioning using 128 sub-partitions we still adhere to the condition for a full
partition-wise join and the plan would look change only slightly, as shown in
Figure 18 above.
However the query against the new partitioned tables returns even faster than
before. Besides the benefits from the parallel full partition-wise join a big
performance improvement is achieved through partition elimination; the Oracle
database analyzes all existing predicates in the query to see whether some
partitions can be ruled out from the processing completely. In our case, the
composite range-hash partitioned table SALES has 84 x 128 =10,752
subpartitions in total. Analyzing the filter predicate on purchase_date leads
to a reduction down to two range partitions (#72 and #73, shown in pstart/pstop
of ID 10); we only have to access 256 out of 10,752 partitioned, providing appr.
a 40x performance improvement.
Partition-wise joins can also be leveraged when joining REF Partitioned tables
or as so-called partial partition-wise joins when a small table is joined with a
significantly larger table and the database enforces a data redistribution to match
the partitioning strategy of the larger table. For the sake of focusing on parallel
execution only we will not further discuss partition-wise joins for REF
partitioned table nor do we discuss partial partition-wise joins.
Wait events
Almost all SQL statements executing in parallel will read data directly from disk
rather than out of memory. As a result parallel statements can be very I/O
intensive. Oracle Enterprise Manager Database Control 11g provides I/O
throughput information on the main performance page – on the “I/O tab” – as
well as on the detailed I/O pages.
Figure 20: Detailed I/O page in OEM 11g Database Console for a parallel
DML workload.
With Oracle Database 11.1.0.6 you can only use textual output from the view.
Starting with Oracle Enterprise Manager database console 11.1.0.7 there is a
graphical interface to GV$SQL_MONITOR. Oracle Enterprise Manager Grid
Control 11g will also provide the graphical interface.
The examples and screenshots in this section show Oracle Enterprise Manager
11.1.0.7 database console on a single instance 2 CPU database server12.
The SQL Monitoring screen shows the execution plan of a long-running
statement or a statement that is running in parallel. In near real-time (the default
refresh cycle is 5 seconds) you can monitor which step in the execution plan is
being worked on and if there are any waits (see Figure 22). A parallel statement
shows the parallel server sets. The SQL Monitor output is extremely valuable to
identify which parts of an execution plan are expensive throughout the total
execution of a SQL statement.
The SQL Monitoring screens also provide information about the parallel server
sets and work distribution between individual parallel servers on the “Parallel”
tab (see Figure 23).
11 Oracle Database Enterprise Manager Tuning Pack must be licensed in order to access
(G)V$SQL_MONITOR.
12 As of publication Oracle Database 11.1.0.7 is not yet available. The example shows
screenshots of an early version of database console on a development version.
Ideally you see an equal distribution of work across the parallel servers. If there
is a skew in the distribution of work between parallel servers in one parallel
server set then you have not achieved optimal performance. The statement will
have to wait for the parallel server performing most work to complete.
The third tab in the SQL Monitoring interface shows the activity for the
statement over time in near real-time (see Figure 24). Use this information to
identify at statement level what resources are used most intensely.
If you always use hints, and nothing but hints, to enable SQL parallel execution
on Oracle Database 9i then there is little to worry about when upgrading. You
should verify whether every operation with parallel hints actually runs in
parallel in Oracle Database 9i, but if it does, it will do so in Oracle Database 10g
and beyond as well.
If on Oracle Database 9i you used session settings to enable SQL parallel execution
If you always use only the session setting to enable parallel execution, then you
should look at the operations that are executed in the sessions that enable or
force parallel execution. Expect more operations to execute in parallel after an
upgrade to Oracle Database 10g or beyond. If there are only parallel operations
on Oracle Database 9i in your parallel enabled sessions then you would expect
minimal changes, if any, after an upgrade.
If you set the parallel properties at the table or index level in order to enable
parallel execution, then you will face the highest likelihood to experience
changes. Expect some operations that access parallel enabled objects which
would not execute in parallel on Oracle Database 9i to run in parallel after an
upgrade.
Carefully review the parallel settings at the table level, and reset the parallel
setting on small database objects to noparallel (database objects with fewer than
thousands of records and/or few database blocks in size). Operations that
complete in a few seconds or less when running in serial benefit little from
executing in parallel. Rather you want operations that take minutes or even
hours to complete in serial to benefit from parallel execution.
parallel_max_servers
The default value in Oracle Database 9i was 10. For Oracle Database 10g and
higher, assuming you use automatic memory management for execution
memory (i.e. you use pga_aggregate_target or starting with Oracle
Database 11g memory_target) the default equates to 10 * cpu_count.
Generally 10 * cpu_count equates to a lot more than 10, which means that
SQL parallel execution may end up using a lot more system resources.
If your system was heavily loaded on Oracle Database 9i with some operations
parallel_adaptive_multi_user
In Oracle Database 9i parallel_adaptive_multi_user was by default
derived from parallel_automatic_tuning and defaulted to false. In
Oracle Database 10g and beyond parallel_adaptive_multi_user
equates to true. As a result the database will aggressively reduce the DOP for
SQL parallel operations when some other statements already use SQL parallel
servers. If you did not explicitly change parallel_automatic_tuning or
parallel_adaptive_multi_user on Oracle Database 9i, then you should
explicitly set parallel_adaptive_multi_user to false when you upgrade
to Oracle Database 10g or beyond.
CONCLUSION
The objective of parallel execution is to reduce the total execution time of an
operation by using multiple resources concurrently. Resource availability is the
most important prerequisite for scalable parallel execution.
The Oracle Database provides a powerful SQL parallel execution engine that
can run almost any SQL-based operation – DDL, DML and queries – in the
Oracle Database in parallel. This paper explained how to enable SQL parallel
execution and provided some best practices to ensure its successful use.
Oracle Corporation
World Headquarters
500 Oracle Parkway
Redwood Shores, CA 94065
U.S.A.
Worldwide Inquiries:
Phone: +1.650.506.7000
Fax: +1.650.506.7200
oracle.com