Cache Buffer Chains Demystified

Cache Buffer Chains Demystified
You must have seen sessions waiting on the event latch: cache buffers chains from time to
time. If you ever wondered what this means and how you can reduce time spent on it, read on.
Here you will learn how buffer cache works, how Oracle multi-versioning works, how buffers
are allocated and deallocated, what hash chain is and how buffers are linked to it, what the role
of cache buffer chain latch is and why sessions wait on it, how to find the objects causing the
contention and how to reduce the time spent on that event.
While exploring the reasons for the slowness in database sessions, you check the wait interface
and see the following output:
SQL> select state, event from v$session where sid = 123;
STATE
EVENT
------- --------------------------WAITING latch: cache buffers chains
This event is more common, especially in applications that perform a scan of a few blocks of
data. To resolve it, you should understand what the cache buffers chains latch is and why
sessions have to wait on it. To understand that you must understand how the Oracle buffer cache
works. We will explore these one by one, and close with the solution to reducing the cache
buffers chains latch waits.
This is the fifth in the Series "100 Things You Probably Didn't Know About Oracle". If
you haven't already, I urge you to read the other parts
Part1 (Commit does not force writing of buffers into the disk),
Part2 (How Oracle Locking Works),
Part3 (More on Interested Transaction Lists).
Part4 (Can I Fit a 80MB Database Completely in a 80MB Buffer Cache?)
You will learn

How buffer cache works
How buffers are populated
About buffer states and versioning
How buffers are flushed
About the role of Cache Buffer Chain latch
How to reduce CBC Latches
About other kinds of latches
How Buffer Cache Works

The buffer cache (BC), resident inside the SGA of an Oracle instance is designed to hold blocks
that come from the database. When a user issues a statement such as the following:
update EMP
set NAME = ROB
where EMPNO = 1
the Oracle server process assigned to the session performs the following actions:
1) locates the block that contains the record with EMPNO = 1
2) loads the block from the database file to an empty buffer in the buffer cache
3) if an empty buffer is not immediately found, locates an empty buffer or forces the DBWn
process to write some dirty buffers to make room
4) updates the NAME column to ROB in the buffer
In step 1, we assume an index is present and hence the server process can locate the single block
immediately. If the index is not present, Oracle will need to load all the blocks of the table EMP
into the buffer cache and check for matching records one by one.
The above description has two very important concepts:
1) a block, that is the smallest unit of storage in the database
2) a buffer, that is a placeholder in the buffer cache used to hold a block.
Buffers are just placeholders, which may or may not be occupied. They can hold exactly one
block at a time. Therefore for a typical database where the block size is set to 8KB, the buffers
are also of size 8KB. If you use multiple block sizes, e.g. 4KB or 16KB, you would have to
define multiple buffer caches corresponding to the other block sizes. In that case the buffer sizes
will match the block sizes corresponding to those blocks.
When buffers come to the cache, the server process must scan through them to get the value it
wants. In the example shown above, the server process must find the record where EMPNO=1.
To do so, it has to know the location of the blocks in the buffers. The process scans the buffers in
a sequence. So, buffers should ideally be placed in a sequence, e.g. 10 followed by 20, then 30,
etc. However this creates a problem. What happens when, after this careful placement buffers,
buffer #25 comes in? Since it falls between 20 and 30, it must be inserted in-between, i.e. Oracle
must move all the buffers after 20 one step towards the right to make room for the new buffer
#25. Moving of memory areas in the memory is not a good idea. It costs expensive CPU cycles,
requires all actions on the buffers (even reading) to stop for the duration and is prone to errors.
Therefore, instead of moving the buffers around, a better approach is to put them in something
like a linked list. Fig 1 shows how that is done. Each of the buffers has two pointers: which one
is behind and which one is right ahead. In this figure, buffer 20 shows that 10 is in front and 30 is
the one behind. This would be the case regardless of the actual position of the buffers. When 25
comes in, all we have to do is to update the behind pointer of 20 and ahead pointer of 30 to
point to 25. Similarly the ahead pointer and behind pointer of 25 are updated to point to 30
and 20 respectively. This simple update is much quicker, does not need activity to stop on all the
buffers except the ones being updated and less error-prone.
However, there is another problem. This is just one of the lists. Buffers are used for other
purposes as well. For instance, the LRU algorithm needs a list of buffers in LRU order, the
DBWn process needs a list of buffers for writing to the disk, etc. So, physically moving the
buffers to specific lists is not just impractical, its impossible at the same time. Oracle employs a
simpler technique to overcome the obstacle. Rather than placing the actual buffers in the linked
list, Oracle creates a simpler, much lighter structure called buffer header as a pointer to an actual
buffer. This buffer cache is moved around, leaving the actual buffer in place. This way, the buffer
header can be listed in many types of lists at the same time. These buffer headers are located in
the shared pool, not the buffer cache. This is why you will find the reference to buffers in the
shared pool.
Buffer Chains
The buffers are placed in strings. Compare that to rows of spots in a parking lot. Cars come in to
an empty spot in a row. If they dont find one, they go to the next row and so on. Similarly
buffers are located on the cache as rows. However, unlike the parking spots which are physically
located next to each other, the buffers are logically placed as a sequence in the form of a linked
list, described in the above section. Each linked list of buffers is known as a buffer chain, as
shown in Fig 2.
Notice how each of the three chains has different numbers of buffers. This is quite normal.
Buffers are occupied only when some server process brings them up from the block. Otherwise
the buffers are free and not linked to anything. When the buffers are freed up, perhaps because
some process such as DBWn writes their contents to the disk, they are removed from the lista
process known as unlinking from the chain. So, in a normal database, buffers will be constantly
linked to and unlinked from a chainmaking the chain long or small depending on the
frequency of either activity. The number of buffer chains is determined by the hidden database
parameter _db_block_hash_buckets, which is automatically calculated from the size of the
buffer cache.
When a server process wants to access a specific buffer in the cache, it starts at the head of the
chain and goes on to inspect each buffer in sequence until it finds what it needs. This is called
walking the chain. You might be wondering about a nagging question herewhen a buffer
comes to the cache, who decides which of the three chains it should be linked to and how? A
corollary to that is a challenge posed by the server process in trying to find a specific buffer in
the cache. How does the process know which chain to walk? If it always starts at the chain 1, it
will take an extraordinary amount of time to locate the block. Typical buffer caches are huge, so
the number of chains may run into 10s of thousands, if not 100s. So, searching all the chains is
not practical. On the other hand, if Oracle were to maintain a memory table showing which
blocks are located in which buffers is not practical either, because maintaining that memory table
will be time consuming and make the process sequential. Several processes cant read chains in
parallel then.
Oracle solves the problem in a neat manner. Consider the parking lot example earlier. What if
you forget where you parked your car? Suppose after you come out of the mall, you find that all
the cars have been buried under a thick pile of snow making identification of any of the cars
impossible. So, you would have to start at the first car at the first row, dust off the snow from the
license plate, check for your car, move on to the next, and so on. Sounds like a lot of work,
doesnt it? So, to help forgetful drivers, the mall marks the rows with letter codes and asks the
drivers to park in the row matching the first letter of their last name. John Smith will need to park
in row S, and in row S only, even if row T or row R are completely empty. In that case, when
John returns to find his car and forgets where it is, he will know to definitely find it in row S.
That will be the domain of his searchmuch, much better than searching the entire parking lot.
Similarly, Oracle determines which specific chain a buffer should be linked to. Every block is
uniquely identified by a data block address (DBA). When the block comes to the buffer cache,
Oracle applies a hash function to determine the buffer chain number and places the block in a
buffer in that chain alone. Similarly, while looking up a specific buffer, Oracle applies the same
hash function to the DBA, instantly knows the chain the buffer will be found and walks that
specific buffer only. This makes accessing a buffer much easier compared to searching the entire
cache.
To find out the data block address, you need to first get the relative file# and block#. Here is an
example where I want to find out the blocks of the table named CBCTEST.
SQL> select
2
col1,
3
dbms_rowid.rowid_relative_fno(rowid) rfile#,
4
dbms_rowid.rowid_block_number(rowid) block#
5 from cbctest;
COL1
RFILE#
BLOCK#
---------- ---------- ---------1
6
220
2
6
220
3
6
220
4
6
221
5
6
221
221
6 rows selected.
From the output we see that there are 6 rows in this table and they are all located in two blocks in
a file with relative file# 6. The blocks are 220 and 221. Using this, we can get the data block
address. To get the DBA of the block 220:
SQL> select dbms_utility.make_data_block_address(6,220) from dual;
DBMS_UTILITY.MAKE_DATA_BLOCK_ADDRESS(6,220)
------------------------------------------25166044
The output shows the DBA of that block is 25166044. If there are three chains, we could apply a
modulo function that returns the reminder from an input after dividing it by 3:
SQL> select mod(25166044,3) from dual;
MOD(25166044,3)
--------------1
So, we will put it in chain #1 (assuming there are three chains and the first chain starts with 0).
The other block of that table, block# 221 will end up in chain #2:
SQL> select dbms_utility.make_data_block_address(6,221) from dual;
DBMS_UTILITY.MAKE_DATA_BLOCK_ADDRESS(6,221)
------------------------------------------25166045
SQL> select mod(25166045,3) from dual;
MOD(25166045,3)
--------------2
And so on. Conversely, Oracle if we get a DBA, we can apply the mod() function and the output
shows the chain it can be found on. Oracle does not use the exact mod() function as shown here;
but a more sophisticated hash function. The exact mechanics of the function is not important; the
concept is similar. Oracle can identify the exact chain the buffer needs to go to by applying a
hash function on the DBA of the buffer.
Multi-versioning of Buffers
Consider the update SQL statement shown in the beginning of the paper. When Oracle updates
the buffer that already exists in the buffer cache, it does not directly update it. Instead, it creates a
copy of the buffer and updates that copy. When a query selects data from the block as of a certain
SCN number, Oracle creates a copy of the buffer as of the point in time of interest and returns the
data from that copy. As you can see, there might be more than a single copy of the same block in
the buffer cache. While searching for a buffer the server process needs to search for the versions
of the buffer as well. This makes the buffer chain even longer.
To find out the specific buffer of a block, you can check the view V$BH (the buffer headers).
The column OBJD is the object_id. (Actually it's the DATA_OBJECT_ID. In this case both are
the same; but may not be in all cases). Here are the columns of interest to us:
FILE# - the file_id

BLOCK# - the block number
CLASS# - the type of the block, e.g. data block, segment header, etc. Shown as a code
STATUS - the status of the buffer, Exclusive Current, Current, etc.
To make it simpler to understand, we will use a decode() on the class# field to show the type of
the block. With that, here is our query:
select file#, block#,
decode(class#,
1,'data block',
2,'sort block',
3,'save undo block',
4,'segment header',
5,'save undo header',
6,'free list',
7,'extent map',
8,'1st level bmb',
9,'2nd level bmb',
10,'3rd level bmb',
11,'bitmap block',
12,'bitmap index block',
13,'file header block',
14,'unused',
15,'system undo header',
16,'system undo block',
17,'undo header',
18,'undo block')
class_type,
status
from v$bh
where objd = 99360
order by 1,2,3
/
FILE#
BLOCK# CLASS_TYPE
---------- ---------- ----------------6
219 segment header
6
221 segment header
6
222 data block
6
220 data block
STATUS
---------cr
xcur
xcur
xcur
4 rows selected.
There are 4 buffers. In this example we have not restarted the cache. So there are two buffers for
the segment header. There is one buffer for each data block 220 and 221. The status is "xcur",
which stands for Exclusive Current. It means that the buffer was acquired (or filled by a block)
with the intention of being modified. If the intention is merely to select, then the status would
have shown CR (Consistent Read). In this case since the rows were inserted modifying the
buffer, the blocks were gotten in xcur mode. From a different session update a single row. For
easier identification I have used Sess2> as the prompt:
Sess2> update cbctest set col2 = 'Y' where col1 = 1;
1 row updated.
From the original session, check the buffers:

FILE#
BLOCK# CLASS_TYPE
---------- ---------- ----------------6
219 segment header
6
220 segment header
6
220 data block
6
220 data block
6
221 data block
STATUS
---------cr
xcur
xcur
cr
xcur
5 rows selected.
There are 5 buffers now, up one from the previous four. Note there are two buffers for block ID
220. One CR and one xcur. Why two?
It's because when the update statement was issued, it would have modified the block. Instead of
modifying the existing buffer, Oracle creates a "copy" of the buffer and modifies that. This copy
is now XCUR status because it was acquired for the purpose of being modified. The previous
buffer of this block, which used to be xcur, is converted to "CR". There can't be more than one
XCUR buffer for a specific block, that's why it is exclusive. If someone wants to find out the
most recently updated buffer, it will just have to look for the copy with the XCUR status. All
others are marked CR.
Suppose from a third session, update a different row in the same block.
Sess3> update cbctest set col2 = 'Y' where col1 = 2;
1 row updated.
From the original session, find out the buffers.

FILE# BLOCK# CLASS_TYPE
------ ------ -----------------6
219 segment header
6
219 segment header
6
221 data block
6
220 data block
6
220 data block
6
220 data block
6
220 data block
6
220 data block
6
220 data block
9 rows selected.
STATUS
---------xcur
cr
xcur
xcur
cr
cr
cr
cr
cr
Whoa! There are 9 buffers now. Block 220 now has 6 buffers - up from 4 earlier. This was
merely a select statement, which, by definition does not change data. Why did Oracle create a
buffer for that?
Again, the answer is CR processing. The CR processing creates copies of the buffer and rolls
them back or forward to create the CR copy as of the correct SCN number. This created 2
additional CR copies. From one block, now you have 6 buffers and some buffers were created as
a result of select statement. This how Oracle creates multiple versions of the same block in the
buffer cache.
Latches
Now that you know how many buffers can be created and how they are located on the chains in
the buffer cache, consider examine another problem. What happens when two sessions want to
access the buffer cache? There could be several possibilities:
1)
2)
3)
Both processes could be after the same buffer

The processes are after different buffers but the buffers are on the same chain
The buffers are on different chains
Possibility #3 is not an issue; but #2 will be. We dont allow two processes to walk the chain at
the same time. So there needs to be some sort of a mechanism that prevents other processes to
perform an action when another process is doing it. This is enabled by a mechanism called a
latch. A latch is a memory structure that processes compete to acquire. Whoever gets is is said to
hold the latch; all others must wait until the latch is available. In many respects it sounds like a
lock. The purpose is the sameto provide exclusive access to a resourcebut locks have
queues. Several processes waiting for a lock will get it when the lock is released in the same
sequence they started waiting. Latches, on the other hand, are not sequential. Whenever latches
are available, every interested process jumps into the fray to capture it. Again, only one gets it;
the others must wait. A process first performs a loop, for 2000 times to actively look for the
availability of a latch. This is called spinning. After that the process sleeps for 1 ms and then
retries. If not successful, it tries for 1 ms, 2 ms, 2 ms, 4 ms, 4 ms, etc. until the latch is obtained.
The process is said to be sleep state in between.
So, latches are the mechanism for making sure no two processes are accessing the same chain.
This latch is known as cache buffers chains latch. There is one parent CBC latch and several
child CBC latches. However, latches consume memory and CPU; so Oracle does not create as
many child latches as there are chains. Instead a single latch may be used for two or more chains,
as shown in Fig 3. The number of child latches is determined by the hidden parameter
_db_block_hash_latches.
Latches are identified by latch# and child# (in case of child latches). A specific instance of latch
that is used is identified by its address in memory (latch address). To find out the latch that
protects a specific buffer, get the file# and block# as shown earlier and issue this SQL:
select hladdr
from x$bh
where dbarfil = 6
and dbablk = 220;
Going back to CBC latches, lets see how you can find out the correlation between chains and
latches. First, find the Latch# of the CBC latch. Latch# may change from version to version or
across platforms; so its a good idea to check for it.
select latch# from v$latch
where name = 'cache buffers chains';
LATCH#
-----203
This is the parent latch. To find out the child latches (the ones that protect the chains), you should
look into another viewV$LATCH_CHILDREN. To find out how many child latches are there:
SQL> select count(1) cnt from v$latch_children where latch# = 203;
CNT
------16384
If you check the values of the two hidden parameters explained earlier, you will see:
_db_block_hash_buckets 524288
_db_block_hash_latches 16384
The parameter _db_block_hash_buckets decides how many buffer chains are there and the
parameter _db_block_hash_latches decides the number of CBC latches. Did you notice the
value, 16384? It determines the number of CBC latches and we confirmed that it is in fact the
number of CBC latches.
Diagnosis of CBC Latch Waits
Lets now jump into resolving the CBC latch issues. The sessions suffering from CBC latch
waits will show up in V$SESSION. Suppose one such session is SID 366. To find out the CBC
latch, check the P1, P1RAW and P1TEXT values in V$SESSION, as shown below:
select p1, p1raw, p1text
from v$session where sid = 366;
P1
P1RAW
P1TEXT
---------- ---------------- ------5553027696 000000014AFC7A70 address
P1TEXT clearly shows the description of the P1 column, i.e. the address of the latch. In this case
the address is 000000014AFC7A70. We can check the name of the latch and examine how many
times this latch has been requested by sessions but has been missed.
SQL> select gets, misses, sleeps, name
2 from v$latch where addr = '000000014AFC7A70';
GETS MISSES SLEEPS NAME
----- ------ ------ -------------------49081
14
10 cache buffers chains
From the output we conform that this is a CBC latch. It has been acquired 49,081 times, 14 times
missed and 10 times processes have gone to sleep waiting for it.
Next, identify the object whose buffer is so popular. Get the File# and Block# from the buffer
cache where the CBC latch is the latch address we identified to be the problem:
select dbarfil, dbablk, tch
from x$bh
where hladdr = '000000014AFC7A70';
DBARFIL DBABLK TCH
------- ------ ----6
220 34523
The TCH column shows the touch count, i.e. how many times the buffer has been accesseda
measure of its popularity and hence how much likely that it will be subject to CBC latch waits.
From the file# and block# we can get the object ID. The easiest way is to dump the block and get
the object ID from the dump file. Here is how you dump the above mentioned block.
alter system dump datafile 6 block min 220 block max 220;
This produces a tracefile, a part of which is shown below.

Start dump data blocks tsn: 4 file#:6 minblk 220 maxblk 220
Block dump from cache:
Dump of buffer cache at level 4 for pdb=0 tsn=4 rdba=25166044
BH (0x7ff72f6b918) file#: 6 rdba: 0x018000dc (6/220) class: 1 ba:
0x7ff7212a000
set: 12 pool: 3 bsz: 8192 bsi: 0 sflg: 0 pwc: 39,28
dbwrid: 0 obj: 93587 objn: 93587 tsn: [0/4] afn: 6 hint: f
Get the object ID (the value after objn). Using that value you can get the object name:
SQL> select object_name

2 from dba_objects
3 where object_id = 93587;
OBJECT_NAME
---------------------------------------CBCTEST
Now you know the table whose blocks are so highly popular resulting in CBC latches.
Resolving CBC Latch Waits

From the above discussion you would have made one important observationCBC latch waits
are caused by popularity of the blocks by different processes. If you reduce the popularity, you
reduce the chances that two processes will wait for the same buffer. Note: you cant completely
eliminate the waits; you can only reduce it. To reduce is, reduce logical I/O. For instance, Nested
Loops revisit the same object several times causing the buffers to be accessed multiple times. If
you rewrite the query to avoid NLs, you will significantly reduce the chance that one process
will wait for the CBC latch.
Similarly if you write a query that accesses the blocks from a table several times, you will see the
blocks getting too popular as well. Here is an example of such a code:
for i in 1..100000 loop
select
into l_var
from tablea
where ;
exit when sql%notfound;
end loop;
You can rewrite the code by selecting the data from the table into a collection using bulk collect
and then selecting from that collection rather than from the table. The SQL_ID column of the
V$SESSION will show you which SQLs are causing the CBC latch wait and getting to Object
shows you which specific object in that query is causing the problem, allowing you to devise a
better solution.
You can also proactively look for objects contributing to the CBC latch wait in the Active
Session History, as shown below:
select p1raw, count(*)
from v$active_session_history
where sample_time < sysdate 1/24
and event = 'latch: cache buffers chain'
group by event, p1
order by 3 desc;
The P1RAW value shows the latch address, using which you can easily find the file# and block#:
select o.name, bh.dbarfil, bh.dbablk, bh.tch
from x$bh bh, sys.obj$ o
where tch > 0

and hladdr= ''
and o.obj#=bh.obj
order by tch;
With the approach shown earlier, you can now get the object information from the file# and
block#. Once you know the objects contributing to the CBC latch waits, you can reduce the waits
by reducing the number of times the latch is requested. That is something you can do by making
the blocks of the table less popular. The less the number of rows in a block, the less popular the
block will be. You can reduce the number of rows in a block by increasing PCTFREE or using
ALTER TABLE MINIMIZE RECORDS_PER_BLOCK. If that does not help, you can partition a table.
That forces the data block address to be recomputed for each partition, making it more likely that
the buffers will end up in different buffer chains and hence the competition for the same chain
will be less.
Conclusion
In this blog you learned how Oracle manages the buffer cache and how latches are used to ensure
only one process can walk the chain to access a buffer. This latch is known as Cache Buffer
Chain latch. You learned why this latch is obtained by Oracle and how to reduce the possibility
that two processes will want the latch at the same time. I hope this helped you understanding and
resolving Cache Buffer Chains Latch related waits. Your feedback will be highly appreciated.
Oracle Database 12c Feature: Multitenant Database
Imagine a situation where you have to create two different schemas in the same databases; but both with the same
name. A typical example is in the case of Peoplesoft applications which have a specific schema name - SYSADM,
that can't be changed. So if you want to install two Peoplesoft applications in the same database, you will soon
discover that it's not possible since you can't have two schemas named SYSADM in the same database. So, what
are your choices?
Well, you could create two different databases. In fact, prior to Oracle Database 12c that was your only choice. But
with two different databases comes two different sets of overheads - two Oracle instances (the memory areas such
as SGA and the processes such as pmon and smon) which consume memory and CPU cycles in the host. The more
databases you have, the more the CPU and memory usage - all because you want to create multiple schemas in the
same name.
Not any more, in the multi-tenancy option in Oracle Database 12c. Instead of creating a physical database for each
SYSADM schema you want to create, you can a virtual database for each schema. Each virtual database behaves
like an independent database; but runs on the top of a real, physical database which may be hidden from the end
users. These virtual databases are called Containers. The physical database that houses these containers, is in
effect a database of containers, and is known as a Container Database (CDB). You can pull out (or "unplug") a
container from one CDB and place it (or, "plug" it) into another CDB. This is why a container is also known as
a Pluggable Database (PDB). For all practical purposes from the perspective of the clients, the PDBs are just
regular databases.
Please note a very important point: It is NOT necessary that the database be created as a CDB with PDBs inside it.
You can also create a database exactly how it was (non- CDB) in the prior versions. The multi-tenancy option to the
Oracle Database 12c is required to create s. That is an extra cost option; but there is no cost to create exactly one
PDB inside a CDB. Later in this article you will see how to create a database as a PDB. To find out if the database
has been created as a CDB or not, just check the column called CDB in the view V$DATABASE.
SQL> select cdb from v$database;

CDB
--YES
What is a Container Database

So, what is the big advantage in this setup, you may ask? Couldn't we just have created multiple plain vanilla
databases instead of multiple PDBs? Yes, we could have; but then each of these plain "vanilla" databases would
have has its own instance (processes and memory) overhead. PDBs do not have an instance associated with them,
eliminating this overhead. Let's examine the setup with an example. Suppose you have a CDB (container database the real database that houses other PDBs) called CONA which has a PDB calledPDB1. If you check the Oracle
instance, you will see that there is only one - that of the CDB. Let's check the ORACLE_SID first:
[oracle@prosrv1 ~]$ echo $ORACLE_SID
CONA
That's it. There is just one SID; not one for each PDB. Next, let's check for the processes, specifically the very
important one known as "pmon":
[oracle@prosrv1 ~]$ ps -aef|grep pmon
oracle 7672 7323 0 11:18 pts/2 00:00:00 grep pmon
oracle 12519 1 0 Feb19 ? 00:00:00 asm_pmon_+ASM
oracle 21390 1 0 Feb19 ? 00:00:00 ora_pmon_CONA
As you can see, the only instance running is CONA (the CDB) beside,of course, the ASM instance. There is no
instance for the PDB namedPDB1. You can create as many of these PDBs on this CDB called CONA; there will be no
additional instance. PDBs are simply hosted on the CDBs. So in effect these PDBs are like virtual machines running
on a physical machine (akin to the CDB) in a virtual machine context.
Since the CDB is the only real database, all the physical database components such as the Automatic Diagnostic
Repository (ADR) is associated with it. Let's check the ADR using the ADRCI command line utility:
[oracle@prosrv1 trace]$ adrci
ADRCI: Release 12.1.0.1.0 - Production on Sun Feb 24 12:18:12 2013
Copyright (c) 1982, 2013, Oracle and/or its affiliates. All rights reserved.
ADR base = "/u02/oradb"
adrci> show homes
ADR Homes:
diag/rdbms/cona/CONA
As you see from the output, there is only one ADR home - that for CONA (the CDB). There is no separate ADR for the
PDBs.
You can check the containers (or PDBs) created in a database in a view named V$PDBS, which is brand new in
Oracle Database 12c.
select con_id, dbid, name
from v$pdbs;
CON_ID DBID NAME
---------- ---------- -----------------------------2 4050437773 PDB$SEED
3 3315520345 PDB1
4 3874438771 PDB2
5 3924689769 PDB3
Note how the DBIDs are also different for each PDB. There are two striking oddities in this output:
There is no CON_ID of 1. The answer is simple - there is a special container called the "root" container,
known as CDB$Root that is created to hold the metadata. This container has the CON_ID of 1.
There is a PDB called PDB$SEED, which is something we didn't create. You will get the explanation of this
PDB later in the article.
There are new built-in functions to identify PDBs from their details without querying the V$PDBS view. Here is an
example how to identify the container ID from the name:
SQL> select con_name_to_id('PDB2') from dual;
CON_NAME_TO_ID('PDB2')
---------------------4
And, here is how you can get the container ID from the DBID:
SQL> select con_dbid_to_id(3924689769) from dual;
CON_DBID_TO_ID(3924689769)
-------------------------5
Operating on Specific PDBs
The next big question you may have is considering the unusual nature of the PDBs (they are virtual inside a real
database) how you can operate on a specific PDB. There are several approaches. Let's examine them one by one.
Session Variable. You can set a session variable called container to the name of the PDB you want to
operate on. First connect to the CDB as usual. Here is how I connected as the SYSDBA user:
[oracle@prosrv1 pluggable]$ sqlplus sys/oracle as sysdba

SQL*Plus: Release 12.1.0.1.0 Production on Sat Mar 2 18:09:10 2013
Copyright (c) 1982, 2013, Oracle. All rights reserved.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit
Production
With the Partitioning, Automatic Storage Management, OLAP, Advanced
Analytics
and Real Application Testing options
SQL> alter session set container = pdb1;
Session altered.
Now all commands in this session will be executed in the context of the PDB called PDB1. For instance suppose you
want to shutdown the PDB named PDB1, you would issue:
SQL> shutdown immediate

Pluggable Database closed.
Only the PDB called PDB1 will be shut down; other PDBs will not be affected.
Service Name. When you create a PDB, Oracle automatically adds it as a service in the listener. You can
confirm it by looking at the listener status:
[oracle@prosrv1 trace]$ lsnrctl status

LSNRCTL for Linux: Version 12.1.0.1.0 - Production on 24-FEB-2013
12:20:14
Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
STATUS of the LISTENER
-----------------------Alias LISTENER
Version TNSLSNR for Linux: Version 12.1.0.1.0 - Production
Start Date 19-FEB-2013 21:09:05
Uptime 4 days 15 hr. 11 min. 9 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listener Parameter File
/u02/app/oracle/product/12.1.0/grid/network/admin/listener.ora
Listener Log File
/u02/app/oracle/diag/tnslsnr/prosrv1/listener/alert/log.xml
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1521)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=prosrv1.proligence.com)
(PORT=1521)))
Services Summary...
Service "+ASM" has 1 instance(s).
Instance "+ASM", status READY, has 1 handler(s) for this service...
Service "CONA" has 1 instance(s).
Instance "CONA", status READY, has 1 handler(s) for this service...
Service "CONAXDB" has 1 instance(s).
Service "pdb1" has 1 instance(s).
The command completed successfully
The service "pdb1" actually points to the PDB called PDB1. It's very important to note that that this is not a service
name in initialization parameter of the database, as you can see from the service_names parameter of the
database.
SQL> show parameter service
NAME TYPE VALUE

------------------------------------ ---------------------------------------service_names string CONA
You can place that service name in an entry in the TNSNAMES.ORA file:
PDB1 =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = prosrv1.proligence.com)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = PDB1)
)
)
Now you can connect to PDB1 using the connect string:
[oracle@prosrv1 ~]$ sqlplus system/oracle@pdb1
Using TWO_TASK. A third way is by defining the TWO_TASK operating system variable to point to the PDB
you want to connect to:
[oracle@prosrv1 schema]$ export TWO_TASK=PDB1

And, then you can connect as usual without giving a connect string:
[oracle@prosrv1 schema]$ sqlplus system/oracle

I prefer this last approach because it simulates a database connection prior to introduction of the PDB. If you connect
to a specific PDB the majority of the time, all you have to do is to set this variable in the shell initialization file
(e.g. .profile in case of bourne shell) of the user so that the variable is automatically set when you log in. If you
need to know which PDB you are connected right now in SQL*Plus, just use the following command: To show which
container (or, the PDB) you are connected to:
SQL> show con_id

CON_ID
-----------------------------5
SQL> show con_name
CON_NAME
-----------------------------PDB1
Creating PDBs
You can create the PDBs when creating the main (CDB) database; or later. If you use Database Configuration
Assistant (DBCA) to create the CDB database, DBCA will ask you whether this is to be created as a CDB and if so
how many PDBs you want to create, etc. But how do you create a new PDB when the main CDB has been created
already? You have two options - using the familiar DBCA interface and the manual approach. Let's see the steps in
each approach.
DBCA Approach
1.
Start DBCA.
2.
You will see a menu like this:
3.
Choose "Manage Pluggable Databases":
4.
4. In the next screen, choose "Create a Pluggable Database."
5.
In the next screen you will see a list of container databases (CDBs). Choose the one where you want the
PDB to be created. In this case we have only one CDB called CONA.
6.
In the next screen click on the option "Create a New Pluggable Database."
7.
On the next screen you will have to answer some questions about the PDB you are about to create.
8.
Enter the name PDB2, or something else you want to name it as. Let's examine the options on the screen.
9.
The PDB uses some storage exclusive to its own use; which is not part of the root container CDB$Root.You
will need to mention in the screen how you want that storage to be created. In this case I chose Oracle Managed File
(OMF) which lets Oracle put them in the proper location. I could have also chosen instead to put these files in a
common location, which means I would have remember to clean up the files later if I drop the PDB. The overall
storage occupied by CDB is a sum of the root container - CDB$Root, the seed PDB - PDB$Seed and all the PDBs
contained in it.
10.
I also checked a box called "Create a Default User Tablespace". Every PDB may contain its own USERS
tablespace that will be default tablespace of the users if not explicitly specified. This is very useful if you want the
default tablespace to be different for each PDB. That way you can make sure that not all the users from one PDB
take over the space in a common tablespace.
11.
You have to use a special user who can administer the PDB. In the above screen I used the name "syspdb2"
and entered its password.
12.
After the PDB is created, you will see a message like the following screen.
13.
After the PDB creation, you may examine the alert log to see the various activities performed to create the
PDB:
14.
15.
Sun Feb 24 10:52:35 2013

CREATE PLUGGABLE DATABASE PDB2 ADMIN USER syspdb2 IDENTIFIED BY *
ROLES=(CONNECT) file_name_convert=NONE
16.
****************************************************************
17.
Pluggable Database PDB2 with pdb id - 4 is created as UNUSABLE.
18.
If any errors are encountered before the pdb is marked as NEW,
19.
then the pdb must be dropped
20.
****************************************************************
21.
Deleting old file#5 from file$
22.
Deleting old file#7 from file$
23.
Adding new file#11 to file$(old file#5)
24.
Adding new file#12 to file$(old file#7)
25.
Successfully created internal service pdb2 at open
26.
ALTER SYSTEM: Flushing buffer cache inst=0 container=4 local
27.
****************************************************************
28.
Post plug operations are now complete.
29.
Pluggable database PDB2 with pdb id - 4 is now marked as NEW.
30.
****************************************************************
31.
Completed: CREATE PLUGGABLE DATABASE PDB2 ADMIN USER syspdb2 IDENTIFIED
BY * ROLES=(CONNECT) file_name_convert=NONE
32.
alter pluggable database PDB2 open
33.
Pluggable database PDB2 dictionary check beginning
34.
Pluggable Database PDB2 Dictionary check complete
35.
Opening pdb PDB2 (4) with no Resource Manager plan active
36.
XDB installed.
37.
XDB initialized.
38.
Pluggable database PDB2 opened read write
39.
Completed: alter pluggable database PDB2 open
40.
CREATE SMALLFILE TABLESPACE "USERS" LOGGING DATAFILE SIZE 5M AUTOEXTEND

ON NEXT
41.
1280K MAXSIZE UNLIMITED EXTENT MANAGEMENT LOCAL SEGMENT SPACE MANAGEMENT
AUTO
42.
Sun Feb 24 10:53:24 2013
43.
Setting Resource Manager plan SCHEDULER[0x420C]:DEFAULT_MAINTENANCE_PLAN
via
44.
scheduler window
45.
Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN at pdb PDB2 (4)
via
46.
parameter
47.
Completed: CREATE SMALLFILE TABLESPACE "USERS" LOGGING DATAFILE SIZE 5M
48.
AUTOEXTEND ON NEXT 1280K MAXSIZE UNLIMITED EXTENT MANAGEMENT LOCAL
SEGMENT
49.
SPACE MANAGEMENT AUTO
50.
ALTER DATABASE DEFAULT TABLESPACE "USERS"
51.
Completed: ALTER DATABASE DEFAULT TABLESPACE "USERS"
52.
Sun Feb 24 10:53:51 2013
53.
TABLE SYS.WRI$_OPTSTAT_HISTHEAD_HISTORY: ADDED INTERVAL PARTITION
SYS_P223
54.
(41328) VALUES LESS THAN (TO_DATE(' 2013-02-25 00:00:00', 'SYYYY-MM-DD
55.
HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN'))
56.
Sun Feb 24 10:53:54 2013
57.
Thread 1 advanced to log sequence 70 (LGWR switch)
58.
Current log# 1 seq# 70 mem# 0:
+CONDATA/CONA/ONLINELOG/group_1.262.807831747
59.
Current log# 1 seq# 70 mem# 1:
+DATA/CONA/ONLINELOG/group_1.283.807831749
60.
TABLE SYS.WRI$_OPTSTAT_HISTGRM_HISTORY: ADDED INTERVAL PARTITION
SYS_P226 (41328)
61.
VALUES LESS THAN (TO_DATE(' 2013-02-25 00:00:00', 'SYYYY-MM-DD
HH24:MI:SS',
62.
'NLS_CALENDAR=GREGORIAN'))
Manual Approach
You don't have to fire up the DBCA interface. A simple SQL command does the trick. Connect to the CDB as
SYSDBA:
$ sqlplus sys/oracle as sysdba
SQL> create pluggable database pdb3
2 admin user syspdb3 identified by syspdb3
3 roles=(CONNECT,DBA)
4 /
Pluggable database created.
You will learn about the different clauses used here (admin user, roles, etc.) later. The PDB is created but not open
yet. You have to manually open it:
SQL> alter pluggable database pdb3 open;
Pluggable database altered.
Now that the database is open, you can connect to it using the different methods shown earlier. Note that you have to
be authenticated with SYSDBA role for this operation.
Common and Local Users

With the introduction of PDBs comes a very important point not to be forgotten - there are two different types of
databases now: CDB and PDB. Consequently, the concept of users in the database is bound to be affected. Let's
connect to one of the PDBs - PDB1 - and create a user called HRMASTER that will hold our HR application.
Session altered.
SQL> create user hrmaster identified by hrmaster;
User created.
This user is created in the PDB named pdb1 alone; not in the CDB database. You can exploit this fact by creating the
user in different containers. You can create a user called HRMASTER in another PDB called, say, PDB2. Since PDB1
and PDB2 are considered two databases, this is absolutely possible. This is exactly how you will be able to host two
Peoplesoft applications in the same database - by creating two different PDBs and creating a SYSADM user in each
PDB.
Since this user HRMASTER is visible inside the PDB named PDB1 only, it is called a "local" user. HRMASTER user
in PDB1 is an entirely different entity from the HRMASTER user in PDB2. However, if you want to create a user that
is visible to all the PDBs in that CDB, you should create a *common* user. Logging into the container database (CDB)
as a DBA user, you create a common user as shown below:
SQL> create user c##finmaster identified by finmaster container = all;
User created.
Note, the common user must have a special prefix of "C##" to differentiate itself from local users. Although the user is
common, you have to grant the necessary privileges in the appropriate PDBs. Granting a privilege in one PDB will not
automatically make the user have that privilege in another. If you want to grant a specific privilege in all the PDBs,
e.g. you want to grant "create session" to the common user in all the containers, you could grant that privilege
individually in all the PDBs or use the SQL like the following to grant in all PDBs in one statement:
SQL> grant create session to c##finmaster container = all;
Grant succeeded.
To confirm that this user is indeed common, you can connect to one of the PDBs:
[oracle@prosrv1 ~]$ sqlplus c##finmaster/finmaster@pdb1
... output truncated ...
SQL> show con_name
CON_NAME
-----------------------------PDB1
Remember, you did not create the user explicitly in the container PDB1. Since the user is common, it can connect to
that PDB. Now test the same in another PDB - PDB2:
SQL> show con_name
CON_NAME
-----------------------------PDB2
However, if you attempt to connect as HRMASTER:

SQL> conn hrmaster/hrmaster@pdb2
ERROR:
ORA-01017: invalid username/password; logon denied
This error occurred because the user HRMASTER has not been created in the PDB named PDB2. It has been
created as a local user in PDB1 alone; so the attempt to connect to PDB2 fails. Oracle already created some
common users. Notable examples are SYS, SYSTEM, etc.
Now that you know the difference between a local user and a common user, you have to learn another very important
difference. A common user is merely a user that can connect to a PDB without explicitly being created there. The
schemas of the common users in the PDBs are different from each other. For instance C##FINMASTER schema in
PDB1 cannot see anything owned by C##FINMASTER in PDB2. Let's see with an example. Let's connect to PDB1
and create a table T1:
SQL> show con_name
CON_NAME
-----------------------------PDB1
SQL> create table t1 (col1 number);
Table created.
Now connect to PDB2 using the same user and check for the table T1:
SQL> show con_name
CON_NAME
-----------------------------PDB2
SQL> desc t1
ERROR:
ORA-04043: object t1 does not exist
This error occurred because the schema is different. The common user is merely an authentication mechanism; the
actual storage of data is different across different PDBs.
Considering there are two types of users now, how do you find which users are local and which are common? A
special type of view prefixed with CDB_ shows the information that relates to the entire CDB. You will learn more
about this type of views in the section on data dictionary; but here is one for now - CDB_USERS (for all the users in the
CDB):
SQL> select con_id, common, username from cdb_users where username like '%MASTER';
CON_ID COM USERNAME
---------- --- ------------1 YES C##FINMASTER
3 NO HRMASTER
3 YES C##FINMASTER
4 YES C##FINMASTER
The column COMMON shows you if the user is common or not. From the output you know C##FINMASTER is
common while HRMASTER is not. You can also see that C##FINMASTER shows up in all containers while
HRMASTER shows up only in container 3, where it was originally created.
Although common users can be created in a CDB, there is little use of that in a real life application. Ordinarily you will
create local users in each PDB as required and that is what Oracle recommends.
Administration
So far you learned how the PDBs are considered independent from each other allowing you to create users with the
same names while not proliferating the actual databases. The next important topic you are probably interested in
learning is how to manage this entire infrastructure. Since the PDBs are logically different, it's quite conceivable that
separate DBAs are responsible for managing them. In that case, you want to makes sure the privilege of these DBAs
fall within the context of the respective container and not outside of it. >br> Earlier you saw how to create the PDB.
Here it is once again:
2 admin user syspdb3 identified by syspdb3
3 roles=(CONNECT,DBA);
Note the clause "admin user syspdb3 identified by syspdb3". It means the PDB has a user called syspdb3 which is an
admin user. The next line "roles=(CONNECT,DBA)" indicates that the user has the CONNECT and DBA roles. This
becomes the DBA user of the PDB. Let's see that by connecting as that user and confirming that the roles have
enabled.
[oracle@prosrv1 trace]$ sqlplus syspdb3/syspdb3@pdb3
...
SQL> select * from session_roles;
ROLE
------CONNECT
DBA
PDB_DBA
Note that this is for this PDB alone; not in any other PDB. For instance, if you connect to PDB2, it will not work:
[oracle@prosrv1 ~]$ sqlplus syspdb3/syspdb3@pdb2
ERROR:
ORA-01017: invalid username/password; logon denied
Back in PDB3, since this user is a DBA, it can alter the parameters of the PDB as needed:
SQL> alter system set optimizer_index_cost_adj = 10;
System altered.
This is a very important point to consider here. The parameter changed here is applicable only to PDB3; not to any
other PDBs. Let's confirm that: In PDB2:
SQL> conn syspdb2/syspdb2@pdb2
Connected.
SQL> show parameter optimizer_index_cost_adj
NAME TYPE VALUE
------------------------------------ ----------- -----------------------------optimizer_index_cost_adj integer 100
However, in PDB3:
SQL> conn syspdb3/syspdb3@pdb3
Connected.
SQL> show parameter optimizer_index_cost_adj
NAME TYPE VALUE
------------------------------------ ----------- -----------------------------optimizer_index_cost_adj integer 10
PDB2 has the old, unchanged value while PDB3 has the changed value. This is a very important property of the
PDBs. You can change parameters in specific containers to suit the application. There is no need to force a common
parameter value for all the containers in the CDB. A classic example is in the case of two containers - one for
production and other for development. You may want to force values of a parameter for performance reasons. This
could be a permanent setting or just temporarily for some experiments. You can change the value in only one
container without affecting the others.
Note that not all the parameters can be modified in a PDB. A column ISPDB_MODIFIABLE in V$PARAMETER shows
whether the parameter can be modified in a PDB or not. Here is an example:
SQL> select name, ispdb_modifiable
2 from v$parameter
3 where name in (
4 'optimizer_index_cost_adj',
5 'audit_trail'
6* )
SQL> /
NAME ISPDB
------------------------------ ----audit_trail FALSE
optimizer_index_cost_adj TRUE
The audit_trail parameter is for the entire CDB; you can't modify them for individual PDBs. It makes sense in many
ways. Since audit trail is something that is for a physical database; not a virtual one, it is not modifiable for individual
PDBs. Similarly some parameters such as db_block_buffers, which is for an Oracle instance are non-modifiable as
well. That parameter is for an Oracle instance. A PDB doesn't have an instance; so the parameter has no relevance in
the PDB context and hence is non-modifiable.
Additionally, you can also use any of the normal ALTER SYSTEM commands. A common example is identifying
errant sessions and killing them. First we identify the session from V$SESSION. However, since V$SESSION shows
background processes for CDB as well, you need to trim down to show only for the current PDB. To do that, get the
container_id and filter the output from v$ession using that.
SQL> show con_id
CON_ID
-----------------------------5
SQL> select username, sid, serial#
2 from v$session
3 where con_id = 5;
USERNAME SID SERIAL#
------------------------------ ---------- ---------SYSPDB3 49 54303
C##FINMASTER 280 13919
2 rows selected.
SQL> alter system kill session '280,13919';
System altered.
There is a special case for starting and shutting down the PDBs. Remember, the PDBs themselves don't have any
instance (processes and memory areas) or controlfile and redo logs. These elements of an Oracle database instance
belongs to the CDB and shutting them down will shutdown all the PDBs. Therefore there is no concept called an
instance shutdown in case of PDBs. When you shutdown or startup a PDB, all that happens is that the PDB is closed.
Similarly the startup of PDB merely opens the PDB. The instance is already started, since that belongs to the CDB.
Let's see with an example.
[oracle@prosrv1 pluggable]$ sqlplus sys/oracle@pdb1 as sysdba
SQL*Plus: Release 12.1.0.1.0 Production on Sat Mar 9 14:51:38 2013
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Advanced Analytics
SQL> shutdown
Pluggable Database closed.
Here is the corresponding entry from alert log:
2013-03-09 14:51:50.022000 -05:00
ALTER PLUGGABLE DATABASE CLOSE
Pluggable database PDB1 closed
Completed: ALTER PLUGGABLE DATABASE CLOSE
Adding Services in PDBs
Remember, Oracle automatically creates service names in the same name as the PDBs. This lets you connect to the
PDBs directly from the clients using the SERVICE_NAME clause in the TNS connect string. However, occasionally
you may want to add services in the PDBs themselves. To do so you can use the SRVCTL command with a special
parameter "-pdb" to indicate the PDB it should be created in:
[oracle@prosrv1 ~]$ srvctl add service -db CONA -s SERV1 -pdb PDB1
If you want to check on the service SERV1, use:
[oracle@prosrv1 ~]$ srvctl config service -db CONA -s SERV1
Service name: SERV1Service is enabled
Cardinality: SINGLETON
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Global: false
Commit Outcome: false
Failover type:
Failover method: TAF failover retries:TAF failover delay:
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: NONE
Edition:
Pluggable database name: PDB1
Maximum lag time: ANY
SQL Translation Profile:
Retention: 86400 seconds
Replay Initiation Time: 300 seconds
Session State Consistency:

Remember a very important fact while defining services in PDBs: services are unique in a single CDB. Therefore you
can't create a service called SERV1 in another PDB. However, this is not new rule specific to CDBs or PDBs. If you
had a listener servicing multiple databases in a single server in the prior versions, the service names had to be
unique across all those databases as well. Here is what happens when you try to create this service in PDB2 which is
in the root CDB named CONA:
[oracle@prosrv1 ~]$ srvctl add service -db CONA -s SERV1 -pdb PDB2
PRKO-3117 : Service SERV1 already exists in database CONA
The service you just created is not started right away. To start it, issue:
[oracle@prosrv1 ~]$ srvctl start service -db CONA -s SERV1
[oracle@prosrv1 ~]$ srvctl status service -db CONA -s SERV1
Service SERV1 is running
Data Dictionary
You are obviously familiar with the DBA_* views (e.g. DBA_USERS, DBA_TABLES, etc.) in the Oracle Database.
However, now there is an element of contextual difference in the database. Remember from our earlier discussions
that the PDB, for all practical purposes, behave like regular databases. Therefore, the DBA_* views show data for
individual PDBs alone; not for all the PDBs inside that CDB. To get data for all the PDBs, we have to rely on a
different kind of DBA views. To get the data that is across all the PDBs inside that CDB we need to get introduced to
another type of DBA views, prefixed by CDB_ (for CDB). You saw an example of this type of view earlier CDB_USERS, which shows the users across the entire CDB. The familiar DBA_USERS view shows the users in that
PDB alone.
Similarly, the dynamic performance views (the ones with the prefix V$) show the data for the specific PDB alone; not
across all the PDBs. If you want to get the data from different PDBs, you should be connected to the root container
(CDB$Root). Here is an example of the view V$PDB that shows the information on all the PDBs. Connect to the root
container and issue this SQL:
SQL> select con_id, name, open_time, create_scn, total_size
2* from v$pdbs
SQL> /
CON_ID NAME OPEN_TIME CREATE_SCN TOTAL_SIZE
---------- ----------------------- ------------------------- ---------- ---------2 PDB$SEED 19-FEB-13 09.54.26.452 PM 1688774 283115520
3 PDB1 19-FEB-13 09.55.06.421 PM 1875166 288358400
4 PDB2 24-FEB-13 10.53.08.615 AM 2710636 288358400
You have to be careful to see the data from the dynamic performance views (the ones with V$ prefix, e.g.
V$SESSION). They may show some data from outside of the context of the current PDB as well. The rows have a
column called CON_ID. Depending on the container you are connected to, the CON_ID shows information differently.
Let's see it with an example. When you are connected to the root (the CDB directly) the container ID of 0 shows the
data for the entire CDB. Here is an example:
SQL> select sid, username, program
2 from v$session
3 where con_id = 0;
SID USERNAME PROGRAM

---------- ------------------------------ ------------------------------------1 oracle@prosrv1.proligence.com (PMON)
6 oracle@prosrv1.proligence.com (LGWR)
All the Oracle background processes are relevant to CDBs; not to PDBs at all. Therefore the CON_ID is 0 for them.
Now let's see the data where the CON_ID is 1.
SQL> l
1 select sid, username, program
2 from v$session
3* where con_id = 1
SQL> /
---------- -------- ------------------------------------------42 SYS oraagent.bin@prosrv1.proligence.com (TNS V1-V3)
50 SYS sqlplus@prosrv1.proligence.com (TNS V1-V3)
237 SYS oraagent.bin@prosrv1.proligence.com (TNS V1-V3)
You can see that these are the sessions connected to the database. Sessions are always relevant in the context of
some PDB. In the context of CON_ID = 1, which is a PDB, the user sessions such as SQL*Plus and agents are
relevant; not the background processes. Therefore sessions related to background processes like PMON and LGWR
are not shown when CON_ID is 1. Now, let's repeat the exercise while connected to PDB3.
2 from v$session
3* where con_id = 0;
---------- -------- -----------------------------------------------1 oracle@prosrv1.proligence.com (PMON)
2 oracle@prosrv1.proligence.com (VKTM)
Again, it shows the Oracle background processes for the instance, which is relevant data for the entire CDB; not to
specific PDBs. If you check for data filtered for the present CON_ID.
SQL> show con_id
CON_ID
-----------------------------5
2 from v$session
3 where con_id = 5;
---------- -------- ----------------------------------------50 SYSPDB3 sqlplus@prosrv1.proligence.com (TNS V1-V3)
This shows the session relevant in the context of this container only. The user sessions are not relevant to the CDB
as a whole; and hence these is not shown for that container ID. Remember, the root (CDB$Root) is also a container;
so sessions connected directly to it will show up as container ID of 1 (which is the ID of the root container). This
difference in the data presented - one specific to the container and the other irrelevant to container but to the CDB as
a whole - is seen in all V$ views, except a few important ones - V$SYSSTAT, V$SYS_TIME_MODEL,
V$SYSTEM_EVENT and V$SYSTEM_WAIT_CLASS. In these four views only the container specific data is
presented. Here is an example:
SQL> show con_id

CON_ID
-----------------------------5
SQL> select con_id, count(1) from v$sysstat group by con_id
2 /
CON_ID COUNT(1)
---------- ---------5 817
The view V$SYSSTAT contains data specific to container ID 5 (the current one) only; not anything related to the
entire CDB. The same behavior applies to the other three views. However, in managing an entire CDB, you will be
interested to know the statistics of different PDBs simultaneously; not connecting to one PDB at a time. Therefore a
new set of views with the prefix V$CON_ are provided to give the CDB-wide information for those four views. Here is
an example where I want to find out the logical reads made by different PDBs in this CDB:
SQL> show con_name
CON_NAME
-----------------------------CDB$ROOT
SQL> select con_id, value
2 from v$con_sysstat
3 where name = 'session logical reads';
CON_ID VALUE
---------- ---------1 956853686
2 955759828
3 864433091
4 329765911
5 496782913
Looking at the data, I can clearly see that the first three PDBs exert most pressure on the database while the last
PDBs don't as much.
Backup and Restore/Recover

Backup is an important necessary feature of any database. Since the concept of database itself has changed - from a
physical "database" to a virtual "pluggable database", and the former has all the files with the latter has some
additional files specific to the PDB, the question now is how RMAN is used: on CDB as a whole or PDB selectively. If
we use it on a PDB, it can't just stop at selecting the PDB-specific files since the common files of SYSTEM and
SYSAUX, etc. are also required in a recovery. So RMAN has to understand what to backup in the context of its
connection. Well, have no fear; RMAN is smart enough to understand what it should back up. Let's start with a simple
example where we connect to the pluggable database PDB1 instead of the physical CDB.
[oracle@prosrv1 pluggable]$ rman target=sys/oracle@pdb1
Recovery Manager: Release 12.1.0.1.0 - Production on Thu Mar 7 21:36:44 2013
connected to target database: CONA (DBID=1690995584)
Note how the target database shows up as CONA (the CDB); not the PDB itself. This should be quite intuitive. Since
the physical database has the datafiles, controlfile and redolog files, it makes sense to backup there. However, when
RMAN backs up the files, it backs up relevant files of PDB alone:
RMAN> backup incremental level 1 database;

Starting backup at 07-MAR-13
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=270 device type=DISK
no parent backup or copy of datafile 9 found
no parent backup or copy of datafile 8 found
channel ORA_DISK_1: starting incremental level 0 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00009
name=+CONDATA/CONA/D62013E9C50254BDE04380A8840A3EE3/DATAFILE/sysaux.270.807832473
name=+CONDATA/CONA/D62013E9C50254BDE04380A8840A3EE3/DATAFILE/system.271.807832473
channel ORA_DISK_1: starting piece 1 at 07-MAR-13
channel ORA_DISK_1: finished piece 1 at 07-MAR-13
piece
handle=+DATA/CONA/D62013E9C50254BDE04380A8840A3EE3/BACKUPSET/2013_03_07/nnndn0_tag2013
0307t213846_0.289.809473129 tag=TAG20130307T213846 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:01:15
channel ORA_DISK_1: starting incremental level 1 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
name=+CONDATA/CONA/D62013E9C50254BDE04380A8840A3EE3/DATAFILE/users.273.807832517
channel ORA_DISK_1: starting piece 1 at 07-MAR-13
channel ORA_DISK_1: finished piece 1 at 07-MAR-13
piece
handle=+DATA/CONA/D62013E9C50254BDE04380A8840A3EE3/BACKUPSET/2013_03_07/nnndn1_tag2013
0307t213846_0.290.809473203 tag=TAG20130307T213846 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 07-MAR-13
Starting Control File and SPFILE Autobackup at 07-MAR-13
piece
handle=+DATA/CONA/D62013E9C50254BDE04380A8840A3EE3/AUTOBACKUP/2013_03_07/s_809473204.2
91.809473205 comment=NONE
Finished Control File and SPFILE Autobackup at 07-MAR-13
To confirm that RMAN knows the files to backup, issue:
RMAN> report schema
2> ;
Report of database schema for database with db_unique_name CONA
List of Permanent Datafiles
===========================
File Size(MB) Tablespace RB segs Datafile Name
---- -------- -------------------- ------- -----------------------8 300 SYSTEM ***
+CONDATA/CONA/D62013E9C50254BDE04380A8840A3EE3/DATAFILE/system.271.807832473
9 2530 SYSAUX ***
+CONDATA/CONA/D62013E9C50254BDE04380A8840A3EE3/DATAFILE/sysaux.270.807832473
10 5 USERS ***
+CONDATA/CONA/D62013E9C50254BDE04380A8840A3EE3/DATAFILE/users.273.807832517
List of Temporary Files
=======================
File Size(MB) Tablespace Maxsize(MB) Tempfile Name
---- -------- -------------------- ----------- -------------------3 20 TEMP 32767
+CONDATA/CONA/D62013E9C50254BDE04380A8840A3EE3/TEMPFILE/temp.272.807832501
Note how RMAN reported only the relevant files of the PDB; not the entire CDB database. If you check another PDB PDB2 you will see that there are different files that constitute the PDBs. The tablespaces AP_DATA and AP_IDX were
created in the PDB2 alone and they show up in the latter output; not in the output of PDB1.
[oracle@prosrv1 pluggable]$ rman target=sys/oracle@pdb2

RMAN> report schema;
===========================
---- -------- -------------------- ------- -----------------------11 290 SYSTEM ***
+CONDATA/CONA/D67B69C12A8C19BAE04380A8840A62A8/DATAFILE/system.274.808224757
12 1930 SYSAUX ***
+CONDATA/CONA/D67B69C12A8C19BAE04380A8840A62A8/DATAFILE/sysaux.275.808224763
13 357 USERS ***
+CONDATA/CONA/D67B69C12A8C19BAE04380A8840A62A8/DATAFILE/users.277.808224803
16 100 AP_DATA ***
+DATA/CONA/D67B69C12A8C19BAE04380A8840A62A8/DATAFILE/ap_data.286.808920195
17 100 AP_IDX ***
+DATA/CONA/D67B69C12A8C19BAE04380A8840A62A8/DATAFILE/ap_idx.287.808920247
=======================
---- -------- -------------------- ----------- -------------------4 20 TEMP 32767
+CONDATA/CONA/D67B69C12A8C19BAE04380A8840A62A8/TEMPFILE/temp.276.808224783
Now contrast this with the output from the same report schema command while connected to the root container. Note
that all the files - of all the PDBs - are reported here.
[oracle@prosrv1 pluggable]$ rman target=/
RMAN> report schema
2> ;
===========================
---- -------- -------------------- ------- -----------------------1 830 SYSTEM *** +CONDATA/CONA/DATAFILE/system.258.807831649
3 3370 SYSAUX *** +CONDATA/CONA/DATAFILE/sysaux.257.807831595
4 260 UNDOTBS1 *** +CONDATA/CONA/DATAFILE/undotbs1.260.807831715
5 260 PDB$SEED:SYSTEM *** +CONDATA/CONA/DATAFILE/system.267.807831773
6 5 USERS *** +CONDATA/CONA/DATAFILE/users.259.807831713
7 670 PDB$SEED:SYSAUX *** +CONDATA/CONA/DATAFILE/sysaux.266.807831773
8 300 PDB1:SYSTEM ***
9 2530 PDB1:SYSAUX ***
10 5 PDB1:USERS ***
+CONDATA/CONA/D67B69C12A8C19BAE04380A8840A62A8/DATAFILE/system.274.808224757
12 1930 PDB2:SYSAUX ***
+CONDATA/CONA/D67B69C12A8C19BAE04380A8840A62A8/DATAFILE/sysaux.275.808224763
13 357 PDB2:USERS ***
+CONDATA/CONA/D67B69C12A8C19BAE04380A8840A62A8/DATAFILE/users.277.808224803

+CONDATA/CONA/D6D1419BB837449AE04380A8840A7601/DATAFILE/system.278.808593451
15 1340 PDB3:SYSAUX ***
+CONDATA/CONA/D6D1419BB837449AE04380A8840A7601/DATAFILE/sysaux.279.808593451
16 100 PDB2:AP_DATA ***
+DATA/CONA/D67B69C12A8C19BAE04380A8840A62A8/DATAFILE/ap_data.286.808920195
17 100 PDB2:AP_IDX ***
+DATA/CONA/D67B69C12A8C19BAE04380A8840A62A8/DATAFILE/ap_idx.287.808920247
18 50 PDB3:USERS ***
+DATA/CONA/D6D1419BB837449AE04380A8840A7601/DATAFILE/users.288.809001765
=======================
---- -------- -------------------- ----------- -------------------1 88 TEMP 32767 +CONDATA/CONA/TEMPFILE/temp.265.807831767
2 94 PDB$SEED:TEMP 32767 +CONDATA/CONA/DATAFILE/pdbseed_temp01.dbf
3 20 PDB1:TEMP 32767
+CONDATA/CONA/D62013E9C50254BDE04380A8840A3EE3/TEMPFILE/temp.272.807832501
4 20 PDB2:TEMP 32767
+CONDATA/CONA/D67B69C12A8C19BAE04380A8840A62A8/TEMPFILE/temp.276.808224783
5 20 PDB3:TEMP 32767
+CONDATA/CONA/D6D1419BB837449AE04380A8840A7601/TEMPFILE/temp.280.808593479
After a backup is taken you may want to find out the existence of such backups. But in the multitenancy environment,
you may be interested in knowing about the backup of a specific PDB; not just all the PDBs. It's quite likely that the
backup of one PDB is more current than the other. To check for the existence of backup of a specific PDB, use the
familiar "list backup" command but with an additional clause:
RMAN> list backup of pluggable database pdb1;
List of Backup Sets
===================
BS Key Type LV Size Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ --------------4 Full 1.92G DISK 00:01:40 07-MAR-13
BP Key: 4 Status: AVAILABLE Compressed: NO Tag: TAG20130307T215412
Piece Name:
+DATA/CONA/BACKUPSET/2013_03_07/nnndf0_tag20130307t215412_0.292.809474053
List of Datafiles in backup set 4
File LV Type Ckp SCN Ckp Time Name
---- -- ---- ---------- --------- ---8 Full 5437070 07-MAR-13
9 Full 5437070 07-MAR-13
10 Full 5437070 07-MAR-13
The same concept applies to recovery as well. Let's see how I recover a specific PDB while not affecting other PDBs
that may be contained in the parent CDB.
RMAN> alter pluggable database pdb1 close;
Statement processed
RMAN> restore database pdb1;
Starting restore at 07-MAR-13
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=29 device type=DISK
channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00008 to

channel ORA_DISK_1: reading from backup piece
+DATA/CONA/BACKUPSET/2013_03_07/nnndf0_tag20130307t215412_0.292.809474053
During recovery, the same principles you are used to in normal databases holds true. The only difference is that when
you recover, you should connect as SYSBACKUP to the specific PDB.
SQL> conn sys/oracle@pdb1 as sysbackup
Connected.
SQL -- always a good idea to check the container (or PDB) you are connected to
SQL> show con_name
CON_NAME
-----------------------------PDB1
SQL> recover database;
ORA-00279: change 5437070 generated at 03/07/2013 21:54:13 needed for thread 1
ORA-00289: suggestion :
+DATA/CONA/ARCHIVELOG/2013_03_07/thread_1_seq_243.294.809474599
ORA-00280: change 5437070 for thread 1 is in sequence #243
ORA-00279: change 5463466 generated at 03/07/2013 22:09:45 needed for thread 1
ORA-00289: suggestion :
+DATA/CONA/ARCHIVELOG/2013_03_08/thread_1_seq_246.295.809485415
Specify log: {=suggested | filename | AUTO | CANCEL}
Log applied.
Media recovery complete.
SQL> alter database open;
Database altered.
See how intuitive it is to use backup and recovery in a multitenant environment? It's not different from those
operations in normal database environments.
Point in Time Recovery in PDB
Now that you learned about backups and recovery, let's discuss another vexing topic - Point in Time Recoveries
(PITR). PITR allows you to recover up to a specific point in time. However, in a multitenant environment you have
several PDBs, of which you are interested in doing the PITR of a specific one. Can you do PITR of one alone leaving
the rest of them current? Of course you can. Here is an example of a point in time recovery for a specific PDB inside
a CDB - PDB1. This CDB has other PDBs inside it.
RMAN> run {
2> set until time '08-MAR-13';
3> restore pluggable database pdb1;
4> recover pluggable database pdb1;
5> }
Here is the output (truncated for brevity):
Starting recover at 08-MAR-13
using channel ORA_DISK_1
RMAN-05026: WARNING: presuming following set of tablespaces applies to specified
Point-in-Time
List of tablespaces expected to have UNDO segments

Tablespace SYSTEM
Tablespace UNDOTBS1
Creating automatic instance, with SID='pvsj'
initialization parameters used for automatic instance:
db_name=CONA
db_unique_name=pvsj_pitr_pdb1_CONA
compatible=12.0.0.0.0
db_block_size=8192
db_files=200
sga_target=1G
processes=80
diagnostic_dest=/u02/oradb
#No auxiliary destination in use
enable_pluggable_database=true
_clone_one_pdb_recovery=true
control_files=+DATA/CONA/DATAFILE/current.302.809519541
#No auxiliary parameter file used
starting up automatic instance CONA
Oracle instance started
As you know a point in time recovery requires restoration and recovery of all the datafiles. Since the PDBs all share
some files, or more specifically, the undo tablespace datafile, there is a challenge to recover only a single PDB. The
undo tablespace datafile can't be overwritten; so the PDB can't be restored in its current place. Therefore RMAN
creates a brand new instance (which is, in this example, named "pvsj") and restores all the necessary datafiles there.
In that temporary instance, a special parameter "_clone_one_pdb_recovery" is set to true, which indicates that this
restored database is actually a PDB.
So, your knowledge of point in time recovery in RMAN does not have to go through an overhaul in this multitenant
environment.
Resource Management
Now that you have a fair idea of PDBs, you realize that the PDBs are essentially virtual databases running inside a
physical database (called a CDB). All the components of the physical database, the processes, the memory, etc.
belong to the CDB; not the PDBs. Therefore, it is highly possible that a single PDB will hog all the system resources
(such as CPU) inside a CDB leaving nothing for the rest. There are some OS level utilities you can use to limit
processes to a CPU limit; but the processes (PMON, SMON, etc.) are for the CDB - not the PDB - leaving that
approach ineffective. The same is true for CDB level resources that can't be defined at the PDB level. Take, for
instance, the parallel query servers. The parameter parallel_max_servers limits the number of PQ servers that can
exist in an instance; but this is something defined at the CDB level; not for the PDBs. So it's possible for a single PDB
to kick off all the allowed PQ servers starving the rest of the PDB from kicking off even a single PQ server.
Monopolization of resources in a CDB is the single biggest risk to the multitenant database implementation.
Well, worry no more. Fortunately, the creators of this powerful feature have thought about this risk and have built in a
mechanism to prevent that. The answer to the issue lies in the Resource Manager, also known as Database
Resource Manager (DBRM). You are probably already aware of the DBRM that has existed in Oracle Databases for
quite a while now. But that is for governing resources inside the database. PDBs are something like databases-in-adatabase; so that concept does not work in their case. Now the DBRM has new features to make sure they can force
PDBs to play nice together in a single CDB database. In this section you will learn how to govern the resources
among the PDBs using this new CDB-level DBRM. We will not explore the feature of DBRM that governs resources
among different sessions inside a database, which is not a new feature. There are three new parameters that allow
resource management naming PDBs.
1.
The CPU allocation is controlled by a parameter called "shares", which determine how the CPU will be
divided among the PDBs in case of a crunch. So, if you have two PDBs - PDB1 and PDB2 - with shares parameter of
1 and 2 respectively, it tells the DBRM that PDB2 should get twice the amount of CPU consumed by PDB1. Note that
DBRM kicks in only when there is a contention for CPU. If the total demand is less that 100%, everyone gets as mush
they want; but if there a contention then PDB2 is guaranteed 2/(1+2), i.e. 2/3rd of the available CPU. Since PDB1's
share is only 1, it's guaranteed 1/3rd of the CPU.
2.
The second parameter is "utilization_limit", which puts a ceiling on the CPU consumption by a specific PDB
even if there is spare CPU available in that CDB. This is specified as a percentage of the total CPU. This parameter
allows you to put a cap on the CPU consumption by a PDB for any reason.
3.
The third parameter is "parallel_server_limit", which limits the number of parallel query servers that can be
kicked off in the PDB. This is a percentage of the overall maximum parallel query servers in a CDB.
Let's see how to implement this with an example. Suppose we have three PDBs named PDB1, PDB2 and PDB3.
PDB1 hosts the most important applications. If there is a CPU contention, we want to give PDB1 50%, PDB2 25%
and PDB3 25% of the available CPU respectively. When there is plenty of CPU, we don't want to limit any CPU
consumption by PDB1, since it hosts critical apps; but we want to limit PDB2 and PDB3 so that they can't ever take
up more than 50% and 70% of the CPUs respectively. We also want to limit the parallel query servers to 50% and
70% of the value defined by parallel_max_servers.
To implement this structure, we will execute the following PL/SQL block:
begin
dbms_resource_manager.clear_pending_area();
dbms_resource_manager.create_pending_area();
-- create the CDB resource plan
dbms_resource_manager.create_cdb_plan(
plan => 'dayshift_cona_plan',
comment => 'cdb plan for cona'
);
-- give the limits in the plan for PDB1
dbms_resource_manager.create_cdb_plan_directive(
pluggable_database => 'pdb1',
shares => 2,
utilization_limit => 100,
parallel_server_limit => 100
);
-- and, now the same for PDB2
shares => 1,
);
-- and now, PDB3
shares => 1,
);
dbms_resource_manager.validate_pending_area();
dbms_resource_manager.submit_pending_area();
end;
/
With the plan in place, you should now enable it for the CDB to start enforcing it:
SQL> alter system set resource_manager_plan = 'DayShift_CONA_Plan' scope=both;

System altered.
That's it. Now the PDBs will play along nicely within the CDB.
There is a little problem - you may ask - when you create new PDBs inside the same CDB. Since you haven't put that
new PDB into the DBRM yet, it's possible that the new PDB will not subject to any restriction. Well, rest assured that
there is away. You can define a "default" value for all the future PDBs to be created. Here is how you can define those
default plan directives:
begin
dbms_resource_manager.update_cdb_default_directive(
new_shares => 1,
new_utilization_limit => 30,
new_parallel_server_limit => 30
);
end;
/
Now all the newly created PDBs in this CDB will be subject to this plan. You can also explicitly define plans for them.
Cloning
A very important and useful feature of PDBs in a multitenant environment is the ability to clone the PDB quickly and
easily. Let's see an example where I am cloning the PDB1 to another PDB called PDB4:
2 from pdb1
3 /
After the PDB is created, let's check the existence of the datafiles:
Session altered.
SQL> show con_name
CON_NAME
-----------------------------PDB4SQL> select name
2 from v$datafile;
NAME
-------------------------------------------------------------------------------+CONDATA/CONA/DATAFILE/undotbs1.260.807831715
+CONDATA/CONA/D784602E6E8B308FE04380A8840A2F06/DATAFILE/system.281.809621997
+CONDATA/CONA/D784602E6E8B308FE04380A8840A2F06/DATAFILE/sysaux.282.809622005
+CONDATA/CONA/D784602E6E8B308FE04380A8840A2F06/DATAFILE/users.283.809622065
4 rows selected.
Oracle automatically copies the source datafiles (those of PDB1) to new files and calls it PDB4. This is very similar to
the "duplicate database" command in RMAN; but where the source and target databases are just virtual databases,
not all the files need be copied and there is no database instance created. So apart from some additional storage,
cloning a PDB does not consume any incremental resource.
In this example I used Oracle Managed Files (OMF) that allows the Oracle Database to determine the location of the
files. While it is a good practice, it's not absolutely necessary. If you use specific locations, and the locations differ on
the target PDB, you can use use file_name_convert clause to let the files be copied to the desired location.
Cloning is primarily the way the PDBs are created. But cloning needs a very important ingredient - the source PDB
which is cloned. In this example we the source was PDB1. When you create your very first PDB, you didn't specify;
so where does Oracle get the files from?
But there is indeed a source. Do you remember seeing a PDB named PDB$SEED? Let's check the PDBs in our
CDB:
SQL> conn system/oracle
Connected.
SQL> select name
2 from v$pdbs;
NAME
-----------------------------PDB$SEED
PDB1
PDB2
PDB3
PDB4
5 rows selected.
Here you can see the newly created PDB - PDB4. The PDB named PDB$SEED is the "seed" container from which all
other containers are cloned. So when you create a fresh new PDB, with the syntax shown below:
create pluggable database pdb6
admin user syspdb6 identified by syspdb6;
It's actually cloned from PDB$SEED. This is a very important fact to remember. It means if you want a the database
to be created in a certain way, e.g. the system tablespace has to be of a specific size, etc., you can change that in the
seed PDB database and the new PDB will have the same value since Oracle simply copies the files from the seed to
the new PDB.
Transport
The idea of cloning is not limited to be inside the same CDB, or even the same server. You can clone a PDB from a
different CDB or even a different host. You can also "move" a PDB from one CDB to a different one. For instance,
suppose you have an application running against a PDB in a host called serv1. To debug some issue in the app, the
developers want to point the test app against that database; but there is a little problem - the database is inside a
production firewall and the test app server can't connect to it. You are asked to create a copy of the database outside
the firewall. Normally, you would have resorted to backup and restore - a possible but time consuming process, not to
mention the careful planning and additional work. But with PDB, it's a breeze; you just "transport" the PDB to a
different CDB. Let's see an example where we transport a PDB called PDB4 to a different host.
1.
If the PDB is open, you should close it.
2.
3.
4.
SQL> alter pluggable database pdb4 close;
5.
Create the meta-information on the PDB in an XML file.
6.
7.
8.
SQL> alter pluggable database pdb4

2 unplug into 'pdb4_meta.xml';
This file is generated in $ORACLE_HOME/dbs directory in Unix and %ORACLE_HOME%\database folder in
Windows. If you examine the contents of the XML file, you will see that it contains the information on the PDB, its
constituents such as the tablespaces and datafiles.
9.
Copy this file as well as the datafiles to the target server. You probably had a listing of the datafiles already.
If not, simply refer to the XML file. All the files are there. If the datafiles are on ASM, which they are most likely in, use
the remote copy command of ASMCMD.
10.
On the target server, connect to the CDB with SYSDBA privilege:
11.
$ sqlplus sys/oracle as sysdba
12.
Execute this:
13.
14.
15.
16.

2 using 'pdb4_meta.xml';
If you check in the alert log:
Successfully created internal service pdb9 at open

****************************************************************
Post plug operations are now complete.
Pluggable database PDB9 with pdb id - 6 is now marked as NEW.
****************************************************************
Completed: create pluggable database pdb9
17.
This PDB is not open yet. You should open it:
18.
SQL> alter pluggable database pdb9 open;

Now the PDB is created and ready for use. You can confirm it by listing the datafiles.
SQL> select name from v$datafile;

NAME
------------------------------------------------------------------------------+CONDATA/CONA/DATAFILE/undotbs1.260.807831715
+CONDATA/CONA/D78933868BCA4E94E04380A8840A6D4A/DATAFILE/system.284.809642687
+CONDATA/CONA/D78933868BCA4E94E04380A8840A6D4A/DATAFILE/sysaux.294.809642695
+CONDATA/CONA/D78933868BCA4E94E04380A8840A6D4A/DATAFILE/users.295.809642771
History of PDBs
In a multitenant database environment you will normally create many PDBs, clone then, transport them, etc. With
time you may forget what you did where and how the different PDBs came into being. Instead of trying to jog your
memory, you have a new view to get that information from. The view CDB_PDB_HISTORY shows the various
operations of PDBs inside it. Here is an example:
SQL> select PDB_NAME, OPERATION, OP_TIMESTAMP, CLONED_FROM_PDB_NAME

2 from cdb_pdb_history
3 /
PDB_NAME OPERATION OP_TIMEST CLONED_FRO
---------- ---------------- --------- ---------PDB$SEED UNPLUG 01-FEB-13
PDB$SEED PLUG 19-FEB-13
PDB1 CREATE 19-FEB-13 PDB$SEED
PDB$SEED UNPLUG 01-FEB-13
PDB$SEED UNPLUG 01-FEB-13
PDB3 CREATE 28-FEB-13 PDB$SEED
PDB$SEED UNPLUG 01-FEB-13v
PDB2 CREATE 24-FEB-13 PDB$SEEDPDB4 CREATE 28-FEB-13 PDB111 rows selected.
You can see not only the creation dates but the source of PDB4 is PDB1 and it was cloned on 28th Feb.
Conclusion
In this article you learned about the new multitenancy option in Oracle Database 12c that allows you to create several
virtual databases (PDB) called pluggable databases or containers in a single physical database called container
database (CDB). The CDB is the one that has the Oracle instance associated with it, i.e. it has the background
processes such as pmon, smon and the memory areas such as buffer cache and large pool. The PDBs do not have
their own instance; but take up residence inside the CDB, with some additional storage exclusive to them. This
arrangement allows you to address the PDBs as independent databases. So you can create a user called, say,
SCOTT in every PDB, instead of creating a new database for creating these users. This is highly useful in case of
applications that require a specifically named schema, e.g. PeopleSoft requires a user called SYSADM. If you want to
place several PeopleSoft applications inside one database, you can't, since you can't create more than one user with
the name SYSADM. So you had to resort to creating several databases to hold several copies of the application.
Each of these databases had its own memory and resource overheads. In a multitenancy model you can create
several PDBs instead of actual databases. Each PDB can then have a user called SYSADM and run its copy of the
PeopleSoft application but without consuming any additional memory and CPU.
While these PDBs are hosted in a single CDB, they have the many of the identities of a regular independent
database. For instance you can set a specific optimizer parameter to different values in different PDBs. You can, of
course, start and shutdown a PDB leaving other PDBs in the CDB unaffected. You can back up PDBs independently,
as RMAN knows which specific files are relevant for the PDBs. You can restore individual PDBs, even perform a point
in time recovery for a PDB while the others are running.
This allows you to create a true "cloud database" where you can host as many databases as you want without
increasing the memory and CPU footprint. This is particularly useful when you have budgetary constraints that
prevent creation of additional databases forcing difficulties in development cycles. You can spin up a PDB very
quickly to host the development effort, or even clone it from another PDB from either the same CDB or a different
one, even a different server. This ability of creating virtual databases disengages the database provisioning from
hardware provisioning. All the PDBs share the same resources. If the overall demand becomes too high, all you have
to do add the resources to the CDB and all the PDBs get the benefit immediately. Perhaps the biggest advantage is
seen during upgrades. Instead of upgrading multiple databases, all you have to do is to upgrade the CDB and all
PDBs in it are immediately upgraded.
Multitenancy, introduced in Oracle Database 12c, redefines the concept of creating and managing an Oracle
Database; but at the same time does not pose a steep learning curve for the existing database administrators to
master and be effective.
Creating Controlfile From Scratch when No Backup is Available

You have lost the controlfile, the catalog and the backup to the controlfile too; so restoring the
controlfile from a previous backup is not an option. How can you recover the database? By
creating the controlfile from scratch. Interested in learning how? Read on.
Here is a final thread to the blog posts I had posted in the last three days, about interesting
situations faced by John the DBA at Acme Bank. In the first post, you saw how John restored a
controlfile when the autobackup was not being done. In the second post you learned how John
discovered the DBID when someone forgot to record it somewhere. In the final installment you
will see what John does when the controlfile backup simply does not exist, or exists somewhere
but simply can't be found, thus rendering the previous tips useless.
This time, John had to recreate the controlfile from scratch. Let me reiterate, he had to recreate
the controlfile, using SQL; not restore it from somewhere. How did he do it? Following his own
"best practices", honed by years and years of managing Oracle databases, wise ol' John always
takes a backup of the controlfile to trace using this command:
alter database backup controlfile to trace as '/tmp/cont.sql' reuse;
This command produces a text file named cont.sql, which is invaluable in creating the
controlfile. John puts the command as a cron job (in Unix; as a auto job on Windows) on
database servers so that this command gets excuted every day creating the text file. The "reuse"
option at the end ensures the command overwrites the existing file which means the text file
contains fresh data from the database when it is opened. Here is an except from the beginning of
the generated file.
-- The following are current System-scope REDO Log Archival related
-- parameters and can be included in the database initialization file.
--- LOG_ARCHIVE_DEST=''
-- LOG_ARCHIVE_DUPLEX_DEST=''
-... output removed for brevity...
It is a very long file. John scrolls down to the section that shows the following information:
-------
Below are two sets of SQL statements, each of which creates a new
control file and uses it to open the database. The first set opens
the database with the NORESETLOGS option and should be used only if
the current versions of all online logs are available. The second
set opens the database with the RESETLOGS option and should be used
if online logs are unavailable.
--------
The appropriate set of statements can be copied from the trace into
a script file, edited as necessary, and executed when there is a
need to re-create the control file.
Set #1. NORESETLOGS case
The following commands will create a new control file and use it
-- to open the database.

-- Data used by Recovery Manager will be lost.
-- Additional logs may be required for media recovery of offline
-- Use this only if the current versions of all online logs are
-- available.
-- After mounting the created controlfile, the following SQL
-- statement will place the database in the appropriate
-- protection mode:
-- ALTER DATABASE SET STANDBY DATABASE TO MAXIMIZE PERFORMANCE
CREATE CONTROLFILE REUSE DATABASE "PROQA3" NORESETLOGS ARCHIVELOG
MAXLOGFILES 32
MAXLOGMEMBERS 4
MAXDATAFILES 800
MAXINSTANCES 8
MAXLOGHISTORY 10225
LOGFILE
GROUP 3 (
'+PROQA3REDOA/PROQA3/PROQA3_redo103a.rdo',
'+PROQA3REDOB/PROQA3/PROQA3_redo103b.rdo'
) SIZE 2048M BLOCKSIZE 512,
GROUP 4 (
GROUP 4 (
) SIZE 2048M BLOCKSIZE 512,
GROUP 5 (
) SIZE 2048M BLOCKSIZE 512
-- STANDBY LOGFILE
DATAFILE
'+PROQA3DATA1/PROQA3/PROQA1_system_01.dbf',
'+PROQA3DATA1/PROQA3/PROQA1_sysaux_01.dbf',
'+PROQA3DATA1/PROQA3/PROQA1_undo1_01.dbf',
'+PROQA3DATA1/PROQA3/PROQA1_users_data01_01.dbf',
'+PROQA3DATA1/PROQA3/PROQA1_xdb_tbs_01.dbf',
'+PROQA3DATA1/PROQA3/PROQA1_abcdefg_small_data1_03.dbf',
'+PROQA3DATA1/PROQA3/PROQA1_abcdefg_large_data1_01.dbf',
... output removed for brevity ...
'+PROQA3DATA1/PROQA3/PROQA1_abcdefg_large_data1_09.dbf',
'+PROQA3DATA1/PROQA3/PROQA1_sysaux_03.dbf'
CHARACTER SET AL32UTF8
;
As you can see, this file contains a complete syntax for creating the controlfile using CREATE
CONTROLFILE command. But more important, the command contains all the data files and
online redo logs of the database. This is invaluable information to create the controlfile. John
creates a SQL script file called create_controlfile.sql where he puts the CREATE
CONTROLFILE SQL command. It's one long command with several lines. Here is how the file
looks like (with lines removed in between for brevity). Remember, this is just one command; so,
there is just one semicolon at the end for the execution:
CREATE CONTROLFILE REUSE DATABASE "PROQA3" NORESETLOGS ARCHIVELOG
MAXLOGFILES 32
'+PROQA3DATA1/PROQA3/PROQA1_sysaux_03.dbf'
CHARACTER SET AL32UTF8
;
Then John extracts the following commands immediately following the CREATE
CONTROLFILE command from that above mentioned file and puts them on another file named
create_temp_tablespaces.sql:
-- Commands to add tempfiles to temporary tablespaces.
-- Online tempfiles have complete space information.
-- Other tempfiles may require adjustment.
ALTER TABLESPACE TEMP1 ADD TEMPFILE '+PROQA3DATA1/PROQA3/PROQA1_temp1_01.dbf'
SIZE 31744M REUSE AUTOEXTEND OFF;
-- End of tempfile additions.
With the preparations completed, John proceeds to next steps. First, he starts up the instance with
NOMOUNT option. He has to use NOMOUNT anyway since the controlfile is missing:
startup nomount
This command brings up the instance only. Next, John creates the controlfile by executing the
file he created earlier--create_controlfile.sql. When the comamnd succeeds, he gets the following
message:
Control file created.
Voila! The controlfile is now created from scratch. With that the database is mounted
automatically. However, this newly created controlfile is empty; it does not have any information
on the database, sequence numbers, etc. It reads the information from the datafile headers; but
the data files may have been checkpointed at points in the past. John has to bring them up as
much forward as possible. He has to perform a recovery on the datafiles. From the SQL*Plus
prompt, he issues this statement:
SQL> recover database using backup controlfile;
ORA-00279: change 7822685456060 generated at 04/25/2014 17:11:38 needed for
thread 1
ORA-00289: suggestion : +PROQA3ARCH1
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
It's important that John uses "using backup controlfile" option. This controlfile is not the
current one; so the recovery process must know that. John carefully notes the SCN# of the
archived log being asked for--7,822,685,456,060. He has to provide an archived log that contains
changes with this SCN. To know that, he opens up another SQL*Plus window, connects as
sysdba and gathers the archived log information:
col first_change# head "First SCN# in Archive" format 999,999,999,999,999
col name format a80
select first_change#, name
from v$archived_log
order by 1
/
Here is the output:

First SCN# in Archive NAME
------------------------------------------------------------------------------------------7,822,681,948,348
+PROQA3ARCH1/PROQA3/archivelog/2014_04_12/thread_1_seq_1.285.844655135
7,822,681,949,237
7,822,681,950,115
7,822,685,451,799
7,822,685,453,816
Referring to this output, he sees that the latest archived log has the starting SCN# of
7,822,685,453,816, which is less than the SCN# being asked for. Therefore this archived log may
or may not contain the changes being asked by the recovery process. He decided to give that
archived log anyway. So he pastes the entire path of the archived log at the prompt:
Oracle immediately responds with:
ORA-00310: archived log contains sequence 2; sequence 3 required

ORA-00334: archived log:
'+PROQA3ARCH1/PROQA3/archivelog/2014_04_25/thread_1_seq_2.330.845829419'
Clearly, the archived log John supplied is not something the the recovery process was looking
for. But that was the latest archived log; there is nothing after that. Remember, the data could
also be there on the online redo log which have not been archived yet. John has to make a
decision here. If the online redo logs are not available, he needs to end the recovery here by
typing:
cancel
Oracle responds by:
Media Recovery canceled
After that, John opens the database:

alter database open resetlogs;
On the other hand, if the online redo logs are intact and available, he will need to just pass it to
the recovery process. He gathers the details on the online redo logs from the other SQL*Plus
window:
select sequence#, member
from v$log l, v$logfile f
where f.group# = l.group#

order by 1;
SEQUENCE# MEMBER
-------- ------------------------------------------1 +PROQA3REDOA/PROQA3/PROQA3_redo103a.rdo
1 +PROQA3REDOB/PROQA3/PROQA3_redo103b.rdo
2 +PROQA3REDOA/PROQA3/PROQA3_redo104a.rdo
3 +PROQA3REDOA/PROQA3/PROQA3_redo105a.rdo
From the first SQL*Plus window, John starts the recovery process again (the recovery process
ends when it does not get the file it expects) and this time he supplies the name of the online redo
log file:
SQL> recover database using backup controlfile;
ORA-00279: change 7822685456060 generated at 04/25/2014 17:11:38 needed for

thread 1
ORA-00289: suggestion : +PROQA3ARCH1
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
+PROQA3REDOA/PROQA3/PROQA3_redo105a.rdo
Oracle responds by:

Log applied.
Media recovery complete.
Voila! The database does not need any other recovery. Since the online logfile contains the last
known change, Oracle knows that there is no further recovery required and hence it stops asking
for any more changes. John has just recovered all the changes made to the database; nothing was
lost. He proceeds to opening the database.
alter database open resetlogs
Resetlogs is necessary here because John used a controlfile that he created. Remember, this is a
complete recovery (nothing was lost); but the database must be opened with resetlogs. This starts
the log sequence at 1 again. From a different window, John opens up the alert log of the database
and checks for the output:
... previous output removed for brevity ...
alter database open resetlogs

RESETLOGS after complete recovery through change 7822685456061
SUCCESS: diskgroup PROQA3REDOB was mounted
Fri Apr 25 17:47:12 2014
NOTE: dependency between database PROQA3 and diskgroup
resourceora.PROQA3REDOB.dg is established
Archived Log entry 47 added for thread 1 sequence 1 ID 0xffffffff983ca615
dest 1:
dest 1:
dest 1:
Clearing online redo logfile 3 +PROQA3REDOA/PROQA3/PROQA3_redo103a.rdo
Clearing online log 3 of thread 1 sequence number 1
Fri Apr 25 17:47:24 2014
Clearing online redo logfile 3 complete

Fri Apr 25 17:47:37 2014
Fri Apr 25 17:47:47 2014
Resetting resetlogs activation ID 2554111509 (0x983ca615)
Online log +PROQA3REDOA/PROQA3/PROQA3_redo103a.rdo: Thread 1 Group 3 was
previously cleared
Online log +PROQA3REDOB/PROQA3/PROQA3_redo103b.rdo: Thread 1 Group 3 was
previously cleared
previously cleared
previously cleared
previously cleared
previously cleared
Fri Apr 25 17:47:47 2014
Setting recovery target incarnation to 4
Using SCN growth rate of 16384 per second
Fri Apr 25 17:47:48 2014
Assigning activation ID 2554183704 (0x983dc018)
LGWR: STARTING ARCH PROCESSES
The output shows that the online redo logs started at sequence #1. With the recovery now
complete, John creates the temporary tablespaces using the script he had created earlier-create_temp_tablespaces.sql. Then he passes the database to the users for normal processing.
Creation of Controlfile from a Different

Database
What if John had not created the controlfile trace? The recovery would still be possible but he
would had to remember the names of all the datafiles and redo log files. In that case, he would
have to create the create controlfile command from any other database, edit that file to put the
names of the PROQA3 database objects he was trying to recover, and then create the controlfile.
It's possible; but difficult.
Takeaways
What did you learn from this story and John? Here is a summary:
1. Always use a recovery catalog. This post assumes that you lost that catalog as well; but
now you see how difficult it is without the catalog.
2. Always set the controlfile to autobackup. From the RMAN command prompt, issue
configure controlfile autobackup on. The default if off.
3. Always backup the RMAN logfile to the tape or other location where it would be
available even after the main sever with the database itself is inaccessible.
4. Always backup the controlfile to trace with a cron job that executes once a day and
updates the existing file.
5. If the controlfile backup is missing, check for the controlfile backup in the following
possible locations:
o snapshot controlfile
o backup taken in some location
6. Look for possible controlfile backups from RMAN log files.
7. If no backup of controlfile is available, create the controlfile from the trace you have
presumably created.
8. While recovering the database after creating the controlfile, always try giving the most
recent online redo logs as archived log names to achieve a complete recovery.
Thank you for reading. As always, I will appreciate your feedback. Tweet me at @ArupNanda,
or just post a comment here.
How to Get the DBID when Instance in in NOMOUNT State

You lost your controlfile and the catalog. To restore the controlfile, you must know the DBID.
Did you follow the advise to write down the DBID in a safe place? You didn't, did you? Well,
what do you do next? Don't worry; you can still get the DBID from the header of the data files.
Read on to learn how.
If you have lost your controlfile and the catalog database (or the database was not registered to a
recovery catalog anyway), you need to restore the controlfile first and then restore the other
files. I wrote a blog post on that activity earlier. In summary, here is what you need to do to
restore the controlfile from the backup:
You need the DBID. IF you don't know the DBID, don't panic. You can extract the DBID from
the header of a datafile, assuming you have access to it. The database instance needs to up in
NOMOUNT mode. Well, it has to be NOMOUNT because you haven't restored the controlfile
yet, a major requirement for the mount operation. If you have the database mounted, this blog
post is not for you since you have access to the V$DATABASE view and therefore the DBID.
But at that point the DBID is not required anyway.
1. Bring up the instance in nomount mode.

SQL> startup nomount
ORACLE instance started.
Total System Global Area 6.8413E+10 bytes
Fixed Size 2238616 bytes
Variable Size 1.6777E+10 bytes
Database Buffers 5.1540E+10 bytes
Redo Buffers 93618176 bytes
2. Set a tracefile identifier for easy identification of the trace file that will be generated.
SQL> alter session set tracefile_identifier = arup;
Session altered.
3. Dump the first few blocks of the datafile. The file of the SYSTEM tablespace works
perfectly. 10 blocks will do nice
SQL> alter system dump datafile
'+PROQA3DATA1/PROQA3/PROQA1_system_01.dbf' block min 1 block max 10;
System altered.
4. Check the trace file directory for a file with the term "ARUP" in it
prolin1:/PROQA/orabase/diag/rdbms/PROQA3/PROQA31/trace>ls -l *ARUP*
-rw-r--r-- 1 oracle asmadmin 145611 Apr 24 21:17
PROQA31_ora_61079250_ARUP.trc
-rw-r--r-- 1 oracle asmadmin 146 Apr 24 21:17
PROQA31_ora_61079250_ARUP.trm
5. Open that file. Here is an excerpt of that file.

Trace file
/PROQA/orabase/diag/rdbms/PROQA3/PROQA31/trace/PROQA31_ora_61079250_ARUP
.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit
Production
With the Partitioning, Automatic Storage Management, OLAP, Data Mining
ORACLE_HOME = /PROQA/oracle/db1
System name: AIX
Node name: prolin1
Release: 1
Version: 7
Machine: 0008F1CBD400
Instance name: PROQA31
Redo thread mounted by this instance: 0
Oracle process number: 26

Unix process pid: 61079250, image: oracle@prolin1(TNS V1-V3)
***
***
***
***
***
***
2014-04-24 21:17:16.957
SESSION ID:(937.3) 2014-04-24 21:17:16.957
CLIENT ID:() 2014-04-24 21:17:16.957
SERVICE NAME:() 2014-04-24 21:17:16.957
MODULE NAME:(sqlplus@prolin1 (TNS V1-V3)) 2014-04-24 21:17:16.957
ACTION NAME:() 2014-04-24 21:17:16.957
*** TRACE CONTINUED FROM FILE

/PROQA/orabase/diag/rdbms/PROQA3/PROQA31/trace/PROQA31_ora_61079250.trc
***
Start dump data block from file +PROQA3DATA1/PROQA3/PROQA1_system_01.dbf
minblk 1 maxblk 10
V10 STYLE FILE HEADER:
Compatibility Vsn = 186646528=0xb200000
Db ID=2553024456=0x982c0fc8, Db Name='PROQA3'
Activation ID=0=0x0
Control Seq=8419=0x20e3, File size=524288=0x80000
File Number=1, Blksiz=8192, File Type=3 DATA
Dump all the blocks in range:
buffer tsn: 0 rdba: 0x00400002 (1024/4194306)
scn: 0x071b.e7e3500f seq: 0x02 flg: 0x04 tail: 0x500f1d02
6. Note the section marked in red. The DBID is prominently displayed there.
Db ID=2553024456
7. That's it. Now you have the DBID.
Restoring Controlfile When AUTOBACKUP Fail

Allow me to present the snapshot of a day from the life of John--the DBA at Acme Bank. On this
particular day a database John manages crashed entirely and had to be restored from the backup.
He takes regular (backupset) RMAN backups to tape. Since everything--including the
controlfile--had crashed, John had to first restore the controlfile and then restore the database.
The controlfile is always backed up with the backup database command. John was sure of that.
However, restore controlfile from autobackup gave the error:
RMAN-06172: no AUTOBACKUP found or specified handle is not a valid copy or
piece
Without the controlfile, the recovery was stuck, even though all the valid pieces were there. It
was a rather alarming situation. Others would have panicked; but not John. As always, he
managed to resolve the situation by completing recovery. Interested to learn how? Read on.
Background
Since controlfile was also damaged, the first task at hand was to restore the controlfile. To restore
the controlfile, John needs a very special information: the DBID--database identifier. This is not
something that would be available until the database is at least mounted. In unmounted state-which is how the database is in right now--John couldn't just go and get it from the database.
Fortunately, he follows a best practice: he records the DBID in a safe place.
This is the command John used to restore the controlfile from the backup. The commands
assume the usage of Data Domain Boost, the media management layer (MML) plugin for Data
Domain backup appliance; but it could apply to any MML--NetBackup, TSM, etc.
SQL> startup nomount;
RMAN> run {
2>
allocate channel c1 type sbt_tape PARMS
'BLKSIZE=1048576,SBT_LIBRARY=/prodb/oradb/db1/lib/libddobk.so,ENV=(STORAGE_UNI
T=DDB01,BACKUP_HOST=prolin1.proligence.com,ORACLE_HOME=/prodb/oradb/db1)';
3>
set dbid = 2553024456;
4>
restore controlfile from autobackup;
5>
release channel c1;
6> }
allocated channel: c1
channel c1: SID=1045 device type=SBT_TAPE
channel c1: Data Domain Boost API
sent command to channel: c1
executing command: SET DBID
Starting restore at 22-APR-14
channel
channel
channel
channel
channel
channel
channel
channel
c1:
c1:
c1:
c1:
c1:
c1:
c1:
c1:
looking for AUTOBACKUP on day:

no AUTOBACKUP in 7 days found
20140422
20140421
20140420
20140419
20140418
20140417
20140416
released channel: c1
RMAN-00571:===========================================================
RMAN-00569:=============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571:===========================================================
RMAN-03002: failure of restore command at 04/22/2014 16:08:25
RMAN-06172: no AUTOBACKUP found or specified handle is not a valid copy or
piece
So, RMAN couldn't locate the backup of the controlfile. John knew that by default, RMAN
searches only 7 days of backup. Thinking that perhaps the controlfile was somehow not backed
up in the last seven days, he expanded the search to 20 days, using the special parameter
maxdays, shown below:
RMAN> run {
2>
3>
send 'set username ddboostadmin password password servername
prolin1.proligence.com';
3>
set dbid = 2553024456;
4>
restore controlfile from autobackup maxdays 20;
5>
release channel c1;
6> }
channel c1: looking for AUTOBACKUP on day:
channel c1: no AUTOBACKUP in 20 days found
20140422
20140421
20140420
20140419
20140418
20140417
20140416
20140415
20140414
20140413
20140412
20140411
20140410
20140409
20140408
20140407
20140406
20140405
20140404
20140403
RMAN-00571:
RMAN-00569:
RMAN-00571:
RMAN-03002:
RMAN-06172:
piece
===========================================================
=============== ERROR MESSAGE STACK FOLLOWS ===============
===========================================================
failure of restore command at 04/22/2014 16:17:56
no AUTOBACKUP found or specified handle is not a valid copy or
No luck; it gave the same error. So--John concluded--it was not an issue with the absence of
controlfile backup. Something else caused the backup of controlfile to be invisible. He did,
however, know that the controlfiles are backed up along with the regular backups. Without the
database in mounted mode, he couldn't find out the location of those controlfile backups. If this
database was registered to a catalog, he could have got that information from the catalog; but
unfortunately, being a new database, it was not yet registered. That avenue was closed.
He did, however, follow another best practice--saving the rman log files. As a rule, he sends the
RMAN output logs to the tape along with the backup. He recalled the most recent backup log
and checked the log for the name of the backup piece. Here is an excerpt from the log:
channel c8: starting piece 1 at 21-APR-14
channel c8: finished piece 1 at 21-APR-14
piece handle=14p69u7q_1_1 tag=TAG20140421T141608 comment=API Version 2.0,MMS
Version 1.1.1.0
channel c8: backup set complete, elapsed time: 00:00:01
piece handle=10p69rhb_1_1 tag=TAG20140421T141608 comment=API Version 2.0,MMS
Version 1.1.1.0
channel c5: backup set complete, elapsed time: 00:47:33
Looking at the output, John notes the names of the backup pieces created, listed next to "piece
handle"--14p69u7q_1_1, 10p69rhb_1_1, etc.He still did not know exactly which one contained
the controlfile backup; but it was not difficult to try them one by one. He tried to get the
controlfile from the first backuppiece, using the following command where he used a special
clause: restore controlfile from a location.
RMAN> run {
2>
3>
set dbid = 2553024456;
4>
restore controlfile from '14p69u7q_1_1';
5>
release channel c1;
6> }
channel c1:
channel c1:
output file
output file
output file
restoring control file

restore complete, elapsed time: 00:01:25
name=+prodb3CTL1/prodb3/control01.ctl
name=+prodb3DATA1/prodb3/control02.ctl
name=+prodb3INDX1/prodb3/control03.ctl
Finished restore at 22-APR-14

It worked; the controlfile was restored! If it hadn't worked, John would have tried the other
backup pieces one by one until he hit the one with the controlfile backup.
[Update April 27th, 2013 This tip came from a reader Kamil Stawiarski of Poland who is an
Oracle Certified Master (http://education.oracle.com/education/otn/KStawiarski.html). Thank
you, Kamil] If the specific location of the controlfile backup was not known and the backup is on
a disk, John could have used a trick to locate it using the RMAN duplicate command:
c:\> rman auxiliary /
RMAN responds with:

connected to auxiliary database: PROQA3 (not mounted)
Next, John used the following command (he used the auxiliary database name as ORCL
completely randomly; any name would have been fine, as long as there is no real instance with
that name):
RMAN> duplicate database to orcl backup location='c:\temp\oraback';
Here is the output from RMAN:
Starting Duplicate Db at 27-APR-14

contents of Memory Script:
{
sql clone "alter system set db_name =
''PROQA3'' comment=
''Modified by RMAN duplicate'' scope=spfile";
shutdown clone immediate;
startup clone force nomount
restore clone primary controlfile from
'C:\TEMP\oraback\05P6PAN2_1_1.RMAN';
alter clone database mount;
}
executing Memory Script
Remember, he has no intention of actually duplication. All he wants to know is the location of
the controlfile backup. He gets that from the above output:
restore clone primary controlfile from
'C:\TEMP\oraback\05P6PAN2_1_1.RMAN';
Now he knows the location of the controlfile. He presses Control-C to stop the process. With the
location known, he uses the earlier used command restore controlfile from 'location' to restore
the controlfile.
[Update Apr 27th, 2013: this tip came from Anuj Mohan, another reader from the US. Excellent
tip, Anuj and thank you for sharing]. When RMAN starts, it creates a snaphot controlfile, whose
default location is $ORACLE_HOME/dbs. The snapshot controlfile is usually named with .f as
an extension.
With the controlfile restored, John mounted the database.
RMAN> alter database mount;
database mounted
The rest was easy; all he had to do was to issue "restore database" and "recover database using
backup controlfile". The first thing John did after the database was mounted was checking the
controlfile autobackup setting:
RMAN> show CONTROLFILE AUTOBACKUP;
RMAN configuration parameters for database with db_unique_name prodb3 are:
CONFIGURE CONTROLFILE AUTOBACKUP OFF; #default
Well, lesson learned. John immediately changed it to ON.

RMAN> CONFIGURE CONTROLFILE AUTOBACKUP on;
new RMAN configuration parameters:
CONFIGURE CONTROLFILE AUTOBACKUP ON;
new RMAN configuration parameters are successfully stored
Someone suggested that he could have tried to restore the controlfile from the TAG instead of the
actual backup piece. Had he attempted the restore from the TAG, he would have got a different
error:
RMAN> run {
2>
3>
send 'set username ddboostadmin password password servername
prolin1.proligence.com';
4>
set dbid = 2553024456;
5>
restore controlfile from tag=TAG20140421T141608;
6>
release channel c1;
7> }
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of restore command at 04/22/2014 16:10:04
RMAN-06563: control file or SPFILE must be restored using FROM AUTOBACKUP
So that command could not have helped John.
No Backup of Controlfile
Let's consider another scenario: there was no backup of controlfile. Dreaded as it sounds, it's still
not end of the world. John could create the controlfile from a backup located somewhere else.
This special backup could be created by two special commands:
SQL> alter database backup controlfile to '/tmp/cont.dbf';
Database altered.
The above command creates a copy of the controlfile with the data as of the time of the
command. Another command is:
SQL> alter database backup controlfile to trace as '/tmp/cbk.trc' reuse;

Database altered.
This command creates a text file that you can use as SQL statement (after some minor editing) to
create a controlfile. The major difference between these two approaches is that the first approach
produces a snapshot of the controlfile as of that time, along with all the data--the backup, the
archived logs, etc. The second approach creates a brand new "blank" controfile that you must
feed to bring up. John uses both options as a Plan B. On another post we will see how he saved
the day using these two special controlfile backups.
Takeaways
What did you learn from the story. Here are some key takeaways:
1. Always write down the DBID of all your databases somewhere. If you use a recovery
catalog, it's there; but it's good to note it down separately. This number does not change
unless you use NID utility; so recording once is enough.
2. Always configure controlfile autobackup. The default is OFF; make it ON.
3. Always save the backup log files. In a crunch, they yield valuable information otherwise
not available.
4. When controlfile is not found, you can use restore controlfile from 'location' syntax in
RMAN to pull the controlfile from the location. If that location does not have a
controlfile backup, don't worry; just try all available locations. One might contain what
you are looking for. You have nothing to lose but everything to gain.
5. Always use a script for this type RMAN restore activities instead of typing at the prompt.
You will find changing data, e.g. various backup locations, easier and make less mistakes
that way.
6. Always create a backup controlfile everyday, even if you don't think you need it. You
may someday and you will thank yourself when you do.
A System for Oracle Users and Privileges with Automatic Expiry Dates
Tired of tracking down all the users in the database to deactivate them when they
cease to exist, or change roles, or fulfill their temporary need to the database? Or,
tracking down privileges you granted to existing users at the end of their requested
period? The solution is to think out of the box - developing a system that allows you
to create a database user account with an expiration date. This fire-and-forget
method allows you to create users with the assurance that they will be expired
(locked or dropped) at the expiration date automatically, without your intervention.
Interested? Read on how I developed such a system--along with source code for you
to try.
Introduction
What is a database user? In my opinion, there are two kinds of users:
1. Permanent Residents - those who live in the database forever until there is no
purpose for them. These are non-human users. Typical examples: admins
accounts (sys, system) and applications schemas.
2. Human Users - these are accounts created for real human beings.
It's the second category that is subject to a lot of scrutiny from many sources Payment Card Industry (PCI) mandates, Health Insurance Portability and
Accountability Act (HIPAA), Serbanes-Oxley (SOX), etc. All these mandates and
regulations have one thing in common - the need to identify and regulate the
human users. Common requirements in the mandates include database accounts
should be removed when they leave the organization, they should be validated very
so often (usually 90 days), they should get the privileges which they can justify a
business need for, and so on.
Concept
DBVisitor is a tool to create Oracle database user accounts with an expiration date.
A user in the Oracle database is permanent; there is no such thing as a temporary
user. Using the DBVisitor tool the DBA can create a visitor, which is a regular
database user but with a built-in expiration date (from as little as 5 minutes to as
much as needed) after which the user is either dropped or locked (the exact action
can be defined for each user specifically). This tool can also grant visitor privileges,
which are regular Oracle database privileges such as create table, select on
TableName, etc., with built-in expiration dates, after which the privilege is
automatically revoked. The expiration time can be extended for both the visitor and
the privilege. The tool keeps track of the creation, deletion, re-activation of the
users. The source code as well as all the scripts used in this tool can be downloaded
here.
Components
There are 7 major stored procedures in the tool. (Please Note: I plan to have all
these in a single package in a later release)
ADD_VISITOR
To add a visitor. The expiration, password, default role, etc.

can be given here
ADD_PRIVILEGE
To add a privilege, e.g. create session to a visitor with

expiration date
EXTEND_VISIT_TIME To extend the expiration date for a visitor

EXTEND_PRIV_TIME
To extend the expiration date for a privilege granted to a

visitor
EXPIRE_VISITORS
To inactivate the visitors at the end of expiration (called via a

job)
EXPIRE_VISITOR_PRI
To revoke the privileges at expiration (via a job)
VS
SEND_REMINDER_E
MAIL
To send email reminders just before the expiration date of

visitors
UNLOCK_VISITOR
To unlock the visitor whose account is locked at expiration
The actions are recoded in a table called DBVISITOR_EXPIRATION (for visitors) and
DBVISITOR_PRIVS (for the privileges granted). This table is never deleted. When the
expiration date is extended, a new record is inserted and the old record updated, to
leave an audit trail which can be examined later.
How it Works
When a visitor is created by this tool, a record goes into the DBVISITOR_EXPIRATION
table with the expiry date. A job searches that table and when it finds some visitor
whose expiration date is past, inactivates that visitor. The exact actions of
inactivation could be DROP, i.e. the user is completely dropped; or LOCK, i.e. it
is not dropped but its account is locked so it cant log in any more. The latter action
preserves any tables or other objects created by the user; but prevents the login.
The record is marked I (for Inactive). The active visitors are marked with A. The
same mechanism applies to privileges too, except that those records are located in
the table DBVISITOR_PRIVS.
When the expiration time is extended, DBVisitor creates a new record with the new
expiration date and status as A. The status of the old record is updated with the
flag X, for Extended. Similarly, when the account is unlocked, the status is shown
as U in the old record.
Not all parameters to the stored procedures are mandatory. If not specified, they
assume default values, which are stored in a table called DBVISITOR_PROPERTIES. If

you want to reduce the expiration date (not extend it), you can use the same
extend_* stored procedure; but use a negative number. If you want to expire the
visitor right now without waiting, just update the table DBVISITOR_EXPIRATION or
DBVISITOR_PRIVS to set the EXPIRY_DT to something less than the sysdate. The job
will see the expiration date as past and will inactivate the account.
Usage
When asked to create a visitor, execute the stored procedure ADD_VISITOR. You can
see the details of the stored procedure later in the blog. For your convenience the
downloaded file contains an SQL*Plus script for each activity. Here are the scripts:
addv.sql to add visitors
addp.sql to add privileges
extv.sql to extend time for visitors
extp.sql to extend time for privileges
unlock.sql to unlock the account
selv.sql to list the visitors
selp.sql to list the privileges
selxv.sql visitors expiring in next hours
Here is an example of how to create a visitor named JSMITH with an expiration of 3
hours. The script will prompt you for the values. If you press ENTER, the default
values will be taken.
SQL> @addv
Enter value
Enter value
Enter value
Enter value
Enter value
Enter value
Enter value
Enter value
for
for
for
for
for
for
for
for
username: jsmith
duration: 3
dur_unit: hour
role:
password:
expiration_process:
email: john.smith@proligence.com
comments:
There is a very important things you should note here:we omitted entering some
fields, e.g. password, role, etc. These values are picked up from the default settings.
The default values are defined in the table DBVISITOR_EXPIRATION. At the end, an
email will go out to the visitor and you will see a small confirmation for the user
created:
*
*
*
*
*
UserID
Email
Password
Expires in
Expiry Date
:
:
:
:
:
JSMITH
JOHN.SMITH@PROLIGENCE.COM
changem3
3 HOUR
10/07/13 18:28:10
* Role
: VISITOR
* Expiry Process : DROP
And here is how you will grant a privilege create table to the visitor.
SQL> @addp
Enter value for usrname: jsmith
Enter value for privilege: create table
Enter value for duration: 2
Enter value for duration_unit: hours
* CREATE TABLE
* granted to JSMITH
* until 03/07/13 17:30:58
Note a very important point: we created the visitor for 3 hours but the privilege for
only 2 hours. This is allowed. If you need to add more privileges, just execute
addp.sql for each privilege. Do not give multiple privileges in the script.
Extension
When you need to extend the visit time or the privilege time, use extv.sql and
extp.sql respectively. You can extend the time only if the visitor or the privilege
being extended is active. Here is an example where you extend the visit time of
JSMITH by 2 more hours:
SQL> @extv
Enter value
Enter value
Enter value
Enter value
for
for
for
for
username: jsmith
extend_time: 2
extend_dur: hours
comments: to continue from earlier
*********************************************
*
* Expiration Date Change for JSMITH
* Old: 10/07/13 18:28:10
* New: 10/07/13 20:28:10
*
*********************************************
Updated.
Similarly, to extend the CREATE TABLE privilege to this user by 2 more hours, you
will need to execute the extp.sql script.
SQL> @extp
Enter
Enter
Enter
Enter
Enter
value
value
value
value
value
for
for
for
for
for
username: jsmith
priv_name: create table
extend_time: 2
extend_unit: hours
comments:
*********************************************
*
* Expiration Date Change for JSMITH
* for CREATE TABLE
* Old 10/11/13 14:52:37
* New 10/11/13 16:52:37
*
*********************************************
Updated.
Reporting
To find out the visitors and their privileges, you can select from the tables
DBVISITORS_EXPIRATION and DBVISITORS_PRIVS. To make it easier, three scripts
have been provided:
selv.sql this shows the visitors you have created earlier, along with the
expiration dates. The expired visitors are also shown. Status column shows
Active (A) or Inactive (I). If it shows X, then the visitors time was extended.
Here is a sample report:
SQL> @selv
Expiry
DB User Status Process Created on Expires on Locked on Dropped on changed on
Misc
-------------------- ------ -------- ----------------- --------------------------------- ----------------- -------------------------------------------------------JSMITH A DROP 09/30/13 09:50:30 09/30/13 12:50:30 Change Ticket 3456789
JOHN.SMITH@PROLIGENCE.COM
JOHNSMITH I DROP 09/29/13 21:59:24 09/29/13 23:59:24 09/29/13 23:59:48
ARUP@PROLIGENCE.COM
selp.sql this shows the privileges granted to the visitors, active or not.
Here is a sample report:
DB User Privilege Name Status Granted on Expires on Revoked on Changed on

Comments
------- ---------------------- ------ ----------------- --------------------------------- ----------------- -------------------CHRIS SELECT ANY TABLE X 02/02/13 12:41:59 02/02/13 14:41:59 02/02/13 12:44:50
CHRIS SELECT ANY TABLE I 02/02/13 12:44:50 02/02/13 16:41:59 02/02/13 16:42:22
change ticket 123
MARK SELECT ANY TABLE X 02/02/13 13:00:55 02/02/13 15:00:55 02/02/13 13:01:36
MARK SELECT ANY TABLE I 02/02/13 13:01:36 02/02/13 17:00:55 02/02/13 17:01:22
change ticket 234
PAT SELECT ON ARUP.ITLTEST I 02/07/13 14:32:41 02/07/13 14:33:41 02/07/13
14:34:23
ARUP2 CREATE TABLE X 03/06/13 13:32:11 03/27/13 13:32:11 03/06/13 13:32:54
ARUP2 CREATE TABLE I 03/06/13 13:32:54 04/19/13 13:32:11 04/19/13 13:32:22
VIS3 CREATE TABLE I 03/07/13 15:30:58 03/07/13 17:30:58 03/07/13 17:31:22
JSMITH CREATE TABLE X 03/11/13 12:52:37 03/11/13 14:52:37 03/11/13 12:53:17
selxv.sql this shows the visitors who are expiring in the next <n> hours,
where <n> is something you supply.
Quick Reference
--> To add a visitor: addv.sql Default expiration: 2 hours
To add a privilege: addp.sql
Default expiration: 2 hours
To deactivate a visitor now: update dbvisitor_expiration set expiry_dt = sysdate
-1/24/60; commit;
To delete a privilege now, issue the above update against dbvisitor_privs
Never delete records from these tables.
To extend the visit time: extv.sql
To extend the privilege time: extp.sql
To reduce the expiration (make it expire earlier), extv.sql and extp.sql but use a
-ve num in duration
To get reports on visitors: selv.sql
To get reports on privileges: selp.sql
To get the list of visitors expiring in next <n> hours: selxv.sql
The column STATUS: A active and I Inactive. X the visitor or privilege was
initially granted and then extended.
To unlock a visitor after locked: unlock.sql
Important
You can extend the expiry only if the visitor or the privilege is active (status = A). If
the visitor is already expired, you cant extend it. You must re-add it (in the same
name).
-->
Specification
Here are the descriptions of each of the procedures and tables.
Tables
DBVISITOR_EXPIRATION
Its there to hold the visitor information, as a part of the tool. The term visitor is a
user in the database which has a built-in expiration date after which the user is
either dropped or locked.
Column
Purpose
DBUSER
The database username of the visitor
STATUS
Status: A-Active, I-Inactive, X-has been extended, U-unlocked
CREATED_DT
Date/Time the visitor was created
EXPIRY_DT
Date/Time the visitor is supposed to expire
EXP_PROCESS
How the expiration will occure LOCK/DROP
CHANGE_DT
Date/Time a change was made, e.g. extended, or unlocked
REMINDER_SENT_DT
Date/Time a reminder was sent that an expiration is approaching
EMAIL
The Email ID
LOCKED_DT
Date/Time the visitors DB account was locked
DROPPED_DT
Date/Time the visitors DB account was dropped
COMMENTS
Any comments, e.g. Change ticket
-->
DBVISITOR_PRIVS
This holds temporary privileges granted to the visitor users. The privileges have a
built-in expiration date after which they are revoked automatically.
Column
Purpose
DBUSER
The database username of the visitor
STATUS
Status: A-Active, I-Inactive, X-has been extended
EXPIRY_DT
Date/Time the privilege is supposed to expire
GRANT_DT
Date/Time the privilege was granted
CHANGE_DT
Date/Time a change occurred, e.g. extended
REVOKE_DT
Date/Time the privilege was revoked (after expiration)
PRIV_NAME
The name of the privilege, e.g. create table
COMMENTS
Comments
DBVISITOR_PROPERTIES
Its part of the tool, this stores the default values of various parameters used in the
DBVisitor tool.
--> -->
Column
Purpose
NAME
The name of the property, e.g. DEFAULT_ROLE
VALUE
The value of the property, e.g. VISITOR
-->
Procedures
ADD_VISITOR
Purpose : Adding a visitor user to the database which has a built-in expiration date
after which the database user account is either locked or dropped, based on
settings.
Usage : This accepts 8 parameters:
p_username = the username to be created. This is prefixed by a
predefined
prefix when the user is created. If omitted, the default is
VISITOR<n> where <n> is a unique number.
p_duration = the duration after which the user is expired
p_dur_unit = the unit in which the above parameter is mentioned. Valid
values
are DAY(S), HOUR(S) and MINUTE(S). Can't exceed 90 days.
p_role
= the role granted to the visitor automatically
p_password = the password to be used for the user. This password is
used only
for initial login. the user must change the password immediately.
p_exp_proc = how the account is to be expired, i.e. LOCK or DROP
p_email
= the email ID of the user. For convenience you can specify SW
for starwoodhotels.com. starwood, sw.com, star will work too.
acn,
acc, accenture will work for accenture.com.
p_comments = any free format comments (up to 2000) chars can be
used.
All these parameters are optional. If omitted, the default values are
picked up
from a table called DBVISITOR_PROPERTIES.
ADD_PRIVILEGE
Purpose
: Adding a database privilege (create session, select on tableName, etc.)
to a visitor user in the database which has a built-in expiration date after which the
privilege is revoked automatically.
Usage
: This accepts 5 parameters:

p_username = the username to which the privilege is to be granted.
p_privilege = the privilege to be granted (e.g. 'create session')
p_duration = the duration after which the privilege is expired
values
used.
All these parameters, except user and privilege, are optional. If omitted,
the default values are picked up from a table called
DBVISITOR_PROPERTIES.
There is no default for the p_comments parameter.
EXTEND_VISIT_TIME
Purpose
: DBVisitor is tool to create a user in the DB with a built-in expiration
after which the database user account is either locked or dropped, based on
settings. This procedure is used to extend that expiration date.
Usage

p_username
= the username to for which the time should be
extended.
p_extend_time = the duration by which the time is to be extended
p_extend_unit = the unit in which the above parameter is mentioned.
Valid values
used.
picked up
EXTEND_PRIV_TIME
Purpose
: To extend the expiration time for a database privilege (e.g. create
session) of a visitor user in the database which has a built-in expiration date
afterwhich the privilege is revoked automatically.
Usage

p_username = the username to which the privilege is to be granted.
p_priv_name = the privilege to be extended (e.g. 'create session'). This
must already exists for the user. Use add_privilege if not.
p_duration = the time extension after the original expiration
values
used.
All these parameters, except user and privilege, are optional. If omitted,
the default values are picked up from a table called
DBVISITOR_PROPERTIES.
There is no default for the p_comments parameter.
UNLOCK_VISITOR
Purpose
:
This procedure is used to unlock the account that was locked earlier by the
tool after it expired. You can only unlock an account; it will not work if
the visitor was dropped. You can set the expiration time (from now) for this
newly unlocked accounts.
Usage

p_username
= the username which is to be unlocked.
p_extend_time = the duration by which the time is to be extended (from
now)
p_extend_unit = the unit in which the above parameter is mentioned.
Valid values
p_comments
= any free format comments (up to 2000) chars can be
used.
picked up
SEND_REMINDER_EMAILS
Purpose
: This stored procedure reads through the dbvisitor_expiration and sends
reminder
emails to the visitors whose account is expiring in stated number of days.
The
reminder is sent only once. The column reminder_sent_dt is populated; and
reminder for that user is not sent again.
Usage
: This has only one parameter

p_before_days = the number of days after which the visits accounts will be
inactivated. If you want
hours, simply pass on that many days, e.g. for 9 hours, enter
9/24. Default is 7 days.
EXPIRE_VISITORS
Purpose
: This stored procedure reads through the dbvisitor_expiration and
inactivates the users for which the expiration date has been past. The user is either
dropped or locked depending on the setting for each visitor.
Usage
: This has no parameter. It is called from a scheduler job.
EXPIRE_VISITOR_PRIVS
Purpose
: This stored procedure reads through the dbvisitor_privs and revokes the
privileges for which the expiration date has been past.
Usage
: This has no parameter. It is called from a scheduler job.
Last Successful Login Time in SQL*Plus in Oracle 12c

If you have been working with Oracle 12c, you may have missed a little something
that appeared without mush fanfare but has some powerful implications. Let's see it
with a small example--connecting with SQL*Plus.
C:\> sqlplus arup/arup
SQL*Plus: Release 12.1.0.1.0 Production on Mon Aug 19 14:17:45 2013
Copyright (c) 1982, 2013, Oracle.
All rights reserved.
Last Successful login time: Mon Aug 19 2013 14:13:33 -04:00

Connected to:
Production
With the Partitioning, OLAP, Advanced Analytics and Real Application
Testing options
SQL>
Did you note the line in red above?
Last Successful login time: Mon Aug 19 2013 14:13:33 -04:00
That line shows you when you last logged in successfully. The purpose of that little
output is to alert you about the last time you (the user ARUP) logged in, very similar
to the message you get after logging in to a unix session. If you didn't login earlier,
this message will alert you for possible compromise of your account.
Suppression
What if you don't want to show this timestamp?
C:\> sqlplus -nologintime arup/arup
SQL*Plus: Release 12.1.0.1.0 Production on Mon Aug 19 14:23:25 2013
Connected to:
Production
With the Partitioning, OLAP, Advanced Analytics and Real Application
Testing options
The login time has been suppressed, going back to the old behavior.
Who Manages the Exadata Machine?

For organizations that just procured an Exadata machine, one of the big questions is bound to be
about the group supporting it. Who should it be - the DBAs, Sys Admins, Network Admins, or
some blend of multiple teams?
The conventional Oracle database system is a combination of multiple distinct components servers, managed by system admins; storage units, managed by SAN admins; network
components such as switches and routers, managed by network admins; and, of course, the
database itself, managed by the DBAs. Exadata has all those components - servers, storage (as
cell servers), infiniband network, ethernet network, flash disks, the whole nine yards; but
packaged inside a single physical frame representing a single logical unit - a typical engineered
system. (For a description of the components inside the Exadata system, please see my 4-part
article series on Oracle Technology Network) None of these conventional technology groups
posses the skillsets to the manage all these components. That leads to a difficult but important
decision - how the organization should assign the operational responsibilities.
Choices
There are two choices for organizations to assign administrative responsibilities.
1. Distributed - Have these individual groups manage the respective components, e.g. Sys
Admins managing the Linux servers, the storage admins managing the storage cells,
network admins managing the network components and finally DBAs managing the
database and the cluster.
2. Consolidated - Create a specialized group - Database Machine Administrator (DMA)
and have one of these groups expand the skillset to include the other non-familiar areas.
Each option has its own pros and cons. Let's examine them and see if we can get the right fit for
our specific case.
Distributed Management
Under this model each component of Exadata is managed as an independent entity by a group
traditionally used to manage that type of infrastructure. For instance, the system admins would
manage the Linux OS, overseeing all aspects of it such as creation of users to applying the
patches and RPMs. The storage and database would be managed likewise by the specialist teams.
The benefit of this solution is its seeming simplicity - components are managed by their
respective specialists without a need for advanced training. The only need for training is for
storage, where the Exadata Storage Server commands are new and specific to Exadata.
While this approach seems a nobrainer on surface, it may not be so in reality. Exadata is not just
something patched up from these components; it is an engineered system. There is a huge
meaning behind that qualifier. These components are not designed to act alone; they are put
together to make the entire structure a better database machine. And, note the stress here - not an
application server, not a fileserver, not a mail server; not a general purpose server - but a
database machine alone. This means the individual components - the compute nodes, the storage
servers, the disks, the flashdisk cards and more - are tuned to achieve that overriding objective.
Any incremental tuning in any specific component has to be within the framework of the entire
frame; otherwise it may fail to produce the desired result, or worse, produce undesirable result.
For instance the disks where the database resides are attached to the storage cell servers; not the
database compute nodes. The cell servers, or Cells run Oracle Enterprise Linux, which is very
similar to Red Hat Linux. Under this model of administration, the system admins
are responsible for managing the operating system. A system admin looks at the host and
determines that it is under tuned since the filesystem cache is very low. In a normal Linux
system, that would have been a correct observation; but in Exadata, the database is in ASM and a
filesystem cache is less important. On the other hand, the Cells need the memory to place the
Storage Indexes on the disk contents. Placing a large filesystem cache not only produce nothing
to help the filesystem; but actually hurt the performance for the paging of Storage Indexes.
This is just one example of how the engineered systems are closely interrelated. Assuming they
are separate and assigning multiple groups with different skillsets may not work effectively.
Database Machine Administrator
This is leads to the other approach - making a single group responsible for the entire frame from
storage to the database. The single group would be able to understand the impact of the changes
in one component to the overall effectiveness of the rack and will be in a better position to plan
and manage. The single role that performs the management of Exadata is known as Database
Machine Administrator (DMA).
I can almost hear the questions firing off inside your brain. The most likely question probably is
whether it is even possible to have a single skillset that encompasses storage, system, database
and network.
Yes, it definitely is. Remember, the advantages of an engineered system do not stop at being a
carefully coordinated individual components. Another advantage is the lack of controls in those
components. There are less knobs to turn on each component in an Exadata system. Take for
instance the Operating System. There are two types of servers - the compute nodes and the cells.
In the cells, the activity performed by a system admin is severely limited - almost to the point of
being none. On the compute nodes, the activities are limited as well. The only allowable
activities are - setting up users, setting up email relays, possibly setting up an NFS mount and
handful of more. This can easily be done by a non-expert. One does not have to a System Admin
to manage the servers.
Consider storage, the other important component. Traditionally storage administrators perform
critical functions such as adding disks, carving out LUNs, managing replication for DR and so
on. These functions are irrelevant in Exadata. For instance, the disks are preallocated in Exadata,
the LUNs are created at installation time, there is no replication since the DR is by Data Guard
which at the Oracle database level. One need not be a storage expert to the perform the tasks in
Exadata. Additionally the Storage Admins are experts in the specific brand of storage, e.g. EMC
VMax or IBM XiV. In Exadata, the storage is different from all the other brands your storage
admins may be managing. They have to learn about the Exadata storage anyway; so why not
have someone else, specifically the DMA learn?
Consider Network. In Exadata the network components are very limited since it is only for the
components inside the rack. This reduces the flexibility of the configuration compared to a
regular general purpose network configuration. the special kind of hardware used in Exadata Infiniband - requires some special skills which the network ops folks may have to learn anyway.
So, why not the DMAs instead of them? Besides, Oracle already provides a lot of tools to
manage this layer.
That leaves the most visible component - the database which is, after all, the heart and soul of
Exadata. This layer is amenable to a considerable degree of tuning and the depth of skills in this
layer is vital to managing Exadata effectively. Transferring the skills needed here to a non-DBA
group or individual is difficult, if not impossible. This makes the DBA group the most natural
choice for evolving into the DMA role after absorbing the relevant other skills. The other skills
are not necessarily at par with the administrator of the respective components. For instance the
DMA does not need to be a full scale Linux system admin; but just needs to know a few relevant
concepts, commands and tools to perform the job well. Network management is Exadata is a
fraction of the skills expected from a network admin. The storage management in cell servers are
new to any group; so the DMA will find that as easy as any other group, if not easier.
By understanding the available knobs on all the constituent components of Exadata, the DMA
can be better prepared to be an effective administrator of the Exadata system; not by divvying up
the activities to individual groups which are generally autonomous. The advantages
are particularly seen when troubleshooting or patching Exadata. Hence, I submit here for your
consideration - a new role called DMA (Database Machine Administrator) for the management
of Exadata. The role should have the following skillsets:
60% Database Administration
20% Cell Administration
15% Linux Administration
5% Miscellaneous (Infiniband, network, etc.)
I have written an article series on Oracle Technology Network - Linux for Oracle DBAs. This 5part article series has all the commands an concepts the Oracle DBA should understand about
Linux. I have also written a 4 part article series - Commanding Exadata - for DBAs to learn the
20% cell administration. With these two , you will have everything you need to be a DMA.
Scroll down to the bottom of this page and click on "Collection of Some of My Very Popular
Web Articles" to locate all these articles and more.
Summary
In this blog entry, I argued for creating a single role to manage the Exadata system instead of
multiple groups managing individual parts. Here are the reasons in a nutshell:
1. Exadata is an engineered system where all the components play collaboratively instead of
as islands. Managing them separately may be ineffective and detrimental.
2. The support organizations of components such as Systems, storage, DBA, etc. in an
organizations are designed with a generic purpose in mind. Exadata is not generic. Its
management needs unprecedented close coordination among various groups which may
be new to the organization and perhaps difficult to implement.
3. The needed skillsets are mostly database centric; other components have very little to
manage.
4. These other skills are easy to add to the DBA skills making the natural transition to the
DMA role.
Best of luck in becoming a DMA and implementing Exadata.
Primary Keys Guarantee Uniqueness? Think Again.
When you create a table with a primary key or a unique constraint, Oracle
automatically creates a unique index, to ensure that the column does not contain a
duplicate value on that column; or so you have been told. It must be true because
that is the fundamental tenet of an Oracle database, or for that matter, any
database.
Well, the other day I was checking a table. There is a primary key on the column
PriKey. Here are the rows:
Select PriKey from TableName;

PriKey
-----1
1
I got two rows with the same value. The table does have a primary key on this
column and it is enforced. I can test it by inserting another record with the same
value - 1:
SQL> insert into TableName values (1,)
It errors with ORA-00001: unique constraint violated error. The question is: why
there are two rows with duplicate values in a column that has an enforced primary
key, and it refuses to accept the very value that is already violated the primary key?
It could be a great interview question, test your mettle; or just entertaining. Read on
for the answer.
Setup
Lets start with creating a table
SQL> create table pktest1 (
2
pk1
number,
3
col2
varchar2(200)
4 );
Table created.
Notice how I deliberately decided not to add a primary key column now. Lets add
some records. Note that I inserted two records with the pk1 value of 1.
SQL> insert into pktest1 values (1,'One');
1 row created.
SQL> insert into pktest1 values (1,'Second One');
1 row created.
SQL> insert into pktest1 values (2,'Two');
1 row created.
SQL> commit;
Commit complete.
Now we will create an index.

SQL> create index in_pktest1_01 on pktest1 (pk1);
Index created.
Note, I did not use a uniqueness clause; so the index will be created as nonunique. I
can confirm that by checking the status and uniqueness of the index quickly:
SQL> select index_name, status, uniqueness
2 from user_indexes
3 where table_name = 'PKTEST1';
INDEX_NAME
STATUS
UNIQUENES
------------------------------ -------- --------IN_PKTEST1_01
VALID
NONUNIQUE
And, now I will add a primary key constraint on that column:

SQL> alter table pktest1 add constraint in_pktest1_01 primary key (pk1);
alter table pktest1 add constraint in_pktest1_01 primary key (pk1)
*
ERROR at line 1:
ORA-02437: cannot validate (ARUP.IN_PKTEST1_01) - primary key violated
The constraint creation failed, as expected since there are two rows with the same
value in the column. We have to delete the offending row for this PK to be created. I
should have done that; but instead I used something like the following:
SQL> alter table pktest1 add constraint in_pktest1_01 primary key (pk1)
disable keep index;
Table altered.
The constraint was created, even with the duplicate values! So, how did Oracle
allow that? The statement succeeded because you created the constrained with a
disabled status. You can confirm by checking the status of the constraint:
SQL> select constraint_name, status
2 from user_constraints
CONSTRAINT_NAME
STATUS
------------------------------ -------IN_PKTEST1_01
DISABLED
This is where you need to understand a very important attribute of the key
enforcement of Oracle Database through an index. Its true that PK or UK
constraints are enforced through unique indexes. The only reason you the unique
index was created in that case is to enforce the uniqueness. But what if Oracle
already has a unique index on that column? In that case, Oracle decides to
repurpose that index for the primary key. Thats what happened in this case.
But wait, the index we created was non-unique; how did Oracle use it to enforce the
PK? Well, the answer is simple. Oracle simply doesnt care. If an index exists, Oracle
just uses it unique or not.
When you disable the constraint, the purpose of the index is also eliminated and it
is dropped. However, if you pre-created the index, it is not dropped. If the index was
created with the constraint definition, then the keep index clause preserves the
index. You can check that the index still exists even when the PK is gone.
SQL> select index_name, status
2 from user_indexes
INDEX_NAME
STATUS
------------------------------ -------IN_PKTEST1_01
VALID
Remember, the constraint is still disabled. Later, I enabled the constraint:

SQL> alter table pktest1 modify constraint in_pktest1_01 enable novalidate;
Table altered.
How was the constraint enabled? Looking at the table data:

SQL> select * from pktest1;
PK1 COL2
---------- -------------------1 One
1 Second One
2 Two
The data still shows duplicate rows. The trick is using the clause novalidate, which
instructs Oracle to skip checking of the existing data in the table. This is why the
duplicate value in the table were tolerated while creating the constraint. The
constraint is still enabled.
SQL> select constraint_name, status
2 from user_constraints
CONSTRAINT_NAME
STATUS
------------------------------ -------IN_PKTEST1_01
ENABLED
However, only the existing rows are skipped; the future rows are subject to the
enforcement of the constraint, as shown below:
SQL> insert into pktest1 values (1,'Third One');
insert into pktest1 values (1,'Third One')
*
ERROR at line 1:
ORA-00001: unique constraint (ARUP.IN_PKTEST1_01) violated
This is yet another property of the Oracle database you should be aware of the
presence of NOVALIDATE clause in constraint enablement. This is why there were
duplicate values but a primary key constraint was still enabled.

One question you may be wondering about is why would Oracle even have such a
feature that could possibly allow non-conforming values? Consider this: you have a
billion row table in a datawarehouse and using ETL processes you move them
around between tables. If you have a constraint enabled, the constraint will be
checked for the rows inserted consuming considerable time and system resources.
The data integrity in each step does not change; so why would you have the
constraint. Well, you do have the constraints like primary key and foreign key to
show the relationship between tables for modeling systems and query tools that
rely on such information. One small example is Query Rewrite by Oracle, which
allows a query to be rewritten from a summary table (materialized view) even
though the query used a lot of dependent tables. Without the relationships between
the tables Oracle could not have decided which materialized view to pull the data
from. So, you want the constraints; but when you enable them, you dont want to
check for the conformance in the existing data. Here is the how the process looks
like
1 Disable constraints on TargetTable
2 Load TargetTable Select * from SourceTable
3 alter table TargetTable enable constraint novalidate
Since the constraints are disables during Step 1, the loading will be much faster in
step 2. When the constraints are enabled in Step 3, they are not checked for in the
existing data making it almost instantaneous. This is why Oracle allows this
process.
Conclusion
Lets summarize what we learned from this:
(1) Just because there is an enabled constraint on a table does not mean that all the
rows will conform to the constraint. For instance, a primary key on a column does
not mean that the column will not contain any duplicate values.
(2) The primary key constraint is enforced through a unique index. However, if
Oracle already finds an index unique or not it uses it for the primary key.
(3) In that case, if the index was initially created as nonunique, it will continue to
show as nonunique; but it will actually be a unique index.
(4) If you disable the primary key constraint, the index, if created by Oracle to
support the PK, is also dropped. To keep the index, use the keep index clause.
(5) When you enable a constraint with the novalidate clause the constraint does
not check for the existing data; so there could be non-conforming values in the
table.
By the way, what is the difference between primary key and unique constraints?
They both enforce uniqueness of values in a column; but unique keys allow nulls
while primary keys don't. Both of them enforce the uniqueness by creating a unique
index.
I hope it was useful. Like always, I will appreciate you feedback.

Demystifying Big Data for Oracle Professionals
Ever wonder about Big Data and what exactly it means, especially if you are already
an Oracle Database professional? Or, do you get lost in the jargon warfare that
spews out terms like Hadoop, Map/Reduce and HDFS? In this post I will attempt to
explain these terms from the perspective of a traditional database practitioner but
getting wet on the Big Data issues by being thrown in the water. I was inspired to
write this from the recent incident involving NSA scooping up Verizon call data
invading privacy of the citizens.
Big Data Books

It's not a news anymore that many people are reacting in different ways to the news
of National Security Agency accessing the phone records of Verizon - cellphone and
land lines - to learn who called who. But one thing is common - all these reactions
are pretty intense. While the debate lingers on whether the government
overstepped its boundaries in accessing the records of private citizens or whether it
was perfectly justified in the context of the threats our country is facing right now and there will be a debate for some time to come - I had a different thought of my
own. I was wondering about the technological aspects of the case. The phone
records must be massive. What kind of tools and technologies the folks in the "black
room", a.k.a Room 641A must have used to collate and synthesize the records into
meaningful intelligence - the very task NSA is supposed to gather and act upon?
Along with anything it piques the interest from an engineering point of view.
And that brings the attention to the aspect of computation involving massive
amounts of data. It's one thing to slice and dice a finite, reasonable amount of
dataset; but in the case of the phone records, and especially collated with other
records to identify criminal or pseudo-criminal activities such as financial records,
travel records, etc., the traditional databases such as Oracle and DB2 likely will not
scale well. But the challenge is not just in the realm of romanticized espionage; it's
very much a concern for purely un-romantic corporations who, among other things,
want to track customer behavior to fine tune their service and product offerings.
Well, it's espionage of a slightly different kind. Perhaps the sources of data are
different - website logs, Facebook feeds as opposed to phone records; but the
challenges are the same - how to synthesize enormous amounts of seeming
unrelated data into meaningful intelligence. This is the challenge of the "Big Data".
Meet the V's
Several years ago, Yahoo! faced the same issue - how to present the attractiveness
the webpages it puts on its portal and analyze the pattern of clicking by users to
attract advertisers to put relevant ads on their pages. Google had a similar
challenge of indexing the entire World Wide Web in its servers so that it present
search results very, very quickly. Both of these issues represent the issues that
others probably didn't face earlier. Here are the relatively unique aspects of this
data, which are known as the "Three V's of Big Data".
Volume - the sheer mass of the data made it difficult, if not impossible, to
sort through them
Velocity - the data was highly transient. Website logs are relevant only for
that time period; for a different period it was different
Variety - the data was not pre-defined and not quite structured - at least not
the way we think of structure when we think of relational databases
Both these companies realized they are not going to address the challenges using
the traditional relational databases, at least not in the scale they wanted. So, they
developed tools and technologies to address these very concerns. They took a page
from the super-computing paradigm of divide and conquer. Instead of dissecting the
dataset as a whole, they divided it into smaller chunks to be processed by
hundreds, even thousands of small servers. This approach solved three basic,
crippling problems:
1. There was no need to use large servers, which typically costs a lot more than
small servers
2. There was a built-in data redundancy since the data was replicated between
these small servers
3. But the most important, it could scale well, very well simply by adding more
of those small servers
This is the fundamental concept that have rise to Hadoop. But before we cover
that, we need to learn about another important concept.
Name=Value Pairs
A typical relational database works by logically arranging the data into rows and
columns. Here is an example. You decide on a table design to hold your customers,
named simply CUSTOMERS. It has the columns CUST_ID, NAME, ADDRESS, PHONE.
Later, your organization decides to provide some incentives to the spouses as well
and so you added another column - SPOUSE.
Everything was well, until the time you discovered customer 1 and spouse are
divorced and there is a new spouse now. However the company decides to keep the
names of the ex-spouses as well, for marketing analytics. Like the right relational
application, you decide to break SPOUSE away from the main table and create a
new table - SPOUSES, which is a child of CUSTOMERS, joined by CUST_ID. This
requires massive code and database changes; but you survive. Later you had the
same issue with addresses (people have different addresses - work, home, vacation,
etc.) and phone numbers (cell phone, home phone, work phone, assistant's phone,
etc.). So you decide to break them into different tables as well. Again, code and
database changes. But the changes did not stop there. You had to add various
tables to record hobbies, associates, weights, dates of birth - the list is endless.
Every thing you record requires a database change and a code change. But worse not all the tables will be populated for every customer. In your company's quest to
build a 360 degree view of the customer, you collect some information; but there is
no guarantee that all the data points will be gathered. you are left with sparse
tables. Now, suddenly, someone says there is yet another attribute required for the
customer - professional associations. So, off you go - build yet another table,
followed by change control, code changes to incorporate that.
If you look at the scenario above, you will find that the real issue is trying force a
structure around a dataset that inherently unstructured - akin to a square peg in a
round hole. The lack of structure of the data is what makes it agile and useful; but
the lack of structure is also what makes it difficult in a relational database that
demands structure. This is the primary issue if you wan tto capture social media
data - Twitter feeds, Facebook updates, LinkedIn updates and Pinterest posts. It's
impossible to predict in advance, at least accurately, the exact information you will
expect to see in them. So, putting a structure around the data storage not only
makes life difficult for everyone - the DBAs will constantly need to alter the
structures and the developers/designers will constantly wait for the structure to be
in the form they want - slowing down capture and analysis of data.
So, what is the solution. If you think about it, think about how we - human beings process information. Do we parse information in form of rows in some table? Hardly.
We process and store information by associations. For instance, let's say I have a
friend John. I probably have nuggets of information like this:
Last Name = Smith
Lives at = 13 Main St, Anytown, USA
Age = 40
Birth Day = June 7th
Wife = Jane
Child = Jill
Jill goes to school = Top Notch Academy
Jill is in Grade = 3
... and so on. Suppose I meet another person - Martha- who tells me that her child
also goes to Grade 3 in Top Notch Academy. My brain probably goes through a
sequence like this:
Search for "Top Notch Academy"
Found it. It's Jill
Search for Jill.
Found it. She is child of John
Who is John's wife?
Found it. It's Jane.
Where do John and Jill live? ...
And finally, after this processing is all over, I say to Martha as a part of the
coversation - "What a coincidence! Jill - the daughter of my friends John and Jane
Smith goes there are well. Do you know them?". "Yes, I do," replies Martha. "In fact
they are in the same class, that of Mrs Gillen-Heller".
Immediately my brain processed this new piece of information and filed the data as:
Jill's Teacher = Mrs. Gillen-Heller
Jane's Friend = Martha
Martha's Child Goes to = ...
Months later, I meet with Jane and mention to her that I met Martha whose child
went to Mrs. Gillen-Heller's class - the same one as Jill. "Glad you met Martha,", Jane
says. "Oh, Jill is no longer in that class. Now she is in Mr. Fallmeister's class."
Aha! My brain probably stored that information as:
Jill's former teacher = Mrs. Gillen-Heller
And it updated the already stored information:

Jill's Teacher = Mr. Fallmeister
This is called storing by a name=value pair. You see I stored the information as a
pair of property and it value. As information goes on, I keep adding more and more
pairs. When I need to retrieve information, I just get the proper property and by
associations, I get all the data I need. But the storing of data by name=value pairs
gives me enormous flexibility in storing all kinds of information without modifying
any data structures I may currently have.
This is also how the Big Data is tamed for processing. Since the data coming of
Twitter, Facebook, LinkedIn, Pinterest, etc. is impossible to categorize in advance, it
will be practically impossible to put it all in the relational format. Therefore, a
name=value pair type storage is the logical step in compiling and collating the data.
the name is also known as "key"; so the model is sometimes called key-value pair.
The value doesn't have to have a datatype. In fact it's probably a BLOB; so anything
can go in there - booking amount, birth dates, comments, XML documents, pictures,
audio and even movies. It provides an immense flexibility in capturing the
information that is inherently unstructured.
NoSQL Database
Now that you know about name value pairs, the next logical question you may have
is - how do we store these? Our thoughts about databases are typically colored by
our long-standing association with relational databases, making them almost
synonymous with the concept of database. Before relational databases were there,
even as a concept, big machines called mainframes ruled the earth. The databases
inside them were stored in hierarchical format. One such database from IBM was
IMS/DB, which was hierarchical. Later, when relational databases were up and
coming, another type of database concept - called a network database - was
developed to compete against it. An example of that category was IDMS (now
owned by Computer Associates) developed for mainframes. The point is, relational
databases were not the answer to all the questions then; and it is clear that they
are not now either.
This leads to the development a different type of database technologies based on
the the key-value model. Relational database systems are queried by SQL language,
which I am sure is familiar to almost anyone reading this blog post. SQL is a setoriented language - it operates on sets of data. In the key-value pair mode,
however, that does not work anymore. Therefore these key-value databases are
usually known as NoSQL, to separate them from the relational SQL-based
counterparts. Since their introduction, some NoSQL databases actually support SQL,
which is why "NoSQL" is not a correct term anymore. Therefore sometimes it is
referred to as "Not only SQL" databases. But the point is that their structure is not
dependent on relational. But how exactly the data is stored is usually left to the
implementer. Some examples are MongoDB, Dynamo, Big Table (from Google) etc.
I would stress here that almost any type of non-relational database can be classified
as NoSQL; not just the name-value pair models. For instance, Object Store, an object
database is also NoSQL. But for this blog post, I am assuming only key-value pair
database as the NoSQL one.
Map/Reduce
Let's summarize what we have learned so far:
1. The key-value pair model in databases offer flexibility in data storage without
the need for a predefined table structure
2. The data can be distributed across many machines where they are
independently processed and then collated.
When the system gets a large chunk of data, e.g. a Facebook feed, the first task is
to break it down to these keys and their corresponding values. After that the values
may be collated for a summary result. The process of dividing the raw data into
meaningful key-value pairs is known as "mapping". Later combining the values to
form summaries, or just eliminating the noise from the data to extract meaningful
information is known as "reducing". For instance you see "Name" and "Customer
Name" in the keys. They mean the same thing; so you reduce them to a single key
value - "Name". These are almost always used together; hence the operation is
known as Map/Reduce.
Here is a very rudimentary but practical example of Map/Reduce. Suppose you get
Facebook feeds and you are expected to find out the total of likes for our company's
recent post. Facebook feed comes in the form of a massive dataset. The first task is
to divide that among many servers - a principle described earlier to make the
process scale well. Once the dataset is divided, each machine run some code to
extract and collate the information and then present the data to some central
coordinator. to collate for the final time. Here is a pseudo-code for the process for
each server doing the processing on a subset of data:
begin
get post
while (there_are_remaining_posts) loop
extract status of "like" for the specific post
if status = "like" then
like_count := like_count + 1
else
no_comment := no_comment + 1
end if
end loop
end
Let's name this program counter(). Counter runs on all the servers, which are called
Nodes. As shown in the figure, there are three nodes. The raw dataset is divided
into three sub-datasets which are then fed to each of the three Nodes. A copy of the
subdataset is kept another server as well. That takes care of redundancy. Each node
perform their computation, send their results to an intermediate result set where
they are collated.
Map/Reduce Processing
How does this help? It does; in many ways. Let's see:
(1) First, since the data is stored in chunks and the copy of a chunk is stored in a
different node, there is built-in redundancy. There is no need to protect the data
being fed since there is a copy available elsewhere.
(2) Second, since the data is available elsewhere, if a node fails, all it needs to done
is that some other nodes will pick up the slack. There is no need to reshuffle or
restart the job.
(3) Third, since the nodes all perform task independently, when the datasize
becomes larger, all you have to do is to add a new node. Now the data will be
divided four ways instead of three and so will be processing load.
This is very similar to parallel query processes in Oracle Databases, with PQ servers
being analogous to nodes.
There are two very important points to note here:
(1) The subset of data each node gets is not needed to be viewed by all the nodes.
Each node gets its own set of data to be processed. A copy of the subset is
maintained in a different node - making the simultaneous access to the data
unnecessary. This means you can have the data in a local storage; not in expensive
SANs. This is not only brings cost significantly down but may performs better as well
due to a local access. As cost of Solid State Devices and flash-based storage
plummets, it could also mean the that the storage cost per performance will be
even better.
(2) The nodes need not be super fast. A relatively simple commodity class server is
enough for the processing as opposed to a large server. Typically servers are priced
for their use, e.g. a Enterprise class server with 32 CPUs is probably roughly
equivalent in performance to eight 4-CPU blades. But the cost of the former is way
more than 8 times the cost of the blade server. This model takes advantage of the
cheaper computers by scaling horizontally; not vertically.
Hadoop
Now that you know how the processing data in parallel and using a concept called
Map/Reduce allows you to shove in several compute intensive applications to
dissect large amounts of data, you will often wonder - there are a lot of moving
parts to be taken care of just to empower this process. In a monolithic server
environment you just have to kick off multiple copies of the program. The Operating
System does the job of scheduling these programs on the available CPUs, taking
them off the CPU (paging) to roll in another process, prevent processes from
corrupting each others' memory, etc. Now that these processes are occurring on
multiple computers, there has to be all these manual processes to make sure they
work. For instance, in this model you have to ensure that the jobs are split between
the nodes reasonably equally, the dataset is split equitably, the queue for feeding
data and getting data back from the Map/Reduce jobs are properly maintained, the
jobs fail over in case of node failure, so on. In short, you need an operating system
of operating systems to manage all these nodes as a monolithic processor.
What would you do if this operating procedures were already defined for you? Well,
that would make things really easy, won't it? you can then focus on what you are
good at - developing the procedures to slice and dice the data and derive
intelligence from it. This "procedure", or the framework is available, and it is called
Hadoop. It's an open source offering, similar to Mozilla and Linux; no single
company has exclusive ownership of it. However, many companies have adopted it
and evolved it into their offerings, similar to Linux distributions such as Red Hat,
SuSe and Oracle Enterprise Linux. Some of those companies are Cloudera,
Hortonworks, IBM, etc. Oracle does not have a Hadoop distribution. Instead it
licenses the one from Cloudera for its own Big Data Appliance. The Hadoop
framework runs on all the nodes of the cluster of nodes and acts as the coordinator.
A very important point to note here that Hadoop is just a framework; not the actual
program that performs Map/Reduce. Compare that to the operating system analogy;
an OS like Windows does not offer a spreadsheet. You will need to either develop or
buy an off the shelf product such as Excel to have that functionality. Similarly,
Hadoop offers a platform to run the Map/Reduce programs that you develop and you
put that logic in the code what you "map" and how you "reduce".
Remember another important advantage you had seen in this model earlier - the
ability to replicate data between multiple nodes so that the failure of a single node
does not cause the processing to be abandoned. This is offered through a new type
of filesystem called Hadoop Distributed Filesystem (HDFS). HDFS, which is a
distributed (and not a clustered) filesystem, by default has 3 copies of data on three
different nodes - two on the same rack and the third on a different rack. The nodes
communicate to each other using a HDFS- specific protocol that is built on TCP/IP.
The nodes are aware of the data present on the other nodes, which is precisely what
allows Hadoop job scheduler to divide the work among the nodes. Oh, by the way,
HDFS is not absolutely required for Hadoop; but as you can see, HDFS is the only
way for Hadoop to know which node has what data for smart job scheduling.
Without it, the division of labor will not be as efficient.
Hive
Now that you learned how Hadoop fills a major void for computations on a massive
dataset, you can't help but see the significance for datawarehouses where massive
datasets are common. Also common are jobs that churn through this data. However,
there is a little challenge. Remember, NoSQL databases mentioned earlier? That
means they do not support SQL. To get the data you have to write a program using
the APIs the vendor supplies. This may reek of COBOL programs of yesteryear
where you had to write a program to get the data out, making it inefficient and
highly programmer driven (although it did strut up job-security, especially during
the Y2K transition times). The inefficacy of the system gave rise to 4th Generation
Languages like SQL, which brought the power of queries to common users, ripping
the power away from programers. In other words, it brought the data and its users
closer, reducing the role the middleman significantly. In datawarehouses, it was
especially true since the power users issued queries to after getting the result from
the previous queries. It was like a conversation - ask a question, get the answer,
formulate your next question - and so on. If the conversation were dependent on
writing programs, it would have been impossible to be effective.
With that in mind, consider the implications of lack of SQL in these databases highly
suitable for datawarehouses. This requirement to write a program to get the data
everytime would have taken it straight to the COBOL-days. Well, not to worry,
Hadoop has another product that allows a SQL-like language called HiveQL. Just as
users could query relational databases with SQL very quickly, HiveQL allows users
to get the data for analytical processing directly. It was initially developed at
Facebook.
Comparing to Oracle RAC

When we talk about clustering in Hadoop, you may not help wonder - shouldn't the
same functionality be provided by Oracle Real Application Cluster? Well, the short
answer is - a big resounding NO. Oracle RAC combines the power of multiple nodes
in the cluster which communicate with one another and transfer data (cache
fusion); but that's where the similarity ends. The biggest difference is the way the
datasets are accessed. In RAC, the datasets are common, i.e. they must be visible
to all the nodes of the cluster. In Hadoop, the datasets are specific to the individual
nodes, which allows them to be local. The filesystems in RAC can't be local; they
have to be clustered or available globally, either by a clustered filesystem, shared
volumes, clustered volume managers (such as ASM) or by NFS mounting. In
Hadoop the local files are replicated to other nodes, which means there is no reason
to create a RAID lever protection at the storage level. ASM does provide a software
level mirroring, which may sound similar to Hadoop's replication; but remember,
ASM's mirrors are not node specific. The mirrored copies of ASM must be visible to
all the nodes. There is a preferred node concept in ASM; but that simply means that
data is read by a specific node from one mirror copy. The mirror copies can't be
local; they must be be globally visible.
Besides, Oracle RAC if for a relational database. The Hadoop cluster is not one.
There are traits such as transnational integrity, multiple concurrent writes that are
innate features of any modern database system. In Hadoop, these things are not
present and not really necessarily. So while both are technically databases, a
comparison may not be fair to either Hadoop or RAC. They are like apples and
tomatoes. (Tomato is technically a fruit, just in case you are wondering about this
analogy).
The Players
So, who are the players for this new branch of data processing? The major players
are show below with small description. This list is by no means exhaustive. It simply
is a collection of companies and products I have studied.
Cloudera - they have their distribution called, what else, Cloudera

Distribution for Hadoop (CDH). But perhaps the most impressive from them is
Impala - a real-time SQL like interface to query data from the Hadoop cluster.
Hortonworks - may of the folks who founded this company came from
Google and Yahoo! where they built or added to the building blocks of Hadoop
IBM - they have a suite called Big Insights which has their distribution of
Hadoop. This is one of the very few companies who offer both the hardware
and software. the most impressive feature from IBM is a product called
Streams that can mine data frm a non-structured stream like Facebook in
realtime and send alerts and data feeds to other systems.
Dell - like IBM they also have a hardware/software combination running a

licensed version of Cloudera, along with Pentaho Business Analytics
EMC
MapR
Conclusion
If the buzzing of the buzz-words surrounding any new technology annoy you and all
you get is tons of websites on the topic but not a small consolidated compilation of
terms, you are just like me. I was frustrated by the lack of information in a digestible
form on these buzzwords that are too important to ignore but would take too much
time to understand fully. This is my small effort to bridge that gap and get you going
on your quest for more information. If you have 100's or 1000's of questions after
reading this, I would congratulate myself - that is precisely what my objective was.
For instance how HiveQL differs from SQL or how Map/Reduce jobs are written these are questions that should be flying in your mind now. My future posts will
cover them and some more topics like HBase, Zookeeper, etc. that will unravel the
mysteries around the technology that is going to be commonplace in the very near
future.
Welcome aboard. I wish you luck in learning. As always, your feedback will be highly
appreciated.
Streams Pool is only for Streams? Think Again!

If you dont use the automatic SGA (i.e. set the sga_target=0) - something I frequently do - and
dont use Streams, you probably have set the parameter streams_pool_size to 0 or not set it at
all, since you reckon that the pool is used for Streams alone and therefore would be irrelevant in
your environment wasting memory.
But did you know that the Streams Pool is not just for Streams and it is used for other tools
some of which are frequently used in almost any database environment? Take for instance, Data
Pump. It uses Streams Pool, contrary to conventional wisdom. If Streams Pool is not defined, it
is dynamically allocated by stealing that much memory from the buffer cache. And the size is not
reset back to zero after the demand for the pool is over. You should be aware of this lesser known
fact as it reduces the buffer cache you had allocated to the instance earlier.
Demonstration
Lets examine this with an example. First, lets check the various pools defined in the database
instance right now:
SQL> show parameter sga_target
NAME TYPE VALUE
------------------------------------ ----------- ----sga_target big integer 0
SQL> show parameter db_cache_size
NAME TYPE VALUE
------------------------------------ ----------- ----db_cache_size big integer 300M
SQL> show parameter streams_pool_size
NAME TYPE VALUE
------------------------------------ ----------- ----streams_pool_size big integer 0
Note carefully the values of the following parameters:
sga_target = 0 --> this means the SGA is not auto tuned.
db_cache_size = 300M --> this is the buffer cache
streams_pool_size = 0 --> this is the stream pool, set to 0 as expected
Now kick off a Data Pump Export (expdp) job:

$ expdp directory=DATA_FILE_DIR tables=arup.t1
Job "SYS"."SYS_EXPORT_TABLE_01" successfully completed at 14:49:16
After the Data Pump job is complete, check the size of the buffer cache again:
SQL> show parameter db_cache_size
NAME TYPE VALUE

------------------------------------ ----------- ----db_cache_size big integer 280M
The buffer cache got compressed from 300 MB earlier to 280 MB. But you didnt do that; Oracle
did it.
Well, where did the 20 MB of missing memory go? Now, check the size of the Streams Pool:
SQL> show parameter streams_pool_size
NAME TYPE VALUE

------------------------------------ ----------- ----streams_pool_size big integer 20M
The Streams Pool was 0 earlier, as you intended it to be; but Oracle allocated 20 MB to it by
stealing that much memory from the buffer cache. The reason: the Streams Pool was used for the
Data Pump Export job, even though it does not sound intuitive. If you check the alert log, you
will see the activity recorded there:
$ adrci
ADRCI: Release 11.2.0.1.0 - Production on Thu Apr 18 14:44:46 2013

ADR base = "/opt/oracle"
adrci> set homepath diag/rdbms/d112d2/D112D2
adrci> show alert -tail f
Here are the excerpts from the alert log:

2013-04-18 14:48:45.581000 -04:00
streams_pool_size defaulting to 20971520. Trying to get it from Buffer Cache
for process 27378.
The next question you may be wondering about is why did Oracle decide to give only 20 MB
to the Streams Pool? Why not 100 MB, or 10 MB? Is it dependent on the size of the table being
exported? The answer is no.
Oracle by default gives 10% of the size of the shared pool to the Streams Pool. Let me find out
the size of the shared pool:
SQL> show parameter shared_pool_size
NAME TYPE VALUE

------------------------------------ ----------- ----shared_pool_size big integer 200M
The shared pool is 200 MB. 10% of that is 20 MB, which is how much was assigned to the
Streams Pool. That size is not dependent on the size of the exported data; but the size of the
shared pool.
It's important to understand that the shared pool is used to compute the default size of the streams
pool; the actual memory is carved out of buffer cache; not the shared pool.
If you check the databases operations, you will be able to confirm Oracles adjustment of the
pools:
SQL> select component, oper_type, parameter, initial_size, target_size,
final_size
2 from v$sga_resize_ops
3 order by start_time;
COMPONENT OPER_TYPE PARAMETER INITIAL_SIZE TARGET_SIZE FINAL_SIZE
--------------- ------------- -------------------- ------------ -------------------DEFAULT buffer STATIC db_cache_size 0 314572800 314572800
cache
DEFAULT buffer SHRINK db_cache_size 314572800 293601280 293601280
cache
streams pool GROW streams_pool_size 0 20971520 20971520
The output has been truncated to show only the relevant records. From the output you can see
clearly that the buffer cache was defined statically as 314572800, or 300 MB initially. Later the
buffer cache shrank from 314572800 to 293601280 (about 280 MB). The amount of shrinkage
was 314572800 - 293601280 = 20971520 (or, 20 MB), the exact amount the
streams_pool_size was allocated.
Why this is a problem? Well, the biggest problem is that the buffer cache size is now reduced
without your knowledge. The buffer cache lost 10% of the shared pool. But systems with large
shared pool, it could be substantial. Worse, the amount allocated to Streams Pool remains there;
it is not returned to the buffer cache as you might expect. You have to manually give it back:
SQL> alter system set streams_pool_size = 0;
In case of a RAC database, its possible that only one instance sees this change in Streams Pool
size; the other instances will be unaffected.
It would be prudent to note here that this surprise occurs when you do not use automatic SGA
settings. When auto SGA is used, i.e. sga_target is set to a non-zero value, you give up
complete control to Oracle to manipulate the memory structures. In that case Oracle juggles the
memory between various pools including Streams Pool - without your control anyway.
While it is not very well known, this behavior is not undocumented. Its mentioned in the
Utilities Guide at
http://docs.oracle.com/cd/E11882_01/server.112/e22490/dp_perf.htm#SUTIL973.
Conclusion
Just because you havent defined the streams_pool_size parameter as you dont use
Streams doesn't mean that Oracle will not assign some memory to Streams Pool. Data Pump,
which is frequently used in many databases, uses the Streams Pool and Oracle will assign it as
10% of the size of the shared pool and reduce the buffer cache by that amount to fund the
memory for the Streams Pool. So you should configure the Streams Pool, even if you dont use
Streams, so that Data Pump can use a precisely allocated pool it rather than stealing it from the
Buffer Cache. If you dont do that now, or dont intend to do it, then regularly check the
streams_pool_size value and set it to zero if it is not so.
Application Design is the only Reason for Deadlocks? Think Again
[Updated on 4/20/2013 after feedback from Charles Hooper, Jonathan

Lewis, Laurent Schneider and Mohamed Houri and with some minor cosmetic
enhancements of outputs]
Have you ever seen a message ORA-00060: Deadlock detected and automatically
assumed that it was an application coding issue? Well, it may not be. There are
DBA-related issues and you may be surprised to find out that INSERTs may cause
deadlock. Learn all the conditions that precipitate this error, how to read the
"deadlock graph" to determine the cause, and most important: how to avoid it.
Introduction
I often get a lot of questions in some form or the other like the following:
What's a Deadlock
How can I prevent it
Why would an INSERT cause deadlock
Why would I need to index FK columns
Is ON DELETE CASCADE FK constraint a good idea?
Deadlock is one of those little understood and often misinterpreted concepts in the
Oracle Database. The word rhymes with locking, so most people assume that it is
some form of row locking. Broadly speaking, its accurate; but not entirely. There
could be causes other than row level locking. This is also often confused by people
new to Oracle technology since the term deadlock may have a different meaning in
other databases. To add to the confusion, Oracles standard response to the problem
is that its an application design issue and therefore should be solved through
application redesign. Well, in a majority of cases application design is a problem;
but not in all cases. In this post, I will describe:
1. Why Deadlocks Occur
2. Primer on Oracle Latching, Locking
3. How to Interpret Deadlock Traces
4. Various Cases of Deadlocks
5. Some Unusual Cases from My Experience
Deadlocks Explained
With two Oracle sessions each locking the resource requested by the other, there
will never be a resolution because both will be hanging denying them the
opportunity to commit ot rollback and therefore releasing the lock. Oracle
automatically detects this deadly embrace and breaks it by forcing one statement
to roll back abruptly (and releasing the lock) and letting the other transaction to
continue.
Here is how a deadlock occurs. Two sessions are involved, doing updates on
different rows, as shown below:
Step Session 1
Session 2
---- ------------------- ----------------1.
Update Row1
(Does not Commit)
2.
Update Row2
(Does not Commit)
3.
Update Row2
4.
Waits on TX Enqueue
5.
Update Row1
At the step 5 above since Row1 is locked by session1, session2 will wait; but this
wait will be forever, since session1 is also waiting and cant perform a commit or
rollback until that wait is over. But session 1's wait will continue to exist until
session 2 commits or rollback - a Catch 22 situation. This situation is a cause
of deadlock and Oracle triggers the ststement at Step 3 to be rolled back (since it
detected that deadlock). Note that only the statement that detected the deadlock is
rolled; the previous statements stay. For instance, update row1 in Step 1 stays.
This is the most common cause of deadlocks and is purely driven by application
design and can only be solved by reducing the possibility of occurence of that
scenario. Now that you understand how a deadlock occurs, we will explore some
other causes of deadlocks. But before that, we will explore different types of locks in
Oracle.
Types of Locks
Database locks are queue-based, i.e. the session first waiting for the lock will get it
first, before another session which started waiting for the same resource after the
first session. The requesters are placed in a queue, hence locks are also called
Enqueues. There are several types of enqueues; but we will focus on row locking,
and specifically only two type of them:
TM this is related to database structural changes. Suppose someone is

executing some query against a table, such as SELECTing from it. The table
structure should remain the same in that period. TM locks protect the table
structure so that someone does not add a column during that query. TM locks
allow multiple queries and DMLs, but not DDL against the table.
TX this is the row level locking. When a row is locked by a session, this type
of lock is acquired.
Anatomy of a Deadlock Trace
When a deadlock occurs and one of the statements gets rolled back, Oracle records
the incident in the alert log. Here is an example entry:
ORA-00060: Deadlock detected. More info in file
/opt/oracle/diag/rdbms/odba112/ODBA112/trace/ODBA112_ora_18301.trc.
Along with the alert log entry, the incident creates a tracefile (as shown above). The
trace file shows valuable information on the deadlock and should be your first stop
in diagnosis. Let's see the various sections of the tracefile:
Deadlock Graph
The first section is important; it shows the deadlock graph. Here are the various
pieces of information on the graph. Deadlock graph tells you which sessions are
involved, what types of locks are being sought after, etc. Let's examine the
deadlock graph, shown in the figure below:
Row Information
The next critical section shows the information on the rows locked during the
activities of the two sessions. From the tracefile you can see the object ID. Using
that, you can get the object owner and the name from the DBA_OBJECTS view. The
information in on rowID is also available here. You can get primary key information
from the object using that rowID.
Process Information
The tracefile also shows the Oracle process information which displays the calling
user. That information is critical since the schema owner may not be the one that
issued the statement.
With the information collected from various sections of the deadlock graph, you now
know the following:
The session that caused it
The session that was the victim
The Oracle SID and process ID of the sessions
The object (the table, materialized view, etc.) whose row was in the deadlock
The exact row that was so popular to cause the deadlock.
The SQL statement that caused the deadlock.
The machine the session came from with the module, program (e.g.
SQL*Plus) and userid information
Now it is a cinch to know the cause of that deadlock and which specific part of the
application you need to address to fix it.
Other Causes
The case described above is just one type of locking scenario causing deadlocks;
but this is not the only one. Other types of locks also cause deadlocks. These
scenarios are usually difficult to identify and diagnose and are often misinterpreted.
Well, not for you.You will learn how to diagnose these other causes in this post.
These causes include:
1. ITL Waits
2. Bitmap Index Update
3. Direct Path Load
4. Overlapping PK Values
Deadlocks due to ITL Shortage

You can read how ITL works in another of my blogposts - How Oracle Locking Works.
In summary, when a session locks a row, it does not go to a central lock repository
and get a lock from there. Instead, the session puts the information on the lock in
the header of the block, called Interested Transaction List (ITL). Each ITLslot takes
up 24 bytes. Figure 1 below shows an empty block with just one ITL slot. When rows
are inserted, from bottom of the block upwards, the free space gradually drops.
When a session - session1 - wants to lock the row1, it uses the slot#1 of the ITL, as
shown in Figure 3 below. Later, another session session2 updates row2. Since
there is no more ITL slot, Oracle creates a new slot slot#2 for this transaction.
However, at this stage, the block is almost packed. If a third transaction comes in,
there will be no more room for a third ITL slot to be created; causing the session to
wait on ITL. Remember, this new session wants to lock row3, which is not locked by
anyone and could have been locked by the session; but its artificially prevented
from being locked due to the absence of an ITL slot.
Checking for ITL Shortage

You can check for ITL shortage by issuing this query:
select owner, object_name, value
from v$segment_statistics
where statistic_name = 'ITL waits'
and value > 0
Here is a sample output:
OWNER
OBJECT_NAME
VALUE
----------- ------------------------- ---------SYSMAN
MGMT_METRICS_1HOUR_PK
19
ARUP
ARUP
DLT2
DLT1
23
131
If you check the EVENT column of V$SESSION to see which sessions are
experiencing it right now, you will see that the sessions are waiting with the event:
enq: TX - allocate ITL entry.
Deadlock Scenario
Here is the scenario where two sessions cause a deadlock due to ITL shortage.
Imagine two rows row1 and row2 are in the same block. The block is so tightly
packed that only two ITL slots can be created.
Ste Session1
p
1
Update Table1 Row1

(1 ITL slot is used; no more free ITL
slots and no room in the block to
create one)
Session2
Update Table2 Row1

(One ITL slot is gone. There are no
more free ITL slots and no room to
create one)
Update Table2 Row2
(Lack of ITL slots; so this will hang)
Update Table1 Row2

(Lack of ITL slots will make this hang as
well. Deadlock!)
At Step 4 Session 2's hang can't be resolved until session 1 releases the lock, which
is not possible since it itself is hanging. This never ending situation is handled by
Oracle by detecting it as a deadlock and killing one of the sessions.
Deadlock Graph
To identify this scenario as the cause of deadlock, look at the deadlock graph. This is
how a deadlock graph looks like when caused by ITL waits.
The absence of row information on one of the sessions is a dead giveaway that this
is a block level issue; not related to specific rows. Here are the clues in this deadlock
graph:
The lock type is TX (row lock) for both the sessions
The holders held the lock in "X" (exclusive) mode (this is expected for TX
locks)
However, only one of the waiters is waiting in the "X" mode. The other is
waiting with the "S" (shared) mode, indicating that it's not really a row lock
the session is waiting for.
One session has the row information; the other doesn't.
These clues give you the confirmation that this is an ITL related deadlock; not
because of the application design. Further down the tracefile we see:
As you can see, its not 100% clear from the tracefile that the deadlock was caused
by ITL. However by examining the tracefile we see that the locks are of TX type and
the wait is in the S (shared) mode. This usually indicates ITL wait deadlock. You
can confirm that is the case by checking the ITL shortages on that segment from the
view V$SEGMENT_STATISTICS as shown earlier.
Update on 4/19/2013: [Thanks, Jonathan Lewis] Occasionally you may see two rows
here as well, as a result of a previous wait (e.g. buffer busy wait) on the block which
has not been cleaned out yet. In such a case you will see information on two rows;
but there are some other clues that may point to this cause. The row portion of the
rowid will be 0, meaning it was not a row but the block. The other clue might be that
the row information points to a row that has nothing to do with the SQL statement.
For instance, you may find the row information pointing to a row in table Table1
whereas the SQL statement is "update Table2 set col2 = 'X' where col1 = 2".
The solution is very simple. Just increase the INITRANS value of the table. INITRANS
determines the initial number of ITL slots. Please note, this value will affect only the
new blocks; the old ones will still be left with the old values. To affect the old ones
you can issue ALTER TABLE TableName MOVE to move the tables to nw blocks and
hence new structure.
Deadlock due to Foreign Key
This is a really tricky one; but not impossible to identify. When a key value in parent
table is updatd or a row is deleted, Oracle attempts to takes TM lock on the entire
child table. If an index is present on the foreign key column, then Oracle locates the
corresponding child rows and locks only those rows. The documentation in some
versions may not very clear on this. There is a documentation bug (MOS Bug#
2546492). In the absense of the index, a whole table TM lock may cause a deadlock.
Let's see the scenario when it happens.
Scenario
Here is the scenario when this deadlock occurs.
Step
Session1
Delete Chaild Row1
Session2
Delete Child Row2
Delete Parent Row1

(Waits on TM Enqueue)
Delete Parent Row2

(Waits on TM Enqueue)
Deadlock!
Deadlock Graph
This is how the deadlock graph looks like when caused by unindexed foreign key. As
you can see, the deadlock graph does not clearly say that the issue was to do with
Foreign Key columns not being indexed.Instead, the clues here are:
TM locks for both the sessions, instead of TX. Remember: TM are metadata
related, as opposed to TX, which is a row related lock.
The lock type of holders is Share Exclusive (SX) as opposed to Exclusive (X)
Sessions do not show any row information
These three clues together show that this deadlock is due to FK contention rather
than the conventional row locks.
So, what do you do? Simple - create the indexes on those FKs and you will not see
this again. As a general rule you should have indexes on FKs anyway; but there are
exceptions, e.g. a table whose parent key is never updated or deleted infrequently
(think a table with country codes, state codes or something pervasive like that). If
you see a lot of deadlocks in those cases, perhaps you should create indexes on
those tables anyway.
Deadlock due to Direct Load

Direct Load is the fastest way to load data into a table from another source such as
a table or a text file. It can be effected in two ways the APPEND hint in INSERT
statement ( insert /*+ append */ ) or by using DIRECT=Y option in SQL*Loader.
When a table is loaded with Direct Path, the entire table is locked from further
DMLs, until committed. This lock may cause deadlocks, when two sessions try to
load into the same table, as shown by the scenario below.
Scenario
Ste Session1
p
1
Direct Path Load into Table1
2
3
Session2

(Hangs with TM Enqueue; since Session2
has the lock)
Direct Load into Table1
TM lock on Table1 prevents this
operation
Deadlock!
Deadlock Graph
As usual, the deadlock graph confirms this condition. Here is how the deadlock
graph looks like:
Both sessions do not show any row

information; and subsequent parts of the tracefile do not show any other relevant
information. The key to identify this deadlock as caused by Direct Path is to look for
the type of lock mode X. This type of lock mode exists for row level locking as well.
However the deadlock graph shows row information in that case. So, the clues for
this type of deadlock are:
Lock type is TM (as shown in the Resource Name)
Lock mode for both the holders and waiters is X (indicating a row lock)
No row information (since it is not really row-related)
Deadlock due to Bitmap Index Contention

Bitmap Index is a special type of index that stores bitmaps of actual values and
compare bitmaps to bitmaps, e.g. instead of comparing literals such as "A" = "A",
Oracle converts the value to a bitmap and compares against the stored bitmap
values. For instance A might be represented as "01011"; so the comparison will be
01011 = "01011". Index searches are way faster compared to literal comparison.
However, there is a price to pay for this performance. Unlike a regular b*tree index,
when a row is updated, the index piece of the bitmap index is locked until the
transaction is committed. Therefore udates to any of the rows covered by that index
piece hangs. When two sessions update two different rows covered by the same
index piece, they wait for each other. Here is the scenario when this condition
arises.
Scenario
Ste Session1
p
1
Session2
Update Row 1
(Bitmap index piece is locked)
Update Row2
(Hangs for TX
Row Lock)
3
Update Row2
(Hangs as bitmap index piece is locked by session2 and
can't release until it commits)
Deadlock!
Deadlock Graph
You can confirm this occurrence from readling the deadlock graph.
The clues that show this type of deadlock:
The lock type is TX (as shown in the Resource Name)
The lock wait mode is S (shared) but the type of lock is TX rather than TM.
The waiter waits with mode "S" instead of "X"
The row information is available but the object ID is not the ID of the table;
but the bitmap index.
The solution to this deadlock is really simple just alter the application logic in such
a way that the two updates will not happen in sequence without commits in
between. If thats not possible, then you have to re-evaluate the need for a bitmap
index. Bitmap indexes are usually for datawarehouse only; not for OLTP.
Deadlock due to Primary Key Overlap
This is a very special case of deadlock, which occurs during inserts; not updates or
deletes. This is probably the only case where inserts cause deadlocks. When you
insert a record into a table but not commit it, the record goes in but a further insert
with the same primary key value waits. This lock is required for Oracle because the
first insert may be rolled back, allowing the second one to pass through. If the first
insert is committed, then the second insert fails with a PK violation. But in the
meantime-before the commit or rollback is issued-the transaction causes the

second insert to wait and that causes deadlock. Let's examine the scenario:
Scenario
Step
Session1
Insert PK Col value = 1

(Doesn't commit)
Session2
Insert PK Col value = 2

(Doesn't commit)
Insert PK Col = 2
(Hangs, until Session2 commits)
Insert PK Col = 1
(Hangs and Deadlock)
Deadlock Graph
The deadlock graph looks like the following.
The key clues are:
The lock type is TX (row lock)
The holders are holding the lock in "X" (exclusive) mode
The waiters are waiting for locks in S mode, even when the locks type TX.
The subsequent parts of the tracefile dont show any row information.
However, the latter parts of the tracefile shows the SQL statement, which should be
able to point to the cause of the deadlock as the primary key deadlock. Remember,
this may be difficult to diagnose first since there is no row information. But this is
probably normal since the row is not formed yet (it's INSERT, remember?).
Special Cases
I have encountered some very interesting cases of deadlocks which may be rather
difficult to diagnose. Here are some of these special cases.
Autonomous Transactions
Autonomous transactions are ones that are kicked off form inside another
transaction. The autonomous one follows its own commit, i.e. it can commit
independently of the outer transaction. The autonomous transaction may lock some
records the parent transaction might be interested in and vice versa a perfect
condition for deadlocks. Since the autonomous transactions is triggered by its
parent, the deadlocks are usually difficult to catch.
Here is how the deadlock graph looks like (exceprted from the tracefile)
---------Blocker(s)-------- ---------Waiter(s)--------Resource Name process session holds waits process session holds waits
TX-0005002d-00001a40 17 14 X 17 14 X
session 14: DID 0001-0011-00000077
session 14: DID 0001-0011-00000077
Rows waited on:
Session 14: obj - rowid = 000078D5 - AAAHjVAAHAAAACOAAA
(dictionary objn - 30933, file - 7, block - 142, slot - 0)
Information on the OTHER waiting sessions:
End of information on OTHER waiting sessions.
Here are the interesting things about this deadlock graph, which are clues to
identifying this type of deadlock:
The lock type is TX (row lock) and the mode is "X", which is exclusive. This
indicates a simple row lock.
Remember, deadlocks are always as a result of two transactions; not one.

However, the deadlock graph shows only one session. The other session
information is not even there.The presence of only one session indicates that
the other transaction originated from the same session - hence only one
session was recorded. The only way two transactions could have originated
from the same session is when the transaction is an autonomous one.
The row information is not there because the autonomous transaction acts
independently of the parent.
If you see a deadlock graph like this, you can be pretty much assured that
autonomous transactions are to blame.
Update on 4/19/2013. [Thanks, Mohamed Houri] The above cause is not limited to
TX locks; it could happen in TM locks as well. The diagnosis remains the same.
Deadlocks among the PQ slaves

Consider a procedural logic like this:
LOOP
SELECT /*+ PARALLEL */ FOR UPDATE
END LOOP
This code locks the rows selected by the parallel query slaves. Since the select is
done in parallel, the PQ slaves distribute the rows to be selected. Therefore the
locking is also distributed among the PQ slaves. Since no two rows are updated by
the same PQ slave (and hence the same session), there is no cause for deadlocks.
However, assume the code is kicked off more than once concurrently. This kicks off
several PQ slaves and many query coordinators. In this case there is no guarantee
that two slaves (from different coordinators) will not pick up the same row. In that
case, you may run into deadlocks.
Triggers firing Autonomous Transactions

If you have triggers firing Autonomous Transactions, they may cause deadlocks, in
the same line described in the section on autonomous transactions.
Freelists
In case of tablespaces defined with manual segment space management, if too
many process freelists are defined, it's possible to run out of transaction freelists,
causing deadlocks.
In Conclusion
The most common cause of deadlocks is the normal row level locking, which is
relatively easy to find. But that's not the only reason. ITL Shortage, Bitmap Index
Locking, Lack of FK Index, Direct Path Load, PK Overlap are also some of the
potential causes. You must check the tracefile and interpret the deadlock graph to
come to a definite conclusion on the cause of the deadlock. Some of the causes,
e.g. ITL shortage, are to do with the schema design; not application design and are
quite easy to solve. Some causes, as in the case of the PK overlap case, INSERTs
cause deadlocks.
I hope you found it useful in diagnosing the deadlock conditions in your system. As
always, your feedback is very much appreciated.
Switching Back to Regular Listener Log Format
Did you ever miss the older listener log file format and want to turn off the ADRstyle log introduced in 11g? Well, it's really very simple.
Problem
Oracle introduced the Automatic Diagnostic Repository (ADR) with Oracle 11g
Release 1. This introduced some type of streamlining of various log and trace files
generated by different Oracle components such as the database, listener, ASM, etc.
this is why you didn't find the alert log in the usual location specified by the familiar
background_dump_dest initialization parameter but in a directory specified by a
diferent parameter - ADR_BASE. Similarly listener logs now go in this format:
$ADR_BASE/tnslsnr//listener/alert/log.xml
Remember, this is in the XML format; not the usual listener.log. The idea was to
present the information in the listener log in a consistent, machine readable format
instead of the usually cryptic inconsistent older listener log format. Here is an
example of the new format:
<msg time='2013-03-31T13:17:22.633-04:00' org_id='oracle' comp_id='tnslsnr'
type='UNKNOWN' level='16' host_id='oradba2'
host_addr='127.0.0.1' version='1'
>
<txt>31-MAR-2013 13:17:22 * service_update * D112D2 * 0</txt>
</msg>
<msg> time='2013-03-31T13:17:25.317-04:00' org_id='oracle' comp_id='tnslsnr'
type='UNKNOWN' level='16' host_id='oradba2'
host_addr='127.0.0.1'
>
<txt>WARNING: Subscription for node down event still pending </txt>
</msg>
Being in XML format, many tools now can be made to read the files unambiguously
since the data is now enclosed within meaningful tags. Additionally the listener log
files (the XML format) is now rotated. After reaching a certain threshold value the
file is renamed to log_1.xml and a new log.xml is created - somewhat akin to the
archived log concept in the case of redo log files.
While it proved useful for new tools, there was also the presence of myriads of tools
that read the older log format perfectly. So Oracle didn't stop the practice of writing
to the old format log. The old format log was still called listener.log but the
directory it is created in is different - $ADR_BASE/tnslsnr/Hostname/listener/trace.
Unfortunately there is no archiving scheme for this file so this simply kept growing.
In the pre-11g days you could temporarily redirect the log to a different location and
archive the old one by setting the following parameter in listener.ora:
log_directory = tempLocation
However, in Oracle 11g R1 and beyond, this will not work; you can't set the location
of the log_directory.
Solution
So, what's the solution? Simple. Just set the following parameter in listener.ora:
diag_adr_enabled_listener = off
This will disable the ADR style logging for the listener. Now, suppose you want to set
the directory to /tmp and log file name to listener_0405.log, add the following into
listener.ora (assuming the name of the listener is "listener"; otherwise make the
necessary change below):
log_file_listener = listener_0405.log
log_directory_listener = /tmp
That's it. the ADR style logging will be permanently be gone and you will be reunited
with your highly missed pre-11g style logging. You can confirm it:
LSNRCTL> status
Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
STATUS of the LISTENER
-----------------------Alias
listener
Version
TNSLSNR for Linux: Version 11.2.0.1.0 - Production
Start Date
26-NOV-2012 16:50:58
Uptime
129 days 15 hr. 33 min. 31 sec
Trace Level
off
Security
ON: Local OS Authentication
SNMP
OFF
Listener Parameter File
/opt/oracle/product/11.2.0/grid/network/admin/listener.ora
Listener Log File
/tmp/listener_0405.log
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1521)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=oradba2)(PORT=1521)))
Services Summary...
Service "+ASM" has 1 instance(s).
Happy logging.
P.S. By the way, you can also change the values by issuing set commands from
LSNRCTL command prompt:
LSNRCTL> set log_file '/tmp'
However, if you have heeded my advice earlier, you might have set
admin_restrictions to ON; so can't use the set command. Instead, you would put the
value in listener.ora and reload the listener for the desired effect.
Why should you set the ADMIN_RESTRICTIONS_LISTENER to ON
Recently someone probably went through the slides of my session on "Real Life DBA
Best Practices" and had a question on OTN forum why I was recommending setting
the parameter to ON, as a best practice. I responded on the forum; but I feel it's
important enough to put it here as well.
As a best practice, I recommend setting this parameter to ON (the default is OFF).
But as I profess, a best practice is not one without a clear explanation. Here is the
explanation.
Over the period of time, the Oracle Database has encountered several security
vulnerabilities, some of them on the listener. Some are related to buffer overflow.
others involve unauthorized access into the listener process itself. Some of the
listener access exploits come from external listener manipulations. Did you know
that you do not need to even log into a server to connect to the listener? As long as
the port the listener is listeneing on is open (and it will be, for obvious reasons) you
can connect to the listener from a remote server.
In 10g, Oracle provided a default mechanism that does not require password from
the oracle user manipulating the listener via online commands. Having said that,
there have been bugs and there will be. Those vulnerabilities usually get fixed later;
but most often the fix does not get to the software quickly enough.
So, what should you do to protect against these vulnerabilities? I consider a simple
thing to do is to remove the possibilty altogether; and that's where the admin
restrictions come into picture. After setting this parameter, you can't dynamically
change the parameter. So, even though a connection is made somehow from an
outside server - bug or not - eliminating the possibilty altogether mitigates the risk.
And, that's why recommend it.
Let's ponder on the problem a little bit more. Is that a problem is setting the
parameter? Absolutely not. When you need to change a parameter, you simply log
on to the server, update the listener.ora and issue "lsnrctl reload". This reloads the
parameter file dynamically. Since you never stopped the listener, you will not see
unsuccessful conection requests from clients. So, it is dynamic. If you are the
oracle user, then you can log on to the server; so there is no issue there.
I advocate this policy rather than dyanamic parameter changes, for these simple
reasons:
(1) It plugs a potential hole dues to remote listener vulnerability attacks, regardless
of the probabilty of that happening.
(2) It forces you to make changes to listener.ora file, which shows the timestamp.
(3) I ask my DBAs to put extensive comments on the parameter files, including the
listener.ora file, to explain the change. I also ask them to comment a previous line
and create a new line with the new value, rather than updating a value directly. This
sort of documentation is a gem during debugging. Changing in the parameter file
allows that, while dynamic change does not.
So, I don't see a single functionality I lose by this practice; and I just showed you
some powerful reasons to adopt this practice. No loss, and some gain, however
small you consider that to be - and that's why I suggest it.
As I mentioned earlier, a best practice is not one without a clear explanation. I hope
this explanation makes it clear.
Quiz: Mystery of Create Table Statement
Happy Friday! I thought I would jumpstart your creative juices with this little, really
simple quiz. While it's trivial, it may not be that obvious to many. See if you can
catch it. Time yourself exactly 1 minute to get the answer. Tweet answer to me
@arupnanda
Here it goes. Database is 11.2.0.3. Tool is SQL*Plus.
The user ARUP owns a procedure that accepts an input string and executes it. Here
is the procedure.
create or replace procedure manipulate_arup_schema
(
p_input_string varchar2
)
is
begin
execute immediate p_input_string;
end;
/
The user ARUP have granted EXECUTE privileges on this to user SCOTT. The idea is
simple: SCOTT can create and drop tables and other objects in ARUP's schema
without requiring the dangerous create any table system privilege.
With this, SCOTT tries to create a table in the ARUP schema:
SQL> exec arup.manipulate_arup_schema ('create table abc (col1 number)')
PL/SQL procedure successfully completed.
The table creation was successul. Now SCOTT tries to create the table in a slightly
different manner:
SQL> exec arup.manipulate_arup_schema ('create table abc1 as select * from
dual');
It fails with an error:
BEGIN bus_schema.manipulate_bus_schema ('create table abc as select * from

dual'); END;
*
ERROR at line 1:
ORA-01536: space quota exceeded for tablespace 'USERS'
ORA-06512: at "ARUP.MANIPULATE_ARUP_SCHEMA", line 18
ORA-06512: at line 1
Huh? After checking you did confirm that the user indeed doesn't have the quota on
tablespace USERS, so the error is genuine; but how did the first table creation
command go through successfully?
Tweet me the answer @arupnanda. Aren't on Twitter? Just post the answer here as a
comment. I will post the answer right here in the evening. Let's see who posts the
first answer. It shouldn't take more than 5 minutes to get the answer.
Have fun.
Update at the end of the Day. Here is the answer:
Oracle 11g R2 introduced a new feature called deferred segment creation.

Segments are stored data objects such as tables, views and materialized views.
Prior to Oracle 11gR2, when you created a table, a segment was automatically
created. The segment was empty; but created it was. From 11gR2, the table is
created only in data dictionary, if there is no data. In the second case, the create
table statement used create table as select format, which pulled the data from dual
to create the table. However the user didn't have quota on tablespace users; so the
statement failed. In the first case, the create table statement merely created the
table in dictionary; not the segment. Since there was no segment, there was no
space consumption; so the unavailability of quota in the tablespace didn't matter
and the statement was successful.
It was a simple puzzle; but I have seen many DBAs, even seasoned ones, stumble
over. Eventually they get it; but, well..., they should have taken just a few minutes.
From all the responses I got - on twitter and this blog - Yasin Baskan (@yasinbaskan)
was the first one to get back with correct answer. Several others did eventually; but
Yasin takes the honor of being the first one.
Congratulations, Yasin and thank you all who twitted and posted comments here.
What to Learn from LinkedIn Password Hack as an Oracle DBA
One of the major news today was the hacking and resultant publishing of passwords
in LinkedIn. Didn't hear about it? Well, read it here. In summary, someone smart but
with head screwed a little askew decided to pull passwords from LinkedIn account
using a little known flaw in the LinkedIn iOS app. LinkedIn later confirmed that leak
and asked users to change the password. This created a major ripple effect all over
the world. The news competed for attention with others such as Spain's economic
reforms; but in the end it managed to rise to the top since many professionals and
executives are members of the LinkedIn site and were affected.
Well, what is that to do with being an Oracle DBA - you may ask. Fair question. You
see, there is a very important lesson to be learned here from this incident - a lesson
commonly ignored by many DBAs, developers, architects and pretty much all users
of an Oracle database. Let's see what that is.
Mechanics of the LinkedIn Password Hack

Adversaries (a.k.a. hackers) obtain password in many ways. Some use brute force
approach of guessing the password and trying to login until they succeed. However,
many systems employ a simple mechanism of locking out the user when more than
a threshold number of incorrect attempts are made - not very effective. There is
another type of attack, where the adversary simply gets the password stored in the
servers of the site. But, wait, shouldn't that password be encrypted?
Yes, they generally are. But here is where a twist comes. The passwords may not be
really "encrypted"; but merely "hashed". There is a huge difference. Encryption
generally requires a key that is used to encrypt a value; hashing does not. Hashing
sort of transforms the value but not in a predictable way; so you can't reverse the
hasing process to get the source value. Let's see how it works.
Hashing
Here is a simple example. Suppose you are negotiating a rate for your baby sitter
and you agreed on an amount - $123. Now you asked the sitter to tell your spouse
that amount. Well, how do you make sure she would mention that very amount?
After all, she has incentive to say a higher amount, doesn't see? She could say that
you agreed on $125 or even $150; your spouse will not be able to ascertain that.
(Imagine for a moment that you don't have access to normal modern technology
like a cellphone, etc. for you to communicate directly with your spouse). So you
develop a simple strategy - you come up with a formula that creates a number from
the amount. It could be as simple as, say, the total of all digits. So your amount $123 becomes:
1+2+3=6
You write that down on a paper, seal it in an envelope and ask the sitter to give it to
your spouse in addition to mentioned the agreed upon amount. You and your spouse
both know this formula; but the sitter doesn't. Suppose she fudges the amount you
agreed on to make it to, say $125. Upon her telling your spouse computes the
magic number:
1+2+5=8
Your spouse will compare this with the number inside the sealed envelope and
immediately come to the conclusion that the amount agreed by you was something
different; not $125. The authenticity of the value is now definitively established to
be false.
This process is called hashing and this magic number is called a hash value. Of
course the hashing process is much more complex than merely adding the digits. I
just wanted to show the concept with a very simple example. The mechanics of the
process, which was simply ading up the digits, is known as the hashing algorithm.
Here are some properties of this hashing process:
(1) The process is one-way. You can determine the hashvalue by adding the digits
(1+2+3); but you can't determine the source number from the hashvalue (6). You
spouse can't determine from the hashvalue mentioned by the sitter what amount
you agreed on. So, it's not the same as encryption, which allows you to decrypt and
come up with the source number.
(2) The purpose is not to store values. It's merely to establish the authenticity. In
this example, your spouse determines that the amount mentioned by the baby
sitter ($125) must be wrong, because its hashvalue would have been 8, not 6. After
that authenticity is established (or rejected, as in this case) the purpose of the
hashvalue cease to exist.
(3) The hashing function is deterministic, i.e. it will always come up with the same
value everytime it is invoked against the same source value.
(4) What if the baby sitter had mentioned $150? The hashvalue, in that case, would
have been 1+5+0 = 6, exactly the hashvalue computed by you. In that case, your
spouse would have determined the value $150 to be authentic, which would have
been wrong. So, it's important that the hashvalue is somewhat unique, to reduce
possibility of two different numbers producing the same result. This is known as
"collision" of hashvalues.
The algorithm is the key to make sure the possibilty of collisions is reduced. There
are several algortihms in use. Two very common ones are MD5 (message digest)
and SHA-1 (secure hash algorithm).
Since the source value can't be computed back from hash value, this is considered
by some as a more secure process than encryption. This process is useful in
situations where the reverse computation of values is not necessary; merely the
matching of hashvalues is needed. One such example is passwords. If you want to
establish that the password entered by the user matches the stored password, all
you have to do is generate the hashvalue and match that with the hashvalue stored
in the database. If they match, you establish that the password is correct; if not
then, well, it's not. This has an inherent security advantage. If someone somehow
manages to read the passwords, all that will be exposed will be the hashvalues of
the passwords; not the actual values of the passwords themselves. As we saw
earlier, it will be impossible to decipher the original password from the hashvalue.
That's why it is common in password storage.
Salt
So, that's great, with some higher degree of security for password store. What's the
problem?
The problem is that the hashvalues are way too predictable. Recall from the
previous section that the hashvalue of a specific input value is always the same
value. Considering the simpel hashfunction (adding digits), the input value $123 will
always return 6 as the hashvalue. Consider this: an adversary can see the
hashvalue and guess the value, as shown below.
Is the input value $120? The hash value is 1+2+0 = 3, which does not match "6", so
it must not be the correct number.
Is it $121? Hash value of 121 is 1+2+1=4, different from 6; so this is not correct
either.
Is it $122? Hashvalue of 122 is 5; so not correct.
Is it $123? Hashvalue is 6. Bingo! The adversary now knows the input value.
In just 4 attempts the adversary figured out the input value from the hashvalue.
Consider this scenario for passwords. The adversary can see the password hash
(from which he can't decipher the password); but he can generate hashes from
multiple nmput strings and check which one matches the stored password
hashvalue. Using the computing power of modern computers this turns almost
trivial. So a hash value is not inherently secure.
What is the solution, then? What if the hash value was not as predictable? If the
hash value generated from an input value were different, it would have been
impossible to match it against some stored value. This element of randomness to an
otherwise deterministic function is briught by introducing a modifier to the process,
called a "salt". Like its real-life namesake, salt adds spice to the hashvalue to give it
a unique "flavor", or a different vlaue. Here is an example where we are storing the
password value "Secret":
hash("Secret") = "X"
hash("Secret") + salt = "Y"
hash("Secret") + salt = "Z"
Everytime the salt is added, a different value is produced. It will not allow the
matching of passwords.
In case of LinkedIn, the passwords were stored without salt. Therefore it was easy
for the adversary to guess the passwords by creating SHA-1 hash values from
known words and comparing against the stored value. Here is a rough pseudo-code:
for w in ( ... list of words ... ) loop
l_hash := hash(w);
if l_hash != stored_value
continue;
else
show "Bingo! The password is '||w
end if;
Lesson for Oracle DBAs

In Oracle database (as of 11g R2 and all prior versions), the passwords are stored in
the database in a table called USER$. There is a column called PASSWORD which
stores the SHA-1 hashvalue. Using the algorithm mentioned above, an adversary
can pass a very long listof words, perhaps the entire Oxford English Dictionary and
crack open the password. You may argue that this process is cumbersome and time
consuming. Actually, it's quite trivial for a resonably fast computer.
In Oracle, the passwords are not hashed alone. The userid is combined with the
password to produce the hash, e.g. suppose SCOTT's password is TIGER. The hash
function is applied as:
hash('SCOTTTIGER')
Let's see how Oracle stores the password with an example. Take a look at the
password column in the view DBA_USERS:
select username, password
from dba_users
where username = 'SCOTT';
USERNAME PASSWORD
-------- ---------------SCOTT
F894844C34402B67
The password is hashed and thus undecipherable, but we know that SCOTT's
password is "tiger." Therefore, the hash value for "tiger" when userid is "SCOTT" is
F894844C34402B67. Now, if SCOTT's password changes, this hash value also
changes. You can then confirm in the view DBA_USERS to see if SCOTTs password
matches this hash value, which will verify the password as "tiger".
So how can an adversary use this information? It's simple. If he creates the user
SCOTT with the password TIGER, he will come to know the hash values of stored in
the password column. Then he can build a table of such accounts and the hashed
values of the passwords and compare them against the password hashes stored in
the data dictionary. What's worse: he can create this user in any Oracle database;
not necessarly the one he is attacking right now.
This is why you must never use default passwords and easily guessed passwords.
Protection
Now that you know how the adversaries use the password hash to guess passwords.
you should identify all such users and expire them, or force them to change
passwords. How can you get a list of such users?
In Oracle Database 11g, this is easy, almost to the point of being trivial. The
database has a special view, dba_users_with_defpwd, that lists the usernames with
the default passwords. Here is an example usage:
select * from dba_users_with_defpwd;
USERNAME
-----------------------------DIP
MDSYS
XS$NULL
SPATIAL_WFS_ADMIN_USR
CTXSYS
OLAPSYS
OUTLN
OWBSYS
SPATIAL_CSW_ADMIN_USR
EXFSYS
ORACLE_OCM
output truncated
The output clearly shows the usernames that have the default password. You can
join this view with DBA_USERS to check on the status of the users:
select d.username, account_status
from dba_users_with_defpwd d, dba_users u
where u.username = d.username;
USERNAME
-----------------------------PM
OLAPSYS
BI
SI_INFORMTN_SCHEMA
OWBSYS
XS$NULL
ORDPLUGINS
APPQOSSYS
output truncated
ACCOUNT_STATUS
-------------------------------EXPIRED & LOCKED
EXPIRED & LOCKED
EXPIRED & LOCKED
EXPIRED & LOCKED
EXPIRED & LOCKED
EXPIRED & LOCKED
EXPIRED & LOCKED
EXPIRED & LOCKED
Oracle 10g
What if you don't have Oracle 11g?
In January 2006, Oracle made a downloadable utility available for identifying default
passwords and their users. This utility, available via a patch 4926128 is available on
My Oracle Support as described in the document ID 361482/1. As of this writing, the
utility checks a handful of default accounts in a manner similar to that described
above; by the time you read this, however, its functionality may well have
expanded.
Security expert Pete Finnigan has done an excellent job of collecting all such default
accounts created during various Oracle and third-party installations, which he has
exposed for public use in his website, petefinnigan.com. Rather than reinventing the
wheel, we will use Pete's work and thank him profusely. I have changed his original
approach a little bit, though.
First, create the table to store the default accounts and default password:.
CREATE TABLE osp_accounts (
product VARCHAR2(30),
security_level NUMBER(1),
username VARCHAR2(30),
password VARCHAR2(30),
hash_value VARCHAR2(30),
commentary VARCHAR2(200)
);
Then you can load the table using data collected by Pete Finnigan from many
sources. (Download the script script here.) After the table is loaded, you are ready to
search for default passwords. I use a very simple SQL statement to find out the
users:
col password format a20
col account_status format a20
col username format a15
select o.username, o.password, d.account_status
from dba_users d, osp_accounts o
where o.hash_value = d.password
/
USERNAME
--------------CTXSYS
OLAPSYS
DIP
DMSYS
EXFSYS
SYSTEM
WMSYS
XDB
OUTLN
SCOTT
SYS
PASSWORD
-------------------CHANGE_ON_INSTALL
MANAGER
DIP
DMSYS
EXFSYS
ORACLE
WMSYS
CHANGE_ON_INSTALL
OUTLN
TIGER
ORACLE
ACCOUNT_STATUS
-------------------OPEN
OPEN
EXPIRED & LOCKED
OPEN
EXPIRED & LOCKED
OPEN
EXPIRED & LOCKED
EXPIRED & LOCKED
OPEN
OPEN
OPEN
Here you can see some of the most vulnerable of situations, especially the last line,
which where the username says is SYS and the password is "ORACLE" (as is that of
SYSTEM)!! It may not be "change_on_install,", but it's just as predictable.
Action Items
Now that you know how one adversary used the salt-less hashing algorithm to
guess passwords, you have some specific actions to take.
(1) Advocate the use of non-dictionary words. Remember, the adversary can create
passwords and compare the resultant hash against the stored hash to see if they
match. Making it impossible for him to guess the list of such input values makes it
impossible to generate has values.
(2) Immediately check in the database for users with default passwords. Either
change the passwords, or Expire and Lock them.
(3) Whenever you use hashing (and not encryption), use salt, to make sure it is
diffcult, if not impossible for the adversary to guess.
Unicode Migration Assistant for Oracle

When you want to convert a database created in the default characterset to a multibyte
characterset, there were two basic approaches - the safe export/import and the not-for-the-faintof-the-heart alter database convert internal. In either case you had to follow a string of activities checking the presence of incompatible values by running csscan, etc.
There is a new tool from Oracle to make the process infinitesimally simpler - Migration
Assistant for Unicode. It's a GUI tool that you can install on the client. A server side API
(installed via a patch) does all the heavy lifting with the client GUI providing a great intuitive
interface. You have the steps pretty much laid out for you. But the main strength of the tool is not
that. There are two primary differentiators for the tool.
1. When you do have a bad character, what can you really do? You can truncate the part of
the data. But how do you know how much to truncate? If you truncate aggressively, you
may shave off a chunk and lose valuable data; but be miserly and you risk having the bad
data in place. This tool will show the data in a separate window allowing you to correct
only the affected data; nothing less, nothing more.
2. When users copy and paste data from some unicode compliant system to Oracle, e.g.
from MS Word to a VARCHAR2 field in the database, the characters may look garbled;
but given proper characterset they become meaningful. This tool allows you to see the
data in many charactersets to identify which one was used to create it in the first place.
After that it's a simple matter to reproduce that characters in the proper characterset.
With these two differentiators in place, the tool has great future. Check out everything on this
tool at http://www.oracle.com/technetwork/database/globalization/dmu/overview/index330958.html or just visit the booth at #OOW Demogrounds in Moscone South.
Oh, did I mention that the tool is free?
Guide to Advanced Linux Command Mastery

by Arup Nanda
Published August 2006

In Sheryl Calish's excellent article Guide to Linux File Command Mastery," you learned some
routine Linux commands, which are especially valuable for Linux newbies. But now that you
have mastered the basics, lets move on to some more sophisticated commands that you will find
extremely useful.
In this four-part series, you will learn some not-so-well-known tricks about various routine
commands as well as variations in usage that make them more useful. As the series progresses,
you will learn successively difficult commands to master.
Note that these commands may differ based on the specific version of Linux you use or which
specific kernel is compiled, but if so, probably only slightly.
Painless Changes to Owner, Group, and Permissions

In Sheryl's article you learned how to use chown and chgrp commands to change ownership and
group of the files. Say you have several files like this:
# ls -l
total 8
-rw-r--r--rwxr-xr-x
-rwxr-xr-x
-rwxr-xr-x
-rwxr-xr-x
-rwxr-xr-x
1
1
1
1
1
1
ananda
oracle
oracle
oracle
oracle
oracle
users
dba
dba
dba
dba
dba
70
132
132
132
132
132
Aug
Aug
Aug
Aug
Aug
Aug
4
4
4
4
4
4
04:02
04:02
04:02
04:02
04:02
04:02
file1
file2
file3
file4
file5
file6
and you need to change the permissions of all the files to match those of file1. Sure, you could
issue chmod 644 * to make that changebut what if you are writing a script to do that, and you
dont know the permissions beforehand? Or, perhaps you are making several permission changes
and based on many different files and you find it infeasible to go though the permissions of each
of those and modify accordingly.
A better approach is to make the permissions similar to those of another file. This command
makes the permissions of file2 the same as file1:
chmod --reference file1 file2
Now if you check:

# ls -l file[12]
total 8
-rw-r--r-1 ananda
-rw-r--r-1 oracle
users
dba
70 Aug
132 Aug
4 04:02 file1
4 04:02 file2
The file2 permissions were changed exactly as in file1. You didnt need to get the permissions of
file1 first.
You can also use the same trick in group membership in files. To make the group of file2 the
same as file1, you would issue:
# chgrp --reference file1 file2
# ls -l file[12]
-rw-r--r-1 ananda
users
-rw-r--r-1 oracle
users
70 Aug
132 Aug
4 04:02 file1
4 04:02 file2
Of course, what works for changing groups will work for owner as well. Here is how you can use
the same trick for an ownership change. If permissions are like this:
# ls -l file[12]
-rw-r--r-1 ananda
-rw-r--r-1 oracle
users
dba
70 Aug
132 Aug
4 04:02 file1
4 04:02 file2
70 Aug
132 Aug
4 04:02 file1
4 04:02 file2
You can change the ownership like this:

# chown --reference file1 file2
# ls -l file[12]
-rw-r--r-1 ananda
users
-rw-r--r-1 ananda
users
Note that the group as well as the owner have changed.

Tip for Oracle Users
This is a trick you can use to change ownership and permissions of Oracle executables in a
directory based on some reference executable. This proves especially useful in migrations where
you can (and probably should) install as a different user and later move them to your regular
Oracle software owner.
More on Files
The ls command, with its many arguments, provides some very useful information on files. A
different and less well known command stat offers even more useful information.
Here is how you can use it on the executable oracle, found under $ORACLE_HOME/bin.
# cd $ORACLE_HOME/bin
# stat oracle
File: `oracle'
Size: 93300148
Device: 343h/835d
Blocks: 182424
Inode: 12009652
IO Block: 4096
Links: 1
Regular File
Access:
Access:
Modify:
Change:
(6751/-rwsr-s--x) Uid: ( 500/ oracle)

2006-08-04 04:30:52.000000000 -0400
2005-11-02 11:49:47.000000000 -0500
2005-11-02 11:55:24.000000000 -0500
Gid: (
500/
dba)
Note the information you got from this command: In addition to the usual filesize (which you
can get from ls -l anyway), you got the number of blocks this file occupies. The typical Linux
block size is 512 bytes, so a file of 93,300,148 bytes would occupy (93300148/512=) 182226.85
blocks. Since blocks are used in full, this file uses some whole number of blocks. Instead of
making a guess, you can just get the exact blocks.
You also get from the output above the GID and UID of the ownership of the file and the octal
representation of the permissions (6751). If you want to reinstate it back to the same permissions
it has now, you could use chmod 6751 oracle instead of explicitly spelling out the permissions.
The most useful part of the above output is the file access timestamp information. It shows you
that the file was accessed on 2006-08-04 04:30:52 (as shown next to Access:), or August 4,
2006 at 4:30:52 AM. This is when someone started to use the database. The file was modified on
2005-11-02 11:49:47 (as shown next to Modify:). Finally, the timestamp next to Change:
shows when the status of the file was changed.
-f,
a modifier to the stat command, shows the information on the filesystem instead of the file:
# stat -f oracle
File: "oracle"
ID: 0
Namelen: 255
Type: ext2/ext3
Blocks: Total: 24033242
Free: 15419301
Available: 14198462
Inodes: Total: 12222464
Free: 12093976
Another option, -t, gives exactly the same information but on one line:
Size: 4096
# stat -t oracle
oracle 93300148 182424 8de9 500 500 343 12009652 1 0 0 1154682061
1130950187 1130950524 4096
This is very useful in shell scripts where a simple cut command can be used to extract the values
for further processing.
When you relink Oracle (often done during patch installations), it moves the existing executables
to a different name before creating the new one. For instance, you could relink all the utilities by
relink utilities
It recompiles, among other things, the sqlplus executable. It moves the exiting executable sqlplus
to sqlplusO. If the recompilation fails for some reason, the relink process renames sqlplusO to
sqlplus and the changes are undone. Similarly, if you discover a functionality problem after
applying a patch, you can quickly undo the patch by renaming the file yourself.
Here is how you can use stat on these files:

# stat sqlplus*
File: 'sqlplus'
Size: 9865
Blocks: 26
IO Block: 4096
Device: 343h/835d
Inode: 9126079
Links: 1
Access: (0751/-rwxr-x--x) Uid: ( 500/ oracle)
Gid: (
Access: 2006-08-04 05:15:18.000000000 -0400
Modify: 2006-08-04 05:15:18.000000000 -0400
Change: 2006-08-04 05:15:18.000000000 -0400
File:
Size:
Device:
Access:
Access:
Modify:
Change:
'sqlplusO'
8851
Blocks: 24
IO Block: 4096
343h/835d
Inode: 9125991
Links: 1
(0751/-rwxr-x--x) Uid: ( 500/ oracle)
Gid: (
2006-08-04 05:13:57.000000000 -0400
2005-11-02 11:50:46.000000000 -0500
2005-11-02 11:55:24.000000000 -0500
Regular File
500/
dba)
Regular File
500/
dba)
It shows sqlplusO was modified on November 11, 2005, while sqlplus was modified on August
4, 2006, which also corresponds to the status change time of sqlplusO . It indicates that the
original version of sqlplus was in effect from Nov 11, 2005 to Aug 4, 2006. If you want to
diagnose some functionality issues, this is a great place to start. In addition to the file changes, as
you know the permission's change time, you can correlate it with any perceived functionality
issues.
Another important output is size of the file, which is different9865 bytes for sqlplus as
opposed to 8851 for sqlplusOindicating that the versions are not mere recompiles; they
actually changed with additional libraries (perhaps). This also indicates a potential cause of some
problems.
File Types
When you see a file, how do you know what type of file it is? The command file tells you that.
For instance:
# file alert_DBA102.log
alert_DBA102.log: ASCII text
The file alert_DBA102.log is an ASCII text file. Lets see some more examples:
# file initTESTAUX.ora.Z
initTESTAUX.ora.Z: compress'd data 16 bits
This tells you that the file is a compressed file, but how do you know the type of the file was
compressed? One option is to uncompress it and run file against it; but that would make it
virtually impossible. A cleaner option is to use the parameter -z:
# file -z initTESTAUX.ora.Z
initTESTAUX.ora.Z: ASCII text (compress'd data 16 bits)
Another quirk is the presence of symbolic links:

# file spfile+ASM.ora.ORIGINAL
spfile+ASM.ora.ORIGINAL: symbolic link to

/u02/app/oracle/admin/DBA102/pfile/spfile+ASM.ora.ORIGINAL
This is useful; but what type of file is that is being pointed to? Instead of running file again, you
can use the option -l:
# file -L spfile+ASM.ora.ORIGINAL
spfile+ASM.ora.ORIGINAL: data
This clearly shows that the file is a data file. Note that the spfile is a binary one, as opposed to
init.ora; so the file shows up as data file.
Suppose you are looking for a trace file in the user dump destination directory but are unsure if
the file is located on another directory and merely exists here as a symbolic link, or if someone
has compressed the file (or even renamed it). There is one thing you know: its definitely an ascii
file. Here is what you can do:
file -Lz * | grep ASCII | cut -d":" -f1 | xargs ls -ltr
This command checks the ASCII files, even if they are compressed, and lists them in
chronological order.
Comparing Files
How do you find out if two filesfile1 and file2are identical? There are several ways and
each approach has its own appeal.
diff.
The simplest command is diff, which shows the difference between two files. Here are
the contents of two files:
# cat file1
In file1 only
In file1 and file2
# cat file2
In file1 and file2
In file2 only
If you use the diff command,
you will be able to see the difference between the files as shown
below:
# diff file1 file2
1d0
< In file1 only
2a2
> In file2 only
#
In the output, a "<" in the first column indicates that the line exists on the file mentioned first,
that is, file1. A ">" in that place indicates that the line exists on the second file (file2). The
characters 1d0 in the first line of the output shows what must be done in sed to operate on the
file file1 to make it same as file2.
Another option, -y, shows the same output, but side by side:
# diff -y file1 file2 -W 120
In file1 only
In file1 and file2
<
>
In file1 and file2

In file2 only
The -W option is optional; it merely instructs the command to use a 120-character wide screen,
useful for files with long lines.
If you just want to just know if the files differ, not necessarily how, you can use the -q option.
# diff -q file3 file4
# diff -q file3 file2
Files file3 and file2 differ
Files file3 and file4 are the same so there is no output; in the other case, the fact that the files
differ is reported.
If you are writing a shell script, it might be useful to produce the output in such a manner that it
can be parsed. The -u option does that:
# diff -u file1 file2
--- file1
2006-08-04 08:29:37.000000000 -0400
+++ file2
2006-08-04 08:29:42.000000000 -0400
@@ -1,2 +1,2 @@
-In file1 only
In file1 and file2
+In file2 only
The output shows contents of both files but suppresses duplicates, the + and - signs in the first
column indicates the lines in the files. No character in the first column indicates presence in both
files.
The command considers whitespace into consideration. If you want to ignore whitespace, use the
-b option. Use the -B option to ignore blank lines. Finally, use -i to ignore case.
The diff command can also be applied to directories. The command
diff dir1 dir2
shows the files present in either directories; whether files are present on one of the directories or
both. If it finds a subdirectory in the same name, it does not go down to see if any individual files
differ. Here is an example:
# diff DBA102 PROPRD
Common subdirectories: DBA102/adump and PROPRD/adump
Only in DBA102: afiedt.buf
Only in PROPRD: archive
Only in PROPRD: BACKUP
Only in PROPRD: BACKUP1

Common subdirectories: DBA102/bdump and PROPRD/bdump
Common subdirectories: DBA102/cdump and PROPRD/cdump
Only in PROPRD: CreateDBCatalog.log
Only in PROPRD: CreateDBCatalog.sql
Only in PROPRD: CreateDBFiles.log
Only in PROPRD: CreateDBFiles.sql
Only in PROPRD: CreateDB.log
Only in PROPRD: CreateDB.sql
Only in DBA102: dpdump
Only in PROPRD: emRepository.sql
Only in PROPRD: init.ora
Only in PROPRD: JServer.sql
Only in PROPRD: log
Only in DBA102: oradata
Only in DBA102: pfile
Only in PROPRD: postDBCreation.sql
Only in PROPRD: RMANTEST.sh
Only in PROPRD: RMANTEST.sql
Common subdirectories: DBA102/scripts and PROPRD/scripts
Only in PROPRD: sqlPlusHelp.log
Common subdirectories: DBA102/udump and PROPRD/udump
Note that the common subdirectories are simply reported as such but no comparison is made. If
you want to drill down even further and compare files under those subdirectories, you should use
the following command:
diff -r dir1 dir2
This command recursively goes into each subdirectory to compare the files and reports the
difference between the files of the same names.
One common use of diff is to differentiate between different init.ora files. As a best practice, I
always copy the file to a new namee.g. initDBA102.ora to initDBA102.080306.ora (to
indicate August 3,2006)before making a change. A simple diff between all versions of the file
tells quickly what changed and when.
This is a pretty powerful command to manage your Oracle home. As a best practice, I never
update an Oracle Home when applying patches. For instance, suppose the current Oracle version
is 10.2.0.1. The ORACLE_HOME could be /u01/app/oracle/product/10.2/db1. When the time
comes to patch it to 10.2.0.2, I dont patch this Oracle Home. Instead, I start a fresh installation
on /u01/app/oracle/product/10.2/db2 and then patch that home. Once its ready, I use the
following:
# sqlplus / as sysdba
SQL> shutdown immediate

SQL> exit
# export ORACLE_HOME=/u01/app/oracle/product/10.2/db2
# export PATH=$ORACLE_HOME/bin:$PATH
# sqlplus / as sysdba
SQL> @$ORACLE_HOME/rdbms/admin/catalog
...
and so on.
The purpose of this approach is that the original Oracle Home is not disturbed and I can easily
fall back in case of problems. This also means the database is down and up again, pretty much
immediately. If I installed the patch directly on the Oracle Home, I would have had to shut the
database for a long timefor the entire duration of the patch application. In addition, if the patch
application had failed due to any reason, I would not have a clean Oracle Home.
Now that I have several Oracle Homes, how can I see what changed? Its really simple; I can
use:
diff -r /u01/app/oracle/product/10.2/db1 /u01/app/oracle/product/10.2/db2 |
grep -v Common
This tells me the differences between the two Oracle Homes and the differences between the files
of the same name. Some important files like tnsnames.ora, listener.ora, and sqlnet.ora should not
show wide differences, but if they do, then I need to understand why.
cmp. The command cmp is similar to diff:
# cmp file1 file2
file1 file2 differ: byte 10, line 1
The output comes back as the first sign of difference. You can use this to identify where the files
might be different. Like diff, cmp has a lot of options, the most important being the -s option,
that merely returns a code:
0, if the files are identical
1, if they differ
Some other non-zero number, if the comparison couldnt be made
Here is an example:
# cmp -s file3 file4
# echo $?
0
The special variable $? indicates
the return code from the last executed command. In this case
its 0, meaning the files file1 and file2 are identical.
# cmp -s file1 file2
# echo $?
Summary of Commands in
This Installment
means file1 and file2 are not the same.

This property of cmp can prove very useful in shell
scripting where you merely want to check if two files
differ in any way, but not necessarily check what the
difference is. Another important use of this command is
to compare binary files, where diff may not be reliable.
Comman
d
Use
chmod
To change
permissions of a
file, using the -reference
parameter
chown
To change owner of
a file, using the -reference
parameter
chgrp
To change group of
a file, using the -reference
parameter
stat
To find out about

the extended
attributes of a file,
such as date last
accessed
file
To find out about

the type of file,
such ASCII, data,
and so on
diff
To see the
difference between
two files
cmp
To compare two
files
comm
To see whats
common between
two files, with the
output in three
columns
Recall from a previous tip that when you relink Oracle

executables, the older version is kept prior to being
overwritten. So, when you relink, the executable sqlplus
is renamed to sqlplusO and the newly compiled sqlplus
is placed in the $ORACLE_HOME/bin. So how do you
ensure that the sqlplus that was just created is any
different? Just use:
# cmp sqlplus sqlplusO
sqlplus sqlplusO differ: byte 657, line 7
If you check the size:

# ls -l sqlplus*
-rwxr-x--x
1 oracle
4 05:15 sqlplus
-rwxr-x--x
1 oracle
2 2005 sqlplusO
dba
8851 Aug
dba
8851 Nov
Even though the size is the same in both cases, cmp

proved that the two programs differ.
comm. The command comm is similar to the others but
the output comes in three columns, separated by tabs.
Here is an example:
# comm file1 file2
In file1 and file2
In file1 only
In file1 and file2
In file2 only
This command is useful when you may want to see the

contents of a file not in the other, not just a difference
sort of a MINUS utility in SQL language. The option -1
suppresses the contents found in first file:
# comm -1 file1 file2
In file1 and file2
In file2 only
md5sum. This command generates a 32-bit MD5 hash value of the files:
# md5sum file1
ef929460b3731851259137194fe5ac47
file1
Two files with the same checksum can be considered identical. However, the usefulness of this
command goes beyond just comparing files. It can also provide a mechanism to guarantee the
integrity of the files.
Suppose you have two important filesfile1 and file2that you need to protect. You can use the
--check option check to confirm the files haven't changed. First, create a checksum file for both
these important files and keep it safe:
# md5sum file1 file2 > f1f2
Later, when you want to verify that the files are still untouched:
# md5sum --check f1f2
file1: OK
file2: OK
This shows clearly that the files have not been modified. Now change one file and check the
MD5:
# cp file2 file1
# md5sum --check f1f2
file1: FAILED
file2: OK
md5sum: WARNING: 1 of 2 computed checksums did NOT match
The output clearly shows that file1 has been modified.

is an extremely powerful command for security implementations. Some of the

configuration files you manage, such as listener.ora, tnsnames.ora, and init.ora, are extremely
critical in a successful Oracle infrastructure and any modification may result in downtime. These
are typically a part of your change control process. Instead of just relying on someones word
that these files have not changed, enforce it using MD5 checksum. Create a checksum file and
whenever you make a planned change, recreate this file. As a part of your compliance, check this
file using the md5sum command. If someone inadvertently updated one of these key files, you
would immediately catch the change.
md5sum
In the same line, you can also create MD5 checksums for all executables in
$ORACLE_HOME/bin and compare them from time to time for unauthorized modifications.
Conclusion
Thus far you have learned only some of the Linux commands you will find useful for performing
your job effectively. In the next installment, I will describe some more sophisticated but useful
commands, such as strace, whereis, renice, skill, and more.
Arup Nanda ( arup@proligence.com ) has been an Oracle DBA for more than 12
years, handling all aspects of database administrationfrom performance tuning to
security and disaster recovery. He is a coauthor of PL/SQL for DBAs (O'Reilly Media,
2005), was Oracle Magazine's DBA of the Year in 2003, and is an Oracle ACE.
Setting Up Oracle Connection Manager
The Problem
It seems really simple. We have an Oracle database (on all nodes of a full rack Exadata, to be
exact), which a lot of end-users connect to through apps designed in a rather adhoc and
haphazard manner - on Excel spreadsheets, Access forms, TOAD reports and other assorted
tools. We want to control the access from these machines and streamline them.
The database machine sits behind a firewall. To allow the adhoc tools accessing the database
from the client machines mean we have to change the firewall rules. Had it been one or two
clients, it would have been reasonable; but with 1000+ client machines, it becomes impractical.
So I was asked to provide an alternative solution.
The Solution
This is not a unique problem; it's the same problem when the machines need to access resources
that exist across firewalls. The easy solution is to punch a hole through the firewall to allow that
access; but is not desirable for obvious security reasons. A better solution, often implemented, is
to have a proxy server. The proxy sits between the two layers of access and can access the
servers behind the firewall. Clients make the request to the proxy which it passes on to the
server.
Such a proxy solves the problem; but we are looking for a simpler solution. Does one exist?
Yes, it does. The answer is Connection Manager from Oracle. Among its many functions, one
stands out - it acts as a proxy between the different layers of access and passes through the
request. It's not a separate product; but is an option in the Oracle Client software (not the
database or grid infrastructure software). This option is not automatically installed. When
installing the client software, choose "Custom" and explicitly select "Connection Manager" from
the list.
The Architecture
Let's quickly go through the architecture of tool. Assume there are three hosts:
client1 - the machine where the client runs and wants to connect to the database
dbhost1 - the machine where the database instance runs
cmhost1 - the machine where the Connection Manager process runs
Here is a rough network diagram of the three machines
From client1 you can reach cmhost1 but not dbhost1:

C:\>ping dbhost1
Ping request could not find host dbhost1. Please check the name
and try again.
Or, if the host is known but not reachable, you will notice a message like this:
C:\>ping dbhost1
Pinging dbhost1 [192.168.104.31] with 32 bytes of data:
Request timed out.

However, you can ping the dbhost1 from cmhost1 and cmhost1 from client1.
Normally the TNS entry at the client machine would have looked like this:
TNS_REG =
(DESCRIPTION =
(ADDRESS =
(PROTOCOL = TCP)(HOST = dbhost1)(PORT = 1521)
)
(CONNECT_DATA =
(SERVICE_NAME=srv1)
)
)
However, this will not work since the client does not even know the routing for dbhost1. Instead
the client connects to the CM host. The connection manager has two processes
The Connection Manager Admin Process (cmadmin)
One or more Connection Manager Gateway Processes (cmgw)
The CMGW processes allows the client connections to come in through them. The admin
process manages the gateways. We will cover more on that later.
After setting up the CM processes, you will need to rewrite the TNSNAMES.ORA in the
following way:
TNS_CM =
(DESCRIPTION =
(SOURCE_ROUTE = YES)
(ADDRESS =
(PROTOCOL = TCP)(HOST = cmhost1)(PORT = 1950)
)
(ADDRESS =
)
(CONNECT_DATA =
(SERVICE_NAME=srv1)
)
)
How it Works
Note the special parameter:
SOURCE_ROUTE = YES
This tells the client connection request to attempt the first address listed first and only then
attempt the next one. This is different from load balance setups where you would expect the
client tool to pick one of the addresses at random. So the client attempts this address first:
(PROTOCOL = TCP)(HOST = cmhost1)(PORT = 1950)
This is the listener for Connection Manager. The clients are allowed to connect to the port 1950
(the port where CM listener listens on) on the host cmhost1.
After that the connection attempts the second address
However, it will fail since the client does not have access to the port 1521 of the host dbhost1.
This is where CM comes in. The client does not make the request; CM does on behalf of the
client connection that just came in. The connection manager (running on cmhost1) makes the
request with that address. Since the host cmhost1 can access dbhost1 on port 1521, that
connection request goes through successfully. When the response comes back from the database,
CM passes it back to the original client.
A single CM connection can handle many client connection requests.
Setting Up
Now that you know how CM works, let's see how to enable it, step by step.
(1) Install CM, if you don't have already. Check for a file cmctl under $ORACLE_HOME/bin. If
you have it, CM may have been installed already. If not, install CM by running the installer from
Oracle Client (not Database or Grid Infra) software. Choose Custom Install and explicitly choose
Connection Manager.
(2) Go to $OH/network/admin (remember the $OH of the client software home; not the database
home)
(3) You need to create a configuration file called cman.ora. Instead of creating it from scratch, go
to the samples subdirectory and copy the cman.ora sample file back into the admin directory.
(4) In the file cman.ora, make the changes to the following lines. Of course, I assumed cmhost1
as the server running Connection Manager process. Substitute by whatever name you choose for
the CM server. I also assumed you would use port 1950 for the CM listener. It does not have to
be. Whatever you choose will need to be opened in the firewall.
The lines you are changing will be at the beginning and the end of the cman.ora file.
cman_cmhost1 =
(configuration=
(address=(protocol=tcp)(host=cmhost1)(port=1950))
(parameter_list =
...
...
...
# conn_stats = connect_statistics
(rule_list=
(rule=
(src=*)(dst=*)(srv=*)(act=accept)
(action_list=(aut=off)(moct=0)(mct=0)(mit=0)(conn_stats=on))
)
)
)
Keep the remaining lines as is, for now. I will explain the meaning of these parameters later.
(5) Start the CM command line interface by executing "cmctl"
# cmctl
CMCTL for Linux: Version 11.2.0.1.0 - Production on 29-AUG-2011
15:16:01
Welcome to CMCTL, type "help" for information.

CMCTL>
This will show a prompt "CMCTL>". Here you will enter different admin commands for the CM
processes, much like LSNRCTL command line interpreter.
(6) Start the administration process by typing "administer"
CMCTL> administer
Current instance CMAN_cmhost1 is not yet started
Connections refer to (address=(protocol=tcp)(host=cmhost1)

(port=1950)).
The command completed successfully.
CMCTL:CMAN_cmhost1>
Note how the prompt changed showing the name of the CM.
(7) Start the connection manager processes by issuing "startup"
CMCTL> startup
Starting Oracle Connection Manager instance
CMAN_cmhost1.proligence.com. Please wait...
TNS-04077: WARNING: No password set for the Oracle Connection
Manager instance.
CMAN for Linux: Version 11.2.0.1.0 - Production
Status of the Instance
---------------------Instance name
cman_cmhost1.proligence.com
Version
CMAN for Linux: Version 11.2.0.1.0 Production
Start date
29-AUG-2011 17:25:48
Uptime
Num of gateways started
2
Average Load level
0
Log Level
OFF
Trace Level
OFF
Instance Config file
/opt/oracle/product/11gR2/client1/network/admin/cman.ora
Instance Log directory
/opt/oracle/product/11gR2/client1/network/log
Instance Trace directory
/opt/oracle/product/11gR2/client1/network/trace
(8) If everything goes well, you should see the CM status
CMCTL:CMAN_cmhost1> show status
Status of the Instance
---------------------Instance name
cman_cmhost1
Version
CMAN for Linux: Version 11.2.0.1.0 Production
Start date
30-JUN-2011 13:26:16
Uptime
Num of gateways started
2
Average Load level
1
Log Level
ADMIN
Trace Level
OFF
Instance Config file
/opt/oracle/product/11gR2/client1/network/admin/cman.ora
Instance Log directory
/opt/oracle/product/11gR2/client1/network/log
Instance Trace directory
/opt/oracle/product/11gR2/client1/network/trace
If it does not start, refer to the troubleshooting section later in this blog.
(9) Make the TNSNAMES.ORA file change at the client as shown earlier.
(10) Make the connection using this new TNS connection alias:
C:\> sqlplus arup/arup@TNS_CM
You should be able to connect to the database server now. Note, you still can't access the
database host directly. If you use the regular TNS connect string - TNS_REG - you will fail. This
new connection was established through the connection manager.
(11) Check the number of connections coming through the CM, using CMCTL tool:
CMCTL:CMAN_cmhost1> show connections
Number of connections: 1.
The output shows there is one connection through the CM listener. As you connect more, you
will see the number next to "Number of connections:" increasing.
That's it. You have successfully configured Connection Manager interface.
Fine Tuning
In the previous setup I asked you to enter some values without really explaining the significance
of them. Let's go through them.
One of the powerful features of the CM interface is to act as sort of a firewall, i.e. allow
connections from/to certain hosts and for specific services. You can define these inside the
RULES_LIST section as shown below:
(rule_list=
(rule=
(src=x)(dst=x)(srv=*)(act=accept)
(action_list=(aut=off)(moct=0)(mct=0)(mit=0)(conn_stats=on))
)
)
Here are the parameters and what they mean:
src = the source server where the connection request would come from. If you want to
leave it unrestricted, use "*", as a wildcard.
dst = the destination server, which is probably the database server the request would go
to. Again, unrestricted access would be given as "*".
srv = the service. enter "*" for all types of services.
act = the action, e.g. accept, reject or drop the request
On src and dst parameters you can give hostnames, IP addresses as well as wildcards. You would
use this section to allow or deny the access between different servers, making it a really powerful
firewall-like tool.
The action_list parameter allows you to fine tune the actions on the connection.
aut = whether the Oracle Advanced Security Option authentication filter should be
applied. The value shown here is OFF, means this is not to be applied.
moct = after how long the outbound connection established should timeout. The value set
here is 0, means the outbound connection is never to be timed out.
mct = after how long the session should disconnect. The value is 0, i.e. never.
mit = the timeout duration for idle connections
conn_stats = whether the connection statistics be maintained.
Note the use of parentheses. You can use different rules and actions for each combination of
sources and destinations. It allows you to finetune the access. For instance, database D1 is highly
secure and you would want ASO filter; but not database D2. For the request coming from the
same client, you can have a different set of actions for each destination. For D1, the more secure
database host, you can establish various timeouts.
CMCTL Primer
Now that you know about the CMAN.ORA file, let's see the activities you can perform in
CMCTL. The first command you should explore should be "help".
CMCTL> help
The following operations are available
An asterisk (*) denotes a modifier or extended command:
administer
reload
show*
suspend*
close*
resume*
shutdown
exit
save_passwd
sleep
quit
set*
startup
You can get help on a specific command as well:

CMCTL> help administer
administer [] [using ] - Sets up a context for administering the
given Oracle Connection Manager instance
Two commands you will be using a lot are - SHOW and SET.
CMCTL> help SHOW
The following operations are available after show
all
gateways
status
connections
parameters
version
defaults
rules
events
services
CMCTL> help set

The following operations are available after set
aso_authentication_filter
event
inbound_connect_timeout
log_level
password
trace_directory
connection_statistics
idle_timeout
log_directory
outbound_connect_timeout
session_timeout
trace_level
Most of these modifiers are self explanatory, e.g. show status will show you the status of CM;
show connections will show connections established through CM, etc.
Troubleshooting
Of course things may not go well the first time. Don't despair. You can perform extensive
diagnostics and enable logging and tracing.
The most common error may come during the startup:
TNS-04012: Unable to start Oracle Connection Manager instance.

Unfortunately it's a generic, catch-all error. The most common reason is an incorrectly
constructed CMAN.ORA file, e.g. with non-identifiable hostnames and port numbers, log/trace
directories that do not yet exist, invalid parameters and values or even mismatched parantheses.
It's difficult to guess what caused the issue. The best option for you is to copy the sample
CMAN.ORA file and replace the values with your own, paying special attention to the directory
names.
Another cause of this error is not using Fully Qualified Names. For instance, you must use
cmhost1.proligence.com instead of "cmhost1".
Sometimes the issues still linger. You can perform extensive diagnostics by using extended
tracing and logging. To enable logging and tracing, you have to set the parameters in
CMAN.ORA:
log_level : set to SUPPORT
trace_level : set to SUPPORT
The default for both is OFF, means no logging and tracing.
When enabled, CM emits tracings and logging in the appropriate directories that may lead to the
source of the problem. Here is an excerpt from the log:
(LOG_RECORD=(TIMESTAMP=29-AUG-2011 16:22:26)(EVENT=CMAN.ORA
contains no rule for local CMCTL connection)(Add
(rule=(src=cmhost1)(dst=127.0.0.1)(srv=cmon)(act=accept)) in
rule_list)
This shows clearly the issue. We have to make the appropriate entry to make CM work. You can
also enable these dynamically
CMCTL:CMAN_cmhost1.proligence.com> set trace_level support
CMAN_cmhost1.proligence.com parameter trace_level set to support.
Takeaways
Connection Manager is a great product form Oracle Network family of products that can, among
many things, perform as a connection concentrator from multiple client requests, act as a rule
based mini-firewall for the database requests and act as a proxy between different access
domains. Here you learned how to set it up, fine tune the parameters and manage it effectively.
Hope you liked it. As always, please provide your feedback.
Difference between Select Any Dictionary and Select_Catalog_Role
When you want to give a user the privilege to select from data dictionary and
dynamic performance views such as V$DATAFILE, you have two options:
grant select any dictionary to ;

grant select_catalog_role to ;
Did you ever wonder why there are two options for accomplishing the same
objective? Is one of them redundant? Won't it make sense for Oracle to have just
one privilege? And, most important, do these two privileges produce the same
result?
The short answer to the last question is -- no; these two do not produce the same
result. Since they are fundamentally different, there is a place of each of these. One
is not a replacement for the other. In this blog I will explain the subtle but important
differences between the two seemingly similar privileges and how to use them
properly.
Create the Test Case
First let me demonstrate the effects by a small example. Create two users called
SCR and SAD:
SQL> create user scr identified by scr;
SQL> create user sad identified by sad;
Grant the necessary privileges to these users, taking care to grant a different one to
each user.
SQL> grant create session, select any dictionary to sad;
Grant succeeded.
SQL> grant create session, select_catalog_role to scr;
Grant succeeded.
Let's test to make sure these privileges work as expected:

SQL> connect sad/sad
Connected.
SQL> select * from v$session;
... a bunch of rows come here ...
SQL> connect scr/scr

Connected.
SQL> select * from v$datafile;

... a bunch of rows come here ...
Both users have the privilege to select from the dictionary views as we expected.
So, what is the difference between these two privileges? To understand that, let's
create a procedure on the dictionary tables/views on each schema. Since we will
create the same procedure twice, let's first create a script which we will call p.sql.
Here is the script:
create or replace procedure p as

l_num number;
begin
select count(1)
into l_num
from v$session;
end;
/
The procedure is very simple; it merely counts the number of connected sessions by
querying V$SESSION. When you connect as SAD and create the procedure by
executing p.sql:
SQL> @p.sql
Procedure created.
The procedure was created properly; but when you connect as SCR and execute the
script:
SQL> @p.sql
Warning: Procedure created with compilation errors.
SQL> show error

Errors for PROCEDURE P:
LINE/COL
-------4/2
6/7
ERROR
-----------------------------------------------PL/SQL: SQL Statement ignored
PL/SQL: ORA-00942: table or view does not exist
That must be perplexing. We just saw that the user has the privilege to select from
the V$SESSION view. You can double check that by selecting from the view one
more time. So, why did it report ORA-942: table does not exist?
Not All Privileges have been Created Equal
The answer lies in the way Oracle performs compilations. To compile a code with a
named object, the user must have been granted privileges by direct grants; not
through the roles. Selecting or performing DML statements do not care how the
privileges were received. The SQL will work as long as the privileges are there. The
privilege SELECT ANY DICTIONARY is a system privilege, similar to create session or
unlimited tablespace. This is why the user SAD, which had the system privilege,
could successfully compile the procedure P.
The user SCR had the role SELECT_CATALOG_ROLE, which allowed it to SELECT from
V$SESSION but not to create the procedure. Remember, to create another object on
the base object, the user must have the direct grant on the base object; not through
a role. Since SCR had the role not the direct grant on V$DATAFILE, it can't compile
the procedure.
So while both the privileges allow the users to select from v$datafile, the role does
not allow the users to create objects; the system privilege does.
Why the Role?
Now that you know how the privileges are different, you maybe wondering why the
role is even there. It seems that the system grant can do everything and there is no
need for a role. Not quite.The role has a very different purpose. Roles provide
privileges; but only when they are enabled. To see what roles are enabled in a
session, use this query:
SQL> connect scr/oracle

Connected.
SQL> select * from session_roles
ROLE
-----------------------------SELECT_CATALOG_ROLE
HS_ADMIN_SELECT_ROLE
2 rows selected.
We see that two roles - SELECT_CATALOG_ROLE and HS_ADMIN_SELECT_ROLE - have

been enabled in the session. The first one was granted to the user. The other one is
granted to the first one; so that was also enabled.
Just because a role was granted to the user does not necessarily mean that the role
would be enabled. The roles which are marked DEFAULT by the user will be enabled;
the others will not be. Let's see that with an example. As SYS user, execute the
following:
SQL> alter user scr default role none;
User altered.
Now connect as SCR user and see which roles have been enabled:
SQL> connect scr/oracle

no rows selected
None of the roles have been enabled. Why? That's because none of the roles are
default for the user (effected by the alter user statement by SYS). At this point when
you select from a dynamic performance view:
SQL> select * from v$datafile;
select * from v$datafile

*
ERROR at line 1:
ORA-00942: table or view does not exist
You will get this error because the role is not enabled, or active. Without the role the
user does not have any privilege to select from the data dictionary or dynamic
performance view. To enable the role, the user has to execute the SET ROLE
command:
SQL> set role SELECT_CATALOG_ROLE;
Role set.
Checking the enabled roles:
ROLE
-----------------------------SELECT_CATALOG_ROLE
HS_ADMIN_SELECT_ROLE
2 rows selected.
Now the roles have been enabled. Since the roles are not default, the user
must explicitly enable them using the SET ROLE command. This is a very important
characteristic of the roles. We can control how the user will get the privilege. Merely
granting a role to a user will not enable the role; the user's action is required and
that can be done programmatically. In security conscious environments, you may
want to take advantage of that property. A user does not always have the to have to
privilege; but when needed it will be able to do so.
The SET ROLE command is an SQL*Plus command. To call it from SQL, use this:
begin
dbms_session.set_role ('SELECT_CATALOG_ROLE');
end;
You can also set a password for the role. So it will be set only when the correct
password is given;
SQL> alter role SELECT_CATALOG_ROLE identified by l

2 /
Role altered.
To set the role, you have to give the correct password:
SQL> set role SELECT_CATALOG_ROLE identified by l;
Role set.
If you give the wrong password:
SQL> set role SELECT_CATALOG_ROLE identified by fl

2
set role SELECT_CATALOG_ROLE identified by fl

*
ERROR at line 1:
ORA-01979: missing or invalid password for role 'SELECT_CATALOG_ROLE'
You can also revoke the execute privilege on dbms_session from public. After that
the user will not be able to use it to set the role. You can construct another wrapper
procedure to call it. Inside the wrapper, you can have all sort of checks and
balances to make sure the call is acceptable.
We will close this discussion with a tip. How do you know which roles are default?
Simply use the following query:
SQL> select GRANTED_ROLE, DEFAULT_ROLE

2 from dba_role_privs
3 where GRANTEE = 'SCR';
GRANTED_ROLE
DEF
------------------------------ --SELECT_CATALOG_ROLE
NO
Update
Thanks to Randolph Geist (http://www.blogger.com/profile/13463198440639982695)

and Pavel Ruzicka (http://www.blogger.com/profile/04746480312675833301) for
pointing out yet another important difference. SELECT ANY DICTIONARY allows
select from all SYS owner tables such as TAB$, USER$, etc. This is not possible in
the SELECT_CATALOG_ROLE. This difference may seem trivial; but is actually quite
important in some cases. For instance, latest versions of Oracle do not show the
password column from DBA_USERS; but the hashed password is visible in USER$
table. It's not possible to reverse engineer the password from the hash value; but it
is possible to match it to a similar entry and guess the password. A user with the
system privilege will be able to do that; but a user with the role will not be.
Conclusion
In this blog entry I started with a simple question - what is the difference between
two seemingly similar privileges - SELECT ANY DICTIONARY and
SELECT_CATALOG_ROLE. The former is a system privilege, which remains active
throughout the sessions and allows the user to create stored objects on objects on
which it has privileges as a result of the grant. The latter is not a system grant; it's a
role which does not allow the grantee to build stored objects on the granted objects.
The role can also be non-default which means the grantee must execute a set role
or equivalent command to enable it. The role can also be password protected, if
desired.
The core message you should get from this is that roles are different from privileges.
Privileges allow you to build stored objects such as procedures on the objects on
which the privilege is based. Roles do not.
Nulls in Ordering
You want to find out the tables with the highest number of rows in a database.
Pretty simple, right? You whip up the following query:
select owner, table_name, num_rows
from dba_tables
order by num_rows;
And, here is the output:

OWNER
TABLE_NAME
NUM_ROWS
------------- ------------------------------ ---------CRM_ETL
GTT_RES_DLY
CRM_ETL
GTT_RES_PRDCT_CT
CRM_ETL
GTT_RES_PRDCT_RATE_CT
CRM_ETL
GTT_RRSD_DRVR
CRM_ETL
GTT_SUS_RES
SYS
L$1
SYS
L$2
SYS
WRI$_ADV_OBJSPACE_TREND_DATA
SYS
WRI$_ADV_OBJSPACE_CHROW_DATA
SQLTXPLAIN
SQLG$_TAB_SUBPART_COLUMNS
SQLTXPLAIN
SQLG$_DBA_SUBPART_HISTOGRAMS
SQLTXPLAIN
SQLG$_WARNING
Whoa! The NUM_ROWS columns comes up with blanks. Actually they are nulls. Why
are they coming up first? This is due to the fact that these tables have not been
analyzed. CRM_ETL seems like an ETL user. The tables with GTT_ in their names
seem to be global temporary table, hence there are no statistics. The others belong
to SYS and SQLTXPLAIN, which are Oracle default users and probably never
analyzed. Nulls are not comparable to actual literals; so they are neither less or
greater than the others. By default the nulls come up first when asking for a ordered
list.
You need to find the tables with the highest number of rows fast. If you scroll down,
you will see these rows; but it will take some time and it makes you impatient.You
can add a new predicate something like: where num_rows is not null; but it's not
really elegant. It will do the null processing. And what if you want the table names
with null num_rows as well? This construct will eliminate that possibility. So, you
need a different approach.
Nulls Last
If you want to fetch the nulls but push them tot he end of the list rather than first,
you should add a new clause to the order by - NULLS LAST, as shown below.
select owner, table_name, num_rows

from dba_tables
order by 3 desc nulls last
Here is the output:
OWNER
---------------CRM_ETL
CRM_ETL
CRM_ETL
TABLE_NAME
NUM_ROWS
--------------- ---------F_SALES_SUMM_01 1664092226
F_SALES_SUMM_02 948708587
F_SALES_SUMM_03 167616243
This solves the problem. The nulls will be shown; but after the last of the rows with
non-null num_rows value.
A question came up on my blog entry http://arup.blogspot.com/2011/01/more-on-interestedtransaction-lists.html. I think the question warrants a more comprehensive explanation instead of
an answer of a few lines. So I decided to create another blog.
Here was the question:
Could you please explain on the scenario when multiple transactions try to update the same row
as well. Will there be any ITL allocated? Yes, I am talking about the real locking scenario.
Paraphrased differently, the reader wants to know what would happen when this series of event
happens:
1. update row 1 (locked by transaction 1, and occupying one ITL slot)
2. update row 2 (locked by transaction 2, occupying a different ITL slot)
3. Transaction 3 now wants to update either row 1 or row 2. It will hang of course. But will
it trigger the creation of a new ITL slot?
I also decided to expand the questions to cover one more scenario. Transaction 4 wants to update
row 1 and row 4 in the same statement. Row 4 is not locked; but row 1 is. So will transaction 4
be allowed to lock row 4, even though the statement itself will hang? Will it trigger the creation
of another ITL?
Examination
Let's examine these question via a case study. To demonstrate, let me create a table with three
rows:
SQL> create table itltest2 (col1 number, col2 number)
2 /
Table created.
SQL> insert into itltest2 values (1,1);
1 row created.
SQL> c/1,1/1,2
1* insert into itltest2 values (1,2)
SQL> /
1 row created.
SQL> c/1,2/2,2
1* insert into itltest2 values (2,2)
SQL> /
1 row created.
SQL> commit;
Checking the rows:

SQL> select * from itltest2
2 /
COL1
COL2
---------- ---------1
1
1
2
2
2
javascript:void(0)
Now open three sessions and issue different statements
Session1> update itltest2 set col2 = col2 + 1 where col1 = 1;
2 rows updated.
It updated (and locked) 2 rows - row 1 and row 2.
If you check the transaction ID, you will see the transaction details:
SQL> select dbms_transaction.local_transaction_id from dual;
LOCAL_TRANSACTION_ID
------------------------------------------------------------------------------7.10.33260
From session2, try to lock row 2 and 3:

Session2> update itltest2 set col1 = col1 + 1 where col2 = 2;
This will hang. The reason is obvious. The transaction is trying to get a lock on rows 2 and 3.
Since row 2 is already locked by transaction 1, it can't be locked. However, what about row 3? It
should have been able to be locked. Was it locked? Let's make a simple check by updating only
row 3 from another session, which was attempted to be locked by transaction 2.
Session3> update itltest2 set col2 = col2 + 1 where col1 = 2 and col2 = 2;
1 row updated.
Checking the transaction ID:

SQL> select dbms_transaction.local_transaction_id from dual;
LOCAL_TRANSACTION_ID
------------------------------------------------------------------------------10.4.33214
We know that there are three transactions and three lock requests. Or, are there? Let's check in
V$TRANSACTION:
SQL> select XIDUSN, XIDSLOT, XIDSQN
2 from v$transaction;
XIDUSN
XIDSLOT
XIDSQN
---------- ---------- ---------7
10
33260
10
4
33214
There are only two transactions that have placed locks. If you combine the XIDUSN, XIDSLOT
and XIDSQN, separated by periods, you will get the transaction ID shown earlier. The
transaction that is hanging has not placed a lock on the row it could have put a lock on. That is
consistent with the concept of statements inside transactions - either all rows will be updated or
none - not in piecemeal. If one of the rows can't be locked, none of the rows will be.
What about ITL slots. Let's see them by doing block dumps. First , we need to know the block
number these rows are in:
SQL> select dbms_rowid.rowid_block_number(rowid), col1, col2

2 from itltest2;
DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID)
COL1
COL2
------------------------------------ ---------- ---------4052
1
1
4052
1
2
4052
2
2
After performing a checkpoint, we will dump the block.

SQL> alter system dump datafile 7 block min 4052 block max 4052;
Looking in the tracefile and searching for "Itl", we see the following two
lines:
Itl
0x01
0x02
Xid
0x000a.004.000081be
0x0007.00a.000081ec
Uba
0x00c004fe.1873.23
0x00c00350.194c.18
Flag
-------
Lck
1
2
Scn/Fsc
fsc 0x0000.00000000
fsc 0x0000.00000000
There are just two ITL slots; not three. Remember the XID column is in hexadecimal. If you
convert the XID columns in the v$transaction view:
SQL> select
2
to_char(XIDUSN,'XXXXXX'),
3
to_char(XIDSLOT,'XXXXXX'),
4
to_char(XIDSQN,'XXXXXX')
5 from v$transaction;
TO_CHAR TO_CHAR TO_CHAR
------- ------- ------7
A
81EC
A
4
81BE
Note how the output matches the entry under the column marked "Xid" in the Itl output. you saw
the same transaction IDs in the same Itl. There are just two ITL slots and each slot points to a
transaction that has placed the lock. The transaction that has not placed the lock is not given an
ITL slot; there is no no need for it.
Lock Change
Now suppose Transactios 1 and 3 ended by either commit or rollback. Transaction 2, which was
hanging until now, will be free to put the locks. Let's see the ITL slots:
Itl
0x01
0x02
Xid
0x0008.00f.0000a423
0x0006.005.0000a43f
Uba
Flag Lck
0x00c013ce.1e11.05 C--0
0x00c008fe.1d22.12 ---2
Scn/Fsc
scn 0x0000.0244bcb2
fsc 0x0000.00000000
If you examine the hexadecimal values of the XID values from V$TRANSACTION,
SQL> select
2
to_char(XIDUSN,'XXXXXX'),
3
4
5
to_char(XIDSLOT,'XXXXXX'),
to_char(XIDSQN,'XXXXXX')
from v$transaction;
TO_CHAR TO_CHAR TO_CHAR

------- ------- ------6
5
A43F
This matches the transaction Id we see in the "Xid" column of the ITL slot. The other ITL slot is
now free from any other lock.
Magic of Block Change Tracking

The other day, one of the instances of our main 3-node RAC database crashed due
to an I/O error. Apparently there were many "dead paths" from the host to the SAN,
which made some LUNs in the ASM diskgroup not being recognizable by the ASM
instance. Why? Well, that's the topic for another blog. The point behind this one has
nothing to do with the reason. All I want to convey that there was a LUN problem on
one of the nodes which brought the instance down. Being RAC, the database was
serviced from other two nodes - praise God for RAC! - and the users didn't notice it
terribly (or, I would like to think that way).
After a few days we noticed the incremental RMAN backup taking a long time. This
caused major issues - it took a long time and I/O waits went through the roof. In fact
it took increasingly longer every day that passed by that unfortunate collapse of the
node. Everyone was quite intrigued - what could be the connection between an
instance crash and instance crashing? All sorts of theories cropped up - from failed
HBA cards to undiscovered RAC bugs.
This is where I got involved. The following chronicles the diagnosis of the issue and
the resolution.
First, the increased length of time is obviously a result of the incemental backups
doing more work, i.e. more changed blocks. What caused so many changed blocks?
Interviews with stakeholdrs yielded no clear answer - there was absolutely no
reason for increased activity. Since we are doing proper research, I decided to start
with the facts. How much was the extra blocks processed by incrementals?
I started with this simple query:
select completion_time, datafile_blocks, blocks_read, blocks
from v$backup_datafile
where file# = 1
order by 1
/
Output:
COMPLETIO
--------18-JUL-08
19-JUL-08
20-JUL-08
21-JUL-08
22-JUL-08
DATAFILE_BLOCKS BLOCKS_READ BLOCKS

--------------- ----------- ---------524288 32023 31713
524288 11652 10960
524288 524288 12764
524288 524288 5612
524288 524288 11089
The columns are:

DATAFILE_BLOCKS - the number of blocks in the datafile at that time
BLOCKS_READ - the exact number of blocks the RMAN incremental backup read
BLOCKS - the numberof blocks it actually backed up
From the above output, a pattern emerges - until Jul 20th, the backup read only a
few blocks; but on July 20th, it started scanning the entire file - all the blocks! I
checked for a few other datafiles and the story is the same everywhere. With a 4.5
TB database, if the incremental backup reads the datafiles in entirity, then I/O would
obviously go for a toss. That explains the I/O and time issue.
But why did RMAN switch from reading a few blocks to the whole file that day? The
#1 suspect is Block Change Tracking. The 10g feature BCT allows RMAN to scan only
the changed blocks and not the entire file. We use that. So, did something happen
to make that disappear?
to answer, I issued a modified query:
select completion_time, datafile_blocks, blocks_read, blocks,
used_change_tracking
where file# = 1
order by 1
/
Output:
COMPLETIO
--------18-JUL-08
19-JUL-08
20-JUL-08
21-JUL-08
22-JUL-08
DATAFILE_BLOCKS BLOCKS_READ BLOCKS USE

--------------- ----------- ---------- --524288 32023 31713 YES
524288 11652 10960 YES
524288 524288 12764 NO
524288 524288 5612 NO
524288 524288 11089 NO
Bingo! The BCT use ceased from the 20th July date. That was what caused the
whole file to be scanned. But why was it stopped? No one actually stopped it.
Investigating even further, I found from the alert log of Node 1:
Sun Jul 20 00:23:52 2008
CHANGE TRACKING ERROR in another instance, disabling change tracking
Block change tracking service stopping.
From Node 2:
Sun Jul 20 00:23:51 2008
CHANGE TRACKING ERROR in another instance, disabling change tracking
Alert log of Node 3 showed the issue:
Sun Jul 20 00:23:50 2008
Unexpected communication failure with ASM instance:
ORA-12549: TNS:operating system resource quota exceeded
CHANGE TRACKING ERROR 19755, disabling change tracking
Sun Jul 20 00:23:50 2008
Errors in file /xxx/oracle/admin/XXXX/bdump/xxx3_ctwr_20729.trc:
ORA-19755: could not open change tracking file
ORA-19750: change tracking file: '+DG1/change_tracking.dbf'
ORA-17503: ksfdopn:2 Failed to open file +DG1/change_tracking.dbf
ORA-12549: TNS:operating system resource quota exceeded
The last message shows the true error. The error was operating system resource
quota exceeded, making the diskgroup unavailable. Since the ASM diskgroup was
down, all the files were also not available, including BCT file. Surprisingly, Oracle
decided to stop BCT altogether rather than report it as a problem and let the user
decide the next steps. So block change tracking was silently disabled and the DBAs
didn't get a hint of that. Ouch!
Resolution
Well, now that we discovered the issue, we took the necessary steps to correct it.
Because of the usual change control process, it took some time to have the change
approved and put in place. We executed the following to put the BCT file.
alter database enable block change tracking using file '+DG1/change_tracking.dbf'
The entry in alert log confirms it (all all nodes)
Block change tracking file is current.
But this does not solve the issue completely. to use block change tracking, there has
to be a baseline, which is generally a full backup. We never take a full backup. We
always take an incremental image copy and then merge to a full backup on a
separate location. So, the first order of business was to take a full backup. After that
we immediately took an incremental. It took just about an hour, down from some
18+ hours earlier.
Here is some analysis. Looking at the backup of just one file - file#1, i.e. SYSTEM
datafile:
select COMPLETION_TIME, USED_CHANGE_TRACKING, BLOCKS, BLOCKS_READ
where file# = 1
order by 1
/
The output:
COMPLETIO USE BLOCKS BLOCKS_READ
--------- --- ---------- ----------18-AUG-08 NO 31713 524288
18-AUG-08 NO 10960 524288
20-AUG-08 NO 12764 524288
21-AUG-08 NO 5612 524288
22-AUG-08 NO 11089 524288
23-AUG-08 NO 8217 524288
23-AUG-08 NO 8025 524288
25-AUG-08 NO 3230 524288
26-AUG-08 NO 6629 524288
27-AUG-08 NO 11094 524288 <= the filesize was increased 28-AUG-08 NO
3608 786432 29-AUG-08 NO 8199 786432 29-AUG-08 NO 12893 786432 31-AUG08 YES 1798 6055 01-SEP-08 YES 7664 35411
Columns descriptions:
USE - was Block Change Tracking used?
BLOCKS - the number of blocks backed up
BLOCKS_READ - the number of blocks read by the backup
Note, when the BCT was not used, the *entire* file - 524288 blocks - were
being read every time. Of course only a percent of that was being backed up
since that percentage changed; but the whole file was being checked.
After BCT, note how the "blocks read" number dropped dramatically. That is
the magic behind the dropped time.
I wanted to find out exactly how much I/O savings BCT was bringing us. A simple
query would show that:
select sum(BLOCKS_READ)/sum(DATAFILE_BLOCKS)
where USED_CHANGE_TRACKING = 'YES'
/
The output:
.09581342
That's just 9.58%. After BCT, only 9.58% of the blocks of the datafiles were scanned!
Consider the impact of that. Before BCT, the entire file was scanned for changed
blocks. After BCT, only about 9.58% of the blocks were scanned for changed blocks.
Just 9.58%. How sweet is that?!!!
Here are three representative files:
File# Blocks Read Actual # of blocks Pct Read
----- ------------- ------------------- -------985 109 1254400 .009
986 1 786432 .000
987 1 1048576 .000
Note, files 986 and 987 were virtually unread (only one block was read). Before BCT,
all the 1048576 blocks were read; after BCT only 1 was. This makes perfect sense.
These files are essentially older data; so nothing changes there. RMAN incremental
is now blazing fast because it scans less than 10% of the blocks. The I/O problem
disappered too, making the database performance even better.
So, we started with some random I/O issue, causing a node failure, which led to
increased time for incremental, which was tracjed down to a block change tracking
file being suspended by Oracle silently without raising an error.
Takeaways:
The single biggest takeway you should get is that just because it is defined, don't
get the idea it is going to be there. So, a periodic check for the BCT file is a must. I
will work on developing an automated tool to check for non-use of BCT file. The tool
will essentially issue:
SELECT count(1)
FROM v$backup_datafile
where USED_CHANGE_TRACKING = 'NO'
/
If the output is >1, then an alert should be issued. Material for the next blog.
Thanks for reading.

Cache Buffer Chains Demystified

Diunggah oleh

Informasi Dokumen

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Cache Buffer Chains Demystified

Diunggah oleh

Hak Cipta:

Format Tersedia

Cache Buffer Chains Demystified

Part2 (How Oracle Locking Works),

Part3 (More on Interested Transaction Lists).

Part4 (Can I Fit a 80MB Database Completely in a 80MB Buffer Cache?)

You will learn

How Buffer Cache Works

FILE# - the file_id

From the original session, check the buffers:

From the original session, find out the buffers.

Both processes could be after the same buffer

Diagnosis of CBC Latch Waits

This produces a tracefile, a part of which is shown below.

SQL> select object_name

Resolving CBC Latch Waits

where tch > 0

Oracle Database 12c Feature: Multitenant Database

SQL> select cdb from v$database;

What is a Container Database

[oracle@prosrv1 pluggable]$ sqlplus sys/oracle as sysdba

SQL> shutdown immediate

[oracle@prosrv1 trace]$ lsnrctl status

SQL> show parameter service

NAME TYPE VALUE

[oracle@prosrv1 ~]$ sqlplus system/oracle@pdb1

[oracle@prosrv1 schema]$ export TWO_TASK=PDB1

[oracle@prosrv1 schema]$ sqlplus system/oracle

SQL> show con_id

You will see a menu like this:

Choose "Manage Pluggable Databases":

4. In the next screen, choose "Create a Pluggable Database."

Sun Feb 24 10:52:35 2013

CREATE SMALLFILE TABLESPACE "USERS" LOGGING DATAFILE SIZE 5M AUTOEXTEND

Common and Local Users

However, if you attempt to connect as HRMASTER:

Session State Consistency:

SID USERNAME PROGRAM

SQL> show con_id

Backup and Restore/Recover

RMAN> backup incremental level 1 database;

[oracle@prosrv1 pluggable]$ rman target=sys/oracle@pdb2

14 280 PDB3:SYSTEM ***

channel ORA_DISK_1: restoring datafile 00009 to

List of tablespaces expected to have UNDO segments

SQL> alter system set resource_manager_plan = 'DayShift_CONA_Plan' scope=both;

If the PDB is open, you should close it.

SQL> alter pluggable database pdb4 close;

Create the meta-information on the PDB in an XML file.

Pluggable database altered.

SQL> alter pluggable database pdb4

On the target server, connect to the CDB with SYSDBA privilege:

$ sqlplus sys/oracle as sysdba

SQL> create pluggable database pdb9

Successfully created internal service pdb9 at open

This PDB is not open yet. You should open it:

SQL> alter pluggable database pdb9 open;

SQL> select name from v$datafile;

SQL> select PDB_NAME, OPERATION, OP_TIMESTAMP, CLONED_FROM_PDB_NAME

Creating Controlfile From Scratch when No Backup is Available

-- to open the database.

Here is the output:

Oracle immediately responds with:

ORA-00310: archived log contains sequence 2; sequence 3 required

Oracle responds by:

Media Recovery canceled