AUDIO
OpEons
You
can:
q Either
listen
the
audio
broadcast
on
your
computer
q Or
join
teleconference
(dial
in)
ObjecEves
Agenda
1
10
Lun Disk
System Lun
Cell
Disk
Grid Disk 1
Grid
Grid Disk
Disk nn
Copyright
2014
Oracle
and/or
its
aliates.
All
rights
reserved.
|
ASM disk
ASM disk
11
12
Physical
Disk
A
Physical
disk
is
an
actual
device
within
the
storage
cell
that
consEtutes
a
single
disk
drive
spindle
13
14
15
Cell
Disk
Cell
disk
is
an
Oracle
Exadata
Cell
abstracEon
that
is
built
on
top
of
a
Lun
Higher
Level
of
abstracEon
for
the
data
storage
on
a
Physical
disk
It
can
be
further
divided
into
Grid
disks,
which
are
directly
exposed
to
ASM/DB
16
17
Grid
Disk
Grid
Disk
is
a
potenEally
nonconEguous
parEEon
of
the
cell
disk
that
is
directly
exposed
to
ASM
to
be
used
for
ASM
disk
group
creaEons
18
Grid
Disk
To
list
griddisk
to
query
name,size
and
oset
CellCLI>
list
griddisk
where
celldisk=CD_06_dmorlcel05
aoributes
name,size,oset
DATA_DMORL_CD_06_dmorlcel05
2208G
32M
RECO_DMORL_CD_06_dmorlcel05
552.109375G
2208.046875G
DBFS_DG_CD_06_dmorlcel05
33.796875G
2760.15625G
Lower
oset
will
be
placed
on
outer
most
tracks
(faster
tracks).
Examples
of
Space
AllocaDon
for
Grid
Disks
on
an
Exadata
Database
Machine
(Doc
ID
1513068.1)
19
SYSTEM
LUN
sdb
sda
Sda3
Size 528 GB
sda5 to sda11
Sdb3
Cell disk
Size 528 GB
OS,Exadata SW,etc
size 29 GB
sdb5 to sdb11
size 29 GB
20
Cell
disk
sdc
sdb
DATA
Grid
Disk
RECO
Grid
Disk
DBFS
Grid
Disk
21
22
Model:
LSI
MR9261-8i
(scsi)
Disk
/dev/sdad:
3000GB
Sector
size
(logical/physical):
512B/512B
ParEEon
Table:
gpt
Number
Start
End
Size
File
system
Name
Flags
1
32.8kB
123MB
123MB
ext3
primary
raid
2
123MB
132MB
8225kB
primary
3
132MB
2964GB
2964GB
primary
4
2964GB
2964GB
32.8kB
primary
5
2964GB
2975GB
10.7GB
ext3
primary
raid
6
2975GB
2985GB
10.7GB
ext3
primary
raid
7
2985GB
2989GB
3221MB
ext3
primary
raid
8
2989GB
2992GB
3221MB
ext3
primary
raid
9
2992GB
2994GB
2147MB
linux-swap
primary
raid
10
2994GB
2995GB
732MB
primary
raid
11
2995GB
3000GB
5369MB
ext3
primary
raid
InformaEon:
Don't
forget
to
update
/etc/fstab,
if
necessary.
Model:
LSI
MR9261-8i
(scsi)
Disk
/dev/sdb:
3000GB
Sector
size
(logical/physical):
512B/512B
ParEEon
Table:
gpt
Number
Start
End
Size
File
system
Name
Flags
1
32.8kB
123MB
123MB
ext3
primary
raid
2
123MB
132MB
8225kB
ext2
primary
3
132MB
2964GB
2964GB
primary
4
2964GB
2964GB
32.8kB
primary
5
2964GB
2975GB
10.7GB
ext3
primary
raid
6
2975GB
2985GB
10.7GB
ext3
primary
raid
7
2985GB
2989GB
3221MB
ext3
primary
raid
8
2989GB
2992GB
3221MB
ext3
primary
raid
9
2992GB
2994GB
2147MB
linux-swap
primary
raid
10
2994GB
2995GB
732MB
primary
raid
11
2995GB
3000GB
5369MB
ext3
primary
raid
InformaEon:
Don't
forget
to
update
/etc/fstab,
if
necessary.
Copyright
2014
Oracle
and/or
its
aliates.
All
rights
reserved.
|
23
MD
(MulDpath
Device)
MD
device
used
to
mirror
two
system
Lun
This
system
area
contains
OS
image,Swap,Exadata
SW,logs
and
other
cong
les
/dev/md5
&
/dev/md6
System
parEEon
-
root
pariEon
/dev/md7
&
/dev/md8
Sojware
-
exadata
sojware
installaEon
/dev/md4
-
boot
/dev/md11
-
/var/log/oracle
-
to
storage
cellos
and
crash
les
etc
at
any
point
4
md
parEEon
will
be
mounted
24
MD
(MulDpath
Device)
ROOT
Exadata
SW
Boot
Cellos
and
crash
25
MD
(MulDpath
Device)
26
Agenda
Disk
Layout
at
Storage
cell
side
Disk
Layout
at
ASM/DB
node
side
Exadata
Auto
Management
Replacing
the
failed/failing
disk
TroubleshooEng
disk
replacement
27
28
3-disk
RAID
5
with
1
global
hot
spare
on
images
11.2.3.1.1
and
earlier
4-disk
RAID
5
on
images
11.2.3.2.0
and
later
X2-8
Linux
only
(
dual-boot
Solaris
image
parEEon
has
been
reclaimed)
7-disk
RAID
5
with
1
global
hot
spare
on
images
11.2.3.1.1
and
earlier
8-disk
RAID
5
on
images
11.2.3.2.0
and
later
X3-8
Linux
only:
8-disk
RAID
5
Copyright
2014
Oracle
and/or
its
aliates.
All
rights
reserved.
|
29
NOTE:
Both
opEons
only
apply
if
the
hot-spare
isn't
claimed
already
as
part
of
a
previous
(11.2.3.2.0)
update.
Customers
who
require
their
hot-spare
'back'
ajer
a
previous
upgrade
(to
11.2.3.2.0)
need
to
reimage
to
a
release
<
11.2.3.2.0
and
then
upgrade
to
11.2.3.2.1
and
then
follow
the
steps
in
this
note
-
guided
by
Oracle
Support.
30
# /opt/oracle.SupportTools/reclaimdisks.sh -check
Hotspare
reclaimed
# /opt/oracle.SupportTools/reclaimdisks.sh -check
31
Disk
/dev/sda:
896.9
GB,
896998047744
bytes
255
heads,
63
sectors/track,
109053
cylinders
Units
=
cylinders
of
16065
*
512
=
8225280
bytes
Device
Boot
Start
End
Blocks
Id
System
/dev/sda1
*
1
65
522081
83
Linux
/dev/sda2
66
36351
291467295
8e
Linux
LVM
/dev/sda3
36352
109054
583985248
8e
Linux
LVM
32
33
34
35
36
37
Slot 3
Slot 6
Slot 3
DATA 3
DATA 6
DATA 3
DATA 6
RECO 3
RECO 6
RECO 3
RECO 6
Slot 8
Slot 11
Slot 8
Slot 11
DATA 11
Slot 6
DATA 8
DATA 11
DATA 8
RECO 8
RECO 11
RECO 8
RECO 11
DBFS 11
38
Slot 3
Slot 6
Slot 3
DATA 3
DATA 6
DATA 3
DATA 6
RECO 3
RECO 6
RECO 3
RECO 6
Slot 8
Slot 11
Slot 8
Slot 11
DATA 11
Slot 6
DATA 8
DATA 11
DATA 8
RECO 8
RECO 11
RECO 8
RECO 11
DBFS 11
39
Agenda
Disk
Layout
at
Storage
cell
side
Disk
Layout
at
ASM/DB
node
side
Exadata
Auto
Management
Replacing
the
failed/failing
disk
TroubleshooEng
disk
replacement
40
What
is
automated
?
What
requires
human
intervenEon
?
41
What
is
automated
?
42
What
requires
human
intervenDon
?
Diskgroup
mount
is
not
automated.
So
if
a
diskgroup
got
dismounted
due
to
say
loss
of
all
physical
mirrors,
ASM
admin
would
have
to
manually
mount
the
diskgroup
when
those
disks
become
accessible
again.
Users
taken
oine
disks
should
be
brought
back
online
manually
ajer
maintenance.
Users
drop
disks
should
be
added
back
manually.
XDMG
and
XDWK
are
new
background
processes
which
does
the
auto
management.
Both
are
restart
able
process.
For
more
detail,
kind
refer
Auto
disk
management
feature
in
Exadata
(Doc
ID
1484274.1)
Copyright
2014
Oracle
and/or
its
aliates.
All
rights
reserved.
|
43
Agenda
Disk
Layout
at
Storage
cell
side
Disk
Layout
at
ASM/DB
node
side
Exadata
Auto
Management
Replacing
the
failed/failing
disk
TroubleshooEng
disk
replacement
44
45
46
Disk
Failure
(Dead
disk)
Disk
Controller
detects
that
the
disk
was
dead
Exadata
Auto
management
force
drop
the
grid
disks
on
the
dead
disk
from
ASM
disk
group
Need
to
replace
the
disk
ASAP
47
Disk/Media
Problem
(PredicDve
failure)
48
Poor
Performance
MS
detects
a
disk
in
poor
performance
MS
moves
the
Lun
to
warning
and
Physical
disk
to
PredicEve
failure
Celldisk
and
its
associated
grid
disk
goes
to
proacEve
failure
Cellsrv
sends
message
to
force
drop
the
Grid
disks
from
ASM
side
.If
Grid
disk
cannot
be
force
dropped
(due
to
oine
partners),the
disks
will
be
dropped
normal
(get
the
data
relocated
out
of
the
slow
disk
eventually.)
49
Disk
Status
Dead
disk
criEcal
PredicEve failure
Poor performance
Lun status
warning
warning
warning
Not present
ProacEve failure
ProacEve
failure
Not present
ProacEve
failure
ProacEve
failure
ASM acDon
Force drop
50
AutomaDc
Removal
of
Underperforming
Disk
From
11.2.3.2
onwards,
an
underperforming
disk
can
be
removed
from
the
acEve
conguraEon.
Normal
status
1st Phase
2nd Phase
3rd Phase
Physical Disk
normal
warning
-
connedOnline
warning - connedOine'
Lun
normal
warning
-
connedOnline
warning
-
connedOine'
Cell Disk
normal
normal
-
connedOnline
normal - connedOine
predicDve failure
Grid Disk
acDve
acDve - connedOnline
acDve
-
connedOine
ASM AcDon
None
None
Take
Grid
disk
oine
if
possible
Online
the
disk
if
the
disks
are
ne/drop
force
the
disk
if
it
detects
performance
51
Things
to
check
before
replacing
disk/pulling
the
disk
out
First
idenEfy
the
disk
that
needs
to
be
replaced
at
cell
side
System
disk
or
Data
disk?
IdenEfy
the
corresponding
the
LUN/cell
disk
and
Grid
disks
If
its
system
disks,
then
check
the
partner
status
in
mdadm
sojware
RAID
Check
the
grid
disks
are
online
at
ASM
side
or
not
Check
any
rebalance
is
going
on
52
Replacing
the
Disk
(Hard
failure
or
PredicDve
disk)
cellsrv
would
have
send
an
alert
related
to
disk
failure
if
you
have
congured
the
system
for
alert
noEcaEon
.the
below
is
an
example
of
the
email
received
53
Drop
Physical
Disk
for
Replacement
From
11.2.3.3.0
and
later
Execute
this
command
to
replace
disk.
54
Cont..
11.2.3.2.1
and
earlier
version
(Manual
method
to
verify)
1. check
the
grid
disks
status
on
ASM
side
Use
the
following
queries
to
validate
that
Grid
disks
are
DROPPED
from
ASM
for
proacEve
failure
.For
hard
failure(dead
disk
),the
mode_status
should
be
OFFLINE
and
mount_status
=CLOSED.
col
name
format
a30
col
path
format
a40
col
group_number
number
99
col
mount_status
status
a10
set
linesize
200
select
path,name,group_number,mount_status,mode_status
from
v$asm_disk
where
path
like
%CD_07_dmorlcel06%';
PATH
o/192.168.10.6/SYSTEMDG_CD_07_dmorlcel06
CLOSED OFFLINE
o/192.168.10.6/DATA_CD_07_dmorlcel06
CLOSED
OFFLINE
o/192.168.10.6/RECO_CD_07_dmorlcel06
CLOSED
OFFLINE
55
2. If
the
grid
disk
shows
online
that
needs
to
be
replaced,
drop
those
griddisk
manually
alter
diskgroup
<dg
name>
drop
disk
<grid
disk
name>
rebalance
power
11;
Note
:-
recommended
rebalance
power
limit
1
to
32
only.
3.
Wait
for
rebalance
to
complete
only
for
normal
drop
(PredicDve
failure
or
online
disk
replacement
only)
SQL>
select
*
from
gv$asm_operaEon;
If
it
returns
no
rows
,then
there
is
no
rebalance
going
on
currently
4.
Verify
MS
process
is
running
on
the
cell
node
before
replacing
the
disk
CellCLI>
list
cell
aoributes
cellsrvStatus,msStatus,rsStatus
detail
cellsrvStatus:
running
msStatus:
running
rsStatus:
running
56
5.
If
its
a
System
disk,
then
check
the
other
system
partner
status
Run
the
below
command
to
verify
MD
device
volume
status
for
x
in
1
2
4
5
6
7
8
11;
do
mdadm
-Q
--detail
/dev/md$x;
done
Sample
output
from
one
of
the
md
device
(output
truncated
and
highlight
required
informaEon)
#
mdadm
-Q
--detail
/dev/md5
/dev/md5:
State
:
clean
Number
Major
Minor
RaidDevice
State
0
8
5
0
acDve
sync
/dev/sda5
1
0
0
1
removed
2
8
21
-
faulty
spare
/dev/sdb5
The
most
important
to
check
partner
disk
sate
and
State
is
clean,
clean,degraded
or
acEve".
If
it
is
"clean"
is
safe
to
hot
remove,
"acDve"
is
acEvely
syncing
the
disk
mirrors
and
should
wait
unEl
it
is
"clean"
before
hot
removing
the
disk.
If
the
disk
is
staying
in
"acEve"
state,
then
follow
the
steps
in
MOS
Note1524329.1
to
set
it
to
removed
before
replacing
Copyright
2014
Oracle
and/or
its
aliates.
All
rights
reserved.
|
57
Perform
the
following
steps
to
replace
physical
disk
Locate
the
failed
disk
(The
system
disks
are
lejmost
in
the
system)
Validate the failed disk has the Amber LED turned on or Blue Ok-to remove LED on
Verify the Green LED begins to icker as the system recognizes the new drive
When
you
replace
a
physical
disk,
the
disk
must
be
acknowledged
by
the
RAID
controller
before
it
can
be
used.
This
does
not
take
a
long
Eme,
and
you
can
use
the
LIST
PHYSICAL
command
to
monitor
the
status
unEl
it
returns
to
normal.
58
Post
Replacement
check
1
Validate
that
the
cell
disk
and
Grid
disk
were
created
Status
should
be
normal
for
the
cell
disk
and
acEve
for
the
grid
disk.
This
can
be
used
using
cellcli
command
line
cellcli e list celldisk
cellcli -e list griddisk
2 Connect
to
the
ASM
instance
and
idenDfy
the
status
of
the
rebalance
operaDon
SQL>
select
*
from
gv$asm_operaEon;
An
acEve
rebalance
operaEon
can
be
idenEed
by
STATE=RUN.
The
column
group
number
and
inst_id
provide
the
disk
group
number
of
the
disk
group
been
rebalanced
and
the
instance
number
where
the
operaEon
is
running.
59
Post
Replacement
check
the
following
queries
to
validate
that
all
failgroups
have
the
same
number
of
disks
3.Use
on
the
correct
status.
(MODE_STATUS
=
ONLINE
or
MOUNT_STATUS=CACHED)
o/192.168.10.6/SYSTEMDG_CD_07_dmorlcel06
1 CACHED ONLINE
o/192.168.10.6/DATA_CD_07_dmorlcel06
2 CACHED ONLINE
o/192.168.10.6/RECO_CD_07_dmorlcel06
3 CACHED ONLINE
If
the
disk
shows
as
CANDIDATE
,then
you
need
to
add
it
manually
using
below
command
alter
diskgroup
<DG
NAME>
add
disk
<disk
path>
rebalance
power
11;
Copyright
2014
Oracle
and/or
its
aliates.
All
rights
reserved.
|
60
Post
Replacement
check
If
its
a
System
disk,then
check
the
replaced
disk
synced
with
other
system
partner
disk
4.
for
x
in
1
2
4
5
6
7
8
11;
do
mdadm
-Q
--detail
/dev/md$x;
done
Below
is
the
correct
output
if
the
md
devices
are
synchronized.
The
important
data
to
validate
are:
the
State,
that
should
be
clean
and
the
informaEon
at
the
booom,
where
it
shows
the
parEEons
with
the
other
disk
are
in
sync.
61
MOS
ArDcle
Reference:-
Oracle
Exadata
DiagnosDc
InformaDon
required
for
Disk
Failures
and
some
other
Hardware
issues
(Doc
ID
761868.1)
How
to
Replace
a
Hard
Drive
in
an
Exadata
Storage
Server
(Hard
Failure)
(Doc
ID
1386147.1)
How
to
Replace
a
Hard
Drive
in
an
Exadata
Storage
Server
(PredicDve
Failure)
(Doc
ID
1390836.1)
Things
to
Check
in
ASM
When
Replacing
an
ONLINE
disk
from
Exadata
Storage
Cell
(Doc
ID
1326611.1)
Copyright
2014
Oracle
and/or
its
aliates.
All
rights
reserved.
|
62
Agenda
Disk
Layout
at
Storage
cell
side
Disk
Layout
at
ASM/DB
node
side
Exadata
Auto
Management
Replacing
the
failed/failing
disk
TroubleshooEng
disk
replacement
63
If
it
shows
as
Uncongured(bad),
Spun
down"or
FAILED,
then
the
replaced
disk
is
having
issue.
Please
Contact
Oracle
Support
64
65
66
67
68
Summary
What
we
covered
today
qExadata
Storage
and
database
node
disk
layout
qOverview
of
Exadata
Auto
management
qHandling
disk
failure
qThings
to
check
before
replacing
disk
q
TroubleshooEng
Eps
for
disk
replacement
issue
Learn
More
Available
References
and
Resources
to
Get
ProacEve
qAbout
Oracle
Support
Best
PracEces
www.oracle.com/goto/proacEvesupport
qGet
ProacEve
in
My
Oracle
Support
hops://support.
oracle.com
|
Doc
ID:
432.1
qMy
Oracle
Support
Blog
hops://blogs.oracle.com/supportportal/
qAsk
the
Get
ProacEve
Team
get-proacEve_ww@oracle.com
2.
Directly hops://communiEes.oracle.com
Q & A
THANK
YOU
Copyright
2014
Oracle
and/or
its
aliates.
All
rights
reserved.
|
79
R
255
G
255
B
255
R
95
G
95
B
95
R
220
G
227
B
228
R
127
G
127
B
127
R
255
G
0
B
0
R
138
G
19
B
59
R
255
G
119
B
0
R
70
G
87
B
94
R
141
G
166
B
177
R
176
G
195
B
200
80
AddiEonal Resources
my.oracle.com\site\mktg\creaEve\graphics\photography
my.oracle.com/site/mktg/creaEve/Graphics/Photography/cnt1375391.htm
Academic
Airline
AnalyEcs
ApplicaEon
ATM
my.oracle.com/site/mktg/creaEve/Graphics/Icons/index.html
my.oracle.com/site/mktg/creaEve/Logos/index.html
81
3
2