Anda di halaman 1dari 34

Oracle

Platinum
Services
Fault
Monitoring
What to Expect
Contents
Document Objective ..................................................................................................................................... 4
Overview ....................................................................................................................................................... 4
Remote Fault Monitoring .......................................................................................................................... 4
Fault Monitoring Framework ........................................................................................................................ 5
Key Components of the Gateway ............................................................................................................. 5
Managing the OASG................................................................................................................................. 6
Oracle Advanced Support Portal................................................................................................................... 6
Fault Monitoring Details ............................................................................................................................... 7
Customer Requirement and Obligations ................................................................................................... 7
Fault Monitoring Roles and Responsibilities ............................................................................................ 8
Activities ................................................................................................................................................... 9
Oracle Platinum Services Fault Monitoring Implementation Prerequisites .......................................... 9
Oracle Platinum Services Fault Monitoring Implementation ............................................................. 10
Oracle Platinum Services Fault Monitoring Event for Oracle Exadata and Oracle Zero Data Loss
Recovery Appliance ............................................................................................................................ 11
Activity ....................................................................................................................................................... 11
Who ............................................................................................................................................................ 11
When .......................................................................................................................................................... 11
Oracle Platinum Services Fault Monitoring Events for Oracle SuperCluster, Oracle Exalogic, and
Oracle Exadata .................................................................................................................................... 12
Appendix I Oracle Platinum Services Fault Monitoring Events .............................................................. 13
ASR Fault Events.................................................................................................................................... 13
OEM Fault Events................................................................................................................................... 13
*Related to Oracle Database component only ............................................................................................ 17
Appendix II Description of Common For-Fee Monitoring Items ............................................................ 17
Appendix III Access Requirements ......................................................................................................... 18
Oracle Access to Data ............................................................................................................................. 21
Appendix IV Process Flow Diagrams ......................................................................................................... 22
High-level Process Flow for Oracle SuperCluster, Oracle Exalogic, and Oracle Exadata ..................... 22

Updated: February 1, 2017 Page 2 of 34 Author: Oracle


Copyright 2017, Oracle and/or its affiliates. All rights reserved.
High-level Process Flow for Oracle Exadata and Oracle Recovery Appliance (OEM Auto Generated
SRs) ......................................................................................................................................................... 23
Appendix V Sample Service Request Details ............................................................................................. 24
SR Example I OEM Detected Fault ..................................................................................................... 24
SR Example II ASR Detected Fault ..................................................................................................... 25
Appendix VI Sample Fault Notification Email........................................................................................... 26
Appendix VII Sample Notification of EM generated SR ........................................................................... 27
Appendix VIII Fault Event Telemetry and Configuration Data ................................................................. 28
Sample Fault Event Telemetry Data via OASP ...................................................................................... 28
Sample Configuration Item via OASP .................................................................................................... 29
Sample Configuration Item Drilldown ............................................................................................... 30
Oracle Collection Manager Collections .................................................................................................. 31
Sample Configuration Collection........................................................................................................ 31

Updated: February 1, 2017 Page 3 of 34 Author: Oracle


Copyright 2017, Oracle and/or its affiliates. All rights reserved.
Document Objective
The objective of this document is to provide an overview of the Oracle Platinum Services Remote Fault
Monitoring framework and detail a sample list of activities that may be performed to monitor a Certified
Platinum Configuration. The information included in this document is for informational purposes only
and is subject to change. This document is not binding on either party, will not be deemed an agreement
between the parties and does not amend and/ or modify the terms of any order or agreement.

Overview
Remote Fault Monitoring
Remote Fault Monitoring, referred to as Fault Monitoring in this document is a deliverable of Oracle
Platinum Services. Oracle Platinum Services remotely monitors for faults in the hardware, database,
operating system and networking components of Certified Platinum Configurations twenty-four (24)
hours per day, seven (7) days per week and provides a mechanism to trigger the creation of a Service
Request (SR) on behalf of the customer. Fault Monitoring is subject to the Oracle Platinum Services
Technical Support Policy.

Please review the Oracle Platinum Services Technical Support Policy at


http://www.oracle.com/us/support/library/platinum-services-policies-1652886.pdf.
A list of Certified Platinum Configurations is available at
http://www.oracle.com/us/support/library/certified-platinum-configs-1652888.pdf.

Fault Monitoring focuses on helping you maintain system and component functionality. Oracle
determines whether an event constitutes a fault. For a list of Oracle Platinum Services fault monitoring
events, please see Appendix I.

You may purchase additional monitoring services for a fee. Examples of for-fee monitoring include but
are not limited to performance, availability and capacity monitoring. For a description of each, please see
Appendix II.

For assistance with for-fee monitoring, please contact acsdirect_us@oracle.com.

Updated: February 1, 2017 Page 4 of 34 Author: Oracle


Copyright 2017, Oracle and/or its affiliates. All rights reserved.
Fault Monitoring Framework
At the heart of the Fault Monitoring Framework is the Oracle Advanced Support Gateway (OASG). The
OASG is a multi-purpose platform designed to facilitate and enable a number of Oracle connected
services including Oracle Platinum Services.

One gateway can monitor multiple Engineered Systems (for example, up to eight (8) Full Rack
SuperCluster machines) as long as they are network accessible and the network connection between the
OASG and the Engineered System is reliable with low latency. In conjunction with the Oracle
Continuous Connection Network (OCCN) transport layer, the OASG establishes secure connectivity to
Oracle via SSL. Learn more about the gateway security by watching this video.

Key Components of the Gateway


The gateway has several key components that facilitate Fault Monitoring. These include:

Oracle Enterprise Manager Oracle Enterprise Manager (OEM) is the standard tool for
monitoring and managing Oracle products. With Oracle Platinum Services Fault Monitoring,
OEM is the primary tool for detecting software faults. OEM software included with the OASG
also includes rule-based fault detection functionality that automatically creates a Service Request
(SR) and uploads related diagnostics, when available, upon detection of critical OEM issues with
Exadata and Recovery Appliance. A client side OEM agent is installed on the Certified Platinum
Configuration as a communication mechanism with OEM.
Oracle Auto Service Request Oracle Auto Service Request (ASR) is used to detect hardware
faults and automatically create the associated SR. ASR detects faults in compute nodes, storage
cells, and their Oracle Integrated Lights Out Managers (ILOM). For more information on ASR,
see Auto Service Request (ASR) documentation.
Oracle Configuration Manager Oracle Configuration Manager (OCM) captures Engineered
System configuration information and uploads the data to My Oracle Support. The configuration
data is extracted and uploaded every twenty-four (24) hours and is analyzed by Oracle Support

Updated: February 1, 2017 Page 5 of 34 Author: Oracle


Copyright 2017, Oracle and/or its affiliates. All rights reserved.
Engineers when working to resolve SRs. For more information on data collected, see OCM
documentation and Appendix VIII for a sample configuration collection.
Oracle Advanced Customer Support Services (ACS) Monitoring Framework (MFW) The
ACS MFW is a centralized framework for receiving, filtering, categorizing and enriching events
from multiple monitoring sources. It qualifies events as faults and forwards consolidated
information to Oracle Support for SR creation. See Appendix VIII for details on fault event
telemetry data collected from the Certified Platinum Configuration.

Managing the OASG

The OASG will be monitored, managed, and maintained by Oracle remotely via the OCCN connection.
Oracle monitors the entire event flow starting from OEM agent installed on the Certified Platinum
Configuration to the OCM Collections housed at Oracle. This ensures Oracle is alerted to any breakdown
in communication between components or software failure including detection of issues with OEM.
Oracle Platinum Services leverages OEM to monitor OASG system resources such as disk, memory,
CPU, etc. If the OASG is running on Oracle owned hardware, Oracle Platinum Services will leverage
ASR to monitor the key components of the hardware and engage Oracle support accordingly.

Oracle Advanced Support Portal


The Oracle Advanced Support Portal (OASP) is a fully integrated, ITIL-based operations management
framework, including tools, processes and technology, which is hosted by Oracle and delivered via a Web
interface. It enables users to monitor and manage their infrastructure elements.

The OASP provides a view of your configuration items, incident management, change management, user
account management, and reporting.

See the OASP Quick Reference Guide or the Oracle Advanced Support Portal Demo for more
information. Sample fault event telemetry and sample configuration item details visible by the customer
can be found in Appendix VIII.

Updated: February 1, 2017 Page 6 of 34 Author: Oracle


Copyright 2017, Oracle and/or its affiliates. All rights reserved.
Fault Monitoring Details
Customer Requirement and Obligations
The table below identifies customer requirements and obligations for successful Fault Monitoring with
Oracle Platinum Services.

Item Requirement / Obligation


Network Connectivity Provide continuous inbound VPN connection via OCCN.
Open ports between OASG and Engineered System for
agent communication and diagnostics.
Deployment Provide root or sudo1 access for agent deployment and
monitoring configuration.
Provide a dedicated user for deploying the OEM agent.
Provide monitoring account credentials.
Service Delivery Provide root or sudo access for management of agents
and SR troubleshooting.
Provide notification of changes to Engineered System
and associated targets, such as new databases to be
monitored; databases that are removed; IP address
changes, and password changes.
Work with Oracle Support Services (OSS) to resolve any
agent issues that cannot be corrected remotely.

For additional details on required firewall ports, please see the Oracle Advanced Support Gateway
Security Guide. For additional details on access requirements, please see Appendix III.

Note: Without continuous inbound connection, Oracle will not be able to validate faults, which negates
the 15-minute resolution / 30-minute joint debug Oracle Platinum Service target response times.

1
sudo allows a user to execute a command or process with the privileges of another user typically superuser or
root without having to grant full access to those privileged accounts.
Updated: February 1, 2017 Page 7 of 34 Author: Oracle
Copyright 2017, Oracle and/or its affiliates. All rights reserved.
Fault Monitoring Roles and Responsibilities
Role Responsibility
Oracle Platinum Driver Oracle assigns each customer a Platinum Driver to provide key information during the
customers consideration of Platinum Services. The goal is to verify that the customer
is fully qualified, fully understands the requirements and responsibilities to the
service, is committed to Platinum Services, and completes prerequisites before
implementation begins. Once implementation of Platinum Services is underway, the
Platinum Driver may engage the customer to help resolve delays and to see that
customer expectations are being well managed and executed.
Oracle Implementation The Implementation Engineer is the primary point of contact and technical
Engineer (IE) manager for customers during the Oracle Platinum Services implementation.
From the point of receiving ownership of the Platinum Implementation SR
(PISR) to the point of hand over to the delivery organization, the IE acts as
the technical project lead during the implementation and remotely installs all
technical aspects of the fault monitoring, Oracle Automatic Service Request
(ASR), and Oracle Configuration Manager (OCM) solution. The IE is also
responsible for coordinating the resources and activities to deliver and install
the Engineered System.
Oracle Platinum The Oracle Platinum Control Center is responsible for fault event
Control Center management after a fault is detected including managing faults in OASP, fault
notification and SR creation.

Customer Contact Customer contact(s) are notified of verified fault events received by Oracle
Platinum Services. Notification is made by email only and can be to
individuals or an alias.

Oracle Field Engineer The Oracle Field Engineer (FE) is responsible for the Oracle Platinum
hardware gateway installation (on Oracle hardware), OASG installation and
Platinum connectivity to Oracle.

Customer Platinum The customer assigns an employee or contractor to fill the Customer Platinum
Manager Manager role. The Customer Platinum Manager is the point of contact (POC) for
Oracle and is responsible for the coordination of customer resources, installation-
related activities (for example, opening firewall ports), and decisions needed for a
smooth implementation. This POC is also responsible for the integration with
customer processes and meeting the planned Go Live schedule. Additional
responsibilities include managing customer stakeholder decisions and, when
necessary, consulting within the company to acquire expertise for service
integrationnetwork expert(s), security expert(s), and the target system owner(s).

Updated: February 1, 2017 Page 8 of 34 Author: Oracle


Copyright 2017, Oracle and/or its affiliates. All rights reserved.
Activities
Oracle Platinum Services Fault Monitoring Implementation Prerequisites
The table below identifies a sample of the prerequisite activities associated with implementing Fault
Monitoring. Any activity in this list not completed may result in implementation delays.

Activity Responsibility When


1. Service Implementation Worksheet Customer
Your Oracle account team will guide you through the process of Oracle
completing the Service Implementation Worksheet online. This Both
worksheet is used to collect details for Oracle to initiate the
Oracle Platinum implementation process. Some of the key details
collected are:
- Customer contact information for fault notification, change
management, SRs and remote patch deployment.
- Configuration information for the OASG.
- External and internal firewall requirements.
- Access requirements.

2. Open Firewall Ports Customer


Open necessary firewall ports. See Oracle Advanced Support Oracle
Gateway Security Guide Network Protocol and Port Matrix for Both
details.
3. Install and Configure the Certified Platinum Configuration Customer
The Oracle FE and Oracle Oracle
IE will install and configure the Engineered System. Both
4. Provide suitable hardware or virtual environment for the Customer
OASG software Oracle
You must provide a suitable environment for the OASG software. Both
This can be an x86 machine that meets the specifications outlined
in the Gateway Host System Requirements document, or an
Oracle Virtual Machine running on an Oracle Virtual Server i.e.
using the Oracle VM Server software.
Note: Oracle Database Appliance is not recommended
5. Complete Network Connectivity Form (IPSEC VPN only) Customer
Oracle Global IT will assist the customer in completing the Oracle
OCCN Network Connectivity Form in case IPSEC VPN is Both
required.
6. Deploy OCCN Customer
For SSL VPN Oracle will deploy OCCN after the customer Oracle
has enabled the outbound connection from the gateway. Both
For IPSEC VPN, Oracle Global IT will assist the customer in
deploying OCCN.

Updated: February 1, 2017 Page 9 of 34 Author: Oracle


Copyright 2017, Oracle and/or its affiliates. All rights reserved.
Oracle Platinum Services Fault Monitoring Implementation
The table below identifies a sample of the activities associated with implementing Fault Monitoring. Any
activity in this list not completed may result in implementation delays.

Activity Who When


1. Deploy the OASG You
Two options: Oracle
a. The customer may download, deploy, and test the OASG Both
software prior to the tasks performed by the IE. The IE must
perform the installation and configuration of OEM and
components of the Oracle Platinum Services.
b. The IE will deploy the OASG software and configure
components of the Oracle Platinum Services on hardware
provided by the customer that meets the requirements
outlined in the Gateway Host System Requirements
document.
c. If customer opts for non-Oracle hardware to deploy the
OASG, the customer will install the OASG image software
and install and configure the OASG software.
2. Install and deploy OEM agents to Certified Platinum You
Configuration Oracle
The IE will install and deploy monitoring OEM agents to the Both
target Engineered System and will, if required, upgrade the OEM
agents with the latest patches.
3. Discover Monitoring Targets You
The IE will discover monitoring component targets and deploy Oracle
monitoring templates. Both
4. Activate ASR You
The IE will enable ASR on the target Engineered System and the Oracle
OASG. Both
Note: For existing ASR installations, reconfiguration is required
for Oracle Platinum Services.
5. Install and Configure OCM You
The IE will install and configure OCM to capture configuration Oracle
information on the target Engineered System. Both
6. Configure OASP You
The IE will configure the OASP for use. Oracle
Both
7. Validation You
The IE will validate the OASG and monitoring setup. Oracle
Both

Updated: February 1, 2017 Page 10 of 34 Author: Oracle


Copyright 2017, Oracle and/or its affiliates. All rights reserved.
Oracle Platinum Services Fault Monitoring Event for Oracle Exadata and Oracle Zero
Data Loss Recovery Appliance
The table below identifies the activities and owners associated with a Fault Monitoring event for Oracle
Exadata and Oracle Zero Data Loss Recovery Appliance (Oracle Recovery Appliance) where the OEM
software automatically opens SRs when it detects Oracle Platinum faults. For a list of Oracle Platinum
Services fault monitoring events, please see Appendix I.

Activity Who When


1. Receive Oracle Platinum Services Fault Event You
A fault event is detected via the OASG. Oracle
Both
2. Validate Oracle Platinum Services Fault You
Oracle determines whether a fault is a valid Oracle Platinum Services Oracle
fault (see Appendix I for a list of Oracle Platinum Services fault Both
monitoring events).

If the fault is a valid Oracle Platinum Services fault, an SR will be


automatically opened (see Step 3 below).
If the fault is not a valid Oracle Platinum Services fault, there is no
further action required of Oracle.
3. Open Oracle Platinum Services SR & Notify Customer of Oracle You Within 5
Platinum Services Fault Event Oracle minutes of
Fault notification will be sent to the distribution email list and the Both receiving
customer contact defined for the Certified Platinum Configuration via fault event
email after a fault is validated and SR opened. See sample fault
notification in Appendix VI, and sample SR in Appendix V, SR
Example I.
4. Diagnostic Upload You
Diagnostics collected automatically if fault is an Automatic Oracle
Diagnostic Repository (ADR) covered fault (see Exadata ORA Both
Events (Oracle Platinum Services Only) for a list of covered faults,
including ADR).
Otherwise, diagnostics will be collected by Oracle or with the
assistance of the customer.
5. Resolve Oracle Platinum Services SR You
Oracle Support and customer contact will work together to adjust Oracle
severity levels to appropriate severity levels, as needed. (See Technical Both
Support Policy Severity Definitions) and resolve the SR.

Updated: February 1, 2017 Page 11 of 34 Author: Oracle


Copyright 2017, Oracle and/or its affiliates. All rights reserved.
Oracle Platinum Services Fault Monitoring Events for Oracle SuperCluster, Oracle
Exalogic, and Oracle Exadata
The table below identifies the activities and owners associated with a Fault Monitoring event for Oracle
SuperCluster, Oracle Exalogic, and Oracle Exadata where the Oracle Platinum Control Center opens SRs
when OEM detects Platinum faults. For a list of Oracle Platinum Services fault monitoring events, please
see Appendix I.

Activity Who When


1. Receive Oracle Platinum Services Fault Event You
A fault event is received in the Oracle Platinum Control Oracle
Center. Both
2. Create Incident ticket in OASP You
The Oracle Platinum Control Center creates an incident Oracle
ticket in OASP. Both
3. Notify Customer of Fault You Within 5
Based upon customer preference, the Oracle Platinum Oracle minutes of
Control Center will provide fault notification to Both receiving
customers identified customer contact by email. See, fault event
sample fault notification in Appendix VI.
4. Validate Fault You
Oracle determines whether a fault is a valid Oracle Oracle
Platinum Services fault (see Appendix I for a list of Oracle Both
Platinum Services fault monitoring events)
If the fault is a valid Oracle Platinum Services fault,
an SR will be opened (see step 5 below).
If the fault is not a valid Oracle Platinum Services
fault, the incident ticket opened in step 2 will be
closed and there is no further action required of
Oracle.
5. Open SR You Within 15
For a hardware fault event, the Oracle Platinum Oracle minutes of
Control Center will validate an ASR has been created. Both notification
See sample SR in Appendix V, SR Example III.
For a software fault event, the Oracle Platinum
Control Center will create a SR. See sample SR in
Appendix V, SR Example II.
6. Resolve Oracle Platinum Services SR You
Oracle Support and the customer contact will work Oracle
together to adjust severity levels to appropriate severity Both
levels, as needed. (See Technical Support Policy Severity
Definitions) and resolve SR.

Updated: February 1, 2017 Page 12 of 34 Author: Oracle


Copyright 2017, Oracle and/or its affiliates. All rights reserved.
Appendix I Oracle Platinum Services Fault Monitoring Events
Fault Monitoring covers the events documented below and is a combination of all ASR fault events and
an Oracle determined set of events generated by OEM. This list of fault events is subject to change and
some gateways may not yet have the latest software update.

ASR Fault Events


ASR fault events are documented publicly. Please consult the following for details on ASR fault events:

ASR Fault Event Coverage for Oracle Exadata:


o Exadata Database Machine
o Exadata Database Servers
o Exadata Storage Servers
o Exadata Servers
ASR Fault Event Coverage for Oracle Exalogic
o Exalogic Server
o Exalogic Storage Appliance
ASR Fault Event Coverage for Oracle SuperCluster
o SuperCluster Products
o See Exadata Storage Servers
o See SuperCluster Storage Appliance
ASR Fault Event Coverage for Oracle Zero Data Loss Recovery Appliance
o See Exadata Database Machine X5-2
o See Exadata Storage Servers
ASR Fault Event Coverage for Oracle ZFS Storage Appliance Racked System
o ZFS Products

OEM Fault Events


OEM software included with the OASG includes rule-based fault detection functionality that
automatically creates a SR and uploads related diagnostics when available upon detection of critical OEM
faults for Oracle Exadata and Oracle Recovery Appliance. Oracle SuperCluster and Oracle Exalogic SRs
are manually created by the Oracle Platinum Control Center.

The below list of Oracle Platinum Services monitored faults, determined by Oracle, are standard and not
subject to customization.
Exadata Exalogic SuperCluster Zero Data ZS3-ES and
Loss ZS4-4 Racked
Item Name Description
Recovery System
Appliance

Fan Failure x x x x A fan in the Infiniband switch has


on an failed or dropped below a safe
1
Infiniband operating level
Switch

OS Kernel x x x x The Kernel has encountered an error


2 Panic and condition which may have led to a
Errors restart of the system

SCSI and x x x x SCSI Errors have been detected by the


3
PCI Errors OS

Memory x x x x Memory Errors have been detected by


4
Errors the OS

Disk x x x x Disk Errors have been detected by the


5
Errors OS

x x x x I/O Errors have been detected by the


6 I/O Errors
OS

ZFS Cluster x
7 Detection of standby node failure

ZFS Critical and x Major and Critical alerts reported in


8
Major Alerts problem, alert and fault logs

ZFS # spare x
9 Detection of spare disks availability
disks available

Voting x x x CRS-160(4|5|6)
8 Disk
Alert*

Control x ORA errors (see list of ORA errors


9 VM below)
Database*

Node x x x CRS-180(2|3|4|5), CRS-1607


10 Configurat
ion Alert*

OCR x x x CRS-(1006|1008|1010|1011|1009)
11
Alert*

Oracle x x x CRS-(1202|1402|1602|1603)
High
12 Availabilit
y Service
Alert*

Updated: February 1, 2017 Page 14 of 34 Author: Oracle


Copyright 2017, Oracle and/or its affiliates. All rights reserved.
Exadata Exalogic SuperCluster Zero Data ZS3-ES and
Loss ZS4-4 Racked
Item Name Description
Recovery System
Appliance

CRS x x x CRS-120(3|5|6)
13 Resource
Alert*

x x x x ORA-
Generic (227|239|240|255|445|494|3137|4036|2
14
Incident* 4982|25319|29770|29771|32701|3270
3|32704|56729)

Cluster x x x ORA-29740
15
Error*

Data Block x x x x ORA-1578


16 Corruption
*

Generic x x x x ORA-600
Internal
Error
17 (Exadata
Storage
Cell and
DB)*

x x x x ORA-7445 (Exadata Storage Cell and


Access
18 DB), ORA-3113, RS-7445 (Exadata
Violation*
Storage Cell)

Redo Log x x x x ORA-(353|355|356)


19 Corruption
*

Out of x x x x ORA-403(0|1)
20
Memory*

File x x x x ORA-376
21 Access
Error*

Deadlock x x x x ORA-4020
22
(System)*

Soft x x x x ORA-700, RS-700 (Exadata Storage


23 Internal Cell)
Error*

Data file x x x x ORA-1157


cannot be
24
identified/l
ocked*

Media x x x x ORA-(1242|1243)
25
failure*

Updated: February 1, 2017 Page 15 of 34 Author: Oracle


Copyright 2017, Oracle and/or its affiliates. All rights reserved.
Exadata Exalogic SuperCluster Zero Data ZS3-ES and
Loss ZS4-4 Racked
Item Name Description
Recovery System
Appliance

Invalid file x x x x ORA-27048


header
26
informatio
n*

Recovery x ORA-(45168|45111)
27 Appliance
task failure

Recovery x ORA-45169
Appliance
28
timer
failure

Recovery x ORA-45109
Appliance
29
metadata
corruption

Corruption x ORA-(45132|45167)
30 in backup
piece

Corruption x ORA-45165
31 in backup
data

Temperatu x x x x Cisco Switch (> 56F)


32 re, Value
(Celsius)

33 Fan State x x x x Cisco Switch

Power x x x x Cisco Switch


34 Supply
State

Module1 x x x x PDU
Phase2
35
Threshold
Evaluation

Module1 x x x x PDU
Phase1
36
Threshold
Evaluation

Module1 x x x x PDU
Phase3
37
Threshold
Evaluation

Updated: February 1, 2017 Page 16 of 34 Author: Oracle


Copyright 2017, Oracle and/or its affiliates. All rights reserved.
*Related to Oracle Database component only

Appendix II Description of Common For-Fee Monitoring Items


The descriptions below are for common for-fee monitoring items. These are examples of items not
covered by Fault Monitoring and do not represent a complete list.

Performance Monitoring Measures IT service components against agreed upon metrics and
thresholds.
Availability Monitoring Measures the availability of key IT infrastructure components against
a defined availability target.
Capacity Monitoring Measures resource utilization and performance against the defined
capacity plan with the ability to adjust based on changing demand.

For assistance with for-fee monitoring, please contact acsdirect_us@oracle.com.

Updated: February 1, 2017 Page 17 of 34 Author: Oracle


Copyright 2017, Oracle and/or its affiliates. All rights reserved.
Appendix III Access Requirements
Oracle requires a continuous connection to the Certified Platinum Configuration during delivery of Oracle
Platinum Services, as described in the Oracle Platinum Services Technical Support Policy. The following
table describes the user account access required by Oracle during the implementation and ongoing
delivery of Oracle Platinum Services.

Patch Exalogic Recovery ZS4-4


System Login Service Exadata SuperCluster
and (Require Appliance Racked Justification
Component Account Activation (Required?) (Required?)
Restore d?) (Required?) System

To set
SNMP
parameters
root Yes Yes x x x x x and create
orarom
monitoring
Integrated account
Lights Out
Manager Ongoing
Monitoring.
This account
orarom Yes Yes x x x x x is created
during the
setup by
Oracle

Required for
implementin
g solution,
creating
root Yes Yes x x x x
orarom user
and
configuring
monitoring

Compute/D Ongoing
B hosts Monitoring,
primary
owner of the
OEM agent.
orarom Yes Yes x x x x
This account
is created
during the
setup by
Oracle

SSH keys for


Storage agent login
root Yes Yes x x x
cells without
password,

Updated: February 1, 2017 Page 18 of 34 Author: Oracle


Copyright 2017, Oracle and/or its affiliates. All rights reserved.
Patch Exalogic Recovery ZS4-4
System Login Service Exadata SuperCluster
and (Require Appliance Racked Justification
Component Account Activation (Required?) (Required?)
Restore d?) (Required?) System

define
SNMP
parameters

Ongoing
cellmonitor Yes x x x
monitoring

To configure
ASM
monitoring
ASM asmsnmp Yes Yes x x x
from OEM
and ongoing
monitoring

To configure
DB
monitoring
for OEM,
ongoing
DBMS dbsnmp Yes Yes x x x
monitoring
and
configuratio
n data
collection

SSH keys for


agent login
without
root Yes Yes x x x x password,

IB define

Switches SNMP
parameters

To monitor
nm2user Yes Yes x x x x Infiniband
Switches

To define
SNMP
parameters;

Cisco Admin Yes Yes x x x x only required

Switch for initial


configuratio
n

enable Yes x x x x

To define
PDUs Admin Yes Yes x x x x SNMP
parameters

To create
ZFS root Yes Yes x x x
shares for

Updated: February 1, 2017 Page 19 of 34 Author: Oracle


Copyright 2017, Oracle and/or its affiliates. All rights reserved.
Patch Exalogic Recovery ZS4-4
System Login Service Exadata SuperCluster
and (Require Appliance Racked Justification
Component Account Activation (Required?) (Required?)
Restore d?) (Required?) System

agent
installation
(Exalogic
only) and to
run
workflow to
enable OEM
monitoring

Created
during
installation
and assigned
orarom Yes Yes x x x to the agent
role, which
is used for
ongoing
monitoring

Control
VMs - for
Exalogic root Yes Yes x
for release
2.0.6.x.x

Ops Center
VM and
Exalogic
OVMM root Yes Yes x
VM for
release
2.0.4.x.x

Domains &
root Yes Yes x
Zones

Recovery
Ongoing
Appliance rasys Yes No x
monitoring
(Admin)

Initial
Activation
and one time
Recovery
root Yes Yes x SSH
Appliance
communicati
on between
nodes

Updated: February 1, 2017 Page 20 of 34 Author: Oracle


Copyright 2017, Oracle and/or its affiliates. All rights reserved.
Oracle Access to Data
OEM agents are installed using a unique account created specifically for monitoring (orarom).
This account can be read-only and does not need administrative access to the Operating System
or Oracle Database.
Within the Oracle Database, OEM agents use a generic DBSNMP account, which is enabled for
monitoring including configuration collection. This configuration data can be used as diagnostics
for restoration planning and for patch planning.
The generic DBSNMP account has restricted access to the database for monitoring purposes only.
The users cannot run SQL commands, navigate Tablespaces, or maliciously query the Oracle
Databases.
As described in the table above, Oracle requires administrative-level privileged access to the
Certified Platinum Configuration during Oracle Platinum Services implementationincluding
setup of fault monitoring; during remote patch deployment events; and to assist with diagnostics
and fault restoration.
o Privileged access to root or oracle accounts for example does not need to be
continuous. It can be provided on a temporary basis then revoked upon completion of
task. For example, access can be provided for a remote patch deployment event, then
revoked when the remote patch deployment event is complete.
o During Oracle Platinum Services implementation including setup of fault monitoring
direct access to the root and other privileged accounts is required as described in the table
above.
o During ongoing fault monitoring activities including collection of diagnostic
information to assist with fault restoration activities access to root and other privileged
accounts can be constrained and monitored with the use of tools such as sudo.
o During a remote patch deployment event, access to root and other privileged accounts
can be constrained and monitored with the use of tools such as sudo.
Group read and write access must be set for each database nodes diagnostic directory
/u01/app/oracle/diag, for uploading relevant diagnostic files during the OEM SR automation
process. Detailed information is available in How to setup diagnostic directory group permissions
for Platinum Automated Diagnostic Upload (Doc ID 1633603.1)

Updated: February 1, 2017 Page 21 of 34 Author: Oracle


Copyright 2017, Oracle and/or its affiliates. All rights reserved.
Appendix IV Process Flow Diagrams
High-level Process Flow for Oracle SuperCluster, Oracle Exalogic, and
Oracle Exadata
The table below identifies the activities and owners associated with a Fault Monitoring event for Oracle
SuperCluster, Oracle Exalogic, and Oracle Exadata where the Oracle Platinum Control Center opens SRs
when OEM detects Platinum faults.
High-level Process Flow for Oracle Exadata and Oracle Recovery Appliance
(OEM Auto Generated SRs)
The table below identifies the activities and owners associated with a Fault Monitoring event for Oracle
Exadata and Oracle Recovery Appliance where the Oracle Platinum control center for the OEM software
automatically opens SRs when it detects Platinum faults.

Updated: February 1, 2017 Page 23 of 34 Author: Oracle


Copyright 2017, Oracle and/or its affiliates. All rights reserved.
Appendix V Sample Service Request Details
Platinum Service Requests (SR) created by Oracle may be created manually by the Control Center after
an OEM detected fault for Oracle SuperCluster or Oracle Exalogic. It may be automatically created via an
OEM detected fault for Oracle Exadata or Oracle Recovery Appliance. An SR may also be created
automatically via ASR for your Certified Platinum Configuration.

SR Example I OEM Detected Fault


Source: OEM
Type: Automatic

Abstract: SASR:ORA-600 - This is an automated database error on an Exadata System

Description
Hostname: xyzdb02
Product Type: EM ASR PRODUCT
Summary:SASR:ORA-600 - This is an automated database error on an Exadata System

Message Payload Data:

problem_key = ORA 600 [1350]


target_name = db_db02
host_name = xyzdb02.oracle.com
target_type = oracle_database

Hardware Component:
Name:NA
Id:NA

SASR:ORA-600 - This is an automated database error on an Exadata System;

Alerts received in last 30 days (limit 10)


Date Summary SR
21 Jul 2014 03:56:40 SASR:oracle_ibswitch:metric_alert:Aggr [SR #]
21 Jul 2014 03:56:39 SASR:oracle_ibswitch:metric_alert:Aggr [SR#]

Updated: February 1, 2017 Page 24 of 34 Author: Oracle


Copyright 2017, Oracle and/or its affiliates. All rights reserved.
SR Example II ASR Detected Fault
Source: Automated Service Request (ASR)
Type: Automatic

Abstract: ASR:Memory module correctable errors exceeding acceptable levels.

Description
Hostname: xyzdb01
Product Type: ORCL,SPARC-T4-4
Summary:ASR:Memory module correctable errors exceeding acceptable levels.

Fault event knowledge article: https://support.oracle.com/msg/SUN4V-8002-3R

The number of correctable errors associated with this memory module has exceeded
Message-ID: SUN4V-8002-3R
UUID: [UUID #]
Time: Jun 9, 2014 6:44 AM (UTC)
Severity: Major

FRU = hc://:chassis-mfg=unknown:chassis-name=ORCL,SPARC-T4-4:chassis-part=7020893:chassis-
serial=[chassis serial #]:fru-serial=[fru-serial #]:fru-part=07020578,HMT42GR7BMR4A-
G/chassis=0/cpuboard=0/dimm=8
Part number = 07020578,HMT42GR7BMR4A-G
Certainty = 95
Class = fault.memory.dimm-page-retires-excessive

Alerts received in last 30 days (limit 10)


Date Summary SR
09 Jun 2014 12:25:09 ASR:Memory module correctable errors e [SR #]

Updated: February 1, 2017 Page 25 of 34 Author: Oracle


Copyright 2017, Oracle and/or its affiliates. All rights reserved.
Appendix VI Sample Fault Notification Email
Below is a sample notification email that would be sent to the customer contact defined by the customer
for the Certified Platinum Configuration.

Updated: February 1, 2017 Page 26 of 34 Author: Oracle


Copyright 2017, Oracle and/or its affiliates. All rights reserved.
Appendix VII Sample Notification of EM generated SR
Below is a sample notification sent to the customer contact of the Certified Platinum Configuration when
an SR is automatically opened for an Oracle Platinum detected fault.

Updated: February 1, 2017 Page 27 of 34 Author: Oracle


Copyright 2017, Oracle and/or its affiliates. All rights reserved.
Appendix VIII Fault Event Telemetry and Configuration Data
The OASG collects fault and configuration data to aid in the delivery of Fault Monitoring, remote patch
planning and installation and support and restoration services. Below are samples of data collected via
OASP and OEM under the Oracle collection manager (defined below).

For a demonstration of the OASP, see Oracle Platinum Services - Oracle Advanced Support Portal
(OASP) Overview - [ Video ] (Doc ID 1607117.1)

Sample Fault Event Telemetry Data via OASP


The information below is a sample of fault event telemetry data that is collected by the OASG and used
for incident management and resolution. The detail is visible to the customer via the OASP.

Agent Platinum Connector


Alert Group TT:oracle_database|EN:UserAudit:username|MG:D84385697496BC960548
Alert Key ON:sample host|EC:Metric Alert|CAT:Security,
Article Id
CTA Receive
Time 2014-08-31 23:01:12 PDT
Cleared
Timestamp 2014-09-01 00:16:04 PDT
Correlation
Customer Id 55520521
Customer
Name Sample Customer
Debug Info1
Debug Info2 V:556(70678)|C:a3d0060a-a779-4287-9ce1-7fbd7afb6efe|
Event Time
Drift 3
Grade 3
88aa025e-ed45-4fa5-8e4d-
18bf0cc5d58b|CM:0|Major|TT:oracle_database|EN:UserAudit:username|MG:D6448569
B496BC9205481E8A70692F1E|ON:samplehost|EC:Metric Alert|CAT:Security,|Platinum
Identifier Connector|WebEvent::OracleEnterpriseManager::V12c::Generic_user_audit
Manager OracleEnterpriseManagerV12c
Managing
Host gateway name
Original
Message V:556(70678)
Original 3
Updated: February 1, 2017 Page 28 of 34 Author: Oracle
Copyright 2017, Oracle and/or its affiliates. All rights reserved.
Severity
Reporter
Statu Flash
Severity 0
Summary DOWNGRADE WARNINGS: UserAudit:username User SYS logged on from samplehost
Target uuid 99aa055f-ed45-4fab-8e6d-16be0cc5d58c

Sample Configuration Item via OASP


Below is a sample Exalogic configuration item as shown in OASP configured for Oracle Platinum
Services. The detail is visible to the customer via the OASP.

System Name Type Model


Target Name
/sample_Exalogic sample_exalogic Application Server Exalogic System
ec1-vm- sample_exalogic Server SunFire X4170 M2
sample.sample.org
pc1-vm- sample_exalogic Server SunFire X4170 M2
sample.sample.org
pc2-vm- sample_exalogic Server SunFire X4170 M2
sample.sample.org
sample- sample_exalogic Server SunFire X4170 M2
ovmm.sample.org
samplegw01.sample.org sample_exalogic Oracle Infiniband Switch QDR Infiniband Switch
samplegw02.sample.org sample_exalogic Oracle Infiniband Switch QDR Infiniband Switch
samplesn01.sample.org sample_exalogic Storage ZFS Storage Appliance
samplesn02.sample.org sample_exalogic Storage ZFS Storage Appliance

Updated: February 1, 2017 Page 29 of 34 Author: Oracle


Copyright 2017, Oracle and/or its affiliates. All rights reserved.
Sample Configuration Item Drilldown
Oracle Platinum customers are able to drill down into each configuration item via the OASP. A detailed
view may show the following information depending on the target type chosen:

Name: ec1-vm-sample.sample.org

Customer: SAMPLE CUSTOMER)


ComputerSystem
Category:

Type: Server

External Id:

Make: Sun

Model: SunFire X4170 M2

Description:

Status: Production

UUID:

Serial Number: N/A

Barcode: host

Architecture:

Firmware:
IP Address Type Primary Assigned CI
xx.yy.zzz.aa Management IP ec1-vm-sample.sample.org

Updated: February 1, 2017 Page 30 of 34 Author: Oracle


Copyright 2017, Oracle and/or its affiliates. All rights reserved.
Oracle Collection Manager Collections
Oracle collection manager is a component of OEM housed on the OASG. Collections are made every 24
hours. If there is a change recognized between current and last collection, the change is uploaded to
Oracle. This allows for the most current configuration data to be available to Oracle in the event of a fault
for diagnostic purposes. Collections are attached to an SR upon submission. Below is a sample of the data
collected via Oracle collection manager for an Oracle SuperCluster machine.

The item count in the collections will be configuration dependent.

Sample Configuration Collection

System Configuration Header


Configuration [, May 15,
2014]
Name
Type
Release
Last Collected
Host
Oracle Homes
Support ID
Level
Lifecycle
Source

Updated: February 1, 2017 Page 31 of 34 Author: Oracle


Copyright 2017, Oracle and/or its affiliates. All rights reserved.
Hardware Configuration Header
Hardware
NAME VALUE
Host Name
Domain
Vendor Name
Virtual
System Config
Machine Architecture
Clock Frequency(MHz)
Memory Size (MB)
Local Disk Space (GB)
Total CPU Sockets
Total CPU Cores
Total Enabled CPU Cores
Total CPU Threads
CPU Board Count
I/O Card Count
Host ID
System Serial Number
Fan Count
Power Supply Count
Boot Disk Volume Serial
Number
System BIOS

Updated: February 1, 2017 Page 32 of 34 Author: Oracle


Copyright 2017, Oracle and/or its affiliates. All rights reserved.
Operating System Configuration Header
Operating System
NAME VALUE
Name
Vendor Name
Base Version
Update Level
Distributor Version
Max Swap Space (MB)
Address Length (bits)
Platform ID
Current OS Run Level
Default OS Run Level
Platform Version ID
Is DB Machine Member
Is Exalogic Member
Maximum Process Virtual
Memory (MB)
Timezone
Timezone Region
Timezone Delta

Hardware Configuration Components Item count configuration dependent


Hardware Components

Location
Manufac

Revision
Number

Number
Size In

PCI Id

Model
Name

Serial
Bytes
turer

Type

Part

Updated: February 1, 2017 Page 33 of 34 Author: Oracle


Copyright 2017, Oracle and/or its affiliates. All rights reserved.
Operating System Configuration Registry
Operating System Registered Software

Parent Product Identifier


Software Architecture

Parent Product Name


Installed Location
Installation Date

Name/Identifier

Registry Source
Virtual Machine
Vendor Specific
Vendor Name

Information
Description

Media Type
Version
Name

ID
Installed Firmware Register
Installed Firmware
Description Type Version Installation Date Provider Release
Date

Installed Operating System Patch Register


Installed OS Patches
Id Vendor Applied Packages

Updated: February 1, 2017 Page 34 of 34 Author: Oracle


Copyright 2017, Oracle and/or its affiliates. All rights reserved.