You are on page 1of 57

Strategies for Monitoring Large Data Centers with Oracle Enterprise

Manager
Ana McCollum
Consulting Product Manager
The following is intended to outline our general
product direction. It is intended for information
purposes only, and may not be incorporated into any
contract. It is not a commitment to deliver any
material, code, or functionality, and should not be
relied upon in making purchasing decisions.
The development, release, and timing of any
features or functionality described for Oracles
products remains at the sole discretion of Oracle.
Agenda

Overview of Oracle Enterprise Manager <Insert Picture Here>

Monitoring Best Practices


Q&A
Business-Driven IT Management
Business-Driven IT Management
Enterprise Manager Monitoring
Fundamental part of Integrated Application-to-Disk solution

Lights-out data center monitoring


Monitoring scope Manage by Exception: continuous monitoring of
targets, generation of alerts when exceptions are
detected
Metric: mechanism used to monitor target
Extended Infrastructure
conditions (availability, performance, etc.)
Alert: generated when metric crosses its
thresholds (warning, critical)
Notifications: sending of alert information (email,
etc.)
Complete and integrated across stack
Entire Oracle stack
Heterogeneous infrastructure monitoring by plug-ins
Extensible for custom needs
Manage Many as One
Features to setup and monitor many targets as one
Integrates with third party systems
Helpdesks and other management systems
Common Monitoring Questions

Whats the best way to...


Set Up Monitoring
Deploy monitoring settings on
targets
Set up notifications for administrators
Assign right level of target privileges
to administrators

Manage Alerts
Data center
Controlling volume of alerts
Removing unwanted alerts
Automating fix for common alerts

Administrator
Best Practices for Monitoring the Data Center

Goals: Strategy:
Meet monitoring requirements Set up monitoring for
Deploy your monitoring standards economies of scale by laying
Alert notifications sent to the appropriate the groundwork for monitoring
persons As enterprise grows, minimal
Comply with security practices effort to monitor new targets or
Follow Principle of least privilege when add new Enterprise Manager
granting target privileges administrators

Easy to manage Leverage Manage Many as


Infrastructure does not become One features
administrative task
Manage many as one for managed
targets
Scalable as enterprise grows
Best Practices: Setting Up Monitoring
4 Step Methodology

Step 1: Organize targets into groups

Step 2: Use Roles to segregate responsibilities

Step 3: Define and enforce monitoring standards

Step 4: Set up notifications using groups


Setting Up Monitoring
STEP 1: Organize targets into Groups

Plan your group structure


FINANCE
By Line of Business
Considerations:
Group together targets monitored in
the same way

Same monitoring settings due to:
By Deployment Supporting same application
PRODUCTION DEVT
(Prod vs Devt) Same Deployment type
(Production, Development,
Test)
Visually monitor them together in
By Owner A B C D E a dashboard
Can have group hierarchies
Sample group hierarchy:
By Line of Business
By Deployment (Prod vs Devt)
By Ownership
Setting Up Monitoring
STEP 1: Organize targets into Groups

Create group based on Create the group in


Production database and host Enterprise Manager
targets in the Finance department You can search targets by
operational criteria (aka
target properties)
Deployment Type, Line of
Business, Location
Additional tips:
Can add new target
properties via EMCLI
add_target_property
Can bulk update target
properties via EMCLI
set_target_property_value
Setting Up Monitoring
STEP 1: Organize targets into Groups

Make the group privilege-


Operator propagating
Privilege A privilege on the group that is
Production granted to a user automatically
Group
extends to all members of the
group
Includes subgroups

Requires:
Create Privilege-Propagating
Group privilege
Full privilege on all targets to be
added to the group
Administrator EMCLI modify_group verb to
convert group to privilege
propagating group
Production Production If a group is privilege propagating,
Finance Group Sales Group all its parent groups must be
privilege propagating.
Setting Up Monitoring
4 Step Methodology

Step 1: Organize targets into groups

Step 2: Use Roles to segregate responsibilities

Step 3: Define and enforce monitoring standards

Step 4: Set up notifications using groups


Setting Up Monitoring
STEP 2: Use Roles to segregate responsibilities

Who should do what on the targets in the group?


Map operations on the groups to job responsibilities (Senior
Lead, DBA owner, First Line Support, etc.)
Planning Considerations:
For the groups created, who can do these operations on them?
Change group membership
Grant privileges on the group to other users
Who can do these operations on the targets in the group?
Add / delete the target from Enterprise Manager
Define monitoring settings
Define notification settings
View / receive notifications for alerts
Acknowledge an alert
Act on target to resolve alert
Blackout target for planned or unplanned downtime
Setting Up Monitoring
STEP 2: Use Roles to segregate responsibilities

Mapping of operations to Enterprise Manager privileges


Operations Enterprise Manager Privilege
...On the Group:
Change group membership Group Administration
Grant privileges on group to users
On the member targets:
Delete target from Enterprise Manager Full on group
Operator on group
Set blackout for planned downtime Blackout Target
Change monitoring settings Manage Target Metrics
Change monitoring configuration Configure Target
View and acknowledge alerts, Clear alerts Manage Target Alerts
View target, receive alerts on target View on group
Setting Up Monitoring
STEP 2: Use Roles to segregate responsibilities

Examples of common job responsibilities:


Group Administrator
Adds / Deletes target from Enterprise Manager
Manages group membership
Grants privilege on group to other users

Senior Administrator
Adds / Deletes target from Enterprise Manager
Sets up monitoring for targets
Sets up notification rules for targets

First Line Support


Receives notifications for alerts
Responds to alerts

Target Owner
Receives alerts and responds to alerts
Changes monitoring settings for targets
Perform target maintenance
Setting Up Monitoring
STEP 2: Use Roles to segregate responsibilities

Create roles for each job responsibility


Group Administrator ROLE
Add Any Target system privilege
Group Administrator on the group

Senior Administrator ROLE


Add Any Target system privilege
Full on the (privilege propagating) group

First Line Support ROLE


Manage Target Alerts on the (privilege propagating) group

Target Owner ROLE


Operator on the (privilege propagating) group
Setting Up Monitoring
STEP 2: Use roles to segregate responsibilities

Create roles containing the


appropriate privileges on the
privilege propagating group
Examples:
SeniorAdmin FirstLine SeniorAdmin Role =
Role Support Role
Full on Production Sales
Full Group
Manage
Senior Administrator Target FirstLineSupport Role=
Alerts Manage Target Alerts on
Production Sales Group
Grant roles to administrators
who manage the group
Dont grant privileges on
Junior Administrator Production Sales Group individual member targets
(Privilege-Propagating)
Harder to maintain as group grows
Setting Up Monitoring
4 Step Methodology

Step 1: Organize targets into groups

Step 2: Use Roles to segregate responsibilities

Step 3: Define and enforce monitoring standards

Step 4: Set up notifications using groups


Setting Up Monitoring
STEP 3: Define and enforce monitoring standards

Simplify management of many targets by defining standards for


monitoring (set of metrics and thresholds)
Monitoring standard for production databases
Monitoring standard for test databases
etc.
Create Monitoring Templates to encapsulate monitoring
standards
Monitoring standard for Production systems put this in one
template
Monitoring standard for Test systems put in a separate template
Monitoring Template typically contains complete set of metric
settings
Specific to a target type
Examples:
Monitoring Template for production databases
Monitoring Template for test databases
Setting Up Monitoring
STEP 3: Define and enforce monitoring standards

Metric Warn Crit


Corr Create monitoring template
Action Monitoring
Senior Administrator creates the
Table-
space
75 90 xxx Templates template on behalf of the LOB / Team
Archive 70 80 yyy Grant View 'on template to other
Area % consumers of the template
Grant Full on template only to other
senior administrators (or role) who are
entitled to edit the template
Apply monitoring templates to groups
Apply to highest level in group
hierarchy
Will apply to the applicable targets in the
Apply group / subgroups
Templates Example: Database template will only be
applied to database targets in group
Usage Notes:
Apply requires at least Manage Target
Metrics
Production Multiple templates can be applied on a
target, potentially overriding metric
Group settings
Target can have specific metric settings
by setting prevent template override flag
Setting Up Monitoring
STEP 3: Define and enforce monitoring standards

What if I have one monitoring


Enterprise Manager standard for all my targets?
Define Monitoring Template
Default Template for Specify it as the Default for
database the target type
For targets added in the future,
will be used instead of Oracles
out-of-box monitoring settings
Enterprise Manager will
automatically apply
template upon target
discovery
Usage notes:
Manually apply template for
existing targets
Requires Super Administrator
privilege to specify template as
Default
Setting Up Monitoring
STEP 3: Define and enforce monitoring standards

Enterprise Manager What if I have enterprise-


wide settings and
Default Template for
database (all) application-specific
Database Template for settings for my targets?
Finance Production DBs
Use Default Template to
specify enterprise-wide
Target applied
settings common to all
with Default targets
Template
then with
Use another template
Finance containing settings specific
Template
to application
Apply to target after
discovery
Setting Up Monitoring
4 Step Methodology

Step 1: Organize targets into groups

Step 2: Use Roles to segregate responsibilities

Step 3: Define and enforce monitoring standards

Step 4: Set up notifications using groups


Setting Up Monitoring
STEP 4: Set up notifications using groups

Notification Method: means of


Notification Rule: sending notifications (e.g. email)
Target: Production Sales Group You can extend Notification
Critical alerts Methods to accommodate custom
Action: Send email to DBA team alert handling: OS Script,
PL/SQL, SNMP traps
Notification Rule: when alert
occurs, who gets notified and how
Use groups as the target for the
notification rule
If a target is added to the group,
the notification rule will
automatically apply to the target

Production Sales Group


(Privilege-Propagating)
Leveraging your monitoring setup
As your enterprise grows, minimal effort required for target setup

Enterprise Manager
When new targets are added
to Enterprise Manager:
Apply template
Do nothing (if using Default
Templates) or apply Template
Add to appropriate group
Thats it!
Results:
Targets are monitored according
to your standards
Notifications for alerts on the
targets go to the right
administrators
Administrators have the right
Production Sales Group privileges to manage the targets
(Privilege-Propagating)
Common Monitoring Questions

Whats the best way to...


Set Up Monitoring
Deploy monitoring settings on
targets
Set up notifications for administrators
Assign right level of target privileges
to administrators

Manage Alerts
Data center
Controlling volume of alerts
Removing unwanted alerts
Automating fix for common alerts

Administrator
Managing Alerts

1. Control alerts at the source


Was the alert raised prematurely?
Are the thresholds too high/low?
Review metric trend
Adjust thresholds, set number of occurrences
Do I care about this condition? If NOT, then:
Disable metric collection schedule
Note: Other metrics may be impacted
For database alert log metric, use alert log filters
To disable alerting for database TEMP, UNDO
tablespaces, see Support Note 816920.1
To deploy changed metric settings across targets, use
Monitoring Templates
Managing Alerts

2. Use Corrective Actions to


Critical auto-resolve alerts
Alert Tasks that automatically run in
response to an alert
Is the resolution of the alert a
repeatable process that can be
scripted?
Corrective Usage Notes:
Action Defined for a metric
Can be same or different for
Warning vs Critical severity
Can have different tasks based
on monitored object
Automated Ex: Filesystem Space
Alert Available(%) can have
different corrective actions
Resolution for /u1 and /u2
Set up notifications for
corrective action failure
Managing Alerts

Automated 3. Clear old log-based alerts


regularly
Think about operational
practice for regularly clearing
Alert Alert DBA Fixes old, resolved alerts
Log raised Issues Automate using duration-
based notification rules
Auto-Cleared Tip: Create a separate
Alerts After N Days notification rule for this
Do not combine with rules
Manual / On Demand for sending notifications
Manually via EMCLI
clear_stateless_alerts
Bulk clears stateless
alerts for a target
Managing Alerts

Auto-clearing log-based alerts using notification rules


First specify the
duration-based
condition
(Alert Log
alerts opened
for at least 30
days)
Managing Alerts

Auto-clearing log-based alerts using notification rules

Then choose Clear Alert action


Managing Alerts

4. Perform proactive
FINANCE monitoring using the
System Dashboard
System Shows overall health of
Dashboard your group using universal
PRODUCTION DEVT (overview) colors of alarm
Use System Dashboard at
any level in group
hierarchy:
A B C D E Highest level visibility
into status/performance
System of ALL targets
Dashboard Lower level details of
(details) alerts for specific group
Managing Alerts

System Dashboard System Dashboard


Include metrics showing
overall health
To help triage/assign the
Target Add operations alert, add operational data
Status data e.g. using target properties (e.g.
Owner Contact, Application
Supported, etc.)
Latest comment for alert is
Alert Details shown. Use this to:
Display ticket ID
Show alert ownership
More Monitoring Tips in the Appendix

Refer to Appendix for information on additional topics:


Choosing metrics and thresholds for alerting
Customizing email to add more operational context
Practices for setting up notification rules
Sending alert reminders using repeat notifications
Escalations through email
Did my fix resolve the alert?
Is this old alert still valid?
Are my targets following my monitoring standards?
Do I have correct notification coverage for my target's alerts?
Changing dbsnmp credentials across many databases
Benefits: Enterprise Manager Monitoring
Enabling value through best practices

Reduce manual tasks


Minimal effort to scale as
Improve administrator
enterprise grows
productivity

Flexible to meet monitoring Enables IT to meet service


and security requirements goals

Standardized approach to
Manage More with Less
monitoring
Oracle Enterprise Manager 11g
Resource Center
Access Videos, Webcasts, White Papers, and More
Oracle.com/enterprisemanager11g
Additional Oracle Enterprise Manager Sessions
Thursday, Sept. 23 Location

9:00 a.m. - Oracle WebLogic Server Management for Oracle Marriott Marquis,
DBAs Salon 9
9:00 a.m. - Enabling Database as a Service Through Agile Self- Moscone S. Room
Service Provisioning 102
9:00 a.m. - Reduce TCO with Oracle Application Management Moscone W L2, Rm
Pack for Oracle E-Business Suite 2024
10:30 a.m. - Best Practices for Managing Your PeopleSoft Marriott Hotel, Golden
Applications Gate A
10:30 a.m. - Oracle Enterprise Manager Grid Control Moscone S. Room
Deployment Best Practices 102
10:30 a.m. - Managing Sun SPARC Servers with Oracle Moscone S. Room
Enterprise Manager Ops Center 252
10:30 a.m. - Heterogeneous Data Masking: Oracle, SQL Server, Moscone S. Room
and DB2 Database Best Practices 306
12:00 p.m. - Scalable Enterprise Data Processing for the Cloud Moscone S. Room
with Oracle Grid Engine 310
12:00 p.m. - Spot Problems Before Your Users Call: User Marriott Hotel, Golden
Experience Monitoring for Oracle Apps Gate A
12:00 p.m. - Reduce Problem Resolution Time with Oracle Moscone S. Room
Database 11g Diagnostic Framework 102
Additional Oracle Enterprise Manager Sessions
Thursday, Sept. 23 Location
1:30 p.m. - Patching Enterprisewide Databases: Automation Moscone S. Room
Techniques and Real-World Insights 310
Marriott Hotel, Golden
1:30 p.m. - Managing User Experience: Lessons from eBay
Gate A
1:30 p.m. - Deep Java Diagnostics and Performance Tuning: Marriott Marquis,
Expert Tips and Techniques Salon 9

1:30 p.m. - Oracle Enterprise Manager Configuration Marriott Marquis,


Management Unleashed: Top 10 Expert Tips Salon 6

Moscone S. Room
1:30 p.m. - Oracle Enterprise Manager Security Best Practices
102

3:00 p.m - The X-Files: Managing the Oracle Exadata and Moscone S. Room
Highly Available Oracle Databases 102

3:00 p.m. - Monitoring and Diagnosing Oracle RAC Moscone S. Room


Performance with Oracle Enterprise Manager 310
Oracle Enterprise Manager Hands On Labs

Thursday September 23, 2010


9:00 - 10:00 a.m. Database Performance Diagnostics and Tuning Marriott Hotel,
Salon 12/13, YB
Level
10:30 - 11:30 a.m. Oracle Fusion Middleware Management Marriott Hotel,
Salon 12/13, YB
Level
Oracle Enterprise Manager Demogrounds
DEMO TITLE L O C AT I O N

Oracle Real Application Testing: Database Replay Moscone West


Oracle Real Application Testing: SQL Performance Analyzer Moscone West
Self-Managing Database: Automatic Performance Diagnostics Moscone West
Self-Managing Database: Automatic Fault Diagnostics Moscone West
Self-Managing Database: Automatic Application and SQL Tuning Moscone West
Application Quality Management: Application Testing Suite Moscone South - S022
Real User Monitoring with Oracle Enterprise Manager Moscone South - S021
Siebel CRM Application Management Moscone South - S024
Real User Monitoring with Oracle Enterprise Manager Moscone West
Oracle WebLogic Server Management and Java Diagnostics Moscone West
SOA Management with Oracle Enterprise Manager Moscone West
Oracle Business Transaction Management Moscone West
Push Button Provisioning and Patch Automation Moscone West
Smart Configuration Management Moscone West
Oracle Enterprise Manager Ops Center Moscone West
Managing the Enterprise Private Cloud Moscone West
System Management, My Oracle Support, and Oracle Enterprise Manager Moscone West
Self Managing Database: Change Management for DBAs Moscone West
Oracle Enterprise Manager: Complete Datacenter Management Moscone West
Self-Managing Database: Data Masking for DBAs Moscone West
<Insert Picture Here>

Appendix
Setting up Monitoring: Tips and Traps
Choosing metrics and thresholds for alerting

Default thresholds may over-alert


Some defaults designed for PRODUCTION use cases
Use template with adjusted thresholds to apply to DEV and TEST

Choose metrics for alerting carefully:


Time-based metrics superior for performance
Base resource consumption (CPU, I/O, Memory)
Workload or application-specific metrics or health checks

Threshold values
Use metric history to analyze value ranges
Be conservative with critical thresholds:
Reserve CRITICAL for high signal of serious problem
Database Performance Metrics

#1 Metric: Average Active Sessions


Measures active load on database instance
Sudden high spikes usually mean severe performance issue

Use Adaptive Thresholds:


Sets thresholds automatically and adjusts for workload cycles
Warn at 0.99 significance (measured sample statistic)
Critical at 0.9999 significance (estimated high significance value)

11g: New Adaptive Thresholds user interface


Supports threshold what-if analysis over recent history
Organizes metrics into Classes
Located under Baseline Metric Thresholds in Grid/Database Control
Setting Up Monitoring: Beyond the Basics
Additional tips for setting up notifications

Customize email format to add more operational


context
Add target properties (Line of Business, Owner,
Contact..) in the email to provide additional
operational information
Practices for setting up Notification Rules
Designate users (Senior Administrators) to create
rules on behalf of the team
Common use cases
Rules for production targets different than rules for
non-production targets
Separate rules by Line-of-Business / team
Use naming convention ( e.g. include team name
in rule)
Facilitates searching for rules
Setting Up Monitoring: Beyond the Basics
Additional tips for setting up notifications

Send alert reminders using Repeat Notifications


Set the global defaults to the least frequent interval used and
use this in most rules
Example: Global setting: Repeat every 30 minutes up to
a max of 3 repeats
For rules that include important targets or critical alerts, set to
higher repeat frequency
Example: For target down rule(s): repeat every 5 minutes
up to a max of 10 repeats
Usage Note: Users need to acknowledge the alert in the
Enterprise Manager console to stop repeat notifications
Remember to provide Manage Target Alerts privilege to
your operators
Setting Up Monitoring: Beyond the Basics
Additional tips for setting up notifications

Escalate unattended, important alerts via email


Send email to different person (e.g. manager level) if alert is
open too long
To set up this rule:
Create new notification rule and put a duration condition
associated with the alert
Rule action: Send email to the manager
Notification Rule for Alert Escalation via E-mail

In Metrics tab, define


duration condition for
the alert
(Apply rule if alert
opened > 12 hours
and not
acknowledged)

.. then in Actions tab,


send email to the
manager
Managing Alerts: Other Tips

How do I know if my fix


resolved the alert?
Use Reevaluate Alert
feature
Alternative to waiting for
next metric evaluation
Causes the agent to
reevaluate the metric alert
Current severity will be
provided
Usage: requires 10.2.0.5
agent or higher
Managing Alerts: Other Tips

Is this alert still valid?


Last Collected
Enterprise Manager does
June 29, 2010 not change the alert
Last Collected
triggered date
Value
Validation shown in the
Alert triggered: June 18, 2010
Console
Last Collected Value
Last Collected
Timestamp
Monitoring: Ongoing Maintenance

Are my targets still following my monitoring


standards?
Generate report using Monitoring Template Comparison
reporting element and/or
Use Compare Settings feature in Monitoring Templates
page
Shows differences between monitoring template and targets
settings
Monitoring: Ongoing Maintenance

Report using the Monitoring Template Comparison element

Differences are
highlighted
Monitoring: Ongoing Maintenance

Do I still have correct


notification coverage for my
target?
Notification Rule Coverage
report (per target)
For each metric contained in
a rule:
Alert severities covered
Rule(s), if any
Type of notification
Shows alert-able metrics not
covered in any rule
Potential missed
notification
Monitoring: Ongoing Maintenance

Whats the easiest way to change monitoring


credentials (e.g. dbsnmp) across many databases?
EMCLI update_db_password
Changes password associated with the user in Enterprise
Manager and database target
Changes the password across all features that use it:
Preferred credentials, Corrective Actions, Jobs, User-
defined metrics, target monitoring credentials
Usage tip: Blackout the target during this operation to avoid
metric collection errors due to invalid password
Q&
A

You might also like