Professional Documents
Culture Documents
August 2013
Contents
Send Us Your Comments ........................................................................................................................ xi
Preface ............................................................................................................................................................... xiii
Document Purpose....................................................................................................................................
Audience.....................................................................................................................................................
Document Structure ..................................................................................................................................
How to Use This Document.....................................................................................................................
Related Documents ...................................................................................................................................
Conventions ...............................................................................................................................................
xiii
xiv
xiv
xiv
xiv
xv
1 Introduction
1.1
1.1.1
1.1.2
1.1.3
1.1.4
1.1.5
1-1
1-2
1-3
1-3
1-3
1-4
IP Standards................................................................................................................................. 2-1
Simple Network Management Protocol ........................................................................... 2-1
JavaTM Standards......................................................................................................................... 2-2
2.2.1
2.2.2
2.2.3
2.3
2.3.1
2.3.2
2.3.3
2.3.4
2.3.5
2.3.6
2.4
2.4.1
2-4
2-4
2-4
2-4
2-4
2-5
2-5
2-5
2-5
2-5
iii
2.4.2
2.4.3
2.4.4
4 Conceptual View
4.1
4.2
4.3
4.3.1
4.3.2
4.3.3
4.3.4
4.3.5
4.3.6
4.4
4.4.1
4.4.2
4.4.3
4.4.4
4.4.5
4.4.6
4.4.7
4.4.8
4.4.9
4.4.10
4.4.11
4.5
4.5.1
4.5.2
4.5.3
iv
4.5.4
Transaction Monitoring ................................................................................................... 4-11
4.5.5
Patch Monitoring .............................................................................................................. 4-11
4.5.6
Environment Analysis...................................................................................................... 4-11
4.5.7
Configuration Change Detection.................................................................................... 4-11
4.5.8
Policy Violation Detection ............................................................................................... 4-12
4.5.9
User Experience Monitoring ........................................................................................... 4-12
4.5.10
System Monitoring ........................................................................................................... 4-12
4.6
Integration................................................................................................................................. 4-12
4.6.1
Alert & Notification Integration ..................................................................................... 4-12
4.6.2
Extensibility Framework.................................................................................................. 4-13
4.6.3
Data Exchange................................................................................................................... 4-13
4.7
Management Repository......................................................................................................... 4-13
4.7.1
Monitoring Templates...................................................................................................... 4-13
4.7.2
Job Library ......................................................................................................................... 4-13
4.7.3
Software Library ............................................................................................................... 4-13
4.7.4
Policy Library .................................................................................................................... 4-14
4.7.5
Service Level Rules ........................................................................................................... 4-14
4.7.6
Corrective Action.............................................................................................................. 4-14
4.7.7
Historical Monitoring Data ............................................................................................. 4-14
4.7.8
Deployment Procedures .................................................................................................. 4-14
4.7.9
Reports................................................................................................................................ 4-14
4.7.10
Configurations................................................................................................................... 4-14
5 Logical View
5.1
Logical Tiers................................................................................................................................. 5-1
5.1.1
Client Tier ............................................................................................................................. 5-1
5.1.2
Management Tier................................................................................................................. 5-2
5.1.3
Managed Target Tier ........................................................................................................... 5-2
5.2
Detailed Logical View ................................................................................................................ 5-2
5.2.1
Managed Target Tier ........................................................................................................... 5-4
5.2.1.1
Collection Manager, Collection Engine..................................................................... 5-4
5.2.1.2
Job Executor................................................................................................................... 5-6
5.2.2
Management Tier................................................................................................................. 5-6
5.2.2.1
Resource Monitor ......................................................................................................... 5-6
5.2.2.2
Service Monitor............................................................................................................. 5-6
5.2.2.3
System Monitor............................................................................................................. 5-7
5.2.2.4
Composite Application Monitor ................................................................................ 5-7
5.2.2.5
End User Experience Monitor..................................................................................... 5-7
5.2.2.6
Configuration Change Monitor.................................................................................. 5-7
5.2.2.7
Alert Manager ............................................................................................................... 5-8
5.2.2.8
Job System...................................................................................................................... 5-8
5.2.2.9
Provisioning Engine ..................................................................................................... 5-8
6 Product Mapping
6.1
6.2
Products........................................................................................................................................ 6-1
Product Mapping ........................................................................................................................ 6-2
6.3
7 Deployment View
7.1
7.2
7.3
8 Summary
vi
vii
List of Figures
11
21
22
23
31
32
33
34
35
36
37
38
39
310
311
41
42
51
52
53
61
71
viii
List of Tables
21
51
61
ix
Oracle welcomes your comments and suggestions on the quality and usefulness of this
publication. Your input is an important part of the information used for revision.
If you find any errors or have any other suggestions for improvement, please indicate
the title and part number of the documentation and the chapter, section, and page
number (if available). You can send comments to us at its_feedback_ww@oracle.com.
xi
xii
Preface
Some of the most talked about concerns within IT operations today involve the need to
make enterprise computing more ubiquitous, agile, and the requirement to better
align/support the needs of the business
Many IT organizations currently use a variety of traditional IT management and
monitoring tools, such as event managers, network managers and help desk systems,
to monitor and manage their IT environment. However, as companies deploy
emerging computing strategies such as Service-Oriented Architectures (SOA),
Business Process Management (BPM), and Cloud Computing, which are designed to
make functions, processes, information, and computing resources more available, the
inadequacies of these traditional tools are being highlighted..
Traditionally, different stakeholders within an IT organization have used different
siloed IT management and monitoring tools, which have lent themselves to a more
bottom-up approach to IT management whereby the focus has been on the status of
individual low level infrastructure components. Coupled with the fact that these
emerging computing strategies represent an on-going shift to move from locked down,
siloed monolithic applications to highly distributed and shared computing
environments, makes the management and monitoring of the modern IT environment
more challenging and complex.
This shift in the IT environment increases the need to make holistic IT operational
decisions, perform root cause analysis, share information between the various
stakeholders, and manage IT with the end-user experience in mind.
There is a need to supplement an enterprise's existing bottom-up approach and tooling
with a more business aligned top-down approach and tooling that enables a more
holistic and managed dependency approach of the entire IT environment, which
facilitates improved information sharing, superior diagnostics and root cause analysis,
and the realization of service level management.
Document Purpose
This document provides a reference architecture for designing a management and
monitoring framework to address the needs for the modern IT environment. This
document does not cover the more traditional aspects of IT management and
monitoring such as database and network management but covers key areas that
should be considered when supplementing an existing management and monitoring
approach.
xiii
Audience
This document is intended for IT Operation architects, administrators and enterprise
architects. The material is designed for a technical audience that is interested in
learning about the intricacies of management and monitoring and how infrastructure
can be leveraged to satisfy the management and monitoring needs. In-depth
knowledge or specific expertise in management and monitoring fundamentals is not
required.
Document Structure
This document is organized into chapters that introduce management and monitoring
concepts, standards, and architecture views.
The first chapter provides a background into management and monitoring and is
intended to give the novice reader an understanding into the needs and challenges of a
modern IT environment.
The next two chapters provide a primer on key management and monitoring
capabilities and common industry management and monitoring standards. These
chapters are intended to give the novice reader an understanding of key concepts for a
management and monitoring framework.
The remaining chapters describe a reference architecture for a management and
monitoring framework. The framework is presented using a set of common
viewpoints which include conceptual, logical, and deployment views. The architecture
is also mapped to Oracle products.
Related Documents
IT Strategies from Oracle (ITSO) is a series of documentation and supporting collateral
designed to enable organizations to develop an architecture-centric approach to
enterprise-class IT initiatives. ITSO presents successful technology strategies and
solution designs by defining universally adopted architecture concepts, principles,
guidelines, standards, and patterns.
xiv
ORA Management & Monitoring is one of the series of documents that comprise
Oracle Reference Architecture. ORA Management & Monitoring describes important
aspects of the Enterprise Management layer pertaining to the holistic monitoring and
management of resources such as business solutions, SOA Services, and application
infrastructure.
Please consult the ITSO web site for a complete listing of ORA documents as well as
other materials in the ITSO series.
Conventions
The following typeface conventions are used in this document:
xv
xvi
Convention
Meaning
boldface text
Boldface type in text indicates a term defined in the text, the ORA
Master Glossary, or in both locations.
italic text
underline text
1
Introduction
A common thread running through many services, and systems is the ability to
monitor and manage assets in a consistent and efficient manner. This ORA Monitoring
and Management document offers a framework for OA&M to rationalize these
capabilities and help optimize the operational aspects of enterprise computing.
This chapter introduces and provides a background into the key drivers pushing IT
operations to consider evolving their current IT management and monitoring
environment. These drivers are influenced by organizations adopting enterprise
technology strategies such as SOA, BPM, and EDA, which warrant new management
capabilities. Therefore this chapter does not cover traditional management and
monitoring capabilities such as network management, etc.
Introduction 1-1
managed and monitored by the IT operations team. The cost of managing large sets of
infrastructure components has increased linearly, or more, with each new
infrastructure component added to the enterprise. Conventional management and
monitoring tools struggle with both cost containment and the pressure to maintain
such a large number of infrastructure components.
Administrator productivity has taken a hit as the scale and complexity of the IT
environment increases. Administrators are now responsible for far more infrastructure
components and the relationships between the infrastructure components are much
too complicated to track manually. Firewalls, load-balancers, application servers,
service buses, shared services, composite applications, and clusters are all distributed
and connected through complex rules.
As businesses rely on IT more and more, they can lose revenue on an hourly basis if
their IT infrastructure can not handle the load placed on it by its customers. In
addition, infrastructure components are becoming more distributed, complex, and
virtual.
Therefore administrators require management and monitoring tools that enable the
quick deployment and configuration of resources in both a horizontal and vertical
manner whilst detecting and overcoming human error.
Conventional management and monitoring tools do not enable the ability to increase
access to resources/services and automatically provision based on the current demand
conditions. Therefore there is a management and visibility gap within conventional
approaches that do not fully support today's management and provisioning needs.
2
Common Management & Monitoring
Standards
This chapter introduces some of the most common management & monitoring
standards available today. This is not an exhaustive list of everything that pertains to
management & monitoring, but rather a look at many of the most widely adopted
standards that support a modern computing environment. The following sections
provide a brief overview of each standard.
Figure 21 Management & Monitoring Standards
2.1 IP Standards
2.1.1 Simple Network Management Protocol
Simple Network Management Protocol (SNMP) is a well-known and popular
protocol for network management. It is utilized for collecting information from and
configuring network devices such as servers, printers, hubs, switches, and routers on
Common Management & Monitoring Standards
2-1
JavaTM Standards
An SNMP Manager will learn of problems by receiving traps or change notices from
network devices implementing SNMP. SNMP uses protocol data units to send
information between management applications and agents distributed in the network.
This information is in the form of a standard Management Information Base (MIB)
which describes all objects that are managed by SNMP management applications. The
agents supply or change the values of MIB objects, as requested by the management
applications.
More information about SNMP can be found at: http://www.ietf.org/
JavaTM Standards
Within JMX, one or more Java objects known as Managed Beans (MBeans) instrument
a given resource. These MBeans are registered in a core managed object server, known
as an MBean server, which acts as a management agent and can run on most devices
enabled for the Java programming language. JMX agents directly control resources
and make them available to remote management applications.
JMX also defines standard connectors (JMX connectors) that allow access to JMX
agents from remote management applications. JMX connectors using different
protocols provide the same management interface. Hence a management application
can manage resources transparently, regardless of the communication protocol used.
2-3
2.3.2 WS-Policy
The goal of WS-Policy is to provide the mechanisms needed to enable Web Services to
specify policy information. It provides a flexible and extensible XML grammar for
expressing the capabilities, requirements, and general characteristics of Web Services.
WS-Policy defines a policy to be a collection of policy alternatives, where each policy
alternative is a collection of policy assertions. Assertions may pertain to functional
capabilities, such as security or protocol requirements, while others may be
non-functional, such as QoS characteristics. WS-Policy relies on other specifications,
such as WS-PolicyAttachment, to describe discovery and attachment scenarios, and
WS-SecurityPolicy - one example of a specific policy definition specification.
More information on WS-Policy can be found at:
http://www.w3.org/Submission/WS-Policy/
2.3.3 WS-PolicyAttachment
WS-PolicyAttachment defines two general-purpose mechanisms for associating
policies with the subjects to which they apply. They may be defined as part of existing
metadata about the subject (e.g., attached to the service definition WSDL), or defined
independently and associated through an external binding (e.g., referenced to a UDDI
entry). As such, the specification describes the use of policies with WSDL 1.1, UDDI
2.0, and UDDI 3.0.
More information on WS-PolicyAttachment can be found at
http://www.w3.org/Submission/WS-PolicyAttachment/
2.3.4 WS-SecurityPolicy
WS-SecurityPolicy defines a set of security policy assertions for use with the
WS-Policy framework with respect to security features provided in WS-Security,
WS-Trust, and WS-SecureConversation. It defines a base set of assertions that describe
how messages are to be secured. It is meant to be flexible with respect to token types,
algorithms, and mechanisms used, in order to allow for evolution over time.
2-5
between control requirements, technical issues, and business risks. COBIT enables
clear policy development and good practice for IT control throughout organizations.
COBIT emphasizes regulatory compliance, helps organizations increase the value
attained from IT, enables alignment, and simplifies implementation of the COBIT
framework.
More information on COBIT can be found at: http://www.isaca.org/
2.4.3 Sarbanes-Oxley
Sarbanes-Oxley (SOX) is a United States federal law as a reaction to a number of
major corporate and accounting scandals. The legislation set new or enhanced
standards for all U.S. public company boards, management, and public accounting
firms.
Sarbanes-Oxley contains 11 titles that describe specific mandates and requirements for
financial reporting.
The text of the law can be found at:
http://frwebgate.access.gpo.gov/cgibin/getdoc.cgi?dbname=107_cong_
bills&docid=f:h3763enr.tst.pdf
Control Objectives
Build and Maintain a Secure
Network
Maintain a Vulnerability
Management Program
Maintain an Information
Security Policy
The standard assists enterprises that process card payments to prevent credit card
fraud through increased controls around data and its exposure to compromise. The
2-7
3
Key Management & Monitoring Capabilities
This chapter introduces a number of key concepts and capabilities that pertain to
addressing the management and visibility gap when managing within a highly
distributed and shared computing environment.
These concepts and capabilities supplement the conventional bottom-up approach to
management and monitoring. They address aspects of a top-down management and
monitoring approach to delivering the highest quality of service for all types of
infrastructure components (See Figure 31, "Key Capabilities for a Unified
Management Infrastructure"). These key capabilities are complementary in nature to
each other and should not be seen as individual standalone capabilities.
Figure 31 Key Capabilities for a Unified Management Infrastructure
3-1
Service Management
performance, and service compliance. See Figure 32, "Service Management Phases"
for the high-level phases of Service Management.
Figure 32 Service Management Phases
3.1.1 Service
In the context of management and monitoring, a "Service" is a defined entity that
exposes a useful business and/or IT function to its consumers.
Note: The definition of "Service" within the context of management
and monitoring is broader in scope than SOA Services (aka shared
services). The relationship between these contructs is represented in
Figure 33, "Concept: Service".
Figure 33 Concept: Service
Figure 33, "Concept: Service" above shows some example service types such as SOA
Service and Application. In addition, Services can be grouped into higher-level logical
Services called Aggregate Services. A Service may have an associated Service Level
Service Management
Agreement (SLA) which establishes the goals for Service levels around availability,
performance, and usage.
Service Management enables the definition of the Service which includes the modeling
and mapping of the System in which the Service relies on. This Service modeling
enables intelligent root cause diagnostics through the entire stack to pinpoint any
offending infrastructure component.
3.1.2 System
A System is a logical grouping of hardware and software infrastructure components
that collectively support one or more Services.
3-3
Service Management
As well as defining service levels, the underlying infrastructure components may have
a number of policies applied against it. Service Management enables the ability to
define policies centrally that then propagate to the appropriate enforcement points
that govern infrastructure operations. See the Section 3.5, "Policy Management" for
more details.
In addition to trend analysis, a key part of Service Management is actively monitoring
and reporting service level achievements against goals over a defined period of time.
Dashboards provide an accurate measure of the availability, performance, usage, and
compliance of the critical business Services which ensures that the line of business
executives are getting what they need from IT to ensure the productivity of their
people.
In addition, by constantly monitoring the service levels, IT organizations can identify
problems and their potential impact, diagnose root causes of Service failure, and fix
these in compliance with the service level agreements.
Performance Management
3-5
Lifecycle Management
A transaction perspective is used to test the performance and availability from remote
user locations. Important business activities are recorded as transactions, which are
then used to test availability and performance of a Service. This enables insight into
real end user experienced issues and facilitates working on the resolution before end
users start complaining, thus reducing support costs by lowering call center volumes,
accelerating problem resolution of poorly performing applications, and adapting to
changing needs by providing insight into business activity and user preferences.
A Service can also be monitored by an infrastructure component perspective which
focuses on the underlying infrastructure components that support the Service. The
infrastructure components that are critical to running a Service are designated as key
infrastructure components, which are used to determine the performance and
availability of the Service.
Another important perspective is to record every user session and report on real user
traffic requested by, and generated from the network. It measures the response times
of pages and transactions at the most critical points within the network infrastructure.
Powerful session statistics and diagnostics can then be the basis of effective business
and operational decisions as well as an aid to perform root-cause analysis.
Configuration Management
3-7
Configuration Management
Policy Management
Conformance is assessed by way of defining policies that provide rules against which
managed infrastructure components are evaluated. For example, an identity
management solution can provide a mechanism for implementing the user
management aspects of a corporate policy, as well as a means to audit users and their
access privileges.
3.5.1 Policy
A policy defines the desired behavior and is associated with one or more
infrastructure components. Policies include different categories of policies, such as
configuration, security, and management rules. (See Figure 39, "Policy Types")
3-9
A policy can map and support directly to an industry standard such as SOX, PCI,
COBIT, and ITIL, which ensure an IT organization is adhering to the standard.
Policies are distributed to the appropriate policy enforcement points using common
approaches such as gateways and agents. These policies are monitored/assessed for
compliance and if infrastructure components fall out of compliance, remedial action
can bring the infrastructure component back into compliance.
Detailed compliance reporting highlights the infrastructure components that are in
and out of compliance and details any deviations. This enables administrators to take
action quickly and address the high impact items to improve the compliance score.
staging areas, security requirements, etc., has highlighted the need to approach
management by way of groups and the use of job automation.
3.6.1 Group
Groups are a logical collection of hardware, software, network and other
infrastructure components, which tend to reflect administrative groupings. This
grouping enables stakeholders to manage and monitor many infrastructure
components as one. A group can include infrastructure components of the same type
or include infrastructure components of different types. In large enterprises groups
can also contain other groups. For example, a system administrator may have the
responsibility over the finance and human resources departments application servers
and service buses. Therefore defining an administrative group to include these
infrastructure components enables a holistic management and montoring approach
and forms part of an approach to delegated administration. A group must not be
confused with a system which was previously defined as a logical grouping of
hardware and software infrastructure components that collectively support one or
more Services.
Figure 310 Concept: Group
3.6.2 Job
A job is a defined unit of work that automates commonly-run tasks. Jobs enable
automation for routine circumstances such as when the number of infrastructure
component instances needs to be increased or decreased to accommodate changes in
load.
Jobs can be scheduled to start immediately or start at a later date and time and can be
submitted to individual targets or against a group. Any job that is submitted to a
group is automatically extended to all its members and takes into account the
membership of the group as it changes. Having a single console as a central point of
control and the use of Groups allows administrators to perform common
administrative and monitoring tasks.
A unified infrastructure management solution provides a comprehensive set of
performance and health metrics for all managed components as well as an approach to
use these metrics to be proactive and correct any impending problems with the
environment. See Figure 311, "Concept: Metric".
Figure 311 Concept: Metric
3.6.3 Metric
A metric is a unit of measurement used to report the health of the system that is
captured from the monitored infrastructure components. Metrics from all monitored
infrastructure components are stored and aggregated in the Management Repository,
providing administrators with a rich source of diagnostic information and trend
analysis data.
3.6.4 Threshold
A metric threshold is a boundary value against which monitored metric values are
compared. The comparison determines whether an alert should be generated. If a
metric crosses a warning or critical threshold, which indicates a potential problem
with the environment, an alert is generated utilizing one of many delivery mechanims
and sent to administrators (who have registered interest in receiving such notifications
for rapid resolution.
4
Conceptual View
Standards-based Integration
Statement
Rationale
Implications
Principle
Extensible
Statement
Conceptual View
4-1
Architecture Principles
Rationale
Implications
Principle
Service Aware
Statement
Rationale
Implications
Principle
Discoverable
Statement
Rationale
Implications
Principle
Statement
Rationale
Implications
Principle
Externalize Management
Statement
Rationale
Implications
Principle
Proactive
Statement
Rationale
Implications
Principle
Compliant
Statement
Rationale
Implications
Conceptual View
4-3
data and metadata. See the sub-systems illustrated in Figure 41, "High-level
Conceptual View".
Figure 41 High-level Conceptual View
The high-level conceptual view highlights user interaction capabilities that allow the
appropriate rendering of information into views that support comprehensive analysis,
while at the same time being able to manage the environment from anywhere by
supporting multiple devices such as browser, mobile, and portal.
Conceptually management and monitoring capabilities are viewed as two sets of
capabilities. This assists with defining capabilities utilizing the 'Separation of
Concerns' principle. The Management capabilities focus on consolidating
administration tasks for a variety of infrastructure components, while the monitoring
capabilities focus on allowing enterprises to define, model, capture, and consolidate
monitoring information into a single framework.
A management and monitoring framework requires the ability to integrate and
interact with existing heterogeneous IT management environments to enable the
consolidation and centralization of all management activities and monitoring
information in a central place. This allows the framework to streamline the correlation
of availability and performance problems across an entire set of IT infrastructure
components, by eliminating the need to compile critical information from many
different tools.
While management and monitoring benefits from consolidation and centralization,
there are a number of key areas that might not be eliminated due to these efficiencies.
Examples are:
an efficient and effective means of administration and at the same time supports a
unified management platform. See ORA Security document for more details.
Infrastructure components such as applications, Services, and policies have an
associated lifecycle which covers not only the operational aspects but also
development aspects such as development, testing, and packaging. This means that
management capabilities such as performance and availability reporting, and
administration must be available as Services are developed and deployed. Therefore a
management and monitoring framework intersects with the engineering framework to
make sure that all components, infrastructure, and metrics are in sync, especially when
it comes to migrating between environments and the eventual deployment of these
components into production. See ORA Engineering document for more details.
To address these needs the management and monitoring framework requires access to
a logical centralized storage of enterprise configuration information as this lays the
foundation for defining, deploying, auditing, enforcing, and maintaining the systems.
The diagram below (Figure 42, "Detailed Conceptual View" expands on this concept
by including some example capabilities for each of the major parts highlighted above.
Figure 42 Detailed Conceptual View
Conceptual View
4-5
User Interaction
4.3.1 Administration
Administration enables the ability though a single console to manage and monitor the
entire environment, including all infrastructure components such as applications,
Services, and operating systems. As well as managing all infrastructure components it
enables administration tasks to be applied to logically related infrastructure
components. This facilitates administering many infrastructure components as one.
(See Section 4.4.3, "Group Management", Section 4.4.6, "Service Definition" and the
ORA Security document regarding delegated administration.)
The console has the built-in intelligence to understand the characteristics of each
infrastructure component and allow the appropriate administrative tasks. This
approach allows the framework to support new infrastructure component types in the
future.
4.3.2 Dashboard
Dashboards provide an "at-a-glance" monitoring of all critical indicators for Services
and other infrastructure components. They offer access to a series of rich real-time
customizable and consolidated views of the IT eco-system with the ability to drill
down. Administrators are able to spot recent changes or issues by presenting
actionable information using intuitive icons and graphics, which assist in identifying
trends, patterns, and anomalies.
See the ORA Engineering document for more details regarding quality management.
4.3.4 Query
Query enables the searching of the management and monitoring repository using
pre-defined or ad-hoc queries. For example, an administrator can use this capability to
Management
find all resources with a given configuration. Commonly used user-defined queries
could be stored within the monitoring repository for future use.
4.3.5 Reporting
Reporting and publishing capabilities allow the definition of custom reports, that can
be produced as needed or on a defined schedule. The reports present an intuitive
interface to critical decision-making information stored in the Management
Repository, which should be able to be distributed via several means, email, portal
access, etc. For example, a report could be defined that reports on actual Service levels
achieved, helping IT and business to find out whether their Services indeed function
as expected to support business activities.
4.4 Management
The capabilities that supplement a conventional bottom-up approach to management
can vary from enterprise to enterprise depending on their current capability set. Below
are a number of key management capabilities that are commonly required:
Policy Violation.
Infrastructure unavailability.
Alert & Notification management makes sense of the events and determines the
appropriate action. This requires the maintenance of notification rules that specify the
alert conditions for which notifications are sent. This includes defining flexible
notification schedules and multiple delivery mechanisms, such as email, pager, SNMP
trap, and execution of custom scripts.
In addition, Alert & Notification management should integrate with a help desk
solution to automatically raise an incident report or pass control to "Corrective Action
Management".
4-7
Management
A live configuration.
Restore configuration to a fixed point in time when the configuration was reliable.
Management
4-9
Monitoring
4.5 Monitoring
The capabilities that supplement a conventional bottom-up approach to monitoring
can vary from enterprise to enterprise depending on their current capability set. Below
are a number of key monitoring capabilities that are commonly required. These
capabilities should not be viewed in isolation, as many have a symbiotic relationship.
Monitoring
from components (i.e. URLs, Servlets, EJBs, DataSources, JVM, Connections, Caches,
etc.) to monitor the performance, load and usage of resources.
Environment Analysis can use both manual and automatic techniques to establish
knowledge regarding the infrastructure environment such as agent discovering and
metadata analysis.
Integration
4.6 Integration
While it is preferable to have a single management and monitoring solution it is
unrealistic that a single management and monitoring framework can support every
available infrastructure component now and in the future. Two-way integration
capabilities that cater for message exchange, bulk data exchange and extending the
framework are key in addressing the needs of the modern IT environment. Below are a
number of key integration capabilities:
Management Repository
Access business metrics such as KPIs and associate these metrics with existing
Services and SLA definitions. This allows administrators to correlate business
metrics with service availability, performance and usage, which in turn leads to
better diagnostic and root cause analysis.
Export bulk data to other solutions, i.e. Business intelligence solutions for further
consolidated analysis.
Management Repository
4.7.9 Reports
Reporting capabilities allow the definition of custom reports which can be saved in the
management repository to be reused and executed on an ad-hoc or scheduled basis.
4.7.10 Configurations
The management repository stores the infrastructure components configurations and
the static and dynamic relationships between infrastructure components. This enables
capabilities such as "Configuration Change Detection".
5
Logical View
The logical view builds on the conceptual view by highlighting the architecture tiers
and the key interactions between capabilities. It is important to note that the
capabilities and interactions depicted in the Logical View are not specific to any
product or set of products.
Table 51
Example Collectors
Collector
Description
SQL Collector
Log Collector
The Log Collector reads through a log file for specific patterns
and returns any lines of the file that match. Log files can be
database alert logs, web server logs, or any other text-based file
where a pattern can be used to identify relevant content. For
example, this enables the monitoring of the response time data
generated by actual end-users as they access and navigate web
sites. Web servers collect the end-user performance data and
store it in the log file.
OS Command Collector
JVM Collector
DB Collector
Synthetic Transaction
Collector
Configuration Collector
Component Collector
HTTP(S) Collector
Once the data has been collected it is stored in an interim data store. The Threshold
Detector compares the data to any specified threshold to determine whether to trigger
an alert.
The Upload Manager aggregates this interim target data with previously collected
target data. The Upload Manager then transmits the target data to the Monitoring
Engine. Examples of data transmitted include monitoring information, alert
conditions, target inventory details, and status information for any job or
administration operations that are performed on behalf of a client. The Monitoring
Engine in turn then stores the data in the Management Repository.
The DB Activity Monitor facilitates tracing of Java requests to the associated database
sessions and vice-versa enabling rapid resolution of problems that span different tiers.
The DB Activity Monitor reports SQL query performance, which helps facilitate SQL
and database performance tuning.
The Resource Monitor alerts administrators on abnormalities in Java memory
consumption.
The Memory Leak Analyzer captures multiple heap dumps over a period of time,
analyzes the differences between the heap dumps, and identifies the object causing the
memory leak.
The Root Cause Analyzer plays back transactions interactively from the browser and
enables an administrator to view the time spent in the network, the server, and the
response times breakdown by Servlet, JSP, EJB, JDBC, and SQL layers. This allows an
administrator to perform real-time and historical diagnostics on Java applications.
the comparison in a tabular format, and more detailed information is a drilling down
approach.
6
Product Mapping
This section describes how Oracle products fit together to realize the management &
monitoring framework defined in the previous sections.
6.1 Products
There are a number of products from Oracle that can be used individually to satisfy
specific management & monitoring needs, or used in combination to establish a
complete management & monitoring framework.
Table 61
Product List
Product
Description
Oracle Enterprise Manager - OEM - Service Level Management Pack actively monitors and
Service Level Management reports on the availability and performance of Services. In
addition, it assess the business impact of any Service problem or
Pack
failure, and indicates whether service level goals have been met.
Oracle Enterprise Manager - OEM - Diagnostic Pack for Oracle Middleware provides
Diagnostic Pack for Oracle
proactive monitoring and advanced diagnostic capabilities that
Middleware
empower administrators to prevent crashes and other
undesirable outcomes in high load production environments. A
lightweight Java application monitoring and diagnostics tool
enables administrators to diagnose performance problems in
production.
Oracle Enterprise Manager - OEM - Diagnostic Pack for Non-Oracle Middleware provides
Diagnostic Pack for
proactive monitoring and advanced diagnostic capabilities for
Non-Oracle Middleware
applications running on non-Oracle middleware and for
standalone Java applications to help administrators prevent
crashes and other undesirable outcomes in high load production
environments.
Oracle Enterprise Manager - OEM - Management Pack for Coherence provides
Management Pack for
comprehensive tools for discovery, monitoring, reporting, events
Coherence
management, configuration management, lifecycle management
and deployment automation to simplify the management of an
organization's Oracle Coherence cluster.
Product Mapping
Description
Oracle Real User Experience Oracle Real User Experience enables enterprises to maximize the
Insight
value of their business critical applications by delivering insight
into real end-user experiences. It integrates performance and
usage analysis enabling business and IT stakeholders to develop
a shared understanding of their application users' experience.
Oracle Web Services
Manager
The logical view only highlights a sampling of the capabilities of the conceptual
view
Extensive number of management packs, connectors, plug-ins to show on a single
mapping diagram.
Product Information
The mapping diagram positions each product with respect to its primary role. There
are several products that have some high-level functionality that overlaps with other
products, however this is not shown on the diagram. For a complete list of product
features, architecture documentation, and product usage, please consult the Oracle
Product documentation.
Note: Oracle Enterprise Manager addresses the core capabilities
required. This has been highlighted by the use of a light red box and
signifies that all capabiities fall within its bundaries. Other products
such as the individual packs are highligted by the use of red. For
example, the "System Monitor" is addressed by the core Oracle
Enterprise Manager product while "Resource Monitor" is addressed
by the Diagnostic Pack.
Figure 61 Product Mapping
There are many management packs that define new target types, metrics and
collection definitions. These are highlighted in the Collection Engine and Target
sections within Figure 61, "Product Mapping".
Product Information
7
Deployment View
Deployment View
7-1
Client Tier
Deployment View
7-3
8
Summary
Summary
8-1