You are on page 1of 47

BPPM 9.

5 Architecture & Scalability Best Practices


2/20/2014 version 1.4

Summary
This Best Practice document provides an overview of the core BPPM 9.5 architecture. Detailed
information and some configuration guidance is included so that the reader can get a solid
understanding of the core solution components regarding how they are connected and communicate
with each other. Best Practice recommendations are provided throughout. References to previous
versions are provided and discussed where appropriate.

Caveats & Limitations


This document covers basic implementation architecture and does not cover all possible functions. It is
focused on the core components of BMC ProactiveNet Performance Manager v9.5. For example BPPM
Reporting is not included. BPPM Reporting is addressed separately. The document also does not include
all possible implementation architecture options. Although the solution is very flexible and can be
implemented in multiple ways, this document follows Best Practice recommendations.
The information here is intended to augment the product documentation, not replace it. Additional
solution and product information can be found in the product documentation at the following URL.
https://docs.bmc.com/docs/display/public/proactivenet95/Home
The port numbers provided in this document are based on a default implementation.

Page 1 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4

Table of Contents
Summary ....................................................................................................................................................... 1
Caveats & Limitations ................................................................................................................................... 1
BPPM 9.5 Overall Architecture ..................................................................................................................... 4
Architecture changes compared to BPPM 9.0 .......................................................................................... 5
BPPM Server Architecture ........................................................................................................................ 6
Sybase Database Architecture .................................................................................................................. 7
Oracle Database Architecture ................................................................................................................... 8
Integration Service Hosts ........................................................................................................................ 12
Event & Data Flow Processing ................................................................................................................ 17
Connection Details .................................................................................................................................. 19
Central Management & Administration (CMA) .......................................................................................... 23
Single CMA Architecture Overview ......................................................................................................... 24
Multiple CMA Architecture ..................................................................................................................... 25
Standalone BPPM Servers & CMA .......................................................................................................... 25
CMA Architecture Details........................................................................................................................ 26
Staging Integration Services........................................................................................................................ 27
Overview & Functionality........................................................................................................................ 27
Staging Process Illustration ..................................................................................................................... 29
Initial Agent Deployment .................................................................................................................... 29
Integration Service Policy Application ................................................................................................ 30
Production Monitoring ....................................................................................................................... 31
Staging & Policy Management for Development, Test and Production ..................................................... 32
Single CMA Instance Deployments ......................................................................................................... 32
Multiple CMA Instance Deployments ..................................................................................................... 35
General Recommendations .................................................................................................................... 36
Interoperability ........................................................................................................................................... 37
High Availability .......................................................................................................................................... 39
BPPM Application Server HA................................................................................................................... 39
Data Collection Integration Services HA ................................................................................................. 40
Staging Integration Service HA ............................................................................................................... 40
Event Management Cells HA................................................................................................................... 41
Page 2 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4
PATROL Agents HA .................................................................................................................................. 41
Sybase Database HA................................................................................................................................ 42
Oracle Database HA ................................................................................................................................ 42
BPPM 9.5 Scalability & Sizing ...................................................................................................................... 43
BPPM Server Sizing Overview ................................................................................................................. 43
Integration Service Node Sizing Overview .............................................................................................. 43
Configuring for Scalability ....................................................................................................................... 44
Implementation Order ................................................................................................................................ 46
Components & Dedicated Servers .............................................................................................................. 47
Troubleshooting .......................................................................................................................................... 47

Page 3 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4

BPPM 9.5 Overall Architecture


The diagram below illustrates the high-level architecture of the BPPM 9.5 core solution components.
BPPM Server
Web Operations Console
Service Impact Management & Alerting
Event Management
Root Cause Analytics
Monitoring / Trending / Reporting
Central Management & Administration
Local Sybase Database or
Remote Oracle Database (RAC supported)

User Consoles
(Web GUI & Java Admin)

Event Management
Correlation Cell

Integration Service Hosts


1) Integration Service
2) Event Management Cell
3) Event Adapters (optional)
4) RT Server (optional)

PATROL Agent
Remote Monitoring

Events & Performace Data

Events &
Performace Data

Events & Performace Data

Transaction Response Time &


3rd Party Data Sources
PATROL Agents
Local Monitoring
Legend Key
Data & Events

Locally Managed Nodes


Remotely Managed Nodes

Data
Events

PATROL Agents collect performance data and generate events for availability metrics. Both
performance data and events from PATROL are streamed though the Integration Service nodes. (This
assumes the BPPM 9.5 Server and BPPM 9.5 Integration Service nodes are in use.)
The Integration Service nodes forward the performance data to the BPPM Server. Not all performance
data has to be forwarded. Performance data can be collected and stored at the PATROL agents and
visualized as trends in the BPPM 9.5 console without having to stream the data to the BPPM Server.
This is configurable for each PATROL parameter. It is a Best Practice to limit streaming performance
data to the BPPM server for only the following purposes.
1) Performance data for all parameters designated as KPIs should be streamed to the BPPM server
to support baselines, abnormality detection and predictive alarming.

Page 4 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4
2) Performance reporting in BPPM Reporting. Stream the data for all parameters that are required
in performance reports. This should be limited to KPI parameters, but can be extended.
3) Include parameters that are necessary or desired for probable cause analysis leveraging
baselines and abnormalities.
The Integration Service processes forward events to event management cells running on the Integration
Service hosts. The event management cells running on the Integration Service hosts filter and enhance
events, then forward the events to an event management cells used for correlation.
Best Practices and the options available are discussed further in this document.
BMC strongly recommends that you setup environments for BPPM development and BPPM test
separate from production.
Architecture changes compared to BPPM 9.0
Much of the overall architecture remains unchanged from the previous release; however there are some
significant changes. The major high-level changes are listed below.
1) The Integration Service process has been significantly simplified.
2) Support for multiple Oracle schemas in the same Oracle instance is provided. This applies to the
BPPM Application Server database and the BPPM Reporting database.
3) Connectivity between the Integration Service processes and the Central Management &
Administration module (CMA) has been consolidated. CMA now communicates with each
Integration Service through the BPPM Server that the Integration Service is connected to.
4) In 9.0 data is sent from PATROL Agents to the Integration Service nodes, but the BPPM Server
polls a single data point every 5 minutes from the Integration Services. In 9.5 data is now
streamed from PATROL Agents through the Integration Service to the BPPM Server.
Consequently, every data point is now collected by the server and stored in the database for
performance parameters that are streamed.
5) With BPPM 9.5 events are now streamed from PATROL Agents to the Integration Services on the
same port that performance data streams to. The Integration Service then sends the events to
remote cells or directly to the BPPM Server. In BPPM 9.0 PATROL Agents send events directly to
remote cells on a separate port.
Details regarding these changes are discussed further in this document.

Page 5 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4
BPPM Server Architecture
The BPPM solution supports installing a single BPPM Server or multiple BPPM Servers in the same
environment. The overall architecture diagram on page 1 illustrates a single server environment. The
diagram below illustrates a multiple BPPM Server environment with a Central BPPM Server with the
Central Management and Administration (CMA) module and Child Servers.
Legend Key

Central BPPM
Server with CMA

Policies
Data & Events
Data
Events
Modeling
Direction of arrows indicates
connection requests.

Child BPPM
Server 1

Integration
Host 1

PATROL Agent
monitoring

Child BPPM
Server 2

Integration
Host 2

PATROL Agent
monitoring

Child BPPM
Server N

Child BPPM
Server N+1

Production
Integration
Host N

Production
Integration
Host N+1

PATROL Agent
monitoring

PATROL Agent
monitoring

A multiple BPPM Server implementation supports distributed service models so that specific
Configuration Items in one BPPM Server can be visible in another model in a separate server. This is
supported by Web Services installed standard with the BPPM Server(s). The Central BPPM Server acts as
a single point of entry for users and provides a common point to access service models.
Although not required, for most environments BMC recommends installing the top tier BPPM Server as
a Central server with the CMA module included.
A single BPPM solution implementation cannot support mixed versions of BPPM servers. This includes
the Central Management and Administration module. All BPPM Server versions must be the same in a
single environment.
Page 6 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4

The following are best practices for the BPPM Server.


1) Install and use the BPPM Administration console on a host separate from the BPPM Server. Use
the instance of the BPPM Administration console that is installed with the BPPM Server for
emergency use only.
2) Install IIWS and all other integrations on separate servers from the BPPM Server. All
integrations should be installed on a server separate from the BPPM Server (for example on an
Integration Service host) unless specifically otherwise stated in BMC documentation. This does
not apply to the Integration for BMC Remedy Service Desk (IBRSD).
3) Install a PATROL Agent and the Monitor the Monitor Knowledge Module (KM) on the BPPM
Server in order to monitor the BPPM Server externally. The BPPM Server includes built in self
monitoring, however the Monitor the Monitor KM provides a way to monitor the BPPM Server
externally.
4) Setup a separate event/notification path for external monitoring of the BPPM infrastructure so
that you are not dependant on the BPPM infrastructure to generate and process alarms related
to it being down or running in a degraded state.
5) Do not try to forward performance data to a Central BPPM Server. Performance Data cannot be
forwarded to a Central BPPM Server. Only events can be forwarded to a Central BPPM Server.
Sybase Database Architecture
The BPPM Server is supported with one of two database options. You can install the embedded Sybase
database that comes with the product, or you can leverage an Oracle database that you provide. If you
choose the Sybase option, the Sybase database is installed with the BPPM Server on the same host with
the application server and web server components. The Sybase database cannot be installed on a
separate server.
The Sybase database should be used in the following situations.
1) Oracle License is not available
2) No Oracle DBA is available
3) Robust Database availability is not required
4) Small & medium environments where Oracle is not available
Please see the product documentation for details regarding database topics.

Page 7 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4
Oracle Database Architecture
If you choose the Oracle database option you must provide an Oracle instance. The Oracle instance must
be installed on a separate host from the BPPM Server. You have the option of allowing the BPPM Server
installer to create the schema for BPPM in the Oracle database, or you can create the schema manually
using scripts provided with the installer. Please see product documentation for additional details
regarding the install options and process.
With BPPM 9.5, multiple BPPM Application Servers can be supported with a single Oracle instance. This
is accomplished by creating/allocating separate Oracle schemas in the single Oracle instance, one for
each BPPM Application Server. Obviously database resources and the sizing of the Oracle instance SGA
have to be increased to support this. The diagram below illustrates how multiple BPPM Servers can
share a single Oracle instance. NOTE: This is not possible with the Sybase database.
Central BPPM
Server with CMA
Oracle Server
Single DB
Instance

Development
BPPM Server

Test
BPPM Server

Production
BPPM Server N

Production
BPPM Server N+1

Likewise, multiple BPPM Reporting instances can share the same Oracle instance.
WARNING: An Oracle instance should never contain a schema or schemas for the BPPM Server while
also containing a schema or schemas for BPPM Reporting. The BPPM Application Server instance(s) and
reporting instance(s) must be separated for performance reasons. Additionally the Oracle database
instances for BMC components should be dedicated for BMC products and should not contain any third
party application data or code. The diagram below illustrates these requirements.
BPPM Server 1

Report Engine 1

BPPM
Reporting
Database

BPPM
Application
Database
Schema 1

BPPM Server N

Report Engine N

Reporting
Schema

Schema N
Schema N+1
BPPM Server N+1

Report Engine N+1

Page 8 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4
Each schema in an Oracle instance must have a unique Oracle database user that owns the schema.
When you install the BPPM Server the installer prompts you for the user who owns the schema for the
current instance as shown in the screen below. Be sure to enter a unique user for each BPPM Server
instance you install.

Additionally, each unique BPPM Server schema should be installed into separate data files and
corresponding tablespaces in the Oracle Instance. The BPPM Server installer allows you to specify these
criteria as shown below.

Page 9 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4
The BPPM Application Server installer requires remote connectivity to the Oracle instance and must be
able to connect as sysdba remotely. You should validate this connectivity before trying to install the
BPPM Application Server. It is a Best Practice to install SQL*Plus or the Oracle Instant Client on the
target BPPM Application Server and test/validate Oracle database connectivity as sysdba from that
server before starting the install for the BPPM server. Please see product documentation for additional
information regarding Oracle.
Oracle can be configured so that each database instance has a unique Oracle listener, or a single listener
can support multiple database instances. As a best practice it is recommended to designate a unique
Oracle listener for each database instance. This isolates listener issues to a single instance. Additionally
high availability should be setup for the Oracle listeners and the databases. BMC recommends
leveraging Oracle RAC for database high availability. Please see BMC product documentation for details
regarding BPPM and Oracle RAC. Please see Oracle documentation for additional Oracle related high
availability configuration.
BMC recommends leveraging the same database platform for all BPPM Server databases across the
environment. Although it is technically possible to install some BPPM Servers using the embedded
Sybase database and others using Oracle, standardizing on one platform provides a common way to
manage high availability, backup/restore, and export/import of data from one instance to another.
Note that database export/import is only possible between the exact same versions and patch levels.
In previous releases of BPPM each instance of both the BPPM Application Server and the BPPM
Reporting components required a dedicated Oracle instance. (This assumes Oracle was the chosen
database for the BPPM Server, not Sybase.)
The Oracle database option should be used in the following situations.
1) Large environments
2) When an Oracle License is already available
3) The customer has on site Oracle DBA expertise
4) Oracle is a standard database platform used in the environment
5) When robust database availability is required
The following are additional best practices when using Oracle as the database platform.
1) Use Oracle RDBMS v 11.2.0.2.0 or v11.2.0.3.0
2) Create at least two BMC ProactiveNet users, one for data storage and one for data views.
Consider a third backend user to manage issues like locked accounts.
3) Physically co-locate the BPPM App Server and the DB Server on the same subnet.
4) The backup and restore process must be executed by BMC ProactiveNet users.
Page 10 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4
5) Use BMC Database Recovery Management or an Oracle tool such as RMAN.
6) Enable archive logging.
7) Use Oracle RAC for High Availability
8) Use Oracle Data Guard for Disaster Recovery
9) Use Oracle Storage Area Network (SAN).

Page 11 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4
Integration Service Hosts
The diagram below illustrates how Integration Service nodes fit into the BPPM 9.5 architecture. A
reference to the 9.0 architecture is provided on the left for comparison.
BPPM 9.5
Very Small or POC Environments

BPPM 9.0
All Environments

BPPM Server

Events

Event
Cell

Integration
Service

Data

PATROL Agent Nodes


(version 9.0)

Events

Event
Cell

Integration
Service

Integration
Service Host
Events

BPPM Server

BPPM Server

Data & Events

Data

BPPM 9.5
Recommended

Data

Integration
Service

Integration
Service Host

Integration
Service Host
Data & Events

PATROL Agent Nodes


(version 9.5)

Data & Events

PATROL Agent Nodes


(version 9.5)

Legend Key
Data & Events
Data
Events
Direction of arrows indicates
connection requests.

The BPPM 9.5 Integration Service processes accept streaming of PATROL data and events using a
common connection port. The default port is 3183. This includes all data points and events from
PATROL for parameters that you select. Once events arrive at the Integration Service, events are
separated and follow a unique path to one of the following based on configuration:
1) The Integration Service local cell (default behavior)
2) A Named Event Cell
3) The BPPM Server associated to the Integration Service

NOTE: PATROL sends performance data, streaming it to the BPPM server. This is not summarized data.
The data does get summarized in the BPPM Server (as in previous versions) but raw data is sent from
the PATROL Agents. This includes all data points for parameters that you decide to send.
Page 12 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4

The architecture also supports buffering of PATROL performance data and events at the PATROL Agents
in case there is a network connectivity issue or the Integration Service otherwise cannot be reached.
When the PATROL Agent reconnects to an Integration Service process the buffered data is sent. This
capability is not intended to support buffering for very large amounts of data. It is intended to support a
few minutes of lost connectivity, not hours or days. Testing has shown that the process can support up
to 30 minutes of data collected by PATROL Agents across one thousand managed servers.
The BPPM 9.5 Integration Service processes are generally stateless meaning the following.
1) The 9.5 Integration Services do not cache PATROL Agent namespace data and data points as in
9.0. The data is now streamed directly through to the BPPM Server. The server now gets every
data point rather than only a snapshot every 5 minutes from the Integration Service cached data
points.
2) There are no adapters associated with PATROL data collection.
a. All filtering of performance data is handled at the PATROL Agents.
b. All filtering of events is handled at the PATROL Agents and if necessary in the event
management cells.
3) The Integration Service acts as a proxy to receive and forward both data and events that are
sent to it from PATROL Agents. It also receives PATROL Agent and Knowledge Module (KM)
configuration data from CMA and passes that data to PATROL Agents.
The following components can be optionally installed and configured on the Integration Service host
depending on whether or not they are needed in the environment. Before installing any of these
additional components scalability and additional required resources must be considered.
1) Event Management Cell Event Management process installed locally on the same server with
the Integration Service. It is a recommendation and Best Practice to install the Event
Management Cell on all Integration Service hosts.
2) RT Server - This assumes the environment includes the PATROL Central Console which is not
required. Refer to PATROL documentation for RT Server requirements. Note that the Console
Server process should be installed on a separate machine.
3) Event Adapters - These work with the event management cell to consume non-PATROL events.
For example SNMP traps, etc. Significant non-PATROL event collection should be dedicated to
other event management cells as recommend in Best Practices for previous BPPM versions. The
default event adapter classes, rules and files are installed with the cell that is installed with the
Integration Service installer.
4) PATROL Agent and Knowledge Module (KM) for monitoring the Integration Service host
processes.

Page 13 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4
5) BMC Impact Integration Web Services (IIWS).
Certain processes that ran on the older Integration Service hosts are no longer needed and should not
be installed or used with a BPPM 9.5 Integration Service node. These include the following.
1) A PATROL Agent acting as a Notification Server
2) Integration Service data collection Adapters (used in 9.0 and previous versions)
3) The BII4P3 or BII4P7 processes
4) The pproxy process
The BPPM 9.5 Integration Service is able to consume and forward both performance data and events.
Technically the cell is not required in order to forward events to the BPPM Server. Therefore the cell
technically does not have to be installed with the Integration Service.
For most environments BMC recommends propagating events from the Integration Service to a lower
tier event management cell. This is especially important in environments that meet any of the following
conditions.
1) Involve more than a few thousand events in the system at any one time
2) Include multiple events sources other than PATROL
3) Has more than a few users
4) A medium or large environment involving more than 100 managed servers
The event management cells allow you to further process events before sending them on to the BPPM
Server. For example event enrichment, filtering, correlation, de-duplication, auto closure, etc. This
type of event processing should be avoided on the BPPM Server as much as possible. Event processing
in the BPPM Servers should be controlled and limited to the following.
1) Event Presentation of actionable events only
2) Collection of events for Probable Cause Analysis
3) Events used in Service Modeling
Events sent to the BPPM Servers should be closely controlled and limited for the following reasons.
1) The event presentation in the BPPM Server should not be cluttered with un-actionable events
that distract or otherwise reduce the efficiency of end users.
2) The new capability in BPPM 9.5 to view PATROL performance data in the BPPM Server without
having to forward and store the data in the BPPM database will likely reduce the quantity of
Page 14 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4
parameters that are actually trended in the BPPM Server for most environments. This will likely
increase the number of events propagated from PATROL for parameters that do not require
baselines, but do require static thresholds. This increase will increase the load on the event
management cell in the BPPM Server.
3) PATROL events are approximately twice the size in bytes compared to events generated in the
BPPM Server. A larger volume of PATROL events will increase the memory consumption of the
event management cell on the BPPM Server and will additionally increase BPPM Server start up
time. Overall startup time for a BPPM Server at full capacity ranges from 15 to 20 minutes.
4) The automated event to monitor association introduced in 9.5 has slightly increased load on the
event management cell that is embedded in the BPPM Server.
The Central BPPM Server can act as a presentation server for all events processed in the child BPPM
Servers. Events can be propagated from the Child BPPM Servers to the Central BPPM Server to
accomplish this. Additionally, event management cells on the Integration Service hosts should be
integrated with the BPPM Server so that events in the remote cells are accessible in the BPPM Server
web console under the Other Cells view.
BMC recommends integrating the Child BPPM Servers with Remedy and other BMC products such as
Atrium Orchestrator for event processing related to other products like these. These integrations
should not be configured with the Central BPPM Server, except for the BMC Atrium SSO component.
The following are additional Best Practices for Integration Services, event management cells and the
Integration Service hosts.
1) Install an Integration Service for each major network subnet.
2) Limit the usage of HTTPS between the Integration Service nodes and the BPPM Server(s). HTTPS
is not as scalable as HTTP and HTTPS requires more administration.
3) Do not send raw events directly to the BPPM Server. Every environment should have at least
one lower tier event management cell.
4) Install the event management cell on all Integration Service nodes.
5) Additional event management cells should not be installed on the BPPM Server.
6) Install additional event management cells on Integration Service hosts and remote host as
needed.
7) Do not configure IBRSD, Notifications, or other global event forwarding integrations on the
lower tier event processing cells. Global event forwarding integrations should configured on the
child BPPM Server(s).

Page 15 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4
8) The number and placement of event management cells should be based on the quantity of
events, event source domains (secure zones, geography, etc), and major event sources. Always
deploy multiple event management cells in the following situations.
a.
b.
c.
d.

Large environments
Geographically distributed managed infrastructure
Large numbers of events
When different event sources require different event management rules for example
large numbers of SMNP traps compared to events from PATROL
e. Significantly different event management operations are divided by teams
9) Configure display of remote event management cells in the BPPM server when necessary.
10) Install dedicated event processing cells to manage large volumes of events from common
sources like SNMP traps, SCOM, and other significant sources of events.
11) Distribute event management cells as required, based on event loads and event sources.
12) Deploy event management cells close to or on the same node as the event sources for 3rd party
sources.
13) Filter, enrich, normalize, de-duplicate and correlate events at the lowest tier event management
cells as much as possible before propagating to the next level in the event flow path.
14) Do not collect unnecessary events. Limit event messages sent from the data sources to
messages that require action or analysis.
15) Do not try to use the event management cells as a high volume SNMP trap forwarding
mechanism.
16) Use dedicated Integration Service hosts for large domain data collection, for example vSphere,
remote operating system monitoring and other large sources of data.
17) Install Integration Service hosts close to the data sources that they process data for. Deploy by
geography, department, business, or applications especially if multiple Integration Services are
required from a single source.
18) Do not collect excessive or unnecessary performance data. Review the need for lower polling
intervals considering server performance and database size.
19) Do not collect trends for Availability metrics

Page 16 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4
Event & Data Flow Processing
As already discussed above, both performance data and events are sent to the Integration Service
process from the PATROL Agents over the same TCP communication path. The Integration Service
process then forwards events to the event management cell that is running locally on the same host
with the Integration Service. The event management cell further processes the events (filtering,
enrichment, correlation, etc) and forwards the events to the BPPM Server. Performance data is sent
from the Integration Service process directly to an enterprise event correlation cell. The event
correlation cell then forwards the events to the BPPM Server.
PATROL Agent events are now considered "Internal Intelligent" events in BPPM 9.5. Previous versions
considered PATROL events as "External" events and they were managed like 3rd party events. PATROL
Agent events in BPPM 9.5 are mapped to the instance object they belong to (previously the mapping
was only at the device level). This monitor instance association of PATROL Events improves Probable
Cause Analysis leveraging categorizations (e.g. Database, Application, Server, Network, etc.)
The following are additional Best Practice recommendations for Integration Services.
1) The Integration Service installed with the BPPM Server (locally on the BPPM Server) should be
configured as a Staging Integration service. (In a POC environment it could instead be used for
data collection.) Staging Integration Services are discussed further in this document.
2) At least one remote Integration Service node should be deployed for all environments.
3) BMC generally recommends installing the Integration Service and event management cell in
pairs so that each Integration Service process has a corresponding event management cell
installed on the same host. In this configuration events are propagated from the Integration
Service to the event management cell running on the same host. The install of an event
management cell is an option available in the installer when installing the Integration Service.
4) It is very important to maintain the event flow path so that all events from any one PATROL
Agent are always processed through the same event management cell(s) (including cell HA pairs).
This ensures event processing continuity where automated processing of one event is
dependent on one or more other events from the same agent. A simple example of this type of
processing is the automated closure of critical events that is triggered by Ok events for the
same object that was in a state of critical alarm. If you do not maintain the same event flow
path per agent through the same event management cell(s). Event correlation of all events
from the same agent is not possible because the necessary events are not received and available
in the same cell(s).
5) Some environments may require more than two IS nodes in a cluster and/or more than two
Integration Service nodes defined for each agent that is sending the data (events and
performance) through a 3rd party load balancer to the Integration Service nodes. This is
acceptable as long as all events from any one agent always flow through the same HA cell pair
so that event processing continuity as described above is maintained. For example if four
Integration Service nodes are clustered, then each node in the cluster should not have a cell

Page 17 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4
configured on it. Instead the cell should be on other systems (in an HA pair) so that the event
path remains the same for all events coming from the agents that the cluster handles.
Configuration regarding performance and event data that is sent from the PATROL Agents to the BPPM
Server is defined in policies that are automatically applied to the desired PATROL Agents. PATROL Agent
assignment is defined in each policy configuration based specific criteria. The details of agent selection
criteria per policy are discussed further in Central Management & Administration section of this
document. PATROL events and performance data are completely controlled at the PATROL Agent based
on these policies. You have complete control meaning data, events, data and events, or no data and no
events are controlled per parameter. These configuration settings can be edited and changed on the fly
without having to rebuild any configurations or restart any processes.

Page 18 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4
Connection Details
PATROL Data Collection

The diagram below illustrates the default ports on which connections are made regarding
communications from the PATROL agents through the Integration Service process to the BPPM Server
for the BPPM 9.5 solution components. The direction of the arrows indicates the connection requests.
Please review the product documentation for further details.
BPPM Server

Admin Cell
Port 1827

Jserver
CMA

TCP

Event Cell

Agent Controller
Port 12123

TCP

Port 1828

TCP
TCP
TCP

Integration Service
Node
Event Cell
Port 1828
TCP

Integration Service
Port 12124

Port 3183

PATROL Agent for


(Self-Monitoring)

TCP

Port 3181

Knowledge Modules

TCP

Managed Node
PATROL Agent
Port 3181

Legend Key

Knowledge Modules

Data & Events


Data
Events
Policies
Direction of arrows indicates
connection requests.

Note the following simplifications and changes from BPPM 9.0.


1) Port 3182 is no longer listening on the Integration Service node for external connection requests.
The BPPM Server communicates to port 12124 to send policies from CMA.
2) The number of processes and ports on the Integration Service host has been reduced. There is
no longer a pproxy process.
Page 19 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4

3) The Monitor the Monitor (MTM) KM does not discover and monitor the BPPM 9.5 Integration
Service and should not be used with the BPPM 9.5 Integration Service. The built-in self
monitoring is significantly enhanced with BPPM 9.5 and the MTM KM is no longer needed.
However, the PATROL Agent and Operating System KMs should be used for additional self
monitoring and this is recommended.
Administration & PATROL Consoles

The diagram below illustrates the ports and connections related to the BPPM Administration and
PATROL Consoles.
BPPM Server

Admin Cell
Port 1827

Jserver
CMA

TCP

BEM Cell

Agent Controller
Port 12123

Port 1828

TCP
TCP

TCP

Integration Service Node


TCP

BEM Cell
Port 1828
TCP

Integration Service
Port 12124

Port 3183
TCP

PATROL Agent for


(Self-Monitoring)
RT Server
TCP

TCP

Port 2059
TCP

TCP

PATROL Console
Server
TCP

PATROL Console
Node
PATROL Central
Windows

Managed Node
PATROL Agent

PCM
TCP

PATROL Classic
Console

Port 3181

Legend Key
Knowledge Modules

Data & Events


Data
Events
Policies
Authenication
Direction of arrows indicates
connection requests.

An instance of the BPPM Administration console should always be installed on a separate machine from
the BPPM Server. An instance of the BPPM Administration console is installed on the BPPM Server by
default. This instance of the BPPM Administration console should only be used in an emergency if
another instance is not available.

Page 20 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4
A PATROL Console is not required in every environment. The need to use PATROL Consoles and PATROL
Configuration Manager (PCM) is reduced with BPPM 9.5 due to enhancements in BPPM Central
Management and Administration. The PATROL Console should only be installed in environments where
specific PATROL console functionality is required.
The following are reasons to include PATROL Consoles and/or PCM. These will not apply to all
environments.
1) The environment has a legacy PATROL implementation and the PATROL Console functionality
needs to be continued for some period of time for migration and/or process related reasons.
2) Specific functionality in the PATROL Console is required that is not available in the BPPM 9.5
Console. Examples of functionality limited to the PATROL console include the following.
a. Menu commands that generate reports
b. Menu commands to initiate administrative actions that are built into the PATROL KMs
and run against the managed technology
c. Detailed analysis of certain events in PATROL
If functions like these are not used and/or not required in IT management processes the PATROL
Consoles may not be necessary in the production environment. Do not install a PATROL Console
in the production environment if it is not needed.
3) Some PATROL Knowledge Modules (KMs) in use are not yet fully manageable in CMA. Check
BMCs web site to verify which KMs are fully manageable in CMA as this list is constantly being
updated. A list of compatible KMs can be found at the following URL.
https://docs.bmc.com/docs/display/public/proactivenet95/Monitoring+solutions+configurable+
through+Central+Monitoring+Administration
4) In certain situations detailed analysis of the PATROL Agent and KM operations may be necessary
for troubleshooting. This should be accomplished with a PATROL Central Console if it is required
in production. The PATROL Classic Console should only be used for KM development and never
used in a production environment.
5) Development of custom KMs to be loaded in CMA. This requires the PATROL Classic Console. It
can also be used to analyze content in PATROL events at the PATROL agent and should be done
in a development environment only. The PATROL Classic console should be used primarily for
custom KM development.
6) When detailed understanding of a KMs functionality and how it is configured cannot be
understood without analyzing the KM using the PATROL Console. This should be done in a
development environment.

Page 21 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4
BPPM Administration Console

The diagram below illustrates the connections between the BPPM Administration Console and other
BPPM solution components. The ports listed are default ports and the arrows indicate the direction of
the connection requests.
BPPM Java
Administration
Console

TCP

Port 1828

Port 1099

BEM Cell

RMI
Registry

Integration Service Host

TCP

TCP

TCP

TCP

TCP

Port 12128

Port 3084

Port 1828

Port 1827

Jserver

IAS

BEM Cell

Admin Cell

BPPM Server

Page 22 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4

Central Management & Administration (CMA)


BMC Recommends implementing one of two architectures for CMA. The two choices are listed below
with their pros and cons.
Choice One - Implement a single CMA Instance for all environments including Development, Test and
Production.
Pros
1) The creation, testing, and deployment of monitoring policies into production are very easy
because you do not have to copy or export/import any data. The application of policies to
Development, Test and Production is simply managed in the policys agent selection criteria.
2) It requires less infrastructure nodes and components. Only a single Staging Integration
Service host is needed. Only a single CMA instance is used.
Cons
1) This many not be supported in some sites where all the necessary connections between
Development, Test and Production environments are not available or allowed to be
connected over the network.
2) Due to the powerful ease of use, it is easier for administrators to make mistakes applying
policies unintentionally to production. However, this can be managed.
Choice Two - Implement a separate CMA Instance for the Development, Test and Production
environments each.
Pros
1) Is supported in sites where all the necessary connections between Development, Test and
Production environments are not available or allowed to be connected over the network.
2) Provides a platform and supports policy management methods that help prevent
administrators from making mistakes when applying policies to production.
Cons
1) The creation, testing, and deployment of monitoring policies into production require more
manual effort because you have to export/import policy data from Development to Test and
from Test to Production.
2) Policies could get out of synch across the Development, Test, and Production environments
if not managed properly. Keeping them up to date is more of a manual process supported
by the export/import utility.

Page 23 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4
3) It requires more infrastructure nodes and components. The Development, Test, and
Production environments should each have a dedicated Staging Integration Service host and
a dedicated CMA instance.
IMPORTANT: Neither method supports seamless creation, testing, and production deployment of
updates and deletion for existing policies. Updates and deletion of existing policies that are already in
production should be created, tested, and populated to production leveraging the policy export/import
capability. This topic is discussed in detail in the configuration best practices.
In all scenarios, CMA communicates thorough the agent controller process on the BPPM Server(s) to the
Integration Service nodes.
These implementation architecture options are not installation options. The CMA components are the
same. These two implementation architecture options are simply choices in how you install CMA
instances and connect them to the various BPPM Servers.
Single CMA Architecture Overview
The diagram below illustrates the high-level architecture for a single CMA instance in a multiple BPPM
Server environment including Development, Test and Production.
Legend Key

Central BPPM
Server with CMA

Policies
Data & Events
Data
Events
Direction of arrows indicates
connection requests.

QA
BPPM Server

Staging
Integration
Host

New Deployed
PATROL Agent

QA
Integration
Host

PATROL Agent
monitoring

Test
BPPM Server

Production
BPPM Server N

Production
Integration
Host N

Test
Integration
Host

PATROL Agent
monitoring

PATROL Agent
monitoring

Production
BPPM Server N+1

Production
Integration
Host N+1

PATROL Agent
monitoring

With the Single CMA Architecture a single Staging Integration Service node is used in the agent
deployment process for all agents. All BPPM Servers leverage the single CMA instance for all policy
management
Page 24 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4

Multiple CMA Architecture


The diagram below illustrates the high-level architecture for multiple CMA instances BPPM Server
environments including Development, Test and Production.
Development Central
BPPM Server
With CMA

Manual Policy Export / Import

Manual Policy Export / Import

Development
BPPM
Child Server

Staging
Integration
Service

New Deployed
PATROL Agent
(Development)

Production
BPPM
Child Server N+1

Development
BPPM
Child Server

Development
Integration
Service

Development
PATROL Agents

Production Central
BPPM Server
With CMA

Test Central
BPPM Server
With CMA

Staging
Integration
Service

New Deployed
PATROL Agent
(Test)

Development
Integration
Service

Test PATROL
Agents

Production
BPPM
Child Server N

Staging
Integration
Service

New Deployed
PATROL Agent
(Test)

Production
Integration
Service N+1

Production
Integration
Service N

Production
PATROL Agents

Production
PATROL Agents

Legend Key
Policies
Data & Events
Data
Events
Direction of arrows indicates
connection requests.

With this architecture each environment has its own dedicated CMA instance and Staging Integration
Service. All policy application between environments is supported by the policy export/import utility.
Standalone BPPM Servers & CMA
In most multiple BPPM Server environments the CMA module will be installed with a Central BPPM
Server. However, it is possible to install CMA with a stand-alone BPPM Server and then manually
register the additional BPPM Servers with CMA after the install.
The following are reasons for installing CMA on stand-alone BPPM Servers and not leveraging the
Central Server capability.
1) The top tier BPPM Server is only needed to provide an enterprise event console.
Page 25 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4

2) BPPM Central Server functions are not needed.


a. Single point of entry for service model visualization.
b. Enterprise level map view.
CMA Architecture Details
The diagram below illustrates the default ports and connectivity that support Central Management and
Administration across multiple BPPM Servers. The arrows indicate the direction from which the
connections are established.
BPPM Server with CMA
JMS
Port 8093

Web Services
Port 80 / 443

TCP

TCP

TCP

BPPM Server (Child or Leaf)


JMS

TCP

TCP

TCP
TCP

TCP

BPPM Server (Child or Leaf)


JMS

Port 8093

Agent
Controller

Port 8093

Web Services

Port 12123

Agent
Controller

Web Services

Port 80 / 443

Port 80 / 443

Port 12123
TCP

TCP

Integration Service Host

Integration Service Host

Integration Service
Port 12124

Integration Service

Port 3183
Port 3183

Port 12124

TCP

Managed Host
TCP

PATROL Agent

Managed Host
PATROL Agent

Port 3181

Port 3181

Knowledge Modules

Knowledge Modules

The detailed architecture above applies to both the Single CMA Architecture and the Multiple CMA
Architecture.

Page 26 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4

Staging Integration Services


Overview & Functionality
The Staging Integration Service has been introduced in BPPM 9.5. The Staging Integration Service
provides a single point in the environment where all newly deployed PATROL Agents can register into
the BPPM solution stack. The staging process can leverage the Integration Service on the BPPM Server.
A Staging Integration Service supports a smoother process for deploying PATROL Agents into BPPM
environments in three major ways.
1) It eliminates the need to manage the decision and assignment of PATROL Agents to production
Integration Service node separate from the deployment process. (This assumes an environment
that includes multiple production Integration Service instances.) When you leverage a Staging
Integration Service this decision and the assignment is automated as part of the deployment
process.
2) It supports a smoother process for managing policies across Development, QA, Test, and
Production environments.
3) It reduces the number of PATROL silent install packages that have to be created and maintained.
PATROL Agent silent install packages are created so that the Staging Integration Service is defined in the
package. No other Integration Service is defined in the install packages. Although technically possible, it
is recommended as a best practice that other Integration Service instances not be defined in the install
packages. When a package is deployed and installed, the agent will check in through the Staging
Integration Service. When the agent checks in, the Central Management & Administration module
evaluates agent selection criteria in a Staging Policy and uses that data to automatically assign a data
collection Integration Service (or Integration Service cluster) to the agent. The agent selection criterion
can include any one or any combination of the following.
1) A tag defined in the agent configuration
2) Hostname that the agent is running on
3) Operating System that the agent is running on
4) IP Address or IP Address range that the agent is running on
5) Agent Port
6) Agent version
7) Integration Service that the agent is already assigned to (assuming it is already assigned)
8) BPPM Server that the agent is already assigned to (in this case it is through a Staging Integration
Service)
Page 27 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4

Integration Service policies are the only CMA policies that are applied through a Staging Integration
Service. Monitoring policies are not applied though a Staging Integration Service. Additionally, Staging
Integration Service policies in CMA are only applied through a Staging Integration Service instance.
The architecture of network connections (communication protocol, ports, etc) between the Staging
Integration Service, PATROL Agents, and the BPPM Server are technically the same as with other
Integration Service instances.
The following are best practices for Staging Integration Service nodes.
1) Do not attempt to configure agents so that performance data and/or events are sent to a
Staging Integration Service.
2) Staging Integration Services must not be mixed with Data Collection Integration Services. They
must be configured, used, and managed separately from Data Collection Integration Services.
3) Configure the Integration Service on the BPPM Server as a Staging Integration Service. Do not
use it for data collection.
4) If firewall rules and security prevent using the Integration Service on the BPPM Server as a
Staging Integration Service, deploy a Staging Integration Service into the managed zone or zones.
5) Setup a single Staging Integration Service for each environment, for example one for
Development, one for Test, and one for Production. Or, if you have a single CMA instance for all
environments, setup a single Staging Integration Service for the entire implementation when
possible.
6) Consider high availability for Staging Integration Services. Refer to the High Availability section
for more information.

Page 28 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4
Staging Process Illustration
The diagrams in Steps 1 through 3 below illustrate the process of utilizing a Staging Integration Service.
Initial Agent Deployment
BPPM Server
Legend Key
Policies
Data & Events
Data
Events
Direction of arrows indicates
connection requests.

Staging
Integration
Service

New Deployed
PATROL Agent

General
Integration
Service

PATROL Agent
general monitoring

Domain
Integration
Service

Dedicated PATROL
Agent & IS nodes for
domain monitoring

The diagram above illustrates there different Integration Service nodes and how they are used as follows.
1) The Staging Integration Service is used strictly for introducing new agents to the BPPM Server.
(An Integration Service has to be configured to work as a Staging Integration Service.)
2) The General Integration Service is used for collecting data from various deployments of PATROL
agents that are installed locally on the managed nodes. The term General is a description of
how the Integration Service is used and is not a configuration.
3) The Domain Integration Service is used for collecting data from PATROL Agents that provide
large volumes of data from a single source. Examples are VMware vCenter, PATROL Remote
Operating System Monitoring, NetApp, etc. The term Domain is a description of how the
Integration Service is used and is not a configuration.
The agent introduction process works as follows. A newly deployed PATROL Agent silent install package
is installed as shown above. The install package for the PATROL Agent contains configuration data
telling the PATROL Agent how to connect to the Staging Integration Service. When the new agent starts
for the first time, it registers with CMA through the Staging Integration Service. CMA then applies a
Page 29 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4
Staging policy to the agent based on agent selection criteria in the policy. Agent selection criteria define
what agents the policy should be applied to. The Staging policy only contains agent selection criteria
and information that defines connectivity information for a data collection Integration Service node (or
Integration Service cluster). No other agent and/or KM configuration data can be defined in a Staging
policy.
Integration Service Policy Application
BPPM Server
Legend Key
Policies
Data & Events
Data
Events
Direction of arrows indicates
connection requests.

Staging
Integration
Service

New Deployed
PATROL Agent

General
Integration
Service

PATROL Agent
local monitoring

Domain
Integration
Service

Dedicated PATROL
Agent & IS nodes for
domain monitoring

After receiving the Staging policy, the newly deployed agent switches to the data collection Integration
Service node (or Integration Service cluster) defined in the Staging policy. (The switch is represented by
the blue arrow in the diagram above.) The agent then receives any monitoring polices defined in CMA
that match each Monitoring policys agent selection criteria.

Page 30 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4
Production Monitoring
BPPM Server
Legend Key
Policies
Data & Events
Data
Events
Direction of arrows indicates
connection requests.

Staging
Integration
Service

General
Integration
Service

PATROL Agent
local monitoring

Domain
Integration
Service

Dedicated PATROL
Agent & IS nodes for
domain monitoring

The agent starts monitoring and continues to receive any updates to existing monitoring policies and
new monitoring policies that match the monitoring policys agent selection criteria.
NOTE: Agents do not move from Development to Test, and then to Production. All agents should first
check in with the appropriate Staging Integration Service, then move to their data collection Integration
Service. This supports the concept of creating install packages for Development and Test only, separate
from Production. This topic is discussed further in the BPPM 9.5 Configuration Best Practices.

Page 31 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4

Staging & Policy Management for Development, Test and Production


Single CMA Instance Deployments
A single Staging Integration Service and a single CMA instance can be used to support multiple BPPM
Servers. The diagram below illustrates how this is architected for Development, Test, and Production
BPPM Server environments.
Legend Key

Central BPPM
Server with CMA

Policies
Data & Events
Data
Events
Direction of arrows indicates
connection requests.

Development
BPPM Server

Staging
Integration
Host

New Deployed
PATROL Agent

Development
Integration
Host

PATROL Agent
monitoring

Test
BPPM Server

Production
BPPM Server N

Production
Integration
Host N

Test
Integration
Host

PATROL Agent
monitoring

PATROL Agent
monitoring

Production
BPPM Server N+1

Production
Integration
Host N+1

PATROL Agent
monitoring

All policies include agent selection criteria that allow you to completely control what policies are applied
to any and all PATROL Agents across the entire environment spanning Development, Test and
Production. This allows you to install a single CMA instance for the entire environment, and it
eliminates the need to recreate policies in production after they have been created in development and
tested, etc. One or more of the following agent assignment configurations in the policies is defined and
edited to accomplish this.
1) BPPM Server that the agent is assigned to (Best Practice)
2) Integration Service that the agent is assigned to
3) A tag defined in the agent configuration
Page 32 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4

4) Hostname that the agent is running on


5) IP Address that the agent is running on
The easiest method is to include the appropriate BPPM Servers in the agent selection criteria for the
policy as shown in the screen below, at the proper time.

Simply add the Test and Production BPPM Servers to the policy agent selection criteria when you are
ready to apply the policy to those environments. This simplifies the process of moving configuration
from QA to Test, and finally to Production. It also ensures a policy is not applied to any agents in
production until it has been tested and validated.
The following outlines the process as an example referencing the screen shot above.
Phase 1 Only the BPPM Server named BPPMRHEL62-HM-QA is included in the agent
selection criteria when the policy is first created.

Phase 2 The BPPM Server named BPPMRHEL62-HM-TEST is added to the agent selection
criteria with an OR after the policy has been validated in QA.

Page 33 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4
Phase 3 The BPPM Server named BPPMRHEL62-HM-PROD is added to the agent selection
criteria with an OR after the policy has been tested and validated in the BPPMRHEL62-HM-TEST
environment and is ready to be applied to production.

WARNING: At least one BPPM Server must be included in the agent selection criteria in order to control
which BPPM Server environment(s) the policy is applied to. If you do not include at least one BPPM
Server the policy will be applied to all agents, across all BPPM Servers that match the agent selection
criteria of the policy. Additionally the multiple BPPM Server values must be grouped () and related with
a Boolean OR as shown above. If you use a Boolean AND to relate the agent selection criterion the
policy will not be applied because an agent cannot register with multiple BPPM Servers.
WARNING: leveraging the BPPM Servers in agent selection criteria is powerful and has far reaching,
global implications. If you mistakenly add a production BPPM Server to the agent selection criteria for a
policy the policy could be unintentionally applied to 100s or 1000s of agents in production. Therefore it
is extremely important that this process be managed carefully.
IMPORTANT: Updates and deletion of existing policies will apply to all agents that the policys agent
selection criteria match. Consequently it is not possible to test edits to policies that currently apply to
production without impacting production using the process outlined above. Separate policies should be
created to test edits in the development and test environments leveraging the export/import utility.
The details of this topic are discussed in the configuration best practices.
Leveraging tags in policies can also be used to further control what agents policies are applied to. Tags
should be used to provide a second level of protection to prevent policies still in development or test
from being applied to production accidentally. Leveraging tags this way forces the user to not only add
the appropriate BPPM Server to the policy selection criteria, but they also have to add the appropriate
tag. This helps prevent the user from accidently picking the production BPPM Server and saving the
policy when they did not mean to. Additionally, tags can be used to provide greater granularity for
policy assignment where the other agent selection criteria is not enough.
BMC recommends leveraging precedence in policies so that production policies have the highest
precedence and will not be superseded by development or test policies. If you follow this
recommendation you will also have to adjust the policy precedence when you want to move it from
development to test, and finally from test to production.
The configuration topics above are discussed further in the BPPM 9.5 Configuration Best Practices.

Page 34 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4
Multiple CMA Instance Deployments
The policy export/import utility is used to move policies between environments where there are
multiple CMA instances. The diagram below illustrates how this is architected for Development, Test,
and Production BPPM Server environments.
Development Central
BPPM Server
With CMA

Manual Policy Export / Import

Manual Policy Export / Import

Development
BPPM
Child Server

Staging
Integration
Service

New Deployed
PATROL Agent
(Development)

Development
Integration
Service

Development
PATROL Agents

Production Central
BPPM Server
With CMA

Test Central
BPPM Server
With CMA

Production
BPPM
Child Server N+1

Development
BPPM
Child Server

Staging
Integration
Service

New Deployed
PATROL Agent
(Test)

Development
Integration
Service

Test PATROL
Agents

Production
BPPM
Child Server N

Staging
Integration
Service

New Deployed
PATROL Agent
(Test)

Production
Integration
Service N+1

Production
Integration
Service N

Production
PATROL Agents

Production
PATROL Agents

Legend Key
Policies
Data & Events
Data
Events
Direction of arrows indicates
connection requests.

In a multiple CMA instance deployment there is no need to specify BPPM Servers in the agent selection
criteria for Development and Test environments. This is not needed because the environments are
completely separated and do not share a common CMA instance with Production.

Page 35 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4
General Recommendations
Staging Integration Service nodes are integral to the proper and controlled usage of CMA. The Staging
Integration Service node(s) can be connected to any BPPM Server across the environment. The
Integration Service running on a BPPM Server should also be configured as a Staging Integration Service.
BMC recommends the following Best Practices for Staging Integration services and CMA.
1) In large environments designate at least one Integration Service node separate from the BPPM
Server as a Staging Integration service node.
2) Setup high availability for Staging Integration service nodes. This is especially important in
environments where deployment of managed nodes and PATROL Agents is rapid and often. For
example in a Cloud Services provider environment. Setting up high availability for the Staging
Integration Services ensures limited or no disruption of the monitoring deployment process.
3) Limit the number of Staging Integration Service nodes that are deployed in each environment.
Limit them to one or a single high availability pair per environment if possible. This will reduce
the number of components that have to be maintained. It will also ease administration and
assignment of monitoring policies across each environment.
4) Limit CMA to one instance in one BPPM Server (BPPM Server cluster) for each environment
where Development, QA, Test, and Production are each separate environments. This helps
control the application of policies within the different environments. You can facilitate this by
using the export/import capability of CMA policy content and by assigning the appropriate
BPPM Server to agent selection criteria.
5) Do not move agents from Development to Test, and then to Production. Only configuration
policies should be move between environments.

Page 36 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4

Interoperability
1) In order to leverage all new functionality in BPPM 9.5, PATROL Agent version 9.5.00i or higher
must be used.
2) The BPPM 9.5 Integration Service can only communicate with PATROL Agent versions 9.5.00i or
higher. Older versions of agents are denied access if a connection is attempted. A
corresponding event is generated when this happens.
3) BPPM version 9.0.00 SP1 and 8.6.02 SP3 Integration Services can operate with BPPM 9.5 Servers.
The usage of 9.0.00 SP1 and 8.6.02 SP3 Integration Services in a BPPM 9.5 environment should
be limited to upgrade and migration scenarios where no other method is feasible.
4) The BPPM 9.5 Server and administration console must be manually configured to support 9.0.00
SP1 and 8.6.02 SP3 Integration Services if a fresh install is done.
5) When a BPPM Server 9.0.00 SP1 or 8.6.02 SP3 is upgraded the same version Integration Service
nodes are readily supported. Architecture specific to version 9.0.00 SP1 and 8.6.02 SP3
components is not included in this document. Please review BPPM 9.0.00 SP1 and 8.6.02 SP3
documentation and related version Best Practices for that information.
6) Integration Services older than 9.0.00 SP1 and 8.6.02 SP3 cannot interoperate with BPPM 9.5.
7) A PATROL Agent v9.5 can be connected to a v9.0.00 SP1 or 8.6.02 SP3 Integration Service if the
PATROL Agent is Polled for data via a p3adapter profile in the pproxy configured for the
Integration Service. Streaming under this scenario is not allowed. Do not use this method in a
9.5 BPPM Server environment. Use this capability only in a 9.0 or 8.6 BPPM Server environment
where agents are already polled by the p3adapter. Although the preceding is possible, it should
generally be avoided.
8) A PATROL Agent v9.5 will be denied connection to older Integration Service versions if streaming
with the SA adapter configuration is attempted.
9) BPPM 9.5 can process events that are propagated from previous versions of the event
management cells. In some cases this may require configuration to support customizations
and/or specific event content and processing.
10) Mixed versions of BPPM Servers are discouraged in a single environment. However events from
different versions can be propagated between versions. Integration with CMA and integration
across service models in BPPM 9.5 with older BPPM Servers versions should be avoided.
11) Consider the following regarding Intelligent Ticket integration with BMC Remedy ITSM.
a. There are two major parts of the Intelligent Ticketing 2.0 patch
i. BPPM updates

Page 37 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4
ii. ITSM updates
b. The Intelligent Ticketing 2.0 patch updates are included in BPPM 9.5 and ITSM 8.1 SP1.
(BPPM 9.5 and ITSM 8.1 SP1 do not require the Intelligent Ticketing patch.)
c. Any version of ITSM below ITSM 8.1 SP1 requires the Intelligent Ticketing patch updates
to be applied to ITSM.
d. Intelligent Ticketing 2.0 includes support for ITSM 7.6.04 SP4 and 8.1. Versions prior to
these are not supported including ITSM 8.0.
Please refer to product documentation for additional details at the following URL.
https://docs.bmc.com/docs/display/public/proactivenet95/Supported+integrations+with+other+BMC+p
roducts

Page 38 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4

High Availability
The diagram below illustrates overall High Availability architecture to support fault tolerance for the
core BPPM 9.5 components.
Central BPPM Server
with CMA
Production BPPM Server
OS Cluster - Active/Passive
Test/QA/DEV
BPPM Server

Clustered Integration Servers


Active/Active

Testing Integration Servers


3rd Party Load Balancer (optional)

3rd Party Load Balancer (optional)

Staging Integration Server

Legend Key
Configuration
Data & Events
Data
Events
Direction of arrows indicates
connection requests.

Standard PATROL Agent Nodes.


Use dedicated nodes for Remote
Monitoring.

Dedicated PATROL Agent nodes


for monitoring virtual metrics
from vCenters.

High Availability per component is supported and configured as follows.


BPPM Application Server HA
High Availability for the BPPM Application server is supported through operating system clustering. The
two servers in the cluster must be configured with shared storage between the two nodes. Please see
product documentation for details. BMC recommends leveraging a high speed SAN for storage. This is
unchanged from the previous product release. Although the QA, Test, Dev, and Central BPPM Servers in
the diagram above are not shown in operating systems clusters for HA they can be installed in a cluster.

Page 39 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4
Data Collection Integration Services HA
With BPPM 9.5, the Integration Service processes are stateless as previously discussed. This allows the
PATROL agent to automatically send performance data and events to another Integration Service
instance if the primary instance is not available. There is no concern for maintaining monitoring related
configuration at the Integration Service instances because no such configuration exists. Additionally
there is no association between Integration Service instances and specific PATROL Agents to be
maintained or otherwise managed by administrators at the Integration Service nodes.
BPPM 9.5 includes functionality to cluster Integration Service nodes. These integration Service cluster
configurations are simple software settings referenced in policies. The configuration settings for a
cluster are stored in the CMA module as a Cluster. The cluster configurations simply contains
connectivity information in the form of PATROL agent variables that instruct the agent(s) how to
connect to the first, second, third, and forth Integration Service nodes which are grouped as the
Cluster. There is no built-in load balancing with these cluster configurations however all the
Integration Service instances are active supporting active/active high availability.
You can include up to four Integration Service nodes in a single Cluster.
BMC recommends referencing clusters in Staging policies only.
PATROL Agents will attempt connecting to the list of Integration Services in the cluster, in the order that
the Integration Services are listed. When an agent looses connection to the first Integration Service
instance it will automatically connect to the second instance in the list. When the first Integration
Service is available for connection again the agent does not automatically connect back to the first
instance. It remains connect to the instance it is currently, successfully connected to.
Multiple Integration Service instances can also run behind a load balancer this means a third party
load balancer can be placed between PATROL Agents and the Integration Services to support full
active/active High Availability fault tolerance and true load balancing of event and performance data
across multiple Integration Service processes running on different hosts. Generally, in large
environments BMC recommends leveraging load balancers as a Best Practice. This is a recommendation,
not a requirement. It basically ensures that the Integration Service tier will not be overloaded if/when
there is an event storm or an interruption in communication between the agents and IS nodes causing a
flood of cached data at the agents to be sent to the BPPM Server(s) through the Integration Service
nodes.
Staging Integration Service HA
The Staging Integration Service in the diagram is not shown in a cluster, and they are not included in
cluster configuration within the product. However, Staging Integration Service nodes can be configured
for redundancy. This is accomplished by setting up multiple Staging Integration Service nodes and
designating their connectivity information in a comma separated list for the PATROL Agent Integration
Service configuration variable.

Page 40 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4
An agent install package and/or a single policy should never contain configuration for multiple Staging
Integration Service nodes that are associated with different BPPM Servers.
Event Management Cells HA
High availability for the Event Management cells is provided through a built-in primary/secondary
configuration as an active and hot standby cell pair. Event sources, such as the Integration Services, are
configured to send events first to the primary cell. If the primary cell is not available the event source
sends events to the secondary cell. The cells automatically synchronize live event data so that events
are kept in synch between the two cells. The secondary cell is configured and operates as a hot
standby cell. The primary and secondary cells monitor each other. During a fail-over the secondary cell
detects that the primary cell is not available and it takes over event processing functionality. When the
secondary cell detects that the primary cell is back on line it synchronizes events with the primary cell
and switches back to standby mode. The primary cell then continues on with event processing and
synchronizing with the secondary cell.
The following points are best practices regarding event management cell high availability.
1) The primary and secondary cells must be set up with the same knowledge base configuration. It
is critical that this requirement be followed. The synchronization process only synchronizes
event data including updates to events. It does not synchronize configuration data including
data in the knowledge base files, policies and data in data classes and data records.
2) The synchronization of knowledge base configuration and data in data classes must be manually
managed or automated with custom scripts, etc.
3) Never setup event propagation so that events only propagate to the primary or secondary cell.
Always leverage the multiple host definition (primary/secondary) for the destination
configuration of the HA cell pair in the mcell.dir configuration files.
4) Us e the same cell name for the primary and secondary cells.

PATROL Agents HA
PATROL Agents running on the managed node that they are monitoring generally do not require high
availability. However PATROL Agents that monitor large domain sources such as vSphere or remote
operating system monitoring require high availability configurations in most environments. High
availability for the PATROL Agent is supported with operating system clustering, or other 3rd party
solutions like VMware HA.

Page 41 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4
Sybase Database HA
There are two database options that can be used with the BPPM Server. You can leverage the Sybase
database that is delivered with the product. The Sybase database is embedded and install with the
BPPM Server if you choose this option. The second option is to use an Oracle database that you provide.
If the out of the box embedded Sybase database option is used high availability for the database is
supported as part for the file system replication for the BPPM Server. The Sybase database cannot be
installed on a server separate from the BPPM Server. Please see product documentation for additional
details.
Oracle Database HA
High availability for the Oracle database is supported thorough a third party database availability
management solution. It is best supported using Oracle RAC. Please see BMC BPPM and Oracle product
documentation for details.

Page 42 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4

BPPM 9.5 Scalability & Sizing


This section does not provide comprehensive details on sizing. The intent of this section is to provide an
overview of sizing for BPPM 9.5. Please refer to product documentation for additional information.
Sizing and scalability details are provided in the product documentation at the following URL.
https://docs.bmc.com/docs/display/public/proactivenet95/Performance+and+scalability+details+for+co
mmon+deployments
BPPM Server Sizing Overview
A single large 9.5 BPPM Server host will scale to support the following maximum values. This includes
all process and the event management cell.
1)
2)
3)
4)
5)
6)
7)

Data for 1.7 Million performance parameters


20,000 devices
250,000 monitored instances
3 days of raw data
3 months of rate data
40,000 Intelligent Events per day
350,000 External Events per day

The above values are all independent. If you reach one of these values in a single BPPM Server instance
you should consider that you have reached maximum capacity for that BPPM Server instance.
This sizing is for a 64-bit host with 8 CPUs and 32 Gig of RAM.
Do not try to exceed 250,000 monitor instances or 1,700,000 attributes on a single BPPM Server host.
Additional CPU and memory does not help to scale beyond these values.
In some environments the BPPM 9.5 database will require more storage space than previous versions.
This is due to the increased performance data collection rate. In previous versions performance data
was collected once every 5 minutes by default. In BPPM 9.5 PATROL agent collected data is streamed to
the BPPM Server and many of the parameters are collected every minute. You should allocate 600 GB
of storage (100 GB for the server + 500 GB for the database) in a large implementation.
Please refer to the documentation for additional details.
Integration Service Node Sizing Overview
A single large BPPM 9.5 Integration Service host will scale to support the following maximum values.
This includes the Integration Service process and the event management cell.
8) Data for 1.7 Million performance parameters
9) 900 PATROL Agents
10) 250,000 monitored instances
Page 43 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4
11) 25 events per second
The above values are all independent. If you reach one of these values in a single Integration Service
instance you should consider that you have reached maximum capacity for that BPPM Server instance.
A 64-bit Integration Service host with 2 CPUs and 2 Gig of RAM can scale up to can scale up to 500
PATROL agents.
A 64-bit Integration Service host with 4 CPUs and 8 GB RAM can scale up to 900 PATROL agents.
Do not try to exceed 250,000 monitor instances and 1,700,000 attributes on a single Integration Service
host. Additional CPU and memory does not help to scale beyond these values.
Please refer to the documentation for additional details.
Configuring for Scalability
Performance Data & Event Processing

Do not collect unnecessary data at the PATROL Agents. Configure monitoring in the order listed.
1) Do not configure or otherwise enable KMs for monitoring that are not needed. This applies to
entire KMs and application classes in KMs that are not needed. This has the biggest impact on
reducing unnecessary data.
2) Do not propagate performance data to the BPPM Server that is not needed in the BPPM Server.
Leverage the ability to visualize trended data from PATROL in the BPPM web console without
having to store it in the BPPM database for parameters that are not required in BPPM Reporting
and that do not require baselines and baseline related capabilities.
3) Leverage instance filtering in the monitoring policies so that only the instances that need to be
monitored will send performance data to the BPPM Server.
4) Leverage parameter performance and event data filtering in the monitoring policies so that only
the parameters that need to be monitored will send performance data or events to the BPPM
Server.
5) Reduce data collection frequency for collectors in the PATROL KMs when possible.
Do not send events to the BPPM Server for performance data for PATROL parameters that you are also
sending performance data. If the performance data for a parameter is sent to the BPPM Server events
for that parameter should be generated in the BPPM Server. This will eliminate unnecessary events.
Do not propagate unnecessary events from PATROL or any other source to the remote cells or to the
BPPM Server. Filter events at the source when possible.
Page 44 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4
Filter events in the lowest tier event management cells when filtering is not possible at the event
source(s).
Configure basic event processing such as filtering, enrichment, timers, auto closure and all other event
processing that can be instrumented at the lower tier event management cells as much as possible. This
will reduce the event processing load on the BPPM Server and improve scalability as well as start up
time of the BPPM Server.
Implement event management cell configured for correlation as a middle tier event processing layer
between the remote event management cells and the BPPM Server. This will off load event correlation
from the BPPM Server and help further reduce the volume of events processed by the BPPM Server.
Data Retention

Raw data retention should be reduced to 3 days in large environment when acceptable (the default is 8
days). Reducing the raw data to three days will reduce the size of the database with no impact to
baselines and their calculations.
Do not store more raw data than necessary according to business and IT process requirements. If you
are changing this setting after having already collected and retained 8 or more days of raw data there
will be an initial one time, nominal impact to performance on the BPPM Server as old data is pruned. If
you are concerned about this you can reduce the raw data retention in increments until you get it down
to 3 days.
Store raw data in BPPM Reporting separately for longer periods of time.
Do not reduce raw data retention in the BPPM Server to less than 3 days.
Do not extend raw data retention in the BPPM Server to more than 8 days.
Reporting in the BPPM Server

The following are best practices for reports generated within the BPPM Server and are not related to
BPPM Reporting.
The number of reports generated in the BPPM Server should be kept to a minimum.
The number of reports viewed by multiple concurrent users in the BPPM Server should be kept to a
minimum.
Historical reports should be configured in BPPM Reporting, instead of the BPPM Server.

Page 45 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4

Implementation Order
The following is the general recommended, high-level implementation order for the BPPM 9.5 core
components. Note that this has changed from previous releases. These steps are primarily for a fresh
install.
1) Install the BPPM Central server(s).
2) Install the BPPM Child server(s).
3) Install the BPPM Administration console on desktop node(s).
4) Install the Integration Service node(s) with event management cell(s).
5) Connect the Integration Service nodes to the BPPM Server(s) in the CMA console(s).
6) Configure the appropriate Integration Service nodes as Staging Integration Services including the
one(s) on the BPPM Server(s).
7) Configure Integration Service Clusters in the CMA console. Do not do this as part of step 5
above while you are connecting the Integration Service nodes to the BPPM Server(s).
8) Create agent/KM silent install packages for PATROL agents and Knowledge Modules in the CMA
Monitoring Repository, but do not deploy them to Production at this point in the process.
9) Create separate Staging policies for the BPPM Development, BPPM Test, and BPPM Production
environments that are assigned to the appropriate Integration Service instances.
10) Create monitoring policies to be tested in Development.
11) Create/configure event management policies and rules to be tested in Development.
12) Deploy the agent/KM silent install packages to Development/Test managed servers.
13) Validate the silent install packages, monitoring policies and event management configuration in
the Development/Test environments including all desired data collection and event
management processing.
14) Promote the validated monitoring policies and event management configuration into
Production.
15) Deploy the validated agent/KM silent install packages to Production servers.
16) Enable the monitoring policies in Production and verify proper functionality.

Page 46 of 47

BPPM 9.5 Architecture & Scalability Best Practices


2/20/2014 version 1.4

Components & Dedicated Servers


The chart below lists the major components of the BPPM solution showing which ones should be
installed on dedicated servers.
Component
Dedicated or Shared Server
BPPM Application Server
Dedicated
BPPM Server Oracle database option
Installed on a server dedicated for Oracle databases
BPPM Server Sybase database option
Installed as part of the BPPM Application server
Integration for BMC Remedy Service Desk
Installed as part of the BPPM Application server
Integration Service Host *
Dedicated server for event and data integration processes
Integration Service
Shared on an Integration Service Host
Impact Integration Web Service (optional) Shared on an Integration Service Host
Remote Event Management Cell
Shared on an Integration Service Host
Event Adapters
Shared on an Integration Service Host
PATROL Agents for remote monitoring **
Dedicated
PATROL RT Server (optional)+
Shared on an Integration Service Host
PATROL Console Server (optional)+
Dedicated
PATROL Central Web Console (optional)+
Dedicated
* The Integration Service Host is not an installable component. It is a server dedicated for middle tier
data and event collection processes.
** PATROL Agents for Remote Monitoring include solutions for remote operating system monitoring,
vSphere, and other domains that involve larger amounts of data collected by a single agent.
+ The PATROL components marked optional should not be used in most production environments.
Note: The BPPM Application Server contains multiple processes including the following.
1)
2)
3)
4)
5)
6)

Apache web server


JBOSS Application server
Embedded Event Management Cell
Agent Controller
Rate Process
Additional application server processes

The above processes cannot be installed separately from the BPPM Server.

Troubleshooting
Please see product documentation at the following URL for troubleshooting information.
https://docs.bmc.com/docs/display/proactivenet95/Troubleshooting

Page 47 of 47

You might also like